Transform JSON Messages into a New Topic using JQ

This lab contains a reusable data transform using jaq a rust version of the the popular jq command line JSON processor.

See the jq manual for more information on how to write a filter: https://jqlang.github.io/jq/manual/

Prerequisites

You must have the following:

  • At least version 1.75 of Rust installed on your host machine.

  • The Wasm target for Rust installed. To install this target, run the following:

    rustup target add wasm32-wasi
  • Install rpk on your host machine.

  • Docker and Docker Compose installed on your host machine.

Run the lab

  1. Clone this repository:

    git clone https://github.com/redpanda-data/redpanda-labs.git
  2. Change into the data-transforms/jq/ directory:

    cd redpanda-labs/data-transforms/rust/jq
  3. Set the REDPANDA_VERSION environment variable to at least version v23.3.1. Data transforms was introduced in this version. For all available versions, see the GitHub releases.

    For example:

    export REDPANDA_VERSION=25.1.1
  4. Set the REDPANDA_CONSOLE_VERSION environment variable to the version of Redpanda Console that you want to run. For all available versions, see the GitHub releases.

    You must use at least version v3.0.0 of Redpanda Console to deploy this lab.

    For example:

    export REDPANDA_CONSOLE_VERSION=3.0.0
  5. Start Redpanda in Docker by running the following command:

    docker compose up -d --wait
  6. Set up your rpk profile:

    rpk profile create jq --from-profile profile.yml
  7. Create the required topics:

    rpk topic create src sink
  8. Deploy the transforms function:

    rpk transform build
    rpk transform deploy --var=FILTER='del(.email)' --input-topic=src --output-topic=sink

    This example accepts the following environment variable:

    • FILTER (required): The jq expression that will run on each record’s value.

  9. Run rpk topic produce:

    rpk topic produce src
  10. Paste the following into the prompt and press Ctrl+D to exit:

    {"foo":42,"email":"help@example.com"}
  11. Consume the sink topic to see the email address was deleted and the record produced to the sink topic:

    rpk topic consume sink --num 1
    {
      "topic": "sink",
      "value": "{\"foo\":42}",
      "timestamp": 1707749921393,
      "partition": 0,
      "offset": 0
    }

You can also see this in Redpanda Console.

Clean up

To shut down and delete the containers along with all your cluster data:

docker compose down -v