Configure Data Transforms

Learn how to configure data transforms in Redpanda, including editing the transform.yaml file, environment variables, and memory settings. This topic covers both the configuration of transform functions and the WebAssembly (Wasm) engine’s environment.

Configure transform functions

This section covers how to configure transform functions using the transform.yaml configuration file, command-line overrides, and environment variables.

Transform configuration file

When you initialize a data transforms project, a transform.yaml file is generated in the provided directory. You can use this configuration file to configure the transform function with settings, including input and output topics, the language used for the data transform, and any environment variables.

  • name: The name of the transform function.

  • description: A description of what the transform function does.

  • input-topic: The topic from which data is read.

  • output-topics: A list of up to eight topics to which the transformed data is written.

  • language: The language used for the transform function. The language is set to the one you defined during initialization.

  • env: A dictionary of custom environment variables that are passed to the transform function. Do not prefix keys with REDPANDA_. Check the list of all limitations.

Here is an example of a transform.yaml file:

name: redpanda-example
description: |
  This transform function is an example to demonstrate how to configure data transforms in Redpanda.
input-topic: example-input-topic
output-topics:
  - example-output-topic-1
  - example-output-topic-2
language: tinygo-no-goroutines
env:
  DATA_TRANSFORMS_ARE_AWESOME: 'true'

Override configurations with command-line options

You can set the name of the transform function, environment variables, and input and output topics on the command-line when you deploy the transform. These command-line settings take precedence over those specified in the transform.yaml file.

Built-In environment variables

As well as custom environment variables set in either the command-line or the configuration file, Redpanda makes some built-in environment variables available to your transform functions. These variables include:

  • REDPANDA_INPUT_TOPIC: The input topic specified.

  • REDPANDA_OUTPUT_TOPIC_0..REDPANDA_OUTPUT_TOPIC_N: The output topics in the order specified on the command line or in the configuration file. For example, REDPANDA_OUTPUT_TOPIC_0 is the first variable, REDPANDA_OUTPUT_TOPIC_1 is the second variable, and so on.

Transform functions are isolated from the broker’s internal environment variables to maintain security and encapsulation. Each transform function only uses the environment variables explicitly provided to it.

Configure the Wasm engine

This section covers how to configure the Wasm engine environment using Redpanda cluster configuration properties.

Enable data transforms

To use data transforms, you must enable it for a Redpanda cluster using the data_transforms_enabled property.

Configure transform logging

The following properties configure logging for data transforms: