Docs Cloud Redpanda Connect Components Catalog Outputs aws_s3 aws_s3 Page options Copy as Markdown Copied! View as plain text Ask AI about this topic Add MCP server to VS Code Type: Output ▼ OutputCacheInput Available in: Cloud, Self-Managed Sends message parts as objects to an Amazon S3 bucket. Each object is uploaded with the path specified with the path field. Common Advanced outputs: label: "" aws_s3: bucket: "" # No default (required) path: ${!counter()}-${!timestamp_unix_nano()}.txt tags: {} content_type: application/octet-stream metadata: exclude_prefixes: [] max_in_flight: 64 batching: count: 0 byte_size: 0 period: "" check: "" processors: [] # No default (optional) outputs: label: "" aws_s3: bucket: "" # No default (required) path: ${!counter()}-${!timestamp_unix_nano()}.txt tags: {} content_type: application/octet-stream content_encoding: "" cache_control: "" content_disposition: "" content_language: "" content_md5: "" website_redirect_location: "" metadata: exclude_prefixes: [] storage_class: STANDARD kms_key_id: "" checksum_algorithm: "" server_side_encryption: "" force_path_style_urls: false max_in_flight: 64 timeout: 5s object_canned_acl: private batching: count: 0 byte_size: 0 period: "" check: "" processors: [] # No default (optional) region: "" # No default (optional) endpoint: "" # No default (optional) tcp: connect_timeout: 0s keep_alive: idle: 15s interval: 15s count: 9 tcp_user_timeout: 0s credentials: profile: "" # No default (optional) id: "" # No default (optional) secret: "" # No default (optional) token: "" # No default (optional) from_ec2_role: "" # No default (optional) role: "" # No default (optional) role_external_id: "" # No default (optional) In order to have a different path for each object you should use function interpolations described in Bloblang queries, which are calculated per message of a batch. Metadata Metadata fields on messages will be sent as headers, in order to mutate these values (or remove them) check out the metadata docs. Tags The tags field allows you to specify key/value pairs to attach to objects as tags, where the values support interpolation functions: output: aws_s3: bucket: TODO path: ${!counter()}-${!timestamp_unix_nano()}.tar.gz tags: Key1: Value1 Timestamp: ${!meta("Timestamp")} Credentials By default Redpanda Connect will use a shared credentials file when connecting to AWS services. It’s also possible to set them explicitly at the component level, allowing you to transfer data across accounts. You can find out more in Amazon Web Services. Batching It’s common to want to upload messages to S3 as batched archives. The easiest way to do this is to batch your messages at the output level and join the batch of messages with an archive or compress processor. For example, the following configuration uploads messages as a .tar.gz archive of documents: output: aws_s3: bucket: TODO path: ${!counter()}-${!timestamp_unix_nano()}.tar.gz batching: count: 100 period: 10s processors: - archive: format: tar - compress: algorithm: gzip Alternatively, this configuration uploads JSON documents as a single large document containing an array of objects: output: aws_s3: bucket: TODO path: ${!counter()}-${!timestamp_unix_nano()}.json batching: count: 100 processors: - archive: format: json_array Performance This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field max_in_flight. Fields batching Allows you to configure a batching policy. Type: object # Examples: batching: byte_size: 5000 count: 0 period: 1s # --- batching: count: 10 period: 1s # --- batching: check: this.contains("END BATCH") count: 0 period: 1m batching.byte_size An amount of bytes at which the batch should be flushed. If 0 disables size based batching. Type: int Default: 0 batching.check A Bloblang query that should return a boolean value indicating whether a message should end a batch. Type: string Default: "" # Examples: check: this.type == "end_of_transaction" batching.count A number of messages at which the batch should be flushed. If 0 disables count based batching. Type: int Default: 0 batching.period A period in which an incomplete batch should be flushed regardless of its size. Type: string Default: "" # Examples: period: 1s # --- period: 1m # --- period: 500ms batching.processors[] A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. Type: processor # Examples: processors: - archive: format: concatenate # --- processors: - archive: format: lines # --- processors: - archive: format: json_array bucket The bucket to upload messages to. Type: string cache_control The cache control to set for each object. This field supports interpolation functions. Type: string Default: "" checksum_algorithm The algorithm used to validate each object during its upload to the Amazon S3 bucket. Type: string Default: "" Options: CRC32, CRC32C, SHA1, SHA256 content_disposition The content disposition to set for each object. This field supports interpolation functions. Type: string Default: "" content_encoding An optional content encoding to set for each object. This field supports interpolation functions. Type: string Default: "" content_language The content language to set for each object. This field supports interpolation functions. Type: string Default: "" content_md5 The content MD5 to set for each object. This field supports interpolation functions. Type: string Default: "" content_type The content type to set for each object. This field supports interpolation functions. Type: string Default: application/octet-stream credentials Optional manual configuration of AWS credentials to use. More information can be found in Amazon Web Services. Type: object credentials.from_ec2_role Use the credentials of a host EC2 machine configured to assume an IAM role associated with the instance. Type: bool credentials.id The ID of credentials to use. Type: string credentials.profile A profile from ~/.aws/credentials to use. Type: string credentials.role A role ARN to assume. Type: string credentials.role_external_id An external ID to provide when assuming a role. Type: string credentials.secret The secret for the credentials being used. This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration. Type: string credentials.token The token for the credentials being used, required when using short term credentials. Type: string endpoint Allows you to specify a custom endpoint for the AWS API. Type: string force_path_style_urls Forces the client API to use path style URLs, which helps when connecting to custom endpoints. Type: bool Default: false kms_key_id An optional server-side encryption key. Type: string Default: "" max_in_flight The maximum number of messages to have in flight at a given time. Increase this to improve throughput. Type: int Default: 64 metadata Specify criteria for which metadata values are attached to objects as headers. Type: object metadata.exclude_prefixes[] Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages. Type: array Default: [] object_canned_acl The object canned ACL value. Type: string Default: private Options: private, public-read, public-read-write, authenticated-read, aws-exec-read, bucket-owner-read, bucket-owner-full-control path The path of each message to upload. This field supports interpolation functions. Type: string Default: ${!counter()}-${!timestamp_unix_nano()}.txt # Examples: path: ${!counter()}-${!timestamp_unix_nano()}.txt # --- path: ${!meta("kafka_key")}.json # --- path: ${!json("doc.namespace")}/${!json("doc.id")}.json region The AWS region to target. Type: string server_side_encryption An optional server-side encryption algorithm. Type: string Default: "" storage_class The storage class to set for each object. This field supports interpolation functions. Type: string Default: STANDARD Options: STANDARD, REDUCED_REDUNDANCY, GLACIER, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING, DEEP_ARCHIVE tags Key/value pairs to store with the object as tags. This field supports interpolation functions. Type: string Default: {} # Examples: tags: Key1: Value1 Timestamp: ${!meta("Timestamp")} tcp Configure TCP socket-level settings to optimize network performance and reliability. These low-level controls are useful for: High-latency networks: Increase connect_timeout to allow more time for connection establishment Long-lived connections: Configure keep_alive settings to detect and recover from stale connections Unstable networks: Tune keep-alive probes to balance between quick failure detection and avoiding false positives Linux systems with specific requirements: Use tcp_user_timeout (Linux 2.6.37+) to control data acknowledgment timeouts Most users should keep the default values. Only modify these settings if you’re experiencing connection stability issues or have specific network requirements. Type: object tcp.connect_timeout Maximum amount of time a dial will wait for a connect to complete. Zero disables. Type: string Default: 0s tcp.keep_alive TCP keep-alive probe configuration. Type: object tcp.keep_alive.count Maximum unanswered keep-alive probes before dropping the connection. Zero defaults to 9. Type: int Default: 9 tcp.keep_alive.idle Duration the connection must be idle before sending the first keep-alive probe. Zero defaults to 15s. Negative values disable keep-alive probes. Type: string Default: 15s tcp.keep_alive.interval Duration between keep-alive probes. Zero defaults to 15s. Type: string Default: 15s tcp.tcp_user_timeout Maximum time to wait for acknowledgment of transmitted data before killing the connection. Linux-only (kernel 2.6.37+), ignored on other platforms. When enabled, keep_alive.idle must be greater than this value per RFC 5482. Zero disables. Type: string Default: 0s timeout The maximum period to wait on an upload before abandoning it and reattempting. Type: string Default: 5s website_redirect_location The website redirect location to set for each object. This field supports interpolation functions. Type: string Default: "" Back to top × Simple online edits For simple changes, such as fixing a typo, you can edit the content directly on GitHub. Edit on GitHub Or, open an issue to let us know about something that you want us to change. Open an issue Contribution guide For extensive content updates, or if you prefer to work locally, read our contribution guide . Was this helpful? thumb_up thumb_down group Ask in the community mail Share your feedback group_add Make a contribution 🎉 Thanks for your feedback! aws_kinesis_firehose aws_sns