Amazon Kinesis
Synopsis
Creates a target that writes log messages to Amazon Kinesis Data Streams with support for batching and AWS authentication. The target handles message delivery efficiently with configurable batch limits. Amazon Kinesis Data Streams is a fully managed streaming data service that enables real-time data processing at scale.
Schema
- name: <string>
description: <string>
type: amazonkinesis
pipelines: <pipeline[]>
status: <boolean>
properties:
key: <string>
secret: <string>
session: <string>
region: <string>
endpoint: <string>
stream: <string>
partition_key: <string>
max_events: <numeric>
timeout: <numeric>
field_format: <string>
interval: <string|numeric>
cron: <string>
debug:
status: <boolean>
dont_send_logs: <boolean>
Configuration
The following fields are used to define the target:
| Field | Required | Default | Description |
|---|---|---|---|
name | Y | Target name | |
description | N | - | Optional description |
type | Y | Must be amazonkinesis | |
pipelines | N | - | Optional post-processor pipelines |
status | N | true | Enable/disable the target |
AWS Credentials
| Field | Required | Default | Description |
|---|---|---|---|
key | N* | - | AWS access key ID for authentication |
secret | N* | - | AWS secret access key for authentication |
session | N | - | Optional session token for temporary credentials |
region | Y | - | AWS region (e.g., us-east-1, eu-west-1) |
endpoint | N | - | Custom Kinesis endpoint URL (for testing or local development) |
* = Conditionally required. AWS credentials (key and secret) are required unless using IAM role-based authentication on AWS infrastructure.
Stream Configuration
| Field | Required | Default | Description |
|---|---|---|---|
stream | Y | - | Kinesis Data Stream name |
partition_key | N | "default" | Partition key for distributing records across shards |
max_events | N | 500 | Maximum number of events per batch (1-500) |
timeout | N | 30 | Connection timeout in seconds |
field_format | N | - | Data normalization format. See applicable Normalization section |
Amazon Kinesis Data Streams supports a maximum of 500 records per PutRecords request. The max_events parameter must be between 1 and 500.
Scheduler
| Field | Required | Default | Description |
|---|---|---|---|
interval | N | realtime | Execution frequency. See Interval for details |
cron | N | - | Cron expression for scheduled execution. See Cron for details |
Debug Options
| Field | Required | Default | Description |
|---|---|---|---|
debug.status | N | false | Enable debug logging |
debug.dont_send_logs | N | false | Process logs but don't send to target (testing) |
Details
Amazon Kinesis Data Streams is a fully managed streaming data service that captures and stores data in real time. This target allows you to send log messages to Kinesis streams for processing by downstream applications.
Authentication Methods
Supports static credentials (access key and secret key) with optional session tokens for temporary credentials. When deployed on AWS infrastructure, can leverage IAM role-based authentication without explicit credentials.
Stream and Shard Architecture
Kinesis Data Streams uses shards as the base throughput unit. Each shard provides:
- Write capacity: 1 MB/second or 1,000 records per second
- Read capacity: 2 MB/second
Records are distributed across shards based on the partition key. A well-distributed partition key ensures even load across shards.
Partition Key Strategy
The partition_key parameter determines how records are distributed across shards:
Static Partition Key (default: "default")
- All records go to the same shard
- Simple but can create hot shards
- Suitable for low-volume streams or testing
Dynamic Partition Key
- Use different keys to distribute load
- Records with the same key go to the same shard
- Maintains ordering for records with the same key
- Better performance for high-volume streams
Batch Processing
The target accumulates messages in memory and sends them in batches using the PutRecords API. Batches are sent when the event count limit (max_events) is reached or during finalization. The maximum batch size is 500 records per request (AWS Kinesis limit).
Data Retention
Kinesis Data Streams retains data for 24 hours by default, with the option to extend retention up to 365 days. Data is available for consumption by multiple applications simultaneously.
Encryption
Kinesis automatically encrypts data at rest using AWS KMS. Data in transit is encrypted using TLS. All connections to Kinesis use HTTPS endpoints.
Error Handling
If any records fail to be written, the entire batch operation returns an error. Failed records can be identified in the response and retried. Common failure reasons include:
- Throttling due to exceeding shard limits
- Invalid partition key
- Record size exceeding limits (1 MB per record)
Integration with AWS Services
Kinesis Data Streams integrates with other AWS services:
- AWS Lambda for serverless processing
- Amazon Kinesis Data Firehose for delivery to data stores
- Amazon Kinesis Data Analytics for SQL-based stream processing
- Amazon CloudWatch for monitoring and alarms
Examples
Basic Configuration
The minimum configuration for a Kinesis target:
targets:
- name: basic_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "application-logs"
With IAM Role
Configuration using IAM role authentication (no explicit credentials):
targets:
- name: iam_kinesis
type: amazonkinesis
properties:
region: "us-east-1"
stream: "application-logs"
When using IAM role authentication, ensure the EC2 instance, ECS task, or Lambda function has an IAM role with appropriate Kinesis permissions attached.
With Custom Partition Key
Configuration with a custom partition key for better distribution:
targets:
- name: distributed_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "distributed-logs"
partition_key: "server-01"
High Throughput
Configuration optimized for high-volume data:
targets:
- name: high_volume_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "high-volume-logs"
partition_key: "load-balanced"
max_events: 500
timeout: 60
With Temporary Credentials
Configuration using temporary session credentials:
targets:
- name: temp_creds_kinesis
type: amazonkinesis
properties:
key: "ASIATEMP1234567890AB"
secret: "tempSecretKeyExample1234567890"
session: "FwoGZXIvYXdzEBYaDH...temporary-session-token"
region: "us-west-2"
stream: "temporary-logs"
With Field Normalization
Using field normalization for standard format:
targets:
- name: normalized_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "normalized-logs"
field_format: "cim"
With Checkpoint Pipeline
Configuration with checkpoint pipeline for reliability:
targets:
- name: reliable_kinesis
type: amazonkinesis
pipelines:
- checkpoint
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "critical-logs"
max_events: 100
Multiple Regions
Configuration for Kinesis stream in different region:
targets:
- name: eu_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "eu-west-1"
stream: "eu-application-logs"
partition_key: "eu-server"
Scheduled Batching
Configuration with scheduled batch delivery:
targets:
- name: scheduled_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "scheduled-logs"
max_events: 500
interval: "5m"
Debug Configuration
Configuration with debugging enabled:
targets:
- name: debug_kinesis
type: amazonkinesis
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "test-logs"
debug:
status: true
dont_send_logs: true
Local Development
Configuration with custom endpoint for local testing (e.g., LocalStack):
targets:
- name: local_kinesis
type: amazonkinesis
properties:
key: "test"
secret: "test"
region: "us-east-1"
endpoint: "http://localhost:4566"
stream: "local-test-stream"
Production Configuration
Configuration for production with optimal settings:
targets:
- name: production_kinesis
type: amazonkinesis
pipelines:
- checkpoint
properties:
key: "AKIAIOSFODNN7EXAMPLE"
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-1"
stream: "production-logs"
partition_key: "prod-cluster-01"
max_events: 500
timeout: 60
field_format: "cim"