AWS Kinesis Agent configuration and setup for Data Streaming to the Cloud
Abstract
Amazon Kinesis Agent is a powerful tool that helps you collect, process, and transfer data in real-time to AWS services like Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon CloudWatch. In this guide, we’ll walk you through the steps to set up and configure the AWS Kinesis Agent on an Ubuntu OS.
Target Infrastructure
Install kinesis agent
Depending on your operating system, there are two paths to consider: a shorter, automated setup process and a longer one that involves manual compilation. If your operating system provides a pre-packaged version of the Kinesis agent in its repository, you can set it up automatically. Otherwise, you’ll need to build it from source manually. Centos
installation is straight forward:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[cloudshell-user@ip-xx-xx-xx-xx ~]$ sudo yum install aws-kinesis-agent
Installed:
aws-kinesis-agent.noarch 0:2.0.8-1.amzn2
Dependency Installed:
alsa-lib.x86_64 0:1.1.4.1-2.amzn2 atk.x86_64 0:2.22.0-3.amzn2.0.2 avahi-libs.x86_64 0:0.6.31-20.amzn2.0.2
cairo.x86_64 0:1.15.12-4.amzn2 copy-jdk-configs.noarch 0:3.3-10.amzn2 cups-libs.x86_64 1:1.6.3-51.amzn2.0.3
dejavu-fonts-common.noarch 0:2.33-6.amzn2 dejavu-sans-fonts.noarch 0:2.33-6.amzn2 fontconfig.x86_64 0:2.13.0-4.3.amzn2
fontpackages-filesystem.noarch 0:1.44-8.amzn2 freetype.x86_64 0:2.8-14.amzn2.1.1 fribidi.x86_64 0:1.0.2-1.amzn2.1.2
gdk-pixbuf2.x86_64 0:2.36.12-3.amzn2 giflib.x86_64 0:4.1.6-9.amzn2.0.2 graphite2.x86_64 0:1.3.10-1.amzn2.0.2
gtk-update-icon-cache.x86_64 0:3.22.30-3.amzn2 gtk2.x86_64 0:2.24.31-1.amzn2.0.2 harfbuzz.x86_64 0:1.7.5-2.amzn2
hicolor-icon-theme.noarch 0:0.12-7.amzn2 hwdata.x86_64 0:0.252-9.3.amzn2 jasper-libs.x86_64 0:1.900.1-33.amzn2.0.1
java-1.8.0-openjdk.x86_64 1:1.8.0.382.b05-1.amzn2.0.1 java-1.8.0-openjdk-headless.x86_64 1:1.8.0.382.b05-1.amzn2.0.1 javapackages-tools.noarch 0:3.4.1-11.amzn2
jbigkit-libs.x86_64 0:2.0-11.amzn2.0.2 libICE.x86_64 0:1.0.9-9.amzn2.0.2 libSM.x86_64 0:1.2.2-2.amzn2.0.2
libX11.x86_64 0:1.6.7-3.amzn2.0.3 libX11-common.noarch 0:1.6.7-3.amzn2.0.3 libXau.x86_64 0:1.0.8-2.1.amzn2.0.2
libXcomposite.x86_64 0:0.4.4-4.1.amzn2.0.2 libXcursor.x86_64 0:1.1.15-1.amzn2 libXdamage.x86_64 0:1.1.4-4.1.amzn2.0.2
libXext.x86_64 0:1.3.3-3.amzn2.0.2 libXfixes.x86_64 0:5.0.3-1.amzn2.0.2 libXft.x86_64 0:2.3.2-2.amzn2.0.2
libXi.x86_64 0:1.7.9-1.amzn2.0.2 libXinerama.x86_64 0:1.1.3-2.1.amzn2.0.2 libXrandr.x86_64 0:1.5.1-2.amzn2.0.3
libXrender.x86_64 0:0.9.10-1.amzn2.0.2 libXtst.x86_64 0:1.2.3-1.amzn2.0.2 libXxf86vm.x86_64 0:1.1.4-1.amzn2.0.2
libdrm.x86_64 0:2.4.97-2.amzn2 libfontenc.x86_64 0:1.1.3-3.amzn2.0.2 libglvnd.x86_64 1:1.0.1-0.1.git5baa1e5.amzn2.0.1
libglvnd-egl.x86_64 1:1.0.1-0.1.git5baa1e5.amzn2.0.1 libglvnd-glx.x86_64 1:1.0.1-0.1.git5baa1e5.amzn2.0.1 libjpeg-turbo.x86_64 0:2.0.90-2.amzn2.0.6
libpciaccess.x86_64 0:0.14-1.amzn2 libpng.x86_64 2:1.5.13-8.amzn2.0.5 libthai.x86_64 0:0.1.14-9.amzn2.0.2
libtiff.x86_64 0:4.0.3-35.amzn2.0.14 libwayland-client.x86_64 0:1.17.0-1.amzn2.0.1 libwayland-server.x86_64 0:1.17.0-1.amzn2.0.1
libxcb.x86_64 0:1.12-1.amzn2.0.2 libxshmfence.x86_64 0:1.2-1.amzn2.0.2 libxslt.x86_64 0:1.1.28-6.amzn2
lksctp-tools.x86_64 0:1.0.17-2.amzn2.0.2 log4j-cve-2021-44228-hotpatch.noarch 0:1.3-7.amzn2 mesa-libEGL.x86_64 0:18.3.4-5.amzn2.0.1
mesa-libGL.x86_64 0:18.3.4-5.amzn2.0.1 mesa-libgbm.x86_64 0:18.3.4-5.amzn2.0.1 mesa-libglapi.x86_64 0:18.3.4-5.amzn2.0.1
pango.x86_64 0:1.42.4-4.amzn2 pcsc-lite-libs.x86_64 0:1.8.8-7.amzn2 pixman.x86_64 0:0.34.0-1.amzn2.0.2
pkgconfig.x86_64 1:0.27.1-4.amzn2.0.2 python-javapackages.noarch 0:3.4.1-11.amzn2 python-lxml.x86_64 0:3.2.1-4.amzn2.0.4
ttmkfdir.x86_64 0:3.0.9-42.amzn2.0.2 tzdata-java.noarch 0:2023c-1.amzn2.0.1 xorg-x11-font-utils.x86_64 1:7.5-21.amzn2
xorg-x11-fonts-Type1.noarch 0:7.5-9.amzn2
Complete!
If you’re using Ubuntu, we’ll need to take an additional step: building the aws-kinesis-agent manually and then installing it. Please follow these instructions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apt install git
git clone https://github.com/awslabs/amazon-kinesis-agent.git
sudo ./setup --install
BUILD SUCCESSFUL
Total time: 44 seconds
Configuration file installed at: /etc/aws-kinesis/agent.json
Configuration details:
{
"cloudwatch.emitMetrics": true,
"kinesis.endpoint": "",
"firehose.endpoint": "",
"flows": [
{
"filePattern": "/tmp/app.log*",
"kinesisStream": "yourkinesisstream",
"partitionKeyOption": "RANDOM"
},
{
"filePattern": "/tmp/app.log*",
"deliveryStream": "yourdeliverystream"
}
]
}
Amazon Kinesis Agent is installed successfully.
Your installation has completed!
aws-kinesis-agent log file will be found at: /var/log/aws-kinesis-agent
Control aws-kinesis-agent
service:
1
sudo service aws-kinesis-agent [start|stop|restart|status]
To make the agent automatically start at system startup, type:
1
sudo chkconfig aws-kinesis-agent on
Prepare Logs for ingestion
1
sudo mkdir /var/log/kinesis
1
2
3
4
5
6
$ cd /etc/aws-kinesis/
$ ls -l
total 5
drwxr-xr-x 2 root root 1024 Sep 6 2022 agent.d
-rw-r--r-- 1 root root 338 Sep 6 2022 agent.json
-rw-r--r-- 1 root root 2160 Sep 6 2022 log4j.xml
1
sudo nano agent.json
We will be using aws-kinesis-agent
with Kinesis Firehose
but not a Kinesis Data Stream
, so let’s update configuration to the following (but same agent can work with multiple sources and sinks):
Additionally, the agent provides essential Amazon CloudWatch metrics for streamlined monitoring and troubleshooting of the entire streaming process (but for this setup we will set them to FALSE
)
1
2
3
4
5
6
7
8
9
10
11
12
13
{
"cloudwatch.emitMetrics": false,
"kinesis.endpoint": "",
"firehose.endpoint": "firehose.eu-west-1.amazonaws.com",
"awsAccessKeyId": "ADD_KEY_HERE",
"awsSecretAccessKey": "ADD_SECRET_HERE",
"flows": [
{
"filePattern": "/var/log/kinesis/*.log*",
"deliveryStream": "terraform-kinesis-firehose-logs-s3-stream"
}
]
}
Update the following parameters in configuration:
Param | Description |
---|---|
firehose.endpoint | based on region you are have provisioned firehose it will have naming firehose. |
filePattern | location of logs you want to monitor and push to the cloud |
deliveryStream | name of Firehose Delivery Stream terraform-kinesis-firehose-logs-s3-stream from aws console |
If you are running aws-kinesis-agent
from EC2 instance in AWS - than instead of entering creds to configuration file, just attach IAM Role. If you are running from on-prem DC than AccessKey, SecretAccessKey should be provisioned in aws-kinesis-agent
json config.
Now aws-kinesis-agent
monitors the logs folder and once new records are appended it will ingest them to the Cloud.
Checking the results of aws-kinesis-agent
With the AWS Kinesis Agent successfully installed, you’re now ready to start streaming data seamlessly into your AWS environment. Once new records are appended to logs
, kinesis agent starts operating them, and you can track the activity under kinesis-agent
logs:
1
2
3
4
5
6
tail -f /var/log/aws-kinesis-agent/aws-kinesis-agent.log
2023-10-18 16:40:38.902+0000 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 30000 records parsed (2519250 bytes), and 30000 records sent successfully to destinations. Uptime: 13200218ms
2023-10-18 16:41:08.889+0000 (FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log].MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.tailing.FileTailer [INFO] FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log]: Tailer Progress: Tailer has parsed 30000 records (2519250 bytes), transformed 0 records, skipped 0 records, and has successfully sent 30000 records to destination.
2023-10-18 16:41:08.902+0000 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 30000 records parsed (2519250 bytes), and 30000 records sent successfully to destinations. Uptime: 13230218ms
2023-10-18 16:41:38.889+0000 (FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log].MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.tailing.FileTailer [INFO] FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log]: Tailer Progress: Tailer has parsed 30000 records (2519250 bytes), transformed 0 records, skipped 0 records, and has successfully sent 30000 records to destination.
2023-10-18 16:41:38.902+0000 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 30000 records parsed (2519250 bytes), and 30000 records sent successfully to destinations. Uptime: 13260218ms```
Conclusion
Agent is a Java-based application designed for seamless data collection and transmission to Kinesis Data Streams. It efficiently monitors specified files, ensuring timely delivery of new data to your stream. With built-in features like file rotation, checkpointing, and automatic retries in case of failures, it guarantees reliable data delivery.
Because the kinesis-agent
is built in Java, it’s important to consider allocating sufficient heap memory for the JVM. Prior to running the agent, ensure your machine has enough free resources and review the configuration accordingly.
In the next post Connecting Kinesis Firehose DataStream with aws-kinesis-agent to ingest Log Data into AWS S3 we will configure Cloud side of Kinesis Firehose
to receive data from kinesis-agent
and persist to S3
.
References (Links)
- https://docs.aws.amazon.com/streams/latest/dev/writing-with-agents.html
- https://docs.aws.amazon.com/streams/latest/dev/kinesis-tutorial-cli-installation.html
- Connecting Kinesis Firehose DataStream with aws-kinesis-agent to ingest Log Data into AWS S3