How can I put data records into a Kinesis data stream using the KPL?

3 minute read
0

I want to write and put data records into a Kinesis data stream using the Amazon Kinesis Producer Library (KPL). How can I do this?

Short description

To put a record into a Kinesis data stream using the KPL, you must meet the following requirements:

  • You have a running Amazon Elastic Compute Cloud (Amazon EC2) Linux instance.
  • An AWS Identity and Access Management (IAM) role is attached to your instance.
  • The KinesisFullAccess policy is attached to the instance's IAM role.

Resolution

To put records into a Kinesis data stream using the KPL:

1.    Connect to your Linux instance.

2.    Install the latest version of the OpenJDK 8 developer package:

sudo yum install java-1.8.0-openjdk-devel

3.    Confirm that Java is installed:

java -version

The expected output looks like this:

java version "1.7.0_181"
OpenJDK Runtime Environment (amzn-2.6.14.8.80.amzn1-x86_64 u181-b00)
OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode)

4.    Run the following commands to set Java 1.8 as the default java and javac providers:

sudo /usr/sbin/alternatives --config java 
sudo /usr/sbin/alternatives --config javac

5.    Add a repository with an Apache Maven package:

sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo

6.    Set the version number for the Maven packages:

sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo

7.    Use yum to install Maven:

sudo yum install -y apache-maven

8.    Confirm that Maven is installed properly:

mvn -version

The expected output looks like this:

Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z)
Maven home: /usr/share/apache-maven
Java version: 1.7.0_181, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.181.x86_64/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.14.33-51.37.amzn1.x86_64", arch: "amd64", family: "unix"

9.    Install git, and then download the KPL from Amazon Web Services - Labs:

sudo yum install git
git clone https://github.com/awslabs/amazon-kinesis-producer

10.    Open the amazon-kinesis-producer/java/amazon-kinesis-producer-sample/ directory, and then list the files:

cd amazon-kinesis-producer/java/amazon-kinesis-producer-sample/
ls
default_config.properties  pom.xml  README.md  src  target

11.    Run a command similar to the following to create a Kinesis data stream:

aws kinesis create-stream --stream-name kinesis-kpl-demo --shard-count 2

For more information about the number of shards needed, see Resharding, scaling, and parallel processing.

12.    Run list-streams to confirm that the stream was created:

aws kinesis list-streams

13.    Open the SampleProducer.java file on the Amazon Web Services - Labs GitHub repository, and then modify the following fields:
For public static final String STREAM_NAME_DEFAULT, enter the name of the Kinesis data stream that you previously created.
For public static final String REGION_DEFAULT, enter the Region that you're using.

Example:

cd src/com/amazonaws/services/kinesis/producer/sample
vi SampleProducerConfig.java

public static final String STREAM_NAME_DEFAULT = "kinesis-kpl-demo";
public static final String REGION_DEFAULT = "us-east-1";

14.    Run the following command in the amazon-kinesis-producer-sample directory to allow Maven to download all of the directory's dependencies:

mvn clean package

15.    Run the following command in the amazon-kinesis-producer-sample directory to run the producer and to send data into the Kinesis data stream:

mvn exec:java -Dexec.mainClass="com.amazonaws.services.kinesis.producer.sample.SampleProducer"

16.    Check the Incoming Data (Count) graph on the Monitoring tab of the Kinesis console to verify the number of records sent to the stream.

Note: The record count might be lower than the number of records sent to the data stream. This lower record count can occur because the KPL uses aggregation.


Related information

Writing to your Kinesis Data Stream using the KPL

AWS OFFICIAL
AWS OFFICIALUpdated 3 years ago