Apache Kafka

Install Kafka:

Download tar file.
Extract it at location say /usr/local/kafka_2.11-0.8.2.2
Set variables in .bashrc

###Kafka

export KAFKA_HOME=/usr/local/kafka_2.11-0.8.2.2

export PATH=$PATH:$KAFKA_HOME/bin

###

With Kafka, we can create multiple types of clusters, such as the following:

A single node—single broker cluster
A single node—multiple broker cluster
Multiple nodes—multiple broker clusters

A single node – a single broker cluster

Starting the ZooKeeper server:

>bin/zookeeper-server-start.sh config/zookeeper.properties

· Starting the Kafka broker:

>bin/kafka-server-start.sh config/server.properties

· Creating a Kafka topic:

>kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic kafkatopic

· Get list of topics:

>kafka-topics.sh --list --zookeeper localhost:2181

· Start console-based producer

>kafka-console-producer.sh --broker-list localhost:9092 --topic kafkatopic

type:

Welcome to Kafka DS

This is single node single broker cluster

Just started !! Jai Ganesh

· Start command line consumer client

>kafka-console-consumer.sh --zookeeper localhost:2181 --topic kafkatopic --from-beginning

Output:

Welcome to Kafka DS

This is single node single broker cluster

Just started !! Jai Ganesh

A single node – multiple broker clusters

· Starting the ZooKeeper server:

>bin/zookeeper-server-start.sh config/zookeeper.properties

· Starting the Kafka broker:

For setting up multiple brokers on a single node, different server property files are required for each broker. Each property file will define unique, different values for the following properties: broker.id, port, log.dir

>bin/kafka-server-start.sh config/server-1.properties

>bin/kafka-server-start.sh config/server-2.properties

· Creating a Kafka topic using the command line

>kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 4 --topic replicated-kafkatopic

Note: - Replication factor should be in accordance with number of brokers. Else can cause below exception:

kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 2

at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:70)

at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:171)

at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:93)

at kafka.admin.TopicCommand$.main(TopicCommand.scala:55)

at kafka.admin.TopicCommand.main(TopicCommand.scala)

· Starting a producer to send messages

>kafka-console-producer.sh --broker-list localhost:9093, localhost:9094 --topic replicated-kafkatopic

If we have a requirement to run multiple producers connecting to different combinations of brokers, we need to specify the broker list for each producer as we did in the case of multiple brokers.

· Starting a consumer to consume messages

>kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic replicated-kafkatopic

Multiple node- multiple broker cluster

We should install Kafka on each node of the cluster, and all the brokers from the different nodes need to connect to the same ZooKeeper. Then follow the same step on every machine to start broker as followed above to start multiple broker on single machine.

QueryDB

Search This Blog

Apache Kafka

Comments

Post a Comment

Popular posts

Read from a hive table and write back to it using spark sql

Hive Parse JSON with Array Columns and Explode it in to Multiple rows.

Caused by: java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary

org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;

Hadoop Distcp Error Duplicate files in input path