Apache Kafka is a distributed streaming platform developed by Apache Software Foundation and written in Java and Scala. LinkedIn originally developed Apache Kafka.
Apache Kafka is used for building a real-time streaming data pipeline that reliably gets data between systems and applications. It provides unified, high-throughput, and low-latency data processing in real-time.
This tutorial will show you how to install and configure Apache Kafka on CentOS 7. This guide will cover the Apache Kafka and Apache Zookeeper installation and configuration.
Prerequisites
- CentOS 7 Server
- Root privileges
What will we do?
- Install Java OpenJDK 8
- Install and Configure Apache Zookeeper
- Install and Configure Apache Kafka
- Configure Apache Zookeeper and Apache Kafka as Services
- Testing
Step 1 – Install Java OpenJDK 8
Apache Kafka has been written in Java and Scala, so we must install Java on the server.
Install Java OpenJDK 8 to the CentOS 7 server using the yum command below.
sudo yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel
After the installation is complete, check the installed Java version.
java -version
Now you get the Java OpenJDK 8 installed.
Step 2 – Install Apache Zookeeper
Apache Kafka uses zookeeper for the electing controller, cluster membership, and topics configuration. Zookeeper is a distributed configuration and synchronization service.
In this step, we will install the Apache Zookeeper using the binary installation.
Before installing the Apache Zookeeper, ad a new user named ‘zookeeper’ with home directory ‘/opt/zookeeper’.
useradd -d /opt/zookeeper -s /bin/bash zookeeper passwd zookeeper
Now go to the ‘/opt’ directory and download the Apache Zookeeper binary file.
cd /opt wget https://www-us.apache.org/dist/zookeeper/stable/zookeeper-3.4.12.tar.gz
Extract the zookeeper.tar.gz file to the ‘/opt/zookeeper’ directory and change the directory’s owner to the ‘zookeeper’ user and group.
tar -xf zookeeper-3.4.12.tar.gz -C /opt/zookeeper --strip-component=1 sudo chown -R zookeeper:zookeeper /opt/zookeeper
Next, we need to create a new zookeeper configuration.
Login to the ‘zookeeper’ user and create a new configuration ‘zoo.conf’ under the ‘conf’ directory.
su - zookeeper vim conf/zoo.cfg
Paste the following configuration there.
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/opt/zookeeper/data clientPort=2181
Save and exit.
The Basic Apache Zookeeper configuration has been completed, and it will run on port 2181.
Step 3 – Download and Install Apache Kafka
In this step, we will install and configure Apache Kafka.
Add a new user named ‘kafka’ with the home directory ‘/opt/kafka’.
useradd -d /opt/kafka -s /bin/bash kafka passwd kafka
Go to the ‘/opt’ directory and download the Apache Kafka compressed binary files.
cd /opt wget http://www-eu.apache.org/dist/kafka/2.0.0/kafka_2.11-2.0.0.tgz
Extract the kafka_*.tar.gz file to the ‘/opt/kafka’ directory and change the owner of all files to the ‘kafka’ user and group.
tar -xf kafka_2.11-2.0.0.tgz -C /opt/kafka --strip-components=1 sudo chown -R kafka:kafka /opt/kafka
Next, log in as the ‘kafka’ user and edit the server configuration.
su - kafka vim config/server.properties
Paste the following configuration at the end of the line.
delete.topic.enable = true
Save and exit.
Apache Kafka has been downloaded, and the basic setup is completed.
Step 4 – Configure Apache Kafka and Zookeeper as Services
This tutorial will run the Apache Zookeeper and Apache Kafka as systemd services.
We need to create new service files for both platforms.
Go to the ‘/lib/systemd/system’ directory and create a new service file named ‘zookeeper.service’.
cd /lib/systemd/system/ vim zookeeper.service
Paste the following configuration there.
[Unit] Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple User=kafka ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Save and exit.
Next, create the service file for Apache Kafka ‘kafka.service’.
vim kafka.service
Paste the following configuration there.
[Unit] Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka ExecStart=/bin/sh -c '/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties' ExecStop=/opt/kafka/bin/kafka-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Save and exit, then reload the systemd management system.
systemctl daemon-reload
Start the Apache Zookeeper and Apache Kafka using systemctl commands below.
systemctl start zookeeper systemctl enable zookeeper
systemctl start kafka
systemctl enable kafka
The Apache Zookeeper and Apache Kafka are up and running. Zookeeper running on port ‘2181’, and the Kafka on port ‘9092’, check it using the netstat command below.
netstat -plntu
Step 5 – Testing
Login as the ‘kafka’ user and go to the ‘bin/’ directory.
su - kafka cd bin/
Now create a new topic named ‘HakaseTesting’.
./kafka-topics.sh --create --zookeeper localhost:2181 \ --replication-factor 1 --partitions 1 \ --topic HakaseTesting
And run the ‘kafka-console-producer.sh’ with the ‘HakaseTesting’ topic.
./kafka-console-producer.sh --broker-list localhost:9092 \ --topic HakaseTesting
Type any content on the shell.
Next, open a new terminal, log in to the server, and log in as the ‘kafka’ user.
Run the ‘kafka-console-consumer.sh’ for the ‘HakaseTesting’ topic.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 \ --topic HakaseTesting --from-beginning
And when you type any input from the ‘kafka-console-producer.sh’ shell, you will get the same result on the ‘kafka-console-consumer.sh’ shell.
The installation and configuration for Apache Kafka on CentOS 7 has been completed successfully.