Set up a local Kafka cluster and walk through basic Kafka commands. This follows the Udemy Kafka Course.

What is Kafka?

Kafka is a distributed streaming platform used to:

  • Publish and subscribe to streams of records
  • Store streams of records
  • Process streams of records.

The above link has a more detailed description of Kafka and how it is used. A great use case is using Kafka for the pipeline of events in an event-sourced application.

Environment set up

  1. Install Docker. See instructions here for installing on Windows.
  2. Install Kubernetes. See instructions here for installing on Windows.
  3. Install Helm. See instructions here for installing on Windows.
  4. Add the Bitnami repo.
     > helm repo add bitnami https://charts.bitnami.com
     > helm repo update
    
  5. Create a namespace for Kafka.
     > kubectl create namespace kafka
     > kubectl config set-context docker-for-desktop --namespace kafka
    
  6. Install the Bitnami Kafka chart.
     > helm install bitnami/kafka --name kafka --namespace kafka
    
  7. Verify the installation.
     > helm list
    
     > helm status kafka
    
  8. List the services (the service names are the domains used in the tutorial).
     > kubectl get svc
     NAME                       TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
     kafka                      ClusterIP   10.99.206.57   <none>        9092/TCP                     56m
     kafka-headless             ClusterIP   None           <none>        9092/TCP                     56m
     kafka-zookeeper            ClusterIP   10.98.222.47   <none>        2181/TCP,2888/TCP,3888/TCP   56m
     kafka-zookeeper-headless   ClusterIP   None           <none>        2181/TCP,2888/TCP,3888/TCP   56m
    
  9. List the pods.
     > kubectl get pods
     NAME                READY     STATUS    RESTARTS   AGE
     kafka-0             1/1       Running   1          1h
     kafka-zookeeper-0   1/1       Running   0          1h
    
  10. Install telepresence in order to proxy the Kafka service locally. We must use this so that the IP addresses are resolved correctly when connecting to Kafka locally.
    $ curl -s https://packagecloud.io/install/repositories/datawireio/telepresence/script.deb.sh | sudo bash
    $ sudo apt install --no-install-recommends telepresence
    

    If you are using Windows then you must use the Windows Subsystem for Linux (WSL).

  11. Install .NET Core SDK.
    $ wget -q https://packages.microsoft.com/config/ubuntu/16.04/packages-microsoft-prod.deb
    $ sudo dpkg -i packages-microsoft-prod.deb
    $ sudo apt-get install apt-transport-https
    $ sudo apt-get update
    $ sudo apt-get install dotnet-sdk-2.2
    

    If you are using Windows you must install in WSL so that you can use telepresence.

Tutorial - CLI

This section will go through the steps to create a simple producer and consumer using the Kafka command line scripts.

Create a topic

Connect to the Kafka instance.

> kubectl exec -it kafka-0 bash 

Create a topic.

$ kafka-topics.sh --zookeeper kafka-zookeeper:2181 --topic logging.tutorial.main --create --partitions 3 --replication-factor 1

kafka-zookeeper:2181 is the address of the Zookeeper service retrieved from kubectl get svc. Best practice for replication factor is > 1 but since we only have one broker we can only set replication-factor=0. The topic naming is arbitrary and you can choose whatever topic name you like. See Kafka topic naming article for some thoughts on how to name topics.

Produce messages

Connect to the producer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-producer.sh --broker-list kafka:9092 --topic logging.tutorial.main

Once the console launches you can type a message and press Enter.

$ kafka-console-producer.sh --broker-list kafka:9092 --topic logging.tutorial.main
>hello
>world

To exit the producer console, press Ctrl-C.

Produce messages with acknowledgement

Connect to the producer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-producer.sh --broker-list kafka:9092 --topic logging.tutorial.main --producer-property acks=all

Notice the additional producer-property argument. acks=1 is the default which means the leader broker will acknowledge the message. acks=all means that the leader and the replica brokers will acknowledge the message. acks=0 means that the producer will not wait for acknowledgement.

Once the console launches you can type a message and press Enter.

$ kafka-console-producer.sh --broker-list kafka:9092 --topic logging.tutorial.main --producer-property acks=all
>hello with ack
>world with ack

To exit the producer console, press Ctrl-C.

Consume messages

Connect to the consumer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic logging.tutorial.main

Open a new terminal and connect to the producer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-producer.sh --broker-list kafka:9092 --topic logging.tutorial.main

Write some messages in the producer console and you should see the messages appear in the consumer console.

Consume messages from beginning

Connect to the consumer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic logging.tutorial.main --from-beginning

Note the additional from-beginning argument.

You should then see all the messages written to the logging.tutorial.main topic.

Consume messages within a group

Connect to the consumer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic logging.tutorial.main --group my-app

Note the additional group argument.

Open a new terminal and connect to another consumer console with the same group.

> kubectl exec -it kafka-0 bash 
$ kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic logging.tutorial.main --group my-app

Open a new terminal and connect to the producer console.

> kubectl exec -it kafka-0 bash 
$ kafka-console-producer.sh --broker-list kafka:9092 --topic logging.tutorial.main

Write some messages in the producer console and you should see the messages evenly split between the two consumer consoles.

Tutorial - i-heart-fsharp

In this section we will create a producer and consumer in F#. See the source code for for more details.

Clone the repo

> git clone https://github.com/ameier38/kafka-beginners-course.git
> cd kafka-beginners-course

Install dependencies.

> .paket/paket.exe install

If you are using OSX or Linux you will need to install Mono.

Restore the project.

> dotnet restore

Compile the application.

> dotnet publish -o out

This will compile the application and add the compiled assets into a directory called out.

Start the consumer. We use telepresence to proxy the services locally.

> telepresence --run-shell --method inject-tcp
> dotnet out/Tutorial.dll consumer kafka:9092 test_topic test_group

This will start a consumer that will try to connect to a Kafka broker at kafka:9092 listening on the topic test_topic within the group test_group. Using telepresence allows us to use the same DNS names in the Kubernetes cluster.

Open a new terminal and start the producer.

> telepresence --run-shell --method inject-tcp
> dotnet out/Tutorial.dll producer kafka:9092 test_topic test_key

This will start a producer that will try to connect to a Kafka broker at kafka:9092 producing to the topic test_topic using the key test_key.

Enter messages into the producer terminal and you should see the messages appear in the consumer terminal.

Summary

In this post we covered:

Much thanks to the engineers at Confluent and Jet.com for all the work on the Kafka and F# libraries :raised_hands:!

Additional Resources

I hope you enjoyed the post. If you run into any issues setting up the project leave a comment and I can try to help debug :bug:.