Quick Start

Step 0: Clean up 0.7 data, if exists

Since 0.8 is not backward compatible with 0.7.x, if you had an 0.7 installation already, you will need to delete all existing Zookeeper data and Kafka log data (unless, you use a different Zookeeper namespace and a new Kafka log directory).

Step 1: Download the code

Download a recent stable release.
> tar xzf kafka-<VERSION>.tgz
> cd kafka-<VERSION>
> ./sbt update
> ./sbt package
> ./sbt assembly-package-dependency

Step 2: Start the server

First start the zookeeper server. You can use the convenience script packaged with kafka to get a quick-and-dirty single-node zookeeper instance.

> bin/zookeeper-server-start.sh config/zookeeper.properties
[2013-04-22 15:01:37,495] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
...
Now start the Kafka server:
> bin/kafka-server-start.sh config/server.properties
jkreps-mn-2:kafka-trunk jkreps$ bin/kafka-server-start.sh config/server.properties 
[2013-04-22 15:01:47,028] INFO Verifying properties (kafka.utils.VerifiableProperties)
[2013-04-22 15:01:47,051] INFO Property socket.send.buffer.bytes is overridden to 1048576 (kafka.utils.VerifiableProperties)
...

Step 3: Send some messages

Kafka comes with a command line client that will take input from standard in and send it out as messages to the Kafka cluster. By default each line will be sent as a separate message. The topic test is created automatically when messages are sent to it. Omitting logging you should see something like this:
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test 
This is a message
This is another message

Step 4: Start a consumer

Kafka also has a command line consumer that will dump out messages to standard out.
> bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
This is a message
This is another message

If you have each of the above commands running in a different terminal then you should now be able to type messages into the producer terminal and see them appear in the consumer terminal.

Both of these command line tools have additional options. Running the command with no arguments will display usage information documenting them in more detail.

Setting up replication

If you want to set up replication on more than one broker, you can follow this.

Using the Producer Api

You can follow this to learn how to use the producer api.

Using the high level Consumer Api

You can follow this to learn how to use the high level consumer api.

Using the SimpleConsumer Api

For most applications, the high level consumer Api is good enough. Some applications want features not exposed to the high level consumer yet (e.g., set initial offset when restarting the consumer). They can instead use our low level SimpleConsumer Api. The logic will be a bit more complicated and you can follow the examples in here.

Kafka Hadoop Consumer

Providing a horizontally scalable solution for aggregating and loading data into Hadoop was one of our basic use cases. To support this use case, we provide a Hadoop-based consumer which spawns off many map tasks to pull data from the Kafka cluster in parallel. This provides extremely fast pull-based Hadoop data load capabilities (we were able to fully saturate the network with only a handful of Kafka servers).

Usage information on the hadoop consumer can be found here.