Setting up Kafka and Storm

This blog is about setting up of Kafka Storm cluster based on my experience

Prerequisites
  • JDK 1.7

 

Setting up Kafka 0.8.x and Storm 0.9.2

Setting up kafka 0.8.x and storm 0.9.2 has become relatively straightforward.

Steps to setup kafka

  • Download kafka from https://www.apache.org/dyn/closer.cgi?path=/kafka/0.8.1.1/kafka_2.9.2-0.8.1.1.tgz
  • gunzip kafka_2.9.2-0.8.1.1.tgz
  • tar -xvf kafka_2.9.2-0.8.1.1.tar
  • cd kafka_2.9.2-0.8.1.1

Starting Kafka

  • ./zookeeper-server-start.sh ../config/zookeeper.properties &
  •  ./kafka-server-start.sh ../config/server.properties &c

Testing Kafka Setup

  • ./kafka-console-producer.sh –zookeeper localhost:2181 –topic test
    • Start providing messages
  • ./kafka-console-consumer.sh –zookeeper localhost:2181 –topic test –from-beginning
    • You should be able to get the messages

 

Setting up Storm-Kafka 

Storm kafka is being maintained part of storm release itself.

1) Since, oracle jars are not part of maven repository, download the following jars and install it in your server

mvn install:install-file -Dfile=jms-1.1.jar -DgroupId=javax.jms -DartifactId=jms -Dversion=1.1 -Dpackaging=jar
mvn install:install-file -Dfile=jmxtools-1.2.1.jar -DgroupId=com.sun.jdmk -DartifactId=jmxtools -Dversion=1.2.1 -Dpackaging=jar
mvn install:install-file -Dfile=jmxri-1.2.1.jar -DgroupId=com.sun.jmx -DartifactId=jmxri -Dversion=1.2.1 -Dpackaging=jar

2) if you are writing your own maven pom.xml, make sure to exclude zookeeper and log4j like this

<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.9.2</artifactId>
<version>0.8.1.1</version>
<exclusions>
<exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
</exclusions>
</dependency>

Sample Code 

This code should help you get going https://github.com/sathya21/kafka-storm-implementation 

 

Setting up Kafka 0.7 and Storm 0.9

Though the current Kafka version is 0.8.0, kafka storm works with Kafka 0.7.2 compiled with scala 2.9.2

Steps to setup Kafka

  • project.name=Kafka
  • sbt.version=0.7.5
  • project.version=0.7.2
  • build.scala.versions=2.9.2
  • contrib.root.dir=contrib
  • lib.dir=lib
  • target.dir=target/scala_2.9.2
  • dist.dir=dist
  • ./sbt update
  • ./sbt “++2.9.2 package”
  • open kafka_run_class.sh and replace reference to 2.8.0 scala directory to 2.9.2 scala directory (%s/2.8.0/2.9.2)

Starting Kafka

  • ./zookeeper-server-start.sh ../config/zookeeper.properties &
  •  ./kafka-server-start.sh ../config/server.properties &c

Testing Kafka Setup

  • ./kafka-console-producer.sh –zookeeper localhost:2181 –topic test
    • Start providing messages
  • ./kafka-console-consumer.sh –zookeeper localhost:2181 –topic test –from-beginning
    • You should be able to get the messages

Setting up Storm-Kafka 

<dependency>
<groupId>storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>0.9.0-wip16a-scala292</version>
</dependency>

  • and change the version of storm
                <dependency>
<groupId>storm</groupId>
<artifactId>storm</artifactId>
<version>0.9.0-rc2</version>
<scope>provided</scope>
</dependency>
  •  vi ./src/jvm/storm/starter/KafkaTopology.java
  •  Change the host to point to ZooKeeper
    •         SpoutConfig kafkaConf = new SpoutConfig(new SpoutConfig.ZkHosts(“localhost:2181″,”/brokers”), “test”, “/kafkastorm”, “discovery”);
  • Run it using r mvn -f m2-pom.xml compile exec:java -Dexec.classpathScope=compile -Dexec.mainClass=storm.starter.KafkaTopology

Testing the setup

  • Use the kafka-console-producer.sh to produce data to “test” topic and you can see the data printed on your KafkaTopology console.

References

 

Advertisements