DevOps | Cloud | Analytics | Open Source | Programming





How To Read Kafka From Spark Structured Streaming ?



This post provides a very basic Sample Code - How To Read Kafka From Spark Structured Streaming.

Assumptions :

  • You Kafka server is running with Brokers as Host1, Host2
  • Topics available in Kafka are - Topic1, Topic2
  • Topics contain text data (or words)
  • We will try to count the no of words per Stream
 

Sample Code :


import org.apache.spark.sql.functions.{explode, split}

// Kafka connection Setup
val kafka \= spark.readStream
  .format("kafka")
  .option("kafka.bootstrap.servers", "HOST1:PORT1,HOST2:PORT2")   // List of broker:host
  .option("subscribe", "TOPIC1,TOPIC2")    // comma separated list of topics
  .option("startingOffsets", "latest") // read data from the end of the stream
  .load()

// In our case , we are expecting just text data which we will updates
// to perform the Famous Word-count use case 
// "value" ---> Refers to Value in (key,value) pair
// split lines by space and explode the array .The column name will be \`word\`
val df \= kafka.select(explode(split($"value".cast("string"), "\\\\s+")).as("word"))
  .groupBy($"word")
  .count


  // follow the word counts as it updates
display(df.select($"word", $"count"))


  Additional Read - Sample Code – Spark Structured Streaming vs Spark Streaming Sample Code for PySpark Cassandra Application  


spark structured streaming kafka offset management, spark structured streaming kafka json python, spark structured streaming kafka json java, spark structured streaming kafka example scala, spark structured streaming kafka example java, spark structured streaming example,spark streaming - read from kafka topic, spark structured streaming + kafka example python, spark structured streaming kafka-python example, spark structured streaming kafka offset management, spark structured streaming kafka json, spark structured streaming kafka example java, spark structured streaming kafka consumer group, spark structured streaming example, kafka spark streaming example, spark streaming-kafka, spark structured streaming kafka Sample code, Kafka, spark structured streaming