Sample Code - Spark Structured Streaming vs Spark Streaming

This post gives Sample Code - Spark Structured Streaming vs Spark Streaming . Major differences between Spark Structured Streaming vs Spark Streaming are -

Structured Streaming works on Dataframe\Datasets whereas Spark Streaming works on RDDs
Structured Streaming doesn't work on Micro-batch format(like Spark Streaming does). Rather each data stream row is processed and updated into the unbounded result table. So Structured Streaming is more Real-Time from that aspect.

Below is a sample piece of code which demonstrates How data is read and processed in both Structured Streaming as well as Spark Streaming. It basically shows how you create a Spark-Structured-Streaming environment as well how you create a Spark Streaming environment. This is not a complete end-to-end Application code . It just gives you an easy understanding.

Sample Code - Structured Streaming :


val lines \= spark.readStream
			.format("socket")
			.option("host","localhost")
			.option("port",9999)
			.load()

val data \= lines.as(String).flatMap(\_split(" "))

val countOfWords \= data.groupby("value").count()

counOfWords.writeStream
    .format("console")
    .option("truncate","false")
    .start()
    .awaitTermination()

SampleCode - Spark Streaming :


val streamContext \= new StreamingContext(conf,Seconds(1))
val data \= ssc.socketTextStream("localhost", 9999)

val wordCounts \= data.map(\_.value) // split the message into lines
	              .flatMap(\_.split(" ")) //split into words
	              .filter(w \=> w.length() \> 0) // remove empty words
	              .map(w \=> (w, 1L)).reduceByKey(\_ + \_) // count by word

wordCounts.print()

streamContext.start()

https://spark.apache.org/docs/latest/streaming-programming-guide.html Additional Read - Sample Code for PySpark Cassandra Application Sample Code for Spark Cassandra Scala Application

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

Sample Code - Spark Structured Streaming vs Spark Streaming

Sample Code - Structured Streaming :

SampleCode - Spark Streaming :

Apply Pod Security Standards To Kubernetes Cluster

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

Sample Code - Spark Structured Streaming vs Spark Streaming

Sample Code - Structured Streaming :

SampleCode - Spark Streaming :

Popular Articles

Apply Pod Security Standards To Kubernetes Cluster

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)