DevOps | Cloud | Analytics | Open Source | Programming





How To Fix Spark Error - org.apache.spark.SparkException: Exception Thrown in AwaitResult



In this post , we will see How to Fix - Spark Error - org.apache.spark.SparkException: Exception Thrown in AwaitResult. This is a common occurrence at times and the below error can be seen in the Spark Master Terminal -


org.apache.spark.SparkException: Exception thrown in awaitResult

Use the below points to fix this -

  • Check the Spark version used in the project - especially if it involves a Cluster of nodes (Master , Slave). The Spark version which is running in the Slave nodes should be same as the Spark version dependency used in the Jar compilation. So you need to use the appropriate version in the pom.xml.
In the example below , the Spark version xx.yy.zz should be the "common" version used across all the nodes in the cluster.


<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core\_2.10</artifactId>
  <version>xx.yy.zz</version>
</dependency>

  • Also the Scala version used should be compatible with the corresponding Spark version.
 

  • Check the use of any large data in Broadcast , if applicable.  The default size of Broadcast threshold as set by the variable - spark.sql.autoBroadcastJoinThreshold is 10Mb. Based on your use case you might consider modifying the threshold value.
 

  • Check your code if you are using any shuffle operation which renders data movement across the network.
 

  • The output result from any task run is returned back to TaskRunner. The result data is wrapped in TaskResult. If the size is larger than max direct result size(defined by spark.task.maxDirectResultSize), the result data is saved into BlockManager. Otherwise, the result data is sent back to the Driver directly. It's default is 1 megabyte.
 

  • Try to increase the spark.sql.broadcastTimeout value. The default value is 300 seconds.
 

  • Try to disable the broadcasting (if applicable) -  spark.sql.autoBroadcastJoinThreshold=-1
 

  • Check the parameter - spark.sql.autoBroadcastJoinThreshold . It defaults to 10M. Try to change that as well.
 

  • Try to increase the Spark Driver Memory - spark.driver.memory=<8,16,....>G
  Try all the above steps and see if that helps to solve the issue.  

Other Interesting Reads -

 


exception in thread "main" org.apache.spark.sparkexception: exception thrown in awaitresult, failed to connect to master org.apache.spark.sparkexception: exception thrown in awaitresult, fileformatwriter aborting job null org apache spark sparkexception exception thrown in awaitresult, org.apache.spark.sparkexception exception thrown in future.get in databricks, exception thrown in awaitresult redshift, aws glue exception thrown in awaitresult, spark join exception thrown in awaitresult, caused by: org.apache.spark.sparkuserappexception: user application exited with 1, apache.spark.sparkexception exception thrown in awaitresult, exception in thread "main" org.apache.spark.sparkexception: exception thrown in awaitresult:, failed to connect to master org.apache.spark.sparkexception: exception thrown in awaitresult, fileformatwriter aborting job null org apache spark sparkexception exception thrown in awaitresult, org.apache.spark.sparkexception exception thrown in future.get in databricks, exception thrown in awaitresult redshift, aws glue exception thrown in awaitresult, spark join exception thrown in awaitresult, exception thrown in awaitresult at org apache spark util threadutils awaitresultinforkjoinsafely, How does spark handle out of memory exception, Spark Error - org.apache.spark.SparkException: Exception Thrown in AwaitResult, org.apache.spark.SparkException: Exception Thrown in AwaitResult, spark, pyspark, spark exception,  spark error