How To Fix Spark error - "org.apache.spark.SparkException: Job aborted"

In this post, we will see How to Fix Spark error "org.apache.spark.SparkException: Job aborted". This can be due to various reasons . So I would advise that you do check the below points with respect to your Spark Project.

Make sure the Class Path is correct. Consider the example below . Spark should know where to go and find the Classname (i.e. Classpath location). So in this case the correct Jar location.


./spark-submit 
--class "<CORRECT_CLASSPATH_NAME>" \
--master "spark://xx.yy:7077" \
/AA/BB/target/project-1.0-SNAPSHOT-jar-with-dependencies.jar

Version mismatch is one of the Very Common Root cause of all these type of errors. Check the for any mismatch between the spark connector and spark version used in the project. So if Spark version is xx.yy.zz , then the connector version should also correspond to xx.yy.zz. So when you build the Dependency this need to be taken care of. If you are using Scala , you can use the SBT Tool. Otherwise you can use pom.xml to build.

Make sure you have All the dependencies in place. Download all the dependency Jars and place them in the Jar folder of spark master. It is Good practice to create a Fat Jar. If you skip creating a fat jar, then you have to ensure that the job is submitted with all the correct package specified -


    spark-submit 
    --packages datastax:spark-cassandra-connector:2.4.1-s_2.11 \
    --class "<CORRECT_CLASSPATH_NAME>" \
    --master "spark://xx.yy:7077" 
    /AA/BB/target/project-1.0-SNAPSHOT-jar-with-dependencies.jar

Make sure you are using the Correct IP Address or the Public IP while specifying the Spark Master in case of AWS or any cloud cluster.


 --master spark://<CORRECT_IP_of_SPARK_MASTER>:7077   

OR 

 --master spark://<CORRECT_PUBLIC_DNS_IP_of_SPARK_MASTER>:7077

Make sure to Refresh the metadata if you are using Hive (which internally uses the metastore) . If using Hive, Spark should be Aware of the latest metastore metadata and block location data for the tbale being used. If some new data is loaded into the tables i.e. HDFS\S3 data directory is updated for the table, then we need to use refresh process to take these into account. Use the below command for this


> spark.catalog.refreshTable("<TABLE_NAME>")

Check the Availability of Free RAM - whether it matches the expectation of the job being executed. Run below on each of the servers in the cluster and check how much RAM & Space they have in offer.


free -h

If you are using any HDFS files in the Spark job , make sure to Specify & Correctly use the HDFS URL. Cross-check that the NameNode is up and running.

The job might also fail due to Serialization error . Check our detailed post on how to handle Serialization issue . The post is here - Fix – Spark Error “Task not serializable: java.io.NotSerializableException”

If you are getting any NULL Point Exception, there is possibility that you are using operation like Aggregation etc against some Empty data or data which is null. Check that.

If there is some memory issue with the Job Failure, verify the memory flags and check what value is being set (or default). You might need to tune those. Some of the Important Flags are given below -
- spark.executor.memory – Size of memory to use for each executor that runs the task.
- spark.executor.cores – Number of virtual cores.
- spark.driver.memory – Size of memory to use for the driver.
- spark.driver.cores – Number of virtual cores to use for the driver.
- spark.executor.instances – Number of executors. Set this parameter unless spark.dynamicAllocation.enabled is set to true.
- spark.default.parallelism – Default number of partitions in resilient distributed datasets (RDDs) returned by transformations like join, reduceByKey, and parallelize when no partition number is set by the user.
- To use all the resources available in a cluster, set the maximizeResourceAllocation parameter to true
- spark.executor.cores
- spark.executor.memory
- spark.driver.memory - Use same no. as spark.executors.memory
- spark.driver.cores - same no. as spark.executors.cores

Hope these help you to solve the - Spark error "org.apache.spark.SparkException: Job aborted"

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

How To Fix Spark error - "org.apache.spark.SparkException: Job aborted"

Other Interesting Reads :

How To Fix – “Fatal Error During KafkaServerStartable” in Kafka ?

How To Fix – “Cannot import name ‘main'” Error in Python ?

How to Override – Kafka Topic configurations in MongoDB Connector?

How To Read(Load) Data from Local, HDFS & Amazon S3 in Spark ?

Apply Pod Security Standards To Kubernetes Cluster

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

How To Fix Spark error - "org.apache.spark.SparkException: Job aborted"

Other Interesting Reads :

How To Fix – “Fatal Error During KafkaServerStartable” in Kafka ?

How To Fix – “Cannot import name ‘main'” Error in Python ?

How to Override – Kafka Topic configurations in MongoDB Connector?

How To Read(Load) Data from Local, HDFS & Amazon S3 in Spark ?

Popular Articles

Apply Pod Security Standards To Kubernetes Cluster

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)