DevOps | Cloud | Analytics | Open Source | Programming





How to Setup a Custom Logging for Spark Driver and Executor in Yarn ?



In this post, we will see How to Setup a Custom Logging for Spark Driver and Executor in Yarn . When you submit a Spark job to be run within an Yarn cluster , you are essentially using a cluster mode approach ( --master yarn-cluster) .

This means the Spark driver will be running inside some machine\node in the Yarn cluster. The logging for the Spark job is done by the configured log4j.properties file. This file is located within $SPARK_CONF_DIR by default. And in an actual Production environment , Mostly you would not have the Write\Modify access to this log4j.properties file since this file is shared amongst all the users of the system. However as a workaround , we can do some steps which can help you to use a custom log4j.properties file specific to your job without having the requirement to modify the Actual "Production" version of the log4j.properties file. Before you proceed , do read the below posts for additional knowledge -

  Now let's proceed with our custom log4j setup and usage.  

Sample (Custom) log4j.properties file:

Let's say we want the Spark job to save the Spark job log to our custom file. And for that let us create a modified version of the log4j.properties file. Do Note - in this sample , I have shown only a fraction of the entire log4j.properties file. We will store this modified version of this log4j.properties file in HDFS or S3 to make it accessible. e.g. hdfs://customDir/customFiles/log4j.properties


#Custom Log4j.properties File

#Set everything to be logged to the file
log4j.rootCategory=INFO,FILE

log4j.appender.console=org.apache.log4j.ConsoleAppender

log4j.appender.FILE.File=/tmp/SparkDriver.log   <- CUSTOM FILE



 

1. Custom Logging (log4j) for Both Spark Driver and Executor:

Now while submitting the Spark job in an Yarn Cluster mode , if you want to use this custom log4j.properties file, you have to add certain bits to the spark-submit command as shown below. Do note this custom log4j.properties configuration will be applicable both to the Spark driver and the executor in this case (instead of the default log4j).


/usr/bin/spark-submit  \\
- class com.Your.SparkCode \\
- files hdfs://customDir/customFiles/log4j.properties \\
- master yarn-cluster \\ 
yourSparkCode.jar

 

2. Custom Logging (log4j) Only for Spark Driver:

In this case , let's say -

  • We want only the Spark driver to use the Custom log4j.properties .
  • The executor will continue to use the default (common) version of the log4j.properties file
  • We save the the custom log4j file in - hdfs://customDir/customFiles/custom-log4j-driver.properties
  • For this we will use the OOB configuration property - spark.driver.extraJavaOptions- to pass additional custom options to the Spark driver JVM.

/usr/bin/spark-submit \\
- class com.Your.SparkCode \\
- files hdfs://customDir/customFiles/custom-Log4j.properties \\
- master yarn-cluster \\
- conf spark.driver.extraJavaOptions=-Dlog4j.configuration=custom-log4j-driver.properties
yourSparkCode.jar

 

3. Different Custom Logging (log4j) for Spark Driver and Executors:

In this case , let's say -

  • We want Both the Spark driver and the Executors to use Different Versions of Custom log4j.properties .
  • We save the the custom log4j file for Spark driver in - hdfs://customDir/customFiles/custom-log4j-driver.properties
  • We save the the custom log4j file for Executor in - hdfs://customDir/customFiles/custom-log4j-executor.properties
  • For this we will use the OOB configuration property -
    • spark.driver.extraJavaOptions - to pass additional custom options to the Spark driver JVM.
    • spark.executor.extraJavaOptions - to pass additional custom options to the Executor JVM.
 


/usr/bin/spark-submit \\
- class com.Your.SparkCode \\
- files hdfs://customDir/customFiles/custom-log4j-driver.properties,hdfs://customDir/customFiles/custom-log4j-executor.properties  \\
- master yarn-cluster \\
- conf spark.driver.extraJavaOptions=-Dlog4j.configuration=custom-log4j-driver.properties
- conf spark.executor.extraJavaOptions=-Dlog4j.configuration=custom-log4j-executor.properties
yourSparkCode.jar

  Hope this post helps to setup a custom log4j for Spark .  

Other Interesting Reads :

   


spark custom logging ,spark custom log4j.properties ,spark custom log4j ,spark log4j custom appender ,spark submit custom log4j


custom log4j spark ,spark custom log4j.properties ,spark submit custom log4j ,spark log4j custom appender ,log4j spark submit ,log4j spark scala example ,log4j spark example ,log4j spark scala ,log4j spark log level ,log4j spark streaming ,log4j spark python ,log4j spark.yarn.app.container.log.dir ,log4j spark databricks ,log4j spark appender ,custom log4j spark ,custom log4j spark board ,custom log4j spark build ,custom log4j spark c# ,custom log4j spark cluster ,custom log4j spark connection ,custom log4j spark controller ,custom log4j spark dataframe ,custom log4j spark download ,custom log4j spark driver ,custom log4j spark error ,custom log4j spark example ,custom log4j spark exception ,custom log4j spark file ,custom log4j spark go ,custom log4j spark golang ,custom log4j spark google ,custom log4j spark html ,custom log4j spark http ,custom log4j spark hub ,custom log4j spark install ,custom log4j spark instance ,custom log4j spark jar ,custom log4j spark jar file ,custom log4j spark java ,custom log4j spark js ,custom log4j spark kernel ,custom log4j spark key ,custom log4j spark kit ,custom log4j spark kotlin ,custom log4j spark kubernetes ,custom log4j spark level ,custom log4j spark linux ,custom log4j spark load ,custom log4j spark login ,custom log4j spark logs ,custom log4j spark mac ,custom log4j spark master ,custom log4j spark maven ,custom log4j spark module ,custom log4j spark not working ,custom log4j spark nz ,custom log4j spark query ,custom log4j spark queue ,custom log4j spark release ,custom log4j spark repo ,custom log4j spark repository ,custom log4j spark root ,custom log4j spark tag ,custom log4j spark test ,custom log4j spark token ,custom log4j spark tutorial ,custom log4j spark ubuntu ,custom log4j spark ui ,custom log4j spark update ,custom log4j spark utility ,custom log4j spark version ,custom log4j spark web ,custom log4j spark windows ,custom log4j spark xml ,custom log4j spark yarn ,custom log4j spark year ,custom log4j spark youtube ,spark custom log4j.properties ,spark log4j custom appender ,spark submit custom log4j ,apache spark log4j ,apache spark log4j configuration ,change log4j spark ,cloudera spark log4j ,configure log4j in spark ,configure log4j spark ,custom log4j spark ,driver log4j spark ,emr log4j spark ,emr spark log4j.properties ,how to configure log4j for spark streaming applications ,how to configure log4j in spark ,how to pass log4j to spark ,how to set log4j properties in spark ,how to use log4j in spark ,how to use log4j in spark scala ,implement log4j in spark ,intellij spark log4j ,log4j error spark ,log4j for spark ,log4j for spark application ,log4j hadoop spark ,log4j in spark ,log4j in spark job ,log4j in spark scala ,log4j in spark submit ,log4j on spark ,log4j properties file spark submit ,log4j properties in spark scala ,log4j rootlogger spark ,log4j scala spark ,log4j spark ,log4j spark appender ,log4j spark build ,log4j spark cluster mode ,log4j spark configuration ,log4j spark databricks ,log4j spark disable ,log4j spark example ,log4j spark go ,log4j spark golang ,log4j spark hdfs ,log4j spark java ,log4j spark job ,log4j spark log level ,log4j spark logging ,log4j spark nz ,log4j spark properties ,log4j spark python ,log4j spark query ,log4j spark queue ,log4j spark scala ,log4j spark scala example ,log4j spark streaming ,log4j spark submit ,log4j spark warn ,log4j spark.yarn.app.container.log.dir ,log4j with spark ,log4j.properties spark example ,log4j.xml in spark ,oozie spark log4j ,oozie spark-log4j.properties ,please initialize the log4j system properly spark ,python log4j spark ,spark 2.3 log4j version ,spark 2.4 log4j version ,spark application log4j configuration ,spark caused by java.io.notserializableexception org.apache.log4j.logger ,spark caused by java.lang.classnotfoundexception org.apache.log4j.spi.filter ,spark caused by java.lang.noclassdeffounderror org/apache/log4j/spi/filter ,spark change log4j ,spark client log4j ,spark client mode log4j ,spark cluster mode log4j ,spark custom log4j.properties ,spark databricks log4j.properties ,spark disable log4j ,spark driver log4j properties ,spark exclude log4j ,spark history server log4j ,spark java log4j ,spark java.io.notserializableexception org.apache.log4j.logger ,spark java.lang.noclassdeffounderror org/apache/log4j/level ,spark java.lang.noclassdeffounderror org/apache/log4j/spi/filter ,spark job log4j config ,spark k8s log4j ,spark kubernetes log4j ,spark log4j console ,spark log4j custom appender ,spark log4j debug ,spark log4j dependency ,spark log4j documentation ,spark log4j driver ,spark log4j example ,spark log4j file ,spark log4j file appender ,spark log4j file location ,spark log4j hdfs ,spark log4j intellij ,spark log4j jar ,spark log4j json ,spark log4j kafka ,spark log4j local ,spark log4j location ,spark log4j logback ,spark log4j logger ,spark log4j mdc ,spark log4j not serializable ,spark log4j not working ,spark log4j off ,spark log4j options ,spark log4j output ,spark log4j override ,spark log4j properties not working ,spark log4j resources ,spark log4j rollingfileappender ,spark log4j settings ,spark log4j slf4j ,spark log4j stackoverflow ,spark log4j standalone ,spark log4j stdout ,spark log4j suppress ,spark log4j template ,spark log4j to file ,spark log4j to hdfs ,spark log4j tutorial ,spark log4j version ,spark log4j yarn ,spark log4j-defaults.properties ,spark log4j-defaults.properties location ,spark log4j-executor.properties ,spark log4j.appender.file.file ,


spark log4j.properties ,spark log4j.properties example ,spark log4j.properties in jar ,spark log4j.properties location ,spark log4j.properties.template ,spark log4j.rootcategory ,spark log4j.xml ,spark logback instead of log4j ,spark logging log4j ,spark no log4j2 configuration file found ,spark override log4j ,spark scala log4j example ,spark set log4j level ,spark streaming log4j configuration ,spark submit custom log4j ,spark submit log4j ,spark submit log4j cluster mode ,spark submit with log4j ,spark test log4j.properties ,spark use log4j ,spark yarn log4j.properties ,spark's default log4j profile org/apache/spark/log4j-defaults.properties ,spark-log4j emr ,spark-log4j.properties does not exist ,spark-shell java.lang.noclassdeffounderror org/apache/log4j/spi/filter ,spark-shell log4j.properties ,spark-submit log4j.configuration ,spark-submit log4j.xml ,spark.driver.extrajavaoptions log4j ,use log4j spark ,using log4j in spark scala ,using log4j with spark ,using spark's default log4j profile ,using spark's default log4j profile org/apache/spark/log4j-defaults.properties ,what is log4j in spark ,where is org/apache/spark/log4j-defaults.properties ,yarn log4j spark ,log4j spark submit ,log4j spark scala example ,log4j spark scala ,log4j spark log level ,log4j spark streaming ,log4j spark python ,log4j spark.yarn.app.container.log.dir ,log4j spark databricks ,log4j spark appender ,spark log4j.appender.file.file ,log4j for spark application ,spark log4j custom appender ,log4j.logger.org.apache.spark ,spark application log4j configuration ,log4j.logger.org.apache.spark.repl.main ,log4j in spark ,log4j spark configuration ,log4j spark cluster mode ,spark log4j console ,apache spark log4j configuration ,spark custom log4j.properties ,spark change log4j ,log4j spark disable ,spark log4j debug ,spark log4j-defaults.properties ,spark log4j dependency ,spark log4j driver ,spark log4j documentation ,spark log4j-defaults.properties location ,log4j spark example ,spark-log4j emr ,spark log4j-executor.properties ,log4j error spark ,log4j.properties spark example ,spark exclude log4j ,log4j.logger.org.apache.spark=error ,spark log4j file appender ,log4j for spark ,spark log4j file ,spark log4j file location ,log4j spark hdfs ,log4j hadoop spark ,log4j in spark scala ,spark log4j intellij ,log4j in spark submit ,log4j in spark job ,configure log4j in spark ,spark log4j.properties in jar ,log4j properties in spark scala ,log4j spark java ,log4j spark job ,spark log4j json ,spark log4j jar ,spark job log4j config ,spark kubernetes log4j ,spark log4j kafka ,spark k8s log4j ,log4j spark logging ,spark log4j location ,spark log4j logger ,spark log4j logback ,spark log4j local ,spark log4j mdc ,spark log4j not working ,spark log4j not serializable ,spark log4j properties not working ,spark-log4j.properties does not exist ,spark no log4j2 configuration file found ,spark log4j options ,spark log4j output ,spark log4j override ,spark log4j off ,log4j on spark ,log4j.logger.org.spark\_project ,log4j spark properties ,spark log4j.properties example ,spark log4j.properties.template ,emr spark log4j.properties ,oozie spark-log4j.properties ,spark log4j.rootcategory ,spark log4j rollingfileappender ,spark log4j resources ,log4j rootlogger spark ,log4j.logger.org.apache.spark.repl.sparkimain$exprtyper ,spark log4j settings ,spark log4j stdout ,spark log4j standalone ,spark log4j suppress ,spark log4j template ,spark log4j to file ,spark log4j tutorial ,spark log4j to hdfs ,spark test log4j.properties ,spark use log4j ,spark log4j version ,log4j spark warn ,log4j with spark ,log4j.logger.org.apache.spark=warn ,spark log4j.xml ,spark log4j yarn ,custom log4j spark ,spark custom log4j.properties ,spark log4j custom appender ,spark submit custom log4j ,