How To Fix Spark Error - "org.apache.spark.sql.AnalysisException: resolved attribute(s)" ?

In this post, we will see - How To Fix Spark Error - "org.apache.spark.sql.AnalysisException: resolved attribute(s)".

While applying udf on columns using dataframes in Spark, sometimes you get the below error -


Exception in thread "main" org.apache.spark.sql.AnalysisException: 
resolved attribute(s) xxxx#yy missing from


ERROR: org.apache.spark.sql.AnalysisException: resolved attribute(s)

You also get errors like -


Reference ‘xxx’ is ambiguous

Most of the common occurrences of these issues are due to various join, aggregation etc. operations done on dataframe(s) . Also when the columns between the dataframes share an AttributeReference . This leads to ambiguity and the basic approach would be to resolve this ambiguity or nullify the AttributeReference shared. Use the below steps if that helps to solve the issue -

Approach 1:

If you are reusing references, it might create ambiguity in the name . One approach would be to clone the dataframe -


final Dataset<Row> join = cloneDataset(df1.join(df2, columns))

OR

df1\_cloned = df1.toDF(column\_names)
df1\_cloned.join(df2, \['column\_names\_to\_join'\])

Approach 2:

When you join two dataframes which have more than one keys sharing the same name, then you could try to join the dataframes specifying the exact columns that you are joining on.


df1.join(df2, \['col1', 'col2', 'col3', 'col4'\])

Approach 3:

Let's say you have dataframe-1 df1
Then you derived dataframe-2(df2) from dataframe-1
Since df2 was derived from df1, the common columns will have same name(s)
In case if you require to join df1 & df2 in a scenario, then
- Rename the columns which are common to both the dataframes


df2\_modified = df2.withColumnRenamed('col1', 'col1\_renamed').withColumnRenamed('col2', 'col2\_renamed')

- Now the columns ambiguity being handled join df1 with df2_modified - instead of df2


df1.join(df2\_modified)

Approach 4:

You could also use the alias option as shown below to nullify the column ambiguity. In this case we assume that col1 is the column creating ambiguity.


import pyspark.sql.functions as Func

df1\_modified = df1.select(Func.col("col1").alias("col1\_renamed"))

Now use df1_modified dataframe to join - instead of df1

Hope this helps.

Additional Read -


spark sql resolved attribute(s) missing from ,spark resolved attribute missing ,resolved attribute(s) missing from spark scala ,pyspark resolved attribute(s) missing from ,pyspark join resolved attribute(s) missing from ,resolved attribute(s) missing pyspark ,resolved attribute(s) missing from spark scala ,resolved attribute(s) missing from in operator ,resolved attribute(s) missing from in operator project ,resolved attribute(s) missing from in operator filter ,resolved attributes missing from pyspark ,resolved attributes missing from spark ,resolved attributes ,pyspark join resolved attributes missing ,resolved attribute(s) missing from spark scala ,attribute(s) with the same name appear in the operation ,cannot be resolved on the left side of the join ,analysisexception syntax error in attribute name ,failure when resolving conflicting references in join ,found duplicate rewrite attributes pyspark ,org apache-spark sql analysisexception reference is ambiguous, could be ,import pyspark sql could not be resolved ,spark resolved attribute(s) ,spark resolved attribute(s) missing from ,resolved attribute missing from pyspark ,resolved attribute(s) missing from spark scala ,spark sql resolved attribute(s) missing from ,pyspark join resolved attribute(s) missing from ,pyspark resolved attribute(s) missing from ,pyspark u'resolved attribute(s) missing from ,resolved attribute(s) missing from spark scala ,org.apache.spark.sql.analysisexception ,org.apache.spark.sql.analysisexception cannot resolve ,org.apache.spark.sql.analysisexception path does not exist ,org.apache.spark.sql.AnalysisException ,org.apache.spark.sql.analysisexception cannot resolve ,org.apache.spark.sql.analysisexception path does not exist ,org.apache.spark.sql.analysisexception cannot resolve given input columns ,spark resolved attribute(s) ,spark resolved attribute(s) c# ,spark resolved attribute(s) command ,spark resolved attribute(s) date ,spark resolved attribute(s) difference ,spark resolved attribute(s) error ,spark resolved attribute(s) example ,spark resolved attribute(s) graph ,spark resolved attribute(s) header ,spark resolved attribute(s) in java ,spark resolved attribute(s) key ,spark resolved attribute(s) kotlin ,spark resolved attribute(s) list ,spark resolved attribute(s) missing from ,spark resolved attribute(s) name ,spark resolved attribute(s) not working ,spark resolved attribute(s) qgis ,spark resolved attribute(s) query ,spark resolved attribute(s) queue ,spark resolved attribute(s) table ,spark resolved attribute(s) tag ,spark resolved attribute(s) types ,spark resolved attribute(s) update ,spark resolved attribute(s) url ,spark resolved attribute(s) value ,spark resolved attribute(s) xml ,spark resolved attribute(s) yaml ,spark resolved attribute(s) years ,spark resolved attribute(s) yield ,spark resolved attribute(s) youtube ,spark resolved attribute(s) zero ,spark resolved attribute(s) zip ,spark sql resolved attribute(s) missing from

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

How To Fix Spark Error - "org.apache.spark.sql.AnalysisException: resolved attribute(s)" ?

Approach 1:

Approach 2:

Approach 3:

Approach 4:

Additional Read -

How to Override – Kafka Topic configurations in MongoDB Connector?

How To Fix – Leader Not Available in Kafka Console Producer

How To Read Kafka JSON Data in Spark Structured Streaming

How to Purge a Running Kafka Topic ?

How to Send Large Messages in Kafka ?

How To Setup Spark Scala SBT in Eclipse

How To Set up Apache Spark & PySpark in Windows 10

How to Send Large Messages in Kafka ?

Fix Spark Error – “org.apache.spark.SparkException: Failed to get broadcast_0_piece0 of broadcast_0”

How to Handle Bad or Corrupt records in Apache Spark ?

How to use Broadcast Variable in Spark ?

How to log an error in Python ?

How to Code Custom Exception Handling in Python ?

How to Handle Errors and Exceptions in Python ?

How To Fix – “Ssl: Certificate_Verify_Failed” Error in Python ?

Apply Pod Security Standards To Kubernetes Cluster

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

How To Fix Spark Error - "org.apache.spark.sql.AnalysisException: resolved attribute(s)" ?

Approach 1:

Approach 2:

Approach 3:

Approach 4:

Additional Read -

Popular Articles

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)