In this post , we will see How to Fix Spark Error - "package does not exist". These packages can be anything that is needed in the Spark project viz. Spark Streaming Lib , Spark-Sql Lib etc. You might see the error in the spark terminal as below (Any or More of them) -
/Spark_RDD.java:4: error: package org.apache.spark.api.java does not exist
package org.apache.spark.mllib.fpm does not exist
package org.apache.spark.streaming does not exist
package org.apache.spark.sql does not exist
This error mostly occur due to missing dependency.You might have to add these to your dependency lib or pom.xml (if using Maven). To Fix it , cross-check the below in your respective case as applicable.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_**aa.bb**</artifactId> <!-- matching Scala version -->
<version>**xx.yy.zz**</version> <!-- matching Spark Core version -->
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_aa.bb</artifactId> <!-- matching Scala version -->
<version>xx.yy.zz</version> <!-- matching Spark Core version -->
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_aa.bb</artifactId> <!-- matching Scala version -->
<version>xx.yy.zz</version> <!-- matching Spark Core version -->
</dependency>
<project>
<groupId>com.myproject.app</groupId>
<artifactId>java</artifactId>
<modelVersion>1.0.0</modelVersion>
<name>examples</name>
<packaging>jar</packaging>
<version>0.0.1</version>
<repositories>
<repository>
<id>Akka repository</id>
<url>http://repo.akka.io/releases</url>
</repository>
<repository>
<id>scala-tools</id>
<url>https://oss.sonatype.org/content/groups/scala-tools</url>
</repository>
<repository>
<id>apache</id>
<url>https://repository.apache.org/content/repositories/releases</url>
</repository>
<repository>
<id>twitter</id>
<url>http://maven.twttr.com/</url>
</repository>
<repository>
<id>central2</id>
<url>http://central.maven.org/maven2/</url>
</repository>
</repositories>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.3.1</version>
<scope>provided</scope>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.3.1</version>
<scope>provided</scope>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.3.1</version>
<scope>provided</scope>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.3.1</version>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.3.1</version>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib</artifactId>
<version>1.3.1</version>
</dependency>
<dependency> <!-- Cassandra -->
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector</artifactId>
<version>1.0.0-rc5</version>
</dependency>
<dependency> <!-- Cassandra -->
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java</artifactId>
<version>1.0.0-rc5</version>
</dependency>
<dependency> <!-- Elastic search connector -->
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop-mr</artifactId>
<version>2.0.0.RC1</version>
</dependency>
<dependency> <!-- Jetty demmo -->
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-client</artifactId>
<version>8.1.14.v20131031</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.3.3</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>net.sf.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>2.2.1</version>
</dependency>
</dependencies>
<properties>
<java.version>1.7</java.version>
</properties>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.2.2</version>
<!-- The configuration of the plugin -->
<configuration>
<!-- Specifies the configuration file of the assembly plugin -->
<descriptors>
<descriptor>src/main/assembly/assembly.xml</descriptor>
</descriptors>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
Hope this is useful.
apache spark, hadoop spark, spark ml, python spark, package spark does not exist,apache spark, spark download, error package org apache-spark sql does not exist, Spark Import Issue , package org.apache.spark.api.java does not exist , package org.apache.spark.mllib.fpm does not exist , package org.apache.spark.streaming does not exist , package org.apache.spark.sql does not exist, apache spark issue