Unable to run the java- spark standalone program

0 votes

I am unable to run this java program

package com.sparkdemo.spark_kafka_cassandra;

import org.apache.spark.api.java.JavaRDD;

import org.apache.spark.sql.Row;

import org.apache.spark.sql.SparkSession;


public class JSONtoRDD {

public static void main(String[] args) {

// configure spark

SparkSession spark = SparkSession

.builder()

.appName("Spark Example - Read JSON to RDD")

.master("local[2]")

.getOrCreate();

// read list to RDD

String jsonPath = "data/employees.json";

JavaRDD<Row> items = spark.read().json(jsonPath).toJavaRDD();


items.foreach(item -> {

System.out.println(item);

});

}

}

Getting this error 

9/03/30 10:36:36 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/home/edureka/Desktop/LabDetails/Spark_Workspace/spark-kafka-cassandra/spark-warehouse').

19/03/30 10:36:36 INFO SharedState: Warehouse path is 'file:/home/edureka/Desktop/LabDetails/Spark_Workspace/spark-kafka-cassandra/spark-warehouse'.

19/03/30 10:36:40 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint

19/03/30 10:36:53 INFO FileSourceStrategy: Pruning directories with: 

19/03/30 10:36:53 INFO FileSourceStrategy: Post-Scan Filters: 

19/03/30 10:36:53 INFO FileSourceStrategy: Output Data Schema: struct<value: string>

19/03/30 10:36:53 INFO FileSourceScanExec: Pushed Filters: 

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582

at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)

at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)

at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)

at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)

at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)

at scala.collection.Iterator.foreach(Iterator.scala:937)

at scala.collection.Iterator.foreach$(Iterator.scala:937)

at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)

at scala.collection.IterableLike.foreach(IterableLike.scala:70)

at scala.collection.IterableLike.foreach$(IterableLike.scala:69)

at scala.collection.AbstractIterable.foreach(Iterable.scala:54)

at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)

at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)

at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:58)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:176)

at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)

at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32)

at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29)

at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:194)

at scala.collection.TraversableLike.map(TraversableLike.scala:233)

at scala.collection.TraversableLike.map$(TraversableLike.scala:226)

at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:194)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:170)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:169)

at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)

at scala.collection.immutable.List.foreach(List.scala:388)

at scala.collection.TraversableLike.flatMap(TraversableLike.scala:240)

at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:237)

at scala.collection.immutable.List.flatMap(List.scala:351)

at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:169)

at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$._descriptorFor(ScalaAnnotationIntrospectorModule.scala:22)

at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.fieldName(ScalaAnnotationIntrospectorModule.scala:30)

at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.findImplicitPropertyName(ScalaAnnotationIntrospectorModule.scala:78)

at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findImplicitPropertyName(AnnotationIntrospectorPair.java:467)

at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._addFields(POJOPropertiesCollector.java:351)

at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:283)

at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getJsonValueMethod(POJOPropertiesCollector.java:169)

at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findJsonValueMethod(BasicBeanDescription.java:223)

at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByAnnotations(BasicSerializerFactory.java:348)

at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:210)

at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)

at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)

at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)

at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)

at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)

at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)

at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)

at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)

at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)

at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)

at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)

at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)

at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)

at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)

at org.apache.spark.sql.execution.datasources.json.TextInputJsonDataSource$.inferFromDataset(JsonDataSource.scala:103)

at org.apache.spark.sql.execution.datasources.json.TextInputJsonDataSource$.infer(JsonDataSource.scala:98)

at org.apache.spark.sql.execution.datasources.json.JsonDataSource.inferSchema(JsonDataSource.scala:64)

at org.apache.spark.sql.execution.datasources.json.JsonFileFormat.inferSchema(JsonFileFormat.scala:60)

at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$12(DataSource.scala:183)

at scala.Option.orElse(Option.scala:289)

at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:180)

at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)

at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)

at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)

at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:391)

at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:325)

at com.sparkdemo.spark_kafka_cassandra.JSONtoRDD.main(JSONtoRDD.java:17)

19/03/30 10:36:54 INFO SparkContext: Invoking stop() from shutdown hook

19/03/30 10:36:54 INFO SparkUI: Stopped Spark web UI at http://192.168.0.28:4040

19/03/30 10:36:54 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!

19/03/30 10:36:55 INFO MemoryStore: MemoryStore cleared

19/03/30 10:36:55 INFO BlockManager: BlockManager stopped

19/03/30 10:36:55 INFO BlockManagerMaster: BlockManagerMaster stopped

19/03/30 10:36:55 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!

19/03/30 10:36:55 INFO SparkContext: Successfully stopped SparkContext

19/03/30 10:36:55 INFO ShutdownHookManager: Shutdown hook called

19/03/30 10:36:55 INFO ShutdownHookManager: Deleting directory /tmp/spark-b7e8657d-1cc6-428f-a790-723eab56c07b
Jul 30, 2019 in Apache Spark by Junaid
1,096 views

1 answer to this question.

0 votes

Though there is nothing wrong with the code but maybe something wrong in the way you have tried to execute. Follow these steps:

Step 1:

Start spark daemons .

image

Step 2:

Now create a java project and copy the same code again.

After this right click on project-->buildpath-->configure buildpath-->external library-->external jars

Choose all the jars from /usr/lib/spark/jars folder and Apply

image

Step 3:

Now have look at the result when you run the code.

image

answered Jul 30, 2019 by Lohit

Related Questions In Apache Spark

0 votes
0 answers

Unable to get the Job status and Group ID java- spark standalone program with databricks

package com.dataguise.test; import java.io.IOException; import java.util.concurrent.CountDownLatch; import java.util.concurrent.TimeUnit; import org.apache.spark.SparkContext; import org.apache.spark.SparkJobInfo; import ...READ MORE

Jul 23, 2020 in Apache Spark by kamboj
• 140 points

recategorized Jul 28, 2020 by Gitika 1,941 views
0 votes
1 answer

I am not able to run the apache spark program in mac oc

Hi@Srinath, It seems you didn't set Hadoop for ...READ MORE

answered Sep 21, 2020 in Apache Spark by MD
• 95,440 points
1,132 views
0 votes
1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 29, 2020 in Apache Spark by GAURAV
• 140 points
3,228 views
0 votes
1 answer

Unable to submit the spark job in deployment mode - multinode cluster(using ubuntu machines) with yarn master

Hi@Ganendra, As you said you launched a multinode cluster, ...READ MORE

answered Jul 29, 2020 in Apache Spark by MD
• 95,440 points
1,739 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,556 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,184 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,201 views
0 votes
1 answer

How to run spark in Standalone client mode?

Hi, These are the steps to run spark in ...READ MORE

answered Jul 5, 2019 in Apache Spark by Gitika
• 65,910 points
1,452 views
0 votes
5 answers

How to change the spark Session configuration in Pyspark?

You aren't actually overwriting anything with this ...READ MORE

answered Dec 14, 2020 in Apache Spark by Gitika
• 65,910 points
121,586 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP