Scala org apache poi openxml4j exceptions InvalidFormatException Your InputStream was neither an OLE2 stream nor an OOXML stream

0 votes

Hi Team,

I am trying to read an excel file using Spark CLI, but I am getting "org.apache.poi.openxml4j.exceptions.InvalidFormatException: Your InputStream was neither an OLE2 stream, nor an OOXML stream" error.

Below is the code I am using:

import com.crealytics.spark.excel

val df = spark.read.format("com.crealytics.spark.excel")
.option("useHeader", "true")
.option("startColumn", 0)
.option("treatEmptyValuesAsNulls", "false")
.option("inferSchema", "false")
.option("location", "/home/Desktop/lucky/logs.xlsx")
.option("addColorColumns", "False")
.load()
Jul 30, 2019 in Apache Spark by Nitin
1,448 views

1 answer to this question.

0 votes

Try executing the below code,

def readExcel(file: String): DataFrame = sqlContext.read
    .format("com.crealytics.spark.excel")
​    .option("location", file)
​    .option("useHeader", "true")
​    .option("treatEmptyValuesAsNulls", "true")
​    .option("inferSchema", "true")
​    .option("addColorColumns", "False")
​    .load()

val data = readExcel("path to your excel file")
​
data.show(false)

If you want to know more about Apache Spark Scala, It's highly recommended to go for the Spark Certification Course today.

answered Jul 30, 2019 by Raman

Related Questions In Apache Spark

0 votes
1 answer

How to create RDD from an external file source in scala?

Hi, To create an RDD from external file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
1,518 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

answered Jul 4, 2019 in Apache Spark by Gitika
• 65,910 points
3,903 views
0 votes
2 answers

Error : split value is not a member of org.apache.spark.sql.Row

var d=rdd2col.rdd.map(x=>x.split(",")) or val names=rd ...READ MORE

answered Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy.
11,058 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

answered Jul 10, 2019 in Apache Spark by Tina
41,983 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,556 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,184 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,205 views
0 votes
2 answers

How to execute a function in apache-scala?

Function Definition : def test():Unit{ var a=10 var b=20 var c=a+b } calling ...READ MORE

answered Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy
659 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

answered Jul 25, 2019 in Apache Spark by Rohit
7,724 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP