Scala: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Your InputStream was neither an OLE2 stream, nor an OOXML stream

0 votes

Hi Team,

I am trying to read an excel file using Spark CLI, but I am getting "org.apache.poi.openxml4j.exceptions.InvalidFormatException: Your InputStream was neither an OLE2 stream, nor an OOXML stream" error.

Below is the code I am using:

import com.crealytics.spark.excel

val df = spark.read.format("com.crealytics.spark.excel")
.option("useHeader", "true")
.option("startColumn", 0)
.option("treatEmptyValuesAsNulls", "false")
.option("inferSchema", "false")
.option("location", "/home/Desktop/lucky/logs.xlsx")
.option("addColorColumns", "False")
.load()
Jul 30, 2019 in Apache Spark by Nitin
303 views

1 answer to this question.

0 votes

Try executing the below code,

def readExcel(file: String): DataFrame = sqlContext.read
    .format("com.crealytics.spark.excel")
​    .option("location", file)
​    .option("useHeader", "true")
​    .option("treatEmptyValuesAsNulls", "true")
​    .option("inferSchema", "true")
​    .option("addColorColumns", "False")
​    .load()

val data = readExcel("path to your excel file")
​
data.show(false)
answered Jul 30, 2019 by Raman

Related Questions In Apache Spark

0 votes
1 answer

How to create RDD from an external file source in scala?

Hi, To create an RDD from external file ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 29,090 points
239 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

answered Jul 4, 2019 in Apache Spark by Gitika
• 29,090 points
962 views
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

answered Jul 10, 2019 in Apache Spark by Rishi
2,094 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

answered Jul 10, 2019 in Apache Spark by Tina
7,688 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
5,042 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
717 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
29,675 views
0 votes
1 answer

How to execute a function in apache-scala?

Hi, Here is a simple example of how ...READ MORE

answered Jul 1, 2019 in Apache Spark by Gitika
• 29,090 points
78 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

answered Jul 25, 2019 in Apache Spark by Rohit
1,964 views