Scala org apache poi openxml4j exceptions InvalidFormatException Your InputStream was neither an OLE2 stream nor an OOXML stream

0 votes

Hi Team,

I am trying to read an excel file using Spark CLI, but I am getting "org.apache.poi.openxml4j.exceptions.InvalidFormatException: Your InputStream was neither an OLE2 stream, nor an OOXML stream" error.

Below is the code I am using:

import com.crealytics.spark.excel

val df = spark.read.format("com.crealytics.spark.excel")
.option("useHeader", "true")
.option("startColumn", 0)
.option("treatEmptyValuesAsNulls", "false")
.option("inferSchema", "false")
.option("location", "/home/Desktop/lucky/logs.xlsx")
.option("addColorColumns", "False")
.load()
Jul 30, 2019 in Apache Spark by Nitin
610 views

1 answer to this question.

0 votes

Try executing the below code,

def readExcel(file: String): DataFrame = sqlContext.read
    .format("com.crealytics.spark.excel")
​    .option("location", file)
​    .option("useHeader", "true")
​    .option("treatEmptyValuesAsNulls", "true")
​    .option("inferSchema", "true")
​    .option("addColorColumns", "False")
​    .load()

val data = readExcel("path to your excel file")
​
data.show(false)
answered Jul 30, 2019 by Raman

Related Questions In Apache Spark

0 votes
1 answer

How to create RDD from an external file source in scala?

Hi, To create an RDD from external file ...READ MORE

answered Jul 3, 2019 in Apache Spark by Gitika
• 65,870 points
567 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

answered Jul 4, 2019 in Apache Spark by Gitika
• 65,870 points
1,866 views
0 votes
2 answers

Error : split value is not a member of org.apache.spark.sql.Row

var d=rdd2col.rdd.map(x=>x.split(",")) or val names=rd ...READ MORE

answered Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy.
4,414 views
0 votes
1 answer

org.apache.spark.sql.AnalysisException: cannot resolve given input columns

The string Productivity has to be enclosed between single ...READ MORE

answered Jul 10, 2019 in Apache Spark by Tina
18,037 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
6,852 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,098 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
48,385 views
0 votes
2 answers

How to execute a function in apache-scala?

Function Definition : def test():Unit{ var a=10 var b=20 var c=a+b } calling ...READ MORE

answered Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy
179 views
0 votes
1 answer

Spark: Error while instantiating "org.apache.spark.sql.hive.HiveSessionState"

Seems like you have not started the ...READ MORE

answered Jul 25, 2019 in Apache Spark by Rohit
3,959 views