FileStreamSink Error while looking for metadata directory java lang IllegalArgumentException java net UnknownHostException hive

+2 votes


I tried to read one csv file in spark sql. But I am getting the below error.

val dfs ="com.databricks.spark.csv").option("header", "true").option("inferschema","true").load("hdfs://hive/bike")
20/02/13 09:38:05 WARN streaming.FileStreamSink: Error while looking for metadata directory.
java.lang.IllegalArgumentException: hive
  at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(
  at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(
  at org.apache.hadoop.hdfs.DFSClient.<init>(
  at org.apache.hadoop.hdfs.DFSClient.<init>(
  at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(
  at org.apache.hadoop.fs.FileSystem.createFileSystem(
  at org.apache.hadoop.fs.FileSystem.access$200(
  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(
  at org.apache.hadoop.fs.FileSystem$Cache.get(
  at org.apache.hadoop.fs.FileSystem.get(
  at org.apache.hadoop.fs.Path.getFileSystem(
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:547)
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:545)
  at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
  at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
  at scala.collection.immutable.List.foreach(List.scala:392)
  at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
  at scala.collection.immutable.List.flatMap(List.scala:355)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:359)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
  ... 49 elided
Caused by: hive
  ... 73 more
Feb 13, 2020 in Big Data Hadoop by akhtar
• 38,180 points

1 answer to this question.

0 votes


You tried to read file from your hdfs cluster. So you have to give the full path of the file along with master ip and port.

$  val dfs ="com.databricks.spark.csv").option("header", "true").option("inferschema","true").load("hdfs://master ip:port/hive/bike")

Hope it will solve your problem.

Thank You

answered Feb 13, 2020 by MD
• 95,140 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Hadoop: Format namenode gives error: Shutting down NameNode at

UnknownHostException is thrown when hadoop tries to resolve ...READ MORE

answered Nov 12, 2018 in Big Data Hadoop by Omkar
• 69,110 points
0 votes
1 answer

Error while connecting to Hive using Java JDBC

Use ​org.apache.hive.jdbc.HiveDriver as your driver ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 69,110 points
0 votes
1 answer
0 votes
1 answer

What is Metastore in Hive?

It stores metadata for Hive tables (like their schema ...READ MORE

answered Dec 20, 2018 in Big Data Hadoop by Frankie
• 9,810 points
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points