Can we change the path where the Hive data is stored in HDFS

0 votes
Say, in my requirement I want to keep all the hive data together, the ones which got created through Hive process and the ones which are created in Spark. Is there a way i can do it?
Jul 14, 2019 in Apache Spark by Kevin
854 views

1 answer to this question.

0 votes

Yes, you can but it has to be hdfs only. Refer to code below:

import org.apache.spark.{SparkConf, SparkContext}

import org.apache.spark.sql.{Row, SaveMode};

import org.apache.spark.sql.types.{StructType,StructField,StringType};


val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

val results = hiveContext.read.format("com.databricks.spark.avro").load("people.avro")



val schema = results.schema.map( x => x.name.concat(" ").concat( x.dataType.toString() match { case "StringType" => "STRING"} ) ).mkString(",")


val hive_sql = "CREATE EXTERNAL TABLE people_and_age (" + schema + ") ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/user/ravi/people_age'"

val hive_sql1 = "CREATE EXTERNAL TABLE people_and_age1 (" + schema + ") ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/user/ravi/people_age'"

hiveContext.sql(hive_sql)

hiveContext.sql(hive_sql1)


results.saveAsTable("people_age",SaveMode.Overwrite)

hiveContext.sql("select * from people_age").show()

hiveContext.sql("select * from people_age1").show()
answered Jul 14, 2019 by Yogi

Related Questions In Apache Spark

–1 vote
0 answers
0 votes
1 answer

The number of stages in a job is equal to the number of RDDs in DAG. however, under one of the cgiven conditions, the scheduler can truncate the lineage. identify it.

Hi@Edureka, Spark's internal scheduler may truncate the lineage of the RDD graph ...READ MORE

answered Nov 26, 2020 in Apache Spark by MD
• 95,440 points
3,403 views
0 votes
0 answers
0 votes
1 answer

The number of stages in a job is equal to the number of RDDs in DAG. however, under one of the cgiven conditions, the scheduler can truncate the lineage. identify it.

Hi@ritu, Spark's internal scheduler may truncate the lineage of the RDD graph if ...READ MORE

answered Nov 25, 2020 in Apache Spark by akhtar
• 38,230 points
2,271 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,597 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,203 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,685 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

answered Jul 14, 2019 in Apache Spark by Ishan
4,134 views
0 votes
1 answer

Difference between cogroup and full outer join in spark

Please go through the below explanation : Full ...READ MORE

answered Jul 14, 2019 in Apache Spark by Kiran
9,402 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP