How to connect Spark to a remote Hive server

0 votes
Can anyone help me in understanding how I can load data from Hive server which is installed remotely into Spark data frame. Do I need a hive jdbc connector?
May 17, 2018 in Big Data Hadoop by code799
12,095 views

3 answers to this question.

0 votes
Use org.apache.spark.sql.hive.HiveContext & you can perform query on Hive.

But I would suggest you to connect Spark to HDFS & perform analytics over the stored data. It would be much more efficient that connecting Spark with Hive and then performing analysis over it.
answered May 17, 2018 by Shubham
• 13,490 points
+1 vote

JDBC is not required here.

Create a hive SQLContext as below , this works for me 


val conf = new org.apache.spark.SparkConf().setAppName("hive app")

val sc = new org.apache.spark.SparkContext(conf)

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

val df1 = sqlContext.sql(s"use $dbname");

val dfUnion1 = sqlContext.sql(s"Select * from table_name");


answered Mar 8, 2019 by Vijay Dixon
• 190 points
0 votes

Hi,

JDBC is not required.

HiveServer2 has a JDBC driver. It supports both embedded and remote access to HiveServer2. Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users.​

  • Put hive-site.xml on your classpath, and specify hive.metastore.uris to where your hive metastore hosted.
  • Import org.apache.spark.sql.hive.HiveContext, as it can perform SQL query over Hive tables.
  • Define val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc).
  • Verify sqlContext.sql("show tables") to see if it works​.

answered Jul 30, 2019 by Gitika
• 65,910 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How do I connect my Spark based HDInsight cluster to my blob storage?

Go through this blog: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage#access-blobs I went through this ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,490 points
1,921 views
0 votes
1 answer

How to convert a string to timestamp with milliseconds in Hive?

 concatenation of substrings using the following code: select ...READ MORE

answered Oct 31, 2018 in Big Data Hadoop by Neha
• 6,300 points
18,564 views
0 votes
1 answer

How to save Spark dataframe as dynamic partitioned table in Hive?

Hey, you can try something like this: df.write.partitionBy('year', ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 69,210 points
8,187 views
0 votes
1 answer

Hbase: Client not able to connect with remote Hbase server

You have to remove the localhost entry from hbase server's ...READ MORE

answered Nov 8, 2018 in Big Data Hadoop by Omkar
• 69,210 points
5,090 views
0 votes
1 answer

Hadoop Hive: How to split a single row into multiple rows?

Try this SELECT ID1, Sub FROM tableName lateral view ...READ MORE

answered Nov 14, 2018 in Big Data Hadoop by Omkar
• 69,210 points
8,565 views
+1 vote
2 answers
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Jan 1, 2019 in Apache Spark by anonymous
19,028 views
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,390 points
684 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP