How to connect Spark to a remote Hive server?

0 votes
Can anyone help me in understanding how I can load data from Hive server which is installed remotely into Spark data frame. Do I need a hive jdbc connector?
May 16, 2018 in Big Data Hadoop by code799
1,405 views

3 answers to this question.

0 votes
Use org.apache.spark.sql.hive.HiveContext & you can perform query on Hive.

But I would suggest you to connect Spark to HDFS & perform analytics over the stored data. It would be much more efficient that connecting Spark with Hive and then performing analysis over it.
answered May 16, 2018 by Shubham
• 13,300 points
+1 vote

JDBC is not required here.

Create a hive SQLContext as below , this works for me 


val conf = new org.apache.spark.SparkConf().setAppName("hive app")

val sc = new org.apache.spark.SparkContext(conf)

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

val df1 = sqlContext.sql(s"use $dbname");

val dfUnion1 = sqlContext.sql(s"Select * from table_name");


answered Mar 8 by Vijay Dixon
• 180 points
0 votes

Hi,

JDBC is not required.

HiveServer2 has a JDBC driver. It supports both embedded and remote access to HiveServer2. Remote HiveServer2 mode is recommended for production use, as it is more secure and doesn't require direct HDFS/metastore access to be granted for users.​

  • Put hive-site.xml on your classpath, and specify hive.metastore.uris to where your hive metastore hosted.
  • Import org.apache.spark.sql.hive.HiveContext, as it can perform SQL query over Hive tables.
  • Define val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc).
  • Verify sqlContext.sql("show tables") to see if it works​.

answered Jul 30 by Gitika
• 25,340 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How do I connect my Spark based HDInsight cluster to my blob storage?

Go through this blog: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage#access-blobs I went through this ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,300 points
654 views
0 votes
1 answer

How to convert a string to timestamp with milliseconds in Hive?

 concatenation of substrings using the following code: select ...READ MORE

answered Oct 31, 2018 in Big Data Hadoop by Neha
• 6,280 points
2,749 views
0 votes
1 answer

How to save Spark dataframe as dynamic partitioned table in Hive?

Hey, you can try something like this: df.write.partitionBy('year', ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 67,620 points
1,567 views
0 votes
1 answer

Hbase: Client not able to connect with remote Hbase server

You have to remove the localhost entry from hbase server's ...READ MORE

answered Nov 8, 2018 in Big Data Hadoop by Omkar
• 67,620 points
716 views
0 votes
1 answer

Hadoop Hive: How to split a single row into multiple rows?

Try this SELECT ID1, Sub FROM tableName lateral view ...READ MORE

answered Nov 14, 2018 in Big Data Hadoop by Omkar
• 67,620 points
1,121 views
0 votes
1 answer
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
6,038 views
0 votes
1 answer

Different Spark Ecosystem

Spark has various components: Spark SQL (Shark)- for ...READ MORE

answered Jun 4, 2018 in Apache Spark by kurt_cobain
• 9,260 points
76 views
0 votes
1 answer