How do I connect to a HIVE Meta store through a program in SparkSQL?

0 votes

I'm using HiveContext with SparkSQL and I'm trying to connect to a remote Hive meta store, the only way to set the hive meta store is through including the hive-site.xml on the classpath (or copying it to /etc/spark/conf/). Is there any way to set this parameter programmatically in a java code without including the hive-site.xml? If so what is the Spark configuration to use?

Sep 5 in Apache Spark by nitinrawat895
• 10,670 points
53 views

1 answer to this question.

0 votes

In spark 2.0.+ it should look something like that:

Don't forget to replace the "hive.metastore.uris" with yours. This assumes that you have a hive meta store service started already (not a hive server).

 val spark = SparkSession
          .builder()
          .appName("interfacing spark sql to hive metastore without configuration file")
          .config("hive.metastore.uris", "thrift://localhost:9083") // replace with your hivemetastore service's thrift url
          .enableHiveSupport() // don't forget to enable hive support
          .getOrCreate()

        import spark.implicits._
        import spark.sql
        // create an arbitrary frame
        val frame = Seq(("one", 1), ("two", 2), ("three", 3)).toDF("word", "count")
        // see the frame created
        frame.show()
        /**
         * +-----+-----+
         * | word|count|
         * +-----+-----+
         * |  one|    1|
         * |  two|    2|
         * |three|    3|
         * +-----+-----+
         */
        // write the frame
        frame.write.mode("overwrite").saveAsTable("t4")
answered Sep 5 by ravikiran
• 4,560 points

Related Questions In Apache Spark

0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,300 points
1,369 views
0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4 in Apache Spark by Dhara dhruve
1,023 views
0 votes
1 answer
0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

answered Nov 20, 2018 in Apache Spark by Frankie
• 9,810 points
398 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
3,002 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,670 points
334 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
14,820 views
0 votes
1 answer

How do I turn off INFO Logging in Spark?

Execute this command in the spark directory: cp ...READ MORE

answered Jul 12 in Apache Spark by ravikiran
• 4,560 points
134 views
0 votes
1 answer

How do I access the Map Task ID in Spark?

You can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

answered Jul 23 in Apache Spark by ravikiran
• 4,560 points
43 views