How to add third party java jars for use in PySpark?

0 votes

For my project, I required some third party Database client libraries in Java. I want to access them through

java_gateway.py
E.g: to make the client class (not a jdbc driver!) available to the python client via the java gateway:

java_import(gateway.jvm, "org.mydatabase.MyDBClient")
It is not clear where to add the third party libraries to the jvm classpath. I tried to add to compute-classpath.sh but that did nto seem to work: I get

 Py4jError: Trying to call a package
Also, when comparing to Hive: the hive jar files are NOT loaded via compute-classpath.sh so that makes me suspicious. There seems to be some other mechanism happening to set up the jvm side classpath.

Can someone help?

Thanks in advance!

Jul 4, 2018 in Apache Spark by Shubham
• 12,270 points
774 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

You can add external jars as arguments to PySpark.

pyspark --jars file1.jar,file2.jar

answered Jul 4, 2018 by nitinrawat895
• 9,070 points

Related Questions In Apache Spark

0 votes
1 answer

How to use Spark jars for Yarn distribution?

First, store upload this archive to hdfs and ...READ MORE

answered Mar 28 in Apache Spark by Raj
25 views
0 votes
1 answer

Which query to use for better performance, join in SQL or using Dataset API?

DataFrames and SparkSQL performed almost about the ...READ MORE

answered Apr 19, 2018 in Apache Spark by kurt_cobain
• 9,260 points
63 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

answered Dec 10, 2018 in Apache Spark by Vini
7,614 views
0 votes
1 answer

How to add modify access for Web UI user?

For a user to have modification access ...READ MORE

answered Mar 14 in Apache Spark by Raj
25 views
+1 vote
1 answer
0 votes
1 answer

Writing File into HDFS using spark scala

The reason you are not able to ...READ MORE

answered Apr 5, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
3,412 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

answered Apr 19, 2018 in Apache Spark by nitinrawat895
• 9,070 points
506 views
0 votes
1 answer

What's the difference between 'filter' and 'where' in Spark SQL?

Both 'filter' and 'where' in Spark SQL ...READ MORE

answered May 23, 2018 in Apache Spark by nitinrawat895
• 9,070 points
2,948 views
0 votes
1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

answered May 25, 2018 in Apache Spark by nitinrawat895
• 9,070 points
1,589 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 9,070 points
779 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.