Is there any way to include a python package with Hadoop streaming job

0 votes
I want to include a python package (NLTK) with a Hadoop streaming job, but am not sure how to do this without including every file manually via the CLI argument, "-file".Please note- I don't have the option to install this package on all the slaves.

Can someone please helps me, how to do this?

Thanks in advance!
May 10, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
2,223 views

1 answer to this question.

0 votes

See I don't know the answer to your question exactly but one thing you can surely try: 

Just zip up the package into a .tar.gz or a .zip and pass the entire tarball or archive in a -file option to your hadoop command. I've done this in the past with Perl but not in case of Python.

You can also use Python's zipimport at http://docs.python.org/library/zipimport.html, which allows you to import modules directly from a zip.

Hope this will answer your question to some extent.

answered May 10, 2018 by nitinrawat895
• 11,380 points

Related Questions In Big Data Hadoop

0 votes
1 answer
0 votes
1 answer

Is there a way to rebalance single Datanode in Hadoop.

Currently Hadoop does not automatically do this. ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
791 views
0 votes
9 answers

Is there any way to check which Hadoop daemons are running?

use jps command, It will show all the running ...READ MORE

answered Dec 27, 2018 in Big Data Hadoop by Rakesh
• 160 points
46,760 views
0 votes
1 answer

Is there any way to access Hadoop web UI in linux?

In this case what you can do ...READ MORE

answered May 9, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
4,187 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,350 points
4,641 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,076 views
0 votes
1 answer

How to get started with Hadoop?

Well, hadoop is actually a framework that ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by coldcode
• 2,090 points
1,207 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
109,070 views
0 votes
1 answer

Is there any way to increase Java Heap size in Hadoop?

You can add some more memory by ...READ MORE

answered Apr 12, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
5,001 views
0 votes
1 answer

Is there any way to write "map only" Hadoop jobs ?

You can easily set the number of ...READ MORE

answered Apr 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
956 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP