Is there any way to include a python package with Hadoop streaming job

0 votes
I want to include a python package (NLTK) with a Hadoop streaming job, but am not sure how to do this without including every file manually via the CLI argument, "-file".Please note- I don't have the option to install this package on all the slaves.

Can someone please helps me, how to do this?

Thanks in advance!
May 10, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
1,970 views

1 answer to this question.

0 votes

See I don't know the answer to your question exactly but one thing you can surely try: 

Just zip up the package into a .tar.gz or a .zip and pass the entire tarball or archive in a -file option to your hadoop command. I've done this in the past with Perl but not in case of Python.

You can also use Python's zipimport at http://docs.python.org/library/zipimport.html, which allows you to import modules directly from a zip.

Hope this will answer your question to some extent.

answered May 10, 2018 by nitinrawat895
• 11,380 points

Related Questions In Big Data Hadoop

0 votes
1 answer
0 votes
1 answer

Is there a way to rebalance single Datanode in Hadoop.

Currently Hadoop does not automatically do this. ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
501 views
0 votes
9 answers

Is there any way to check which Hadoop daemons are running?

use jps command, It will show all the running ...READ MORE

answered Dec 27, 2018 in Big Data Hadoop by Rakesh
• 160 points
45,257 views
0 votes
1 answer

Is there any way to access Hadoop web UI in linux?

In this case what you can do ...READ MORE

answered May 9, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
3,541 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,261 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,560 views
0 votes
1 answer

How to get started with Hadoop?

Well, hadoop is actually a framework that ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by coldcode
• 2,080 points
899 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,219 views
0 votes
1 answer

Is there any way to increase Java Heap size in Hadoop?

You can add some more memory by ...READ MORE

answered Apr 12, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
4,524 views
0 votes
1 answer

Is there any way to write "map only" Hadoop jobs ?

You can easily set the number of ...READ MORE

answered Apr 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
662 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP