Is there any way to include a python package with Hadoop streaming job?

0 votes
I want to include a python package (NLTK) with a Hadoop streaming job, but am not sure how to do this without including every file manually via the CLI argument, "-file".Please note- I don't have the option to install this package on all the slaves.

Can someone please helps me, how to do this?

Thanks in advance!
May 10, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
405 views

1 answer to this question.

0 votes

See I don't know the answer to your question exactly but one thing you can surely try: 

Just zip up the package into a .tar.gz or a .zip and pass the entire tarball or archive in a -file option to your hadoop command. I've done this in the past with Perl but not in case of Python.

You can also use Python's zipimport at http://docs.python.org/library/zipimport.html, which allows you to import modules directly from a zip.

Hope this will answer your question to some extent.

answered May 10, 2018 by nitinrawat895
• 10,690 points

Related Questions In Big Data Hadoop

0 votes
1 answer
0 votes
1 answer

Is there a way to rebalance single Datanode in Hadoop.

Currently Hadoop does not automatically do this. ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
66 views
0 votes
1 answer

Is there any way to write "map only" Hadoop jobs ?

You can easily set the number of ...READ MORE

answered Apr 16, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
56 views
0 votes
9 answers

Is there any way to check which Hadoop daemons are running?

use jps command, It will show all the running ...READ MORE

answered Dec 27, 2018 in Big Data Hadoop by Rakesh
• 160 points
8,118 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,107 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,690 points
3,020 views
0 votes
1 answer

How to get started with Hadoop?

Well, hadoop is actually a framework that ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by coldcode
• 2,020 points
95 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
14,915 views
0 votes
1 answer
0 votes
1 answer

Is there any way to increase Java Heap size in Hadoop?

You can add some more memory by ...READ MORE

answered Apr 12, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points
1,031 views