How to include third party library in Python MapReduce?

0 votes

I am writing MapReduce job in Python, and want to use some third libraries like chardet.

I konw that we can use option -libjars=... to include them for java MapReduce.

But how to include third party libraries in Python MapReduce Job ?

Nov 27, 2018 in Big Data Hadoop by Neha
• 6,280 points
37 views

1 answer to this question.

0 votes

Problem has been solved by zipimport.

Then I zip chardet to file module.mod, and used like this:

importer = zipimport.zipimporter('module.mod')
chardet = importer.load_module('chardet')

Add -file module.mod in hadoop streaming command.

Now chardet can be used in script.

answered Nov 27, 2018 by Frankie
• 9,810 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to use custom FileInputFormat in MapReduce?

You have to override isSplitable method. ...READ MORE

answered Apr 10, 2018 in Big Data Hadoop by Shubham
• 13,190 points
92 views
0 votes
1 answer

How to implement data locality in Hadoop MapReduce?

You can use this getFileBlockLocations method of ...READ MORE

answered Apr 20, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
30 views
0 votes
1 answer

How to execute python script in hadoop file system (hdfs)?

If you are simply looking to distribute ...READ MORE

answered Sep 19, 2018 in Big Data Hadoop by digger
• 27,620 points
1,289 views
0 votes
1 answer

Hadoop Pig: How to include external jar file in PIG?

You can do this: register /local/path/to/Jar_name.jar READ MORE

answered Nov 16, 2018 in Big Data Hadoop by Omkar
• 67,120 points
25 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,110 points
2,046 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,110 points
196 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
10,472 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
764 views
0 votes
1 answer

How to format the output being written by MapReduce in Hadoop?

Here is a simple code demonstrate the ...READ MORE

answered Sep 5, 2018 in Big Data Hadoop by Frankie
• 9,810 points
75 views
0 votes
1 answer

What is Custom partitioner in Hadoop? How to write partition function ?

Don't think that in Hadoop the same ...READ MORE

answered Sep 18, 2018 in Big Data Hadoop by Frankie
• 9,810 points
104 views