Error running hadoop mapreduce in Python using Hadoop Streaming

0 votes

I was trying a sample mapredyce code written in python using hadoop streaming in cloudera quickstart VM. But, I am stuck in between.

Here is my mapper code:

#!/usr/bin/env python

import re
import sys

for line in sys.stdin:
  val = line.strip()
  (year, temp, q) = (val[15:19], val[87:92], val[92:93])
  if (temp != "+9999" and re.match("[01459]", q)):
print "%s\t%s" % (year, temp)

Here is my reducer code:

#!/usr/bin/env python

import sys

(last_key, max_val) = (None, -sys.maxint)
for line in sys.stdin:
  (key, val) = line.strip().split("\t")
  if last_key and last_key != key:
    print "%s\t%s" % (last_key, max_val)
    (last_key, max_val) = (key, int(val))
  else:
    (last_key, max_val) = (key, max(max_val, int(val)))

if last_key:
print "%s\t%s" % (last_key, max_val)

source: https://github.com/tomwhite/hadoop-book/tree/master/ch02-mr-intro/src/main/python

This is the command that I am executing in order to run the mapreduce job:


hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-streaming.jar \
-input /user/cloudera/sample.txt \
-output /user/cloudera/output
-mapper /home/cloudera/streaming-sample/max_temperature_map.py \
-reducer /home/cloudera/streaming-sample/max_temperature_reduce.py

This is the error log snippet that I am getting:

image

Please help me understanding what I am doing wrong here.

Apr 2, 2018 in Big Data Hadoop by nitinrawat895
• 10,670 points
105 views

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In Big Data Hadoop

0 votes
1 answer

Getting error in Hadoop Streaming: Type mismatch in Key from Map

In Hadoop streaming you have to customize ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by coldcode
• 2,020 points
236 views
0 votes
1 answer

Error in Hadoop Mapreduce

The file that you are referring here ...READ MORE

answered Apr 19, 2018 in Big Data Hadoop by Shubham
• 13,290 points
94 views
0 votes
1 answer
+1 vote
2 answers

How to authenticate username & password while using Connector for Cloudera Hadoop in Tableau?

Hadoop server installed was kerberos enabled server. ...READ MORE

answered Aug 21, 2018 in Big Data Hadoop by Priyaj
• 56,520 points
181 views
0 votes
1 answer
0 votes
1 answer

Getting error in MapReduce job.setInputFormatClass

In old Hadoop API(i.e. below Hadoop 0.20.0), ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,290 points
323 views
0 votes
1 answer

Moving files in Hadoop using the Java API?

I would recommend you to use FileSystem.rename(). ...READ MORE

answered Apr 15, 2018 in Big Data Hadoop by Shubham
• 13,290 points
804 views
0 votes
1 answer

Getting error while building Hadoop core jar using ant.

I think you are missing libtool library. ...READ MORE

answered Apr 18, 2018 in Big Data Hadoop by coldcode
• 2,020 points
54 views
0 votes
1 answer

Getting error in Hadoop: Output file already exist

When you executed your code earlier, you ...READ MORE

answered Apr 19, 2018 in Big Data Hadoop by Shubham
• 13,290 points
1,347 views