How to execute combiner and partitioning program without adding package name?

0 votes
Hi. I am new to Hadoop and I am learning combining and partitioning as of now. I was just wondering if it is possible to execute the combiner and partitioner program without adding packages. If yes, then how?
Jul 16, 2019 in Big Data Hadoop by Ritu
87 views

1 answer to this question.

0 votes

Yes, it is possible to do so without adding packages. I have shared an example program below for reference:

import java.io.IOException;

import java.util.Iterator;

import java.util.StringTokenizer;


import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.FileInputFormat;

import org.apache.hadoop.mapred.FileOutputFormat;

import org.apache.hadoop.mapred.JobClient;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Partitioner;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;

import org.apache.hadoop.mapred.TextInputFormat;

import org.apache.hadoop.mapred.TextOutputFormat;


public class WithPartitioner {


 public static class Map extends MapReduceBase implements

   Mapper<LongWritable, Text, Text, IntWritable> {


  @Override

  public void map(LongWritable key, Text value,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   String line = value.toString();

   StringTokenizer tokenizer = new StringTokenizer(line);


   while (tokenizer.hasMoreTokens()) {

    value.set(tokenizer.nextToken());

    output.collect(value, new IntWritable(1));


    // // I am fine I am fine

    // v

    // I 1

    // am 1

    // fine 1

    // I 1

    // am 1

    // fine 1


    // I (1,1)


   }


  }

 }


 // Output types of Mapper should be same as arguments of Partitioner

 public static class MyPartitioner implements Partitioner<Text, IntWritable> {


  @Override

  public int getPartition(Text key, IntWritable value, int numPartitions) {


   String myKey = key.toString().toLowerCase();


   if (myKey.equals("hadoop")) {

    return 0;

   }

   if (myKey.equals("data")) {

    return 1;

   } else {

    return 2;

   }

  }


  @Override

  public void configure(JobConf arg0) {


   // Gives you a new instance of JobConf if you want to change Job

   // Configurations


  }

 }


 public static class Reduce extends MapReduceBase implements

   Reducer<Text, IntWritable, Text, IntWritable> {


  @Override

  public void reduce(Text key, Iterator<IntWritable> values,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   int sum = 0;

   while (values.hasNext()) {

    sum += values.next().get();

    // sum = sum + 1;

   }


   // beer,3


   output.collect(key, new IntWritable(sum));

  }

 }


 public static void main(String[] args) throws Exception {


  JobConf conf = new JobConf(WithPartitioner.class);

  conf.setJobName("wordcount");


  // Forcing program to run 3 reducers

  conf.setNumReduceTasks(3);


  conf.setMapperClass(Map.class);

  conf.setCombinerClass(Reduce.class);

  conf.setReducerClass(Reduce.class);

  conf.setPartitionerClass(MyPartitioner.class);


  conf.setOutputKeyClass(Text.class);

  conf.setOutputValueClass(IntWritable.class);


  conf.setInputFormat(TextInputFormat.class);

  conf.setOutputFormat(TextOutputFormat.class);


   FileInputFormat.setInputPaths(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  JobClient.runJob(conf);

 }

}
answered Jul 16, 2019 by Raman

Related Questions In Big Data Hadoop

0 votes
1 answer

How to find the running namenodes and secondary name nodes in hadoop?

Name nodes: hdfs getconf -namenodes Secondary name nodes: hdfs getconf ...READ MORE

answered Nov 26, 2018 in Big Data Hadoop by Omkar
• 69,060 points
155 views
+3 votes
0 answers

How to handle Sqoop failure program and restart the job from failed tables?

When running a Sqoop jobs on a ...READ MORE

Aug 29, 2019 in Big Data Hadoop by Kunal
• 150 points

edited Sep 5, 2019 by Kunal 403 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,380 points
899 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,920 points

edited Mar 21, 2018 by nitinrawat895 718 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
5,089 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,920 points
725 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyF ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
30,142 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,310 points
1,899 views
+2 votes
1 answer

How to calculate Maximum salary of the employee with the name using the Map Reduce Technique

Please try the below code and it ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by Neha
• 6,280 points
1,125 views
0 votes
1 answer

How to move Word and PDF documents to Hadoop HDFS?

Try with below commands: hadoop fs -copyFromLocal <localsrc> ...READ MORE

answered Dec 5, 2018 in Big Data Hadoop by Frankie
• 9,810 points
296 views