How to execute combiner and partitioning program without adding package name?

0 votes
Hi. I am new to Hadoop and I am learning combining and partitioning as of now. I was just wondering if it is possible to execute the combiner and partitioner program without adding packages. If yes, then how?
Jul 16 in Big Data Hadoop by Ritu
26 views

1 answer to this question.

0 votes

Yes, it is possible to do so without adding packages. I have shared an example program below for reference:

import java.io.IOException;

import java.util.Iterator;

import java.util.StringTokenizer;


import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.FileInputFormat;

import org.apache.hadoop.mapred.FileOutputFormat;

import org.apache.hadoop.mapred.JobClient;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Partitioner;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;

import org.apache.hadoop.mapred.TextInputFormat;

import org.apache.hadoop.mapred.TextOutputFormat;


public class WithPartitioner {


 public static class Map extends MapReduceBase implements

   Mapper<LongWritable, Text, Text, IntWritable> {


  @Override

  public void map(LongWritable key, Text value,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   String line = value.toString();

   StringTokenizer tokenizer = new StringTokenizer(line);


   while (tokenizer.hasMoreTokens()) {

    value.set(tokenizer.nextToken());

    output.collect(value, new IntWritable(1));


    // // I am fine I am fine

    // v

    // I 1

    // am 1

    // fine 1

    // I 1

    // am 1

    // fine 1


    // I (1,1)


   }


  }

 }


 // Output types of Mapper should be same as arguments of Partitioner

 public static class MyPartitioner implements Partitioner<Text, IntWritable> {


  @Override

  public int getPartition(Text key, IntWritable value, int numPartitions) {


   String myKey = key.toString().toLowerCase();


   if (myKey.equals("hadoop")) {

    return 0;

   }

   if (myKey.equals("data")) {

    return 1;

   } else {

    return 2;

   }

  }


  @Override

  public void configure(JobConf arg0) {


   // Gives you a new instance of JobConf if you want to change Job

   // Configurations


  }

 }


 public static class Reduce extends MapReduceBase implements

   Reducer<Text, IntWritable, Text, IntWritable> {


  @Override

  public void reduce(Text key, Iterator<IntWritable> values,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   int sum = 0;

   while (values.hasNext()) {

    sum += values.next().get();

    // sum = sum + 1;

   }


   // beer,3


   output.collect(key, new IntWritable(sum));

  }

 }


 public static void main(String[] args) throws Exception {


  JobConf conf = new JobConf(WithPartitioner.class);

  conf.setJobName("wordcount");


  // Forcing program to run 3 reducers

  conf.setNumReduceTasks(3);


  conf.setMapperClass(Map.class);

  conf.setCombinerClass(Reduce.class);

  conf.setReducerClass(Reduce.class);

  conf.setPartitionerClass(MyPartitioner.class);


  conf.setOutputKeyClass(Text.class);

  conf.setOutputValueClass(IntWritable.class);


  conf.setInputFormat(TextInputFormat.class);

  conf.setOutputFormat(TextOutputFormat.class);


   FileInputFormat.setInputPaths(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  JobClient.runJob(conf);

 }

}
answered Jul 16 by Raman

Related Questions In Big Data Hadoop

0 votes
1 answer

How to find the running namenodes and secondary name nodes in hadoop?

Name nodes: hdfs getconf -namenodes Secondary name nodes: hdfs getconf ...READ MORE

answered Nov 26, 2018 in Big Data Hadoop by Omkar
• 67,600 points
87 views
+3 votes
0 answers
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,300 points
673 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 10,690 points

edited Mar 21, 2018 by nitinrawat895 334 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,690 points
3,063 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,690 points
341 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
15,058 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,121 views
+2 votes
1 answer

How to calculate Maximum salary of the employee with the name using the Map Reduce Technique

Please try the below code and it ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by Neha
• 6,280 points
532 views
0 votes
1 answer

How to move Word and PDF documents to Hadoop HDFS?

Try with below commands: hadoop fs -copyFromLocal <localsrc> ...READ MORE

answered Dec 5, 2018 in Big Data Hadoop by Frankie
• 9,810 points
189 views