How to execute combiner and partitioning program without adding package name

0 votes
Hi. I am new to Hadoop and I am learning combining and partitioning as of now. I was just wondering if it is possible to execute the combiner and partitioner program without adding packages. If yes, then how?
Jul 16, 2019 in Big Data Hadoop by Ritu
741 views

1 answer to this question.

0 votes

Yes, it is possible to do so without adding packages. I have shared an example program below for reference:

import java.io.IOException;

import java.util.Iterator;

import java.util.StringTokenizer;


import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.FileInputFormat;

import org.apache.hadoop.mapred.FileOutputFormat;

import org.apache.hadoop.mapred.JobClient;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Partitioner;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;

import org.apache.hadoop.mapred.TextInputFormat;

import org.apache.hadoop.mapred.TextOutputFormat;


public class WithPartitioner {


 public static class Map extends MapReduceBase implements

   Mapper<LongWritable, Text, Text, IntWritable> {


  @Override

  public void map(LongWritable key, Text value,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   String line = value.toString();

   StringTokenizer tokenizer = new StringTokenizer(line);


   while (tokenizer.hasMoreTokens()) {

    value.set(tokenizer.nextToken());

    output.collect(value, new IntWritable(1));


    // // I am fine I am fine

    // v

    // I 1

    // am 1

    // fine 1

    // I 1

    // am 1

    // fine 1


    // I (1,1)


   }


  }

 }


 // Output types of Mapper should be same as arguments of Partitioner

 public static class MyPartitioner implements Partitioner<Text, IntWritable> {


  @Override

  public int getPartition(Text key, IntWritable value, int numPartitions) {


   String myKey = key.toString().toLowerCase();


   if (myKey.equals("hadoop")) {

    return 0;

   }

   if (myKey.equals("data")) {

    return 1;

   } else {

    return 2;

   }

  }


  @Override

  public void configure(JobConf arg0) {


   // Gives you a new instance of JobConf if you want to change Job

   // Configurations


  }

 }


 public static class Reduce extends MapReduceBase implements

   Reducer<Text, IntWritable, Text, IntWritable> {


  @Override

  public void reduce(Text key, Iterator<IntWritable> values,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   int sum = 0;

   while (values.hasNext()) {

    sum += values.next().get();

    // sum = sum + 1;

   }


   // beer,3


   output.collect(key, new IntWritable(sum));

  }

 }


 public static void main(String[] args) throws Exception {


  JobConf conf = new JobConf(WithPartitioner.class);

  conf.setJobName("wordcount");


  // Forcing program to run 3 reducers

  conf.setNumReduceTasks(3);


  conf.setMapperClass(Map.class);

  conf.setCombinerClass(Reduce.class);

  conf.setReducerClass(Reduce.class);

  conf.setPartitionerClass(MyPartitioner.class);


  conf.setOutputKeyClass(Text.class);

  conf.setOutputValueClass(IntWritable.class);


  conf.setInputFormat(TextInputFormat.class);

  conf.setOutputFormat(TextOutputFormat.class);


   FileInputFormat.setInputPaths(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  JobClient.runJob(conf);

 }

}
answered Jul 16, 2019 by Raman

Related Questions In Big Data Hadoop

0 votes
1 answer

How to find the running namenodes and secondary name nodes in hadoop?

Name nodes: hdfs getconf -namenodes Secondary name nodes: hdfs getconf ...READ MORE

answered Nov 26, 2018 in Big Data Hadoop by Omkar
• 69,210 points
2,376 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by Shubham
• 13,490 points
2,130 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points

edited Mar 22, 2018 by nitinrawat895 2,678 views
0 votes
1 answer

How to set a custom install directory for a deb package with fpm

Here's something that you can try... the last ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Amrinder
• 140 points
1,136 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,598 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,204 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,693 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,283 views
0 votes
1 answer

How to delete a directory from Hadoop cluster which is having comma(,) in its name?

Just try the following command: hadoop fs -rm ...READ MORE

answered May 7, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
2,790 views
+2 votes
1 answer

How to calculate Maximum salary of the employee with the name using the Map Reduce Technique

Please try the below code and it ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by Neha
• 6,300 points
5,259 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP