How to execute combiner and partitioning program without adding package name

0 votes
Hi. I am new to Hadoop and I am learning combining and partitioning as of now. I was just wondering if it is possible to execute the combiner and partitioner program without adding packages. If yes, then how?
Jul 16, 2019 in Big Data Hadoop by Ritu
261 views

1 answer to this question.

0 votes

Yes, it is possible to do so without adding packages. I have shared an example program below for reference:

import java.io.IOException;

import java.util.Iterator;

import java.util.StringTokenizer;


import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.FileInputFormat;

import org.apache.hadoop.mapred.FileOutputFormat;

import org.apache.hadoop.mapred.JobClient;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Partitioner;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;

import org.apache.hadoop.mapred.TextInputFormat;

import org.apache.hadoop.mapred.TextOutputFormat;


public class WithPartitioner {


 public static class Map extends MapReduceBase implements

   Mapper<LongWritable, Text, Text, IntWritable> {


  @Override

  public void map(LongWritable key, Text value,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   String line = value.toString();

   StringTokenizer tokenizer = new StringTokenizer(line);


   while (tokenizer.hasMoreTokens()) {

    value.set(tokenizer.nextToken());

    output.collect(value, new IntWritable(1));


    // // I am fine I am fine

    // v

    // I 1

    // am 1

    // fine 1

    // I 1

    // am 1

    // fine 1


    // I (1,1)


   }


  }

 }


 // Output types of Mapper should be same as arguments of Partitioner

 public static class MyPartitioner implements Partitioner<Text, IntWritable> {


  @Override

  public int getPartition(Text key, IntWritable value, int numPartitions) {


   String myKey = key.toString().toLowerCase();


   if (myKey.equals("hadoop")) {

    return 0;

   }

   if (myKey.equals("data")) {

    return 1;

   } else {

    return 2;

   }

  }


  @Override

  public void configure(JobConf arg0) {


   // Gives you a new instance of JobConf if you want to change Job

   // Configurations


  }

 }


 public static class Reduce extends MapReduceBase implements

   Reducer<Text, IntWritable, Text, IntWritable> {


  @Override

  public void reduce(Text key, Iterator<IntWritable> values,

    OutputCollector<Text, IntWritable> output, Reporter reporter)

    throws IOException {


   int sum = 0;

   while (values.hasNext()) {

    sum += values.next().get();

    // sum = sum + 1;

   }


   // beer,3


   output.collect(key, new IntWritable(sum));

  }

 }


 public static void main(String[] args) throws Exception {


  JobConf conf = new JobConf(WithPartitioner.class);

  conf.setJobName("wordcount");


  // Forcing program to run 3 reducers

  conf.setNumReduceTasks(3);


  conf.setMapperClass(Map.class);

  conf.setCombinerClass(Reduce.class);

  conf.setReducerClass(Reduce.class);

  conf.setPartitionerClass(MyPartitioner.class);


  conf.setOutputKeyClass(Text.class);

  conf.setOutputValueClass(IntWritable.class);


  conf.setInputFormat(TextInputFormat.class);

  conf.setOutputFormat(TextOutputFormat.class);


   FileInputFormat.setInputPaths(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  JobClient.runJob(conf);

 }

}
answered Jul 16, 2019 by Raman

Related Questions In Big Data Hadoop

0 votes
1 answer

How to find the running namenodes and secondary name nodes in hadoop?

Name nodes: hdfs getconf -namenodes Secondary name nodes: hdfs getconf ...READ MORE

answered Nov 26, 2018 in Big Data Hadoop by Omkar
• 69,130 points
380 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,480 points
1,308 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points

edited Mar 21, 2018 by nitinrawat895 1,390 views
0 votes
1 answer

How to set a custom install directory for a deb package with fpm

Here's something that you can try... the last ...READ MORE

answered Mar 26, 2018 in Big Data Hadoop by Amrinder
• 140 points
484 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,232 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,176 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
53,088 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,741 views
0 votes
1 answer

How to delete a directory from Hadoop cluster which is having comma(,) in its name?

Just try the following command: hadoop fs -rm ...READ MORE

answered May 7, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
1,125 views
+2 votes
1 answer

How to calculate Maximum salary of the employee with the name using the Map Reduce Technique

Please try the below code and it ...READ MORE

answered Jul 25, 2018 in Big Data Hadoop by Neha
• 6,300 points
2,725 views