Can one implement combiner and reducer separately

Question

I was checking few hadoop projects on the internet and github. There, in many programs, I found that people have used reducer as combiner. One of the reason to do so might be because of the use case context or is it the only way to implement combiner? I mean, can I have implement combiner and reducer separately in a MapReduce program.

Ashish · Answer 1 · Apr 10, 2018

Surely, you can use combiner separately along with reducer but, for implementing combiner you still be using reducer interface. Now, it is quite important for you to understand where combiner should be used. The primary goal of combiners is to optimize/minimize the number of key value pairs that will be shuffled across the network between mappers and reducers and thus to save as most bandwidth as possible.

You can consider combiner as mini reducer which is called several times during the map phase in order to reduce the set of key/value pairs that will be eventually sent to the reducer. This is the reason why combiner must implement the reduce interface or you can extend reducer class.