The difference between Hadoop – Reduce Task and Reducer

“The reducer is different from the reduction task. The reducer can run multiple reduction tasks”. Can someone explain this with the following example?

foo.txt: Very good, this is the foo file
bar.txt: This is the bar file

I am using 2 reducers. What is Reduce tasks and based on multiple reduction tasks generated in the reducer?

Reducer is a class that contains reduce functions, as shown below

protected void reduce(KEYIN key, Iterable values, Context context
) throws IOException, InterruptedException {

Reduce task is a program running on the node , It is executing the reduce function of the Reducer class.

You can treat the Reduce task as an instance of the Reducer

For more details, please check the Apache MapReduce tutorial page (Payload section ).

“The reducer is different from the reduction task. The reducer can run multiple reduction tasks”. Can someone explain this with the following example?

foo.txt: Very good, this is the foo file
bar.txt: This is the bar file

I am using 2 reducers. What is Reduce tasks and based on multiple reduction tasks generated in the reducer?

Reducer is a class, which contains a reduce function, as shown below

protected void reduce(KEYIN key, Iterable values, Context context
) throws IOException, InterruptedException {

Reduce task is a program running on the node, which is executing the reduce function of the Reducer class.

p>

You can think of the Reduce task as an instance of Reducer

For more details, please check the Apache MapReduce tutorial page (Payload section).

Leave a Comment

Your email address will not be published.