java - Use of numofreducers in map reduce -
i have simple doubt in map reduce.
why have set numofreducers in map reduce driver class.if not set,the default value 1.if set 100,100 reduce tasks run.what advantage of that.is reduce effort of single node.(if reduce task 1,the task running in 1 node).is there other advantages?
thanks help
the right number of reduces seems be:
0.95 or 1.75 multiplied (<no. of nodes> * <no. of maximum containers per node>).
with 0.95, of reduces can launch , start transferring map outputs maps finish. 1.75 faster nodes finish first round of reduces , launch second wave of reduces doing better job of load balancing.
increasing number of reduces increases framework overhead, increases load balancing , lowers cost of failures.
the scaling factors above less whole numbers reserve few reduce slots in framework speculative-tasks , failed tasks.
so main advantage load balancing , running tasks in parallel on cluster.
Comments
Post a Comment