Tuesday, April 7, 2015

Configure UBER mode - MapReduce job for small dataset

Uber job configuration in YARN - Hadoop2
previous post - what is Uber mode

How to configure uber job.?

    To enable uber jobs, need to set the following property in yarn-site.xml.
    mapreduce.job.ubertask.enable=true
    mapreduce.job.ubertask.maxmaps=9 (default 9)
    mapreduce.job.ubertask.maxreduces=0 (default 1)
    mapreduce.job.ubertask.maxbytes=4096

mapreduce.job.ubertask.maxbytes

    above value for 4MB, default value = bloksize. The total input size of a job must be less then or equal to this value for the job to be uberized.
   Ex. say if you have data set which is 5MB of size, but you have set 4MB for mapreduce.job.ubertask.maxbytes, then uber mode will not set.
    If you omit this, by default bloksize value assigned (12MB). If you are going to run a dataset of size 50MB will not set in uber mode.)

Post a Comment