Tuesday, April 7, 2015

UBER mode in YARN Hadoop2 - Running MapReduce jobs in small dataset

what is Uber mode in YARN - Hadoop2

You might have seen these lines while running MapReduce in Hadoop2. mapreduce.Job: Job job_1387204213494_0005 running in uber mode : false

what is UBER mode in Hadoop2?

    In normally mappers and reducers will run by ResourceManager (RM), RM will create separate container for mapper and reducer.
    uber configuration, will allow to run mapper and reducers in the same process as the ApplicationMaster (AM).

Uber jobs :

    Uber jobs are jobs that are executed within the MapReduce ApplicationMaster. Rather then communicate with RM to create the mapper and reducer containers.
    The AM runs the map and reduce tasks within its own process and avoided the overhead of launching and communicate with remote containers.
Why

    If you have a small no dataset, want to run MapReduce on small amount of data. Uber configuration will help you out, by reducing additional time that MapReduce normally spends mapper and reducers phase.
Can I configure/have a Uber for all MapReduce job.?

    As of now,
       map-only jobs
       jobs with one reducer are supported.

No comments: