Job Scheduling for Multi-User MapReduce Clusters

Matei Zaharia and Dhruba Borthakur and Joydeep Sen Sarma and Khaled Elmeleegy and Scott Shenker and Ion Stoica

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2009-55

April 30, 2009

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-55.pdf

Sharing a MapReduce cluster between users is attractive because it enables statistical multiplexing (lowering costs) and allows users to share a common large data set. However, we find that traditional scheduling algorithms can perform very poorly in MapReduce due to two aspects of the MapReduce setting: the need for data locality (running computation where the data is) and the dependence between map and reduce tasks. We illustrate these problems through our experience designing a fair scheduler for MapReduce at Facebook, which runs a 600-node multi-user data warehouse on Hadoop. We developed two simple techniques, delay scheduling and copy-compute splitting, which improve throughput and response times by factors of 2 to 10. Although we focus on multi-user workloads, our techniques can also raise throughput in a single-user, FIFO workload by a factor of 2.

BibTeX citation:

@techreport{Zaharia:EECS-2009-55,
    Author= {Zaharia, Matei and Borthakur, Dhruba and Sen Sarma, Joydeep and Elmeleegy, Khaled and Shenker, Scott and Stoica, Ion},
    Title= {Job Scheduling for Multi-User MapReduce Clusters},
    Year= {2009},
    Month= {Apr},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-55.html},
    Number= {UCB/EECS-2009-55},
    Abstract= {Sharing a MapReduce cluster between users is attractive because it enables statistical multiplexing (lowering costs) and allows users to share a common large data set. However, we find that traditional scheduling algorithms can perform very poorly in MapReduce due to two aspects of the MapReduce setting: the need for data locality (running computation where the data is) and the dependence between map and reduce tasks. We illustrate these problems through our experience designing a fair scheduler for MapReduce at Facebook, which runs a 600-node multi-user data warehouse on Hadoop. We developed two simple techniques, delay scheduling and copy-compute splitting, which improve throughput and response times by factors of 2 to 10. Although we focus on multi-user workloads, our techniques can also raise throughput in a single-user, FIFO workload by a factor of 2.},
}

EndNote citation:

%0 Report
%A Zaharia, Matei 
%A Borthakur, Dhruba 
%A Sen Sarma, Joydeep 
%A Elmeleegy, Khaled 
%A Shenker, Scott 
%A Stoica, Ion 
%T Job Scheduling for Multi-User MapReduce Clusters
%I EECS Department, University of California, Berkeley
%D 2009
%8 April 30
%@ UCB/EECS-2009-55
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-55.html
%F Zaharia:EECS-2009-55