Designing Distributed Systems for Heterogeneity

Philip Brighten Godfrey

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2009-82
May 21, 2009

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-82.pdf

Modern distributed and networked systems are highly heterogeneous in many dimensions, including available bandwidth, processor speed, disk capacity, security, failure rate, and pattern of failures. The theme of this dissertation is that this heterogeneity can not only be handled, but rather should generally be viewed as an asset.

We begin by introducing a framework, the price of heterogeneity, to model the effect of heterogeneity in parallel and distributed systems. Our results in this framework show broad classes of systems in which heterogeneity cannot be a disadvantage. We then develop practical methods for distributed systems to adapt to and take advantage of heterogeneity. The $Y_0$ distributed hash table achieves improved load balance, route length, and congestion with low overhead in environments with heterogeneous node capacities, such as bandwidth or processing speed. Addressing heterogeneity in reliability, we show that randomization in node selection strategies typically reduces failure rates---a property that permits better understanding of subtle properties of existing systems, as well as the design of new systems. Finally, we study how to improve stability in the Internet's interdomain routing protocol, while carefully managing tradeoffs with network operators' perferred routes. These results show how both performance and reliability can be improved in heterogeneous environments.

Advisor: Ion Stoica


BibTeX citation:

@phdthesis{Godfrey:EECS-2009-82,
    Author = {Godfrey, Philip Brighten},
    Title = {Designing Distributed Systems for Heterogeneity},
    School = {EECS Department, University of California, Berkeley},
    Year = {2009},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-82.html},
    Number = {UCB/EECS-2009-82},
    Abstract = {Modern distributed and networked systems are highly heterogeneous in
many dimensions, including available bandwidth, processor speed, disk
capacity, security, failure rate, and pattern of failures. The theme of
this dissertation is that this heterogeneity can not only be handled,
but rather should generally be viewed as an asset.

We begin by introducing a framework, the price of heterogeneity, to
model the effect of heterogeneity in parallel and distributed systems.
Our results in this framework show broad classes of systems in which
heterogeneity cannot be a disadvantage. We then develop practical
methods for distributed systems to adapt to and take advantage of
heterogeneity.  The $Y_0$ distributed hash table achieves improved load
balance, route length, and congestion with low overhead in environments
with heterogeneous node capacities, such as bandwidth or processing
speed.  Addressing heterogeneity in reliability, we show that
randomization in  node selection strategies typically reduces failure
rates---a property that permits better understanding of subtle
properties of existing systems, as well as the design of new systems.
Finally, we study how to improve stability in the Internet's interdomain
routing protocol, while carefully managing tradeoffs with network
operators' perferred routes.  These results show how both performance
and reliability can be improved in heterogeneous environments.}
}

EndNote citation:

%0 Thesis
%A Godfrey, Philip Brighten
%T Designing Distributed Systems for Heterogeneity
%I EECS Department, University of California, Berkeley
%D 2009
%8 May 21
%@ UCB/EECS-2009-82
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-82.html
%F Godfrey:EECS-2009-82