Xiang Gao

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2020-30

May 1, 2020

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-30.pdf

Modern datacenters are the foundation of large scale Internet services, such as search engines, cloud computing and social networks. In this thesis, I will investigate the new challenges in building and managing large scale datacenters. Specifically, I will show trends and challenges in the software stack, the hardware stack and the network stack of modern datacenters, and propose new approaches to cope with these challenges.

In the software stack, there is a movement to serverless computing where cloud customers can write short-lived functions to run their workloads in the cloud without the hassle of managing servers. Yet the storage stack has not changed to accommodate serverless workloads. We build a storage system called Savanna that significantly improves the serverless applications performance. In the hardware stack, with the end of Moore's Law, researchers are proposing disaggregated datacenters with pools of standalone resource blades. These resource blades are directly connected by the network fabric, and I will present the requirements of building such a network fabric. Lastly, in the network stack, new congestion control algorithms are proposed (e.g. pFabric) to reduce the flow completion time. However, these algorithms require specialized hardware to achieve desirable performance. I designed pHost, a simple end-host based congestion control scheme that has comparable performance with pFabric without any specialized hardware support.

Advisors: Scott Shenker and Sylvia Ratnasamy


BibTeX citation:

@phdthesis{Gao:EECS-2020-30,
    Author= {Gao, Xiang},
    Title= {Next Generation Datacenter Architecture},
    School= {EECS Department, University of California, Berkeley},
    Year= {2020},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-30.html},
    Number= {UCB/EECS-2020-30},
    Abstract= {
Modern datacenters are the foundation of large scale Internet services, such as search engines, cloud computing and social networks. In this thesis, I will investigate the new challenges in building and managing large scale datacenters. Specifically, I will show trends and challenges in the software stack, the hardware stack and the network stack of modern datacenters, and propose new approaches to cope with these challenges. 

In the software stack, there is a movement to serverless computing where cloud customers can write short-lived functions to run their workloads in the cloud without the hassle of managing servers. Yet the storage stack has not changed to accommodate serverless workloads. We build a storage system called Savanna that significantly improves the serverless applications performance. In the hardware stack, with the end of Moore's Law, researchers are proposing disaggregated datacenters with pools of standalone resource blades. These resource blades are directly connected by the network fabric, and I will present the requirements of building such a network fabric. Lastly, in the network stack, new congestion control algorithms are proposed (e.g. pFabric) to reduce the flow completion time. However, these algorithms require specialized hardware to achieve desirable performance. I designed pHost, a simple end-host based congestion control scheme that has comparable performance with pFabric without any specialized hardware support.},
}

EndNote citation:

%0 Thesis
%A Gao, Xiang 
%T Next Generation Datacenter Architecture
%I EECS Department, University of California, Berkeley
%D 2020
%8 May 1
%@ UCB/EECS-2020-30
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-30.html
%F Gao:EECS-2020-30