Sagar Karandikar

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2018-154

December 1, 2018

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-154.pdf

We present FireSim, an open-source simulation platform that enables cycle-exact microarchitectural simulation of large scale-out clusters by combining FPGA-accelerated simulation of silicon-proven RTL designs with a scalable, distributed network simulation. Unlike prior FPGA-accelerated simulation tools, FireSim runs on Amazon EC2 F1, a public cloud FPGA platform, which greatly improves usability, provides elasticity, and lowers the cost of large-scale FPGA-based experiments. We describe the design and implementation of FireSim and show how it can provide sufficient performance to run modern applications at scale, to enable true hardware-software co-design. As an example, we demonstrate automatically generating and deploying a target cluster of 1,024 3.2 GHz quad-core server nodes, each with 16 GB of DRAM, interconnected by a 200 Gbit/s network with 2 microsecond latency, which simulates at a 3.4 MHz processor clock rate (less than 1,000x slowdown over real-time). In aggregate, this FireSim instantiation simulates 4,096 cores and 16 TB of memory, runs 14 billion instructions per second, and harnesses 12.8 million dollars worth of FPGAs-at a total cost of only $100 per simulation hour to the user. We present several examples to show how FireSim can be used to explore various research directions in warehouse-scale machine design, including modeling networks with high-bandwidth and low-latency, integrating arbitrary RTL designs for a variety of commodity and specialized datacenter nodes, and modeling a variety of datacenter organizations, as well as reusing the scale-out FireSim infrastructure to enable fast, massively parallel cycle-exact single-node microarchitectural experimentation.

Advisors: Krste Asanović


BibTeX citation:

@mastersthesis{Karandikar:EECS-2018-154,
    Author= {Karandikar, Sagar},
    Editor= {Asanović, Krste and Katz, Randy H.},
    Title= {FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud},
    School= {EECS Department, University of California, Berkeley},
    Year= {2018},
    Month= {Dec},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-154.html},
    Number= {UCB/EECS-2018-154},
    Abstract= {We present FireSim, an open-source simulation platform that enables cycle-exact microarchitectural simulation of large scale-out clusters by combining FPGA-accelerated simulation of silicon-proven RTL designs with a scalable, distributed network simulation. Unlike prior FPGA-accelerated simulation tools, FireSim runs on Amazon EC2 F1, a public cloud FPGA platform, which greatly improves usability, provides elasticity, and lowers the cost of large-scale FPGA-based experiments. We describe the design and implementation of FireSim and show how it can provide sufficient performance to run modern applications at scale, to enable true hardware-software co-design. As an example, we demonstrate automatically generating and deploying a target cluster of 1,024 3.2 GHz quad-core server nodes, each with 16 GB of DRAM, interconnected by a 200 Gbit/s network with 2 microsecond latency, which simulates at a 3.4 MHz processor clock rate (less than 1,000x slowdown over real-time). In aggregate, this FireSim instantiation simulates 4,096 cores and 16 TB of memory, runs 14 billion instructions per second, and harnesses 12.8 million dollars worth of FPGAs-at a total cost of only $100 per simulation hour to the user. We present several examples to show how FireSim can be used to explore various research directions in warehouse-scale machine design, including modeling networks with high-bandwidth and low-latency, integrating arbitrary RTL designs for a variety of commodity and specialized datacenter nodes, and modeling a variety of datacenter organizations, as well as reusing the scale-out FireSim infrastructure to enable fast, massively parallel cycle-exact single-node microarchitectural experimentation.},
}

EndNote citation:

%0 Thesis
%A Karandikar, Sagar 
%E Asanović, Krste 
%E Katz, Randy H. 
%T FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud
%I EECS Department, University of California, Berkeley
%D 2018
%8 December 1
%@ UCB/EECS-2018-154
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-154.html
%F Karandikar:EECS-2018-154