Arka Bhattacharya and David E. Culler

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2014-29

April 15, 2014

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-29.pdf

The proliferation of large High-Performance Computing clusters executing computation-intensive jobs on large data sets has made cluster power proportionality very important. Despite publicly available traces showing that many clusters have a low average utilization, existing power-proportionality techniques have seen low adoption, a major reason being that these techniques require modifications to the existing cluster software and network stack, and do not address the reliability concerns that may arise during the course of server power-cycling.

We present Hypnos, a defensive power proportionality system which is unobtrusive, extensible and gracefully handles possible server software and hardware failures which may occur during server power-cycling. We deployed Hypnos on a 57-server production cluster. From a 21-day run, we obtained a 36% energy saving in spite of multiple server and network failures.


BibTeX citation:

@techreport{Bhattacharya:EECS-2014-29,
    Author= {Bhattacharya, Arka and Culler, David E.},
    Title= {Hypnos: Unobtrusive Power Proportionality for HPC frameworks},
    Year= {2014},
    Month= {Apr},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-29.html},
    Number= {UCB/EECS-2014-29},
    Abstract= {The proliferation of large High-Performance Computing clusters executing computation-intensive jobs on large data sets has made cluster power proportionality very important. 
Despite publicly available traces showing that many clusters have a low average utilization, existing power-proportionality techniques have seen low adoption, a major reason being that these techniques require modifications to the existing cluster software and network stack, and do not address the reliability concerns that may arise during the course of server power-cycling.

We present Hypnos, a defensive power proportionality system which is unobtrusive, extensible and gracefully handles possible server software and hardware failures which may occur during server power-cycling. We deployed Hypnos on a 57-server production cluster. From a 21-day run, we obtained a 36% energy saving in spite of multiple server and network failures.},
}

EndNote citation:

%0 Report
%A Bhattacharya, Arka 
%A Culler, David E. 
%T Hypnos: Unobtrusive Power Proportionality for HPC frameworks
%I EECS Department, University of California, Berkeley
%D 2014
%8 April 15
%@ UCB/EECS-2014-29
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-29.html
%F Bhattacharya:EECS-2014-29