Single and Multi-CPU Performance Modeling for Embedded Systems
Trevor Conrad Meyerowitz
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2008-36
April 14, 2008
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-36.pdf
The combination of increasing design complexity, increasing concurrency, growing heterogeneity, and decreasing time to market windows has caused a crisis for embedded system developers. To deal with this problem, dedicated hardware is being replaced by a growing number of microprocessors in these systems, making software a dominant factor in design time and cost. The use of higher level models for design space exploration and early software development is critical. Much progress has been made on increasing the speed of cycle-level simulators for microprocessors, but they may still be too slow for large scale systems and are too low-level (i.e. they require a detailed implementation) for effective design space exploration. Furthermore, constructing such optimized simulators is a significant task because the particularities of the hardware must be accounted for. For this reason, these simulators are hardly flexible.
This thesis focuses on modeling the performance of software executing on embedded processors in the context of a heterogeneous multi-processor system on chip in a more flexible and scalable manner than current approaches. We contend that such systems need to be modeled at a higher level of abstraction and, to ensure accuracy, the higher level must have a connection to lower-levels. First, we describe different levels of abstraction for modeling such systems and how their speed and accuracy relate. Next, the high-level modeling of both individual processing elements and also a bus-based microprocessor system are presented. Finally, an approach for automatically annotating timing information obtained from a cycle-level model back to the original application source code is developed. The annotated source code can then be simulated without the underlying architecture and still maintain good timing accuracy. These methods are driven by execution traces produced by lower level models and were developed for ARM microprocessors and MuSIC, a heterogeneous multiprocessor for Software Defined Radio from Infineon. The annotated source code executed between one to three orders of magnitude faster than equivalent cycle-level models, with good accuracy for most applications tested.
Advisors: Alberto L. Sangiovanni-Vincentelli
BibTeX citation:
@phdthesis{Meyerowitz:EECS-2008-36, Author= {Meyerowitz, Trevor Conrad}, Title= {Single and Multi-CPU Performance Modeling for Embedded Systems}, School= {EECS Department, University of California, Berkeley}, Year= {2008}, Month= {Apr}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-36.html}, Number= {UCB/EECS-2008-36}, Abstract= {The combination of increasing design complexity, increasing concurrency, growing heterogeneity, and decreasing time to market windows has caused a crisis for embedded system developers. To deal with this problem, dedicated hardware is being replaced by a growing number of microprocessors in these systems, making software a dominant factor in design time and cost. The use of higher level models for design space exploration and early software development is critical. Much progress has been made on increasing the speed of cycle-level simulators for microprocessors, but they may still be too slow for large scale systems and are too low-level (i.e. they require a detailed implementation) for effective design space exploration. Furthermore, constructing such optimized simulators is a significant task because the particularities of the hardware must be accounted for. For this reason, these simulators are hardly flexible. This thesis focuses on modeling the performance of software executing on embedded processors in the context of a heterogeneous multi-processor system on chip in a more flexible and scalable manner than current approaches. We contend that such systems need to be modeled at a higher level of abstraction and, to ensure accuracy, the higher level must have a connection to lower-levels. First, we describe different levels of abstraction for modeling such systems and how their speed and accuracy relate. Next, the high-level modeling of both individual processing elements and also a bus-based microprocessor system are presented. Finally, an approach for automatically annotating timing information obtained from a cycle-level model back to the original application source code is developed. The annotated source code can then be simulated without the underlying architecture and still maintain good timing accuracy. These methods are driven by execution traces produced by lower level models and were developed for ARM microprocessors and MuSIC, a heterogeneous multiprocessor for Software Defined Radio from Infineon. The annotated source code executed between one to three orders of magnitude faster than equivalent cycle-level models, with good accuracy for most applications tested.}, }
EndNote citation:
%0 Thesis %A Meyerowitz, Trevor Conrad %T Single and Multi-CPU Performance Modeling for Embedded Systems %I EECS Department, University of California, Berkeley %D 2008 %8 April 14 %@ UCB/EECS-2008-36 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-36.html %F Meyerowitz:EECS-2008-36