Virtual prototyping methodology
to boot Linux on the ARM Cortex A15
Andy Meier – Carbon Design Systems 3/11/2013 10:29 AM EDT
SoC development teams worldwide have begun a steady move to a virtual prototype methodology for better accuracy and to accelerate the design process of all kinds of applications.
For those of you who aren’t familiar with using a virtual prototype, let’s start with a definition, then take a look at how an engineer recently used virtual prototyping to boot Linux on the ARM® Cortex™-A15.
Virtual prototypes are fast, functional software models of a system that can execute production code. With benefits ranging from software development to enabling architectural exploration and early functional verification using abstract models, their rising popularity is easy to understand.
Almost every virtual prototype deployment though suffers from a similar problem:
The virtual prototype either runs fast while sacrificing cycle accuracy or it is cycle accurate but lacks the speed to develop software.
Some virtual prototypes attempt to solve this problem by sacrificing a bit of speed and accuracy to produce a “best of both worlds” system that claims to have the best attributes of both with none of the downsides.
In practice however, this pleases no one because it’s too slow for the software team and not accurate enough for use by architects and firmware engineers.
Fortunately, there’s a way to create a single virtual prototype that is both fast and accurate.
I recently worked with an engineer to help him boot Linux on a virtual prototype containing an ARM Cortex-A15.
In this case, he was developing a mobile application processor but the same steps apply to almost all complex SoC designs.
In order to get a true measure of the performance of the SoC, the engineer needed to run benchmarks that ran on top of an operating system.
Benchmarks included Dhrystone, CoreMark and tiobench, a multi-threaded I/O benchmark used to measure file system performance, on top of Linux. Running benchmarks served two primary purposes.
Obviously, results of the benchmark helped determine the relative performance of the device under test (DUT) but also do an effective job of generating large amounts of representative system traffic to stress the system and identify optimization opportunities.
Each benchmark required a significant number of simulation cycles to complete in addition to the huge number of cycles required to simply boot the OS.
Because of this large number of required execution cycles, this type of use case is not typically considered with traditional cycle accurate prototypes.
Instead, engineers have opted for cycle-approximate models that can lead to inaccurate and un-optimized SoC designs.
Or, more often, they have skipped this optimization step entirely during the design phase and waited to run these benchmarks in prototypes when it was too late to make changes based on the results.
Virtual prototypes in the design flow
Design teams don’t need to accept inaccuracy or wait until design freeze if they use a virtual prototype.
Software from Carbon Design Systems, for example, allows engineers to do advanced performance optimization by leveraging ARM Fast Models for speed and Carbon’s Swap & Play™ technology for 100% accuracy.
The integration with ARM Fast Models enables an engineer to increase simulation performance in selected components during periods of time when accuracy isn’t critical. Swap & Play then enables ARM Fast Model components to be swapped out in favor of their 100% accurate equivalent components when accuracy is required, such as benchmarking. Essentially, this means performance when it’s wanted and accuracy when it’s needed.
In the system illustration below, the engineer used the Cortex-A15 Linux Carbon Performance Analysis Kit (CPAK) to accelerate analysis, optimization and verification of the SoC’s performance.
The CPAK contains reference hardware and software designs along with analysis and debug software for the Cortex-A15 processor, a way for him to immediately begin analyzing performance and power constraints.
Figure 1: The Cortex-A15 Linux CPAK was used to accelerate analysis.
After booting the Linux kernel provided the CPAK, the engineer created a Swap & Play checkpoint corresponding to the start the Dhrystone benchmark.
Instead of simply swapping over to cycle-accurate execution at that point, however, he continued running in the Fast Model-based system.
He used SoC Designer Plus’ built-in checkpoint manager to create a variety of additional checkpoints, each representing different benchmarks or interesting points of execution.
To obtain accurate results, he then loaded each of the checkpoints into the cycle-accurate implementation of the CPAK and completed the benchmark execution.
This enabled him to pinpoint certain areas of the benchmark for deeper analysis without needing to execute the entire benchmark in cycle-accurate mode.
The screen shot below gives a small sample of the system profiling statistics that can be gathered while running the benchmark.
Figure 2: The virtual prototype is tracking several hardware events and statistics running a benchmark on top of an operating system.
Benefits of a virtual prototype
Take another look at Figure 1. Yes, those are actual hardware events and statistics running a benchmark on top of an operating system with a virtual prototype. What’s displayed here is only a small sampling of the statistics that can be viewed.
For example, synchronized windows can be used to display a number of hardware and software performance metrics.
SoC development teams have discovered that virtual prototypes eliminate the need for them to configure a hardware prototype of the system for this level of analysis.
Furthermore, Swap & Play and accurate software can help ensure correct architectural tradeoffs or an optimized system.
A prototype can be a reliable gauge, but may impact the project schedule if the development team needs to re-validate and verify an architectural change.
This could mean time-to-market delays and loss of revenue.
Of course, the engineer could have opted to over-engineer the chip, but quickly ruled this out because over-engineering can lead to an increased chip size and extra power consumption, not an option for any processor market segment.
The SoC development team recently implemented the virtual prototype methodology to boot Linux on the ARM Cortex A15 and found that it solved several intractable performance and software problems that previously would have required expensive hardware prototype solutions.
That alone should build a solid case for bringing a virtual prototype methodology into any design environment.
About the author
Andy Meier is manager of application engineering at Carbon Design Systems in Acton, Mass.
Before being promoted into his current position, he served as a Carbon Design Systems corporate applications engineer.
Previously, he worked as a senior verification engineer at SiCortex and a verification engineer for Mindspeed Technologies. Meier holds a Bachelor of Science degree in Electrical and Computer Engineering from Worcester Polytechnic Institute in Worcester, Mass.