Altera’s Stratix 10 makes Cortex-A53 an extreme performance player
Bernard Cole October 31, 2013
Sensing a market opportunity for ARM CPUs in extreme performance applications in radar systems,
backbone network/communications and compute intensive data centers,
Altera at ARM Techcon took the wraps off its Cortex-A53 based Stratix 10 FPGA family.
With this move, the FPGA power house transforms the ARM architecture
from an outsider looking in at the very high performance market dominated up to now
by Intel’s x86 and IBM’s Power Architecture into one in which it must be considered
as a extreme performance player of equal capabilities.
Altera has done this by using Intel’s own 14 nanometer TriGate process based foundry service
that uses to build all of its high density FPGAs and applied it to a core-based design – the Statix 10 –
that combines a 1.5 GHz quadcore 64-bit Cortex-A53 architecture with a 1 GHz programmable fabric.
According to Danny Biran, senior vice president, corporate strategy and marketing at Altera,
the Cortex-A53 is already one of the power efficient and compute-capable of ARM’s application-class processors.
“But when delivered on the 14 nm Tri-Gate process it will achieve more than six times more data
throughput compared to today’s highest performing SoC FPGAs, “he said.
The Cortex-A53 also delivers important features to the extreme performance segment of the market,
including virtualization support,
256TB memory reach and error correction code (ECC) on L1 and L2 caches.
“Furthermore, the Cortex-A53 core can run in 32-bit mode,
which will run Cortex-A9 operating systems and code unmodified,,” he said, ”
allowing a smooth upgrade path from Altera’s 28 nm and 20 nm SoC FPGAs.”
The other ingredient in the secret sauce that Altera brings to the competition
in this previously Intel/IBM dominated market is a cleverly structured FPGA fabric
that allows developers to implement clean designs in separate functional layers as performance and capability requires.
A logic layer is implemented in 1 GHz programmable fabric
that can be used to implement custom functions such as hardware accelerators
that gives designers access to the equivalent of four million 4-input lookup tables (LUTs)
that use six inputs. It would be used for functions such as deep packet inspection,
hardware acceleration, and special cryptographic engines.
Another layer is optimized for DSP and contains hardened floating-point DSP blocks
that are designed in excess of 10 teraflops of computational performance in the highest end devices.
Here, said xxx, the designer can implement DSP-based operations necessary
for floating-point computations, matrix manipulations, and waveform processing.
To aid developers in building applications,
Biran said the company has combined its its own SoC Embedded Design Suite (EDS)
and an ARM Development Studio 5 (DS-5) kit optimized for Altera’s FPGA designs with the OpenCL programming tools
for creating in a high level design language the kind of software support
that such extreme performance heterogeneous implementations designs need.
“With this combination of building blocks Altera Stratix 10 SoCs
will have a programmable-logic performance level of more than 1GHz,
“ he said, “twice the core performance of current high-end 28 nm FPGAs.