Reconfigurable Systems Past and Next 10 Years

A Reconfigurable System is a standard computer tightly coupled to a Programmable Active Memory PAM through a high bandwidth link. The PAM is a Field Programmable Gate Array FPGA and RAM based coprocessor; through software configuration, it may emulate any specific custom hardware, within size limits.

Reconfigurable Systems combine the advantages of software programming to the performance level of application specific dedicated hardware. As a point in case, consider the system DECPeRLe1. It is built from chips available in 92 - RAM, FPGA and processor. Yet, five years and two silicon technology generations later, DECPeRLe1 still holds over 10 significant absolute speed records. They include RSA cryptography, where the competition comes from custom VLSI, and applications from high energy physics, where the competition comes from massively parallel super computers.

The key to such an unusual speed advantage is that FPGA architectures have reached a higher computational density (in bit operations per second and micron square) than microprocessors - over two orders of magnitude for our 92 technology and specific applications. We argue that the computational density ratio between FPGA and microprocessors keeps increasing as technology keeps shrinking. In 98, memory chips are 16 times denser than in 92. Processor chips now deliver 16 times more operations per unit time and area. The computational density of a FPGA is 30 times more in 98 than in 92.

Dividing away the incidence on speed and area of the feature size allows to measure the intrinsic architecture density. A similar technique lets DeHon make quantitative comparisons between the computational densities of various recent processor and FPGA chips - across time and silicon technologies. We argue that new architectures such as Field Programmable Arithmetic Arrays FPAA could bring significant density advantages over FPGA.

We conclude that Reconfigurable Systems are here to stay: they provide the densest, fastest and cheapest known silicon implementations of The Universal Machine.