First things first, what's an FPGA?
The word FPGA stands for “Field Programmable Gate Array”. It's basically a reprogrammable microelectronic chip. In other words, you can describe the connection between the transistors that compose your FPGA to create your very own system e.g. a specific processor or an hardware accelerator. FPGAs are often used by SI companies to prototypes their next CPU or System-on-chip. They are also widely used in self-driving cars, servers, and even smartphones to run complex applications or to accelerate specific processes.
Figure 1: An example of microelectronic chip
FPGAs are the only ICs that can be reprogrammed. Indeed, FPGAs essentially are array of gates (composed of transistors) that can be arbitrarily connected together to make a circuit of your choice. Some FPGAs also have built in hard blocks such as Memory controllers, high speed communication interfaces, DSPs, etc. to improve the processing power. To program or reprogram an FPGA, you need to describe the connections within the array of gates.
In the early days, engineers used graphical editors to describe the connections within the array of gates. This method was visual but inefficient due to the time needed to add all the logic operators and arrows between them. Engineers switched to using Hardware Description Languages - HDL and compilers called logic synthesizers in the 90's. The use of HDLs, such as Verilog, speeds up the development time and enables hardware engineers to create complex systems, CPUs, GPUs, and SoC within an acceptable timeframe.
In the recent years, new tools have allowed engineers to be more productive by offering ways of generating HDLs from higher level of abstraction languages. We created Cx and Synflow to head in this direction: enabling all makers, engineers, programmers to use FPGAs with ease, increasing their productivity, and providing the same Quality of Results as if using HDLs. And if you would like to learn a bit of history about Cx and Synflow, you can have a look at the Open RVC-CAL Compiler, this is were it all started.
Pros and Cons of using an FPGA
If you are not familiar with FPGAs, you may wonder about the Pros and Cons of using an FPGA. Let's take a piece of code:
for i from 0 to 63 S1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25) ch := (e and f) xor ((not e) and g) temp1 := h + S1 + ch + k[i] + w[i] S0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22) maj := (a and b) xor (a and c) xor (b and c) temp2 := S0 + maj h := g g := f f := e e := d + temp1 d := c c := b b := a a := temp1 + temp2
This is part of an SHA-256 algorithm. SHA-256 is basically a set of cryptographic hash functions designed by the NSA. More information on Wikipedia. Executed on a basic processor, this algorithm would take N*64 clock cycles to complete, with N equal the number of cycles required to execute all instructions within the loop. On an FPGA, you can execute an unlimited quantity of instructions in parallel so all instructions within the loop may be executed at the same time and this algorithm will take 64 clock cycles to complete.
This is the main reason for using FPGA, the possibility of executing an unlimited quantity of instructions in parallel. You also have full control of the timing of your applications (hard real time), the latency but also the cadency. Finally, small FPGAs like the iCE from Lattice use less power to run complex apps than most microprocessors. On the other hand, FPGAs run at slower frequencies (compared to CPUs), usually up to 250MH, and mid/high-end FPGAs require a lot of power to operate properly.
As a conclusion, FPGAs are cool because you can do everything with an FPGA! But you need to keep in mind the drawbacks e.g. the operating power and the clock frequency.
As introduced before, FPGAs are composed of millions of cells that contain logic elements, but also small local memories (registers and RAM). A local memory is directly accessed by logic elements so the impact of a memory access is often negligible. When a designer codes hardware, he splits an algorithm into independent parts that exclusively access their local memories and communicate directly (i.e. not using caches, or a shared memory) with other parts of the algorithm. It is also possible to access an external RAM or an external DDRAM using dedicated interfaces and SI [Intellectual Property cores] (https://en.wikipedia.org/wiki/Semiconductor_intellectual_property_core). In such case, the timing is important and must be considered with care.