What is a custom functional unit?
Cascade automatically constructs a programmable coprocessor architecture as an array of simple computational elements that operate at the instruction level – ADD, XOR, SHIFT, etc. – optimized to execute the offloaded software. Users can increase performance further by specifying some offloaded functionality to be mapped directly onto a custom functional unit, which is a user-designed hardware implementation that is embedded in the coprocessor. Cascade allows users to explore the performance benefits of deploying custom functional units without the necessity of implementing RTL before the architecture is finalized. To facilitate the easy integration of the custom functional unit into the coprocessor design, Cascade generates RTL Verilog module or VHDL entity declarations that define the block interface to the coprocessor RTL. This approach provides a simple path to integrating new or existing hardware IP into a software-programmable coprocessor. An example of such a unit can be found in