
Multipliers; 1-2 multiplier units, 24x16-bit
or 32x32-bit max multiply
Adders; 1-2 adder units
Shifters; 1-2 shift units
4 48-bit Accumulators
Logic; 1-2 logic units
Application Specific Logic (ASL) unit
AGU; 0-8 Circular Buffers, 0-4 Page
Registers
Registers; 8-32 entry, 32-bit register file
Accumulators; 2-4 accumulators
New instructions; Addition of new,
customer-defined instructions
Stack Pointers; 1-2 stack pointers
Constants; 6 customer-defined constants

Superscalar issue, executes two
instructions per cycle
Multiple data instructions
Hardware automatically detects and
handles all data and pipeline hazards
2-4 48-bit Accumulators
Maximum size of 24x16-bit or 32x32-bit
Multipliers
8-32 32x32-bit General Purpose Registers
All instructions conditional
Static branch prediction, dynamic branch
resolution
Minimal branch overhead

Single clock edge, flip-flop based design
VHDL/Verilog based RTL code
Ease-of-synthesis coding style
Fully gated functional units
Programmable clock frequency
Sleep mode
Minimal clock power consumption
32-bit orthogonal instruction set
Bit manipulation instructions: Set, Clear,
Extract, Insert, Pack
24x16-bit or 32x32-bit Multiply, MAC,
Multiply-Subtract instruction
8-bit, 16-bit SIMD Add, Sub, Mult, MAC,
Shift, Saturation instructions
Barrel Shift, Signed, Unsigned instructions
Conditional Branch
Register indirect addressing mode
|

Since the SP-5flex is a fully synthesizable core, it can be targeted for almost any process technology. Featuring a QUAD MAC architecture (4 MACs per cycle on 16-bit data), the following typical performance and core size is achievable:
Process | 0.25 µm | 0.18 µm | 0.13 µm |
Clock Speed | 160 MHz | 220 MHz | 320 MHz |
MAC/Sec @16-bit | 640M | 880M | 1280M |
GOP/Sec @8-bit | 3.8 | 5.3 | 7.7 |
Die Area | 3.6 mm2 | 2.5 mm2 | 1.6 mm2 |

Typical application power consumption
Voltage | 2.5 V | 1.8 V | 1.0 V |
Process | 0.25 µm | 0.18 µm | 0.13 µm |
Power | 150 mW | 77 mW | 25 mW |
Program Memory can be incrementally
configured as SRAM or Set Associative Cache
A, B Data Memory Supports up to 2 Memory
Load/Store per Cycle
Data Memory Byte, Half-word, Word addressable
DSP Address Generation Unit Supports 8 Circular
Buffers, with Transpose Mode
for 2-D Data
Manipulation, Bit-Reverse, capable of generating
2 Addresses per Cycle
4 Page Registers
31 Prioritized Interrupts, Supports
Nested Interrupts, 2 Cycle Typical
(7 Maximum) Interrupt Latency
Hardware Stack, Supports Push and
Pop. Highly Efficient Context
Switching
JTAG Debug Port, DSP Break on Program
Counter, Status Register, Memory
Address
Full Scan Design
|