SuperCISC Compiler C to VHDL compiler Processor actors are bottleneck Hardware acceleration Ptolemy II Graphical Model Ptolemy II Graphical Model Creation Creation ACME Actor Generator SoC Generation SoC Generation Tool Flow Tool Flow 2x2 Mesh Interconnect 2x2 Mesh Interconnect Network Network Xilinx Platform Studio Xilinx Platform Studio System System Processor Based Actors Design complex components Example: Switch arbiters Describe functionality in C Use Java Native Interface within Ptolemy II Soft core processors run code on FPGA 2x2 Mesh Interconnect 2x2 Mesh Interconnect Network Network ACME actor ACME actor library mirrors library mirrors Ptolemy’s Java Ptolemy’s Java library library Xilinx library Xilinx library contains IP blocks contains IP blocks and board and board descriptions descriptions ACME Graphical design entry Uses Ptolemy II environment from UC Berkeley Components called ‘actors’ Generate systems targeting FPGAs System emulation Rapid MPSoC prototyping Processor and logic design Logic Containin g VHDL Actors Microblaz e Processor Systems Serial Port for PC Communication Actor Generator GUI Actor Generator GUI Extend Ptolemy II Extend Ptolemy II GUI for graphical GUI for graphical actor creation actor creation Generated skeleton code for Generated skeleton code for actor actor ACME Actor Generator ACME Actor Generator Network Switch Processing Node Rapid Prototyping and Emulation of Many- Rapid Prototyping and Emulation of Many- Core Chip Multiprocessors with Integrated Core Chip Multiprocessors with Integrated Hardware Accelerators Hardware Accelerators Colin J. Ihrig Colin J. Ihrig University of Pittsburgh University of Pittsburgh Email: [email protected] Email: [email protected] Emulation Augmentation Emulation Augmentation User specified latency and User specified latency and throughput circuit throughput circuit Processor / hardware Processor / hardware synchronization via a hardware synchronization via a hardware barrier circuit barrier circuit Three cycle throughpu t One additional latency cycle Processors set / reset barrier Barrier clocks custom logic Problem Need to study new architectures System design is time consuming Software simulators do not scale well Orders of magnitude slowdown #pragma HWstart z = x + y; m = y << 3; n = m – y; if ( n < 0 ) n = 0; if ( q == 3 ) i = z + 5; else i = z – 2; j = n * i; #pragma HWend SuperCISC Compiler SuperCISC Compiler Emulation clock Target vs. Host FPGA cycles Decouples emulated system from FPGA Tracked via hardware counters Skeleton code generated for: Ptolemy Java Actor Java Native Interface C and Header files ACME VHDL Actor Actors automatically incorporated into ACME and Ptolemy II Fast Simplex Link Bus Connections C to VHDL automation flow Annotate C code with pragmas Construct Super Dataflow Graph Custom coprocessors in ACME