Top Banner
ECE 545 Project 1 Introduction & Specification
56

ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Jan 16, 2016

Download

Documents

Deirdre Hart
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

ECE 545 Project 1Introduction & Specification

Page 2: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Schedule

Project 1 RTL design for FPGAs (30 points)

Due date: Tuesday, November 21, midnight

Final choice of the project topic: Thursday, October 19

Progress reports: Thursday-Friday, November 2-3 Thursday-Friday, November 16-17

Page 3: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Groups

• ONE-person and TWO-person teams allowed

• Teams must be formed at the moment when the project topic is selected, i.e., by Thursday, October 19

• TWO-person teams work on more complex versions of each project topic

• One final grade per entire team

Page 4: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Honor Code Rules

• Using somebody’s else code and presenting it as your own is a serious Honor Code violation and may result in an F grade for the entire course.

• All student teams are expected to write and debug their codes by themselves and are not allowed to share their codes with other teams.

• Students are encouraged to help and support each other in all problems related to the– basic understanding of the problem– operation of the CAD tools.

Page 5: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Project 1 - Platform & tools

Target devices: Xilinx FPGA Spartan 3 family

Tools:

VHDL Simulation: Aldec Active HDL or ModelSimVHDL Synthesis: Synplify Pro or Xilinx XSTImplementation: Xilinx ISE or Xilinx WebPack

Page 6: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Project 1 - Final Deliverables

1. All block diagrams and ASM chartsdescribing the entire circuit and its components(electronic form, PDF)

2. All synthesizable VHDL source codes3. All testbenches used to verify the operation of the entire

circuit and its components, and the correspondinginput files containing test vectors, and output files containing results

4. Timing waveforms demonstrating the correct operationof the entire circuit and its components

5. Final report

Page 7: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Final Report (1)

1. Short description of the block diagrams and ASM charts. Discussion of any alternative architectures and solutions.

2. List of source codes and a short description of major modules.

3. Source of test vectors and a way of generating these test vectors.

4. Format of input & output files. Short description of a testbench.

Page 8: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Final Report (2)

5. Results• resource utilization (CLB slices, LUTs, FFs,

BRAMs, etc.)• post-synthesis timing

• clock frequency• throughput• latency• critical path

• post placing & routing timing• clock frequency• throughput• latency• critical path

Page 9: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Final Report (3)

6. Discussion of the obtained results and and any optimizations applied in order to obtain

the optimum design.

7. Speed-up vs. software implementation.

8. Discussion of dependence of results on parameters of the application.

9. Deviations from the original specification, encountered problems, and unresolved issues.

Page 10: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Two topics from two different areas to choose from

Cryptography:

Digital Signal Processing:

Stream cipher qualifiedto Phase 2 of the eSTREAM contest

Finite Impulse Response Filter

Page 11: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Stream cipher qualifiedto Phase 2 of the eSTREAM contest

Page 12: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Cipher

Message / Ciphertext

Ciphertext / Message

CryptographicKey

m bits

m bits

k bits

Encrypt/Decrypt

1 bit

Page 13: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Secret-Key Ciphers

key of Alice and Bob - KABkey of Alice and Bob - KAB

Alice Bob

Network

Encryption Decryption

Page 14: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Block vs. stream ciphers

Stream cipher

memoryBlock cipher

KK

M1, M2, …, Mn m1, m2, …, mn

C1, C2, …, Cn c1, c2, …, cn

Ci=fK(Mi) ci = fK(mi, mi-1, …, m2, m1)

Every block of ciphertext is a function of only one

corresponding block of plaintext

Every block of ciphertext is a function of the current and

all proceeding blocks of plaintext

Page 15: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Typical stream cipher

Sender Receiver

PseudorandomKeyGenerator

mi

plaintext

ci

ciphertext

kikeystream

keyinitialization vector (seed)

PseudorandomKeyGenerator

mi

plaintext

ci

ciphertext

ki keystream

key initializationvector (seed)

Page 16: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

eSTREAM - Contest for a new stream cipher standard, 2004-2008

PROFILE 1

• Stream cipher suitable for software implementations optimized for high speed• Key size - 128 bits• Initialization vector – 64 bits or 128 bits

PROFILE 2

• Stream cipher suitable for hardware implementations with limited memory, number of gates, or power supply• Key size - 80 bits• Initialization vector – 32 bits or 64 bits

Page 17: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Schedule of the contest

November 2004 Request for proposals

29 April 2005 Deadline for submissions

34 ciphers, 23 candidates for PROFILE 1

26 candidates for PROFILE 2

26-27 May 2005 Stream Cipher Workshop, Danmark

March 2006 End of Phase I

July 2006 Beginning of the evaluation part of Phase II

September 2007 End of Phase II

January 2008 Final report

time

eSTREAM - Contest for a new stream cipher standard, 2004-2008

http://www.ecrypt.eu.org/stream/timetable.html

Page 18: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

10 focus candidatesPROFILE 1 (Software)Dragon - Ed Dawson, Kevin Chen, Matt Henricksen, William Millan, Leonie Simpson, HoonJae Lee, SangJae MoonHC-256 - Hongjun WuLEX - Alex BiryukovPhelix - Doug Whiting, Bruce Schneier, Stefan Lucks, Frédéric MullerPy - Eli Biham and Jennifer SeberrySalsa20 - Daniel BernsteinSOSEMANUK - Come Berbain, Olivier Billet, Anne Canteaut, Nicolas Courtois, Henri Gilbert, Louis Goubin, Aline Gouget, Louis Granboulan, Cédric Lauradoux, Marine Minier, Thomas Pornin, Hervé Sibert

PROFILE 2 (Hardware)Grain - Martin Hell, Thomas Johansson and Willi MeierMICKEY-128 - Steve Babbage and Matthew DoddPhelix - Doug Whiting, Bruce Schneier, Stefan Lucks, Frédéric MullerTrivium - Christophe De Cannière and Bart Preneel

Page 19: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Your task

For groups of the size ONE

For groups of the size TWO

implement ONE out of the following FIVE ciphers

implement TWO out of the following FIVE ciphers

Grain, MICKEY-128, Phelix, Salsa, Trivium

Page 20: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Optimization Criteria

Maximum ratio

Throughput divided by

Total Circuit Area [CLB slices]

I. Minimum area

II.

Page 21: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

eSTREAMcipher

clk

reset

enc_dec

data_in

data_in_ready

data_in_write

d

data_out

writefull

d

Required interface

key_IV

key_IV_ready

key_IV_write

k

k=1, 2, 4, 8, 16, 32, 64d – set of allowed values specific to a given algorithm

Page 22: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Tasks of a TWO-person team

• Implement TWO ciphers

• Compare TWO ciphers against each other

Page 23: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

eSTRAMImplementation Hints

Page 24: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Example of an eSTRAM cipher

Page 25: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Linear Feedback Shift Register (LFSR)

L, C(D)

Connection polynomial, C(D)

C(D) = 1 + c1D + c2D2 + . . . + cLDL

Length

Page 26: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

4, 1+D+D4

Connection polynomial, C(D)Length

Example of LFSR

Page 27: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Initial state[sL-1, sL-2, . . . , s1, s0]

LSFR recursion:

sj = c1sj-1 c2sj-2 . . . cL-1sj-(L-1) cLsj-L

for j L

sj-1 sj-2 sj-(L-1) sj-L

Page 28: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

LFSR State Sequence

Page 29: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Non-linear Feedback Shift Register (NFSR)

Page 30: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Doubling the speed of Grain

Page 31: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Resources

eSTREAM PHASE 2 –the ECRYPT Stream Cipher Project

available at

http://www.ecrypt.eu.org/stream/

Source of test vectors

Reference C implementations provided bythe authors of the algorithms.

Page 32: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Finite Impulse Response Filter

Page 33: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Topic proposed and co-advised by:

Dr. David Hwang Dr. Kathleen Wage

Page 34: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

DSP Project: FIR Digital Filter Design

• Digital filters are widely used in digital communications and audio/video processing.

• In particular, finite impulse response (FIR) filters are used for their ease of implementation and stability.

• In this project, you will investigate different FIR filter structures and their VLSI implementations– Step 1: Implement and compare direct form versus

direct form transposed structures– Step 2: Implement and compare fast FIR structures

which reduce the number of required multiplications per sample

Page 35: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Example: Gigabit Ethernet Transceiver

• As seen above digital filters, boxed in blue, play a crucial role in digital communication chips such as Ethernet transceivers, cable modems, DSL modems, satellite receivers, mobile phones, etc.

Page 36: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

x(n) Z-1 Z-1 Z-1

h0 h1 h2 hN-1

Step 1a: Direct Form FIR Filter

• An FIR filter implements a convolution in the time-domain• Critical path of N-tap filter:

– N-1 adds + 1 multiply• Arithmetic complexity of N-tap filter modeled as:

– N multiplications/sample + N-1 adds/sample• Problem 1a: Design a parametrizable direct form FIR filter

y(n)

Page 37: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Step 1b: Direct Form Transpose FIR Filter

• Use a signal flow graph reversal to reduce the critical path transpose structure

• Critical path of N-tap transposed filter:– 1 add + 1 multiply

• Arithmetic complexity of N-tap filter modeled as:– N multiplications/sample + N-1 adds/sample

• Problem 1b: Design a parametrizable direct form transpose FIR filter

x(n)

Z-1 Z-1 Z-1

hN-1 hN-2 hN-3 h0

y(n)

Page 38: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

x(2n) H0(z)

H0(z)+H1(z)

x(2n+1) H1(z)

N/2 taps

Z-1

y(2n)

y(2n+1)

Step 2: Power Reduction via Parallel Subexpression Sharing

• Direct form and transpose form structures (running at the same rate) require N multiplications/sample and N-1 adds/sample

• Methods exist to reduce this complexity by parallel processing and subexpression sharing. See [1] and [2] for details and derivation.– In the 2-parallel structure above, two inputs arrive at half the original clock

rate and are processed in parallel by three ceil(N/2)-tap filters [ceil() is the ceiling function]

– Arithmetic complexity of the 2-parallel filter is approximately: 3 x N/2 multiplications / two samples + 3 x (N/2-1) adds / two samples + 4 adds / two samples = 3/4 N multiplications/sample + (3N/4 + 1/2) adds/sample

– If power is dominated by multipliers, 25% power savings over traditional structures!

• Problem 2a: Design a 2-parallel parametrizable FIR filter

Page 39: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Obtaining Coefficients of 2-Parallel Subfilters

• Example for N = 8– H(z) = {h0, h1, h2, h3, h4, h5, h6, h7}

• Subfilter coefficients obtained by performing a polyphase decomposition by 2. Each subfilter has N/2 = 4 coefficients:– H0(z) = {h0, h2, h4, h6}

– H1(z) = {h1, h3, h5, h7}

– H0(z) + H1(z) = {h0+h1, h2+h3, h4+h5, h6+h7}

Page 40: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

H0(z)

H1(z)

H2(z)

H0(z) + H1(z)

H1(z) + H2(z)

H0(z) + H1(z) + H2(z)

N/3 taps

x(3n)

x(3n+1)

x(3n+2) Z-1

Z-1

y(3n)

y(3n+1)

y(3n+2)

3-parallel filter

• In the 3-parallel filter, three inputs arriving at a third of the original rate are processed by six parallel ceil(N/3)-tap filters

• Arithmetic complexity of the 3-parallel filter is approximately:– 2/3 N multiplications/sample + (2/3N + 4/3) adds– 33% reduction in multiplications/sample

• Problem 2b: Design a 3-parallel parametrizable FIR filter

Page 41: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Obtaining Coefficients of 3-Parallel Subfilters

• Example for N = 9– H(z) = {h0, h1, h2, h3, h4, h5, h6, h7, h8 }

• Subfilter coefficients obtained by performing a polyphase decomposition by 3. Each subfilter has N/3 = 3 coefficients:– H0(z) = {h0, h3, h6}– H1(z) = {h1, h4, h7}– H2(z) = {h2, h5, h8}– H0(z) + H1(z) = {h0+h1, h3+h4, h6+h7}– H1(z) + H2(z) = {h1+h2, h4+h5, h7+h8}– H0(z) + H1(z) + H2(z) = {h0+h1+h2, h3+h4+h5, h6+h7+h8}

Page 42: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Further parallelism

• These parallel structures introduce issues such as increased area, adder overhead (pre- and post-processing), etc. which eventually become prohibitive as the subsampling rate increases

Page 43: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Assumptions

All coefficients are loaded to the circuitbefore the start of processing and do notchange during the runtime.

Registers storing coefficients are connected in chain, so coefficients must be loadedserially, in the proper order, startingfrom the ones with the smallest indices.

Page 44: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Parameters of the design

N: number of taps (N=8, 12, 16, 24, 32)

M: fractional wordlength of input (M=8..10)

K: fractional wordlength of output (K=8..10)

L: fractional wordlength of coefficients (L=7-11)

Page 45: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

FIR Filter

clk

reset_datapath

d_in1.M

filt_mode

d_out

1.K

Required interface - basic architecture

load_begin

load_coeff_done

reset_coeff

coeff1.L

( 0=load coefficients, 1=run filter)

( 0=idle, 1=start to load coefficients)

Page 46: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

FIR Filter

clk

reset_datapath

d_in_11.M

filt_mode

d_out_1

1.K

Required interface – 2-parallel structure

load_begin

load_coeff_done

reset_coeff

coeff1.L

( 0=load coefficients, 1=run filter)

( 0=idle, 1=start to load coefficients)

d_in_21.M

d_out_2

1.K

Page 47: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

One-Person Team Requirements• Matlab code will be given for five different configurations (A, B, C, D, E),

each with different values of N, M, L, and K.– CASE A: N = 8, M = 8, K = 8, L = 7– CASE B: N = 12, M = 9, K = 9, L = 8– CASE C: N = 16, M = 9, K = 10, L = 9– CASE D: N = 24, M = 10, K = 11, L = 10– CASE E: N = 32, M = 11, K = 12, L = 11

• Step 1: Direct form and transpose form structures:– Generate parametrizable VHDL code; round output of each multiplier to K

fractional bits– Generate test vectors using Matlab and verify the test vectors in RTL for

configurations A-E– Implement configurations B and D on FPGA

• Optimize for minimum area• Optimize for maximum ratio of: throughput / area (CLB slices)

• Step 2: 2-parallel and 3-parallel fast FIR structures– Generate parametrizable VHDL code; round output of each multiplier to K

fractional bits – Generate test vectors using Matlab and verify the test vectors in RTL for

configurations B and D– Implement configurations B and D on FPGA

• Optimize for minimum area• Optimize for maximum ratio of: throughput / area (CLB slices)

Page 48: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Two-Person Team Additional Requirements

• Step 3: 4-parallel and 6-parallel fast FIR structures. See ref [2] for block diagrams.

– Generate parametrizable VHDL code; round output of each multiplier to K fractional bits

– Generate test vectors using Matlab and verify the test vectors in RTL for configurations B and D

– Implement configurations B and D on FPGA• Optimize for minimum area• Optimize for maximum ratio of: throughput / area (CLB slices)

• Step 4: Quantization studies– For the 6-parallel filter and configurations B and D, implement truncation

instead of rounding after the multipliers.• Optimize for minimum area• Optimize for maximum ratio of: throughput / area (CLB slices)

– For the 4-parallel filter and configurations B and D, round to K+4 bits after the multipliers. Round again to K bits right before the filter outputs to produce a 1.K output.

• Optimize for minimum area• Optimize for maximum ratio of: throughput / area (CLB slices)

Page 49: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Required reading

[1] Z. Mou and P. Duhamel, “Short-length FIR filters and their use in fast nonrecursive filtering,” IEEE Transactions on Signal Processing, vol. 39, no. 6, pp. 1322-1332, June 1991.[2] K.K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation, John Wiley, pp. 256-275, 1999.

Source of test vectors

Matlab implementation – provided by Dr. Hwang

Page 50: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Important Notes on Two’s Complement Arithmetic

Page 51: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Project Notation

• For this project, we are using two’s complement fractional notation• An m.M number indicates a two’s complement m+M bit word with m

integer bits and M fractional bits• Example: 1.4 number

– 0.111 = +0.875– 1.000 = -1– 1.111 = -0.125

• Example: 2.2 number– 00.11 = +0.75– 10.00 = -2– 01.01 = +1.25

• The dynamic range of an m.M number is [-2m-1, 2m-1)

Page 52: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Two’s Complement Multiplication

• The wordlength required for the product of 1.M x 1.L numbers = – 2.(M+L) if we assume -1 x -1 = +1 may occur– 1.(M+L) if we assume -1 x -1 = +1 will never occur

• In general a product of m.M x l.L numbers = – (m+l).(M+L) if assume (most neg value of a) x (most neg value of b) may occur– (m+l-1).(M+L) if assume (most neg value of a) x (most neg value of b) will never

occur• In this project, we assume that (most neg value of a) x (most neg

value of b) will never occur for any multiplier in any filter structure. This is guaranteed by scaling the inputs and coefficients properly in Matlab.

– Examples: 1.5 x 2.5 = 2.10, 1.4 x 1.6 = 1.10, 3.4 x 2.3 = 4.7

a

b

1.M

1.L

1.M+L

Page 53: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Two’s Complement Truncation versus Rounding

• In this project, we ask you to round the output of each multiplier to K fractional bits.

• To round a k.K’ number to a k.K number (K < K’):– Truncate the k.K’ number to become a k.K number– Add the former fractional K+1 bit to fractional position K

• For information purposes, to truncate a k.K’ number to a k.K number (K < K’):– Truncate the k.K’ number to become a k.K number

• Rounding and truncation produce equal noise variance, whereas rounding is (approximately) unbiased and truncation is biased

Page 54: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Truncation versus Rounding:Example: 2.5 number to a 2.3 number

00.01110 +100.100

11.01000 +011.010

ROUNDING

TRUNCATION

00.01110 00.011

11.0100011.010

10.00110 +110.010

10.0011010.001

Page 55: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Two’s Complement Addition

• FIR filters perform chains of additions• A k.K number plus a k.K number requires a (k+1).K number to

represent the sum– Ex. 0.111 (0.75) + 0.111 (0.75) = 01.100 (1.5)– Ex. 1.000 (-1) + 1.000 (-1) = 10.000 (-2)

• In general, an adder chain summing J numbers, each of wordlength k.K, requires a wordlength of (k + ceil(log2(J)).K after the final adder

– This grows for a large number of coefficients N

x(n) Z-1 Z-1

h0 h1

1.M 1.M

1.N

ROUND

1.M+N

ROUND

1.K 1.K 1.K 1.K

. . .

y(n)

Page 56: ECE 545 Project 1 Introduction & Specification. Schedule Project 1 RTL design for FPGAs (30 points) Due date: Tuesday, November 21, midnight Final choice.

Two’s Complement Adder Chain Trick using Modulo Arithmetic

• Trick: if we know output of adder is bounded within a k’.K value (where k’ is some known value), then all intermediate addition nodes only require k’.K bit wordlengths

– Provides hardware savings for large number of coefficients N!• This is only true if we know the output of the adder chain is bounded

– Be careful, because x(2n) + x(2n+1) is not guaranteed to be bounded in 1.M; you need the full 2.M– h(0) + h(1) is not guaranteed to be bounded in 1.L; you need the full 2.L– In this project, this trick helps after multiplier outputs, not on multiplier inputs

• In our project, the final output y(n) is bounded within a 1.K bit wordlength. This has been controlled by scaling the inputs and coefficients in Matlab.

• To learn about more helpful hardware “tricks” take ECE 645 next semester!

1.K 1.K 1.K 1.K

y(n)

1.K 1.K 1.K 1.K

y(n)

2.K 3.K 3.K

1.K 1.K 1.K