This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Mattausch, CMOS Design, H21/6/12 1
Arithmetic Modules (Part 2)• Circuits for Multiplication
– Manual Multiplication Process of Positive Binaries– Combinational Multiplier Circuits for Positive Multi-Bit Binaries – Sequential Multiplier Circuits for Positive Multi-Bit Binaries– Handling of the Sign Bit
• Circuits for Division– Manual Division Process of Positive Binaries– Combinational Divider Circuits for Positive Multi-Bit Binaries– Handling of the Sign Bit
• Parallel Arithmetic for Increased Throughput– Matrix Arrangement of Simple Arithmetic Modules– Important Application: Picture Processing
Again X (M-bit) and Y (N-bit) inputs are fed serial and parallel, respectively. Z is determined pipelined in M+2N clock cycles.
Each coefficient Zk(•2k) of the product is calculated in N-1
clock cycles, but pipelined, for all k.
Critical Path of 1 Adder Stageallows a shorter clock cycle.
Critical Path:
Multiplier bits in,least-significantbit first
Product bits out, least-significantbit first
X
Y0
Clock
FA
Clock
Cout
Cin
Y1
Clock
FA
Clock
Cout
Cin
Y2
Clock
Y3
FA
Clock
Cout
Cin
Z
Clock
Mattausch, CMOS Design, H21/6/12 14
Handling of the Sign Bit in Multiplications
The sign bit of the product is simply determined with an EXOR circuit from the sign bits of multiplicand and multiplier.
The sign bits SX and SYof multiplicand and multiplierdetermine the sign bit SZof the product.
X base2 = SXXM−1XM− 2 oo o X3X2X1X0
Ybase 2 = SYYN −1YN − 2o oo Y3Y2Y1Y0
SX SY SZ = SX⊗SY = SX•SY+ SX•SY
0 00 1
1 01 1
01
10
The sign bit SZ of the product is only 1 (negative)if SX and SY are different.
Mattausch, CMOS Design, H21/6/12 15
Circuits for Division- Manual Division Process of
Positive Binaries - Combinational Divider Circuits for
Positive Multi-Bit Binaries- Handling of the Sign Bit
Mattausch, CMOS Design, H21/6/12 16
Manual Division Process (Positive Binaries)
The division process recursively subtracts 2i•D from R (Rinitial=A). The sign of the result determines bit qi of Q.
Example: 4-Bit Divisor and 7 Bit Dividend
General Algorithm for M-Bit Divisor and N-Bit Dividend
Start1Quotient (Q):
0
0
1
0
0
1
10001
01101
0101
0Divisor (D): 1000 01
Remainder (R):
-Dividend (A):
0001-
01
D’=D•2N-1, Q=0, R=A
R=R-D’
R≥0Q=Q •21+20 Q=Q •21
R=R+D’D’=D’ •2-1
R≥D
End
noyes
yes
no
Mattausch, CMOS Design, H21/6/12 17
Basic Unit of a Combinational Divider Circuit
The quotient-bit-dependent restore can be realized with a multiplexer and the shift function can be realized by interconnecting the divisor bit to the next column.
The basic unit of a divider circuit has to provide a subtract, a quotient-bit dependent restore and a shift function for
the divisor bit.
=DC
Si
S’idj
CiCi+1
dj
qk
FA Ci
qk
Ci+1
dj
dj
Si
S’i
MUX01
Mattausch, CMOS Design, H21/6/12 18
Combinational Array-Divider Circuit (Example for 6-bit dividend and 3-bit divisor)
The critical delay path of array-divider circuits (N-bit dividend, M-bit divisor) contains (N-M/2)•(M-1) full-adder stages.
Critical Path:0
DC DC DC
DC DC DC
DC DC DC
a2
a1
a0
0
0
q2
q1
q0
r2 r1 r0
DC
a5
a4
a3
q3 DC DC DC
DC DCq4
q5
d0d1d2
0
0
0
Mattausch, CMOS Design, H21/6/12 19
Handling of the Sign Bit in Divisions
The sign bits of quotient and remainder are equal. An EXOR circuit of dividend and divisor sign bits determines them.
The sign bits SA and SDof dividend and divisordetermine the sign bits Sqand SR of the quotient andremainder.
A base2 = SAAM−1AM− 2o o o A3A2A1A 0
Dbase 2 = SDDN −1DN − 2o oo D3D2D1D0
SA SD SQ = SA =SA⊗SD
0 00 1
1 0
01
11 1 0
The sign bits Sqand SR of the quotient andremainder are only 1 (negative)if SA and SD are different.
Mattausch, CMOS Design, H21/6/12 20
Parallel Arithmetic for Increased Throughput
- Matrix Arrangement of Simple ArithmeticModules (Processing Elements)
- Important Application: Picture Processing
Mattausch, CMOS Design, H21/6/12 21
Parallel Processing with Many Simple Elements
High performance applications can be realized with simple application-specific processing elements (PE). Each PE has
the principle construction with datapath, control and memory.
Input/Output Control
Memory
Datapath
InterconnectUnit
Construct simple processing elements, which are optimized for
a target application.
ProcessingElement
(PE)
Input/Outputof PE
Mattausch, CMOS Design, H21/6/12 22
Parallel Processing Structures with PEs
The common structure for parallel processing with PEs are the linear structure and the matrix structure.
Linear structure forparallel processing PE PE PE PE PE
Global Data-Exchange Bus
Local Data-Exchange between PEs
Matrix structure forparallel processing
PE PE PE PE PE
PE PE PE PE PE
PE PE PE PE PE
Mattausch, CMOS Design, H21/6/12 23
Real-Time Picture Processing Needs a PE Matrix
High performance image processing needs the parallel processing power of a processing matrix with PEs.
CarCar
MotionMotion
ObjectRecognition
ObjectRecognition
ObjectTrackingObject
Tracking
Picture SegmentationPicture Segmentation
Region ExtractionRegion Extraction・・・
Intelligent Information Processing
Picture Segmentation• Natural image is partitioned into meaningful regions• Important initial task for higher level image processing
Mattausch, CMOS Design, H21/6/12 24
Research at RCNS: Video-Picture Segmentation
Video-picture segmentation is a hot research example for the necessity of parallel processing with a PE matrix.
Matrix-ProcessingNetwork
Interconnection Registers (Type 2)
Interconnection Registers (Type 1)Processing Element (PE)
P1 P2 P3
P4 P5 P6
P7 P8 P9
Color-Picture Segmentation Example
(RCNS: Research Center for Nanodevices and Systems)
Mattausch, CMOS Design, H21/6/12 25
Video-Picture Segmentation Test-Chip Design(RCNS: Research Center for Nanodevices and Systems)