1 Combined LNS Adder/Subtractors for DCT Hardware Jie Ruan & Mark G. Arnold.

Post on 21-Dec-2015

217 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

1

Combined LNS Adder/Subtractors for DCT

Hardware

Jie Ruan & Mark G. Arnold

2

Outline

Logarithmic Number System (LNS) Discrete Cosine Transform (DCT) Combined LNS adder/subtractor

3

LNS (Logarithmic Number System)

Represents a number by a sign bit and an exponent to a certain base b

Exponent (n-1 bits)S

F (Precision)

4

Properties of LNS

Large dynamic range Easy for multiplications,

divisions and exponentiations

Additions are not linear operations for LNS

Cost of adders is exponential to word lengths

Have advantages at low precisions

5

LNS Arithmetic Units

Multiplication• logb(XY) = logbX + logbY

• The cost is a fixed-point adder

Addition

• More complex process than multiplication

• E.g., when calculating logb(X+Y),

(x=logbX, y=logbY)1. Calculate z=x-y Z=X/Y

2. Table-lookup sb(z)=logb(1+bz) 1+X/Y

3. logb(X+Y)=y+sb(z) Y(1+X/Y)=X+Y Subtraction

• db(z)=logb|1-bz|

6

LNS Multiplication and Addition

sb(z)

db(z)

z

sb(z)=logb(1+2z)

=y+sb(z)

_+

x

y

logb(X+Y)

db(z)=logb|1-2z|

x

y

+

LNS multiplication LNS addition

logb(XY)

=x+y

x=logbX, y=logbY

(=y+db(z) when Sx≠Sy)

7

Discrete Cosine Transform

7

0

7

0

),(2

)(

2

)(),(

x x

yxfcc

F

7...0,,7...0, yx

16

)12(cos

16

)12(cos

yx

An important part in MPEG encoding

2 Dimensional 8x8 DCT

2-D DCT usually performed through 2 rounds of 1-D DCT to reduce the hardware cost

8

LNS DCT in MPEG encoding

Floating-point cost is too high for portable systems LNS has the same visual result as fixed-point at the

same precisions LNS have shorter word length than fixed-point

numbers

At the same dynamic range and precisions for MPEG-1

• Fixed-point (12+F) bits

• LNS (6+F) bits

9

Fast DCT algorithm

Chen’s 1-D DCT algorithm (one cycle)• Directly factorizes the DCT matrix

• 16 multiplications

• 26 additions

• Perform one 8-point 1-D DCT in one cycle

Two-cycle version by reusing hardware• 14 adders

• 10 multipliers

• Perform one 8-point 1-D DCT in two cycles

10

Diagram of Chen’s 1-D DCT

-

-

-

-

-

-

-

-

-

-

S(1/4)

C(1/4)

S(1/8)

C(1/8)

S(1/8)

-C(1/8)

C(1/4)

S(1/4)

S(1/16)

C(1/16)

-S(7/16)

C(7/16)

S(5/16)

C(5/16)

-S(3/16)

C(3/16)

f(0)

f(1)

f(2)

f(3)

f(4)

f(5)

f(6)

f(7)

F(0)

F(4)

F(2)

F(6)

F(1)

F(5)

F(3)

F(7)

-

S(m/n)=sin(mπ/n), C(m/n)=cos(mπ/n)

11

Many computational units as below in DCT

Combined LNS adders/subtractors

X+Y

X-Y-

The above two computation always access different sb(z) table and db(z) table

Share table-lookup part and some combinational parts in the above two computations

=

12

Combined LNS adder/subtractors

1. z=x-y

2. Table-lookup sb(z)=logb(1+2z)

3. y+sb(z)

X+Y

x=logbX, y=logbY

1. z=x-y

2. Table-lookup db(z)=logb|1-2z|

3. y+db(z)

X-YSame hardware

Same address for different tables

13

Combined LNS adder/subtractors (type 1)

sb(z)

db(z)

z=x-y

=y+sb(z)_+

x

y

logb(X+Y)

(=y+db(z) when Sx≠Sy)

+=y+db(z)

logb|X-Y|

(=y+sb(z) when Sx≠Sy)

14

Combined LNS adder/subtractors (type 1)

sb(z)

db(z)

z=x-y

=y+sb(z)_+

x

y

logb(X+Y)

(=y+db(z) when Sx≠Sy)

+=y+db(z)

logb|X-Y|

(=y+sb(z) when Sx≠Sy)

15

Diagram of Chen’s 1-D DCT

-

-

-

-

-

-

-

-

-

-

S(1/4)

C(1/4)

S(1/8)

C(1/8)

S(1/8)

-C(1/8)

C(1/4)

S(1/4)

S(1/16)

C(1/16)

-S(7/16)

C(7/16)

S(5/16)

C(5/16)

-S(3/16)

C(3/16)

f(0)

f(1)

f(2)

f(3)

f(4)

f(5)

f(6)

f(7)

F(0)

F(4)

F(2)

F(6)

F(1)

F(5)

F(3)

F(7)

S(m,n)=sin(mπ/n), C(m,n)=cos(mπ/n)

S(1/8)

C(1/8)

-C(1/8)

S(1/8)

S(1/8)

C(1/8)

S(1/8)

-C(1/8)

S(1/8)

C(1/8)

-C(1/8)

S(1/8)

16

Some computation units perform blow computations

Combined LNS adder/subtractors

a1X+a2Y

-a2X+a1Y (a1, a2 are constants)

S(1/8)

C(1/8)

S(1/8)

-C(1/8)

Access different tables in an LNS adder Share table-lookup part Add some extra combinational hardware The table-lookup of the two computations use

different addresses

=

17

Combined LNS adder/subtractors (type 2)

sb(z)

db(z)

=y+sb(z1)_+

logba2X

logb(a1X+a2Y)

(=y+db(z1) when Sx≠Sy)

+=y+db(z2)

logb(-a2X+a1Y)

(=y+sb(z2) when Sx≠Sy)

_

logba2Y

logba1X

logba1Yz2

z1

18

Portions of table-lookup part in LNS adders

0

500

1000

1500

2000

2500

3000

2 3 4 5

F

Are

a Combinational

Table-lookup

19

ROM size with/without combined LNS adder/subtractors

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

One-cycle Two-cycle

ROM bitsWithout

With

20

Hardware comparison for LNS adder and LNS adder/subtractors

0

500

1000

1500

2000

2500

3000

3500

4000

2 3 4 5

F

Are

a

Ordinary

Type 1

Type2

21

LNS adder/subtractors in Chen’s hardware

LNS adders

Ordinary Type 1 Type 2

Direct inferred hardware

26 0 10 3

Two-cycle version hardware

14 4 3 2

22

Hardware comparison for Chen’s DCT algorithm at F=4

0

20000

40000

60000

80000

100000

120000

140000

One-cycle Two-cycle

Fixed-point

One-cycle

Two-cycle

23

Conclusion

Significant area savings by combined LNS adder/subtractors in DCT hardware

Suitable to reduce area in portable MPEG devices Some overhead when converting to/from fixed-point

top related