Top Banner

of 23

M.tech Final Project

Apr 04, 2018

Download

Documents

Syed Hyder
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/31/2019 M.tech Final Project

    1/23

    VIF COLLEGE OF ENGINEERING &

    TECHNOLOGY

    ( Himayat Nager ,gandipet x road )

    (Affiliated to Jawaharlal Nehru Technological University)

    logo

    P.G.DEPARTMENT OF ECE

    Seminar Report

    On

    --------------------------------------

    Submitted By

    Name:

    Roll No:Branch:

  • 7/31/2019 M.tech Final Project

    2/23

    VIF COLLEGE OF ENGINEERING &

    TECHNOLOGY(---------)

    Logo

    P.G.DEPARTMENT OF ECE

    Certificate

    This is to certify that Mrs. ---------------------------------

    bearing H.T.No. -------- has satisfactorily completed the course of

    Seminar entitled --------------- prescribed by JNTU for the I -

    SEMISTER of M.Tech( Branch) during the academic year 2010-

    2011.

    Faculty In-charge Head of the department

  • 7/31/2019 M.tech Final Project

    3/23

    Acknowledgement

  • 7/31/2019 M.tech Final Project

    4/23

    Abstract

    This paper provides a detailed study of a configurable multiplier optimized for low power

    and high speed operations and which can be configured either for single 16-bit

    multiplication operation, single 8-bit multiplication or twin parallel 8-bit multiplication.

    The output product can be truncated to further decrease power consumption and increase

    speed by sacrificing a bit of output precision. Furthermore, the proposed multiplier

    maintains an acceptable output quality with enough accuracy when truncation is performed.

    Thus it provides a flexible arithmetic capacity and a tradeoff between output precision and

    power consumption. The approach also dynamically detects the input range of multipliers

    and disables the switching operation of the non effective ranges. Thus the ineffective

    circuitry can be efficiently deactivated, thereby reducing power consumption and

    increasing the speed of operation. Thus the proposed multiplier outperforms the

    conventional multiplier in terms of power and speed efficiencies

  • 7/31/2019 M.tech Final Project

    5/23

    List of Tables

  • 7/31/2019 M.tech Final Project

    6/23

    List of Figures

  • 7/31/2019 M.tech Final Project

    7/23

    Table of Content

    Page No

    Acknowledgement

    Abstract

    List of Tables

    List of Figures

    Chapter 1: Introduction

    Chapter 2: Multipliers

    Chapter 3: Booths Algorithm

    Chapter 4: Timing and Area Analysis

    Chapter 5: Conclusion

    Appendix A. Test File Verilog Code

    References

  • 7/31/2019 M.tech Final Project

    8/23

    Chapter 1

    Introduction

    Portable multimedia and digital signal processing (DSP) systems, which typically require flexible

    processing ability, low power consumption, and short design cycle, have become increasingly popular over the

    past few years. Many multimedia and DSP applications are highly multiplication intensive so that the

    performance and power consumption of these systems are dominated by multipliers. The computation of the

    multipliers manipulates two input data to generate many partial products for subsequent addition operations,

    which in the CMOS circuit design requires many switching activities. Thus, switching activity within the

    functional unit requires for majority of power consumption and also increases delay. Therefore, minimizing the

    switching activities can effectively reduce power dissipation and increase the speed of operation without

    impacting the circuits operational performance. Besides, energy-efficient multiplier is greatly desirable for many

    multimedia applications.

    The first multiplication algorithm that was developed for the early computing requirements follow the

    steps that we use to multiply two numbers by hand [1]. According to Patterson and Hennessy [1], when this

    algorithm was translated for computer use, it required five hardware components, as seen in Figure 1.1. The

    components included one register for each number (multiplicand, multiplier, and product), an ALU, and a

    control. The algorithm involves multiplying each digit of the multiplier with the multiplicand and adding up the

    individual results [1]. Since binary multiplication involves only 1s and 0s, the multiplication of each digit to

    multiplicand translates to shifting and adding of the multiplicand.

    Figure 1.1 First Multiplication Hardware Implementation

    As seen in Figure 1.1, the control tests the multipliers least significant bit (LSB). If the LSB is 1, it

    will send a signal to the ALU to add the multiplicand to the current calculated product. The multiplier is then

    shifted to the right to fetch the next bit and multiplicand is shifted to the left to prepare for the next multiplication

    iteration. This algorithm, shown as a flow chart in Figure 1.2 [1], is the basis for the pen and paper algorithm.

  • 7/31/2019 M.tech Final Project

    9/23

    Figure 1.2 First Multiplication Algorithm Flowchart for 32-bit Numbers

    Many tried to make several improvements to the traditional pen and paper algorithm by reducing the

    amount of additions being performed in the algorithm. In 1951, based on the idea that computers are faster at

    shifting bits than adding them [1], Andrew Donald Booth developed an algorithm known as Booths algorithm.

    There were many such discoveries through the years to improve the efficiency and performance of the

    multiplication algorithms.

    Here attempt is made to combine configuration, partially guarded computation, and the truncation

    technique to design a high speed and power-efficient configurable BM (CBM). The main concerns are speed,

    power efficiency and structural flexibility. The proposed multiplier not only perform single 16-b, single 8-b, or

    twin parallel 8-b multiplication operations but also offer a flexible tradeoff between output accuracy and power

    consumption to achieve more power savings.

    Several techniques are available [1] [3] to improve the speed and power efficiency is analyzed.

    Approaches termed guarded evaluation, clock gating, signal gating, truncation etc. reduce the power

    consumption and increase the speed of multipliers by eliminating spurious computations according to the

    dynamic range of the input operands. The work in [4] separated the arithmetic units into the most and least

    significant parts and turned off the most significant part when it did not affect the computation results to save

    power. Techniques in [5] that can dynamically adjust two voltage supplies based on the range of the incoming

    operands and disable ineffective ranges with a zero-detection circuitry were presented to decrease the power

  • 7/31/2019 M.tech Final Project

    10/23

    consumption of multipliers. In [6] a dynamic-range detector to detect the effective range of two operands was

    developed. The one with the smaller dynamic range is processed to generate booth encoding so that partial

    products have a greater opportunity to be zero, thereby reducing power consumption maximally.

    Furthermore, in many multimedia and DSP systems is frequently truncated due to the fixed register

    size and bus width inside the hardware. With this characteristic, significant power saving can be achieved by

    directly omitting the adder cells for computing the least significant

    bits of the output product, but large truncation errors are introduced. Various error compensation approaches and

    circuits, which add the estimated compensation carries to the carry inputs of the retained adder cells to reduce the

    truncation error. In the constant scheme [7], constant error compensation values were pre-computed and added to

    reduce the truncation error. On the contrary, data-dependent error compensation approaches [8] [10] were

    developed to achieve better accuracy than that of the constant schemed were in data dependent error

    compensation values will be added to reduce the truncation error of array and Booth multipliers (BMs).

    Here, we attempt to combine configuration, partially guarded computation, and the truncation

    technique to design a power-efficient configurable BM (CBM). Our main concerns are power efficiency and

    structural flexibility. Most common multimedia and DSP applications are based on 816-b operands, the

    proposed multiplier is designed to not only perform single 16-b but also performs single 8-b, or twin parallel 8-b

    multiplication operations. The experimental results demonstrate that the proposed multiplier can provide various

    configurable characteristics for multimedia and DSP systems and achieve more power savings with slight area

    overhead.

    Chapter 2

  • 7/31/2019 M.tech Final Project

    11/23

    Multipliers

    A binary multiplier is an electronic circuit used in digital electronics, such as

    a computer, to multiply two binary numbers. It is built using binary adders.

    A variety of computer arithmetic techniques can be used to implement a digital

    multiplier. Most techniques involve computing a set ofpartial products, and then summing

    the partial products together. This process is similar to the method taught to primary

    schoolchildren for conducting long multiplication on base-10 integers, but has been

    modified here for application to a base-2 (binary) numeral system.

    History

    Until the late 1970s, most minicomputers did not have a multiply instruction, and

    so programmers used a "multiply routine"which repeatedly shifts and accumulates partial

    results, often written using loop unwinding. Mainframe computers had multiply

    instructions, but they did the same sorts of shifts and adds as a "multiply routine".

    Early microprocessors also had no multiply instruction. The Motorola 6809,

    introduced in 1978, was one of the earliest microprocessors with a dedicated hardware

    multiply instruction. It did the same sorts of shifts and adds as a "multiply routine", but

    implemented in the microcode of the MUL instruction.

    As more transistors per chip became available due to larger-scale integration, it

    became possible to put enough adders on a single chip to sum all the partial products at

    once, rather than reuse a single adder to handle each partial product one at a time.

    Because some common digital signal processing algorithms spend most of their

    time multiplying, digital signal processor designers sacrifice a lot of chip area in order to

    make the multiply as fast as possible; a single-cycle multiplyaccumulate unit often used

    up most of the chip area of early DSPs.

    Multiplication basics

    The method taught in school for multiplying decimal numbers is based on calculating

    partial products, shifting them to the left and then adding them together. The most difficult

    part is to obtain the partial products, as that involves multiplying a long number by one

    digit (from 0 to 9):

    123

  • 7/31/2019 M.tech Final Project

    12/23

    x 456

    =====

    738 (this is 123 x 6)

    615 (this is 123 x 5, shifted one position to the left)

    + 492 (this is 123 x 4, shifted two positions to the left)

    =====

    56088

    A binary computer does exactly the same, but with binary numbers. In binary

    encoding each long number is multiplied by one digit (either 0 or 1), and that is much

    easier than in decimal, as the product by 0 or 1 is just 0 or the same number. Therefore, the

    multiplication of two binary numbers comes down to calculating partial products (which

    are 0 or the first number), shifting them left, and then adding them together (a binary

    addition, of course):

    1011 (this is 11 in binary)

    x 1110 (this is 14 in binary)

    ======

    0000 (this is 1011 x 0)

    1011 (this is 1011 x 1, shifted one position to the left)

    1011 (this is 1011 x 1, shifted two positions to the left)

    + 1011 (this is 1011 x 1, shifted three positions to the

    left)

    =========

    10011010 (this is 154 in binary)

    This is much simpler than in the decimal system, as there is no table of

    multiplication to remember: just shifts and adds.

    This method is mathematically correct and has the advantage that a small CPU may

    perform the multiplication by using the shift and add features of its arithmetic logic unit

    rather than a specialized circuit. The method is slow, however, as it involves many

    intermediate additions. These additions take a lot of time. Faster multipliers may be

    engineered in order to do fewer additions; a modern processor can multiply two 64-bit

    numbers with 16 additions (rather than 64), and can do several steps in parallel.

    The second problem is that the basic school method handles the sign with a separate

    rule ("+ with + yields +", "+ with - yields -", etc.). Modern computers embed the sign of the

  • 7/31/2019 M.tech Final Project

    13/23

    number in the number itself, usually in the two's complement representation. That forces

    the multiplication process to be adapted to handle two's complement numbers, and that

    complicates the process a bit more. Similarly, processors that use ones' complement, sign-

    and-magnitude, IEEE-754 or other binary representations require specific adjustments to

    the multiplication process.

    A more advanced approach: an unsigned example

    For example, suppose we want to multiply two unsigned eight bit integers together: a[7:0]

    and b[7:0]. We can produce eight partial products by performing eight one-bit

    multiplications, one for each bit in multiplicand a:

    p0[7:0] = a[0] b[7:0] = {8{a[0]}} & b[7:0]

    p1[7:0] = a[1] b[7:0] = {8{a[1]}} & b[7:0]

    p2[7:0] = a[2] b[7:0] = {8{a[2]}} & b[7:0]

    p3[7:0] = a[3] b[7:0] = {8{a[3]}} & b[7:0]

    p4[7:0] = a[4] b[7:0] = {8{a[4]}} & b[7:0]

    p5[7:0] = a[5] b[7:0] = {8{a[5]}} & b[7:0]

    p6[7:0] = a[6] b[7:0] = {8{a[6]}} & b[7:0]

    p7[7:0] = a[7] b[7:0] = {8{a[7]}} & b[7:0]

    where {8{a[0]}} means repeating a[0] (the 0th bit of a) 8 times (Verilog notation).

    To produce our product, we then need to add up all eight of our partial products, as

    shown here:

    p0[7] p0[6] p0[5] p0[4] p0[3] p0[2] p0[1] p0[0]

    + p1[7] p1[6] p1[5] p1[4] p1[3] p1[2] p1[1] p1[0] 0

    + p2[7] p2[6] p2[5] p2[4] p2[3] p2[2] p2[1] p2[0] 0 0

    + p3[7] p3[6] p3[5] p3[4] p3[3] p3[2] p3[1] p3[0] 0 0 0

    + p4[7] p4[6] p4[5] p4[4] p4[3] p4[2] p4[1] p4[0] 0 0 0 0

    + p5[7] p5[6] p5[5] p5[4] p5[3] p5[2] p5[1] p5[0] 0 0 0 0 0

    + p6[7] p6[6] p6[5] p6[4] p6[3] p6[2] p6[1] p6[0] 0 0 0 0 0 0

    + p7[7] p7[6] p7[5] p7[4] p7[3] p7[2] p7[1] p7[0] 0 0 0 0 0 0 0

    -------------------------------------------------------------------------------------------

    P[15] P[14] P[13] P[12] P[11] P[10] P[9] P[8] P[7] P[6] P[5] P[4] P[3] P[2] P[1] P[0]

    In other words,P[15:0] is produced by summingp0,p1

  • 7/31/2019 M.tech Final Project

    14/23

  • 7/31/2019 M.tech Final Project

    15/23

    bit position 0 (LSB) and all the -1's in bit columns 7 through 14 (where each of the MSBs

    are located) are added together, they can be simplified to the single 1 that "magically" is

    floating out to the left. For an explanation and proof of why flipping the MSB saves us the

    sign extension, see a computer arithmetic book

  • 7/31/2019 M.tech Final Project

    16/23

    Chapter 3

    Booths Algorithm

    Booth's multiplication algorithm is a multiplication algorithm that multiplies two

    signed binary numbers in two's complement notation. The algorithm was invented

    by Andrew Donald Booth in 1950 while doing research on crystallography at Birkbeck

    College in Bloomsbury, London. Booth used desk calculators that were faster

    at shifting than adding and created the algorithm to increase their speed. Booth's algorithm

    is of interest in the study of computer architecture.

    The algorithm

    Booth's algorithm examines adjacent pairs of bits of theN-bit multiplierYin

    signed two's complement representation, including an implicit bit below the least

    significant bit,y-1 = 0. For each bityi, fori running from 0 toN-1, the bitsyi andyi-1 are

    considered. Where these two bits are equal, the product accumulatorPremains unchanged.

    Whereyi = 0 andyi-1 = 1, the multiplicand times 2i is added toP; and whereyi = 1 andyi-1 =

    0, the multiplicand times 2i is subtracted fromP. The final value ofPis the signed product.

    The representation of the multiplicand and product are not specified; typically,

    these are both also in two's complement representation, like the multiplier, but any number

    system that supports addition and subtraction will work as well. As stated here, the order of

    the steps is not determined. Typically, it proceeds from LSB to MSB, starting at i = 0; the

    multiplication by 2i is then typically replaced by incremental shifting of thePaccumulator

    to the right between steps; low bits can be shifted out, and subsequent additions and

    subtractions can then be done just on the highestNbits ofP.[1] There are many variations

    and optimizations on these details.

    The algorithm is often described as converting strings of 1's in the multiplier to a

    high-order +1 and a low-order 1 at the ends of the string. When a string runs through the

    MSB, there is no high-order +1, and the net effect is interpretation as a negative of the

    appropriate value.

    A typical implementation

    http://en.wikipedia.org/wiki/Two's_complementhttp://en.wikipedia.org/wiki/Booth's_multiplication_algorithm#cite_note-0http://en.wikipedia.org/wiki/Two's_complementhttp://en.wikipedia.org/wiki/Booth's_multiplication_algorithm#cite_note-0
  • 7/31/2019 M.tech Final Project

    17/23

    Booth's algorithm can be implemented by repeatedly adding (with ordinary unsigned

    binary addition) one of two predetermined valuesA and Sto a productP, then performing a

    rightwardarithmetic shift onP. Let m and r be the multiplicand and multiplier,

    respectively; and letx andy represent the number of bits in m and r.

    1. Determine the values ofA and S, and the initial value ofP. All of these numbers

    should have a length equal to (x +y + 1).

    1. A: Fill the most significant (leftmost) bits with the value ofm. Fill the

    remaining (y + 1) bits with zeros.

    2. S: Fill the most significant bits with the value of (m) in two's complement

    notation. Fill the remaining (y + 1) bits with zeros.

    3. P: Fill the most significantx bits with zeros. To the right of this, append the

    value ofr. Fill the least significant (rightmost) bit with a zero.

    2. Determine the two least significant (rightmost) bits ofP.

    1. If they are 01, find the value ofP+A. Ignore any overflow.

    2. If they are 10, find the value ofP+ S. Ignore any overflow.

    3. If they are 00, do nothing. UsePdirectly in the next step.

    4. If they are 11, do nothing. UsePdirectly in the next step.

    3. Arithmetically shift the value obtained in the 2nd step by a single place to the right.

    LetPnow equal this new value.

    4. Repeat steps 2 and 3 until they have been doney times.

    5. Drop the least significant (rightmost) bit fromP. This is the product ofm and r.

    Example

    Find 3 (4), with m = 3 and r = 4, andx = 4 andy = 4:

    m = 0011, -m = 1101, r = 1100

    A = 0011 0000 0

    S = 1101 0000 0 P = 0000 1100 0

    Perform the loop four times :

    1. P = 0000 1100 0. The last two bits are 00.

    P = 0000 0110 0. Arithmetic right shift.

    2. P = 0000 0110 0. The last two bits are 00.

    http://en.wikipedia.org/wiki/Arithmetic_shifthttp://en.wikipedia.org/wiki/Arithmetic_shifthttp://en.wikipedia.org/wiki/Arithmetic_shifthttp://en.wikipedia.org/wiki/Arithmetic_shift
  • 7/31/2019 M.tech Final Project

    18/23

    P = 0000 0011 0. Arithmetic right shift.

    3. P = 0000 0011 0. The last two bits are 10.

    P = 1101 0011 0. P = P + S.

    P = 1110 1001 1. Arithmetic right shift.

    4. P = 1110 1001 1. The last two bits are 11.

    P = 1111 0100 1. Arithmetic right shift.

    The product is 1111 0100, which is 12.

    The above mentioned technique is inadequate when the multiplicand is the largest negative

    numberthat can be represented (e.g. if the multiplicand has 4 bits then this value is 8).

    One possible correction to this problem is to add one more bit to the left of A, S and P.

    Below, we demonstrate the improved technique by multiplying 8 by 2 using 4 bits for the

    multiplicand and the multiplier:

    A = 1 1000 0000 0

    S = 0 1000 0000 0

    P = 0 0000 0010 0

    Perform the loop four times :

    1. P = 0 0000 0010 0. The last two bits are 00.

    P = 0 0000 0001 0. Right shift.

    2. P = 0 0000 0001 0. The last two bits are 10.

    P = 0 1000 0001 0. P = P + S.

    P = 0 0100 0000 1. Right shift.

    3. P = 0 0100 0000 1. The last two bits are 01.

    P = 1 1100 0000 1. P = P + A.

    P = 1 1110 0000 0. Right shift.

    4. P = 1 1110 0000 0. The last two bits are 00.

    P = 1 1111 0000 0. Right shift.

    The product is 11110000 (after discarding the first and the last bit) which is 16.

    Booth Recoding

    http://en.wikipedia.org/wiki/Two's_complement#The_most_negative_numberhttp://en.wikipedia.org/wiki/Two's_complement#The_most_negative_numberhttp://en.wikipedia.org/wiki/Two's_complement#The_most_negative_numberhttp://en.wikipedia.org/wiki/Two's_complement#The_most_negative_number
  • 7/31/2019 M.tech Final Project

    19/23

    Booth multiplication is a technique that allows for smaller, faster multiplication circuits, by

    recoding the numbers that are multiplied. It is the standard technique used in chip design,

    and provides significant improvements over the "long multiplication" technique.

    Shift and Add

    A standard approach that might be taken by a novice to perform multiplication is to

    "shift and add", or normal "long multiplication". That is, for each column in the multiplier,

    shift the multiplicand the appropriate number of columns and multiply it by the value of the

    digit in that column of the multiplier, to obtain a partial product. The partial products are

    then added to obtain the final result:.

    0 0 1 0 1 1

    0 1 0 0 1 1

    0 0 1 0 1 1

    0 0 1 0 1 1

    0 0 0 0 0 0

    0 0 0 0 0 0

    0 0 1 0 1 1

    0 0 1 1 0 1 0 0 0 1

    With this system, the number of partial products is exactly the number of columns

    in the multiplier.

    Reducing the Number of Partial Products

    It is possible to reduce the number of partial products by half, by using the

    technique of radix 4 Booth recoding. The basic idea is that, instead of shifting and addingfor every column of the multiplier term and multiplying by 1 or 0, we only take every

    second column, and multiply by 1, 2, or 0, to obtain the same results. So, to multiply by

    7, we can multiply the partial product aligned against the least significant bit by -1, and

    multiply the partial product aligned with the third column by 2:

  • 7/31/2019 M.tech Final Project

    20/23

    Partial Product 0 = Multiplicand * -1, shifted left 0 bits (x -1)

    Partial Product 1 = Multiplicand * 2, shifted left 2 bits (x 8)

    This is the same result as the equivalent shift and add method:

    Partial Product 0 = Multiplicand * 1, shifted left 0 bits (x 1)

    Partial Product 1 = Multiplicand * 1, shifted left 1 bits (x 2)

    Partial Product 2 = Multiplicand * 1, shifted left 2 bits (x 4)

    Partial Product 3 = Multiplicand * 0, shifted left 3 bits (x 0)

    The advantage of this method is the halving of the number of partial products. This is

    important in circuit design as it relates to the propagation delay in the running of the

    circuit, and the complexity and power consumption of its implementation.

    It is also important to note that there is comparatively little complexity penalty in

    multiplying by 0, 1 or 2. All that is needed is a multiplexer or equivalent, which has a delay

    time that is independent of the size of the inputs. Negating 2's complement numbers has the

    added complication of needing to add a "1" to the LSB, but this can be overcome by adding

    a single correction term with the necessary "1"s in the correct positions.

    Radix-4 Booth Recoding

    To Booth recode the multiplier term, we consider the bits in blocks of three, such that each

    block overlaps the previous block by one bit. Grouping starts from the LSB, and the first

    block only uses two bits of the multiplier (since there is no previous block to overlap):

    Figure 1 : Grouping of bits from the multiplier term, for use in Booth recoding. The least

    significant block uses only two bits of the multiplier, and assumes a zero for the third bit.

  • 7/31/2019 M.tech Final Project

    21/23

    The overlap is necessary so that we know what happened in the last block, as the MSB of

    the block acts like a sign bit. We then consult the table 2-3 to decide what the encoding will

    be.

    Block Partial Product

    000 0

    001 1 * Multiplicand

    010 1 * Multiplicand

    011 2 * Multiplicand

    100 -2 * Multiplicand

    101 -1 * Multiplicand

    110 -1 * Multiplicand

    111 0

    Table 1 : Booth recoding strategy for each of the possible block values.

    Since we use the LSB of each block to know what the sign bit was in the previous block,

    and there are never any negative products before the least significant block, the LSB of the

    first block is always assumed to be 0. Hence, we would recode our example of 7 (binary

    0111) as :

    0 1 1 1

    block 0 : 1 1 0 Encoding : * (-1)

    block 1 : 0 1 1 Encoding : * (2)

    In the case where there are not enough bits to obtain a MSB of the last block, as below, we

    sign extend the multiplier by one bit.

    0 0 1 1 1

    block 0 : 1 1 0 Encoding : * (-1)

    block 1 : 0 1 1 Encoding : * (2)

    block 2 : 0 0 0 Encoding : * (0)

    The previous example can then be rewritten as:

    0 0 1 0 1 1 , multiplicand

    0 1 0 0 1 1 , multiplier

    1 1 -1 , booth encoding of multiplier

    1 1 1 1 1 1 0 1 0 0 , negative term sign extended

    0 0 1 0 1 1

    0 0 1 0 1 1

    0 0 0 0 1 , error correction for negation

    0 0 1 1 0 1 0 0 0 1 , discarding the carried high bit

    One possible implementation is in the form of a Booth recoder entity, such as the one in

    figure 2-16, with its outputs being used to form the partial product:

  • 7/31/2019 M.tech Final Project

    22/23

    Figure 2 : Booth Recoder and its associated inputs and outputs.

    In figure 2,

    The zero signal indicates whether the multiplicand is zeroed before being used as a

    partial product

    The shift signal is used as the control to a 2:1 multiplexer, to select whether or not

    the partial product bits are shifted left one position.

    Finally, the neg signal indicates whether or not to invert all of the bits to create a

    negative product (which must be corrected by adding "1" at some later stage)The described operations for booth recoding and partial product generation can be

    expressed in terms of logical operations if desired but, for synthesis, it was found to be

    better to implement the truth tables in terms of VHDL case and if/then/else statements.

    Sign Extension Tricks

    Once the Booth recoded partial products have been generated, they need to be shifted and

    added together in the following fashion:

    [Partial Product 1]

    [Partial Product 2] 0 0

    [Partial Product 3] 0 0 0 0

    [Partial Product 4] 0 0 0 0 0 0

    The problem with implementing this in hardware is that the first partial product needs to be

    sign extended by 6 bits, the second by four bits, and so on. This is easily achievable in

    hardware, but requires additional logic gates than if those bits could be permanently kept

    constant.

    1 1 1 1 1 1 1 0 1 0 0

    0 0 0 0 0 1 0 1 1

    0 0 0 1 0 1 1

    0 0 0 0 1 , error correction for negation

    0 0 1 1 0 1 0 0 0 1

    Fortunately, there is a technique that achieves this:

    Invert the most significant bit (MSB) of each partial product

    Add an additional '1' to the MSB of the first partial product

  • 7/31/2019 M.tech Final Project

    23/23