1.1. Introduction 1 1.2. Behavioural description of the ...molenkam/ods/simple_ALU/simple_ALU.pdf · 1 E. Molenkamp Introduction Design flow of a simple ALU 1.1. Introduction 1 1.2.

1

E. Molenkamp Introduction

Design flow of a simple ALU

1.1. Introduction 1

1.2. Behavioural description of the alu 2 1.2.1. Test environment for the behavioural description 4

1.3. How to implement the behavioural description of the alu? 6 1.3.1. The test environment 6

1.4. Separation in datapath and controller 9 1.4.1. The datapath 10 1.4.2. The controller 12

1.4.2.1. behavioural description of the controller 13 1.4.2.2. finite state machine description of the controller 14

1.5. Realisation of a VHDL description 15

1.6. Design files 18 1.6.1. Design files before synthesis. 18 1.6.2. Design files for post simulation 19

1.1. Introduction

b e h a v io u r

a lg o r i th m

b e h a v io u r

fi n i te s ta te m a ch in e

m icro p ro g ra m m in g

re g is te r tra n s fe r l e v e l

re a l i s a tio n re a l i s a tio n

d a ta p a th co n tro l l e r

VH D L

s y n th e s isto o l

fig. 1 The design flow of a digital system

How to design a complex digital system? It is the aim of this document to illustrate the

design process using a simple ALU as case. A simple example has as disadvantage that

a number of design steps are not necessary due to the reduced complexity; however

these design steps are taken. The design flow is shown in figure 1.1 All design steps,

except the microprogramming step, are given.

Designing is finding and consider alternative solutions to the problem with the design

requirements in mind, such as speed, area, power, etc.

It is a benefit that the design steps are written in the same language. This allows

comparison of a design step with the behavioural description. How this verification

could be performed is also shown.

1 Many synthesis tools nowadays can synthesise VHDL description at a higher level of

abstraction than shown in this figure.

2

E. Molenkamp Behavioural description of the ALU

Finally a VHDL description should be synthesised with a given technology in mind.

However, what is the design level most suitable as input for a synthesiser, or even

more important how should it be described in VHDL. Some synthesis tools can handle

rather complex descriptions whereas others need a more detailed description.

In the standard VHDL environment a predefined type bit is declared, however this is

for many descriptions not suitable, there is no three state support, no "unknown", no

"don't care" value. To beat this problem an IEEE working group has defined the

std_logic_1164 multi-value logic system. All VHDL environments, including the

synthesis tools, support this package. In this chapter this package is used.

An additional problem is the integer representation of a std_logic_vector. What is the

integer value of the vector "111"? Is it 7, or -1? This is not defined2. For this the IEEE

developed a package numeric_std3. This package declares two vector types SIGNED

and UNSIGNED. A signed vector is interpreted as a twos complement value whereas

an unsigned vector is interpreted as a binary number (decimal 0 and higher).

1.2. Behavioural description of the ALU

A synchronous system (with a synchronous reset).

If input inst is low the operands op1 and op2 are added and otherwise they are

multiplied.

Before a new operation can be performed the output ready should be high.

A new operation starts when start goes to high. Start should remain high until output

ready goes to low.

If start is high then the operands and instruction should be stable.

If output ready is high the calculation is finished.

fig. 2 The informal specification of the ALU

Figure 2 gives the informal specification of the alu. This informal specification seems

complete, but it is not. The representation of the operands is missing, is the reset active

high or low? What is the active edge of the clock?

We will use VHDL to specify the system to design, this also allows simulation. There

is another reason that the specification is in an executable language. It makes it

possible to perform verification by simulation of a design against the specification

automatically.

2 There is a non IEEE package std_logic_unsigned that is often used. In this package a

std_logic_vector is interpreted as an unsigned. 3 There is a non IEEE package std_logic_arith with almost the same contents. The conversion

functions in this package start with “conv_” whereas in the package numeric_std they start

with “to_”.

3


library ieee;

use ieee.std_logic_1164.all;

use ieee.numeric_std.all;

entity alu is

generic (bw : natural := 5);

port (op1,op2 : in std_logic_vector(bw-1 downto 0);

inst,start : in std_logic;

reset : in std_logic;

clk : in std_logic;

ready : out std_logic;

res : out std_logic_vector(2*bw-1 downto 0));

end alu;

architecture behaviour of alu is

signal rdy_int : std_logic :='1';

begin

alu:process

constant add : std_logic := '0';

constant mul : std_logic := '1';

constant allzero : std_logic_vector(2*bw-1 downto 0) := (others => '0');

variable op1i,op2i : unsigned (bw-1 downto 0);

begin

wait until rising_edge(clk);

if reset='1' then

rdy_int <='1'; res <= allzero;

else

if start='1' then

rdy_int<='0', '1' after 50 ns; -- delay is used to simulate protocol

op1i := unsigned(op1);

op2i := unsigned(op2);

case inst is

when add => res <= std_logic_vector(resize (('0'&op1i + op2i),2*bw));

when others => res <= std_logic_vector(op1i*op2i);

end case;

end if;

end if;

end process alu;

ready <= rdy_int;

-- check I/O timing

check_stable_inputs:process

begin


if start='1' then

assert inst'stable report "input INST not stable" severity warning;

assert op1'stable report "input OP1 not stable" severity warning;

assert op2'stable report "input OP2 not stable" severity warning;

end if;

end process;

transition_start_high2low:process

variable prv_start : std_logic :='0';

begin


if prv_start='1' and start='0' then

assert rdy_int='0' report "incorrect transition start from high to low; rdy_int

is not '0'" severity warning;

end if;

prv_start:=start;

end process;

--transition of start from high to low only allowed when ready is low

--PSL default clock is rising_edge(clk);

--PSL start_high2low_readylow:

-- assert always ( {start='1'; start='0'}|-> {rdy_int='0'} );

end behaviour;

fig. 3 The executable specification of the alu

4


An executable specification of the alu is given in figure 3. The behavioural description

contains a section “check I/O timing”. This section is not complete, but only illustatres

this issue.

A part of the I/O relation is that after start goes to high it should remain high until

ready is low. The process transition_start_high2low verifies this. In a similar way

other I/O timing violations with respect to start and ready can be verified.

If start is high the data inputs should be stable in the region of the rising edge of the

clock. The timing details setup and hold times will depend on technology (after the

design is realized).

This behavioural description suggests that both operations (addition and multiplication)

require the same number of clock cycles. Hence, the output ready is not really required

since it is known beforehand when it is ready. However we assume that in the

realization both operations may use a different number of clock cycles (“assume”: the

designer is informed about this)

Assertions in VHDL can become complicated. An alternative is the use of PSL

(Property Specification Language). The PSL standard is also included in the VHDL

1076-2008 standard. The comment line that start with “--PSL” is a PSL statement and

as such it is recognized by a tool that supports PSL.

--transition of start from high to low only allowed when ready is low


--PSL start_high2low_readylow:

-- assert always ( {start='1'; start='0'}|-> {rdy_int='0'} );

This assertion verifies that if the sequence start is high followed by start occurs then

rdy_int should be low when start is low.

1.2.1. Test environment for the behavioural description

To verify that your description is correct the design it is simulated. Exhaustive testing

is in general not possible. Therefore the selection of the stimuli is important. You

should use stimuli that test corner cases like the extreme values, the protocol, etc.

An example of a test environment is given in in figure 4 (it is incomplete w.r.t. the test

cases). The designer has to verify that the output is correct.

At the rising edge of the clock the data is read. Therefore in the test environment the

stimuli are changed on the falling edge of the clock.

Most simulation tools have a command to simulate a design until no future activities

are planned (e.g. in ModelSim the command run –all). After all stimuli are applied the

boolean finished is set true and the signal clk will not oscillate anymore.

library ieee;



entity test_environment is


end test_environment;

architecture behaviour of test_environment is

component alu is





5


clk : in std_logic;



end component alu;

constant addition : std_logic := '0';

constant multiplication : std_logic := '1';

signal reset,inst,start,ready : std_logic;

signal clk : std_logic := '0';

signal op1,op2 :std_logic_vector(bw-1 downto 0);

signal res : std_logic_vector(2*bw-1 downto 0);

signal finished : boolean := false; -- used to stop simulation

begin

bhv: alu

generic map (bw)

port map (op1,op2,inst,start,reset,clk,ready,res);

clk <= not clk after 5 ns when not finished;

process

-- protocol assumes that data is set (operands and instruction)

procedure protocol (

signal clk, ready : IN std_logic;

signal start : OUT std_logic) is

begin

assert ready='1' report "alu is busy!" severity warning;

start<='1';

wait until falling_edge(clk);

lp0:loop


if ready='0' then

start<='0';

exit;

end if;

end loop lp0;

lp1:loop


exit when ready='1';

end loop lp1;

end protocol;

variable operand1, operand2 : integer RANGE 0 TO 2**bw-1;

begin

reset<='1'; start<='0';

inst<='0'; op1<=(others=>'0'); op2<=(others=>'0');


reset<='0';

for instruction in addition to multiplication loop

inst <= instruction;

op1<=(others=>'0'); op2<=(others=>'0'); -- test zero operands

protocol (clk,ready,start);

op1<=(others=>'1'); op2<=(others=>'0'); -- test ones and zeros operands


op1<=(others=>'0'); op2<=(others=>'1'); -- test zeros and ones operands


op1<=(others=>'1'); op2<=(others=>'1'); -- test ones and ones operands


loop_op1: for operand1 in 1 to 2**bw-1 loop -- some other values

op1 <= std_logic_vector(to_unsigned(operand1,bw));




exit loop_op2 when operand2 > 15; -- exhaustive testing takes too long

end loop loop_op2;

exit loop_op1 when operand1 > 15; -- exhaustive testing takes too long

end loop loop_op1;

end loop;

assert false report "simulation finished" severity note;

finished <= true;

wait;

end process;

end behaviour;

fig. 4 Part of an alternative architectural description of the alu

6

E. Molenkamp How to implement the behavioural description of the alu?

1.3. How to implement the behavioural description of the alu?

There are many ways to implement the behavior of the alu. It depends on the

requirements: should it be a fast circuit? or is there only a small area available? is

power consumption important? An algorithm influences these requirements. In this

step alternative algorithms should be considered for the different tasks (e.g.

intructions). Next the best combination of alternatives should be chosen. You should

have a rough indication of the block diagram of the system.

The choosen algorithm is written in VHDL and compared with the behavioural

description to detect the errors, which is commonly present in a description that is

presumed to be correct. In the next chapter this comparison is discussed. The design

flow assumes a top-down design approach. However, the designer should be aware of

the consequences and the availability of subsystems at a lower level. Therefore a top-

down design approach is more or less a 'meet in the middle' approach.

Synthesis tools can synthesise an adder. Multiplication can be a little bit more difficult

especial when the operand size is large. Modern FPGA’s have a (large) number of

approximately 20×20 bit combinatorial block multipliers. We assume that these block

multipliers can not be used. In figure 5 an algorithm for the multiplier is given based on

the shift-add algorithm. In the process the function addshift_multiply is declared that

gives a description of the algorithm.

function addshift_multiply (op1,op2 : unsigned) return unsigned is

variable tmp : unsigned (op1'length downto 0);

variable op1i : unsigned(op1'range);

begin

op1i := op1;

tmp := (others => '0');

for i in 1 to op1i'length loop

if op1i(0)='1' then

tmp := tmp + op2; --add

end if;

op1i := tmp(0) & op1i(op1i'length-1 downto 1); -- shift

tmp := '0' & tmp(op1i'length downto 1);

end loop;

return tmp(op1i'length-1 downto 0) & op1i;

end addshift_multiply;

..

case insti is

when add => res <= std_logic_vector(resize (('0'&op1i + op2i),2*bw));

when mul => res <= std_logic_vector(addshift_multiply(op1i,op2i));

when others => res <= allzero;

end case;

fig. 5 The algorithm for multiply

1.3.1. The test environment

To verify a design step you could verify it via formal verification and check that both

descriptions, the behavioural and the design step, have the same behaviour. We will use

simulation. Performing a simulation in which the designer tries to verify the correctness

is a bad solution, because the designer thinks that the design step is correct and

therefore is probably not critical enough to detect possible errors. In large companies

you often see that the verification is done by another person.

Simulation is possible but the simulation result should be best automatically compared

with behavioural description and discrepancies between both descriptions should

generate at least a warning.

The executable specification (not including timing) should be the same as a design step.

Due to delays the circuit needs time to stabilize. Therefore the results are to be

7


compared after some time. In a synchronous design you can compare the results just

before the next active edge of the clock, assuming that the numbers of clock cycles of

the results are the same in both descriptions. In our case the latter is not true. In the

specification multiplication is as fast as addition, whereas in the final result

multiplication will use a number of clock cycles. Therefore we use the control signal

start and ready. See figure 6 for a VHDL description. The procedure protocol handles

the I/O including the comparison of the alu output values. Notice that in the header of

this procedure the object are of mode signal! This assures that within the procedure

you have access to the current values of the objects.

The test set is rather easy: extreme values are applied and some other values. In a real

test environment the test data is often read from a file.

Furthermore the two component instantiations of the alu are given. This description

seems to handle everything very nice: start the execution, wait for the ready signals,

and finally check if errors or warnings are reported. Output ready is verified on the

falling edge. In the real synchronous design all output will change just after the rising

edge of the clock.

However, there is a problem, assume that the design step never generates a ‘1’ for

ready then the test environment waits endlessly. In that case no warning is displayed,

but the design step is not correct!

This problem can be solved using an additional VHDL process statement that checks

that signal ready is not too many consecutive clock cycless low (i.e. alu is processing).

This can also easily be verified with a PSL statement:


--PSL ready_to_long_low: assert never {not ready2[*20]};

If ready2 is low for 20 consecutive clock cycles a violation is reported. More

complicated checks can be written rather easily with PSL.

A separate configuration statement shows the used architectures (figure 7). This makes

it possible to add other configurations using the same test environment.

8


architecture structure of test_environment is

component alu is





clk : in std_logic;



end component alu;



signal reset,inst,start,ready1, ready2 : std_ulogic;



signal res1, res2 : std_logic_vector(2*bw-1 downto 0);


begin

bhv: alu

generic map (bw)

port map (op1,op2,inst,start,reset,clk,ready1,res1);

design: alu

generic map (bw)



process

-- protocol assumes that data is set (operands and instruction)

procedure protocol (

signal clk, ready1, ready2 : IN std_logic;

signal start : OUT std_logic;

signal res1, res2 : IN std_logic_vector(2*bw-1 downto 0)) is

variable rdy1zero,rdy2zero : boolean := false;

begin

assert (ready1='1') and (ready2='1') report "a alu is busy!"

severity warning;

start<='1';


lp0:loop -- both ready should be zero, not necessarily at the same time.


if ready1='0' then rdy1zero:=true; end if;

if ready2='0' then rdy2zero:=true; end if;

exit lp0 when rdy1zero and rdy2zero;

end loop lp0;

start<='0';

lp1:loop -- wait until both alus are ready


exit lp1 when (ready1='1') and (ready2='1');

end loop lp1;

assert res1=res2 report "mismatch in results" severity warning;

end protocol;

variable operand1, operand2 : integer RANGE 0 TO 2**bw-1;

begin

reset<='1'; start<='0';

inst<='0'; op1<=(others=>'0'); op2<=(others=>'0');


reset<='0';

for instruction in addition to multiplication loop

inst <= instruction;

op1<=(others=>'0'); op2<=(others=>'0'); -- test zero operands

protocol (clk,ready1, ready2,start,res1,res2);

op1<=(others=>'1'); op2<=(others=>'0'); -- test ones and zeros operands


op1<=(others=>'0'); op2<=(others=>'1'); -- test zeros and ones operands


op1<=(others=>'1'); op2<=(others=>'1'); -- test ones and ones operands






protocol(clk,ready1, ready2,start,res1,res2);

exit loop_op2 when operand2 > 15; -- exhaustive testing takes to long

end loop loop_op2;

9

E. Molenkamp Separation in datapath and controller

exit loop_op1 when operand1 > 15; -- exhaustive testing takes to long

end loop loop_op1;

end loop;

assert false report "simulation finished" severity note;

finished <= true;

wait;

end process;

end structure;

fig. 6 The test environment

configuration cnf_algorithm of test_environment is

for structure

for bhv:alu use entity work.alu(behaviour); end for;

for design:alu use entity work.alu(algorithm); end for;

end for;

end cnf_algorithm;

fig. 7 The configuration for the test environment for the algorithmic description.

1.4. Separation in datapath and controller

As mentioned previously it is assumed that there is already an appropriate

implementation for an adder available. The algorithm for multiplication assumes the

repeated use of an adder. Only one adder and some extra logic are necessary for the

data manipulation, in the datapath, and in addition a controller to control the dataflow

and operations in the datapath.

There are still design decisions to be taken, e.g. what should be the width of the

internal operands? In this example the internal width of the datapath could be 1, 2 or 4

bits. However, the reduction of the data width will increase the complexity and the

area of the controller, also this will influence the performance. In this case the data

width of the internal operands is the same as the external ones. Figure 8 shows the

division of the alu in the two subsystems.

In general the number of control signals from controller to datapath is large and will

often change during the design process. Also the encoding of control signals should be

open until the right moment. An enumeration type control_bus is declared with the

logical names of the control signals, for readability and maintenance reasons, and

implies no encoding. The only task left for the designer is to find the logical names for

the control signals and put these in a package, for this simple alu:

PACKAGE control_names IS

TYPE control_signals IS

(enable_r1,enable_r2,enable_r3,init,shift_add,addition);

-- do not change the following type declaration

TYPE control_bus IS ARRAY (control_signals) OF std_logic;

END control_names;

10


library ieee;


use work.control_names.all;

architecture datapath_controller of alu is

signal control : control_bus;

component datapath



control : in control_bus;

clk : in std_logic;


end component;

component controller


port (inst : in std_logic;

start : in std_logic;

clk : in std_logic;

reset : in std_ulogic;

control : out control_bus;

ready : out std_logic);

end component;

begin

dp:datapath

generic map (bw)

port map (op1,op2,control,clk,res);

ct:controller

generic map (bw)

port map (inst,start,clk,reset,control,ready);

end datapath_controller;

fig. 8 The division in a datapath and a controller.

1.4.1. The datapath

o p 1 o p 2

r34 4 4

4 4

4

4

4

5 a b

s

a d d e r

re s <= r3 & r2

0 0 0 0

0 0 0 0

s a a d din i t

s a a d din i t

s aa d d

a d d

s a AN D r1 (0 )

s a AN D N OT r1 (0 )

r1 r2

e n a b le _ r3 e n a b le _ r1 e n a b le _ r2

s (4 ..1 ) 0 0 0 &s (4 ) s (0 &r1 (3 ..1 )s (3 ..0 )

N o te : a d d =a d d i ti o ns a = s h i fta d din i t = i n i ti a l i s a tio n

fig. 9 The block diagram of the datapath width generic value for bw is 4.4

4 Meaning of the used symbols:

- double underlined blocks

registers (edge sensitive), possible with an enable input, the global clock line is not shown.

- other blocks

combinational logic

- circles

circles are gates. Only the behaviour of all connected outputs of gates can be described. If

exactly one gate condition is true the output of all connected gates is equal to the input of the

corresponding input of the true condition gate. If no gate condition is true, or two or more gate

conditions are true the output is a don't care. In fact this is an abstraction from implementation,

e.g. it could be implemented with a multiplexer.

11


The description of the datapath is a register transfer level description. A detailed block

diagram of this description is given in figure 9. The significance of such a diagram is an

informal one, it only makes the formal description easier to understand. If there is a

discrepancy between the block diagram and a formal description (figure 10) of the

datapath, the latter is correct.

If the control signals init, enable_r1, and enable_r2 are '1', the operand values are read

loaded in the registers at the rising edge of the clock. Next the control signals

add(ition), enable_r2, and enable_r3 are '1' an addition is performed and the result is

stored in the registers r3 and r1.

If the datapath contains more ‘complex’ elements (i.e. register file, alu, memory, ..)

component instantiations can be used. This has the advantage that each design entity of

a sub design can be testen separately. I personally would not use a component

instantion for a component that is not ‘complex’, e.g. a multiplexer.

a

b

ia

ib

y

The previous schematic means that the output y is equal to the value of input ia when control

signal a is '1' (and control signal b='0'), or it is equal to the value of input ib when control

signal b is '1' (and control signal a='0'). In other cases it is don't care. There is no unique way

to describe this in VHDL. In fact it depends probably on the style you like. It could be written

as:

y <= dontcare WHEN dontcare_condition( control(a) & control(b)) ELSE

ia WHEN control(a)='1' ELSE

ib WHEN control(b)='1' ELSE

dontcare;

The function dontcare_condition takes the control values as input and determines whether

there is a don't care condition or not. If both control signals are '1' dontcare_condition results

in true.

To realise the previous schematic representation one could use a multiplexer, or 'pass

transistors' for the gates, and there are many more possibilities.

From a specification point of view the don’t care condition check is nice. However the

synthesis tool probably may generate a larger and/or slower circuit. Therefore consider using a

more practical solution and don’t use the dontcare_condition check:

y <= ia WHEN control(a)='1' ELSE

ib; WHEN control(b)='1' ELSE

dontcare;

You could add an assertion with the function dontcare_condition and report a warning if

multiple control signals are '1'.

12


library ieee;



entity datapath is



control : in control_bus;

clk : in std_logic;


end datapath;

library ieee;


architecture rtl of datapath is

constant zerobw_1 : std_logic_vector(bw-2 downto 0) := (others=>'0');

constant zerobw : std_logic_vector(bw-1 downto 0) := (others=>'0');

constant dontcare : std_logic_vector(bw-1 downto 0) := (others=>'-');

signal s : std_logic_vector(bw downto 0);

signal a,b,ir1,ir2,ir3 : std_logic_vector(bw-1 downto 0);

signal r1,r2,r3 : std_logic_vector(bw-1 downto 0);

begin

ir1 <= s(0) & r1(bw-1 downto 1) when control(shift_add)='1' else

s(bw-1 downto 0) when control(addition)='1' else

op1 when control(init)='1' else

dontcare;

ir2 <= op2;

ir3 <= s(bw downto 1) when control(shift_add)='1' else

zerobw_1 & s(bw) when control(addition)='1' else

zerobw when control(init)='1' else

dontcare;

a <= r3 when control(shift_add)='1' else

r1 when control(addition)='1' else

dontcare;

b <= r2 when control(addition)='1' else

r2 when control(shift_add)='1' and r1(0)='1' else

zerobw when control(shift_add)='1' and r1(0)='0' else

dontcare;

s <= std_logic_vector( ('0'&unsigned(a)) + unsigned(b));

res <= r3 & r1;

registers:process

begin


if control(enable_r1)='1' then r1<=ir1; end if;



end process;

end rtl;

fig. 10 The description of the datapath.

1.4.2. The controller

The datapath description probably contains errors. However some of these errors can

only be detected if the combination of datapath and controller is simulated against the

behavioural description. But if there is an error detected it could be in the datapath, in

the controller, or in both components. For debugging reasons as well it is very

important that the controller is first described at a high level, i.e. as a behavioural

description again. After simulation of this design step the controller is designed.

13


1.4.2.1. behavioural description of the controller

library ieee;



entity controller is


port (inst : in std_logic;

start : in std_logic;

clk : in std_logic;


control : out control_bus;

ready : out std_logic);

end controller;

architecture behaviour of controller is

signal rdy_int : std_ulogic;

begin

cntrl:process

variable count : natural range 0 to bw-1;

constant add : std_logic := '0';

constant mul : std_logic := '1';

variable insti : std_logic;

begin

rst: loop

if reset='1' then

control<=(others=>'0'); rdy_int<='1'; insti:='0';


else

lp_start:loop

control<=(others=>'0'); rdy_int<='1';

exit when start='1';


exit rst when reset='1';

end loop lp_start;

insti:=inst; rdy_int<='0';

control<=(enable_r1 | enable_r2 | enable_r3 | init => '1', others=>'0');



case insti is

when add => control<=(enable_r1 | enable_r3 | addition => '1', others=>'0');

rdy_int<='0';



when others => count:=bw-1;

repeat:loop

control<=(enable_r1 | enable_r3 | shift_add => '1',

others=>'0'); rdy_int<='0';



exit repeat when count=0;

count:=count-1;

end loop repeat;

end case;

end if;

end loop rst;

end process;

ready <= rdy_int;

end behaviour;

fig. 11 The behavioural description of the controller.

The behavioural description of the controller is straightforward. However this is not

the only possible behavioural description. How fast should the system respond to the

reset? In this description the execution is ended rudely due to the statement

exit rst when reset='1' after the wait statements.

For simulation the test environment is used with the configuration is shown in figure

12.

14


configuration cnf_controller_behaviour of test_environment is

for structure

for bhv:alu use entity work.alu(behaviour); end for;

for design:alu use entity work.alu(datapath_controller);

for datapath_controller

for dp:datapath use entity work.datapath(rtl); end for;

for ct:controller use entity work.controller(behaviour); end for;

end for;

end for;

end for;

end cnf_controller_behaviour;

fig. 12 The configuration for the test environment for the datapath with behavioural of the controller.

1.4.2.2. finite state machine description of the controller

architecture fsm of controller is

signal rdy_int : std_ulogic;

begin

process

type states is (idle,ini,add,mul);

variable state : states;

variable count : natural range 0 to bw-1;

begin


if reset='1' then

rdy_int<='1';

count:=0;

state:=idle;

control<=(others=>'0');

else

case state is

when idle => if start='1' then

state:=ini;

end if;

when ini => if inst='1' then

state:=mul; count:=bw-1;

else

state:=add;

end if;

when add => state:=idle;

when mul => if count>0 then

count:=count-1;

else

state:=idle;

end if;

end case;

case state is

when idle => control<=(others=>'0'); rdy_int<='1';

when ini => control<=(enable_r1 | enable_r2 | enable_r3 | init=>'1',


when add => control<=(enable_r1 | enable_r3 | addition=>'1',


when mul => control<=(enable_r1 | enable_r3 | shift_add=>'1',


end case;

end if;

end process;

ready <= rdy_int;

end fsm;

fig. 13 The finite state machine (fsm) description of the controller and the configuration.

In the behavioural description of the controller figure 11 the state information is not

explicit. A finite state machine description is given in figure 13. Four states are used:

idle: wait for a new operation to be performed

initialization: the operands are read and register are initializated

15

E. Molenkamp Realisation of a VHDL description

addition: operands are added,

multiplication: this state is executed bw times (repeated addition and shifting).

1.5. Realisation of a VHDL description

The design entities datapath (rtl) (figure 10) and controller(fsm) (figure 13) are

synthesisable, therefore the alu as a whole can be realized. The synthesis flow depends

on the synthesis tool. In this document the Quartus II software of Altera is used. The

input for the tooling is: package pkg_control_names (file pkg_control_names.vhd) entity controller (file controller.vhd)

architecture datapath_controller of controller (file alu-datapath_controller.vhd)

entity datapath (file datapath.vhd)

architecture rtl of datapath (file datapath-rtl.vhd)

entity controller (file controller.vhd)

architecture fsm of controller (file controller-fsm.vhd)

Furthermore additional constraints can be added. We want the system to operate at

100 MHz. This information is put in the synopsys design constraint file alu.sdc:

create_clock -period 10.000 -name clk [get_ports clk]

In figure 14 the realization of the alu is given. Furthermore the tool informs the user

about the timing, the area and so on.

After realization the real delays are known. Most synthesis tools, also Quartus II,

generate VITAL (VHDL Initiative Towards ASIC Libraries) compliant post simulation

files. The names (and extensions) are not the same. For Quartus II it is:

- *.vho (VHDL output file). This is a low level structural description of the realization.

You need also the technology specific packages. If you use a ModelSim-Altera version

these packages are precompiled.

- *.sdo (Standard delay output file) this file contains the timing information.

In figure 15a a part if the generated output file alu.vho is given. Notice that the generic

in the port map is replaced with constraints in the inputs op1 and op2 and in the output

res. The entity name is alu. Since in the design flow the entity alu is already declared

with a different port declaration the entity name in the file alu.vho is manually changed

in alu_realization. Consequently also the name of the file is changed in

alu_realization.vho.

In figure 16 the whole simulation waveform is shown with the test environment (the

behaviour is compared with the realization (post simulation)). In the middle part the

output of the behavioural description is shown and in the bottom parts the output of

the realization. Due to the delays in the post simulation you see much more activity. It

can happen that the output in the realization is already high while not outputs signals

are valid. It is a synchronous system, and it is known that there the clock to output

delays which do not have the same values. If this is not desirable an additional state can

be added to the state machine to take care of an extra clock cycle delay for the output

ready.

16


a) RTL view of structure of alu

b) RTL view of datapath

c) RTL view of controller

fig. 14 The realization of the alu.

17


-- VENDOR "Altera"

-- PROGRAM "Quartus II 64-Bit"

-- VERSION "Version 11.0 Build 157 04/27/2011 SJ Full Version"

-- DATE "08/26/2011 09:40:42"

--

-- Device: Altera EP2C5T144C6 Package TQFP144

--

--

-- This VHDL file should be used for ModelSim-Altera (VHDL) only

--

LIBRARY CYCLONEII;

LIBRARY IEEE;

USE CYCLONEII.CYCLONEII_COMPONENTS.ALL;

USE IEEE.STD_LOGIC_1164.ALL;

ENTITY alu IS

PORT (

ready : OUT std_logic;

res : OUT std_logic_vector(9 DOWNTO 0);

op1 : IN std_logic_vector(4 DOWNTO 0);

op2 : IN std_logic_vector(4 DOWNTO 0);

inst : IN std_logic;

start : IN std_logic;

reset : IN std_logic;

clk : IN std_logic

);

END alu;

a) top of generated alu.vho output file by Quartus II. Note that manually the entity name is changed

from alu to alu_realization.

architecture realization of test_environment is



signal reset,inst,start,ready1, ready2 : std_logic;



signal res1, res2 : std_logic_vector(2*bw-1 downto 0);


begin

bhv: entity work.alu (behaviour)

generic map (bw)


-- this instantion depends on the generated simulation output files

-- of the synthesis tool.

realization: entity work.alu_realization(structure)

port map (ready=>ready2, res=> res2, op1=>op1, op2=>op2, inst=>inst,

start=>start, reset=>reset,clk=>clk);


...

b) Part of the test environment with can test the realization against the behaviour

fig. 15 Part of test environment used to test the realization.

18

E. Molenkamp Design files

fig. 16 Part of the simulation (behaviour versus realization).

1.6. Design files

1.6.1. Design files before synthesis.

The script in file compile_all.do compiles all files in a correct order. Use the

configurations to simulate the different design steps.

File compile_all.do # packages

vcom pkg_control_names.vhd

#

# controller

vcom controller.vhd

vcom controller-behaviour.vhd

vcom controller-fsm.vhd

#

# datapath

vcom datapath.vhd

vcom datapath-rtl.vhd

# alu

vcom alu.vhd

vcom alu-behaviour.vhd

vcom alu-algorithm.vhd

vcom alu-datapath_controller.vhd

#

# test environment

vcom test_environment.vhd

vcom test_environment_behaviour.vhd

vcom test_environment_structure.vhd

#

# configuration

vcom cnf_behaviour.vhd

vcom cnf_algorithm.vhd

vcom cnf_controller_behaviour.vhd

vcom cnf_controller_fsm.vhd

19

E. Molenkamp Design files

1.6.2. Design files for post simulation

The script in file compile_post_and simulation.do compiles the generated output file

and the test_environment.

Note: manually the entity name alu in the output file alu.vho is changed in

alu_realization (also the name if the file is changed accordingly)

File compile_post_and_simulation.do vcom alu_realization.vho

vcom test_environment_realization.vhd vsim -sdftyp /realization/=alu_vhd_fast.sdo -t ps work.test_environment (realization)

1.1. Introduction 1 1.2. Behavioural description of the ...molenkam/ods/simple_ALU/simple_ALU.pdf · 1 E. Molenkamp Introduction Design flow of a simple ALU 1.1. Introduction 1 1.2.

Documents