Memory Address TAG LINE WORD 4bit 2bit 2bit IMPLEMENTATION OF “DIRECT MAPPED CACHE”, IN BEHAVIORIAL VERILOG ECE254 ASSIGNMENT 1 NEERAJ DHOTRE (perm:5483615) 1. Introduction: Memory hierarchy is imperative due to prevalent highly pipelined and super scalar architectures. As main memory access is lot slower compared to other tasks in the pipeline, data and instructions are stored closed to the processor in a small and comparatively faster memory, called cache. The main aspects of cache design are cache size, memory mapping function, write policy and replacement algorithm. In a direct mapped cache each block of memory is mapped to a particular row in the cache. The mapping function is simple to implement but performance of this type of cache is not the best. The synchronization of the cache with main memory, handling read miss and write miss etc. make direct mapped cache a good candidate for this assignment, aim of which is to learn Verilog modeling and design simulation with Model Sim. 2. Cache Design: 2.1 Assumptions: The cache is designed with the following assumptions. This cache lies between the Processor and the Main memory. The cache and processor run on the same fast clock. Main memory is a single port synchronous DRAM running on a slower clock.(4 times slower) Processor sends physical address to the cache. Processor sends/requests one word data (32 bits wide) at a time. The cache implements „write through‟ write policy with “No write allocate” i.e. on a write miss data is written only to main memory. 2.2 Cache and Main Memory Size: To keep the cache size, memory size, data with etc. flexible for any cache module instance parameters are used. Parameters define constants which can be changed during instantiation. The main parameters with the values specified were used for simulations in this assignment. parameter ADDR_SIZE = 8; meaning 8 bit address and 2 8 Main memory locations. parameter DATA_SIZE = 32; meaning processor is 32 bit and each main memory location is 32bit making it 256 x 4 byte or 1024B memory. parameter LINE_BITS = 2; meaning 4 lines in the cache. parameter LINES = 1 << LINE_BITS; parameter WORD_BITS = 2; meaning 4 words per line in the cache, making it 64B memory. parameter WORDS = 1 << WORD_BITS; So according to these sizes the address is broken like shown is figure 1. for direct mapping in the cache. Figure 1. Address break up for direct mapping.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Memory Address
TAG LINE WORD
4bit
s
2bit
s
2bit
s
IMPLEMENTATION OF “DIRECT MAPPED CACHE”, IN BEHAVIORIAL VERILOG
ECE254 ASSIGNMENT 1 NEERAJ DHOTRE (perm:5483615)
1. Introduction:
Memory hierarchy is imperative due to prevalent highly pipelined and super scalar architectures. As main memory
access is lot slower compared to other tasks in the pipeline, data and instructions are stored closed to the processor in
a small and comparatively faster memory, called cache. The main aspects of cache design are cache size, memory
mapping function, write policy and replacement algorithm. In a direct mapped cache each block of memory is
mapped to a particular row in the cache. The mapping function is simple to implement but performance of this type
of cache is not the best. The synchronization of the cache with main memory, handling read miss and write miss etc.
make direct mapped cache a good candidate for this assignment, aim of which is to learn Verilog modeling and
design simulation with Model Sim.
2. Cache Design:
2.1 Assumptions: The cache is designed with the following assumptions.
This cache lies between the Processor and the Main memory.
The cache and processor run on the same fast clock.
Main memory is a single port synchronous DRAM running on a slower clock.(4 times slower)
Processor sends physical address to the cache.
Processor sends/requests one word data (32 bits wide) at a time.
The cache implements „write through‟ write policy with “No write allocate” i.e. on a write miss
data is written only to main memory.
2.2 Cache and Main Memory Size:
To keep the cache size, memory size, data with etc. flexible for any cache module instance parameters are used.
Parameters define constants which can be changed during instantiation. The main parameters with the values specified were used for simulations in this assignment.
parameter ADDR_SIZE = 8; meaning 8 bit address and 28 Main memory locations.
parameter DATA_SIZE = 32; meaning processor is 32 bit and each main memory location is 32bit making it
256 x 4 byte or 1024B memory.
parameter LINE_BITS = 2; meaning 4 lines in the cache.
parameter LINES = 1 << LINE_BITS;
parameter WORD_BITS = 2; meaning 4 words per line in the cache, making it 64B memory.
parameter WORDS = 1 << WORD_BITS;
So according to these sizes the address is broken like shown is figure 1. for direct mapping in the cache.
Figure 1. Address break up for direct mapping.
Cache Memory data 8
rd_en 7 addr
clk
reset
Chip_select
mem_addr
mem_wr
mem_data
rd_done
wr_done
8
busy
data_valid
2.3 Block Diagram:
The block diagram is shown in figure 2. Behavioral model is written for the cache and main memory blocks.
The signals from processor are given as stimulus is tech bench. Table 2 lists the ports of cache.
Figure 2. Block Diagram showing signal connections.
Port Direction Description
clk input Common clock between cache and processor
reset input Synchronous reset to the cache
rd_en input HIGH for read from cache LOW for write to cache
data [31:0] bidirectional Data from/to processor Direction determined by rd_en
addr[7:0] input Address from the processor
data_valid output Active high signal indicating output data to processor is valid
Busy output Active High signal indication that cache is busy. Processor will not send another
request when cache busy.
mem_addr[7:0] output Address bus to main memory
mem_wr output HIGH for read from main memory LOW for write to main memory
chip_select output Signal to enable main memory access
mem_data[31:0] bidirectional Data from/to main memory
rd_done input Signal from main memory that requested read operation done
wr_done input Signal from main memory that requested write operation done
Table 1. Ports of cache with direction and descriptions.
Processor
Cache
Cache memory
tag word 1 word 2 word 3 word 4
Main Memory
Register Size Description
cache_hit_reg 1 bit Indicates a tag match, meaning requested address present in cache
line 2 bit To store line index to cache from input address
tag 4bit To store location tag from input address
count 2 bit To keep track of number of main memory reads in case of read miss
data_out 32bit Registered data out before driving it onto the bidir data bus to processor
mem_data_out 32bit Registered data out before driving it onto the bidir data bus to Memory
mem_data_reg0 to 3 32bit 4 registers to store data words read in from main memory Table 2. Internal registers used in the behavioral model
3. Verilog Implementation
3.1 Verilog code
The verilog code for the cache is given in appendix A. The design is implemented in 6 always blocks which execute simultaneously. There are 2 combinational blocks and 4 sequential blocks. These blocks do
the following logical tasks and together model cache behavior.
3.1.1 Combinational Blocks:
I. Tag comparison: This block always checks weather the tag of line mentioned in input address
matches with that in the address. It sets the cache_hit_reg if there is a tag match irrespective of read
or write operation
II. Memory select: This blocks controls the enabling of Main Memory. The Main Memory needs to
be enabled only when data is needed to be transferred to/from it. This gives better control over the
rd_done and wr_done signals given out by the Main Memory.
3.1.2 Sequential Blocks:
I. Cache Hit: Only if Tag comparison is successful this block executes and does the required data
manipulation.
II. Cache Miss: Only if Tag comparison is un successful this block executes and does the required data
manipulation.
III. Data Synchronizing from Memory: There two blocks, one runs on posedge clk and other on posedge
rd_done. These are required to synchronize the reads from memory in case of a read miss, as cache
and memory run at different asynchronous clocks.
3.2 Test bench
The test bench code is present in appendix C. The test bench runs 4 test cases to test the functionality of
the direct mapped cache. The clk signal is given a period of 10ns and mem_clk period is 40ns
1) Write Miss: Initially there is nothing in the cache or Memory. Processor issues 4 writes to
consecutive memory locations all of which result in a cache write miss. The data is written only to main
memory. As seen in the waveform data 56,57,58,59 were written to memory location 120,121,122 and
123 respectively. Cache_hit_reg signal was always low meaning a cache miss and proper busy pulses
were given to the processor form every right.
Figure 3. Wave forms showing Cache write miss test case.
2) Read Miss: Now the test bench requests the data written in the previous step. This results
in a read miss and cache brings the data from main memory. In this case as the memory has only
one word at each location, cache has to do 4 reads to get a block of data and replace a line. As
seen in the waveform in figure 4 the processor requests data at location 120 resulting in a read
miss. This triggers 4 reads from main Memory. Required data is given to processor with
data_valid and the cache line 2 is written with 4 words (56,57,58,59).
Figure 4. Wave forms showing Cache read miss test case
3) Read Hit: Again the processor requests same data. This time it is a cache hit as the data
was brought into the cache in the previous step. The data requested was at location 122 and as
seen in figure 5. Correctly data 58 was returned.
4) Write Hit: Now the processor writes a word to the cache at the same address from which
it read in last step. This results in a cache hit and the data is written properly. The data 60 is
requested to be written at location 122. As seen in the waveform in figure 5. correctly 60 is
written to the cache. According to write through method this data is written to main memory too.
Figure 5. Wave forms showing Cache read and write hit test case