© 2003 Fast-Chip. All rights reserved. 11/23/2015 3:37:34 AM RTL-Synchronized Transaction Reference Models Dave Whipp Fast-Chip Inc.

Post on 05-Jan-2016

213 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

© 2003 Fast-Chip. All rights reserved. 04/20/23 07:17 PM

RTL-SynchronizedTransaction Reference Models

Dave Whipp

Fast-Chip Inc.

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Motivation

›Needed Cycle Verification•Now, not 6 months later

›Why build two models, when one will do•We had a working “functional” model

›Don’t Chase RTL•Avoid modeling artifacts of the implementation

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Overview

1. What is Transaction Synchronization

2. Patterns in Transaction Synchronization

3. Methodology, Futures, Summary

© 2003 Fast-Chip. All rights reserved. 04/20/23 07:17 PM

Part 1

What is Transaction Synchronization?

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

A Functional Model

int classify_packet ( Packet packet_data, Uint32 rule_address ){

int result = ITERATE while (result == ITERATE) {

RuleStruct rule; read_rule(&rule, rule_address); int field = extract(rule, packet_data); interpret(rule, field, &result, &rule_address);

} return result;

}

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

“Bringup” Flow

test.script C-sim

RTL-sim

Compare

csim.log

rtl.log

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Transaction Interactions

Read-Rule

Rules DB

Write-Rule

Thread A Thread B

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Trace Files

›A trace of the sequence of transaction steps›Each synch point has a name, and thread-ID

•Comments provide context (values from RTL)

›Often hand-edited during debug

Example:[1536] read_rule thread_A # addr=h8a34 data=h1578[1544] write_rule thread_B # addr=h8a34 data=h5343[1632] read_rule thread_A # addr=h8a34 data=h5343[1694] write_rule thread_B # addr=h8a34 data=hf519[1694] read_rule thread_A # addr=h8a34 data=hf519

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

“Synchronized” Flow

C-sim

RTL-sim

Compare

csim.log

rtl.logtest.script

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Simulation Kernel

Read Synch

Read Stimulus

[pending]

[not pending]

Call Synchfunction

Pending SynchPoints (task list)

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Memory Access with Arbiter

A

B

Arb Mem

Monitor

Delay Delay

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Dual Port Memory Access

Monitor A Monitor B

A BMemoryDelay Delay

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

int field = extract(rule, packet_data); interpret(rule, field, &result, &rule_address);

}return result;

}

A Functional Model

}

int continue_read_rule (){

int classify_packet ( Packet packet_data, Uint32 rule_address ){

int result = ITERATEwhile (result == ITERATE){

RuleStruct rule; read_rule(&rule, rule_address);

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Refactoring

1. Move local variables into a “context” structure. Create an instance (on the heap, not the stack) at start of transaction – and delete at end.

2. Replace iterative loops with recursive functions.

3. For each function that requires synchronization (directly or indirectly), replace the call with a request/callback pair.

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

“Context” Structure

struct context{

Packet packet_data;Uint32 rule_address;RuleStruct rule;int field;int result;

void (*callback) (int);};

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Introduce Context Structure

void classify_packet_request (Packet packet_data, Uint32 rule_address, void (*callback)(int))

{ struct context *cxt = calloc(1, sizeof(struct context)); cxt->packet_data = packet_data; cxt->rule_address = rule_address; cxt->callback = callback;

cxt->result = ITERATE;

classify_packet_iterate(cxt);}

void packet_classify_reply(struct context *cxt){

int result = cxt->result;void (*callback)(int) = cxt->callback;free(cxt);callback(result);

}

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Non-Recursive Implementation

void classify_packet_iterate ( struct context *cxt )

{

while (cxt->result == ITERATE)

{

read_rule(&cxt->rule, cxt->rule_address);

cxt->field = extract(cxt->rule, cxt->packet_data);

interpret(cxt->rule, cxt->field, &cxt->result, &cxt->rule_address);

}

classify_packet_reply(cxt);

}

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Recursive Implementation

void classify_packet_iterate ( struct context *cxt )

{

if (cxt->result == ITERATE)

{

read_rule(&cxt->rule, cxt->rule_address);

cxt->field = extract(cxt->rule, cxt->packet_data);

interpret(cxt->rule, cxt->field, &cxt->result, &cxt->rule_address);

classify_packet_iterate(cxt);

}

else

{

classify_packet_reply(cxt);

}

}

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Synchronized Implementation

void classify_packet_iterate ( struct context *cxt ){

if (cxt->result == ITERATE) {

read_rule_request(&cxt->rule, cxt->rule_address, &classify_packet_continue); } else {

classify_packet_reply(cxt); }

}

void continue_read_rule ( struct context *cxt ){ cxt->field = extract(cxt->rule, cxt->packet_data); interpret(cxt->rule, cxt->field, &cxt->result, &cxt->rule_address); classify_packet_iterate(cxt);}

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Transaction Diagrams

Extract

Interpret

Read Rule

[done] [iterate]

Rules DB

Packet Buffer

Classify Packet

© 2003 Fast-Chip. All rights reserved. 04/20/23 07:17 PM

Part 2

Patterns in Transaction Synchronization

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Adding a Cache

›Cache needn’t effect transactions•Data-RAM not modeled

•cache is coherent

•Can rerun all tests, with no changes to C model

›Tag RAM is an Addition, not Modification•Independent Transactions

•Independent Synchronization

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Single Port, Cached

A

B

Arb Mem

Read/Write

Delay DelayCache

Tag RAMMiss Rd/Wr

Hit Rd/Wr

Correct Errors

Check ECC

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Cache Transaction (Read)

Read Data

[hit]

Write Tag

Read Tag

Read Tag

[miss]

Write TagCheck ECC

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

FIFOs and Counters

›Delay elements need no synchronization•But synchronization can increase locality

›Some FIFOs can drop transactions•Synchronize overflow: don’t model actual size

›Counters seem to need cycle-based model•We want to avoid this

›Correct Synch propagates “forces” to Model

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Force

Synchronizing a FIFO

Flow Control

Producer ConsumerFIFO

Push Pop

Drop

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

FIFO Transaction Diagram

[drop]

[push]

Pop

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

FIFO Synchronization Checker

Producer ConsumerFIFO

Push PopDrop

Checker: Queue Size Assertions

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Force

value

Counters

Register+1

load

Client

UpdateSample

value

select clk sample_en

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Scaffolding

›Permit verification incomplete RTL•Encourage end-to-end skeletons

•Implement “incorrect, but simple” algorithms•Don’t wait for complete RTL

•Postpone modeling the algorithm

•Use synch to avoid chasing a moving target

•Remove scaffolding once RTL is complete

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

An Algorithm Cache

Read Node

TreeSearch

NodeMemory

Result Cache

Hit Miss

Tag Ram

Hit Rd/Wr

Miss Rd/Wr

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Algorithm Cache: Transactions

Read Node

[match]

[No match]

[iterate]

[hit]

[miss]

Backdoorsearch

Read Tag

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Speculation

›When hardware speculates:•Effect precedes cause

•Transaction model appears incorrect

›Creative accounting can sometimes help•Insert a “virtual” delay

•Filter based on future events

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Speculation

Read Ctrl

Read Data

Read Data

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Speculation

Read Ctrl

Read Data

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Speculative Reads

Stage 1 Stage 2 Stage 4Stage 3

Ctrl RAMUpdate

Lookup(Pipe)

write

read

Delay (2 clocks)?

write

read

Data RAMUpdate

advance (2 clocks)?

© 2003 Fast-Chip. All rights reserved. 04/20/23 07:17 PM

Part 3

Methodology, Futures, Summary

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Verification Flow

›RTL Simulation is expensive•Licenses

•CPU time

›Post-Processing is cheap

›Stop simulations when broken•But not if bug is in test/model

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Methodology

›Cycle-Precise Reference Comparison•Without a cycle-accurate model

›Verify the System First•Bringup Flow (Function Model)

•Synchronized Flow (Transaction-Testbench)

›Postpone module level testing•Use scoreboarding to identify unit testbenches

•Only build unit-testbenches for stable modules

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Comparison with Platform-Based

›System-on-Chip Methodology•Verify components first

•Verify system as composition of verified units

›Complex-ASIC Methodology•Verify transactions first

•Verify units in context of verified transactions

•An “Agile” Methodology

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Future Work

›Performance in non-synchronized mode•Use threading to avoid fragmentation

›Synchronization as basis of SW architecture•Cycle-model plug-in could provide synch

•Can postpone this plug-in until tapeout

›But what if we want a cycle-model earlier?•Example: up-front performance validation

© 2003 Fast-Chip Confidential. All rights reserved. 04/20/23 07:17 PM

Summay

›Cycle timing is a “Don’t Care”

› Initial verification uses “Functional” model•Refactor into “Transaction” model

›RTL provides cycle timing•Caches, like FIFOs, are just delay elements

•“Forces” in testbench propagate to model

›“Coarse-grain first” methodology

© 2003 Fast-Chip. All rights reserved. 04/20/23 07:17 PM

Questions

mailto:Dave@Whipp.name

http://Dave.Whipp.name/dv

top related