Sunburst Design, Inc. - World Class SystemVerilog & …...SNUG 2016 Page 4 Rev 1.0 Applying Stimulus & Sampling Outputs ‐ UVM Verification Testing Techniques 1. Introduction Although

World Class SystemVerilog & UVM Training

ApplyingStimulus&SamplingOutputs‐UVMVerificationTestingTechniques

Clifford E. Cummings

Sunburst Design, Inc. [email protected] www.sunburst-design.com

ABSTRACT

Whenshouldtestbenchstimulusvectorsbeappliedtoadesign?Whenshoulddesignoutputsbeverifiedbythetestbench?HowshouldUVMdriveandsampleDUTsignals?Thedetailsofdrivingstimulusandsamplingoutputsisoneofthemostadhochabitsofmanyverification engineers, and little thought has been given to the best usage strategies andreasonsforusingthem.Thispaperwilldetailfundamentaltechniquesthathavebeenproventoworkwithallphasesofdesignverification.Discussed inthispaperarethetradeoffsofthreedifferenttechniquestoapplystimulus.AlsodiscussedinthispaperarethebesttechniquesforsamplingDUToutputs.Finally, useful new SystemVerilog verification features such as clocking blocks and#1stepsamplingwillbediscussed.Thispaperalsoexplainswhy theadditionof theSystemVerilogprogramkeywordwasabadideaandwhyitshouldnotbeused.

SNUG2016

Page2

Rev1.0


TableofContents1.Introduction.............................................................................................................................................................................4

1.1IntroductiontoUVMmethodologies...............................................................................................................4

1.2Introductiontoterminology...............................................................................................................................4

1.3Assessingknowledge.............................................................................................................................................5

2.Time‐0raceconditions........................................................................................................................................................5

2.1Time‐0potentialproblem....................................................................................................................................5

2.2Time‐0initialandalwaysblockbehavior.....................................................................................................8

2.3Time‐0stimulusassignments.............................................................................................................................8

3.Verificationgoal.....................................................................................................................................................................8

4.StimulusTiming.....................................................................................................................................................................9

5.Drivingstimulusontheactiveclockedge‐Avoidthis..........................................................................................9

5.1Testbenchblockingassignments...................................................................................................................10

5.2I/OpadsintheRTLmodel................................................................................................................................10

5.3Gate‐levelsimulationswithsetupandholddelays...............................................................................11

5.4Waveformdisplaydebugging..........................................................................................................................11

6.Drivingstimulusontheinactiveclockedge‐Avoidthis...................................................................................12

6.1Inactive‐clockstimulusproblems..................................................................................................................12

7.Drivingstimulususingtimebudgeting‐Usethis.................................................................................................14

8.VerificationTiming............................................................................................................................................................15

8.1Samplingoutputsontheactiveclockedge................................................................................................15

8.2Samplingoutputsjustbeforethenextactiveclockedge....................................................................16

9.ClockingBlocks....................................................................................................................................................................16

9.1ClockingBlockDefaultTiming........................................................................................................................17

9.2#1stepSampling....................................................................................................................................................17

9.3#0DriveTimes.......................................................................................................................................................17

9.4ExampleUVMclockingblock...........................................................................................................................18

9.5Stimulususingclockingdrives........................................................................................................................19

9.6Whydrivesignalsattime‐0?............................................................................................................................20

9.7Asynchronouscontrolinputs..........................................................................................................................21

9.8Interfacemodportsandtestbenches............................................................................................................25

10.DeathtotheSystemVerilogprogram!.....................................................................................................................25

10.1Cliff'sconfession.................................................................................................................................................27

11.Conclusions.........................................................................................................................................................................27

SNUG2016

Page3

Rev1.0


12.References...........................................................................................................................................................................27

13.Author&ContactInformation....................................................................................................................................28

TableofFiguresFigure1‐VCSsimulationrace‐conditionoutputofExample1.............................................................................6

Figure2‐Simulator‐Bsimulationrace‐conditionoutputofExample1............................................................6

Figure3‐VCSsimulationNO‐race‐conditionoutputofExample2.....................................................................8

Figure4‐Simulator‐BsimulationNO‐race‐conditionoutputofExample2....................................................8

Figure5‐Exampleofrealhardwaretiming...................................................................................................................9

Figure6‐Exampleofapplyingstimulusontheactiveclockedge‐pronetotestbenchraceconditions....................................................................................................................................................................................10

Figure7‐Asymmetricalstimulusclock‐changestimulusonthenegedgeof(stim)clk........................12

Figure8‐Stimulusgenerationfordual‐clocklogic‐changestimulusonthenegedgeofvclk.............13

Figure9‐Synthesisconstraintsettingsusingtimebudgeting............................................................................14

Figure10‐Exampleclockvarsdrivenusingtheclockingblockname............................................................15

Figure11‐Fourasynchronousresetsignalscenarios............................................................................................21

Figure12‐Asynchronousmid‐cycle,sub‐cycleresetpulse<insertcommoncodeexamplehere>....23

Figure13‐SystemVerilogmoduleandprogrameventscheduling..................................................................26

TableofExamplesExample1‐Time‐0blockingassignment‐racecondition......................................................................................6

Example2‐Time‐0nonblockingassignment‐NOracecondition.......................................................................7

Example3‐SystemVerilogcodetogenerateasymmetricalstimulusclock..................................................13

Example4‐SystemVerilogcodetogeneratedual‐clocklogicstimulusclock..............................................14

Example5‐Clockingblockformatusing20%timebudgettodrivestimulus.............................................14

Example6‐ProgramcounterDUTcode.......................................................................................................................18

Example7‐CYCLE.svfile.....................................................................................................................................................18

Example8‐DUTinterfacewithclockingblock..........................................................................................................19

Example9‐tb_driverwithinitialize()task(noclockingblocktiming)anddrive_tr()ask(usesclockingblocktiming)...........................................................................................................................................................20

Example10‐sample_duttaskchecksasyncresetatbeginningandendofthecycle...............................22

Example11‐tb_monitorchecksasyncresetatbeginningandendofthecycle.........................................23

Example12‐DUTinterfacewithsticky‐bitcodetosaveresetshort‐pulseresetcondition.................24

Example13‐tb_monitormodifiedtotestthesticky‐bitreset_nversionoftherst_nasynchronousreset...............................................................................................................................................................................................25

SNUG2016

Page4

Rev1.0


1.IntroductionAlthoughmuchisknownaboutdesignverification,littlehasbeenwrittenaboutthestrategiesthatcanandshouldbeusedtoapplystimulusvectorsandsampleDUToutputsfordesignvalidation.

Troublesomeverificationissuesinclude:Time‐0simulationraceconditions,howtoapplystimulustoreducesimulationraceconditions,stimulustechniquesthatdonotrequirechangingthetestbenchwhentimingdelaysandtimingchecksareaddedtothesimulation,aconsistentwaytoverifythedesignoutputs,andhowtohandleasynchronouscontrolsignalsinverification.

ThispaperwillshowcommonstimulusgenerationtechniquesandpresentguidelinesforBestKnownPractices.ThebesttechniquewillalsobeshownwithinacommonUVMdriverinSection9.5

ThispaperwillalsoshowtheBestKnownPracticesforsamplingDUToutputsandthebesttechniquewillbeincorporatedintoacommonUVMmonitorasshowninSection9.7

This paper will also detail problems related to time‐0 simulation issues and how to avoid theproblems.

Although thispaperdescribeswhen tosampleDUToutputs forverificationpurposes, itdoesnotdescribehowtobuildaverificationscoreboardforthesampledoutputs.ApaperdescribingUVMscoreboardarchitecturescanbeseenin[2].

1.1IntroductiontoUVMmethodologies

ThetestbenchtechniquesdescribedinthispapercoverbothSystemVerilogandUVMapproaches.UVMverificationenvironmentstypicallyconnectaDUTtoaninterfacethatincludesaclockingblock.Thehandleofthisinterface is typicallystoredinauvm_config_db that isaccessedasavirtualinterfacebytheUVMdriverandmonitor,oraccessedbytheUVMagentandtheagentcopiesthevirtualinterfacehandletothedriverandmonitorthatarebuiltbytheagent.ThedriverthenusesclockingdrivesthataccessestheclockingblockintherealinterfacetodrivestimulusandthemonitorusesclockingsamplesthatagainaccessestheclockingblockintherealinterfacetosampleDUToutputs.

TheuseofclockingblocksfortestbenchesisdescribedinSection9ofthispaper.

ExamplesofinterfacesthatareusedinaUVMtestbenchareshowninExample8andinExample12.Theseinterfacesaretypicallyinstantiatedinatop‐levelmoduleandtheinterfacehandlesstoredinauvm_config_dbforretrievalbytheUVMtestbenchclasses.

1.2Introductiontoterminology

Therearesometermsusedinthispaperthatmightnotbefamiliartosomeverificationengineersandcauseundueconfusion.Belowaredescriptionsofacoupleoftermsthatcouldhelpverificationengineersbetterunderstandtheconceptsdiscussedinthispaper.

clockvaristhetermusedtodescribethesignalsdeclaredinaclockingblockasdescribedbytheSystemVerilogStandard[5]andbyBromleyandJohnston[7].

clockingsignalisthenameofthesignalsdeclaredintheclockingblockandgenerally,aclockingsignalandtheclockvarnamesarethesame[5].

clockingdrivereferstostimulusthatisdrivenusingclockingblocktimingdefinedforaclockvar.Theclockingdriveoperationmakesassignmentstoclockvarsusingtheclockingblocknameandtheassignmentsaremadeusingtheclockingdriveoperator(<=),whichisthesameoperatorthatisusedfornonblockingassignments[5].

SNUG2016

Page5

Rev1.0


1.3Assessingknowledge

When assessing skills of new college graduates, job candidates or even your current verificationengineers,Isuggestthefollowingassessmentscale:40%creditforproperlystartingupaverificationtest.40%creditforproperlyshuttingdownaverificationtest.20%creditforproperlytestingtheDUT after the testbench has started. Almost anybody can get themiddle part of a test to workcorrectlybutproperlystartingandterminatingatestiswhererealverificationskillisrequired.Myexperiencehasshownthatalargeportionofthetestdebugtimeisrelatedtothestart‐upandshut‐downofatest.Talentedengineerscanavoidtheseprolongeddebugissuesandthosearetheskillsthatwillbeshowninthispaper.

2.Time‐0raceconditionsBeforestartingtodevelopaSystemVerilogorUVMtestbench,anengineerneedstoconsiderwhathappensattime‐0duringasimulation.

Time‐0isatrickyplaceinVerilogandSystemVerilogsimulations.It iseasytoexperiencelengthyrace‐conditiondebuggingissuesattime‐0.Ifyoufollowacoupleofsimpleguidelines,itisjustaseasytoavoid100%ofthetime‐0raceconditions.

2.1Time‐0potentialproblem

One of the potential problems related to time‐0 race conditions is that the IEEE Verilog andSystemVerilogStandardsrequireallproceduralblocks(initialblocksandalwaysblocks)tostartexecutionatthebeginningofthesimulationattime‐0butthereisnodefinedorderofexecutionoftheseblocksattime‐0.TheSystemVerilogcodeofExample1hasatime‐0racecondition:

`define CYCLE 10 `timescale 1ns/1ns module initial_always1; logic clk; initial $timeformat(-9,0,"ns",6); initial @(negedge clk) $display("%t: initial #1 negedge clk", $time); always begin @(negedge clk) $display("%t: always #1 negedge clk", $time); wait(0); end initial begin clk = '0; forever #(`CYCLE/2) clk = ~clk; end initial @(negedge clk) $display("%t: initial #2 negedge clk", $time); always begin @(negedge clk) $display("%t: always #2 negedge clk", $time); wait(0); end

SNUG2016

Page6

Rev1.0


initial begin repeat(2) @(negedge clk); FINISH(); end task FINISH(); @(posedge clk); $display("%t: FINISH\n\n", $time); $finish; endtask endmodule

Example1‐Time‐0blockingassignment‐racecondition

NotethattheclockoscillatorinExample1hasatime‐0negedgeclkassignmentusingablockingassignment.Theexamplealsohasaninitialblock(initial#1)andalwaysblock(always#1)that trigger on the negedge clk positioned in the code before the clock oscillator and anotherinitial block (initial #2) and always block (always #2) that trigger on the negedge clkpositionedinthecodeaftertheclockoscillator.Iftheclockoscillatorstartsbeforetheinitialandalwaysblocksareactive,thoseblockswillnottriggeruntilonecycleafterthesimulationstarts.Iftheclockoscillatorstartsaftertheinitialandalwaysblocksareactive,thoseblockswilltriggerattime‐0.

WhenVCSrunsthissimulation,theoutputisshowninFigure1.Notethatalloftheblockstriggeredattime‐0exceptfortheinitialblockthatwasplacedaftertheclockoscillator.This isperfectlylegalbehaviorforVerilogandSystemVerilogsimulators.

0ns: initial #1 negedge clk 0ns: always #1 negedge clk 0ns: always #2 negedge clk 10ns: initial #2 negedge clk 25ns: FINISH

Figure1‐VCSsimulationrace‐conditionoutputofExample1

Whenanothersimulator("Simulator‐B")runsthissimulation,theoutputisshowninFigure2.Notethat the blocks that preceded the clock oscillator triggered at time‐0while the blocks thatwerepositionedaftertheclockoscillatordidnottriggeruntilonecycle later.Thistooisperfectly legalbehaviorforVerilogandSystemVerilogsimulators.

0ns: always #1 negedge clk 0ns: initial #1 negedge clk 10ns: always #2 negedge clk 10ns: initial #2 negedge clk 25ns: FINISH

Figure2‐Simulator‐Bsimulationrace‐conditionoutputofExample1

Usingblockingassignmentsattime‐0frequentlycausesaracecondition.Thisraceconditioncanbeavoidedbyusingnonblockingassignmentsattime‐0.

The modified SystemVerilog code of Example 2 uses a nonblocking assignment for the firstassignmentintheclockoscillator,whichremovesthetime‐0racecondition.

SNUG2016

Page7

Rev1.0


`define CYCLE 10 `timescale 1ns/1ns module initial_always2; logic clk; initial $timeformat(-9,0,"ns",6); initial @(negedge clk) $display("%t: initial #1 negedge clk", $time); always begin @(negedge clk) $display("%t: always #1 negedge clk", $time); wait(0); end initial begin clk <= '0; forever #(`CYCLE/2) clk = ~clk; end initial @(negedge clk) $display("%t: initial #2 negedge clk", $time); always begin @(negedge clk) $display("%t: always #2 negedge clk", $time); wait(0); end initial begin repeat(2) @(negedge clk); FINISH(); end task FINISH(); @(posedge clk); $display("%t: FINISH\n\n", $time); $finish; endtask endmodule

Example2‐Time‐0nonblockingassignment‐NOracecondition

NotethattheclockoscillatorinExample2hasatime‐0negedgeclkassignmentthatnowusesanonblockingassignment.Theexamplestillhasaninitialblock(initial#1)andalwaysblock(always#1)thattriggeronthenegedgeclkpositionedinthecodebeforetheclockoscillatorandanotherinitialblock(initial#2)andalwaysblock(always#2)thattriggeronthenegedgeclk positioned in the code after the clockoscillator. Even if the clockoscillator startsbefore theinitialandalwaysblocksareactive,theclockassignmentwillnotcompleteuntilaftertheotherblockshavebecomeactive.

WhenVCSrunsthissimulation,theoutputisshowninFigure3.Notethatalloftheblockstriggeredattime‐0.Thetime‐0raceconditionhasbeenremoved.

SNUG2016

Page8

Rev1.0


0ns: initial #1 negedge clk 0ns: always #1 negedge clk 0ns: initial #2 negedge clk 0ns: always #2 negedge clk 15ns: FINISH

Figure3‐VCSsimulationNO‐race‐conditionoutputofExample2

Whenanothersimulator("Simulator‐B")runsthissimulation,theoutputisshowninFigure4.Notethateventhoughthecodehasexecutedinaslightlydifferentorder,againalloftheblockstriggeredattime‐0.Thetime‐0raceconditionhasbeenremoved,andbothsimulationsgivethesameresult.

ItisalsoworthnotingthattheFINISHcommandexecutedonecycleearlierusingbothsimulatorsbecausetheinitialblockwiththeFINISHcommandwasalsonowactiveattime‐0.

0ns: always #2 negedge clk 0ns: initial #2 negedge clk 0ns: always #1 negedge clk 0ns: initial #1 negedge clk 15ns: FINISH

Figure4‐Simulator‐BsimulationNO‐race‐conditionoutputofExample2

2.2Time‐0initialandalwaysblockbehavior

AlthoughallofthesimulationresultsshowninSection2.1arelegal,inpracticethemajorsimulationvendors frequentlystartupalwaysblocksbeforeinitialblocksat time‐0.Thisbehavior isnotguaranteed by the IEEE Verilog and SystemVerilog Standards, but this, and the fact that mosttestbenchesdrivestimulusacrossmoduleports,iswhymosttestbenchesworkcorrectlyattime‐0.RTL designs are typically coded usingalways blocks and testbenches are typically coded usinginitialblocks,sotheRTLdesignstypicallybecomeactiveattime‐0beforetheinitialblockssendthefirststimulus.

This is apointof confusion formostnewVerilogusersbecause it sounds likeaninitial blockshouldexecutefirstattime‐0,butthisisnotwhathappens.Amoreaccuratenamefortheinitialblockwouldhavebeenarun_onceblock!

2.3Time‐0stimulusassignments

The best guideline to follow to avoid time‐0 race conditions is to make all time‐0 stimulusassignments using nonblocking assignments. After time‐0, all other stimulus assignments can bemadeusingblockingassignmentsifstimulusisdrivenusingthetimebudgetingtechniquedescribedinSection7.

Thisistherace‐freestimulusdrivingtechniquethatIhaveusedsuccessfullyformorethana10years.

3.VerificationgoalWhenbuildingatestbench,engineersneedtoaskthesequestions:

1. Whenshouldtestvectorsbedriven?2. Whenoutputsshouldbesampled.

Thegoalistoconstructatestbenchthatcanbeusedforbehavioralmodels,for0‐delayRTLdesignsand forgate‐levelsimulations that includebackannotatedSDFtimingwithnomodificationto the

SNUG2016

Page9

Rev1.0


testbench.EngineerscertainlydonotwanttomaintaintwoormoreseparatetestbenchesduetopoortestbenchplanningandtimingissuesrelatedtodifferentDUTimplementations.

4.StimulusTimingWhenshouldstimulusvectorsbeappliedtoadesign?Isthestimulustimingrealisticwhencomparedtoreal‐worlddesignconstraints?

There are three primary stimulus generation techniques that have been used for decades byengineersresponsibleforbuildingverificationenvironments:(1)applystimulusontheactiveclockedge, (2)applystimulusonthe inactiveclockedge,and(3)applystimulususingTimeBudgetingtechniques.

Thesetechniques,alongwithnewSystemVerilogclockingdrivetechniques,aredescribedwiththeiradvantagesandpotentialpitfallsinthefollowingsections.

5.Drivingstimulusontheactiveclockedge‐AvoidthisIntheory,drivingstimulusvectorsontheactiveclockedgeshouldworkwith0‐delayRTLmodelsandmanyengineerscommonlyusethistechnique.Ihighlydiscouragethispracticeforreasonsthatare described later in this section, but if this technique is used, verification engineers need tounderstandthelimitationsandpotentialpitfallsofthistechnique.

Itisfrequentlyclaimedthatapplyingstimulusonanactiveclockedgemorecloselyreplicatesactualhardwarebehavior,butthisisnottrue.Inrealhardware,inputdatafrequentlychangesnanosecondsafteranactiveclockedgeisobserved,asshowninFigure5.

Figure5‐Exampleofrealhardwaretiming

Thenon‐recommendedpracticeofapplyingtestbenchstimulusontheactiveclockedgewouldbeasetuporholdtimeviolationinrealhardware,andpronetosimulationraceconditionsasshowninFigure6.

SNUG2016

Page10

Rev1.0


Figure6‐Exampleofapplyingstimulusontheactiveclockedge‐pronetotestbenchraceconditions

ForVerilogverification,akeytousingtheactiveclock‐edgestimulustechniqueistodriveallstimulusfrom the testbench using nonblocking assignments, which should guarantee that design inputschangeafter theexactsameactiveclockedgehasbeenused tosamplesignals fromthepreviouscycle.Youdonotwanttochangeinputsonaclockedgeandhavethesameclockedgecapturethesignalsthatjustchanged.Realhardwaredoesnotbehavethatway.

ForSystemVerilogverification,akeytousingtheactiveclock‐edgestimulustechniqueistodrivethestimulususingoneofthefollowing:nonblockingassignmentsfrommodules,anytypeofassignmentfromaprogram,ordrivingassignmentsfromaclockingblockoriginatingfromeitheramodule,aclassoraprogram.

Inpractice,therearemanysituationswhereapplyingstimulusontheactiveclockedgecancauseunnecessary, time‐consumingdebugdifficultiesrelated tostimulus‐designraceconditions.Belowaresomeofthepotentialraceconditionsthatcanoccur.

Sowhatare thepotentialproblemsandmistakesassociatedwithapplyingstimulusontheactiveclockedge?

5.1Testbenchblockingassignments

Thefirstpotentialproblemisaratherobviousmistakethathappensfrequentlybutcanbedetectedandcorrectedquickly.Thesimplemistakeisthatthetestbench‐writerusedablockingassignmentfromamodule‐basedtestbenchtoapplystimulus,andsomeofthestimulusinputschangedbeforetheactiveclockedgehadachancetosamplethepreviousDUTinputs.Althoughsimpleandobvious,thismistakestillhappensfrequently,especiallywithnewusersofVerilogandSystemVerilog.

5.2I/OpadsintheRTLmodel

Evenwhendoing0‐delayRTLmodeling,itisnotuncommonforengineerstoinstantiateI/Opadsinthetop‐levelmoduletocommunicatewiththerestofthedesign.IfthedatapathI/Opadshaveshortdelaysandtheclock‐treeI/Opadshaveslightlylongerdelays,thepreviousinputdatavalueswillbechangedbythetestbenchbeforetheyaresampledattheDUTinputsontheactiveclockedge.Thisisnothowtherealhardwarewillwork.

SNUG2016

Page11

Rev1.0


Thisproblemcanbeavoidedbyaddingright‐hand‐side(RHS)delaystothestimulusnonblockingassignments,effectivelydelayingthechangeofthedatainputsuntilaftertheactiveclockedge,whichcloselyreplicatestheactualbehaviorofrealhardware.AddingRHSdelaystostimulusassignmentsissomewhatofacodingnuisance,andcouldcausesimulationstorunalittleslower,butitdoessolvetheproblemandisdescribedin[3].

Using SystemVerilog clocking blocks and stimulus clocking drives can also place delays on thestimulusdata,thuslocalizingthedelaysintoacommonclockingblock,butbeawarethatthedefaultclockingblockdelayis0,soanon‐zerodefaultclockingblockoutputwillbenecessarytoaddtheequivalentdelays.

5.3Gate‐levelsimulationswithsetupandholddelays

Oneofthemajorproblemsrelatedtoapplyingstimulusontheactiveclockedgemanifestsitselfingate‐levelsimulationswithbackannotateddelays,includingsetupandholdtimechecking.Agate‐leveldesignwithactualclocktreelogictypicallyintroducesmoredelayintotheclockpaththanisintroducedintothedatapath.Inanactualhardwaredesignthistypicallyisnotaproblem,becausethereareactual clock‐to‐qdelayson thedrivensignals toadesign,butwhenanactualdesign isdrivenbyatestbenchthatchangesboththeclockanddatastimulussignalsatthesametime,itcancausethedatasignalstochangeontheinputsofregistereddeviceshundredsofpicosecondstoevenacoupleofnanosecondsbeforetheactiveclockedgetraversestheclocktreetotheclockinputoftheregisters.Thedifferentialinactualdatapathversusclockpathdelaysisfrequentenoughtocauseasetuporholdtimeviolationforthegate‐leveldesigninteractingwiththe0‐delaytestbench.TheseviolationstypicallycausesimulationX'stobepropagatedthroughoutthegate‐leveldesign,causingverificationtofail.

Again, to avoid this problem when applying stimulus on the active clock edge, the verificationengineereitherneedstoaddRHSdelaystothestimulusnonblockingassignments,oraddclockingdrivedelaystoaSystemVerilogtestbench.

An important goal of testbench development is to use the same testbench for both 0‐delay RTLdesignsandgate‐leveldesigns that includedelaysand timing checks.Usingactive clock stimulus(typicallyontheposedgeclock)violatesthisgoal.

Guideline:donotapplytestbenchstimulusontheactiveclockedge.

5.4Waveformdisplaydebugging

Whenstimulusinputschangeontheactiveclockedge,debugginga0‐delayRTLdesigninawaveformdisplaycanbeconfusingtodesignengineers.Theconfusionarisesbecauseinawaveformdisplay,theinputchangescoincidentwiththerisingclockedgeandtheregisteredoutput,aswasshowninFigure6.Althoughtheresultsarecorrectfora0‐delayRTLsimulation,anyhardwareengineerwithreal‐worldexperiencewill finditstrangethatthe inputschangedcoincidentwiththerisingclockedgeanditwillappearthattheinputshavepotentiallyviolatedrealsetupandholdtimes.Thisisonereasonthatengineersfrequentlyadd#1delaystotheRHSofnonblockingassignments,sotheycanseeaclk‐to‐qdelayinthewaveformdisplay.Atechniquedescribingtheuseof#1clk‐to‐qdelaysisdescribedin[3].

Ifcareful,averificationengineercanmaketheactive‐clockstimulustechniquework,buttherearefewerchancesforerrorsifstimulusisappliedsometimeaftertheactiveclockedge.Ihavefoundthatengineersmakefewermistakesandspendlesstimedebuggingstimulus‐simulationraceconditionsbyapplyingthestimulusawayfromtheactiveclockedge.

SNUG2016

Page12

Rev1.0


6.Drivingstimulusontheinactiveclockedge‐AvoidthisToavoidalloftheproblemsrelatedtochanginginputsontheactiveclockedgeandtoensurethatthesamestimulusvectorscanbeusedfor0‐delayRTLsimulationsandgate‐levelsimulationswithbackannotatedtiming,formorethan10yearsIusedthetechniqueofapplyinginputstimulusvectorsontheinactiveclockedge(typicallythenegedgeclk).

Applyingvectorsontheinactiveclockedgeaccomplishesthefollowinggoals:

1. Stimuluscanbeappliedusingeitherblockingornonblockingassignmentsfromeitheramoduleoraprogram.TheactiveclockcapturestheinputsandsincethenextinputsarenotplacedontheDUTinputsuntilthenextinactiveclockedge,thereisneverasimulationraceconditionrelatedtotheproperorimproperuseofmodules,programs,blockingornonblockingassignments.

2. 0‐delayRTLmodelswithtop‐levelI/Opadsandaccompanyingdelaysnevercauseproblems.DUTinputschangefarawayfrompropagatedclocks.

3. Gate‐levelsimulationswithdelays(includingsetupandholdtimechecks)areneveraproblem,exceptwhentheinputcombinationaldatapathsarelongerthanhalfoftheclockperiod.ThisisaddressedinSection7.

4. Waveformdisplaysofinputstimulusareeasytounderstand.Whenthestimuluschangesontheinactiveclockedge,anyinputcombinationallogicwillreactimmediatelyandsetupontheregisterinputsofthedesign.Thenextactiveclockedgewillthencapturethestimulusandsettleimmediately(0‐delayRTLmodels)orshortlythereafter(gate‐levelmodelswithdelays),whichishighlyintuitivetomosthardwaredesignengineers.

6.1Inactive‐clockstimulusproblems

Sowhatarethepotentialproblemsassociatedwithapplyingstimulusontheinactiveclockedge?

ThemostfrequentproblemassociatedwithapplyingstimulusontheinactiveclockedgeisthatthereisnowonlyhalfofaclockcyclefortheprimaryDUTdatainputstopropagatetotheinputregistersof the design and meet the register setup time when running gate‐level simulations withbackannotateddelays.

There are twoVerilog approaches to address this potential problem. Both approaches use time‐budgetingtechniquestoaddresstheissues.

Thefirstistobuildaclockoscillatorwithashorthighpulsewidthandalonglowpulsewidth.Theshorthighpulsewidthisthetimebudgetedfromthebeginningofthecycleuntilwhenthestimulusinputswillchange.ThelowpulsewidthisthetimebudgetedforthestimulusinputstopropagatethroughtheDUT inputcombinatorial logic. In the followingexample, it isassumedthat theclockperiodis10nsandthatthestimuluscanchange2nsaftertheactiveclockedge.

AsshowninFigure7,thestimulusvectorsshouldbeappliedattime‐0andoneachnegedgeofthe(stim)clk.

Figure7‐Asymmetricalstimulusclock‐changestimulusonthenegedgeof(stim)clk

SNUG2016

Page13

Rev1.0


ThecorrespondingSystemVerilogcodetogeneratethisasymmetricalclockisshowninExample3.

`define CYCLE 10 `timescale 1ns/100ps // Use appropriate timescale resolution ... parameter HIGH_PULSE_WIDTH = 2; // Choose an appropriate clk-to-q delay parameter LOW_PUSLE_WIDTH = `CYCLE - HIGH_PULSE_WIDTH; ... initial begin // virtual stimulus clock clk <= '0; forever begin #( LOW_PULSE_WIDTH); clk = '1; #(HIGH_PULSE_WIDTH); clk = '0; end end ...

Example3‐SystemVerilogcodetogenerateasymmetricalstimulusclock

ThistechniqueworksfineaslongasonlytheactivesystemclockedgeisusedbytheDUTandthattherearenotransparentlatchesenabledbythesystemclockorflip‐flopsthattriggeronthenegativeedgeclock.

Ifbothclockedgesareseparatelyusedtotriggerflip‐flopsoriftheclocklevelisusedinsometypeoflatchinglogicconfiguration,thenIrecommendusinga2Xvirtualclockwithskeweddutycycletoaccomplishthesamegoal.

AsshowninFigure7,thestimulusvectorsshouldbeappliedattime‐0andoneachnegedgeofthevclk

Figure8‐Stimulusgenerationfordual‐clocklogic‐changestimulusonthenegedgeofvclk

ThecorrespondingSystemVerilogcodetogeneratethisdual‐clocklogicstimulusclockisshowninExample4.TherisingvirtualclockedgeisusedtotoggletheDUTclk.

`define VCYCLE 5 // Actual clock cycle is 10ns `timescale 1ns/100ps // Use appropriate timescale resolution ... parameter HIGH_PUSLE_WIDTH = 1; parameter LOW_PUSLE_WIDTH = `VCYCLE - HIGH_PULSE_WIDTH; ... initial begin // virtual stimulus clock vclk <= '0; forever begin #( LOW_PULSE_WIDTH); vclk = '1; // This example: 4ns #(HIGH_PULSE_WIDTH); vclk = '0; // This example: 1ns

SNUG2016

Page14

Rev1.0


end end initial begin // actual design clock clk <= '0; forever @(posedge vclk) clk = ~clk; end ...

Example4‐SystemVerilogcodetogeneratedual‐clocklogicstimulusclock

7.Drivingstimulususingtimebudgeting‐UsethisTheconceptoftimebudgetingisatechniquethathaslongbeenusedinsynthesisandwasdescribedina1997SNUGpaperbyAnnaEkstrandhandWayneBell[1].Arecommendedsynthesistechniqueistoregisterallmoduleoutputsandonlyallowcombinatoriallogiconthemoduleinputssothatthesynthesiscompilerwouldusemostoftheclockcycletomeetcombinationalinputconstraintsandvery littleof theclockcyclewouldberequired toconstrain theclk‐to‐qoutputof themodule, asshowninFigure9.

Figure9‐Synthesisconstraintsettingsusingtimebudgeting

IntheexampleofFigure9,asynthesistoolwouldbeinstructedtoallocateasmallpercentageoftheclockcycletomeetregister‐to‐module‐outputstiming(20%ofthecycleinthisexample)andallocatethemajorityof theclockcycle tomeet inputs‐to‐registered‐logic timing (80%of thecycle in thisexample).

ThesameconceptcanbeusedtodrivestimulusintoaVerilogorSystemVerilogdesign.

SystemVerilogprovidesaclockingblock tohelpdefine the timebudgetsofstimulusdrivesandoutputsamples.Notethattheclockingblockcanbeplacedinamodule,programorinterface.

clocking cb1 @(posedge clk) default input #1step output (`CYCLE * 0.2); input <list of all inputs> ; output <list of all outputs>; endclocking

Example5‐Clockingblockformatusing20%timebudgettodrivestimulus

SNUG2016

Page15

Rev1.0


RememberthattestbenchoutputsarethestimulusthatisdrivenintotheDUTandtestbenchinputsaretheDUTsignalsthatweresampledandpassedtothetestbench.

Todrivesignalsusingthespecifiedclockingblockdelays,theclockingblocknamemustbeusedwithclockingdrivestousethespecifiedtimingasshowninFigure10.

task drive_tr (trans1 tr); @vif.cb1; vif.cb1.din <= tr.din; vif.cb1.ld <= tr.ld; vif.cb1.inc <= tr.inc; vif.cb1.rst_n <= tr.rst_n; endtask

Figure10‐Exampleclockvarsdrivenusingtheclockingblockname

TheassignmentsinFigure10containthreepartsontheleft‐sideoftheclockingdriveoperator.Thefirstpartisthehandletothevirtualinterface(vifinthisexample).Thesecondpartisthenameoftheclockingblockintheinterface (cb1 inthisexample).Thethirdpart isthenameofthesignalsintheinterfacethataretobedrivenusingtheclockingblocktiming(din,ld,inc,andrst_ninthisexample).These3‐part‐referencesignalsarereferredtoasclockvarsasdescribedintheSystemVerilogStandardandinapaperbyBromleyandJohnston[7].

8.VerificationTimingWhen should design outputs be verified by the testbench? How can a verification strategy beformulatedtoworkwithboth0‐delayRTLmodelsandgate‐levelmodelswithdelays?

Therearetwoprimaryverificationtimingtechniquesthathavebeenusedfordecadesbyengineersresponsible forbuildingverificationenvironments: (1)sampledesignoutputson theactiveclockedge,(2)sampledesignoutputsjustbeforethenextactiveclockedge.

These techniques, along with new SystemVerilog clocking block sampling techniques, aredescribedwiththeiradvantagesandpotentialpitfallsinthissection.

8.1Samplingoutputsontheactiveclockedge

Intheory,samplingdesignoutputsontheactiveclockedgeshouldworkwith0‐delayRTLmodelsandsomeengineersdousethistechnique.Igenerallydiscouragethispracticeforreasonsthatwillbe described later, but if this technique is used, verification engineers need to understand thelimitationsandpotentialpitfallsofsamplingontheactiveclockedge.

ForVerilogverification,akeytosamplingoutputsontheactiveclock‐edgeisthattheinputstimulushastobedrivenusingnonblockingassignments.ThetheorybehindthistechniqueisthatstimulusdrivenwithnonblockingassignmentswillnotcausedesignoutputstobeupdateduntiltheVerilognonblockingassignmenteventregion,whichintheorymeansthattheolddesignoutputsshouldstillbevalidandavailabletobesampledforverificationtestinguntilthenewstimulushasbeenclockedintothedesign.Thedesignoutputsthatarebeingsampledwereclockedonthepreviousrisingclockedgeandshouldhavesettledtotheirfinalvaluealmostafullclockcycleearlier.

For SystemVerilog verification, a key to using the active clock-edge stimulus technique is to drive the stimulus using one of the following: nonblocking assignments from modules, any type of assignment from a program block, or driving assignments from a clocking block originating from either a module, a class or a program.

SNUG2016

Page16

Rev1.0


8.2Samplingoutputsjustbeforethenextactiveclockedge

ThebestplacetosampleDUToutputsisattheendofthecyclejustbeforethenextactiveclockedgeandSystemVerilogintroducedanewtypeofsamplingdelaycalledthe#1steptohelpaccomplishthisgoal.Theuseofthe#1stepsamplingdelayisdescribeinSection9.2

9.ClockingBlocksClockingblocksplayanimportantroleincontrollingstimulusdrivingandoutputsamplingtiminginatestbench,especiallyinaUVMtestbenchenvironment.

AnexcellentpaperbyJonathanBromleyandKevinJohnston[7]goesintogreatdetailonhowtheSystemVerilogclocking blockworks and some of its lesser known capabilities and quirks. Thereaderisencouragedtoreadthatentirepaperforagreaterunderstandingofclockingblocks.

BromleyandJohnstonalsoshared11guidelinesintheirpaper,mostofwhichIstronglyagreewithbutthereareacoupleofexceptionsorfurtherclarificationsthatIwillmakeinthissection.

TheBromleyandJohnstonguidelinesare: 1. When using a clocking block, the testbench must access only its clockvars and should never access the clocking signals directly.

** I mostly agree but an important exception is described in Section 9.5

2. Testbench code should synchronize itself to a clocking block’s clock event by waiting for the clocking block’s own named event, NOT by waiting for the raw clock event.

** I agree - follow this guideline.

3. Write to output clockvars using the clocking drive operator <=. Never try to write an output clockvar using simple assignment =.

** I agree - further clarification is described at the end of Section 9.5

4. Use input #1step unless you have a special reason to do otherwise. It guarantees that your testbench sees sampled values that are consistent with the values observed by your SystemVerilog assertions, properties and sequences. ** I agree - an additional important reason is described in Section 9.2 .

5. Use non-zero output skew values in your clocking blocks to make waveform displays clearer, and to avoid problems caused by clock network delays in gate level simulation.

** I agree - an addition to this guideline is described in Section 9.3

6. Never use input #0 in your clocking blocks.


7. Avoid the use of edge specifiers to determine clocking block skew.

** I agree - follow this guideline

8. When a signal is driven by more than one clocking block output to model DDR or similar multi-clock behavior, that signal should be a variable.


SNUG2016

Page17

Rev1.0


9. Declare your clocking block in an interface. Expose the clocking block, and any asynchronous signals that are directly related to it, through a modport of the interface. In your verification code, declare a virtual interface data type that can reference that modport.

** I mostly agree - I will comment on the modport portion of this guideline in Section 9.8

10. Use your clocking block to establish signal directions with respect to the testbench. Do not add the raw signals to a testbench-facing interface's modport.


11. Clocking blocks should usually be accessed through a virtual interface variable pointing to a modport of the clocking block’s enclosing interface. In that situation, each clockvar must be accessed using the three-part dotted name virtual_interface.clocking_block.clockvar ** I mostly agree - I will comment on the modport portion of this guideline in Section 9.8

Inadditiontoguideline#3,IalsodiscussattheendofSection9.7thepropercodingstyleforsamplingDUToutputsusingaclockingblock.

9.1ClockingBlockDefaultTiming

Clockingblockdefaulttimingvaluesare#1step forsamplingDUTsignalsand#0fordrivingDUTstimulus.The#1stepsampletimeshouldalmostalwaysbeused.The#0drivetimeshouldNEVERbeused.

9.2#1stepSampling

ThebesttimetosampleDUToutputsisattheendofthecycle,justbeforethenextactiveclockedgechangestheoutputsofregisteredlogic.The#1stepinputsampletimespecifiedinaclockingblockelegantlyaccomplishesthisgoal.

InVerilogtestbenchesthatdidnothavethe#1stepsampletime,Iusedtowaitforalmostafullcycleandthensamplethesignalonetimeunitbeforethenextactiveclockedge,using#(`CYCLE-1).ThisworkedfineunlesstheCYCLEdelaywasrelativelyshort,suchasa2nsCYCLEdelay,inwhichcaseCYCLE-1wouldbehalfof theCYCLE.For fasterclockcycles, Iwouldhavetosampleusingeither#(`CYCLE-0.1)orperhapseven#(`CYCLE-0.01).The#1stepgivesthesmallestresolutiondelaybeforethenextactiveclockedgesoengineersdon'thavetoworryaboutrelativeclockspeeds.Withthisaddedobservation,thisagreeswithBromley&JohnstonGuideline#4.

IthasbeenarguedbysomeengineersthatthebestplacetosampletheDUToutputsisonesetup‐timedelaybeforetheactiveclockedge,assumingthatthepropagationofsignalsneedtosettleandbereadybeforethesetuptimerequirementoftheclockedlogic.Althoughitistruethatactualsignalsmustbestableforthedurationofthesetuptimebeforetheactiveclockedge,functionalsimulationis not the place to prove that this requirement is being met. Functional and gate‐sims withbackannotatedtimingdelaysshouldbeusedtoprovethatthedesignisfunctionallycorrectandStaticTimingAnalysisTools(STA)shouldbeusedtoprovethatalltiming,includingsetupandholdtimes,arebeingmet.Verificationengineersshouldnotberequiredtoperiodicallyensurethatmaximumsetuptimesarespecifiedinthetestbench.

9.3#0DriveTimes

Drivingstimulus to theDUTat#0 after theactiveclockedge ispossibly theworstplace todrivestimulus.Unfortunatelythisisthedefaultstimulusdrivetimeofaclockingblock.Inrealhardware,nobodytriestochangetheinputsofclockedlogicexactlyontheactiveclockedgeandyetthatistheprecisebehaviorofthedefault#0drivetime.

SNUG2016

Page18

Rev1.0


With few exceptions, in real hardware, changing the inputs exactly on the active clock edgewillviolatesetuptimesand/orholdtimesandcouldfrequentlycausemetastablevaluestobegeneratedattheoutputoftheclockedlogic.

Engineersshouldsetthedrivetimeofaclockingblocktobe20%oftheclockcycle,allowing80%oftheclockcycleforinputcombinationaldelaystotheDUT.Ifmoreinputcombinationalsettlingisrequired,theclockingblockcouldbesetto10%oftheclockcycletoallow90%ofthecycleofinputsettling. Figure 9 shows that DUT combinational inputs could requiremost of the clock cycle toprocesstestbenchstimulus.

ThisagreeswithBromley&JohnstonGuideline#5,notingthatthedelayshouldgenerallybe10%‐20%oftheclockcycletoallowDUTinputcombinationalsettlingtime.

9.4ExampleUVMclockingblock

Tobetterunderstandsomeoftheconceptsofusingclockingblocks,considerasimpleprogramcounter with asynchronous low‐true reset, along with synchronous load and increment controlsignalsasshownExample6.

module pcnt ( output logic [15:0] dout, input [15:0] din, input ld, inc, clk, rst_n);

always_ff @(posedge clk, negedge rst_n) if (!rst_n) dout <= '0; else if (ld) dout <= din; else if (inc) dout <= dout + 1; endmodule

Example6‐ProgramcounterDUTcode

TheCYCLEdefinitionandclockingblockdrivetime(Tdrive)definitionsarekeptinthefileCYCLE.sv

ìfndef CYCLE `define CYCLE 10 èndif ìfndef Tdrive `define Tdrive #(0.2*`CYCLE) èndif `timescale 1ns/1ns

Example7‐CYCLE.svfile

TheinterfacewithclockingblockusedinthecorrespondingUVMtestbenchisshowninExample8.

ìnclude "CYCLE.sv" interface dut_if (input clk); logic [15:0] dout; logic [15:0] din; logic ld, inc, rst_n;

SNUG2016

Page19

Rev1.0


clocking cb1 @(posedge clk); default input #1step output `Tdrive; input dout; output din; output ld, inc, rst_n; endclocking endinterface

Example8‐DUTinterfacewithclockingblock

ThenextsectionshowshowtheUVMtestbenchdriverissetuptodrivestimulustothisdesign.

9.5Stimulususingclockingdrives

Bromley&JohnstonGuideline#1states:

Whenusinga clockingblock, the testbenchmustaccessonly its clockvarsand shouldneveraccesstheclockingsignalsdirectly.

Althoughthisisgenerallyagoodguideline,thereisoneveryimportantexceptiontothisguidelinethatIdoinallofmytestbenches,andthatexceptionoccursattime‐0.InmyUVMdrivercomponents,Ialwaysincludeaninitialize()taskthatmakesdirect,non‐clockingblocksignalassignmentsattime‐0, and that initialize() task is called at the beginning of the run_phase() and then aforeverloopexecutesadrive_tr()methodormethods,allofwhichexclusivelymakeassignmentstotheclockvars(signalswithclockingblocktiming)aftertime‐0.Thisallowsmytestbenchestoinitializeallsignalswithfixedorrandomvaluesattimezeroandthentakeadvantageofclockingblockcontrolledassignmentstothesamesignalsthroughouttherestofthesimulation.

Thetb_driver inExample9 includes theinitialize() taskandshows thatinitialize() iscalledjustbeforeenteringtheforeverloop.Forinformationalpurposes,thevirtualdut_ifissetbythetb_agent(notshown)forthisexample.

class tb_driver extends uvm_driver #(trans1); ùvm_component_utils(tb_driver) virtual dut_if vif; function new (string name, uvm_component parent); super.new(name, parent); endfunction task run_phase(uvm_phase phase); trans1 tr; initialize(); forever begin seq_item_port.get_next_item(tr); drive_tr(tr); seq_item_port.item_done(); end endtask task initialize(); // @0 - Does not use clocking block vif.rst_n <= '0; vif.ld <= '1; vif.inc <= '1; vif.din <= '1;

SNUG2016

Page20

Rev1.0


endtask task drive_tr (trans1 tr); @vif.cb1; vif.cb1.din <= tr.din; vif.cb1.ld <= tr.ld; vif.cb1.inc <= tr.inc; vif.cb1.rst_n <= tr.rst_n; endtask endclass

Example9‐tb_driverwithinitialize()task(noclockingblocktiming)anddrive_tr()ask(usesclockingblocktiming)

Inhistestbenchbook[6],JanickBergeronsimilarlystatesthatstimulusshouldnotbeassignedattime‐0, but again, I have found it useful tomake time‐0 stimulus assignmentsusingnonblockingassignments. In a personal conversation with Janick regarding this exception, Janick somewhat‐accurately stated thatusingnonblocking assignments at time‐0didnot violatehis guideline as anonblockingassignmentexecutesatalaterstageofthetime‐0eventregions.

Notethatthedrive_tr()taskofExample9usesnotationssimilarto:vif.cb1.din <= tr.din

Bromley&JohnstonGuideline#3states:Write to output clockvars using the clocking drive operator <=. Never try to write an output clockvar using simple assignment =.

The added clarification is that the clocking drive operator (<= ) is requiredwhenever driving aclockvar(asignalthatincludestheclockingblockname)andusingthesimpleblockingassignmentoperator(=)isillegal.TheSystemVerilogcompilerwillenforceBromley&JohnstonGuideline#3.

9.6Whydrivesignalsattime‐0?

Onefrequentlyaskedquestionis,whyevendriveDUTinputsignalsattime‐0?WhynotallowDUTinputstoremainuninitializedatXattime‐0andthenuseclockingdrivesafterthefirsttestbenchactiveclockedge?

Itshouldbenoted thatmanysuccessful testbenchesneverdriveDUT inputsignalsat time‐0andthesetestbenchesworkjustfine.

AsnotedinSection2,time‐0isatrickyplaceinVerilogandSystemVerilogsimulations.InputsignalsthatareallowedtobeXattime‐0havethepotentialtocauseapre‐andpost‐synthesissimulationmismatches[4].Anyuninitializedinputsignaltestedbyaproceduralif‐statementwillfailtheif‐testandalwaystaketheelsebranch.Anyuninitializedinputsignaltestedbyacasexstatementwillalways execute the firstcasex‐item‐statement.These arewell knownX‐optimismexamples, andhavecausedcompaniestohavecostlyre‐spinsintheirASICdesigns.

At time‐0,myinitialize() task is frequentlywritten toreset thedevice,whilesimultaneouslysettinginputstoeitherall1'sorrandomvalues,andtosetload‐controlinputsignalstoattempttoloadvaluesintomyDUT.Idothistoensurethatresetproperlyclearstherequiredregistervaluesandhaspriorityoverotherloadingcontrolsignalsattime‐0.

Another advantage to doing the time‐0 assignments becomes visible in the waveform display. Igenerallydonotliketosee"red"signalsattime‐0,exceptforuninitializedandnon‐resetoutputs.WhenIsee"red"attime‐0,Iquicklyanalyzetheredsignalstomakesurethattheirvaluesareindeedunknownattime‐0.IdonotwanttowastetimeanalyzinguninitializedDUTinputs.

SNUG2016

Page21

Rev1.0


9.7Asynchronouscontrolinputs

UnclockedreferencemodelsandpredictionfunctionstypicallytakeinputssampledonclockedgestopredictwhattheactualDUToutputvaluesshouldbe.Typicallyinputssampledontheactiveclockedge, such as theposedgeclk are the only values required to predict the correct outputs. Theexceptiontothisruleisasynchronouscontrolsignals.

Considertheexampleof theasynchronousresetsignal.Figure11showsfourasynchronousresetscenariosthatmighthavetobeconsideredwhenpredictingtheDUToutput.

Figure11‐Fourasynchronousresetsignalscenarios

Inthefirstasynchronousresetscenario,thepredictedDUToutputswouldneedtoberesetincycles1‐3.

Inthesecondasynchronousresetscenario,thepredictedDUToutputswouldneedtoberesetincycles1‐2.

Inthethirdasynchronousresetscenario,thepredictedDUToutputswouldneedtoberesetincycles2‐3.

Andinthefourthasynchronousresetscenario,thepredictedDUToutputswouldonlyneedtoberesetincycle2.

Inthefirstthreeasynchronousresetscenarios,itisclearthattheresetsignalneedstobesampledbothatthebeginningandattheendofthecyclesincearesetatanytimeduringthecycleshouldcausetheDUToutputstobereset.

If the reset signal is activewhen the inputs are sampled on the active clock edge, the predictedoutputscanbereliablycalculated.Ifresetisnotactiveontheactiveclockedge,thepredictedoutputscannotbeguaranteedtonotberesetlaterinthesamecycle,whichiswhyaresetsignalthatwasnotactivewhentheinputsweresampled,needstobere‐sampledonthenextactiveclockedgetoseeifithasbeenassertedduringthecycle.

Example 10 shows part of the sample_dut task that is called by the tb_monitor (the fulltb_monitorcodeisshowninExample11).ThisUVMtestbenchissetupsothatthesample_duttaskisalwayssynchronizedtotheposedgeclk,sowhencalleditfirstsamplesalloftheinputs,includingtheasynchronousrst_ninput,thenre‐synchronizestothenextclockingblocksamplesignalfromthevirtual interface,@vif.cb1,which in thisexample re‐synchronizes to theposedgeclk, thensamples the outputs #1step before that clocking block sample signal and then re‐samples theasynchronousrst_ntoseeifitislow‐trueassertedonthisposedgeclkedge,andifasserted,therst_nsignalthatwilleventuallybepassedtothetestbenchpredictorissetto0,otherwiseitkeeps

SNUG2016

Page22

Rev1.0


itspreviousvalue,whichmighthavebeenlow‐trueassertedatthebeginningofthecycle.

task sample_dut (output trans1 tr); trans1 t = trans1::type_id::create("t"); //--------------------------------------------- // Sample DUT synchronous inputs on posedge clk. ... @vif.cb1; if (!vif.rst_n) t.rst_n = '0; // async reset t.dout = vif.cb1.dout; //--------------------------------------------- tr = t; endtask

Example10‐sample_duttaskchecksasyncresetatbeginningandendofthecycle

Notethatthesample_duttaskuses: @vif.cb1 … t.dout = vif.cb1.dout

insteadof:@(posedge clk) … t.dout = vif.cb1.dout

whichagreeswithBromley&JohnstonGuideline#[email protected]#1stepbefore theposedgeclk,while the@(posedgeclk) gives race‐condition results and appears tocause at least two simulators to sample the output one clock cycle earlier (the vif.cb1.doutsamplingdoesnotappeartorecognizethecurrentclockedgeandappearstosampleoneclockedgeearlier ‐ this is just an observation and may not be consistent between current simulators orconsistentwiththefuturebehaviorofsimulators).

The tb_monitor code of Example 11 shows the full UVM monitor example including the fullsample_dut()code.

class tb_monitor extends uvm_monitor; ùvm_component_utils(tb_monitor) virtual dut_if vif; uvm_analysis_port #(trans1) aport; function new (string name, uvm_component parent); super.new(name, parent); endfunction function void build_phase(uvm_phase phase); super.build_phase(phase); aport = new("aport", this); // build the analysis port endfunction task run_phase(uvm_phase phase); trans1 tr; tr = trans1::type_id::create("tr"); //--------------------------------------- forever begin

SNUG2016

Page23

Rev1.0


sample_dut(tr); aport.write(tr); end endtask //----------------------------------------------- // sample_dut assumed to be synced to posedge clk // except for first sample at time-0 //----------------------------------------------- task sample_dut (output trans1 tr); trans1 t = trans1::type_id::create("t"); //--------------------------------------------- // Sample DUT synchronous inputs on posedge clk. // DUT inputs should have been valid for most // of the previous clock cycle //--------------------------------------------- t.din = vif.din; t.ld = vif.ld; t.inc = vif.inc; t.rst_n = vif.rst_n; //--------------------------------------------- // Wait for posdege clk and sample outputs #1step before. // Also re-sample and check async control input signals //--------------------------------------------- @vif.cb1; if (!vif.rst_n) t.rst_n = '0; // async reset t.dout = vif.cb1.dout; //--------------------------------------------- tr = t; endtask endclass

Example11‐tb_monitorchecksasyncresetatbeginningandendofthecycle

This technique is generally good enough for testing purposes because the verification engineertypicallydoesnotgeneratesub‐cycleasynchronouscontrolpulseswhengeneratingstimulus.

If there is the possibility of generating sub‐cycle asynchronous control pulses either from thestimulussourceorfromanothersub‐blockconnectedtothisDUTblock,thenasticky‐bittechniquewill be required to capture the asynchronous control signal activity to pass to the testbenchpredictor.Anexampleofthisscenarioisrst_nscenario#4asshowninFigure11,andagain,isolatedasshowninFigure12.

Figure12‐Asynchronousmid‐cycle,sub‐cycleresetpulse<insertcommoncodeexamplehere>

AsshowninFigure12,theresetpulsecannotbedetectedoneithertherisingedgeofcycle#2orontherisingedgeofcycle#3.Itwillbenecessarytocaptureandholdtheresetconditionofcycle#2andsamplethatcondition#1stepbeforecycle#3.

SNUG2016

Page24

Rev1.0


Thesticky‐bit codecanbeeasilyadded to thedut_if.sv fileasshown inExample12.Asimplealways block capturesanyactive (negedge) transitionon therst_n signal andassigns0 to thestickyreset_nsignalthatisusedinthetb_monitorcomponentshowninExample13.Atthenextposedgeclk,eitherthereset_nsignalisstillassignedto0(ifrst_nisstillactivelow)orissetto1tocleartheactivereset_ncondition(ifrst_nwasdeassertedbeforetheendofthecycle).Theonlyother DUT interface requirements tomake this techniquework are to declare the stickylogicreset_nsignalinthedeclarationsportionoftheinterface,andtoaddthereset_nsignalasaninputin the clocking block, to allow the tb_monitor to sample the signal #1step before the nextposedgeclk,whichisusedtodeactivatethesticky‐bitbysettingittoa1.

ìnclude "CYCLE.sv" interface dut_if (input clk); logic [15:0] dout; logic [15:0] din; logic ld, inc, rst_n; logic reset_n; //---------------------------------------------------- // Sticky reset_n signal to capture short rst_n pulses //---------------------------------------------------- always_ff @(posedge clk, negedge rst_n) if (!rst_n) reset_n <= '0; else reset_n <= '1; clocking cb1 @(posedge clk); default input #1step output `Tdrive; input dout; output din; output ld, inc, rst_n; input reset_n; endclocking endinterface

Example12‐DUTinterfacewithsticky‐bitcodetosaveresetshort‐pulseresetcondition

Thetb_monitor also needs to be slightlymodified to sample thevif.cb1.reset_n signal andassign those values as appropriate to the t.rst_n signal in the transaction before writing thetransactiontotheanalysisport.Themodifiedtb_monitorcodeisshowninExample13.

class tb_monitor extends uvm_monitor; ùvm_component_utils(tb_monitor) ... task run_phase(uvm_phase phase); trans1 tr; tr = trans1::type_id::create("tr"); //--------------------------------------- forever begin sample_dut(tr); aport.write(tr); end endtask

SNUG2016

Page25

Rev1.0


task sample_dut (output trans1 tr); trans1 t = trans1::type_id::create("t"); t.din = vif.din; t.ld = vif.ld; t.inc = vif.inc; t.rst_n = vif.rst_n; //--------------------------------------------- // ... // Sample the sticky-bit reset_n to update rst_n if needed //--------------------------------------------- @vif.cb1; if (!vif.cb1.reset_n) t.rst_n = '0; // async reset t.dout = vif.cb1.dout; //--------------------------------------------- tr = t; endtask endclass

Example13‐tb_monitormodifiedtotestthesticky‐bitreset_nversionoftherst_nasynchronousreset

Note that the sample_dut() task of Example 13 uses notations similar to:t.dout = vif.cb1.dout; ThepropercodingforsamplingDUToutputsusingaclockingblockrequirestheuseoftheclockingsampleoperator(=)andusinganonblockingassignmentoperator ( <= ) forclockingsampleoperationsisillegal.TheSystemVerilogcompilerwillcatchthismistake.

9.8Interfacemodportsandtestbenches

Bromley& JohnstonGuidelines #9 and#11 recommend usingmodport versions of the clockingsignals.Theseguidelinesarefineandevenaddasmallamountofadditionalcheckingtothesignalsbeingdrivenandsampled,butIgenerallyfindtheuseofmodportsintestbenchinterfacestobealevelofcomplexitythatisgenerallynotneeded.Verificationengineersareencouragedtouseornotusemodportsattheirdiscretion.

10.DeathtotheSystemVerilogprogram!SystemVerilog‐2005 added a new testbench construct called aprogram. The primary reason foradding theprogram construct to SystemVerilogwas to help avoid stimulus‐DUT race conditionswhenstimuluswasdrivenontheactiveclockedge,whichshouldneverbedone!

The ideawas that theRTLcodewouldbecapturedwithinmoduleswhile the testbenchstimulusgenerationwouldbecapturedwithinprograms.

Theperceivedbenefitofthisapproachwasthatifstimuluswasdrivenontheactiveclockedge,anactiveclockedgewouldfirstexecutetheRTLcodeintheactiveregionoftheeventschedule(shownintheupperhalfofFigure13)andallowtheRTLtosettletoasemi‐finalvaluebeforenewstimuluswassenttotheDUT.ThismeansthattheRTLwouldsampleallinputsthathadsetupontheregistersbeforethetestbenchchangedthoseinputsforthenextcycle.

AftertheRTLhadsettledtoasemi‐finalstateintheactiveregion,testbenchprogramcodewouldthendrivenewstimulustotheDUTduringthereactiveregionofthesametimeslot(showninthelowerhalfofFigure13).Aftertheprogramcodehadcalculatedtheappropriatestimulustosendtothe DUT, those stimulus values would then be sent back into the DUT and any DUT input

SNUG2016

Page26

Rev1.0


combinationallogicwouldrecalculatetheinputstotheDUTregistersandthesevalueswouldremainontheDUTinputsuntilthenextactiveclockedge.

Figure13‐SystemVerilogmoduleandprogrameventscheduling

Ifaverificationengineerdrivesstimulusontheactiveclockedge,theprogramtestbenchschedulingcouldproveusefultoavoidraceconditionswheretheRTLmightpartiallycalculateafinalvalue,thestimulusarrivesbefore theRTL isdonecalculatingvalues,and thestimuluschangessomeof theinputs that have not yet been registered. So the program essentially made it possible to drivestimulusontheactiveclockedgeandavoidRTL‐stimulusraceconditions,butashasbeenpreviouslydiscussed,stimulusshouldNOTbedrivenontheactiveclockedge,andhencethereisnoneedfortheuseoftheprogramblock.

Aslongastheverificationengineerdoesnotdrivestimulusontheactiveclockedge,thereisnoRTL‐stimulusraceconditionandaprogramisnotneeded.

TheSystemVerilogStandardalsointroducedanumberofconfusingcodingrestrictionsassociatedwithprogramusage,suchas:

Aprogramcanonlyuseinitialproceduresbutnotalwaysprocedures.

Aprogramcanhierarchicallyreferencemodulesignalsbutamodulecannothierarchicallyreferenceprogramsignals.

A program can call module tasks and functions while a module cannot call program tasks andfunctions.

Simulatorshavenotalways(andstillmaynot)consistentlycheckandexecuteprogramcode.

TheSystemVerilogprogramstatementshouldjustdieandneverbeusedinyourcode!

SNUG2016

Page27

Rev1.0


10.1Cliff'sconfession

Despitepersonalreservations,IvotedtoincludeprogramsintotheSV2015standard.Inaseparateconference call with committee members who were advocating the inclusion of programs toSystemVerilog,IdescribedmytechniqueofavoidingRTL‐stimulusraceconditionsbyexplaininghowI never drive stimulus on the active clock edge. Those advocating inclusion of the programacknowledged that I had a good technique but that if I allowed programs to be added toSystemVerilog,itwouldbeeasierforengineerstodriverace‐freestimuluswithoutbeingrequiredtounderstandmytechnique.Basedonthatargument,Ivotedinfavorofaddingprograms,avotethatInowregret.

If I could remove programs from the SystemVerilog language I would, but due to backwardcompatiblecodingreasons,Icannotremovethem.

AllIcandoisgivemystrongestrecommendationtoverificationengineers:

Guideline:NeveruseorquitusingSystemVerilogprograms!

11.ConclusionsTime‐0isatrickyplaceinVerilogandSystemVerilogsimulations.Toavoidtime‐0raceconditions,createaninitialize() taskandassignall inputsat time‐0usingnonblockingassignments(seeSection2.)

There are three timing values that need to be properly considered to generate robust, race‐freetestbenches:(1)Whentodrivestimulus,(2)WhentosampleDUTinputs,(3)WhentosampleDUToutputs.

Thispaperhasshownarobusttechniqueofdrivingstimulususingtimebudgetingtoensurethatstimulusisdrivensafelyaftertheactiveclockedge.Atimebudgetofwaiting20%oftheclockcycleaftertheactiveclockedgewasrecommended,andthatvaluewasusedinaclockingblock.OthervaluescouldbeusedbasedonthecombinationallogicinputdelayoftheDUT.

DUTinputsshouldbesampledontheactiveclockedgebecausethatiswheretheDUTwillsamplethosesameinputs.Theexceptionisasynchronouscontrolsignalsthatmustfrequentlybere‐sampledattheendofthecycle,andifasynchronouscontrolsignalscouldbesub‐cyclepulses,asticky‐bittechniquemaybeemployedtocapturethoseglitchingasynchronouscontrolsignals.

Outputsshouldbesampledat the lastpossiblemomentbefore thenextactiveclockedge.This isaccomplishedbyusingthe#1stepsampletimewithinaclockingblock.ThiswasshowninSection9.2

Finally,thewell‐intentionedSystemVerilogprogramenhancementonlyoffersvalueifyouaretryingtoapplystimulusontheactiveclockedge,whichyoushouldneverdo!TheprogramhasanumberofannoyingrestrictionswheninteractingwithamoduleandjustaddsconfusiontohowSystemVerilogevents are scheduled. The SystemVerilog program should never be used! The SystemVerilogprogramshouldjustdie!

12.References[1] AnnaEkstrandh,WayneBell,"EvolvableMakefilesandScriptsforSynthesis,"SNUG(SynopsysUsers

Group)1997Proceedings,February1997.

[2] CliffordE.Cummings,"OVM/UVMScoreboards‐FundamentalArchitectures,"SNUG(SynopsysUsersGroup)2013(SiliconValley,CA).Alsoavailableatwww.sunburst‐design.com/papers

SNUG2016

Page28

Rev1.0


[3] CliffordE.Cummings,"VerilogNonblockingAssignmentsWithDelays,Myths&Mysteries,"SNUG(SynopsysUsersGroup)2002(Boston,MA).Alsoavailableatwww.sunburst‐design.com/papers

[4] DonMillsandCliffordE.Cummings,“RTLCodingStylesThatYieldSimulationandSynthesisMismatches,”SNUG(SynopsysUsersGroup)1999(SanJose,CA).Alsoavailableatwww.lcdm‐eng.com/papers.htmandwww.sunburst‐design.com/papers

[5] "IEEEStandardForSystemVerilog‐UnifiedHardwareDesign,SpecificationandVerificationLanguage,"IEEEComputerSocietyandtheIEEEStandardsAssociationCorporateAdvisoryGroup,IEEE,NewYork,NY,IEEEStd1800™‐2012

[6] JanickBergeron,WritingTestbenches:FunctionalVerificationofHDLModels,2ndEdition,SpringerScience+BusinessMedia,Inc.,2003.ISBN:1‐4020‐7401‐8

[7] JonathanBromleyandKevenJohnston,"TamingTestbenchTiming:Time'sUpforClockingBlockConfusions,"SNUG(SynopsysUsersGroup)2012(Austin,TX).Alsoavailableatwww.verilab.com/resources/papers‐and‐presentations/#snug2012clock

13.Author&ContactInformationCliffCummings,PresidentofSunburstDesign,Inc.,isanindependentEDAconsultantandtrainerwith36yearsofASIC,FPGAandsystemdesignexperienceand26yearsofSystemVerilog,synthesisandmethodologytrainingexperience.

Mr.Cummingshaspresentedmorethan100SystemVerilogseminarsandtrainingclassesinthepast15yearsandwasthefeaturedspeakerattheworld‐wideSystemVerilogNOW!seminars.

Mr. Cummings participated on every IEEE & Accellera SystemVerilog, SystemVerilog Synthesis,SystemVerilog committee from 1994‐2012, and has presented more than 40 papers onSystemVerilog&SystemVerilogrelateddesign,synthesisandverificationtechniques.

Mr. Cummings holds a BSEE from Brigham Young University and an MSEE from Oregon StateUniversity.

Sunburst Design, Inc. offers World Class Verilog & SystemVerilog training courses. For moreinformation,visitthewww.sunburst‐design.comwebsite.

Emailaddress:cliffc@sunburst‐design.com

LastUpdated:June2018

Sunburst Design, Inc. - World Class SystemVerilog & …...SNUG 2016 Page 4 Rev 1.0 Applying Stimulus & Sampling Outputs ‐ UVM Verification Testing Techniques 1. Introduction Although

Documents