May 9, 2016 1 Harnessing Big Data to Simplify Debugging May 9, 2016 Asi Lifshitz, CTO www.thevtool.com
May 9, 20161
Harnessing Big Data to Simplify Debugging
May 9, 2016
Asi Lifshitz, CTO
www.thevtool.com
May 9, 20162
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 20163
RTL Debugging
• Verification is one of the major bottlenecks towards tape-out
• Debugging failing tests is complex and time-consuming
Source: Wilson Research & Mentor Graphics, 2014
May 9, 20164
• Iterating between the waveforms and the simulation log file
• Simulation log files can reach several GB
Debugging Today
May 9, 20165
• Big Data tools will quickly and efficiently extract data from huge log files
• Extracting and manipulating data gets simpler
• Data can be presented in a graphical way
• Shortening the debug time will shorten the project schedule and increase the engineer’s productivity
Debugging Tomorrow
May 9, 20166
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 20167
• Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate
• The term often refers simply to the usage of advanced methods for extracting value from data, and seldom to a particular size of data set
Big Data
May 9, 20168
• For some organizations, facing few gigabytes of data for the first time may trigger a need to reconsider data management options
• For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration
Big Data – Cont.
May 9, 20169
• A database is an organized collection of data
• The data is typically organized in a way that supports processes that require information
• A database management system (DBMS) is a computer software application that interacts with the user, other applications, and the database itself to capture and analyze data
Database
May 9, 201610
• Database can be used to query a specific record, i.e., a specific message
• However, if some computation is required a database search engine is to be used
– A concrete example which goes beyond the capabilities of a database, is when the DV engineer would like to see all messages from time point tp1 to time point tp2
Database for Log Files
May 9, 201611
• A search engine allows the user to search for information using simple keywords
Database Search Engine
May 9, 201612
• A free and open-source database search engine, originally written in Java
• Has been ported to Delphi, Perl, C#, C++, Python, Ruby, and PHP
• Suitable for any application that requires full text indexing and searching capability
• The core of its logical architecture is the idea of a document containing fields of text
May 9, 201613
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 201614
• A simulation log file is a structured textual file, and as such it can be indexed
• Once indexed, Lucene API can be used to search for all the ”interesting” events that are needed for debugging a failing test
Lucene for Verification
May 9, 201615
• The Universal Verification Methodology (UVM) is a standardized methodology for verifying integrated circuit designs
• More than 70% of the industry have adopted UVM, and the numbers will only grow with time
UVM
Source: Wilson Research & Mentor Graphics, 2014
May 9, 201616
• UVM-based simulation contains UVM messages that usually have the following format:
Verbosity
Filename(line)
Timepoint
Emitter
Message
UVM Messages
May 9, 201617
• UVM_ERROR /project/sflash/verification/SFLASH_controller_ENV/src/sflash_controller_env_sb.sv(1863) @ 4498000: uvm_test_top.env.sb [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains 0x532e4000, but expected 0x532e4cb3
• UVM ERROR is the verbosity (or severity)
• /project/sflash/verification/SFLASH_controller_ENV/src/sflash
_controller_env_sb.sv(1863) is the filename(line)
• @ 4498000 is the time point
• uvm_test_top.env.sb is the emitter of the message
• [WRITE_MODE_SPI_DATA_ERR] Sent data packet contains
0x532e4000, but expected 0x532e4cb3 is the message
UVM Message Example
May 9, 201618
• Parse the log file, so that every message will be broken to the aforementioned 5 elements and stored as records in Lucene database
• The user can now use the efficient API of Lucene to extract information
Using Lucene for UVM Messages
May 9, 201619
• Being designed to handle huge records, Lucene returns these records in a negligible time
– Receive all messages of a specific verbosity, or specific verbosity within some time range
– Messages containing a specific string
– All messages emitted from the APB UVC writing 0X1 to register sflash_reg.enable
Extracting UVM Records
May 9, 201620
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 201621
• It is extremely hard to navigate through the log file, while seeking for the necessary information, without being overwhelmed or miss important information
• Graphical representation of data is more natural and is much easier for analysis
Why Graphical Representation?
May 9, 201622
Graphical Representation of a Log File
May 9, 201623
Graphical Debugging
• The transition from debugging a textual file to a graphical representation is intuitive
• Problems are traced much faster.The engineer can quickly see what is wrong, when the pattern changes, or when some unexpected event has occurred
May 9, 201624
Agenda
• Introduction
• What is Big Data, Anyway?
• Simulation Log Files
• Graphical Representation of a Log File
• Summary
May 9, 201625
Summary
• The complexity and size of designs these days require new techniques, as the traditional ones impose very long debugging time
• Harnessing tools that are used for processing Big Data can simplify and shorten the debug time of failing tests
• We hope that this work will encourage more researches on importing these strong capabilities to the existing and new EDA tools
May 9, 201626
Thank You
26