Streamlining TNS Data Collection for ML-Based RTL QoR Prediction

Jain, Pranav; Gupta, Kunal

dc.creator	Jain, Pranav
dc.creator	Gupta, Kunal
dc.date.accessioned	2024-03-20T14:41:50Z
dc.date.available	2024-03-20T14:41:50Z
dc.date.created	2023-05
dc.date.submitted	May 2023
dc.identifier.uri	https://hdl.handle.net/1969.1/200925
dc.description.abstract	Chip designs must meet several requirements before they are ready for fabrication. One of these requirements is achieving convergence on timing (frequency). Meeting this requirement is a time-consuming task for chip designers in the industry for two reasons. First, the standard approach to procuring this metric involves running logic synthesis and placement, both of which can take hours to weeks on larger RTL designs. Second, since the timing requirement is rarely met after one design iteration, these processes need to be rerun multiple times to recalculate the metric to ultimately converge on the design’s requirements. A critical measure of timing convergence is the total negative slack, commonly referred to by its acronym TNS. It indicates the sum of timing margins of all ‘negative slack’ paths that fail to meet the target clock cycle time. To expedite design convergence, our research team previously presented a machine learning-based approach to estimate the TNS values for chip designs expressed in Verilog hardware description language. This technique was orders of magnitude faster than running logic synthesis and placement on those same chips. In this work, we build on the previous approach by improving the initial data generation process. Getting “true” TNS values for training the machine learning models involves running logic synthesis and placement with hundreds of synthesis recipes for each design, resulting in tens of thousands of synthesis and placement runs. Driven by the need to create a rich training data set, since new designs will be continuously added to the RTL developer’s set of training designs, it behooves to reduce the number of synthesis and placement runs necessary to generate machine learning (ML) training data. By taking advantage of similarities in the distributions of TNS values across chip designs, the number of required synthesis and placement runs for n Verilog RTL designs and m unique synthesis recipes can be reduced from O(nm) to O(n+m) without meaningfully compromising the integrity of the training data and the accuracy of ML predictions. We present two methods for achieving this, both of which involve finding the common TNS distribution, then normalizing and computing missing values in the data set. The discoveries made by our research team have the potential to drastically reduce the time to market for a variety of semiconductor computing products, including but not limited to graphics processors, motherboards, and flash memory.
dc.format.mimetype	application/pdf
dc.subject	Machine Learning
dc.subject	Verilog RTL
dc.subject	Total Negative Slack
dc.subject	Logic Synthesis and Placement
dc.title	Streamlining TNS Data Collection for ML-Based RTL QoR Prediction
dc.type	Thesis
thesis.degree.department	Computer Science and Engineering
thesis.degree.discipline	Computer Science
thesis.degree.grantor	Undergraduate Research Scholars Program
thesis.degree.name	B.S.
thesis.degree.level	Undergraduate
dc.contributor.committeeMember	Tyagi, Aakash
dc.type.material	text
dc.date.updated	2024-03-20T14:41:51Z

Files in this item

Name:: JAIN-FINALTHESIS-2023.pdf
Size:: 1.240Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Undergraduate Research Scholars Capstone (2006–present)

Show simple item record