Show simple item record

dc.creatorJain, Pranav
dc.creatorGupta, Kunal
dc.date.accessioned2024-03-20T14:41:50Z
dc.date.available2024-03-20T14:41:50Z
dc.date.created2023-05
dc.date.submittedMay 2023
dc.identifier.urihttps://hdl.handle.net/1969.1/200925
dc.description.abstractChip designs must meet several requirements before they are ready for fabrication. One of these requirements is achieving convergence on timing (frequency). Meeting this requirement is a time-consuming task for chip designers in the industry for two reasons. First, the standard approach to procuring this metric involves running logic synthesis and placement, both of which can take hours to weeks on larger RTL designs. Second, since the timing requirement is rarely met after one design iteration, these processes need to be rerun multiple times to recalculate the metric to ultimately converge on the design’s requirements. A critical measure of timing convergence is the total negative slack, commonly referred to by its acronym TNS. It indicates the sum of timing margins of all ‘negative slack’ paths that fail to meet the target clock cycle time. To expedite design convergence, our research team previously presented a machine learning-based approach to estimate the TNS values for chip designs expressed in Verilog hardware description language. This technique was orders of magnitude faster than running logic synthesis and placement on those same chips. In this work, we build on the previous approach by improving the initial data generation process. Getting “true” TNS values for training the machine learning models involves running logic synthesis and placement with hundreds of synthesis recipes for each design, resulting in tens of thousands of synthesis and placement runs. Driven by the need to create a rich training data set, since new designs will be continuously added to the RTL developer’s set of training designs, it behooves to reduce the number of synthesis and placement runs necessary to generate machine learning (ML) training data. By taking advantage of similarities in the distributions of TNS values across chip designs, the number of required synthesis and placement runs for n Verilog RTL designs and m unique synthesis recipes can be reduced from O(nm) to O(n+m) without meaningfully compromising the integrity of the training data and the accuracy of ML predictions. We present two methods for achieving this, both of which involve finding the common TNS distribution, then normalizing and computing missing values in the data set. The discoveries made by our research team have the potential to drastically reduce the time to market for a variety of semiconductor computing products, including but not limited to graphics processors, motherboards, and flash memory.
dc.format.mimetypeapplication/pdf
dc.subjectMachine Learning
dc.subjectVerilog RTL
dc.subjectTotal Negative Slack
dc.subjectLogic Synthesis and Placement
dc.titleStreamlining TNS Data Collection for ML-Based RTL QoR Prediction
dc.typeThesis
thesis.degree.departmentComputer Science and Engineering
thesis.degree.disciplineComputer Science
thesis.degree.grantorUndergraduate Research Scholars Program
thesis.degree.nameB.S.
thesis.degree.levelUndergraduate
dc.contributor.committeeMemberTyagi, Aakash
dc.type.materialtext
dc.date.updated2024-03-20T14:41:51Z


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record