The Grafted Superset Approach: Bridging Python to Silicon with Asynchronous Compilation and Beyond
Abstract
In the fields of high-performance scientific computing, data analysis, and machine learning, algorithms are evolving at an unprecedented pace, challenging the computational capabilities of existing general-purpose processors, such as CPUs and GPUs. These processors, while versatile, often fall short in efficiency and performance in the face of increasingly complex demands. This inefficiency is particularly pronounced in the current era, characterized by the end of Dennard scaling and the prevalence of dark silicon, which has severely limited the ability of traditional silicon-based technologies to keep up with the computational growth. In response to these challenges, there is a clear and pressing need for dedicated accelerators, namely Application-Specific Integrated Circuits (ASICs).
The essential role of Closed-source Electronic Design Automation (EDA) tools in hardware development is undeniable, yet these tools introduce substantial barriers in the rapidly advancing domain of algorithm and application acceleration. The dependence on manual design methodologies and traditional languages, such as HDL and TCL, is made more challenging by the restrictive nature of proprietary tools. Naturally, this leads to a slow and tedious development process, ill-suited to meet the demands of algorithmic evolution and contemporary energy challenges.
Open-source EDA tools and agile methodologies offer a viable solution to the challenges posed by closed-source systems. Research underscores the effectiveness of open-source initiatives, including OpenLane/OpenROAD, Open PDKs, and the SkyWater SKY130 process, bolstered by supportive frameworks such as Google's Multi-Project Wafer (MPW) lottery. Open-source frameworks for the generation of end-to-end hardware from high-level representations in both the FPGA and ASIC domains are further enhancing the democratization of Open Design Automation. These No-Human-In-Loop (NHIL) tools and approaches enable rapid and flexible development of ASICs from high-level descriptions, aligning with the fast-paced demands of algorithmic evolution and modern energy challenges.
We introduce a Superset Framework, designated as SUF, that integrates seamlessly with the OpenROAD automated ASIC flow. Indeed, the philosophy underlying our approach is solely to augment and complement existing processes, hence the comparison with an extension graft. It operates in conjunction with OpenROAD, facilitating independent updates and avoiding the need for modifications to OpenROAD files (useful in scenarios such as merges and simultaneous usage by multiple users). The principal aspect of SUF is to modify the underlying parallelism paradigm of the baseline flow, but it also adds several high-level and open source tools serving from the generation of inputs to the plotting part. As depicted in Figure, the framework encapsulates the core OpenROAD flow, enhancing it with our suite of tools and emphasizing asynchronous parallelism to facilitate concurrent execution of multiple processes.
The main feature is the asynchronous flow enabled by our library, ``Scenario'', which allows processes to fork themselves with their own data copy and run on independent threads. This approach generates a dependency graph of actions, unveiling parallelism with respect to different PDKs and design entries, allowing them to run independently. It is important to note how high-level configuration files (PDKs.py and designs.py) are shared from this step to the end, ensuring all processes and steps are informed of the configuration context.
Another key aspect is the automated generation of Verilog RTL from Python, optionally incorporating FloPoCo (open source arithmetic generator), and utilizing our open-source translator to Verilog (based on GHDL).
SUF also specializes in aggregating all intermediate files in both a forward and backward manner. Forwardly, it compiles metrics and reports, organizing them into tables and plots formatted for academic publication. Backwardly, it provides feedback for iterations on the HDL or performance specifications (such as clock speed and area) if discrepancies are encountered.
A unique aesthetic feature of SUF is the consolidation of all GDS files into a unified gallery. This function not only highlights the framework's aesthetic appeal but also showcases its efficiency, contrasting sharply with the workflows of closed-source EDA systems.
SUF demonstrates the practical and aesthetic advantages of open-source approaches. The SUF framework has successfully achieved the processing of up to a thousand designs within a five-hour window, while also providing the capability for visual representation of the resulting chip configurations. In light of these performances, we encourage researchers from academia, industry, and hobbyists to adopt this tool in their workflows and to contribute. Our open-source community thrives on collaboration, accessible via our GitHub page.