# A New Era of Open-Source Hardware

**Christopher Batten** 

Computer Systems Laboratory Electrical and Computer Engineering Cornell University



## **Accelerators for Machine Learning in the Cloud**



#### **NVIDIA DGX Hopper**

- Graphics processor specialized just for accelerating machine learning
- Available as part of a complete system with both the software and hardware designed by NVIDIA



#### Google TPU v4

- Custom chip specifically designed to accelerate Google's TensorFlow C++ library
- Tightly integrated into Google's data centers



#### **Microsoft Catapult**

- Custom FPGA board for accelerating Bing search and machine learning
- Accelerators developed with/by app developers
- Tightly integrated into Microsoft data center's and cloud computing platforms

PyMTL3 in Research

A Call to Action

## Accelerators for Machine Learning at the Edge



#### **Amazon Echo**

- Developing AI chips so Echo line can do more on-board processing
- Reduces need for round-trip to cloud
- Co-design the algorithms and the underlying hardware



#### **Facebook Oculus**

- Starting to design custom chips for Oculus VR headsets
- Significant performance demands under strict power requirements



#### **Movidius Myriad 2**



# Top-five software companies are all building custom accelerators

Facebook: w/ Intel, in-house AI chips
 Amazon: Echo, Oculus, networking chips
 Microsoft: Hiring for AI chips
 Google: TPU, Pixel, convergence
 Apple: SoCs for phones and laptops

Chip startup ecosystem for machine learning accelerators is thriving!

How can we accelerate innovation in accelerator-centric hardware design?

- Graphcore
- Nervana
- Cerebras
- Wave Computing
- Horizon Robotics
- Cambricon
- DeePhi
- Esperanto
- SambaNova
- **Eyeriss**
- Tenstorrent
- Mythic
- ThinkForce
- Groq
- Lightmatter

**Cornell University** 

# **Software Innovation Today**



Christopher Batten

6 / 54

# Hardware Innovation Today



Like climbing a mountain – nothing is hidden!

#### What you have to build

- New machine learning accelerator
- Other unrelated components, anything you cannot afford to buy or for which COTS IP does not do

#### **Closed source**

- ARM A57, A7, M4, M0
- ARM on-chip interconnect
- Standard cells, I/O pads, DDR Phy
- SRAM memory compilers
- VCS, Modelsim
- DC, ICC, Formality, Primetime
- Stratus, Innovus, Voltus
- Calibre DRC/RCX/LVS, SPICE

Adapted from M. Taylor, "Open Source HW in 2030," Arch 2030 Workshop @ ISCA'16



**Cornell University** 

## Minimum Viable Product/Prototype



## Minimum Viable Product/Prototype



## How can HW design be more like SW design?

| <b>Open-Source</b>      | Software                              | Hardware                                                                                    |     |
|-------------------------|---------------------------------------|---------------------------------------------------------------------------------------------|-----|
| high-level<br>languages | Python, Ruby, R,<br>Javascript, Julia | Chisel, PyMTL, PyRTL, MyHDL<br>JHDL, Cλash, Calyx                                           | • • |
| libraries               | C++ STL,<br>Python std libs           | BaseJump                                                                                    |     |
| systems                 | Linux, Apache, MySQL,<br>memcached    | Rocket, Pulp/Ariane, OpenPiton<br>Boom, FabScalar, MIAOW, Nyuz                              |     |
| standards               | POSIX                                 | RISC-V ISA, RoCC, TileLink                                                                  |     |
| tools                   | GCC, LLVM, CPython,<br>MRI, PyPy, V8  | Icarus Verilog, Verilator, qflow,<br>Yosys, TimberWolf, qrouter,<br>magic, klayout, ngspice |     |
| methodologies           | agile software design                 | agile hardware design                                                                       |     |
| cloud                   | laaS, elastic computing               | laaS, elastic CAD                                                                           |     |
| ell University          | Christopher Bat                       | iten                                                                                        | 10  |

• Trends in Open-Source HW • PyMTL3 Framework PyMTL3 in Practice PyMTL3 in Research A Call to Action

```
# Ubuntu Server 16.04 LTS (ami-43a15f3e)
% sudo apt-get update
                                                           c. 2018
% sudo apt-get -y install build-essential qflow
% mkdir qflow && cd qflow
% wget http://opencircuitdesign.com/qflow/example/map9v3.v
% qflow synthesize place route map9v3 # yosys, graywolf, qrouter
% wget http://opencircuitdesign.com/qflow/example/osu035_stdcells.gds2
% magic
                       # design def/lef -> magic format
>>> lef read /usr/share/qflow/tech/osu035/osu035_stdcells.lef
>>> def read map9v3.def
>>> writeall force map9v3
% magic
                       # stdcell gds -> magic format
>>> gds read osu035_stdcells.gds2
>>> writeall force
% magic map9v3
>>> gds write map9v3  # design + stdcells magic format -> gds
% sudo apt-get -y install libqt4-dev-bin libqt4-dev libz-dev
% wget http://www.klayout.org/downloads/source/klayout-0.24.9.tar.gz
% tar -xzvf klayout-0.24.9.tar.gz && cd klayout-0.24.9
% ./build.sh -noruby -nopython
% wget http://www.csl.cornell.edu/~cbatten/scmos.lyp
% ./bin.linux-64-gcc-release/klayout -l scmos.lyp ../map9v3.gds
```







Trends in Open-Source HW 

## **RISC-V Hardware** and Software Ecosystem

**Open-source software:** Gcc, binutils, glibc, Linux, BSD, LLVM, QEMU, FreeRTOS, ZephyrOS, LiteOS, SylixOS, ...

**Commercial software:** 

Lauterbach, Segger, IAR, Micrium, ExpressLogic, Ashling, AntMicro, Imperas, UltraSoC ...

Software



ISA specification Golden Model

Compliance

#### Hardware

**Open-source cores:** Rocket, BOOM, RI5CY, Ariane, PicoRV32, Piccolo, SCR1, Shakti, Swerv, Hummingbird, ...

**Commercial core providers:** Andes, Bluespec, Cloudbear, Codasip, Cortus, C-Sky, InCore, Nuclei, SiFive, Syntacore, ...

Inhouse cores: Nvidia, +others

 Trends in Open-Source HW 
 PyMTL3 Framework PyMTL3 in Practice PyMTL3 in Research A Call to Action **OpenROAD: The Future of Open-Source EDA** design.v + .lib + .lef + .sdc + parameters.cfg Flow parameters Library, techfile preparation **Flow Setup** Macro wrappers Dont\_use list Logic optimization Logic Synthesis Technology mapping Buffering, sizing **OpenROAD** v1.0 Static Timing Analysis (OpenSTA) Parasitic Extraction (OpenRCX) IO placement Mixed-size + macro placement Floorplanning Shared Data Model Tapcell insertion **OpenTitan SoC** PDN generation (OpenDB) Global placement GF12LP **Placement** Placement-based optimization Detailed placement Clock tree synthesis CTS CTS, hold, ERC repair Placement legalization Global routing Routing Antenna check + repair Detailed routing Filler cell + BEOL fill insertion Finishing Merge wrapped macros Merge GDS result.def + ppa.rpt + drc.rpt + results.gds **Cornell University Christopher Batten** 16 / 54







|                                                                                                              | 🖄 Call for new engineeri | ng project team proposals — Exc | change                      |                                 |
|--------------------------------------------------------------------------------------------------------------|--------------------------|---------------------------------|-----------------------------|---------------------------------|
|                                                                                                              |                          |                                 |                             |                                 |
|                                                                                                              |                          |                                 |                             |                                 |
|                                                                                                              | - •                      |                                 |                             |                                 |
| Associate Dean for Undergraduate                                                                             | •                        | Ē                               | Archive - Exchange February | 22, 2022 at 12:08 PM            |
| Associate Dean for Undergraduate<br>Call for new engineering project team p<br>To: ENGRFACULTY-L, ENGRFACULT | proposals                | _                               | Archive - Exchange February | 22, 2022 at 12:08 PM<br>Details |

Through a generous donor gift creating the **Shen Fund for Social Impact** we have the opportunity to fund multiple new engineering project teams. This program is designed to bring together new student teams under a faculty member's mentorship to address significant social challenges through novel and/or advanced engineering solutions. Falling under the Project Team Umbrella, the program will fund up to three new teams per year, with each supported for a three-year period at \$30K/yr. The teams will also be provided space and support to design and implement these projects.

Proposals may be submitted by either faculty looking to guide a group of students, or by students who will engage with a faculty member to form the teams.

Attached to this e-mail are three documents:

- Shen Fund FAQ Sp22.pdf: More fully describes the nature of the projects and the goals of the program (also copied to the e-mail below).
- Shen Fund Proposal Template Sp22.docx: Short project proposal form.
- Shen Funded Projects Summary\_Sp22.pdf: A summary document of a currently funded teams.

The ideal project will likely develop through discussions with Lauren Stulgis (as director of the project teams) and me. Feel free to reach out to us with rough ideas and concepts and we can help to try to develop a viable proposal.

Proposals will be considered as they arrive, with discussions to strengthen each within the program constraints. The initial application is a simple document identifying the primary goals, technical challenges and plans, timeline and budget, and currently engaged personnel.

Proposals must be uploaded directly to Box by email to: Proposa.zeuyhp9wqg5p8teo@u.box.com. The first round of decisions will be made based on submissions received by 11:59pm on Sunday, March 13, 2022.

Again, please feel free to contact me or Lauren Stulgis with any questions or to discuss potential projects.

Prof. Alan Zehnder Associate Dean for Undergraduate Programs 177 Rhodes Hall Phone: (607) 255-9181 email: <u>eng\_ugdean@cornell.edu</u>

## **C2S2: Cornell Custom Silicon Systems Project Team**

Three-year student-led project team to tapeout a custom chip in SkyWater 130nm to implement a proof-of-concept system for a campus partner

- Open RISC-V ISA
- Open-Source VexRISCV microcontoller
- Open-Source OpenROAD chip flow
- Open PDK for SkyWater 130nm
- OpenMPW + ChipIgnite w/ efabless

100+ applications  $\rightarrow$  25 team members

- Digital & Verification Subteam
- Analog Subteam
- Software Subteam
- System Architecture Subteam



A Call to Action

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



# A New Era of Open-Source Hardware

Trends in Open-Source HW **PyMTL3** Framework **PyMTL3** in Practice **PyMTL3** in Research **JIT-Compiled Simulation Gradually-Typed HDLs** Property-Based Random Testing A Call to Action

Trends in Open-Source HW • Py

A Call to Action

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



# A New Era of Open-Source Hardware

**Trends in Open-Source HW PyMTL3** Framework **PyMTL3** in Practice PyMTL3 in Research **JIT-Compiled Simulation Gradually-Typed HDLs Property-Based Random Testing** A Call to Action



productive hardware design methodology?







- **PyMTL2**: https://github.com/cornell-brg/pymtl
  - ▷ released in 2014
  - extensive experience using framework in research & teaching
- **PyMTL3**: https://github.com/pymtl/pymtl3
  - official release in May 2020
  - adoption of new Python3 features
  - significant rewrite to improve productivity & performance
  - cleaner syntax for FL, CL, and RTL modeling
  - completely new Verilog translation support
  - first-class support for method-based interfaces



Trends in Open-Source HW • PyMTL3 Framework • PyMTL3 in Practice

PyMTL3 in Research

## **PyMTL3 High-Level Modeling**

```
1 class QueueFL( Component ):
    def construct( s, maxsize ):
 2
      s.q = deque( maxlen=maxsize )
 3
                                        eng
                                             eng
 4
   @non_blocking(
 5
      lambda s: len(s.q) < s.q.maxlen )</pre>
6
   def enq( s, value ):
 7
      s.q.appendleft( value )
8
9
   @non_blocking(
                                        15
10
      lambda s: len(s.q) > 0)
                                        16
11
   def deq( s ):
12
                                        17
     return s.q.pop()
                                        18
13
                                        19
                                        20
FL/CL components can use
                                        21
    method-based interfaces
                                        22
                                        23
                                        24
Structural composition via
                                        25
    connecting methods
                                        26
                                        27
                                        28
```



## **PyMTL3 Low-Level Modeling**

```
from pymtl3 import *
2
   class RegIncrRTL( Component ):
3
4
     def construct( s, nbits ):
5
       s.in = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
          s.tmp <<= s.in_</pre>
12
13
       Qupdate
14
       def comb_logic():
15
          s.out @= s.tmp + 1
16
```



- Hardware modules are Python classes derived from Component
- construct method for constructing (elaborating) hardware
- ports and wires for signals
- update blocks for modeling combinational and sequential logic



## What is PyMTL3 for and not (currently) for?

#### PyMTL3 is for ...

- Taking an accelerator design from concept to implementation
- Construction of highly-parameterizable CL models
- Construction of highly-parameterizable RTL design generators
- Rapid design, testing, and exploration of hardware mechanisms
- Interfacing models with other C++ or Verilog frameworks

#### PyMTL3 is not (currently) for ...

- Python high-level synthesis
- Many-core simulations with hundreds of cores
- Full-system simulation with real OS support
- Users needing a complex OOO processor model "out of the box"

Trends in Open-Source HW PyMTL

A Call to Action

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



# A New Era of Open-Source Hardware

Trends in Open-Source HW PyMTL3 Framework

# **PyMTL3 in Practice**

PyMTL3 in Research JIT-Compiled Simulation Gradually-Typed HDLs Property-Based Random Testing A Call to Action

**Cornell University** 

## **PyMTL** has be used in many chip tapeouts



Trends in Open-Source HW PyMTL3 Framework PyMTL3 in Practice • PyMTL3 in Research A Call to Action **BRG Test Chip #1 (2016)** divided LVDS clk out LVDS clk out debug clk out reset LVDS Clock Receiver diff clk (+) ₩→ LVDS clk SRAM **RISC Processor** diff clk (–) K→ Recv div Subbank **Host Interface** reset single 🕅 (4KB) **HLS Accelerators** clk tree tree ended clk Ctrl Reg host2chip Host RISC Sort Core chip2host A Interface Accel SRAM SRAM SRAM Memory Arbitration Unit Subbank Subbank Subbank (4KB) (4KB) (4KB) SRAM SRAM SRAM SRAM Bank Bank Bank Bank (4KB) (4KB) (4KB) (4KB)

RISC processor, 16KB SRAM, HLS-generated accelerator 2x2mm, 1.2M-trans, IBM 130nm 95% done using PyMTL2

**Christopher Batten** 



Four RISC-V RV32IMAF cores with "smart" sharing of L1\$/LLFU 1x1.2mm, 6.7M-trans, TSMC 28nm 95% done using PyMTL2

**Christopher Batten** 

PyMTL3 in Practice •

PyMTL3 in Research

## **BRG Test Chip #5 (2022)**





RISC-V RV32IM core with 32-KB of SRAM SPI minion for config; SPI master and GP I/O for peripherals 2x2.5mm, TSMC 180nm 100% done using PyMTL3

Christopher Batten

## **Celerity System-on-Chip**

#### Target Workload: High-Performance Embedded Computing

- $5 \times 5$ mm in TSMC 16 nm FFC
- 385 million transistors
- 511 RISC-V cores
  - 5 Linux-capable Rocket cores
  - ▷ 496-core tiled manycore
  - 10-core low-voltage array
- 1 BNN accelerator
- 1 synthesizable PLL
- 1 synthesizable LDO Vreg
- 3 clock domains
- 672-pin flip chip BGA package
- 9-months from PDK access to tape-out



PyMTL3 in Research

#### **PyMTL3 for Undergraduate and Graduate Courses**



**Computer Arch Course** Labs use PyMTL for verification, PyMTL or Verilog for RTL design







**Chip Design Course** Labs use PyMTL for verification, PyMTL or Verilog for RTL design, standard ASIC flow

> First Teaching Tapeout in 10+ years! Four student projects All use PyMTL for testing Two use PyMTL for design



Trends in Open-Source HW PyM

A Call to Action

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



## A New Era of Open-Source Hardware

**Trends in Open-Source HW PyMTL3 Framework PyMTL3** in Practice **PyMTL3** in Research **JIT-Compiled Simulation** Gradually-Typed HDLs Property-Based Random Testing

A Call to Action

## **Evaluating HDLs, HGFs, and HGSFs**

- Apple-to-apple comparison of simulator performance
- 64-bit radix-four integer iterative divider
- All implementations use same control/datapath split with the same level of detail
- Modeling and simulation frameworks:
  - ▷ Verilog: Commercial verilog simulator, Icarus, Verilator
  - ▷ HGF: Chisel
  - ▶ HGSFs: PyMTL, MyHDL, PyRTL, Migen

## **Productivity/Performance Gap**



(a) Handwritten

- Higher is better
- Log scale (gap is larger than it seems)
- Commercial Verilog simulator is
   20× faster than Icarus
- Verilator requires C++ testbench, only works with synthesizable code, takes significant time to compile, but is 200× faster than Icarus

## **Productivity/Performance Gap**



## Productivity/Performance Gap



Using CPython interpreter, Python-based HGSFs are much slower than commercial Verilog simulators; even slower than Icarus!

## **Productivity/Performance Gap**



Using PyPy JIT compiler, Python-based HGSFs achieve ~10× speedup, but still significantly slower than commercial Verilog simulator

#### **Productivity/Performance Gap**



- Hybrid C/C++ co-simulation improves performance but:
  - only works for a synthesizable subset
  - ▶ may require designer to simultaneously work with C/C++ and Python

## **Productivity/Performance Gap**



PyMTL3 achieves impressive simulation performance by co-optimizing the framework and JIT

| Frends in Open-Source HW PyMTL3 Fra | amework PyMTL: | 3 in Practice • P | yMTL3 in Research • | A Call to Action |  |  |  |  |
|-------------------------------------|----------------|-------------------|---------------------|------------------|--|--|--|--|
| PyMTL3 Performance                  |                |                   |                     |                  |  |  |  |  |
|                                     |                |                   |                     |                  |  |  |  |  |
| Technique                           | Divider        | 1-Core            | 16-core             | 32-core          |  |  |  |  |
| Event-Driven                        | 24K CPS        | 6.6K CPS          | 155 CPS             | 66 CPS           |  |  |  |  |
| JIT-Aware HGSF                      |                |                   |                     |                  |  |  |  |  |
| + Static Scheduling                 | <b>13</b> ×    | <b>2.6</b> ×      | <b>1</b> ×          | 1.1×             |  |  |  |  |
| + Schedule Unrolling                | 16×            | <b>24</b> ×       | 0.4×                | 0.2×             |  |  |  |  |
| + Heuristic Toposort                | <b>18</b> ×    | <b>2</b> 6×       | 0.5×                | <b>0.3</b> ×     |  |  |  |  |
| + Trace Breaking                    | <b>19</b> ×    | <b>34</b> ×       | <b>2</b> ×          | 1.5×             |  |  |  |  |
| + Consolidation                     | <b>27</b> ×    | <b>34</b> ×       | <b>47</b> ×         | <b>42</b> ×      |  |  |  |  |
| HGSF-Aware JIT                      |                |                   |                     |                  |  |  |  |  |
| + RPython Constructs                | 96×            | <b>48</b> ×       | 62×                 | 61×              |  |  |  |  |
| + Huge Loop Support                 | 96×            | <b>49</b> ×       | 65×                 | 67×              |  |  |  |  |

RISC-V RV32IM five-stage pipelined cores

Only models cores, no interconnect nor caches

Trends in Open-Source HW PyM

A Call to Action

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



## A New Era of Open-Source Hardware

Trends in Open-Source HW PyMTL3 Framework PyMTL3 in Practice PyMTL3 in Research JIT-Compiled Simulation Gradually-Typed HDLs Property-Based Random Testing

A Call to Action

| Trends in Open-Source HW              | PyMTL3 Frameworl       | k PyMTL3 in F           | Practice   | • PyMTL3 in R             | esearch •  | A Call to                | Action     |  |
|---------------------------------------|------------------------|-------------------------|------------|---------------------------|------------|--------------------------|------------|--|
| Dynamically vs. Statically Typed HDLs |                        |                         |            |                           |            |                          |            |  |
|                                       | Design<br>Productivity | Testing<br>Productivity | y          | Simulation<br>Performance |            | tic Correct<br>Guarantee |            |  |
| Verilog/SystemVerilog                 | Low 🕖                  | Low                     |            | High (                    | $\bigcirc$ | Low                      | $\bigcirc$ |  |
| Bluespec                              | Medium                 | Low                     |            | Medium                    |            | High                     | $\bigcirc$ |  |
| Clash/Chisel/SpinalHDL                | Medium                 | Low                     |            | Medium                    |            | Medium                   |            |  |
| PyRTL/MyHDL/Migen/<br>PyMTL           | High                   | High                    | $\bigcirc$ | Low                       |            | None                     | $\bigcirc$ |  |
| PyMTL3                                | High 🔵                 | High                    | $\bigcirc$ | Medium                    |            | Low                      | Ø          |  |

Can we achieve the best of both dynamically and statically typed HDLs in a single unified framework?

Trends in Open-Source HW PyMTL3 Framework PyMTL3 in Practice PyMTL3 in Research A Call to Action Gradually Typed HDLs 1 class Foo: from pymtl3 import \* 1 bar = 42Dynamic 2 3 def g(x): 3 T W =  $\setminus$ x.bar = 'hello world' 4 TypeVar( "T\_W", bound=Bits ) 4 5 def f(x:Object({bar:Int}))->Int: g(x)6 5 Static return x.bar 7 class RegIncrRTL( Component ): 6 8 f(Foo()) 7 def construct( s, W: Type[T\_W] ): Code in Reticulated Python, 8 a Gradually Typed Dialect of Python s.in\_ = InPort (W) 9 s.out = OutPort(W) 10 IDiv s.tmp = Wire (W) 11 IDivCtrl rdy 12 IDivDpath [n:n\*2] Test Bench: Test Source @update\_ff 13 Test Bench: Test Sin def seq\_logic(): 14 s.tmp <<= s.in\_</pre> 15 16 **Qupdate** 17 def comb\_logic(): 18 s.out @= s.tmp + 1 19 Dyn. Typed Static. Typed 1 ~ 3 Dyn. Checks ······ Boundary

Component Hierarchy in GT-HDL

Trends in Open-Source HW PyM

A Call to Action

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



## A New Era of Open-Source Hardware

Trends in Open-Source HW PyMTL3 Framework PyMTL3 in Practice PyMTL3 in Research JIT-Compiled Simulation Gradually-Typed HDLs Property-Based Random Testing

A Call to Action

## **Testing HW Design Generators is Challenging**

#### Testing a specific ring network instance requires a number of different test cases



test\_ring\_1pkt\_2x2\_0\_chnl test\_ring\_2pkt\_2x2\_0\_chnl test\_ring\_2pkt\_2x2\_0\_chnl test\_ring\_self\_2x2\_0\_chnl test\_ring\_clockwise\_2x2\_0\_chnl test\_ring\_neighbor\_2x2\_0\_chnl test\_ring\_tornado\_2x2\_0\_chnl test\_ring\_backpressure\_2x2\_0\_chnl

#### Ideal testing technique:

- 1. Detect error quickly with **small number of test cases**
- 2. The failing test case has **minimal number of transactions**
- 3. The bug trace has **simplest transactions**
- 4. The failing test case has the **simplest design**

| / | pkt( | <pre>src=0,</pre> | dst=1, | payload=0xdeadbeef | ) |
|---|------|-------------------|--------|--------------------|---|
|   | pkt( | <pre>src=0,</pre> |        | payload=0x0000003  | ) |
|   | pkt( | <pre>src=1,</pre> |        | payload=0x00010000 | ) |
|   | pkt( | <pre>src=1,</pre> |        | payload=0x00010002 | ) |
|   | pkt( | <pre>src=2,</pre> | dst=1, | payload=0x00020001 | ) |
|   | pkt( | <pre>src=2,</pre> | dst=3, | payload=0x00020003 | ) |
|   | pkt( | <pre>src=3,</pre> | dst=2, | payload=0x00030002 | ) |
|   | pkt( | <pre>src=3,</pre> | dst=0, | payload=0x00030000 | ) |
|   | pkt( | <pre>src=0,</pre> | dst=1, | payload=0x00001000 | ) |
|   | pkt( | <pre>src=1,</pre> | dst=2, | payload=0x10002000 | ) |
|   | pkt( | <pre>src=2,</pre> | dst=3, | payload=0x20003000 | ) |
|   | pkt( | <pre>src=3,</pre> | dst=0, | payload=0x3000000  | ) |
|   | pkt( | <pre>src=0,</pre> | dst=3, | payload=0x00003000 | ) |
|   | pkt( | <pre>src=1,</pre> | dst=0, | payload=0x1000000  | ) |
|   | pkt( | <pre>src=2,</pre> | dst=1, | payload=0x20001000 | ) |
| - | pkt( | <pre>src=3,</pre> | dst=2, | payload=0x30002000 | ) |



A design generator can have many parameters: topology, routing, flow control, channel latency

#### **Software Testing Techniques**

- Complete Random Testing (CRT)
  - Randomly generate input data
  - Detects error quickly
  - Debug complicated test case
- Iterative Deepened Testing (IDT)
  - Gradually increase input complexity
  - Finds bug with simple input
  - Takes many test cases to find bug
- Property-Based Testing (PBT)
  - Search strategies, auto shrinking
  - Detects error quickly
  - Produces minimal failing test case
  - Increasingly state-of-the-art in software testing

```
def gcd( a, b ):
    while b > 0:
        a, b = b, a % b
    return a
```

```
def test_crt():
    for _ in range( 100 ):
```

a = random.randint( 1, 128 )
b = random.randint( 1, 128 )
assert gcd( a, b ) == math.gcd( a, b )

```
def test_idt():
    for a_max in range( 1, 128 ):
        for b_max in range( 1, 128 ):
            assert gcd( a, b ) == math.gcd( a, b )
```

```
@hypothesis.given(
    a = hypothesis.strategies.integers( 1, 128 ),
    b = hypothesis.strategies.integers( 1, 128 ),
)
def test_pbt( a, b ):
    assert gcd( a, b ) == math.gcd( a, b )
```

## PyH2 Creatively Adopts PBT for SW to Test HW

- PyH2 combines PyMTL3, a unified hardware modeling framework, with Hypothesis, a PBT framework for Python software and creates a property-based testing framework for hardware
- PyH2 leverages PBT to explore not just the input values for an HW design but to also explore the parameter values used to configure an HW design generator

|                                        | Complete<br>Random<br>Testing | Iterative<br>Deepened<br>Testing | PyH2         |
|----------------------------------------|-------------------------------|----------------------------------|--------------|
| Small number of test cases to find bug | g 🗸                           | Х                                | $\checkmark$ |
| Small number transactions in bug trac  | ce X                          | $\checkmark$                     | $\checkmark$ |
| Simple transactions in bug trace       | Х                             | $\checkmark$                     | $\checkmark$ |
| Simple design instance for bug trace   | Х                             | $\checkmark$                     | $\checkmark$ |
| Cornell University Christopher         | Batten                        |                                  | 49           |

PyMTL3 in Practice

PyMTL3 in Research

## **PyMTL3** Publications

- Shunning Jiang, et al., "Mamba: Closing the Performance Gap in Productive Hardware Development Frameworks." 55th ACM/IEEE Design Automation Conf. (DAC), June 2018.
- Shunning Jiang, Peitian Pan, Yanghui Ou, et al., "PyMTL3: A Python Framework for Open-Source Hardware Modeling, Generation, Simulation, and Verification." IEEE Micro, 40(4):58-66, Jul/Aug. 2020.
- Shunning Jiang\*, Yanghui Ou\*, Peitian Pan, et al., "PyH2: Using PyMTL3 to Create Productive and Open-Source Hardware Testing Methodologies." IEEE Design & Test, 38(2):53-61, Apr. 2021.
- Shunning Jiang, Yanghui Ou, Peitian Pan, et al., "UMOC: Unified Modular Ordering Constraints to Unify Cycle- and Register-Transfer-Level Modeling." 58th ACM/IEEE Design Automation Conf. (DAC), Dec. 2021.

Theme Article: Agile and Open-Source Hardware PyMTL3: A Python Framework for Open-Source Hardware Modeling, Generation, Simulation, and Verification

Shunning Jiang, Peitian Pan, Yanghui Ou, and Christopher Batten Cornell University

0272-1732 © 2020 IEEE

58

Abstract—In this article, we present PyMTL3, a Python framework for op modeling, generation, simulation, and verification. In addition to compelling benefits from using the Python language, PyMTL3 is designed to provide flexible, modular, and extensible s for both hardware designers and computer architects. PyMTL3 supports a nt and carefully designed m architecture using a sophisticated in-memory intermediate representation and a collectio of passes that analyze, instrument, and transform PyMTL3 hardware mo PvMTL3 can play an important role in jump-starting the open-source hardware

**DUE TO THE** breakdown of transistor scaling and the slowdown of Moore's law, there has neous architectures with a mix of generalbeen an increasing trend toward energy-efficient purpose and specialized computing engines. Het-

Digital Object Identifier 10.1109/MM.2020.2997638 Date of publication 25 May 2020; date of current version 30 June 2020.

system-on-chip (SoC) design using heterogeerogeneous SoCs emphasize both flexible parameterization of a single design block and versatile composition of numerous different design blocks, which have imposed significant challenges to state-of-the-art hardware modeling and

IEEE Micro

sed use limited to: Cornell University Library. Downloaded on July 03.2020 at 00:53:45 UTC from IEEE Xolore. Restrictions appl

Published by the IEEE Computer Society

Trends in Open-Source HW PyMTL

```
from pymtl3 import *
1
2
   class RegIncrRTL( Component ):
3
     def construct( s, nbits ):
5
       s.in_ = InPort ( nbits )
6
       s.out = OutPort( nbits )
7
       s.tmp = Wire ( nbits )
8
9
       @update_ff
10
       def seq_logic():
11
         s.tmp <<= s.in_
12
13
       Qupdate
14
       def comb_logic():
15
         s.out @= s.tmp + 1
16
```



# A New Era of Open-Source Hardware

Trends in Open-Source HW PyMTL3 Framework PyMTL3 in Practice PyMTL3 in Research JIT-Compiled Simulation Gradually-Typed HDLs Property-Based Random Testing

# A Call to Action



- Open-source hardware needs developers who
  - ▷ ... are idealistic
  - ▷ ... have lots of free time
  - ▷ ... will work for free
- Who might that be?

#### Students!

Academics have a practical and ethical motivation for using, developing, and promoting open-source electronic design automation tools and open-source hardware designs



**Cornell University** 

Christopher Batten

This work was supported in part by NSF XPS Award #1337240, NSF CRI Award #1512937, NSF SHF Award #1527065, AFOSR YIP Award #FA9550-15-1-0194, DARPA Young Faculty Award #N66001-12-1-4239, DARPA POSH Award #FA8650-18-2-7852, DARPA SDH Award #FA8650-18-2-7863, a Xinux University Program industry gift, and the the Center for Applications Driving Architectures (ADA), one of six centers of JUMP, a Semiconductor Research Corporation program co-sponsored by DARPA, and equipment, tool, and/or physical IP donations from Intel, NVIDIA, Synopsys, and ARM.

Thanks to the core PyMTL developers, Derek Lockhart, Shunning Jiang, Peitian Pan, Yanghui Ou, along with Khalid Al-Hawaj, Moyang Wang, Tuan Ta, Ji Kim, Shreesha Srinath, Berkin Ilbeyi, Dilan Lakhani, Jack Brzozowski, Kyle Infantino, Yixiao Zhang, Jacob Glueck, Aaron Wisner, Gary Zibrat, Christopher Torng, Cheng Tan, Raymond Yang, Kaishuo Cheng, Carl Friedrich Bolz, David MacIver, and Zac Hatfield-Dodds for their help designing, developing, testing, and using PyMTL

The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation theron. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of any funding agency.