ECE 4750 Testing and Debugging Strategy ========================================================================== This document discuses the testing and debugging strategy we will be using the programming assignments for ECE 4750. Testing is the process of checking whether a program behaves correctly. Testing a large design can be hard because bugs may appear anywhere in the design, and multiple bugs may interact. Good practice is to test small parts of the design individually, before testing the entire design, which can more readily support finding and fixing bugs. We will use both unit testing and integration testing. Unit testing is the process of individually testing a small part or unit of a design, typically a single module. A unit test is typically conducted by creating a testbench, a.k.a. test harness, which is a separate program whose sole purpose is to check that a module returns correct output values for a variety of input values. Each unique set of input values is known as a test vector. Manually examining printed output is cumbersome and error prone. A better test harness would only print a message for incorrect output. Integration involves testing the composition of various modules and should only be attempted after we have unit tested those modules. We will be using a mix of black-box and white-box testing. Black box testing is where your test cases only test the _interface_ of your design. Black-box testing does not _directly_ test any of the internals within your design. Obviously, black-box testing will _indirectly_ test the internals though. White-box testing is where your test cases directly test the internals by perhaps poking into the design using hierarchical signal references. White-box testing is pretty fragile so we won't really be using it in this course. We also might do what I call "gray-box" testing. This is where you choose specific test vectors that are carefully designed to trigger complex behavior in a specific implementation. Since gray-box tests can be applied to any implementation, they are like black-box tests. Since they attempt to trigger complex implementation-specific behavior, they are like white-box tests. We will primarly be using directed testing and random testing. Directed testing is where the designer explicitly specifies the inputs and the correct outputs. Directed tests are carefully crafted to enable good coverage of many different hardware behaviors. Random testing is where the designer randomly generates inputs and then verifies that the function produces the right output. This of course begs the question, "How do we know what the right output is, if we are randomly generating the input?" There are two approaches. First, the designer can assert that property is valid on the output. For example, if the module is meant to sort a sequence of values, the random test can assert that the final values are is indeed sorted. Second, the designer can use a golden reference implementation. For example, the programmer might use a functional-level model of the module. We will also use value and delay testing. Value testing focuses on applying different input values and checking the corresponding output values. Delay testing focuses more on changing the delays between _when_ input values are provided and possibly also changing the delays between _when_ output values are accepted. Delay testing is particularly important when working with latency-insenstiive stream interfaces to ensure the corresponding val/rdy micro-protocol is implemented correctly. A variant of delay testing involves verifying that the delay of a module is as expected. So for example we might assert that the module takes no longer than a specific number of cycles to executes a specific transaction. Note that we usually mix and match these different kinds of testing. So we can use unit, black-box, directed, value testing or integration, black-box, random, delay testing. Note that ad-hoc testing should _not_ an important part of your testing strategy. It is neither automatic nor systematic. Debugging Process -------------------------------------------------------------------------- Here is our recommended systematic debugging process. Use this process after you have fixed any Verilog syntax errors and you are now getting some kind of incorrect result. REMEMBER: You must use --tb=short or --tb=long to see the error message when using pytest! Step 1: Run all of the tests to get a high-level view of what test cases are passing and what test cases are failing. Step 2: Pick one failing test script to focus on, and run just that test script in isolation, maybe with the --verbose flag to get a list of the test cases. Pick the most basic test script that is failing. Always focus on any test cases that are failing on ProcFL first since this means it is a bad test case. Step 3: Pick one failing test case to focus on, and run just that test case in isolation using -k (or maybe -x). Pick the most basic test case that is failing. Use -s to see the line trace. Use --tb=long or --tb=short to see the error message. Step 4: Look at the line trace and the error message. Determine what is the observable error. Often this will be a stream sink error. Step 5: Look at the actual test case (in lab 2 this means look at the assembly sequence). Make absolute sure you know what the test case is testing and that the test case is valid. You have no hope of debugging your design if you do not understand what correct execution you expect to happen! You might want to run the test case on ProcFL just verify that this actually a valid test case before continuing, although hopefully you spotted any failures on ProcFL in step 1. Step 6: Work _backwards_ from the observable error on the line trace trying to see what is going wrong from just the line trace. NOTE: you can see the instruction memory request and response _and_ the data memory request and response in the line trace -- you can often spot errors for LW or SW right from the line trace by looking at the data memory request and response messages (incorrect message type? incorrect address? incorrect data being read/written?). Similarly you can often spot errors for instruction fetch from the line trace. You can also often see errors in control flow (are the wrong instructions being executed, squashed) or errors in stalling/bypassing logic (is an instruction not stalling when it should?) right from the line trace. You need to work backwards from the observable error to narrow your focus on what part of the design might have a bug (the datapath? the control unit?). Try to narrow your focus to a specific cycle where something is going wrong. Step 7: Based on the narrowed focus from step 6, make a hypothesis on what might be wrong. Take a quick look at the corresponding code. Check for errors in bitwidth, in signal naming, or in connectivity. If you cannot spot anything obvious then go to the next step. If you spot something obvious skip to step 10. Step 8: Use the --dump-vcd option to generate a VCD file. Open the VCD file in gtkwave. Add the clock, maybe add the inst_D, inst_X, inst_M, inst_W fields to the waveform view. Use the narrowed focus from Step 6 and the hypothesis from Step 7 to zoom in on a specific cycle and a specific part of the design where you can clearly see a specific signal that is incorrect. Step 9: Work _backwards_ from the signal which is incorrect. Work backwards in the datapath -- keep working backwards component by component. For each component look at the inputs (all inputs, look at data inputs and control signals) and look at the outputs (all outputs, look at data outputs and status signals). Check for one of three things: (1) are the inputs incorrect and the outputs incorrect for this component? if so you need to continue working backwards -- if the incorrect input is a control signal then you need to start working backwards into the control unit; (2) are the inputs correct and the outputs incorrect for this component? if so then you have narrowed the bug to be inside the component (maybe it is a bug in the ALU? maybe it is a bug in some other module?); or (3) are the inputs correct and the outputs correct for this component? Then you have gone backwards too far and you need to go forward in the design again to find a signal which is incorrect. Step 10: Once you find a bug, make a hypothesis about what should happen if you fix the bug. Your hypothesis should not just be "fixing the bug will make the test pass." It should instead be something like "fixing this bug should make this specific signal be 1 instead of 0" or "fixing this bug should make this specific instruction in the line trace stall". Fix the bug and see what happens by looking at the line trace and/or waveform. Don't just see if it passes the test -- literally check the line trace and/or waveform and see if the behavior confirms the line trace. One of four things will happen: (1) the test will pass and the linetrace/waveform behavior will match your hypothesis -- bug fixed! (2) the test will fail and the linetrace/waveform will not match your hypothesis -- you need to keep working -- your bug fix did not do what it was supposed to, and it did not fix the error -- undo the bug fix and go back to step 6. (3) the test will fail but the linetrace/waveform _will_ match your hypothesis -- this means your bug fix did what you expected but there might be another bug still causing trouble -- you need to keep working -- go back to step 6. (4) the test will pass and the linetrace/waveform _will not_ match your hypothesis -- you need to keep working -- your bug fix did not do what you thought it would even though it cause the test to pass -- there might be something subtle going on -- go back to step 6 to figure out why the bug fix did not do what you thought it would. Note a couple things about this systematic 10 step process. First, it is a systematic process ... it does not involve randomly trying things. Second, the process uses all tools at your disposable: output from pytest, traceback, line tracing, and VCD waveforms. You really need to use all of these tools. If you use line tracing but never use VCD waveforms or you use VCD waveforms and never use line tracing then you are putting yourself at a disadvantage. Third, the process requires you to think critically and make a hypothesis about what should change -- do not just change something, pass the test, and move on -- change something and see if the line trace and waveforms change in the way you expect. Otherwise you can actually introduce more bugs even though you think are fixing things.