Are Cycle Accurate Simulators a Waste of Time?
This page contains information on the paper "Are Cycle Accurate Simulations a Waste of Time?" which was presented at the 7th Annual Workshop on Duplicating, Deconstructing, and Debunking (WDDD) on 22 June 2008 in Beijing, China.
Here is the latest version of the paper, updated to have complete SESC results:
- Are Cycle Accurate Simulations a Waste of Time? (wddd08_sim.pdf) by V. M. Weaver and S. A. McKee
Here is the version with incomplete results that was presented at the Workshop:
- Workshop version of the paper, incomplete SESC results (wdd08_workshop.pdf)
Here is the Linux kernel module used to put the R12K machine into the various branch prediction modes
The perfmon2 tool and patches used for gathering performance counter data can be found here:
Here are the patches needed to generate MIPS traces using Qemu.
This is against a SVN checkout from 23 April 2008. Some of the patches
have been merged upstream, some aren't necessary anymore. You won't
have much luck applying the patches against newer versions of Qemu;
with the TCG merge the codebase has had some major changes.
Qemu was run with the applied patches, and with the command line:
"-s 8048048 -cpu R3000" (the R3000 has nothing to do with the type
of CPU being emulated, it was a temporary fix to make sure
the right kind of FPU emulation was used).
The ftruncate part of the patch is needed to sixtrack to finish.
Here is the custom branch simulator used in conjunction with Qemu:
The Dinero IV cache simulator used was designed by Elder and Hill. The options we used for simulation are: "-informat b -l1-isize 32k -l1-dsize 32k -l1-ibsize 64 -l1-dbsize 32 -l2-usize 2M -l2-ubsize 128 -l1-dassoc 2 -l1-iassoc 2 -l2-uassoc 2". The simulator can be obtained here:
Here are the final SESC configuration files used in the paper (SESC has never been formally validated):
- With 2-bit branch predictor: r12k.2bit.conf
- With Always-taken branch predictor: r12k.taken.conf
- With Static branch predictor: r12k.static.conf
Here are the relative timings of each precompiled "SPEC" benchmark using the 3 different methods. This will be updated as jobs complete (with our original config some of the benchmarks did not finish):