|
1 | | -# Macsim |
| 1 | +# MacSim |
| 2 | + |
2 | 3 | ## Introduction |
3 | 4 |
|
4 | | -* MacSim is a heterogeneous architecture timing model simulator that is |
5 | | - developed from Georgia Institute of Technology. |
| 5 | +MacSim is a trace-based cycle-level GPGPU simulator developed by [HPArch](https://sites.gatech.edu/hparch/) at Georgia Institute of Technology. |
| 6 | + |
6 | 7 | * It simulates x86, ARM64, NVIDIA PTX and Intel GEN GPU instructions and can be configured as |
7 | | - either a trace driven or execution-drive cycle level simulator. It models |
8 | | - detailed mico-architectural behaviors, including pipeline stages, |
| 8 | + either a trace driven or execution-driven cycle level simulator. It models |
| 9 | + detailed micro-architectural behaviors, including pipeline stages, |
9 | 10 | multi-threading, and memory systems. |
10 | 11 | * MacSim is capable of simulating a variety of architectures, such as Intel's |
11 | 12 | Sandy Bridge, Skylake (both CPUs and GPUs) and NVIDIA's Fermi. It can simulate homogeneous ISA multicore |
|
14 | 15 | cores) and SMT or MT architectures as well. |
15 | 16 | * Currently interconnection network model (based on IRIS) and power model (based |
16 | 17 | on McPAT) are connected. |
17 | | -* MacSim is also one of the components of SST, so multiple MacSim simulatore |
| 18 | +* MacSim is also one of the components of SST, so multiple MacSim simulators |
18 | 19 | can run concurrently. |
19 | 20 | * The project has been supported by Intel, NSF, Sandia National Lab. |
20 | 21 |
|
| 22 | +## Table of Contents |
| 23 | +- [Note](#note) |
| 24 | +- [Intel GEN GPU Architecture](#intel-gen-gpu-architecture) |
| 25 | +- [Documentation](#documentation) |
| 26 | +- [Installation](#installation) |
| 27 | +- [Quick Start](#quick-start) |
| 28 | +- [Downloading Traces](#downloading-traces) |
| 29 | +- [Generating Your Own Traces](#generating-your-own-traces) |
| 30 | +- [Known Bugs](#known-bugs) |
| 31 | +- [People](#people) |
| 32 | +- [Q & A](#q--a) |
| 33 | +- [Tutorial](#tutorial) |
| 34 | +- [SST+MacSim](#sstmacsim) |
| 35 | + |
21 | 36 | ## Note |
22 | 37 |
|
23 | 38 | * If you're interested in the Intel's integrated GPU model in MacSim, please refer to [intel_gpu](https://github.com/gthparch/macsim/tree/intel_gpu) branch. |
|
38 | 53 |
|
39 | 54 | Please see [MacSim documentation file](https://github.com/gthparch/macsim/blob/master/doc/macsim.pdf) for more detailed descriptions. |
40 | 55 |
|
41 | | - |
42 | | -## Download |
43 | | - |
44 | | -* You can download the latest copy from our git repository. |
45 | | - |
46 | | -``` |
47 | | -git clone -b intel_gpu https://github.com/gthparch/macsim.git |
48 | | -
|
49 | | -download traces |
50 | | -/macsim/tools/download_trace_files.py |
51 | | -``` |
52 | | -## Build |
| 56 | +## Installation |
53 | 57 |
|
54 | 58 | ### Prerequisites |
55 | 59 |
|
56 | | -- **zlib** (development library) must be installed on your system. |
| 60 | +- **zlib** (development library) |
57 | 61 | ```bash |
58 | 62 | # Ubuntu/Debian |
59 | 63 | sudo apt install zlib1g-dev |
60 | 64 | # RHEL/CentOS/Fedora |
61 | 65 | sudo dnf install zlib-devel |
62 | 66 | ``` |
63 | 67 |
|
64 | | -Set up a Python virtual environment and install SCons: |
| 68 | +- **Python >= 3.11** and **SCons** (build tool) |
| 69 | + ```bash |
| 70 | + uv venv |
| 71 | + uv pip install scons |
| 72 | + ``` |
| 73 | + |
| 74 | + Optionally, activate the virtual environment so you can omit `uv run`: |
| 75 | + ```bash |
| 76 | + source .venv/bin/activate |
| 77 | + ``` |
| 78 | + |
| 79 | +### Clone and Build |
65 | 80 |
|
66 | 81 | ```bash |
67 | | -uv venv |
68 | | -uv pip install scons |
| 82 | +git clone https://github.com/gthparch/macsim.git --recursive |
| 83 | +cd macsim |
| 84 | +./build.py --ramulator -j 32 |
| 85 | + |
| 86 | +# Or without activating the virtual environment: |
| 87 | +uv run ./build.py --ramulator -j 32 |
69 | 88 | ``` |
70 | 89 |
|
71 | | -Optionally, activate the virtual environment so you can omit `uv run`: |
| 90 | +For more build options, see `./build.py --help`. |
| 91 | + |
| 92 | +## Quick Start |
| 93 | + |
| 94 | +This section walks you through downloading a trace, setting up the simulation, and running it. |
| 95 | + |
| 96 | +### 1. Download a Sample Trace |
72 | 97 |
|
73 | 98 | ```bash |
74 | | -source .venv/bin/activate |
| 99 | +uv pip install gdown |
| 100 | +gdown -O macsim_traces.tar.gz 1rpAgIMGJnrnXwDSiaM3S7hBysFoVhyO1 |
| 101 | +tar -xzf macsim_traces.tar.gz |
| 102 | +rm macsim_traces.tar.gz |
75 | 103 | ``` |
76 | 104 |
|
77 | | -### Building |
| 105 | +This will extract sample traces from the [Rodinia benchmark suite](https://github.com/yuhc/gpu-rodinia) into a `macsim_traces/` directory. |
| 106 | + |
| 107 | +### 2. Set Up a Run Directory |
| 108 | + |
| 109 | +You need three files in the same directory to run a simulation: |
| 110 | +- `macsim` — the binary executable |
| 111 | +- `params.in` — GPU configuration |
| 112 | +- `trace_file_list` — list of paths to GPU traces |
| 113 | + |
| 114 | +Copy them from the build output: |
78 | 115 |
|
79 | 116 | ```bash |
80 | | -# With virtual environment activated: |
81 | | -./build.py --ramulator |
| 117 | +mkdir run |
| 118 | +cp bin/macsim bin/params.in bin/trace_file_list run/ |
| 119 | +cd run |
| 120 | +``` |
82 | 121 |
|
83 | | -# Or without activating: |
84 | | -uv run ./build.py --ramulator |
| 122 | +### 3. Set Up the Trace Path |
| 123 | + |
| 124 | +Edit `trace_file_list`. The first line is the number of traces, and the second line is the path to the trace: |
| 125 | + |
| 126 | +``` |
| 127 | +1 |
| 128 | +/absolute/path/to/macsim_traces/hotspot/r512h2i2/kernel_config.txt |
85 | 129 | ``` |
86 | 130 |
|
87 | | -For more build options, see `./build.py --help` or [INSTALL](INSTALL). |
| 131 | +### 4. Run |
88 | 132 |
|
89 | | -## People |
| 133 | +```bash |
| 134 | +./macsim |
| 135 | +``` |
| 136 | + |
| 137 | +Simulation results will appear in the current directory. For example, check `general.stat.out` for the total cycle count: |
| 138 | + |
| 139 | +```bash |
| 140 | +grep CYC_COUNT_TOT general.stat.out |
| 141 | +``` |
90 | 142 |
|
91 | | -* Prof. Hyesoon Kim (Project Leader) at Georgia Tech |
92 | | -Hparch research group |
93 | | -(http://hparch.gatech.edu/people.hparch) |
| 143 | +> **Note:** The parameter file must be named `params.in`. The macsim binary looks for this exact filename in the current directory. |
94 | 144 |
|
| 145 | +## Downloading Traces |
95 | 146 |
|
| 147 | +### Publicly Available Traces |
96 | 148 |
|
97 | | -## Q & A |
| 149 | +| Dataset | Download | |
| 150 | +|---------|----------| |
| 151 | +| Rodinia | [Download](https://www.dropbox.com/scl/fi/qyqk9yuxaut0f9490k5n3/pytorch_nvbit.tar.gz?rlkey=dgq53t37k38izawacgxdkqxsw&st=fbvchdmw&dl=0) | |
| 152 | +| PyTorch | [Download](https://www.dropbox.com/scl/fi/otaiy3gnmkcrexy66hkez/rodinia_nvbit.tar.gz?rlkey=w2pa56a0ik42zydl0incogc99&st=y3ki6xyy&dl=0) | |
| 153 | +| YOLOPv2 | [Download](https://www.dropbox.com/scl/fi/srmp7cp2uw6lup34j4keg/yolopv2.tar.gz?rlkey=s5pg7dhdub7jofit3omy446n3&st=d6dfq6uy&dl=0) | |
| 154 | +| GPT2 | [Download](https://www.dropbox.com/scl/fi/qn72hfwyeo5qq120kyade/gpt2_nvbit.tar.gz?rlkey=pal8q77bwf4iarypfts2osus3&st=cmjslv8o&dl=0) | |
| 155 | +| GEMMA | [Download](https://www.dropbox.com/scl/fi/ewcyrogwv7odc6soi9v6n/gemma_nvbit.tar.gz?rlkey=arifvlad3kj9tcw6ogze7n04m&st=66fbac0t&dl=0) | |
98 | 156 |
|
99 | | -If you have a question, please use github issue ticket. |
| 157 | +## Generating Your Own Traces |
| 158 | + |
| 159 | +> **Warning:** The trace generation tool is experimental — use at your own risk. |
| 160 | +
|
| 161 | +To generate traces for your own CUDA workloads, use the [MacSim Tracer](https://github.com/gthparch/Macsim_tracer). |
| 162 | + |
| 163 | +Simply prepend `CUDA_INJECTION64_PATH` to your original command. For example: |
| 164 | + |
| 165 | +```bash |
| 166 | +CUDA_INJECTION64_PATH=/path/to/main.so python3 your_cuda_program.py |
| 167 | +``` |
100 | 168 |
|
| 169 | +Available environment variables: |
| 170 | + |
| 171 | +| Variable | Description | Default | |
| 172 | +|----------|-------------|---------| |
| 173 | +| `TRACE_PATH` | Path to save trace files | `./` | |
| 174 | +| `KERNEL_BEGIN` | First kernel to trace | `0` | |
| 175 | +| `KERNEL_END` | Last kernel to trace | `UINT32_MAX` | |
| 176 | +| `INSTR_BEGIN` | First instruction to trace per kernel | `0` | |
| 177 | +| `INSTR_END` | Last instruction to trace per kernel | `UINT32_MAX` | |
| 178 | +| `COMPRESSOR_PATH` | Path to the compressor binary | (built with tracer) | |
| 179 | +| `DEBUG_TRACE` | Generate human-readable debug traces | `0` | |
| 180 | +| `OVERWRITE` | Overwrite existing traces | `0` | |
| 181 | +| `TOOL_VERBOSE` | Enable verbose output | `0` | |
| 182 | + |
| 183 | +See the [MacSim Tracer README](https://github.com/gthparch/Macsim_tracer) for full installation and usage instructions. |
| 184 | + |
| 185 | +## Known Bugs |
| 186 | + |
| 187 | +1. **`src/memory.cc:1043: ASSERT FAILED`** — Happens with FasterTransformer traces + too many cores (40+). **Solution:** Reduce the number of cores. |
| 188 | + |
| 189 | +2. **`src/factory_class.cc:77: ASSERT FAILED`** — Happens when `params.in` file is missing or has a wrong name. **Solution:** Use `params.in` as the config file name. |
| 190 | + |
| 191 | +3. **`src/process_manager.cc:826: ASSERT FAILED ... error opening trace file`** — Too many trace files open simultaneously. **Solution:** Add `ulimit -n 16384` to your `~/.bashrc`. |
| 192 | + |
| 193 | +## People |
| 194 | + |
| 195 | +* Prof. Hyesoon Kim (Project Leader) at Georgia Tech |
| 196 | +Hparch research group |
| 197 | +(http://hparch.gatech.edu/people.hparch) |
| 198 | + |
| 199 | +## Q & A |
| 200 | + |
| 201 | +If you have a question, please use github issue ticket. |
101 | 202 |
|
102 | 203 | ## Tutorial |
103 | 204 |
|
104 | 205 | * We had a tutorial in HPCA-2012. Please visit [here](http://comparch.gatech.edu/hparch/OcelotMacsim_tutorial.html) for the slides. |
105 | 206 | * We had a tutorial in ISCA-2012, Please visit [here](http://comparch.gatech.edu/hparch/isca12_gt.html) for the slides. |
106 | 207 |
|
107 | | - |
108 | 208 | ## SST+MacSim |
109 | 209 |
|
110 | 210 | * Here are two example configurations of SST+MacSim. |
|
0 commit comments