Skip to content

Commit 9b28573

Browse files
authored
Merge pull request #49 from abc99lr/gups_support
Add support for GUPS benchmark
2 parents 7d974c5 + 6124cd9 commit 9b28573

6 files changed

Lines changed: 981 additions & 0 deletions

File tree

posts/gups/LICENSE

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: BSD-3-Clause
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions are met:
6+
#
7+
# 1. Redistributions of source code must retain the above copyright notice, this
8+
# list of conditions and the following disclaimer.
9+
#
10+
# 2. Redistributions in binary form must reproduce the above copyright notice,
11+
# this list of conditions and the following disclaimer in the documentation
12+
# and/or other materials provided with the distribution.
13+
#
14+
# 3. Neither the name of the copyright holder nor the names of its
15+
# contributors may be used to endorse or promote products derived from
16+
# this software without specific prior written permission.
17+
#
18+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
19+
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21+
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
22+
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23+
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
24+
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
25+
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
26+
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28+
#

posts/gups/LICENSE.gups.cu

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
3+
Copyright (c) 2012 NISHIMURA Ryohei.
4+
Copyright (c) 2012 The University of Tennessee.
5+
All rights reserved.
6+
7+
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
8+
· Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
9+
· Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer listed in this license in the documentation and/or other materials provided with the distribution.
10+
· Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
11+
12+
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. in no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

posts/gups/Makefile

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2020-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: BSD-3-Clause
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions are met:
6+
#
7+
# 1. Redistributions of source code must retain the above copyright notice, this
8+
# list of conditions and the following disclaimer.
9+
#
10+
# 2. Redistributions in binary form must reproduce the above copyright notice,
11+
# this list of conditions and the following disclaimer in the documentation
12+
# and/or other materials provided with the distribution.
13+
#
14+
# 3. Neither the name of the copyright holder nor the names of its
15+
# contributors may be used to endorse or promote products derived from
16+
# this software without specific prior written permission.
17+
#
18+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
19+
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21+
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
22+
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23+
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
24+
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
25+
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
26+
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
27+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28+
#
29+
30+
# The CUDA compiler.
31+
CUDA_HOME ?= /usr/local/cuda
32+
33+
# The compiler.
34+
CXX = $(CUDA_HOME)/bin/nvcc
35+
36+
# Optimization and Debugging
37+
OPTFLAGS ?= -O3
38+
39+
# Set target GPU CC (only sm_80 and sm_90 are currently supported for STATIC_SHMEM)
40+
GPU_ARCH ?= 80 90
41+
42+
# Default to using compile time NSHMEM
43+
DYNAMIC_SHMEM ?= -DSTATIC_SHMEM
44+
45+
# Source files
46+
SRC_FILES = gups.cu
47+
48+
# Object Files
49+
OBJ_FILES = $(SRC_FILES:.cu=.o)
50+
51+
# CU flags
52+
CU_FLAGS = -std=c++14 -Xcompiler -std=c++14 -lineinfo
53+
54+
CU_FLAGS += $(foreach cc,$(GPU_ARCH), \
55+
--generate-code arch=compute_$(cc),code=sm_$(cc) )
56+
57+
# CXX flags
58+
CXXFLAGS = $(OPTFLAGS) $(CU_FLAGS) -Xcompiler -Wall $(DYNAMIC_SHMEM)
59+
60+
61+
LINKFLAGS = $(CXXFLAGS)
62+
63+
64+
DEFAULT: gups
65+
66+
all = gups
67+
68+
gups: $(OBJ_FILES)
69+
70+
# Include the dependencies that were created by %.d rule.
71+
#
72+
ifneq ($(MAKECMDGOALS),clean)
73+
-include $(SRC_FILES:.cu=.d)
74+
endif
75+
#
76+
77+
# Prepare file holding dependencies, to be included in this file.
78+
#
79+
80+
%.d: %.cu Makefile
81+
@set -e; rm -f $@; \
82+
$(CXX) -DMAKE_DEPEND -M $(CXXFLAGS) $< > $@.$$$$; \
83+
sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \
84+
rm -f $@.$$$$
85+
86+
%.o: %.cu Makefile
87+
$(CXX) $(CXXFLAGS) -c $*.cu
88+
89+
$(all):%:
90+
$(CXX) $(LINKFLAGS) -o $@ $^
91+
92+
clean:
93+
rm -f $(OBJ_FILES) *.o *.d gups \
94+
*.d.[0-9][0-9][0-9][0-9][0-9] *.d.[0-9][0-9][0-9][0-9] \
95+
*.d.[0-9][0-9][0-9] *.d.[0-9][0-9][0-9][0-9][0-9][0-9] *~

posts/gups/README.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
## GUPS Benchmark
2+
3+
### How to build the benchmark
4+
Build with Makefile with following options:
5+
6+
`GPU_ARCH=xx` where `xx` is the Compute Capibility of the device(s) being tested (default: 80 90). Users could check the CC of a specific GPU using the tables [here](https://developer.nvidia.com/cuda-gpus#compute). The generated executable (called `gups`) supports both global memory GUPS and shared memory GUPS modes. Global memory mode is the default mode. Please refer to the next section for the runtime option to switch between modes.
7+
8+
Notes on shared memory GUPS:
9+
1. Note that for shared memory GUPS, unless if dynamic allocation is forced (see below), only CC 80 and CC 90 are supported, for other CC, the shared memory GUPS code will fall back to dynamic allocation mode.
10+
2. To force dynamic shared memory allocation, build with `DYNAMIC_SHMEM=`. Note that this is NOT recommended and will result in incorrect shared memory GUPS numbers as the kernel becomes instruction bound.
11+
12+
For example: `make GPU_ARCH="70 80" DYNAMIC_SHMEM=` will build the executable `gups`, which supports global memory GUPS and shared memory GUPS with dynamic shared memory allocation, for both CC 70 (e.g., NVIIDA V100 GPU) and CC 80 (e.g., NVIDIA A100 GPU).
13+
14+
### How to run the benchmark
15+
Note that besides GUPS (updates (loop)), our benchmark code supports other random access tests, including reads, writes, reads+writes, and updates (no loop).
16+
You can choose the benchmark type using the `-t` runtime option. Users may need to fine tune access per element option (`-a`) to achieve the best performance.
17+
Note that the correctness verification is only available for updates (loop)/default test.
18+
19+
You could use `./gups -h` to get a list of runtime arguments.
20+
```
21+
Usage:
22+
-n <int> input data size = 2^n [default: 29]
23+
-o <int> occupancy percentage, 100/occupancy how much larger the working set is compared to the requested bytes [default: 100]
24+
-r <int> number of kernel repetitions [default: 1]
25+
-a <int> number of random accesses per input element [default: 32 (r, w) or 8 (u, unl, rw) for gmem, 65536 for shmem]
26+
-t <int> test type (0 - update (u), 1 - read (r), 2 - write (w), 3 - read write (rw), 4 - update no loop (unl)) [default: 0]
27+
-d <int> device ID to use [default: 0]
28+
-s <int> enable input in shared memory instead of global memory for shared memory GUPS benchmark if s>=0. The benchmark will use max available shared memory if s=0 (for ideal GUPS conditions this must be done at compile time, check README.md for build options). This tool does allow setting the shmem data size with = 2^s (for s>0), however this will also result in an instruction bound kernel that fails to reach hardware limitations of GUPS. [default: -1 (disabled)]
29+
```
30+
31+
You can also use provided Python script to run multiple tests with a single command and get a CSV report. The default setting of the script run all the random access tests. Run `python run.py --help` for the usage options.
32+
```
33+
usage: run.py [-h] [--device-id DEVICE_ID]
34+
[--input-size-begin INPUT_SIZE_BEGIN]
35+
[--input-size-end INPUT_SIZE_END] [--occupancy OCCUPANCY]
36+
[--repeats REPEATS]
37+
[--test {reads,writes,reads_writes,updates,updates_no_loop,all}]
38+
[--memory-loc {global,shared}]
39+
40+
Benchmark GUPS. Store results in results.csv file.
41+
42+
optional arguments:
43+
-h, --help show this help message and exit
44+
--device-id DEVICE_ID
45+
GPU ID to run the test
46+
--input-size-begin INPUT_SIZE_BEGIN
47+
exponent of the input data size begin range, base is 2
48+
(input size = 2^n). [Default: 29 for global GUPS,
49+
max_shmem for shared GUPS. Global/shared is controlled
50+
by --memory-loc
51+
--input-size-end INPUT_SIZE_END
52+
exponent of the input data size end range, base is 2
53+
(input size = 2^n). [Default: 29 for global GUPS,
54+
max_shmem for shared GUPS. Global/shared is controlled
55+
by --memory-loc
56+
--occupancy OCCUPANCY
57+
100/occupancy is how much larger the working set is
58+
compared to the requested bytes
59+
--repeats REPEATS number of kernel repetitions
60+
--test {reads,writes,reads_writes,updates,updates_no_loop,all}
61+
test to run
62+
--memory-loc {global,shared}
63+
memory buffer in global memory or shared memory
64+
```
65+
66+
### LICENSE
67+
68+
`gups.cu` is modified based on `randomaccess.cu` file from [link to Github repository](https://github.com/nattoheaven/cuda_randomaccess). The LICENSE file of the Github repository is preserved as `LICENSE.gups.cu`.
69+
70+
`run.py` and `Makefile` are implemented from scratch by NVIDIA. For the license information of these two files, please refer to the `LICENSE` file.

0 commit comments

Comments
 (0)