This repository contains the artifact accompanying the paper:
"Extending the SPMD IR for RMA Models and Static Data Race Detection" by Semih Burak, Simon Schwitanski, Felix Tomski, Jens Domke, and Matthias Müller Submitted at EuroMPI 2025 (https://eurompi.org/)
-
spmd-ir Contains all source code related to the SPMD IR project itself, implemented as an MLIR dialect and analysis/transformation passes.
-
llvm-project Contains the patched LLVM 20 project, required for the SPMD IR, included as a Git submodule.
-
evaluation-benchmark-suites Contains the test programs used for evaluation:
- Micro-benchmark suites: MBB, RMARaceBench
- Proxy apps: Stencil (MPI and SHMEM versions), MiniWeather, and TeaLeaf
-
evaluation-results Contains output data and analysis results obtained from running the benchmarks within the Apptainer environment.
-
reproducibility Contains all files related to setting up and using the Apptainer-based experiment environment for reproducing our results.
The evaluation results used to produce the figures and tables in the evaluation chapter (Chapter 5) are provided in the folder evaluation-results.
The results to produce Table 2 of the paper are based on the result tables in the folder evaluation-results/cqResults/summaries. The files contain the individual results of each tool on each test case and a summarized view.
The individual tool outputs (stdout logs, MLIR intermediate files) and used commands are available in evaluation-results/cqResults/MBB and evaluation-results/cqResults/RRB for the benchmarks MBB and RRB, respectively.
The results for the proxy app analysis with the SPMD IR and PARCOACH are located in the folder evaluation-results/proxyResults. The folder contains the log files with the timing information used in Table 3 of the paper.
The following tool versions were used to get the results:
- SPMD IR as delivered in this artifact (with patched LLVM 20)
- RMASanitizer 1.10.0
- MUST 1.11.0
- PARCOACH 2.4.2
The following software was used to run the evaluation infrastructure:
- Apptainer 1.4.2
- Python 3.12.3
- Container environment (see reproducibility/apptainer for details)
- Debian 12
- MPI: MPICH 4.0.2 (as shipped with Debian 12)
- OpenSHMEM: Sandia SHMEM 1.5.2
The evaluation was performed on a CLAIX-2023 cluster node consisting of the following configuration:
- 2 x Intel Xeon 8468 Sapphire Rapids (48 cores each), SMT disabled
- 256 GB main memory
- Rocky Linux 9
The evaluation workflow relies on Python scripts which require pandas and numpy. We recommend creating a virtual environment and installing the requirements as follows:
cd reproducibility
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtFurther, building the Apptainer containers requires the LLVM submodule to be checked out. To do so, run
git submodule update --init --recursiveTo build the Apptainer images (one image will be built for SPMD IR, RMASanitizer, MUST, PARCOACH), run the script build_apptainer_images.sh:
cd reproducibility
./build_apptainer_images.shThe classification quality benchmarks can be run using the provided shell script run_cq_tests.sh:
cd reproducibility
./run_cq_tests.sh
The script creates a folder cq-results-YYYYMMDD-HHMMSS and runs all the tools on the RRB and MBB test cases. Finally, it parses the results using the script parse_cq_results.sh and generates a summary table in cq-results-YYYYMMDD-HHMMSS/summaries.
The results should be identical to those provided in evaluation-results/cqResults/summaries. Note that the results of PARCOACH-dynamic flaky (in terms of TP/TN/FP/FN) such that results might slightly differ from run to run.
The script job_classification_quality.sh can be used as a starting point to run the tests in a Slurm-managed cluster.
The proxy app assessment can be run using the provided shell script run_proxyapps.sh:
cd reproducibility
./run_proxyapps.sh
The results will be stored in a folder named proxy-results-YYYYMMDD-HHMMSS. Note that the reported performance will slightly differ, depending on the system the tests are run on.
The script job_proxyapps.sh can be used as a starting point to run the tests in Slurm-managed cluster.