Skip to content

Commit 0d82318

Browse files
authored
Merge pull request #6 from OpenBioSim/feature_opencl
Add support for OpenCL platform
2 parents 6fce2dc + 1f3c327 commit 0d82318

19 files changed

Lines changed: 2076 additions & 545 deletions

File tree

README.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,18 +10,19 @@
1010
[![Conda Version](https://anaconda.org/openbiosim/loch/badges/downloads.svg)](https://anaconda.org/openbiosim/loch)
1111
[![License: GPL v3](https://img.shields.io/badge/License-GPL_v3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html)
1212

13-
CUDA accelerated Grand Canonical Monte Carlo (GCMC) water sampling code. Built
13+
CUDA/OpenCL accelerated Grand Canonical Monte Carlo (GCMC) water sampling code. Built
1414
on top of [Sire](https://github.com/OpenBioSim/sire),
1515
[BioSimSpace](https://github.com/OpenBioSim/biosimspace),
16-
[OpenMM](https://github.com/openmm/openmm), and
17-
[PyCUDA](https://documen.tician.de/pycuda/index.html#).
16+
[OpenMM](https://github.com/openmm/openmm),
17+
[PyCUDA](https://documen.tician.de/pycuda/index.html#),
18+
and [PyOpenCL](https://documen.tician.de/pyopencl/).
1819

1920
## Installation
2021

2122
First, create a conda environment with the required dependencies:
2223

2324
```
24-
conda create -f environment.yaml
25+
conda env create -f environment.yaml
2526
conda activate loch
2627
```
2728

@@ -49,7 +50,7 @@ conda install -c conda-forge -c openbiosim/label/dev loch
4950

5051
Instead of computing the energy change for each trial insertion/deletion with
5152
OpenMM, the calculation is performed at the reaction field (RF) level using
52-
a custom CUDA kernel, allowing multiple candidates to be evaluated
53+
a custom CUDA/OpenCL kernel, allowing multiple candidates to be evaluated
5354
simultaneously. Particle mesh Ewald (PME) is handled via the method for
5455
sampling from an approximate potential (in this case the RF potential)
5556
introduced [here](https://doi.org/10.1063/1.1563597). Parallelisation of the
@@ -228,8 +229,9 @@ to enhance sampling.
228229
Once finished, `mu_ex` will contain the computed excess chemical potential in units
229230
kcal/mol.
230231

231-
Note that the simulation requires a system with CUDA support. Please set the
232-
`CUDA_VISIBLE_DEVICES` environment variable accordingly.
232+
Note that the simulation requires a system with CUDA or OpenCL support. Please
233+
set the `CUDA_VISIBLE_DEVICES` or `OPENCL_VISIBLE_DEVICES` environment variable
234+
accordingly.
233235

234236
The standard volume can be computed as follows:
235237

@@ -263,13 +265,11 @@ Free Energy Perturbation (FEP) with GCMC using `loch` is supported via the
263265

264266
## Notes
265267

266-
* Make sure that `nvcc` is in your `PATH`. If you require a different `nvcc` to that
267-
provided by conda, you can set the `PYCUDA_NVCC` environment variable to point
268-
to the desired `nvcc` binary, or use the `nvcc` kwarg in the `GCMCSampler` constructor.
269-
Depending on your setup, you may also need to install the `cuda-nvvm` package from
270-
`conda-forge`.
271-
272-
* A future version supporting AMD GPUs via PyOpenCL is planned.
268+
* When using the CUDA platform, make sure that `nvcc` is in your `PATH`. If you require
269+
a different `nvcc` to that provided by conda, you can set the `PYCUDA_NVCC` environment
270+
variable to point to the desired `nvcc` binary, or use the `nvcc` kwarg in the
271+
`GCMCSampler` constructor. Depending on your setup, you may also need to install the
272+
`cuda-nvvm` package from `conda-forge`.
273273

274274
* OpenMM-to-Sire roundtrip example:
275275

WHITEPAPER.md

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,23 @@
1-
# Loch: CUDA accelerated Grand Canonical Monte Carlo (GCMC) water sampling
1+
# Loch: GPU accelerated Grand Canonical Monte Carlo (GCMC) water sampling
22

33
## Introduction
44

5-
We present `loch`, a high-performance CUDA-accelerated Python package designed
5+
We present `loch`, a high-performance GPU-accelerated Python package designed
66
for Grand Canonical Monte Carlo (GCMC) water sampling in molecular simulations
77
via [OpenMM](https://openmm.org/). To enable parallelisation of insertion and
8-
deletion attempts, `loch` leverages GPU capabilities using a custom CUDA kernel
9-
for nonbonded interactions. This allows thousands of GCMC trials to be attempted
10-
in parallel, significantly enhancing sampling efficiency compared to traditional
11-
CPU-based implementations that perform sequential attempts via the OpenMM Python
12-
API. Additionally, electrostatics for GCMC attempts are computed using the
13-
reaction field (RF) method, with accepted candidates being re-evaluated with a
14-
correction step based on the difference between reaction field and Particle Mesh
15-
Ewald (PME) potential energies. The use of an approximate potential for trial
16-
moves leads to a substantial speed-up in GCMC move evaluation. `loch` has been
17-
designed to be modular, allowing standalone GCMC sampling, or integration with
18-
OpenMM-based molecular dynamics simulation code, e.g. as has been done in the
19-
[SOMD2](https://github.com/openbiosim/somd2) free-energy perturbation engine.
8+
deletion attempts, `loch` leverages GPU capabilities using a custom CUDA/OpenCL
9+
kernel for nonbonded interactions. This allows thousands of GCMC trials to be
10+
attempted in parallel, significantly enhancing sampling efficiency compared to
11+
traditional CPU-based implementations that perform sequential attempts via the
12+
OpenMM Python API. Additionally, electrostatics for GCMC attempts are computed
13+
using the reaction field (RF) method, with accepted candidates being
14+
re-evaluated with a correction step based on the difference between reaction
15+
field and Particle Mesh Ewald (PME) potential energies. The use of an
16+
approximate potential for trial moves leads to a substantial speed-up in GCMC
17+
move evaluation. `loch` has been designed to be modular, allowing standalone
18+
GCMC sampling, or integration with OpenMM-based molecular dynamics simulation
19+
code, e.g. as has been done in the [SOMD2](https://github.com/openbiosim/somd2)
20+
free-energy perturbation engine.
2021

2122
## Parallelisation strategy
2223

@@ -52,6 +53,14 @@ each iteration, as more trials need to be evaluated in parallel, and more data
5253
needs to be transferred to and from the GPU, in which case it might be more
5354
efficient to simply perform more iterations with a smaller batch size.
5455

56+
To enable reproduciblility across GPU platforms we choose to generate random
57+
numbers on the host using NumPy's random number generator, then transfer these
58+
to the GPU kernels where required. This avoids differences in random number
59+
generation across different GPU architectures and drivers, making testing
60+
and validation of the implementation significantly easier. In benchmarks we
61+
have found the NumPy approach to be as performant as using GPU-based random
62+
numbers for the typical batch sizes employed in `loch`.
63+
5564
## Sampling from an approximate potential
5665

5766
In order to further accelerate the evaluation of GCMC insertion and deletion
@@ -91,7 +100,7 @@ Other than the cost of evaluating GCMC trials using PME, performance is aslo
91100
impacted by the cost of updating nonbonded parameters and atomic positions
92101
in the OpenMM context after each accepted insertion or deletion. (No updates
93102
are required for trial moves, since these are all evaluated via the custom
94-
CUDA kernel.) [Recent updates](https://github.com/openmm/openmm/pull/4610)
103+
CUDA/OpenCL kernel.) [Recent updates](https://github.com/openmm/openmm/pull/4610)
95104
to OpenMM have helped mitigate the cost of modifying force field parameters,
96105
allowing updates for only the subset of parameters that have changed within
97106
a particular force. However, updating atomic positions still requires

environment.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ dependencies:
88
- biosimspace
99
- loguru
1010
- pycuda
11+
- pyopencl

examples/bpti/bpti.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,14 @@
5353
choices=["info", "debug", "error"],
5454
required=False,
5555
)
56+
parser.add_argument(
57+
"--platform",
58+
help="The GPU platform to use",
59+
type=str,
60+
default="auto",
61+
choices=["auto", "cuda", "opencl"],
62+
required=False,
63+
)
5664

5765
args = parser.parse_args()
5866

@@ -78,6 +86,7 @@
7886
num_ghost_waters=100,
7987
bulk_sampling_probability=0,
8088
log_level=args.log_level,
89+
platform=args.platform,
8190
overwrite=True,
8291
)
8392

@@ -92,6 +101,7 @@
92101
pressure=None,
93102
constraint="h_bonds",
94103
timestep="2 fs",
104+
platform=args.platform,
95105
)
96106
d.randomise_velocities()
97107

examples/scytalone/sd.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,14 @@
6464
choices=["info", "debug", "error"],
6565
required=False,
6666
)
67+
parser.add_argument(
68+
"--platform",
69+
help="The GPU platform to use",
70+
type=str,
71+
default="auto",
72+
choices=["auto", "cuda", "opencl"],
73+
required=False,
74+
)
6775
args = parser.parse_args()
6876

6977
# Store the ligand index.
@@ -90,6 +98,7 @@
9098
ghost_file=f"ghosts_{lig}.txt",
9199
log_file=f"gcmc_{lig}.txt",
92100
log_level=args.log_level,
101+
platform=args.platform,
93102
overwrite=True,
94103
)
95104

@@ -104,6 +113,7 @@
104113
pressure=None,
105114
constraint="h_bonds",
106115
timestep="2 fs",
116+
platform=args.platform,
107117
)
108118
d.randomise_velocities()
109119

examples/water/water.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,14 @@
6666
choices=["info", "debug", "error"],
6767
required=False,
6868
)
69+
parser.add_argument(
70+
"--platform",
71+
help="The GPU platform to use",
72+
type=str,
73+
default="auto",
74+
choices=["auto", "cuda", "opencl"],
75+
required=False,
76+
)
6977
args = parser.parse_args()
7078

7179
# Load the water box.
@@ -91,6 +99,8 @@
9199
temperature=args.temperature,
92100
num_ghost_waters=100,
93101
log_level=args.log_level,
102+
platform=args.platform,
103+
overwrite=True,
94104
)
95105

96106
# Create a dynamics object using the modified GCMC system.
@@ -104,6 +114,7 @@
104114
pressure=None,
105115
constraint="h_bonds",
106116
timestep="2 fs",
117+
platform=args.platform,
107118
)
108119
d.randomise_velocities()
109120

recipes/loch/template.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ requirements:
1818
- loguru
1919
- pip
2020
- pycuda # [not macos]
21+
- pyopencl
2122
- python
2223
- setuptools
2324
- sire

src/loch/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
######################################################################
22
# Loch: GPU accelerated GCMC water sampling engine.
33
#
4-
# Copyright: 2025
4+
# Copyright: 2025-2026
55
#
66
# Authors: The OpenBioSim Team <team@openbiosim.org>
77
#

0 commit comments

Comments
 (0)