Skip to content

Commit c07b5db

Browse files
authored
Merge pull request #207 from omlins/ka2doc
Update README.md with new constants and example data allocation for S…
2 parents 2c5882f + 0c1ab71 commit c07b5db

1 file changed

Lines changed: 11 additions & 9 deletions

File tree

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -265,8 +265,8 @@ The KernelAbstractions backend keeps the familiar parse-time `@init_parallel_ste
265265
import CUDA # 1 Import backends to be used by the KernelAbstractions backend
266266
using ParallelStencil
267267
@init_parallel_stencil(package=KernelAbstractions, numbertype=Float32) # 2 Initialize KernelAbstractions backend at parse time
268-
const N = 1024
269-
const α = 2.5
268+
const N = 2;
269+
const α = 1.5;
270270

271271
# --- Kernel definition -------------------------------------------------
272272
@parallel_indices (i) function saxpy!(Y, α, X) # 3 Define a single time a hardware-agnostic SAXPY kernel
@@ -276,16 +276,18 @@ end
276276

277277
# --- First run on default runtime hardware (CPU) -----------------------
278278
println("Current runtime hardware target: ", @current_hardware()) # 4 Query current (default) runtime hardware target
279-
X = @rand(N) # 5 Allocate data on the current target
280-
Y = @rand(N) # 5 Allocate data on the current target
279+
X = @fill(3, N) # 5 Allocate data on the current target
280+
Y = @ones(N) # ...
281281
@parallel saxpy!(Y, α, X) # 6 Launch kernel on the current target
282+
Y # 7 Observe correct results
282283

283284
# --- Reselect runtime hardware to CUDA-capable GPU and run again -------
284-
@select_hardware(:gpu_cuda) # 7 Switch runtime hardware target to CUDA-capable GPU
285-
println("Current runtime hardware target: ", @current_hardware()) # 8 Confirm the CUDA-capable GPU runtime hardware target
286-
X = @rand(N) # 9 Allocate data on the new target
287-
Y = @rand(N) # 9 Allocate data on the new target
288-
@parallel saxpy!(Y, α, X) # 10 Launch kernel on the new target without redefining anything
285+
@select_hardware(:gpu_cuda) # 8 Switch runtime hardware target to CUDA-capable GPU
286+
println("Current runtime hardware target: ", @current_hardware()) # 9 Confirm the CUDA-capable GPU runtime hardware target
287+
X = @fill(3, N) # 10 Allocate data on the new target
288+
Y = @ones(N) # ...
289+
@parallel saxpy!(Y, α, X) # 11 Launch kernel on the new target without redefining anything
290+
Y # 12 Observe correct results
289291
```
290292
Type `?@select_hardware` and `?@current_hardware` in the [Julia REPL] to see what runtime hardware targets are supported and which symbols to use to select them.
291293
Note that the KernelAbstractions backend comes with a trade-off: the convenience `Data`/`TData` modules for fixed data types and single-architecture backends are not available, as well as the warp-level primitives in `@parallel_indices` kernels (see [Support for architecture-agnostic low level kernel programming](#support-for-architecture-agnostic-low-level-kernel-programming) and the hide communication feature, described in the next section, is implemented to have no effect for KernelAbstractions (but it nevertheless executes correctly).

0 commit comments

Comments
 (0)