Skip to content

Commit 1b407a3

Browse files
committed
Created using Colab
1 parent 0cf6488 commit 1b407a3

1 file changed

Lines changed: 99 additions & 86 deletions

File tree

Lines changed: 99 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,101 @@
11
{
2-
"cells": [
3-
{
4-
"cell_type": "markdown",
5-
"id": "80a4975b",
6-
"metadata": {},
7-
"source": [
8-
"# 9.3. `nvmath-python` stateful APIs: Amortizing task preparation costs\n",
9-
"\n",
10-
"This example illustrates the use of *stateful* class-form FFT APIs with CuPy ndarrays. Unlike *stateless* functional-form APIs, which are convenience wrappers around respective stateful APIs, the stateful APIs provide the greatest versatility and control. Stateful APIs are particularly useful in workflows that have expensive planning and/or autotuning phases intermixed with multiple executions.\n",
11-
"\n",
12-
"We illustrate such a workflow using the example of the fast Fourier transformation:"
13-
]
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "80a4975b",
6+
"metadata": {
7+
"id": "80a4975b"
8+
},
9+
"source": [
10+
"# 9.3. `nvmath-python` stateful APIs: Amortizing task preparation costs\n",
11+
"\n",
12+
"This example illustrates the use of *stateful* class-form FFT APIs with CuPy ndarrays. Unlike *stateless* functional-form APIs, which are convenience wrappers around respective stateful APIs, the stateful APIs provide the greatest versatility and control. Stateful APIs are particularly useful in workflows that have expensive planning and/or autotuning phases intermixed with multiple executions.\n",
13+
"\n",
14+
"We illustrate such a workflow using the example of the fast Fourier transformation:"
15+
]
16+
},
17+
{
18+
"cell_type": "code",
19+
"execution_count": null,
20+
"id": "ad02decc",
21+
"metadata": {
22+
"id": "ad02decc"
23+
},
24+
"outputs": [],
25+
"source": [
26+
"import cupy as cp\n",
27+
"import nvmath\n",
28+
"\n",
29+
"shape = 512, 512, 512\n",
30+
"axes = 0, 1\n",
31+
"\n",
32+
"a = cp.ones(shape, dtype=cp.complex64)"
33+
]
34+
},
35+
{
36+
"cell_type": "markdown",
37+
"id": "404d935c",
38+
"metadata": {
39+
"id": "404d935c"
40+
},
41+
"source": [
42+
"FFT planning may be expensive compared to the computational work. With multiple executions we can amortize the planning cost by separating the planning from execution:"
43+
]
44+
},
45+
{
46+
"cell_type": "code",
47+
"execution_count": null,
48+
"id": "0cdf344b",
49+
"metadata": {
50+
"id": "0cdf344b"
51+
},
52+
"outputs": [],
53+
"source": [
54+
"# Create a stateful FFT object 'f'.\n",
55+
"with nvmath.fft.FFT(a, axes=axes) as f:\n",
56+
" # Plan the FFT.\n",
57+
" f.plan()\n",
58+
"\n",
59+
" # Execute the FFT.\n",
60+
" b = f.execute()\n",
61+
"\n",
62+
" # Synchronize the default stream\n",
63+
" cp.cuda.get_current_stream().synchronize()"
64+
]
65+
},
66+
{
67+
"cell_type": "markdown",
68+
"id": "0d4e3536",
69+
"metadata": {
70+
"id": "0d4e3536"
71+
},
72+
"source": [
73+
"**TODO: Extend the above example to perform multiple batched FFTs inplace**"
74+
]
75+
}
76+
],
77+
"metadata": {
78+
"kernelspec": {
79+
"display_name": "nersc-nvmath",
80+
"language": "python",
81+
"name": "python3"
82+
},
83+
"language_info": {
84+
"codemirror_mode": {
85+
"name": "ipython",
86+
"version": 3
87+
},
88+
"file_extension": ".py",
89+
"mimetype": "text/x-python",
90+
"name": "python",
91+
"nbconvert_exporter": "python",
92+
"pygments_lexer": "ipython3",
93+
"version": "3.13.5"
94+
},
95+
"colab": {
96+
"provenance": []
97+
}
1498
},
15-
{
16-
"cell_type": "code",
17-
"execution_count": 2,
18-
"id": "ad02decc",
19-
"metadata": {},
20-
"outputs": [],
21-
"source": [
22-
"import cupy as cp\n",
23-
"import nvmath\n",
24-
"\n",
25-
"shape = 512, 512, 512\n",
26-
"axes = 0, 1\n",
27-
"\n",
28-
"a = cp.ones(shape, dtype=cp.complex64)"
29-
]
30-
},
31-
{
32-
"cell_type": "markdown",
33-
"id": "404d935c",
34-
"metadata": {},
35-
"source": [
36-
"FFT planning may be expensive compared to the computational work. With multiple executions we can amortize the planning cost by separating the planning from execution:"
37-
]
38-
},
39-
{
40-
"cell_type": "code",
41-
"execution_count": 3,
42-
"id": "0cdf344b",
43-
"metadata": {},
44-
"outputs": [],
45-
"source": [
46-
"# Create a stateful FFT object 'f'.\n",
47-
"with nvmath.fft.FFT(a, axes=axes) as f:\n",
48-
" # Plan the FFT.\n",
49-
" f.plan()\n",
50-
"\n",
51-
" # Execute the FFT.\n",
52-
" b = f.execute()\n",
53-
"\n",
54-
" # Synchronize the default stream\n",
55-
" cp.cuda.get_current_stream().synchronize()"
56-
]
57-
},
58-
{
59-
"cell_type": "markdown",
60-
"id": "0d4e3536",
61-
"metadata": {},
62-
"source": [
63-
"**TODO: Extend the above example to perform multiple batched FFTs inplace**"
64-
]
65-
}
66-
],
67-
"metadata": {
68-
"kernelspec": {
69-
"display_name": "nersc-nvmath",
70-
"language": "python",
71-
"name": "python3"
72-
},
73-
"language_info": {
74-
"codemirror_mode": {
75-
"name": "ipython",
76-
"version": 3
77-
},
78-
"file_extension": ".py",
79-
"mimetype": "text/x-python",
80-
"name": "python",
81-
"nbconvert_exporter": "python",
82-
"pygments_lexer": "ipython3",
83-
"version": "3.13.5"
84-
}
85-
},
86-
"nbformat": 4,
87-
"nbformat_minor": 5
88-
}
99+
"nbformat": 4,
100+
"nbformat_minor": 5
101+
}

0 commit comments

Comments
 (0)