You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"This example illustrates the use of *stateful* class-form FFT APIs with CuPy ndarrays. Unlike *stateless* functional-form APIs, which are convenience wrappers around respective stateful APIs, the stateful APIs provide the greatest versatility and control. Stateful APIs are particularly useful in workflows that have expensive planning and/or autotuning phases intermixed with multiple executions.\n",
11
-
"\n",
12
-
"We illustrate such a workflow using the example of the fast Fourier transformation:"
"This example illustrates the use of *stateful* class-form FFT APIs with CuPy ndarrays. Unlike *stateless* functional-form APIs, which are convenience wrappers around respective stateful APIs, the stateful APIs provide the greatest versatility and control. Stateful APIs are particularly useful in workflows that have expensive planning and/or autotuning phases intermixed with multiple executions.\n",
13
+
"\n",
14
+
"We illustrate such a workflow using the example of the fast Fourier transformation:"
15
+
]
16
+
},
17
+
{
18
+
"cell_type": "code",
19
+
"execution_count": null,
20
+
"id": "ad02decc",
21
+
"metadata": {
22
+
"id": "ad02decc"
23
+
},
24
+
"outputs": [],
25
+
"source": [
26
+
"import cupy as cp\n",
27
+
"import nvmath\n",
28
+
"\n",
29
+
"shape = 512, 512, 512\n",
30
+
"axes = 0, 1\n",
31
+
"\n",
32
+
"a = cp.ones(shape, dtype=cp.complex64)"
33
+
]
34
+
},
35
+
{
36
+
"cell_type": "markdown",
37
+
"id": "404d935c",
38
+
"metadata": {
39
+
"id": "404d935c"
40
+
},
41
+
"source": [
42
+
"FFT planning may be expensive compared to the computational work. With multiple executions we can amortize the planning cost by separating the planning from execution:"
43
+
]
44
+
},
45
+
{
46
+
"cell_type": "code",
47
+
"execution_count": null,
48
+
"id": "0cdf344b",
49
+
"metadata": {
50
+
"id": "0cdf344b"
51
+
},
52
+
"outputs": [],
53
+
"source": [
54
+
"# Create a stateful FFT object 'f'.\n",
55
+
"with nvmath.fft.FFT(a, axes=axes) as f:\n",
56
+
" # Plan the FFT.\n",
57
+
" f.plan()\n",
58
+
"\n",
59
+
" # Execute the FFT.\n",
60
+
" b = f.execute()\n",
61
+
"\n",
62
+
" # Synchronize the default stream\n",
63
+
" cp.cuda.get_current_stream().synchronize()"
64
+
]
65
+
},
66
+
{
67
+
"cell_type": "markdown",
68
+
"id": "0d4e3536",
69
+
"metadata": {
70
+
"id": "0d4e3536"
71
+
},
72
+
"source": [
73
+
"**TODO: Extend the above example to perform multiple batched FFTs inplace**"
74
+
]
75
+
}
76
+
],
77
+
"metadata": {
78
+
"kernelspec": {
79
+
"display_name": "nersc-nvmath",
80
+
"language": "python",
81
+
"name": "python3"
82
+
},
83
+
"language_info": {
84
+
"codemirror_mode": {
85
+
"name": "ipython",
86
+
"version": 3
87
+
},
88
+
"file_extension": ".py",
89
+
"mimetype": "text/x-python",
90
+
"name": "python",
91
+
"nbconvert_exporter": "python",
92
+
"pygments_lexer": "ipython3",
93
+
"version": "3.13.5"
94
+
},
95
+
"colab": {
96
+
"provenance": []
97
+
}
14
98
},
15
-
{
16
-
"cell_type": "code",
17
-
"execution_count": 2,
18
-
"id": "ad02decc",
19
-
"metadata": {},
20
-
"outputs": [],
21
-
"source": [
22
-
"import cupy as cp\n",
23
-
"import nvmath\n",
24
-
"\n",
25
-
"shape = 512, 512, 512\n",
26
-
"axes = 0, 1\n",
27
-
"\n",
28
-
"a = cp.ones(shape, dtype=cp.complex64)"
29
-
]
30
-
},
31
-
{
32
-
"cell_type": "markdown",
33
-
"id": "404d935c",
34
-
"metadata": {},
35
-
"source": [
36
-
"FFT planning may be expensive compared to the computational work. With multiple executions we can amortize the planning cost by separating the planning from execution:"
37
-
]
38
-
},
39
-
{
40
-
"cell_type": "code",
41
-
"execution_count": 3,
42
-
"id": "0cdf344b",
43
-
"metadata": {},
44
-
"outputs": [],
45
-
"source": [
46
-
"# Create a stateful FFT object 'f'.\n",
47
-
"with nvmath.fft.FFT(a, axes=axes) as f:\n",
48
-
" # Plan the FFT.\n",
49
-
" f.plan()\n",
50
-
"\n",
51
-
" # Execute the FFT.\n",
52
-
" b = f.execute()\n",
53
-
"\n",
54
-
" # Synchronize the default stream\n",
55
-
" cp.cuda.get_current_stream().synchronize()"
56
-
]
57
-
},
58
-
{
59
-
"cell_type": "markdown",
60
-
"id": "0d4e3536",
61
-
"metadata": {},
62
-
"source": [
63
-
"**TODO: Extend the above example to perform multiple batched FFTs inplace**"
0 commit comments