Skip to content

Commit 4327446

Browse files
committed
Created using Colab
1 parent 7915feb commit 4327446

1 file changed

Lines changed: 173 additions & 122 deletions

File tree

Lines changed: 173 additions & 122 deletions
Original file line numberDiff line numberDiff line change
@@ -1,124 +1,175 @@
11
{
2-
"cells": [
3-
{
4-
"cell_type": "markdown",
5-
"id": "7b236cf1",
6-
"metadata": {},
7-
"source": [
8-
"# 9.1. `nvmath-python`: Interoperability with CPU and GPU tensor libraries\n",
9-
"The goal of this exercise is to demonstrate how easy it is to plug `nvmath-python` into existing projects that rely on popular CPU or GPU array libraries, such as NumPy, CuPy, and PyTorch, or how easy it is to start a new project where `nvmath-python` is used alongside array libraries."
10-
]
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"id": "view-in-github",
7+
"colab_type": "text"
8+
},
9+
"source": [
10+
"<a href=\"https://colab.research.google.com/github/samaid/pyhpc-tutorial/blob/main/notebooks/9_1_nvmath-python_interop.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"id": "7b236cf1",
16+
"metadata": {
17+
"id": "7b236cf1"
18+
},
19+
"source": [
20+
"# 9.1. `nvmath-python`: Interoperability with CPU and GPU tensor libraries\n",
21+
"The goal of this exercise is to demonstrate how easy it is to plug `nvmath-python` into existing projects that rely on popular CPU or GPU array libraries, such as NumPy, CuPy, and PyTorch, or how easy it is to start a new project where `nvmath-python` is used alongside array libraries."
22+
]
23+
},
24+
{
25+
"cell_type": "markdown",
26+
"id": "e38c312d",
27+
"metadata": {
28+
"id": "e38c312d"
29+
},
30+
"source": [
31+
"### Pure CuPy implementation\n",
32+
"\n",
33+
"This example demonstrates basic matrix multiplication of CuPy 2D arrays using `matmul`:"
34+
]
35+
},
36+
{
37+
"cell_type": "code",
38+
"execution_count": 1,
39+
"id": "b796dc7e",
40+
"metadata": {
41+
"id": "b796dc7e",
42+
"outputId": "d47076cc-9340-4476-8201-b263dd8a4116",
43+
"colab": {
44+
"base_uri": "https://localhost:8080/",
45+
"height": 460
46+
}
47+
},
48+
"outputs": [
49+
{
50+
"output_type": "error",
51+
"ename": "CUDARuntimeError",
52+
"evalue": "cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version",
53+
"traceback": [
54+
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
55+
"\u001b[0;31mCUDARuntimeError\u001b[0m Traceback (most recent call last)",
56+
"\u001b[0;32m/tmp/ipython-input-4003678963.py\u001b[0m in \u001b[0;36m<cell line: 0>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;31m# Prepare sample input data for matrix matmul\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mk\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m2000\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4000\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m5000\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0ma\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrand\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mk\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0mb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrand\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mk\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mm\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
57+
"\u001b[0;32m/usr/local/lib/python3.11/dist-packages/cupy/random/_sample.py\u001b[0m in \u001b[0;36mrand\u001b[0;34m(*size, **kwarg)\u001b[0m\n\u001b[1;32m 42\u001b[0m raise TypeError('rand() got unexpected keyword arguments %s'\n\u001b[1;32m 43\u001b[0m % ', '.join(kwarg.keys()))\n\u001b[0;32m---> 44\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mrandom_sample\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 45\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 46\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
58+
"\u001b[0;32m/usr/local/lib/python3.11/dist-packages/cupy/random/_sample.py\u001b[0m in \u001b[0;36mrandom_sample\u001b[0;34m(size, dtype)\u001b[0m\n\u001b[1;32m 153\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 154\u001b[0m \"\"\"\n\u001b[0;32m--> 155\u001b[0;31m \u001b[0mrs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_generator\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_random_state\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 156\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mrs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandom_sample\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 157\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
59+
"\u001b[0;32m/usr/local/lib/python3.11/dist-packages/cupy/random/_generator.py\u001b[0m in \u001b[0;36mget_random_state\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1304\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1305\u001b[0m \"\"\"\n\u001b[0;32m-> 1306\u001b[0;31m \u001b[0mdev\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcuda\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mDevice\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1307\u001b[0m \u001b[0mrs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_random_states\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdev\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mid\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1308\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mrs\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
60+
"\u001b[0;32mcupy/cuda/device.pyx\u001b[0m in \u001b[0;36mcupy.cuda.device.Device.__init__\u001b[0;34m()\u001b[0m\n",
61+
"\u001b[0;32mcupy_backends/cuda/api/runtime.pyx\u001b[0m in \u001b[0;36mcupy_backends.cuda.api.runtime.getDevice\u001b[0;34m()\u001b[0m\n",
62+
"\u001b[0;32mcupy_backends/cuda/api/runtime.pyx\u001b[0m in \u001b[0;36mcupy_backends.cuda.api.runtime.check_status\u001b[0;34m()\u001b[0m\n",
63+
"\u001b[0;31mCUDARuntimeError\u001b[0m: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version"
64+
]
65+
}
66+
],
67+
"source": [
68+
"import cupy as cp\n",
69+
"\n",
70+
"# Prepare sample input data for matrix matmul\n",
71+
"n, m, k = 2000, 4000, 5000\n",
72+
"a = cp.random.rand(n, k)\n",
73+
"b = cp.random.rand(k, m)\n",
74+
"\n",
75+
"# Perform matrix multiplication\n",
76+
"result = cp.matmul(a, b)\n",
77+
"\n",
78+
"# Print the result\n",
79+
"print(result)\n",
80+
"\n",
81+
"# Print CUDA device for each array\n",
82+
"print(a.device)\n",
83+
"print(b.device)\n",
84+
"print(result.device)"
85+
]
86+
},
87+
{
88+
"cell_type": "markdown",
89+
"id": "7528a6f8",
90+
"metadata": {
91+
"id": "7528a6f8"
92+
},
93+
"source": [
94+
"### Using `nvmath-python` alongside CuPy\n",
95+
"\n",
96+
"This is a slight modification of the above example, where matrix multiplications is done using corresponding `nvmath-python` implementation.\n",
97+
"\n",
98+
"Note that `nvmath-python` supports multiple frameworks, including CuPy. It uses framework's memory pool and the current stream for seamless integration. The result of each operation is a tensor of the same framework that was used to pass the inputs. It is also located on the same device as the inputs."
99+
]
100+
},
101+
{
102+
"cell_type": "code",
103+
"execution_count": null,
104+
"id": "311ee2e9",
105+
"metadata": {
106+
"id": "311ee2e9"
107+
},
108+
"outputs": [],
109+
"source": [
110+
"# The same matrix multiplication as in the previous example but using nvmath-python\n",
111+
"import nvmath\n",
112+
"\n",
113+
"# Perform matrix multiplication\n",
114+
"result = nvmath.linalg.advanced.matmul(a, b)\n",
115+
"\n",
116+
"# Print the result\n",
117+
"print(result)\n",
118+
"\n",
119+
"# Print CUDA device for each array\n",
120+
"print(a.device)\n",
121+
"print(b.device)\n",
122+
"print(result.device)\n"
123+
]
124+
},
125+
{
126+
"cell_type": "markdown",
127+
"id": "85b2ae1b",
128+
"metadata": {
129+
"id": "85b2ae1b"
130+
},
131+
"source": [
132+
"As we can see, the code looks essentially the same. If one measures the performance of above implementations, it will be nearly identical.\n",
133+
"\n",
134+
"This is because CuPy and `nvmath-python` (as well as PyTorch) all use CUDA-X Math Libraries as the engine. It is up to a user, which library to choose for solving the above matrix multiplication problem.\n",
135+
"\n",
136+
"In the next examples we will demonstrate a few examples, where `nvmath-python` may become essential in reaching peak levels of performance."
137+
]
138+
},
139+
{
140+
"cell_type": "code",
141+
"execution_count": null,
142+
"id": "bf34d34d",
143+
"metadata": {
144+
"id": "bf34d34d"
145+
},
146+
"outputs": [],
147+
"source": []
148+
}
149+
],
150+
"metadata": {
151+
"kernelspec": {
152+
"display_name": "nersc-nvmath",
153+
"language": "python",
154+
"name": "python3"
155+
},
156+
"language_info": {
157+
"codemirror_mode": {
158+
"name": "ipython",
159+
"version": 3
160+
},
161+
"file_extension": ".py",
162+
"mimetype": "text/x-python",
163+
"name": "python",
164+
"nbconvert_exporter": "python",
165+
"pygments_lexer": "ipython3",
166+
"version": "3.13.5"
167+
},
168+
"colab": {
169+
"provenance": [],
170+
"include_colab_link": true
171+
}
11172
},
12-
{
13-
"cell_type": "markdown",
14-
"id": "e38c312d",
15-
"metadata": {},
16-
"source": [
17-
"### Pure CuPy implementation\n",
18-
"\n",
19-
"This example demonstrates basic matrix multiplication of CuPy 2D arrays using `matmul`:"
20-
]
21-
},
22-
{
23-
"cell_type": "code",
24-
"execution_count": null,
25-
"id": "b796dc7e",
26-
"metadata": {},
27-
"outputs": [],
28-
"source": [
29-
"import cupy as cp\n",
30-
"\n",
31-
"# Prepare sample input data for matrix matmul\n",
32-
"n, m, k = 2000, 4000, 5000\n",
33-
"a = cp.random.rand(n, k)\n",
34-
"b = cp.random.rand(k, m)\n",
35-
"\n",
36-
"# Perform matrix multiplication\n",
37-
"result = cp.matmul(a, b)\n",
38-
"\n",
39-
"# Print the result\n",
40-
"print(result)\n",
41-
"\n",
42-
"# Print CUDA device for each array\n",
43-
"print(a.device)\n",
44-
"print(b.device)\n",
45-
"print(result.device)"
46-
]
47-
},
48-
{
49-
"cell_type": "markdown",
50-
"id": "7528a6f8",
51-
"metadata": {},
52-
"source": [
53-
"### Using `nvmath-python` alongside CuPy\n",
54-
"\n",
55-
"This is a slight modification of the above example, where matrix multiplications is done using corresponding `nvmath-python` implementation.\n",
56-
"\n",
57-
"Note that `nvmath-python` supports multiple frameworks, including CuPy. It uses framework's memory pool and the current stream for seamless integration. The result of each operation is a tensor of the same framework that was used to pass the inputs. It is also located on the same device as the inputs. "
58-
]
59-
},
60-
{
61-
"cell_type": "code",
62-
"execution_count": null,
63-
"id": "311ee2e9",
64-
"metadata": {},
65-
"outputs": [],
66-
"source": [
67-
"# The same matrix multiplication as in the previous example but using nvmath-python\n",
68-
"import nvmath\n",
69-
"\n",
70-
"# Perform matrix multiplication\n",
71-
"result = nvmath.linalg.advanced.matmul(a, b)\n",
72-
"\n",
73-
"# Print the result\n",
74-
"print(result)\n",
75-
"\n",
76-
"# Print CUDA device for each array\n",
77-
"print(a.device)\n",
78-
"print(b.device)\n",
79-
"print(result.device)\n"
80-
]
81-
},
82-
{
83-
"cell_type": "markdown",
84-
"id": "85b2ae1b",
85-
"metadata": {},
86-
"source": [
87-
"As we can see, the code looks essentially the same. If one measures the performance of above implementations, it will be nearly identical. \n",
88-
"\n",
89-
"This is because CuPy and `nvmath-python` (as well as PyTorch) all use CUDA-X Math Libraries as the engine. It is up to a user, which library to choose for solving the above matrix multiplication problem. \n",
90-
"\n",
91-
"In the next examples we will demonstrate a few examples, where `nvmath-python` may become essential in reaching peak levels of performance. "
92-
]
93-
},
94-
{
95-
"cell_type": "code",
96-
"execution_count": null,
97-
"id": "bf34d34d",
98-
"metadata": {},
99-
"outputs": [],
100-
"source": []
101-
}
102-
],
103-
"metadata": {
104-
"kernelspec": {
105-
"display_name": "nersc-nvmath",
106-
"language": "python",
107-
"name": "python3"
108-
},
109-
"language_info": {
110-
"codemirror_mode": {
111-
"name": "ipython",
112-
"version": 3
113-
},
114-
"file_extension": ".py",
115-
"mimetype": "text/x-python",
116-
"name": "python",
117-
"nbconvert_exporter": "python",
118-
"pygments_lexer": "ipython3",
119-
"version": "3.13.5"
120-
}
121-
},
122-
"nbformat": 4,
123-
"nbformat_minor": 5
124-
}
173+
"nbformat": 4,
174+
"nbformat_minor": 5
175+
}

0 commit comments

Comments
 (0)