Skip to content

Commit 912e599

Browse files
authored
Merge pull request #7 from leimao/iplugin_v3
Add IPlugin V3 and TensorRT 10 Support
2 parents a2b43b9 + 1321cb2 commit 912e599

29 files changed

Lines changed: 1016 additions & 137 deletions

README.md

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,15 @@ The ONNX model we created is a simple identity neural network that consists of t
1313
To build the custom Docker image, please run the following command.
1414

1515
```bash
16-
$ docker build -f docker/tensorrt.Dockerfile --no-cache --tag=tensorrt:24.02 .
16+
$ docker build -f docker/tensorrt.Dockerfile --no-cache --tag=tensorrt:24.05 .
1717
```
1818

1919
### Run Docker Container
2020

2121
To run the custom Docker container, please run the following command.
2222

2323
```bash
24-
$ docker run -it --rm --gpus device=0 -v $(pwd):/mnt tensorrt:24.02
24+
$ docker run -it --rm --gpus device=0 -v $(pwd):/mnt tensorrt:24.05
2525
```
2626

2727
### Build Application
@@ -33,7 +33,9 @@ $ cmake -B build
3333
$ cmake --build build --config Release --parallel
3434
```
3535

36-
Under the `build/src` directory, the custom plugin library will be saved as `libidentity_conv.so`, the engine builder will be saved as `build_engine`, and the engine runner will be saved as `run_engine`.
36+
Under the `build/src/plugins` directory, the custom plugin library will be saved as `libidentity_conv_iplugin_v2_io_ext.so` for `IPluginV2Ext` and `libidentity_conv_iplugin_v3.so` for `IPluginV3`, respectively. The `IPluginV2Ext` plugin interface has been deprecated since TensorRT 10.0.0 and will be removed in the future. The `IPluginV3` plugin interface is the only recommended interface for custom plugin development.
37+
38+
Under the `build/src/apps` directory, the engine builder will be saved as `build_engine`, and the engine runner will be saved as `run_engine`.
3739

3840
### Build ONNX Model
3941

@@ -67,18 +69,32 @@ The ONNX model will be saved as `identity_neural_network.onnx` under the `data`
6769

6870
To build the TensorRT engine from the ONNX model, please run the following command.
6971

72+
#### Build Engine with IPluginV2IOExt
73+
7074
```bash
71-
$ ./build/src/build_engine
75+
$ ./build/src/apps/build_engine data/identity_neural_network.onnx build/src/plugins/IdentityConvIPluginV2IOExt/libidentity_conv_iplugin_v2_io_ext.so data/identity_neural_network_iplugin_v2_io_ext.engine
7276
```
7377

74-
The TensorRT engine will be saved as `identity_neural_network.engine` under the `data` directory.
78+
#### Build Engine with IPluginV3
79+
80+
```bash
81+
$ ./build/src/apps/build_engine data/identity_neural_network.onnx build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so data/identity_neural_network_iplugin_v3.engine
82+
```
7583

7684
### Run TensorRT Engine
7785

7886
To run the TensorRT engine, please run the following command.
7987

88+
#### Run Engine with IPluginV2IOExt
89+
90+
```bash
91+
$ ./build/src/apps/run_engine build/src/plugins/IdentityConvIPluginV2IOExt/libidentity_conv_iplugin_v2_io_ext.so data/identity_neural_network_iplugin_v2_io_ext.engine
92+
```
93+
94+
#### Run Engine with IPluginV3
95+
8096
```bash
81-
$ ./build/src/run_engine
97+
$ ./build/src/apps/run_engine build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so data/identity_neural_network_iplugin_v3.engine
8298
```
8399

84100
If the custom plugin implementation and integration are correct, the output of the TensorRT engine should be the same as the input.

data/identity_neural_network.onnx

170 Bytes
Binary file not shown.

docker/tensorrt.Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
FROM nvcr.io/nvidia/tensorrt:24.02-py3
1+
FROM nvcr.io/nvidia/tensorrt:24.05-py3
22

3-
ARG CMAKE_VERSION=3.28.0
3+
ARG CMAKE_VERSION=3.29.3
44
ARG NUM_JOBS=8
55

66
ENV DEBIAN_FRONTEND noninteractive

python/README.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,27 @@
22

33
## Unit Test
44

5+
Assuming the `IPluginV2IOExt` and `IPluginV3` plugins have been built, the engine that uses each of the plugins have been built, the unit tests can be run.
6+
57
To run the unit test, please run the following command.
68

79
```bash
8-
python -m unittest test_plugin
9-
python -m unittest test_engine
10+
$ python -m unittest test_plugin
11+
$ python -m unittest test_engine
1012
```
1113

1214
## Run TensorRT Engine
1315

1416
To run the TensorRT engine, please run the following command.
1517

18+
### IPluginV2IOExt
19+
20+
```bash
21+
$ python main.py --engine_file_path ../data/identity_neural_network_iplugin_v2_io_ext.engine --plugin_lib_file_path ../build/src/plugins/IdentityConvIPluginV2IOExt/libidentity_conv_iplugin_v2_io_ext.so
22+
```
23+
24+
### IPluginV3
25+
1626
```bash
17-
$ python main.py
27+
$ python main.py --engine_file_path ../data/identity_neural_network_iplugin_v3.engine --plugin_lib_file_path ../build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so
1828
```

python/common.py

Lines changed: 31 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Slightly modified from
22
# https://github.com/NVIDIA/TensorRT/blob/c0c633cc629cc0705f0f69359f531a192e524c0f/samples/python/common.py
3+
# https://github.com/NVIDIA/TensorRT/blob/ccf119972b50299ba00d35d39f3938296e187f4e/samples/python/common_runtime.py
34

45
#
56
# SPDX-FileCopyrightText: Copyright (c) 1993-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
@@ -150,8 +151,6 @@ def allocate_buffers(engine: trt.ICudaEngine,
150151
raise ValueError(f"Binding {binding} has dynamic shape, " +\
151152
"but no profile was specified.")
152153
size = trt.volume(shape)
153-
if engine.has_implicit_batch_dimension:
154-
size *= engine.max_batch_size
155154
dtype = np.dtype(trt.nptype(engine.get_tensor_dtype(binding)))
156155

157156
# Allocate host and device buffers
@@ -219,23 +218,38 @@ def _do_inference_base(inputs, outputs, stream, execute_async):
219218
return [out.host for out in outputs]
220219

221220

222-
# This function is generalized for multiple inputs/outputs.
223-
# inputs and outputs are expected to be lists of HostDeviceMem objects.
224-
def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
225-
226-
def execute_async():
227-
context.execute_async(batch_size=batch_size,
228-
bindings=bindings,
229-
stream_handle=stream)
230-
231-
return _do_inference_base(inputs, outputs, stream, execute_async)
221+
def _do_inference_base(inputs, outputs, stream, execute_async_func):
222+
# Transfer input data to the GPU.
223+
kind = cudart.cudaMemcpyKind.cudaMemcpyHostToDevice
224+
[
225+
cuda_call(
226+
cudart.cudaMemcpyAsync(inp.device, inp.host, inp.nbytes, kind,
227+
stream)) for inp in inputs
228+
]
229+
# Run inference.
230+
execute_async_func()
231+
# Transfer predictions back from the GPU.
232+
kind = cudart.cudaMemcpyKind.cudaMemcpyDeviceToHost
233+
[
234+
cuda_call(
235+
cudart.cudaMemcpyAsync(out.host, out.device, out.nbytes, kind,
236+
stream)) for out in outputs
237+
]
238+
# Synchronize the stream
239+
cuda_call(cudart.cudaStreamSynchronize(stream))
240+
# Return only the host outputs.
241+
return [out.host for out in outputs]
232242

233243

234-
# This function is generalized for multiple inputs/outputs for full dimension networks.
244+
# This function is generalized for multiple inputs/outputs.
235245
# inputs and outputs are expected to be lists of HostDeviceMem objects.
236-
def do_inference_v2(context, bindings, inputs, outputs, stream):
246+
def do_inference(context, engine, bindings, inputs, outputs, stream):
237247

238-
def execute_async():
239-
context.execute_async_v2(bindings=bindings, stream_handle=stream)
248+
def execute_async_func():
249+
context.execute_async_v3(stream_handle=stream)
240250

241-
return _do_inference_base(inputs, outputs, stream, execute_async)
251+
# Setup context tensor address.
252+
num_io = engine.num_io_tensors
253+
for i in range(num_io):
254+
context.set_tensor_address(engine.get_tensor_name(i), bindings[i])
255+
return _do_inference_base(inputs, outputs, stream, execute_async_func)

python/main.py

Lines changed: 29 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import argparse
12
import numpy as np
23

34
import common
@@ -6,8 +7,26 @@
67

78
def main():
89

9-
engine_file_path = "../data/identity_neural_network.engine"
10-
plugin_lib_file_path = "../build/src/libidentity_conv.so"
10+
# Add an argparser to specify the engine file path and plugin library file path.
11+
parser = argparse.ArgumentParser(
12+
description="Run an engine with Identity Plugin.")
13+
parser.add_argument(
14+
"--engine_file_path",
15+
type=str,
16+
default="../data/identity_neural_network_iplugin_v3.engine",
17+
help="Path to the engine file.",
18+
)
19+
parser.add_argument(
20+
"--plugin_lib_file_path",
21+
type=str,
22+
default=
23+
"../build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so",
24+
help="Path to the plugin library file.",
25+
)
26+
27+
args = parser.parse_args()
28+
engine_file_path = args.engine_file_path
29+
plugin_lib_file_path = args.plugin_lib_file_path
1130

1231
common_runtime.load_plugin_lib(plugin_lib_file_path)
1332
engine = common_runtime.load_engine(engine_file_path)
@@ -46,11 +65,14 @@ def main():
4665

4766
# Execute the engine.
4867
context = engine.create_execution_context()
49-
common.do_inference_v2(context,
50-
bindings=bindings,
51-
inputs=inputs,
52-
outputs=outputs,
53-
stream=stream)
68+
common.do_inference(
69+
context=context,
70+
engine=engine,
71+
inputs=inputs,
72+
outputs=outputs,
73+
bindings=bindings,
74+
stream=stream,
75+
)
5476

5577
# Print output tensor data.
5678
for host_device_buffer in outputs:

python/test_engine.py

Lines changed: 40 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -5,40 +5,54 @@
55
import common_runtime
66

77

8-
class TestMain(unittest.TestCase):
8+
def test_engine(engine_file_path: str, plugin_lib_file_path: str):
9+
10+
common_runtime.load_plugin_lib(plugin_lib_file_path=plugin_lib_file_path)
11+
engine = common_runtime.load_engine(engine_file_path=engine_file_path)
12+
13+
inputs, outputs, bindings, stream = common.allocate_buffers(
14+
engine=engine, profile_idx=None)
915

10-
def test_engine(self):
16+
for host_device_buffer in inputs:
17+
data = np.random.uniform(low=-10.0,
18+
high=10.0,
19+
size=host_device_buffer.shape).astype(
20+
host_device_buffer.dtype).flatten()
21+
np.copyto(host_device_buffer.host, data)
1122

12-
engine_file_path = "../data/identity_neural_network.engine"
13-
plugin_lib_file_path = "../build/src/libidentity_conv.so"
23+
context = engine.create_execution_context()
24+
common.do_inference(
25+
context=context,
26+
engine=engine,
27+
inputs=inputs,
28+
outputs=outputs,
29+
bindings=bindings,
30+
stream=stream,
31+
)
1432

15-
common_runtime.load_plugin_lib(
16-
plugin_lib_file_path=plugin_lib_file_path)
17-
engine = common_runtime.load_engine(engine_file_path=engine_file_path)
33+
for input_host_device_buffer, output_host_device_buffer in zip(
34+
inputs, outputs):
35+
np.testing.assert_equal(input_host_device_buffer.host,
36+
output_host_device_buffer.host)
1837

19-
inputs, outputs, bindings, stream = common.allocate_buffers(
20-
engine=engine, profile_idx=None)
38+
common.free_buffers(inputs=inputs, outputs=outputs, stream=stream)
39+
40+
41+
class TestMain(unittest.TestCase):
2142

22-
for host_device_buffer in inputs:
23-
data = np.random.uniform(low=-10.0,
24-
high=10.0,
25-
size=host_device_buffer.shape).astype(
26-
host_device_buffer.dtype).flatten()
27-
np.copyto(host_device_buffer.host, data)
43+
def test_engine_v2(self):
2844

29-
context = engine.create_execution_context()
30-
common.do_inference_v2(context,
31-
bindings=bindings,
32-
inputs=inputs,
33-
outputs=outputs,
34-
stream=stream)
45+
engine_file_path = "../data/identity_neural_network_iplugin_v2_io_ext.engine"
46+
plugin_lib_file_path = "../build/src/plugins/IdentityConvIPluginV2IOExt/libidentity_conv_iplugin_v2_io_ext.so"
47+
test_engine(engine_file_path=engine_file_path,
48+
plugin_lib_file_path=plugin_lib_file_path)
3549

36-
for input_host_device_buffer, output_host_device_buffer in zip(
37-
inputs, outputs):
38-
np.testing.assert_equal(input_host_device_buffer.host,
39-
output_host_device_buffer.host)
50+
def test_engine_v3(self):
4051

41-
common.free_buffers(inputs=inputs, outputs=outputs, stream=stream)
52+
engine_file_path = "../data/identity_neural_network_iplugin_v3.engine"
53+
plugin_lib_file_path = "../build/src/plugins/IdentityConvIPluginV3/libidentity_conv_iplugin_v3.so"
54+
test_engine(engine_file_path=engine_file_path,
55+
plugin_lib_file_path=plugin_lib_file_path)
4256

4357

4458
if __name__ == "__main__":

0 commit comments

Comments
 (0)