NVIDIA · rgsl888prabhu · Jun 10, 2026 · Jun 10, 2026
@@ -1,4 +1,4 @@
-# NVIDIA cuOpt gRPC server architecture
+# NVIDIA cuOpt gRPC Server Architecture
 
 <!--
   SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

@@ -3,14 +3,14 @@
    SPDX-License-Identifier: Apache-2.0
 
 =======================
-Advanced configuration
+Advanced Configuration
 =======================
 
 This page lists **configuration parameters** first, then **usage** walkthroughs (TLS, Docker, private CA). Complete :doc:`quick-start` first (install, plain TCP server, and minimal example).
 
 For RPC summaries and server behavior, see :doc:`api` and :doc:`grpc-server-architecture`. Example entry points with ``CUOPT_REMOTE_*``: :doc:`examples`. Contributor-only internals: ``cpp/docs/grpc-server-architecture.md`` in the repository.
 
-Configuration parameters
+Configuration Parameters
 ========================
 
 ``cuopt_grpc_server`` (host or explicit container command)
@@ -39,7 +39,7 @@ Run ``cuopt_grpc_server --help`` for the full list. Typical flags (also passable
          --tls-root PATH          Root CA certificate (for client verification)
          --require-client-cert    Require client certificate (mTLS)
 
-NVIDIA cuOpt container (gRPC via entrypoint)
+NVIDIA cuOpt Container (gRPC via Entrypoint)
 --------------------------------------------
 
 These variables apply when the container **entrypoint** builds a ``cuopt_grpc_server`` command (see *Docker: gRPC server in container* under Usage). If you pass an explicit command after the image name, this table does not apply.
@@ -66,7 +66,7 @@ These variables apply when the container **entrypoint** builds a ``cuopt_grpc_se
 
 The REST server path in the same image still uses ``CUOPT_SERVER_PORT`` for HTTP in other docs; that is separate from the gRPC defaults above.
 
-Bundled remote client (Python, C API, ``cuopt_cli``)
+Bundled Remote Client (Python, C API, ``cuopt_cli``)
 ----------------------------------------------------
 
 Remote mode is active when **both** ``CUOPT_REMOTE_HOST`` and ``CUOPT_REMOTE_PORT`` are set. A **custom** gRPC client does not read these automatically; it must configure the channel and protos itself (see :doc:`api`).
@@ -119,7 +119,7 @@ Remote mode is active when **both** ``CUOPT_REMOTE_HOST`` and ``CUOPT_REMOTE_POR
 Usage
 =====
 
-Start the server with TLS
+Start the Server with TLS
 --------------------------
 
 Basic (no TLS), plain TCP, is in :doc:`quick-start`. Encrypted server:
@@ -142,7 +142,7 @@ mTLS (mutual TLS):
      --tls-root ca.crt \
      --require-client-cert
 
-How mTLS works
+How mTLS Works
 --------------
 
 With mTLS the server verifies every client, and the client verifies the server. Trust is based on **Certificate Authorities** (CAs), not individual certificate lists:
@@ -151,7 +151,7 @@ With mTLS the server verifies every client, and the client verifies the server.
 * ``--require-client-cert`` makes client verification **mandatory**. Without it, the server may still allow connections without a client cert.
 * On the client, ``CUOPT_TLS_ROOT_CERT`` is the CA that signed the **server** certificate so the client can verify the server.
 
-Restricting access with a private CA
+Restricting Access with a Private CA
 ------------------------------------
 
 To limit which clients can connect, run your own CA and issue client certs only to authorized actors.
@@ -217,7 +217,7 @@ Repeat for each authorized client. Keep ``ca.key`` private; distribute ``ca.crt`
 
 **Revocation:** built-in gRPC TLS does **not** implement CRL or OCSP. To revoke a client, rotate the CA, stop issuing from a compromised CA, or terminate TLS at a reverse proxy (e.g., Envoy) that supports revocation.
 
-Docker: gRPC server in container
+Docker: gRPC Server in Container
 ---------------------------------
 
 The official NVIDIA cuOpt image includes the REST server and ``cuopt_grpc_server``. The entrypoint behaves as follows:
@@ -256,7 +256,7 @@ Bypass the entrypoint:
      nvcr.io/nvidia/cuopt/cuopt:latest \
      cuopt_grpc_server --port 5001 --workers 2
 
-Client environment (examples)
+Client Environment (Examples)
 ------------------------------
 
 **Required** for remote (see *Bundled remote client* table for all variables):
@@ -280,7 +280,7 @@ For mTLS, also:
    export CUOPT_TLS_CLIENT_CERT=client.crt
    export CUOPT_TLS_CLIENT_KEY=client.key
 
-Limitations and scope
+Limitations and Scope
 =====================
 
 * **Problem types** — **LP**, **MILP**, and **QP** are supported on the gRPC remote path. **Routing** (VRP, TSP, PDP) is **not** supported yet; use the :doc:`REST self-hosted server <../cuopt-server/index>` for remote routing until a future release adds routing over ``CuOptRemoteService``.
@@ -306,7 +306,7 @@ Troubleshooting
    * - Timeout on large problems
      - Increase solver ``time_limit`` and client/server message limits.
 
-Further reading
+Further Reading
 ===============
 
 * :doc:`quick-start` — Plain TCP quick path.

@@ -3,7 +3,7 @@
    SPDX-License-Identifier: Apache-2.0
 
 ======================
-gRPC API (reference)
+gRPC API (Reference)
 ======================
 
 The **CuOptRemoteService** gRPC API is defined in Protocol Buffers under the ``cuopt.remote`` package. Source files in the repository:
@@ -16,7 +16,7 @@ Most users do **not** call these RPCs directly: the NVIDIA cuOpt **Python** API,
 Service: ``CuOptRemoteService``
 ================================
 
-Asynchronous jobs
+Asynchronous Jobs
 -----------------
 
 .. list-table::
@@ -38,7 +38,7 @@ Asynchronous jobs
    * - ``WaitForCompletion``
      - Block until the job finishes (status only; use ``GetResult`` for the solution).
 
-Chunked upload (large problems)
+Chunked Upload (Large Problems)
 --------------------------------
 
 .. list-table::
@@ -54,7 +54,7 @@ Chunked upload (large problems)
    * - ``FinishChunkedUpload``
      - Finalize the upload and return ``job_id`` (same as ``SubmitJob``).
 
-Chunked download (large results)
+Chunked Download (Large Results)
 --------------------------------
 
 .. list-table::
@@ -70,7 +70,7 @@ Chunked download (large results)
    * - ``FinishChunkedDownload``
      - End the download session and release server state.
 
-Streaming and callbacks
+Streaming and Callbacks
 -----------------------
 
 .. list-table::
@@ -84,14 +84,14 @@ Streaming and callbacks
    * - ``GetIncumbents``
      - MILP incumbent solutions since a given index.
 
-Messages and constraints
+Messages and Constraints
 ========================
 
 * **Problem types** — LP and MILP in the enum; the problem payload can include quadratic objective data for **QP**-style solves where the client API supports it. **Routing** over this gRPC service is **not** available yet; it is planned for an **upcoming** release (use REST for remote routing today).
 * **Solver settings** — Carried as ``PDLPSolverSettings`` or ``MIPSolverSettings`` inside the request or chunked header, aligned with the NVIDIA cuOpt solver options documentation.
 * **Errors** — gRPC status codes carry failures (see comments at the end of ``cuopt_remote_service.proto``).
 
-Further reading
+Further Reading
 ===============
 
 * :doc:`grpc-server-architecture` — Server process model and job lifecycle (overview); :doc:`advanced` for ``cuopt_grpc_server`` flags. Contributor details: ``cpp/docs/grpc-server-architecture.md``.

@@ -21,7 +21,7 @@ Add TLS or tuning variables from :doc:`advanced` if your deployment uses them.
 
    Routing solve over gRPC is not supported. For solving routing problems remotely today, use the HTTP/JSON :doc:`REST self-hosted server <../cuopt-server/index>` and :doc:`Examples <../cuopt-server/examples/index>`.
 
-Where to find examples
+Where to Find Examples
 ======================
 
 Python (LP / QP / MILP)
@@ -43,20 +43,20 @@ C API (LP / QP / MILP)
 
 * :doc:`../cuopt-cli/cli-examples` — ``cuopt_cli`` invocations. With the exports above, the CLI forwards solves to ``cuopt_grpc_server``.
 
-Minimal demos (this section)
+Minimal Demos (This Section)
 ----------------------------
 
 Bundled with the gRPC docs source for a quick copy-paste path (also walked through in :doc:`quick-start`):
 
 * :download:`remote_lp_demo.py <examples/remote_lp_demo.py>`
 * :download:`remote_lp_demo.mps <examples/remote_lp_demo.mps>`
 
-Custom gRPC client
+Custom gRPC Client
 ------------------
 
 Integrations that do **not** use the bundled Python / C / CLI stack should speak ``CuOptRemoteService`` directly. See :doc:`api`, :doc:`grpc-server-architecture`, and ``cpp/docs/grpc-server-architecture.md`` in the repository for protos and server behavior.
 
-More samples
+More Samples
 ============
 
 * `NVIDIA cuOpt examples on GitHub <https://github.com/NVIDIA/cuopt-examples>`_ — set the remote environment on the **client** before running notebooks or scripts.

@@ -1,62 +1,20 @@
-# gRPC server behavior
+# gRPC Server Behavior
 
 NVIDIA cuOpt's **`cuopt_grpc_server`** uses one **main process** (gRPC front end, job tracking, background threads) and **worker processes** that run GPU solves. That layout gives isolation between jobs, optional parallelism when you set multiple workers, and streaming for large problems and logs.
 
 Implementation details (IPC layout, C++ source map, chunked transfer internals) live in the contributor reference: **`cpp/docs/grpc-server-architecture.md`** in the NVIDIA cuOpt repository.
 
 ## Process model
 
-```text
-┌──────────────────────────────────────────────────────────────────────┐
-│                        Main Server Process                           │
-│                                                                      │
-│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────────────┐  │
-│  │  gRPC       │  │  Job         │  │  Background Threads         │  │
-│  │  Service    │  │  Tracker     │  │  - Result retrieval         │  │
-│  │  Handler    │  │  (job status,│  │  - Incumbent retrieval      │  │
-│  │             │  │   results)   │  │  - Worker monitor           │  │
-│  └─────────────┘  └──────────────┘  └─────────────────────────────┘  │
-│         │                                        ▲                   │
-│         │ shared memory                          │ pipes             │
-│         ▼                                        │                   │
-│  ┌─────────────────────────────────────────────────────────────────┐ │
-│  │                       Shared Memory Queues                      │ │
-│  │                                                                 │ │
-│  │   ┌─────────────────┐        ┌─────────────────────┐            │ │
-│  │   │  Job Queue      │        │  Result Queue       │            │ │
-│  │   │  (MAX_JOBS=100) │        │  (MAX_RESULTS=100)  │            │ │
-│  │   └─────────────────┘        └─────────────────────┘            │ │
-│  └─────────────────────────────────────────────────────────────────┘ │
-└──────────────────────────────────────────────────────────────────────┘
-               │                                        ▲
-               │ fork()                                 │
-               ▼                                        │
-     ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
-     │  Worker 0       │  │  Worker 1       │  │  Worker N       │
-     │  ┌───────────┐  │  │  ┌───────────┐  │  │  ┌───────────┐  │
-     │  │ GPU Solve │  │  │  │ GPU Solve │  │  │  │ GPU Solve │  │
-     │  └───────────┘  │  │  └───────────┘  │  │  └───────────┘  │
-     │  (separate proc)│  │  (separate proc)│  │  (separate proc)│
-     └─────────────────┘  └─────────────────┘  └─────────────────┘
-```
+![gRPC Server Process Model](images/grpc-process-model.png)
 
 ## Job lifecycle (summary)
 
 **Submit** → the server assigns a job id and queues work. **Process** → a worker pulls the problem, solves on the GPU, and streams the result back. **Retrieve** → the client uses status and result RPCs (including chunked download when needed). See [gRPC API (reference)](api.rst) for RPC names.
 
 ## Job states
 
-```text
-┌─────────┐  submit   ┌───────────┐  claim   ┌────────────┐
-│ QUEUED  │──────────►│ PROCESSING│─────────►│ COMPLETED  │
-└─────────┘           └───────────┘          └────────────┘
-     │                      │
-     │ cancel               │ error
-     ▼                      ▼
-┌───────────┐          ┌─────────┐
-│ CANCELLED │          │ FAILED  │
-└───────────┘          └─────────┘
-```
+![gRPC Server Job States](images/grpc-job-states.png)
 
 ## Logs, capacity, and workers
 

@@ -3,7 +3,7 @@
    SPDX-License-Identifier: Apache-2.0
 
 ==========================
-gRPC remote execution
+gRPC Remote Execution
 ==========================
 
 **NVIDIA cuOpt gRPC remote execution** runs optimization solves on a remote GPU host. Clients can be the **Python** API, **C API**, **`cuopt_cli`**, or a **custom** program that speaks ``CuOptRemoteService`` over gRPC. For Python, the C API, and ``cuopt_cli``, set ``CUOPT_REMOTE_HOST`` and ``CUOPT_REMOTE_PORT`` to forward solves to ``cuopt_grpc_server``.

@@ -3,7 +3,7 @@
    SPDX-License-Identifier: Apache-2.0
 
 ===========
-Quick start
+Quick Start
 ===========
 
 **NVIDIA cuOpt gRPC remote execution** runs LP, MILP, and QP solves on a **GPU host** while your **Python** code, **C API** program, **`cuopt_cli`**, or a **custom** client runs elsewhere. When you set ``CUOPT_REMOTE_HOST`` and ``CUOPT_REMOTE_PORT``, the bundled **Python**, **C API**, and **cuopt_cli** clients forward ``solve_lp`` / ``solve_mip`` to ``cuopt_grpc_server`` with **no code changes**. **Custom** clients call ``CuOptRemoteService`` directly (see :doc:`api`).
@@ -12,7 +12,7 @@ Quick start
 
    **Problem types (gRPC remote):** **LP**, **MILP**, and **QP** are supported today. **Routing** (VRP, TSP, PDP) over this path is **not** available;  For remote routing, use the HTTP/JSON :doc:`REST self-hosted server <../cuopt-server/index>`. This guide is **not** the REST server—see :doc:`../cuopt-server/index` for HTTP/JSON.
 
-How remote execution works
+How Remote Execution Works
 ==========================
 
 1. **GPU host** — Run ``cuopt_grpc_server`` (bare metal or in the official container) so it listens on a TCP port (default **5001**).
@@ -35,7 +35,7 @@ Verify the server binary after install:
 
 For the same install selector with **Container** / registry choices (Docker Hub or NGC), see :doc:`../install`.
 
-Run the gRPC server (GPU host)
+Run the gRPC Server (GPU Host)
 ==============================
 
 **Bare metal** — after activating the same environment you used to install NVIDIA cuOpt:
@@ -68,7 +68,7 @@ Or invoke the binary explicitly:
 
    The container image defaults to the Python **REST** server when ``CUOPT_SERVER_TYPE`` is unset and you do not override the command; setting ``CUOPT_SERVER_TYPE=grpc`` selects ``cuopt_grpc_server``. Extra environment variables (``CUOPT_SERVER_PORT``, ``CUOPT_GPU_COUNT``, ``CUOPT_GRPC_ARGS``) and TLS are documented in :doc:`Advanced configuration <advanced>`.
 
-Point the client at the server
+Point the Client at the Server
 ==============================
 
 On the machine where you run Python, the C API, or ``cuopt_cli`` (use ``127.0.0.1`` if the server is on the same host):
@@ -80,7 +80,7 @@ On the machine where you run Python, the C API, or ``cuopt_cli`` (use ``127.0.0.
 
 Optional TLS and tuning variables are in :doc:`advanced`.
 
-Minimal Python example (LP)
+Minimal Python Example (LP)
 ============================
 
 The script is the same for **local** or **remote** solves: with the exports above, the client library forwards to ``cuopt_grpc_server``; without them, the solve runs locally (where a GPU is available).
@@ -111,7 +111,7 @@ You should see an optimal termination. To solve **locally**, unset the remote va
    unset CUOPT_REMOTE_HOST CUOPT_REMOTE_PORT
    python remote_lp_demo.py
 
-Minimal ``cuopt_cli`` example (LP)
+Minimal ``cuopt_cli`` Example (LP)
 ==================================
 
 The same **LP** is available as MPS. With ``CUOPT_REMOTE_HOST`` and ``CUOPT_REMOTE_PORT`` set as above, ``cuopt_cli`` forwards the solve to the remote server; unset them for a **local** run (GPU on that machine).
@@ -147,7 +147,7 @@ More options (time limits, relaxation): :doc:`../cuopt-cli/quick-start` and :doc
 
 More patterns (MPS variants, custom gRPC): :doc:`examples`.
 
-Next steps
+Next Steps
 ==========
 
 * :doc:`../install` — Top-level install selector (all interfaces), including **Container** pulls.

@@ -43,14 +43,14 @@ Python (cuopt)
    Python Overview <cuopt-python/index.rst>
 
 ====================================
-gRPC remote execution
+gRPC Remote Execution
 ====================================
 .. toctree::
-   :maxdepth: 2
-   :caption: gRPC remote execution
-   :name: gRPC remote execution
+   :maxdepth: 4
+   :caption: gRPC Remote Execution
+   :name: gRPC Remote Execution
 
-   gRPC overview <cuopt-grpc/index.rst>
+   gRPC Overview <cuopt-grpc/index.rst>
 
 ===============================
 Server (cuopt-server)

@@ -11,34 +11,34 @@
   },
   {
     "version": "26.04.00",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/26.04.00/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/26.04.00/"
   },
   {
     "version": "26.02.00",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/26.02.00/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/26.02.00/"
   },
   {
     "version": "25.12.00",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/25.12.00/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/25.12.00/"
   },
   {
     "version": "25.10.00",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/25.10.00/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/25.10.00/"
   },
   {
     "version": "25.08.00",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/25.08.00/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/25.08.00/"
   },
   {
     "version": "25.05",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/25.05/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/25.05.00/"
   },
   {
     "version": "25.02",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/25.02/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/25.02/"
   },
   {
     "version": "24.11",
-    "url": "https://docs.nvidia.com/cuopt/user-guide/24.11/"
+    "url": "https://archive.docs.nvidia.com/cuopt/user-guide/24.11/"
   }
 ]