Skip to content

Commit 7ef7125

Browse files
mdfaheem-intelpvishwanAhmedSeemalKvhpintelsgurunat
authored
Updated the documentation with Intel AI Accelerator instead of Gaudi since stack supports all Intel accelerators (opea-project#79)
* Update prerequisites.md * Update and rename gaudi-prerequisites.md to intel-ai-accelerator-prerequisites.md * Rename Enterprise-Inference-Gaudi-Driver-version.png to Enterprise-Inference-Intel-AI-Accelerator-Driver-version.png * Rename Enterprise-Inference-Gaudi-Firmware-version.png to Enterprise-Inference-Intel-AI-Accelerator-Firmware-version.png * Update and rename einf-singlenode-gaudi.yml to einf-singlenode-intel-ai-accelerator.yml * Update README.md * Update single-node-deployment.md * Update cpu-optimization-guide.md * Update deploy-llm-model-from-hugging-face.md * Update README.md * Update inventory-design-guide.md * Update inventory-design-guide.md * Update inventory-design-guide.md * Update inventory-design-guide.md * Update inventory-design-guide.md * Update configuring-inference-config-cfg-file.md * Update multi-node-deployment.md * Update einf-singlenode-xeon.yml * Rename Enterprise-Inference-Gaudi-Utilization-Cluster-Observability.png to Enterprise-Inference-Intel-AI-Accelerator-Utilization-Cluster-Observability.png * Rename Enterprise-Inference-Gaudi-Observability.png to Enterprise-Inference-Intel-AI-Accelerator-Observability.png * Rename Enterprise-Inference-Gaudi-Habana-version.png to Enterprise-Inference-Intel-AI-Accelerator-Habana-version.png * Update AI Accelerator documentation Signed-off-by: psurabh <pradeep.surabhi@intel.com> Signed-off-by: amberjain1 <amber.jain@intel.com> Signed-off-by: mdfaheem-intel <mohammad.faheem@intel.com> Signed-off-by: vivekrsintc <vivek.rs@intel.com> Co-authored-by: pvishwan <pramodh.vishwanath@intel.com> Co-authored-by: AhmedSeemalK <ahmed.seemal@intel.com> Co-authored-by: vhpintel <vijay.kumar.h.p@intel.com> Co-authored-by: sgurunat <gurunath.s@intel.com> Co-authored-by: jaswanth8888 <jaswanth.karani@intel.com> Co-authored-by: sandeshk-intel <sandesh.kumar.s@intel.com> Co-authored-by: vinayK34 <vinay3.kumar@intel.com> Signed-off-by: Github Actions <actions@github.com> --------- Signed-off-by: psurabh <pradeep.surabhi@intel.com> Signed-off-by: amberjain1 <amber.jain@intel.com> Signed-off-by: mdfaheem-intel <mohammad.faheem@intel.com> Signed-off-by: vivekrsintc <vivek.rs@intel.com> Signed-off-by: Github Actions <actions@github.com> Co-authored-by: pvishwan <pramodh.vishwanath@intel.com> Co-authored-by: AhmedSeemalK <ahmed.seemal@intel.com> Co-authored-by: vhpintel <vijay.kumar.h.p@intel.com> Co-authored-by: sgurunat <gurunath.s@intel.com> Co-authored-by: jaswanth8888 <jaswanth.karani@intel.com> Co-authored-by: sandeshk-intel <sandesh.kumar.s@intel.com> Co-authored-by: vinayK34 <vinay3.kumar@intel.com>
1 parent 5df2a50 commit 7ef7125

17 files changed

Lines changed: 68 additions & 67 deletions

docs/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Quick Start
22
To set up prerequisities and quickly deploy Intel® AI for Enterprise Inference on a single node, follow the steps in the [**Single Node Deployment Guide**](./single-node-deployment.md). Otherwise, proceed to the section below for all deployment options.
33

4-
> 🚀 **New**: Automated Gaudi firmware and driver management! See [Gaudi Prerequisites](./gaudi-prerequisites.md) for automated setup scripts.
4+
> 🚀 **New**: Automated Intel® AI Accelerator firmware and driver management! See [Intel® AI Accelerator Prerequisites](./intel-ai-accelerator-prerequisites.md) for automated setup scripts.
55
66
# Complete Intel® AI for Enterprise Inference Cluster Setup
77

docs/configuring-inference-config-cfg-file.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ Make sure to update the values in the inference-config.cfg file according to you
3737
> - If `deploy_keycloak_apisix` is set to `off`, the `keycloak_client_id`, `keycloak_admin_user`, and `keycloak_admin_password` values will have no effect.
3838
> - The `hugging_face_token` is the token used for pulling LLM models from Hugging Face.
3939
> - If `deploy_llm_models` is set to `off`, the `hugging_face_token` value will be ignored.
40-
> - The `cpu_or_gpu` value specifies whether to deploy models for CPU or Intel Gaudi.
40+
> - The `cpu_or_gpu` value specifies whether to deploy models for CPU or Intel® AI Accelerator.
4141
>
4242
4343
For running behind corporate proxy, please refer to this [guide](./running-behind-proxy.md)

docs/cpu-optimization-guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ resources:
4949
5050
For single-node Xeon clusters, **Keycloak** and **APISIX** are recommended.
5151
52-
For Gaudi or large multi-node Xeon clusters, the GenAI Gateway is well-suited.
52+
For Intel® AI Accelerator or large multi-node Xeon clusters, the GenAI Gateway is well-suited.
5353
5454
## Status Verification
5555
@@ -74,4 +74,4 @@ If models aren't performing optimally:
7474
CPU optimization runs automatically and provides:
7575
- Dedicated CPU cores for each model
7676
- Consistent performance
77-
- Optimal resource utilization
77+
- Optimal resource utilization

docs/deploy-llm-model-from-hugging-face.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,6 @@ This option allows you to deploy any Hugging Face-hosted LLM on the Inference Cl
1717
3. When prompted, provide:
1818
- **Hugging Face Model ID** (e.g., `meta-llama/Meta-Llama-3-8B`)
1919
- **Model Deployment Name** (e.g., `metallama-8b`)
20-
- **Tensor Parallel Size** (based on available Gaudi cards)
20+
- **Tensor Parallel Size** (based on available Intel® AI Accelerator cards)
2121

2222
> **Note**: This deploys a model that has **not** been pre-validated. Make sure the tensor parallel size is configured correctly. An incorrect value can result in the model being stuck in a "not ready" state.

docs/examples/single-node/README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# Setup Single Node Using Ansible
22

3-
These playbooks sets up a single node inference environment on either a Intel® Gaudi or Intel® Xeon node using Ansible. It is designed to be run on the Intel® Gaudi or Intel® Xeon node where the Intel® AI for Enterprise Inference Service will be deployed. The playbooks installs all necessary dependencies, configures the environment, and prepares the system for the Intel® AI for Enterprise Inference Service. If you are going to use Intel® Gaudi, you will need to have the Gaudi drivers and firmware installed on the system before running this playbook, for more information on installing the Gaudi drivers and firmware, refer to the [Gaudi Drivers Installation Guide](https://github.com/opea-project/Enterprise-Inference/blob/main/core/catalog/docs/gaudi/gaudi-prerequisites.md).
3+
These playbooks sets up a single node inference environment on either a Intel® AI Accelerator or Intel® Xeon node using Ansible. It is designed to be run on the Intel® AI Accelerator or Intel® Xeon node where the Intel® AI for Enterprise Inference Service will be deployed. The playbooks installs all necessary dependencies, configures the environment, and prepares the system for the Intel® AI for Enterprise Inference Service. If you are going to use Intel® AI Accelerator, you will need to have the Intel® AI Accelerator drivers and firmware installed on the system before running this playbook, for more information on installing the Intel® AI Accelerator drivers and firmware, refer to the [Intel® AI Accelerator Drivers Installation Guide](../../intel-ai-accelerator-prerequisites.md).
44

55
Many of the defaults are setup to work out of the box, but you will need to update the **`cluster_ip`** and provide the **`hf_token`** for downloading models from Hugging Face.
66

77
There is also a template directory that contains a set of templates for the various configuration files that are used by the AI Inference Service. These templates are used to generate the final configuration files based on the variables defined in the playbook. Do not modify these files directly.
88

9-
Depending on the deployment type or the size of the models used, the playbook may run up to 25 minutes, at the end of the playbook running it will output the results of the installation script. The models will be available sometime after the playbook is done, the models selected by default for the Intel® Gaudi deployment can take up to an hour for all four of them to be available. If you change the models that will be used, the start up time may be different.
9+
Depending on the deployment type or the size of the models used, the playbook may run up to 25 minutes, at the end of the playbook running it will output the results of the installation script. The models will be available sometime after the playbook is done, the models selected by default for the Intel® AI Accelerator deployment can take up to an hour for all four of them to be available. If you change the models that will be used, the start up time may be different.
1010

1111
| Deployment Type | Playbook File |
1212
|------------------|----------------|
13-
| Gaudi Single Node Playbook | einf-singlenode-gaudi.yml |
13+
| Intel® AI Accelerator Single Node Playbook | einf-singlenode-intel-ai-accelerator.yml |
1414
| Xeon Single Node Playbook | einf-singlenode-xeon.yml |
1515

1616

@@ -66,12 +66,13 @@ These settings are all set to `on` by default in the playbook, change these vari
6666

6767
2. **Run the Playbook**
6868

69-
Execute the Gaudi playbook using the following command:
69+
Execute the Intel® AI Accelerator playbook using the following command:
7070

7171
```bash
7272
git clone https://github.com/opea-project/Enterprise-Inference.git
7373
cd Enterprise-Inference/docs/examples/single-node
74-
sudo ansible-playbook einf-singlenode-gaudi.yml
74+
sudo ansible-playbook einf-singlenode-intel-ai-accelerator.yml
75+
7576
```
7677

7778
Execute the Xeon playbook using the following command:
@@ -154,4 +155,4 @@ curl -k ${BASE_URL}/Meta-Llama-3.1-70B-Instruct/v1/completions -X POST -d '{"mod
154155

155156
---
156157

157-
For more information on how to access the models, refer to the [Accessing Deployed Models](/docs/accessing-deployed-models.md) documentation.
158+
For more information on how to access the models, refer to the [Accessing Deployed Models](/docs/accessing-deployed-models.md) documentation.

docs/examples/single-node/einf-singlenode-gaudi.yml renamed to docs/examples/single-node/einf-singlenode-intel-ai-accelerator.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright (C) 2025-2026 Intel Corporation
22
# SPDX-License-Identifier: Apache-2.0
33

4-
# Ansible Playbook to install and configure the Enterprise Inference Service on a Single Gaudi node running Ubuntu 22.04+
4+
# Ansible Playbook to install and configure the Enterprise Inference Service on a Single node of Intel® AI Accelerator running Ubuntu 22.04+
55
# Needs to run as root or with sudo privileges
66
# Installs version:
77
---
@@ -11,7 +11,7 @@
1111
gather_facts: true
1212
vars:
1313
cluster_url: "api.example.com" # Cluster name, change if you want to use a different DNS name for the service
14-
cluster_ip: "127.0.0.1" # Cluster IP, this should be the IP of the Gaudi node that will be used to access the service
14+
cluster_ip: "127.0.0.1" # Cluster IP, this should be the IP of the node that will be used to access the service
1515
ai_user: "ai-inference" # Enterprise Inference Service OS user, change if you want to use a different user
1616
ssh_key_file: "/home/{{ ai_user }}/.ssh/id_rsa" # Path to your private key, this playbook will create this
1717
keycloak_client_id: "api" # Keycloak client ID
@@ -20,7 +20,7 @@
2020
hf_token: "YourHuggingFaceToken" # Hugging Face token for all models, you need to supply your Hugging Face token to download models
2121
hf_token_falcon3: "YourHuggingFaceToken" # Hugging Face token for Falcon 3, can be the same as hf_token
2222
models: "2,5,8,9" # Comma-separated list of model IDs, see repo
23-
cpu_or_gpu: "gpu" # "cpu" or "gpu", set to "gpu" for Gaudi nodes
23+
cpu_or_gpu: "gpu" # "cpu" or "gpu", set to "gpu" for Intel® AI Accelerator nodes
2424
deploy_kubernetes_fresh: "on"
2525
deploy_ingress_controller: "on"
2626
deploy_keycloak_apisix: "on"

docs/examples/single-node/einf-singlenode-xeon.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
gather_facts: true
1212
vars:
1313
cluster_url: "api.example.com" # Cluster name, change if you want to use a different DNS name for the service
14-
cluster_ip: "127.0.0.1" # Cluster IP, this should be the IP of the Gaudi node that will be used to access the service
14+
cluster_ip: "127.0.0.1" # Cluster IP, this should be the IP of the Intel® AI Accelerator node that will be used to access the service
1515
ai_user: "ai-inference" # Enterprise Inference Service OS user, change if you want to use a different user
1616
ssh_key_file: "/home/{{ ai_user }}/.ssh/id_rsa" # Path to your private key, this playbook will create this
1717
keycloak_client_id: "api" # Keycloak client ID
@@ -20,7 +20,7 @@
2020
hf_token: "YourHuggingFaceToken" # Hugging Face token for all models, you need to supply your Hugging Face token to download models
2121
hf_token_falcon3: "YourHuggingFaceToken" # Hugging Face token for Falcon 3, can be the same as hf_token
2222
models: "21" # Comma-separated list of model IDs, see repo
23-
cpu_or_gpu: "cpu" # "cpu" or "gpu", set to "gpu" for Gaudi nodes
23+
cpu_or_gpu: "cpu" # "cpu" or "gpu", set to "gpu" for Intel® AI Accelerator nodes
2424
deploy_kubernetes_fresh: "on"
2525
deploy_ingress_controller: "on"
2626
deploy_keycloak_apisix: "on"

docs/gaudi-prerequisites.md renamed to docs/intel-ai-accelerator-prerequisites.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
# Gaudi Node Requirements and Setup Guide
1+
# Intel® AI Accelerator Node Requirements and Setup Guide
22

3-
This guide helps verify and automatically install the latest firmware and driver version for **Habana Gaudi** nodes in your Kubernetes or Standalone Environment.
3+
This guide helps verify and automatically install the latest firmware and driver version for **Intel® AI Accelerator** nodes in your Kubernetes or Standalone Environment.
44

55
# What You Need
6-
- Intel® Gaudi® cards installed in your system
6+
- Intel® AI Accelerator cards installed in your system
77
- Linux operating system
88
- Internet connection
99
- Root/sudo privileges
@@ -33,11 +33,11 @@ Firmware [SPI] Version : Preboot version hl-gaudi2-1.20.0-fw-58.0.0-sec-9 (Jan 1
3333
```
3434
###### For visual assistance, refer to the following snapshot for Firmware version:
3535

36-
<img src="../docs/pictures/Enterprise-Inference-Gaudi-Firmware-version.png" alt="AI Inference Firmware Snapshot" width="800" height="120"/>
36+
<img src="../docs/pictures/Enterprise-Inference-Intel-AI-Accelerator-Firmware-version.png" alt="AI Inference Firmware Snapshot" width="800" height="120"/>
3737

3838

3939
#### Step 2: Check Driver Version
40-
Use the following commands to check the required driver version installed on your Gaudi nodes:
40+
Use the following commands to check the required driver version installed on your Intel® AI Accelerator nodes:
4141

4242
```bash
4343
hl-smi
@@ -52,7 +52,7 @@ You'll see something like:
5252
```
5353
###### For visual assistance, refer to the following snapshot for Driver version:
5454

55-
<img src="../docs/pictures/Enterprise-Inference-Gaudi-Driver-version.png" alt="AI Inference Driver Snapshot" width="800" height="120"/>
55+
<img src="../docs/pictures/Enterprise-Inference-Intel-AI-Accelerator-Driver-version.png" alt="AI Inference Driver Snapshot" width="800" height="120"/>
5656

5757
#### Step 3: Check Runtime Version
5858

@@ -126,7 +126,7 @@ If the numbers don't match, run:
126126
```bash
127127
kubectl rollout restart ds habana-ai-device-plugin-ds -n habana-ai-operator
128128
```
129-
> **For detailed documentation, refer to the official guide:** [Intel® Gaudi® Software Installation Documentation](https://docs.habana.ai/en/latest/Installation_Guide/Driver_Installation.html)
129+
> **For detailed documentation, refer to the official guide:** [Intel® AI Accelerator Software Installation Documentation](https://docs.habana.ai/en/latest/Installation_Guide/Driver_Installation.html)
130130
>
131131
> **For automation script details:** See [Firmware Update Script Documentation](../core/scripts/README.md)
132132
>

0 commit comments

Comments
 (0)