|
1 | 1 | # Setup Single Node Using Ansible |
2 | 2 |
|
3 | | -These playbooks sets up a single node inference environment on either a Intel® Gaudi or Intel® Xeon node using Ansible. It is designed to be run on the Intel® Gaudi or Intel® Xeon node where the Intel® AI for Enterprise Inference Service will be deployed. The playbooks installs all necessary dependencies, configures the environment, and prepares the system for the Intel® AI for Enterprise Inference Service. If you are going to use Intel® Gaudi, you will need to have the Gaudi drivers and firmware installed on the system before running this playbook, for more information on installing the Gaudi drivers and firmware, refer to the [Gaudi Drivers Installation Guide](https://github.com/opea-project/Enterprise-Inference/blob/main/core/catalog/docs/gaudi/gaudi-prerequisites.md). |
| 3 | +These playbooks sets up a single node inference environment on either a Intel® AI Accelerator or Intel® Xeon node using Ansible. It is designed to be run on the Intel® AI Accelerator or Intel® Xeon node where the Intel® AI for Enterprise Inference Service will be deployed. The playbooks installs all necessary dependencies, configures the environment, and prepares the system for the Intel® AI for Enterprise Inference Service. If you are going to use Intel® AI Accelerator, you will need to have the Intel® AI Accelerator drivers and firmware installed on the system before running this playbook, for more information on installing the Intel® AI Accelerator drivers and firmware, refer to the [Intel® AI Accelerator Drivers Installation Guide](../../intel-ai-accelerator-prerequisites.md). |
4 | 4 |
|
5 | 5 | Many of the defaults are setup to work out of the box, but you will need to update the **`cluster_ip`** and provide the **`hf_token`** for downloading models from Hugging Face. |
6 | 6 |
|
7 | 7 | There is also a template directory that contains a set of templates for the various configuration files that are used by the AI Inference Service. These templates are used to generate the final configuration files based on the variables defined in the playbook. Do not modify these files directly. |
8 | 8 |
|
9 | | -Depending on the deployment type or the size of the models used, the playbook may run up to 25 minutes, at the end of the playbook running it will output the results of the installation script. The models will be available sometime after the playbook is done, the models selected by default for the Intel® Gaudi deployment can take up to an hour for all four of them to be available. If you change the models that will be used, the start up time may be different. |
| 9 | +Depending on the deployment type or the size of the models used, the playbook may run up to 25 minutes, at the end of the playbook running it will output the results of the installation script. The models will be available sometime after the playbook is done, the models selected by default for the Intel® AI Accelerator deployment can take up to an hour for all four of them to be available. If you change the models that will be used, the start up time may be different. |
10 | 10 |
|
11 | 11 | | Deployment Type | Playbook File | |
12 | 12 | |------------------|----------------| |
13 | | -| Gaudi Single Node Playbook | einf-singlenode-gaudi.yml | |
| 13 | +| Intel® AI Accelerator Single Node Playbook | einf-singlenode-intel-ai-accelerator.yml | |
14 | 14 | | Xeon Single Node Playbook | einf-singlenode-xeon.yml | |
15 | 15 |
|
16 | 16 |
|
@@ -66,12 +66,13 @@ These settings are all set to `on` by default in the playbook, change these vari |
66 | 66 |
|
67 | 67 | 2. **Run the Playbook** |
68 | 68 |
|
69 | | - Execute the Gaudi playbook using the following command: |
| 69 | + Execute the Intel® AI Accelerator playbook using the following command: |
70 | 70 |
|
71 | 71 | ```bash |
72 | 72 | git clone https://github.com/opea-project/Enterprise-Inference.git |
73 | 73 | cd Enterprise-Inference/docs/examples/single-node |
74 | | - sudo ansible-playbook einf-singlenode-gaudi.yml |
| 74 | + sudo ansible-playbook einf-singlenode-intel-ai-accelerator.yml |
| 75 | + |
75 | 76 | ``` |
76 | 77 |
|
77 | 78 | Execute the Xeon playbook using the following command: |
@@ -154,4 +155,4 @@ curl -k ${BASE_URL}/Meta-Llama-3.1-70B-Instruct/v1/completions -X POST -d '{"mod |
154 | 155 |
|
155 | 156 | --- |
156 | 157 |
|
157 | | -For more information on how to access the models, refer to the [Accessing Deployed Models](/docs/accessing-deployed-models.md) documentation. |
| 158 | +For more information on how to access the models, refer to the [Accessing Deployed Models](/docs/accessing-deployed-models.md) documentation. |
0 commit comments