An example and Explaination of the inference-config.cfg file:
cluster_url=example.com
cert_file=/path/to/cert/file.pem
key_file=/path/to/key/file.pem
keycloak_client_id=my-client-id
keycloak_admin_user=your-keycloak-admin-user
keycloak_admin_password=changeme
hugging_face_token=your_hugging_face_token
models=
cpu_or_gpu=
deploy_kubernetes_fresh=on
deploy_ingress_controller=on
deploy_keycloak_apisix=on
deploy_genai_gateway=on
deploy_observability=on
deploy_llm_models=on
Make sure to update the values in the inference-config.cfg file according to your requirements before running the Intel AI for Enterprise Inference.
- If
deploy_kubernetes_freshis set toon, a fresh Kubernetes cluster will be initialized as per the deployment configuration.- If
deploy_ingress_controlleris set toon, ingress controller will be configured to route external traffic to the cluster- If
deploy_keycloak_apisixis set toon, Keycloak and APISIX will be deployed for Model API Authentication- If
deploy_genai_gatewayis set toon, the GenAI Gateway will be deployed.- If
deploy_genai_gatewayis set toon, you must choose only one authentication method: either enable Keycloak and APISIX (deploy_keycloak_apisix=on) for external authentication, or use the GenAI Gateway's built-in authentication by settingdeploy_keycloak_apisix=off. Please not enable both authentication methods at the same time.- If
deploy_observabilityis set toon, the observability stack will be deployed for monitoring.- If
deploy_llm_modelsis set toon, the selected models will be deployed for inferencing- The
cert_fileandkey_filepaths should be set according to the instructions for generating certificates for development and production environments, as documented.- The
modelsvalue corresponds to the pre-validated LLM models listed in the documentation.
- The
keycloak_client_idis the client ID used for accessing APIs through Keycloak.- The
keycloak_admin_useris the administrator username for Keycloak.- The
keycloak_admin_passwordis the administrator password for Keycloak.- If
deploy_keycloak_apisixis set tooff, thekeycloak_client_id,keycloak_admin_user, andkeycloak_admin_passwordvalues will have no effect.- The
hugging_face_tokenis the token used for pulling LLM models from Hugging Face.- If
deploy_llm_modelsis set tooff, thehugging_face_tokenvalue will be ignored.- The
cpu_or_gpuvalue specifies whether to deploy models for CPU or Intel® AI Accelerator.
For running behind corporate proxy, please refer to this guide