Skip to content

Commit f8f0652

Browse files
authored
add CPU support for docker compose files (#38)
* Create Dockerfile.cpu * Update README.md * Create docker-compose-cpu.yaml * Rename docker-compose.yml to docker-compose-gpu.yml Contribution by https://github.com/alexjyong
1 parent a9199a5 commit f8f0652

4 files changed

Lines changed: 112 additions & 1 deletion

File tree

Dockerfile.cpu

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
FROM ubuntu:22.04
2+
3+
# Set non-interactive frontend
4+
ENV DEBIAN_FRONTEND=noninteractive
5+
6+
# Install Python and other dependencies
7+
RUN apt-get update && apt-get install -y \
8+
python3.10 \
9+
python3-pip \
10+
python3-venv \
11+
libsndfile1 \
12+
ffmpeg \
13+
portaudio19-dev \
14+
&& apt-get clean && rm -rf /var/lib/apt/lists/*
15+
16+
# Create non-root user and set up directories
17+
RUN useradd -m -u 1001 appuser && \
18+
mkdir -p /app/outputs /app && \
19+
chown -R appuser:appuser /app
20+
21+
USER appuser
22+
WORKDIR /app
23+
24+
# Copy dependency files
25+
COPY --chown=appuser:appuser requirements.txt ./requirements.txt
26+
27+
# Create and activate virtual environment
28+
RUN python3 -m venv /app/venv
29+
ENV PATH="/app/venv/bin:$PATH"
30+
31+
# Install CPU-only PyTorch and other dependencies
32+
RUN pip3 install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu && \
33+
pip3 install --no-cache-dir -r requirements.txt
34+
35+
# Copy project files
36+
COPY --chown=appuser:appuser . .
37+
38+
# Set environment variables
39+
ENV PYTHONUNBUFFERED=1 \
40+
PYTHONPATH=/app \
41+
USE_GPU=false
42+
43+
# Expose the port
44+
EXPOSE 5005
45+
46+
# Run FastAPI server with uvicorn
47+
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "5005", "--workers", "1"]

README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ Orpheus-FastAPI/
106106
### 🐳 Docker compose
107107

108108
The docker compose file orchestrates the Orpheus-FastAPI for audio and a llama.cpp inference server for the base model token generation. The GGUF model is downloaded with the model-init service.
109+
There are two versions, one for machines that have access to GPU support `docker-compose-gpu.yaml` and one for CPU support only: `docker-compose-cpu.yaml`
109110

110111
```bash
111112
cp .env.example .env # Create your .env file from the example
@@ -119,8 +120,15 @@ ORPHEUS_MODEL_NAME=Orpheus-3b-French-FT-Q8_0.gguf # Example for French
119120
```
120121

121122
Then start the services:
123+
124+
For GPU support run
125+
```bash
126+
docker compose -f docker-compose-gpu.yml up
127+
```
128+
129+
For CPU support run:
122130
```bash
123-
docker compose up --build
131+
docker compose -f docker-compose-cpu.yml up
124132
```
125133

126134
The system will automatically download the specified model from Hugging Face before starting the service.

docker-compose-cpu.yaml

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
services:
2+
orpheus-fastapi:
3+
container_name: orpheus-fastapi
4+
build:
5+
context: .
6+
dockerfile: Dockerfile.cpu
7+
ports:
8+
- "5005:5005"
9+
env_file:
10+
- .env
11+
environment:
12+
- ORPHEUS_API_URL=http://llama-cpp-server:5006/v1/completions
13+
restart: unless-stopped
14+
depends_on:
15+
llama-cpp-server:
16+
condition: service_started
17+
18+
llama-cpp-server:
19+
image: ghcr.io/ggml-org/llama.cpp:server
20+
ports:
21+
- "5006:5006"
22+
volumes:
23+
- ./models:/models
24+
env_file:
25+
- .env
26+
depends_on:
27+
model-init:
28+
condition: service_completed_successfully
29+
restart: unless-stopped
30+
command: >
31+
-m /models/${ORPHEUS_MODEL_NAME}
32+
--host 0.0.0.0
33+
--port 5006
34+
--ctx-size ${ORPHEUS_MAX_TOKENS}
35+
--n-predict ${ORPHEUS_MAX_TOKENS}
36+
--threads ${LLAMA_CPU_THREADS:-6}
37+
--threads-batch ${LLAMA_CPU_THREADS:-6}
38+
--rope-scaling linear
39+
--no-mmap
40+
--no-slots
41+
--no-webui
42+
model-init:
43+
image: curlimages/curl:latest
44+
user: ${UID}:${GID}
45+
volumes:
46+
- ./models:/app/models
47+
working_dir: /app
48+
command: >
49+
sh -c '
50+
if [ ! -f /app/models/${ORPHEUS_MODEL_NAME} ]; then
51+
echo "Downloading model file..."
52+
wget -P /app/models https://huggingface.co/lex-au/${ORPHEUS_MODEL_NAME}/resolve/main/${ORPHEUS_MODEL_NAME}
53+
else
54+
echo "Model file already exists"
55+
fi'
56+
restart: "no"

0 commit comments

Comments
 (0)