Skip to content

alexhegit/ReachyMiniChat

Repository files navigation

Reachy Mini — Ollama Chat + Emotion/Dance Demo

"Don't have physical hardware? You can still create your own virtual robot on your desk. This represents a straightforward sim-to-real practice leveraging MuJoCo and AI tools like Faster Whisper, Ollama, and eSpeak/Edge-TTS. While Edge-TTS relies on cloud APIs, eSpeak enables fully offline operation. I developed this on the AMD Strix Halo platform and tested it on an AMD Radeon GPU with Ubuntu. Although untested on other systems, the architecture should facilitate easy porting to macOS and Windows."

Demo

Short summary

  • This repository contains demo apps and controllers for the Reachy Mini simulator and small robot, focused on emotion-driven and dance actions triggered from language model outputs (Ollama). It includes several experimental versions (emo_v1emo_v8) that explore recorded-move playback, streaming-triggered motions, and TTS integration.

What you'll find

  • emo_v1.py — Baseline high-intensity emotion controller and examples.
  • emo_v2.py — RecordedMoves categorization and selection.
  • emo_v3.py — Streaming LM responses triggering actions early.
  • emo_v4.py — Offline-focused TTS (eSpeak) with lip-sync hooks.
  • emo_v5.py — Edge-TTS integration with WAV save/read/play flow (multi-language support).
  • emo_v6.py — Continuous synchronized actions with cartoon voices and multi-modal expressions.
  • emo_v7.py — ASR → LLM → TTS demo (see docs/EMO_V7_README.md)
  • emo_v8.py — Offline Piper-TTS version (ASR/text chat + Ollama + Piper)

Get the details of each version from ./docs

Installation prerequisites (Linux / Debian-family)

This project was developed on an AMD Ryzen™ AI Max+ 395 running Ubuntu 24.04. I recommend this hardware for deployment, as it pairs excellently with the Reachy Mini Desktop Robot. Its integrated GPU and CPU deliver the performance needed to run the full pipeline entirely offline.

So you may follow the AMD ROCm Documentation to install Ryzen Software for Linux with ROCm.

Then go to setup the environment for this application.

  1. System packages
sudo apt update
sudo apt install -y python3 python3-venv python3-pip curl espeak ffmpeg libsndfile1 portaudio19-dev
sudo apt-get install -y libcairo2-dev
sudo apt install -y libgirepository1.0-dev
sudo apt install -y \
    python3-gi \
    gir1.2-gst-plugins-base-1.0 \
    libgstreamer1.0-0 \
    gstreamer1.0-tools \
    gstreamer1.0-plugins-base \
    gstreamer1.0-plugins-good \
    gstreamer1.0-plugins-bad \
    gstreamer1.0-libav

Notes:

  • espeak (eSpeak) is required for the offline TTS flow used by emo_v4.py.
  • libsndfile1 and portaudio are required for soundfile and sounddevice (used when playing WAVs).
  • ffmpeg is optional but useful if you need to convert audio formats or debug audio files.
  1. Python environment
git clone https://github.com/alexhegit/ReachyMiniChat.git
cd ReachyMiniChat
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
  1. Reachy Mini Python SDK.
  • Install reachy-mini SDK with Mujoco support for simulation:
pip install "reachy-mini[mujoco]"
  1. Ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen3:0.6b
ollama serve

Run it

  1. Start the Reachy Mini simulation in terminal 1:

Use export PYGLFW_LIBRARY_VARIANT=x11 if the GUI launch fails on Wayland, which is the default backend of Ubuntu 24.04+.

export PYGLFW_LIBRARY_VARIANT=x11
reachy-mini-daemon --sim

If you have the real Reachy Mini connected, you could play with it by

sudo chmod 666 /dev/ttyACM0 # set the permission, you may change ttyACM* according the real port in your environment
reachy-mini-daemon
  1. Quick test commands (terminal 2)
python emo_v1.py --chat

The actions dataset of ReachyMnini SDK need you to login to Huggingface for download.

export HF_TOKEN=<your token>
# Let's set HF_HOME where to save the models
export HF_HOME=${HOME}/huggingface_cache
mkdir -p ${HF_HOME}

Then try another test

python emo_v2.py --test-moves

It will download the pollen-robotics/reachy-mini-dances-library at the first time run it. And the play the 19 recorded moves one by one from Mujoco sim GUI or realy Reachy Mini Robot.

Project notes and troubleshooting

  • All emo_v*.py scripts and utils/*.py tools support --help even when optional dependencies are missing, thanks to lazy imports and runtime dependency checks.
  • If you hear noisy or distorted audio, ensure soundfile and sounddevice are installed in the active venv, and that the system libsndfile and PortAudio development packages are present. If you see repeated "Audio system is not initialized." warnings, try setting an explicit output device (see below) or ensure PulseAudio/PipeWire is running.
  • emo_v5.py writes Edge-TTS output to WAV and plays it back using the file's sample rate to avoid playback artifacts. If Edge-TTS reports "No audio was received," try running python emo_v5.py --test-tts and/or use a Chinese default voice for CJK text.
  • emo_v4.py uses espeak --stdout as the primary offline TTS backend; ensure eSpeak is installed.

Setting sounddevice default device (optional)

If your system reports PortAudio/device warnings, you can list devices and set a default device index:

python - <<'PY'
import sounddevice as sd
print(sd.query_devices())
# Then in Python or code: sd.default.device = <index>
PY

emo_v7 (ASR → LLM → TTS)

  • emo_v7.py adds a microphone-first pipeline using faster-whisper (CPU) for ASR, then forwards the transcription to Ollama and uses the existing emotion controller + Edge-TTS for speech and actions.
  • See EMO_V7_README.md for usage, requirements, and notes about model choices and VAD improvements.
  • New CLI flag: --gentle — enables gentle_mode which restricts selected recorded moves to a curated gentle set and adjusts motion durations for subtler actions. Example:
python emo_v7.py --asr --gentle
# VAD ASR mode (auto-stop on silence)
python emo_v7_vad.py --asr

# Text chat mode
python emo_v7_vad.py --chat

emo_v8 (Offline Piper-TTS)

  • emo_v8.py replaces Edge-TTS with Piper-TTS for fully offline speech synthesis, while keeping Ollama chat and emotion/action flow.
  • New dependency is already included in requirements.txt:
    • piper-tts>=1.4.0
  • emo_v8.py also supports --gentle (same behavior as emo_v7/emo_v6) and accepts --piper-model and --piper-config to point to local voice models. Example:
python emo_v8.py --model qwen3:0.6b --piper-model models/zh_CN-huayan-medium.onnx --gentle

Piper voice model download

  • Download .onnx and matching .onnx.json voice files from:
    • Piper release page: https://github.com/rhasspy/piper/releases/tag/v0.0.2
  • Place files under models/ (or any path you pass to --piper-model).

Usage examples(default ollama model is qwen3:0.6b)

# Text chat mode + english (default)
python emo_v8.py --piper-model models/en-us-blizzard_lessac-medium.onnx
python emo_v8.py --model qwen3.5:0.8b --piper-model models/en-us-blizzard_lessac-medium.onnx

# ASR mode + Chinese
python emo_v8.py --asr --piper-model models/zh_CN-huayan-medium.onnx --gentle
python emo_v8.py --asr --model qwen3.5:0.8b --piper-model models/zh_CN-huayan-medium.onnx --gentle

# ASR + gentle action + Chinese
python emo_v8.py --piper-model ./models/zh_CN-huayan-medium.onnx --gentle
python emo_v8.py --piper-model ./models/zh_CN-huayan-medium.onnx --gentle --model qwen3.5:0.8b

# Optional: explicit Piper config/speaker
python emo_v8.py --piper-model models/en-us-blizzard_lessac-medium.onnx --piper-config models/en-us-blizzard_lessac-medium.onnx.json --speaker 0

Version History

  • See EMO_README.md for version details and changelog across emo_v* versions.

About

A private conversation robot bases on Huggingface Reachy Mini

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors