NeuroForge is an advanced, versatile Python-based toolkit for building and forging neural networks, autonomous AI agents, and integrated audio-visual applications. It brings together state-of-the-art machine learning frameworks, large language models (LLMs), and specialized tools for real-time interaction and RAG systems.
- 🤖 AI Agents: Modular framework for creating autonomous agents with persistent memory and tool-calling capabilities.
- 🎙️ Audio Intelligence: Comprehensive production-grade tools for:
- ASR (Speech-to-Text): Whisper, SenseVoice, Paraformer, Vosk, etc.
- TTS (Text-to-Speech): Piper, Edge-TTS, GPT-SoVITS.
- VAD & Wake-word: Silero VAD, openwakeword, and WebrtcVAD for efficient listening.
- 🧠 Multi-Model Integration: Seamless support for OpenAI, DeepSeek, Qwen (local & cloud), Google Gemini, and AWS Bedrock (Nova).
- 📚 Advanced RAG Systems: Fast and efficient Retrieval-Augmented Generation using FastAPI and modern vector databases.
- 🏗️ Deployment Optimized: First-class support for local execution (Llama.cpp), ONNX, RKNN (Rockchip NPUs), and TensorRT.
- 🖥️ Diverse Interfaces: Interactive Text User Interfaces (TUI), FastAPI/Flask web services, and WeChat mini-program integrations.
CCV (Chenguang Computer Vision) is integrated as a git submodule at neuro_forge/ccv/.
A zero-dependency C++ library providing foundational algorithms for computer vision and robotics:
| Module | Capabilities |
|---|---|
| Maths | Matrices, vectors, random number generation |
| Kinematics & Dynamics | Rotation matrices, Hamilton quaternions, Euler angles (12 conventions) |
| Estimation | EKF, Particle Filter, Gauss-Newton, Levenberg-Marquardt, Bundle Adjustment |
| Computer Vision | 2D/3D structures, image processing (Gaussian, pyramids), FAST features |
No third-party dependencies in the core — no OpenCV, Eigen, or PCL — ensuring maximum portability across edge and cloud platforms.
# Initialize submodule
git submodule update --init --recursive
# or
make submodulesneuro_forge/
├── agents/ # Agent logic, memory management, and identity definitions.
├── apps/ # Ready-to-use applications:
│ ├── voice_assistant/ # Fully-featured voice assistant (Wake word -> ASR -> LLM -> TTS).
│ ├── web_fastapi_rag/ # Production-ready RAG system.
│ ├── tui/ # Interactive Text-based User Interfaces.
│ └── wechat_miniprogram/ # WeChat ecosystem integrations.
├── audio/ # Core audio processing: Recording, playback, ASR, TTS, and VAD.
├── llm/ # Connectors for various LLM providers (OpenAI, DashScope, Boto3, etc.).
├── lvm/ # Large Vision Models: Text-to-Image and Text-to-Video capabilities.
├── framework/ # Framework integrations: Torch, ONNX, RKNN, TensorRT, vLLM.
├── memory/ # Knowledge persistence layers: Chroma, Mem0.
├── text/ # NLP utilities, tokenizers, and language identification.
└── tools/ # Extension tools for agents: MCP/ACP protocols and custom skills.
| Domain | Technologies |
|---|---|
| Language | Python 3.10+ |
| ML Frameworks | PyTorch, TensorFlow, ONNX Runtime, RKNN, TensorRT |
| Audio | pyaudio, faster-whisper, piper-tts, openwakeword, silero-vad |
| LLM/API | openai, dashscope, google-genai, boto3, fastapi, langchain |
| UI/UX | textual (TUI), flask / fastapi (Web) |
We highly recommend using uv for fast and reliable dependency management.
uv venv
source .venv/bin/activateYou can install the core package or include optional feature groups:
# Core installation
uv pip install -e .
# Install specific feature groups (e.g., audio and LLM)
uv pip install -e ".[audio,text,lm,torch,langchain]"
# Install ALL dependencies
uv syncOptional Groups: torch, tf, rknn, audio, text, lm, langchain, api, tui, aws, cpp, benchmark.
The core definitions for agent behavior and identity:
AGENTS.md,SOUL.md,IDENTITY.md,USER.md,TOOLS.md,HEARTBEAT.md,BOOTSTRAP.md,MEMORY.md
- Models: LLM, LVM
- Tools: ACP, MCP, Skills
- Memory: Vector databases and persistent history
Integrated web interfaces for agent interaction and RAG.
A full-loop demonstration of NeuroForge's capabilities.
flowchart LR
A["Voice Input"]
subgraph Agent Core
A0("Audio PreProcess")
B("ASR (Whisper)")
C("LLM (Qwen/GPT)")
D("Action/Tool Call")
E("TTS (Piper)")
end
G0[Hardware/Robot]
G1[Audio Output]
A -- "Wake Word" --> A0 --> B -- "Text" --> C
C -- "JSON/Tools" --> D <--> G0
C -- "Response" --> E --> G1
To run the integrated voice assistant with wake-word detection:
# Ensure required extras are installed
uv pip install -e ".[torch,audio,text,lm]"
# Run the assistant
PYTHONPATH=. python -m neuro_forge.apps.voice_assistant.voice_assistant_wakeThis project is licensed under the BSD-3-Clause License. See the LICENSE file for full details.
Developed with ❤️ by Gavin Gao (cggos@outlook.com)
