Skip to content

cggos/NeuroForge

Repository files navigation

NeuroForge 🛠️🚀

NeuroForge is an advanced, versatile Python-based toolkit for building and forging neural networks, autonomous AI agents, and integrated audio-visual applications. It brings together state-of-the-art machine learning frameworks, large language models (LLMs), and specialized tools for real-time interaction and RAG systems.


✨ Key Features

  • 🤖 AI Agents: Modular framework for creating autonomous agents with persistent memory and tool-calling capabilities.
  • 🎙️ Audio Intelligence: Comprehensive production-grade tools for:
    • ASR (Speech-to-Text): Whisper, SenseVoice, Paraformer, Vosk, etc.
    • TTS (Text-to-Speech): Piper, Edge-TTS, GPT-SoVITS.
    • VAD & Wake-word: Silero VAD, openwakeword, and WebrtcVAD for efficient listening.
  • 🧠 Multi-Model Integration: Seamless support for OpenAI, DeepSeek, Qwen (local & cloud), Google Gemini, and AWS Bedrock (Nova).
  • 📚 Advanced RAG Systems: Fast and efficient Retrieval-Augmented Generation using FastAPI and modern vector databases.
  • 🏗️ Deployment Optimized: First-class support for local execution (Llama.cpp), ONNX, RKNN (Rockchip NPUs), and TensorRT.
  • 🖥️ Diverse Interfaces: Interactive Text User Interfaces (TUI), FastAPI/Flask web services, and WeChat mini-program integrations.

👁️ Visual AI

CCV (Chenguang Computer Vision) is integrated as a git submodule at neuro_forge/ccv/.

A zero-dependency C++ library providing foundational algorithms for computer vision and robotics:

Module Capabilities
Maths Matrices, vectors, random number generation
Kinematics & Dynamics Rotation matrices, Hamilton quaternions, Euler angles (12 conventions)
Estimation EKF, Particle Filter, Gauss-Newton, Levenberg-Marquardt, Bundle Adjustment
Computer Vision 2D/3D structures, image processing (Gaussian, pyramids), FAST features

No third-party dependencies in the core — no OpenCV, Eigen, or PCL — ensuring maximum portability across edge and cloud platforms.

# Initialize submodule
git submodule update --init --recursive
# or
make submodules

📂 Project Structure

neuro_forge/
├── agents/         # Agent logic, memory management, and identity definitions.
├── apps/           # Ready-to-use applications:
│   ├── voice_assistant/  # Fully-featured voice assistant (Wake word -> ASR -> LLM -> TTS).
│   ├── web_fastapi_rag/  # Production-ready RAG system.
│   ├── tui/              # Interactive Text-based User Interfaces.
│   └── wechat_miniprogram/ # WeChat ecosystem integrations.
├── audio/          # Core audio processing: Recording, playback, ASR, TTS, and VAD.
├── llm/            # Connectors for various LLM providers (OpenAI, DashScope, Boto3, etc.).
├── lvm/            # Large Vision Models: Text-to-Image and Text-to-Video capabilities.
├── framework/      # Framework integrations: Torch, ONNX, RKNN, TensorRT, vLLM.
├── memory/         # Knowledge persistence layers: Chroma, Mem0.
├── text/           # NLP utilities, tokenizers, and language identification.
└── tools/          # Extension tools for agents: MCP/ACP protocols and custom skills.

🛠 Tech Stack

Domain Technologies
Language Python 3.10+
ML Frameworks PyTorch, TensorFlow, ONNX Runtime, RKNN, TensorRT
Audio pyaudio, faster-whisper, piper-tts, openwakeword, silero-vad
LLM/API openai, dashscope, google-genai, boto3, fastapi, langchain
UI/UX textual (TUI), flask / fastapi (Web)

📦 Installation

We highly recommend using uv for fast and reliable dependency management.

1. Setup Environment

uv venv
source .venv/bin/activate

2. Install Dependencies

You can install the core package or include optional feature groups:

# Core installation
uv pip install -e .

# Install specific feature groups (e.g., audio and LLM)
uv pip install -e ".[audio,text,lm,torch,langchain]"

# Install ALL dependencies
uv sync

Optional Groups: torch, tf, rknn, audio, text, lm, langchain, api, tui, aws, cpp, benchmark.


🎤 Applications

🤖 AI Agents

📄 Bootstrap Files

The core definitions for agent behavior and identity:

  • AGENTS.md, SOUL.md, IDENTITY.md, USER.md, TOOLS.md, HEARTBEAT.md, BOOTSTRAP.md, MEMORY.md

🧩 Components

  • Models: LLM, LVM
  • Tools: ACP, MCP, Skills
  • Memory: Vector databases and persistent history

🌐 Web with FastAPI / Flask

Integrated web interfaces for agent interaction and RAG.

🎙️ Voice Assistant

A full-loop demonstration of NeuroForge's capabilities.

flowchart LR
    A["Voice Input"]
    
    subgraph Agent Core
    	A0("Audio PreProcess")
        B("ASR (Whisper)")
        C("LLM (Qwen/GPT)")
        D("Action/Tool Call")
        E("TTS (Piper)")
	end
    G0[Hardware/Robot]
    G1[Audio Output]

    A -- "Wake Word" --> A0 --> B -- "Text" --> C
    C -- "JSON/Tools" --> D <--> G0
    C -- "Response" --> E --> G1
Loading

To run the integrated voice assistant with wake-word detection:

# Ensure required extras are installed
uv pip install -e ".[torch,audio,text,lm]"

# Run the assistant
PYTHONPATH=. python -m neuro_forge.apps.voice_assistant.voice_assistant_wake

📄 License

This project is licensed under the BSD-3-Clause License. See the LICENSE file for full details.

Developed with ❤️ by Gavin Gao (cggos@outlook.com)

About

an advanced, versatile Python-based toolkit for building and forging neural networks, including AI Agent

Resources

Stars

Watchers

Forks

Contributors