A comprehensive toolkit for training high-quality LoRA models for 3D animated characters, specifically optimized for Pixar-style animation and other 3D CGI content.
- Video Preprocessing: Frame extraction with scene detection and interpolation
- AI-Powered Segmentation: Multi-layer segmentation optimized for 3D content
- Smart Character Clustering: CLIP-based clustering with interactive refinement
- Auto Caption Generation: BLIP2-powered caption generation
- Quality Training Data: Automated dataset preparation with augmentation
- LoRA Training: Integration with Kohya_ss for optimized training
- Evaluation Tools: Comprehensive LoRA quality testing
- CPU-Only Pose Data Preparation: MediaPipe-powered pose detection that runs in parallel with GPU training (60+ FPS on 32-thread CPU)
Unlike 2D anime pipelines, this is specifically designed for:
- Smooth 3D shading and realistic lighting
- Depth-aware segmentation
- Material properties (SSS, specular, etc.)
- Consistent 3D character models
- Motion capture-based animation
See .claude/claude.md for detailed documentation and workflows.
# 1. Extract frames
python scripts/generic/video/universal_frame_extractor.py \
--input movie.mp4 \
--output frames/ \
--mode scene
# 2. Segment characters
python scripts/generic/segmentation/layered_segmentation.py \
--input-dir frames/ \
--output-dir segmented/ \
--extract-characters
# 3. Cluster characters
python scripts/generic/clustering/character_clustering.py \
--input-dir segmented/characters \
--output-dir clustered/
# 4. Prepare training data
python scripts/generic/training/prepare_training_data.py \
--character-dirs clustered/character_* \
--output-dir training_data/ \
--generate-captions
# 5. Train LoRA
conda run -n ai_env python sd-scripts/train_network.py \
--config_file configs/character_training.toml- scripts/core/pipeline: 3D pipeline CLI (
python -m scripts.core.pipeline ...) - scripts/run_pipeline.py: 2D pipeline CLI (
python scripts/run_pipeline.py ...) - scripts/batch: Batch/nohup launchers (incl. Wan2.1 dataset + training)
- scripts/training: SDXL/Kohya training orchestration & monitoring
- scripts/generic: Reusable tools (segmentation, clustering, inpainting, training utils)
- anime_pipeline/: Packaged pipeline library (2D-focused modules)
- docs/: Documentation and guides
- configs/: Shared configuration
- requirements/: Python dependencies
- Python 3.10+
- CUDA-capable GPU (recommended: RTX 3090 or better)
- 32GB+ RAM
- 500GB+ disk space for datasets
pip install -r requirements/all.txt- Quick Start:
docs/guides/quick_start.md - Tool Guides:
docs/guides/tools/ - 3D Training:
docs/3d-training/ - Setup:
docs/setup/
Repo-local generated artifacts can get very large over time (especially outputs/ and logs/).
# Safe cleanup of repo-local artifacts (keeps active Wan2.1 training log if running)
bash scripts/maintenance/cleanup_repo_artifacts.shAll data is stored in centralized warehouse:
/mnt/data/ai_data/
├── datasets/3d-anime/ # Raw datasets
├── training_data/3d_characters/ # Prepared training data
├── models/lora/3d_characters/ # Trained LoRA models
└── lora_evaluation/ # Evaluation results
ComfyUI is installed at /mnt/c/ai_tools/comfyui/ for visual workflow design and LoRA testing.
Quick Start:
/mnt/c/ai_tools/comfyui/start_comfyui.sh
# Access at http://localhost:8188Pre-configured Workflows:
- LoRA checkpoint comparison testing
- Multi-character scene composition
- Video generation pipeline
- ControlNet pose control
Features:
- Visual node-based workflow design
- InstantID & IPAdapter for character consistency
- RIFE frame interpolation for video generation
- SAM segmentation integration via Impact Pack
- Python API for automated testing
See docs/guides/comfyui/COMFYUI_INTEGRATION.md for detailed setup and usage.
universal_frame_extractor.py- Extract frames with scene detectionframe_interpolator.py- Generate intermediate framesvideo_synthesizer.py- Create videos from frames
layered_segmentation.py- Multi-layer segmentation (U²-Net, SAM, ISNet)- Depth-aware segmentation for 3D content
- LaMa inpainting for background restoration
character_clustering.py- CLIP + HDBSCAN clusteringinteractive_character_selector.py- Interactive cluster reviewturbo_character_clustering.py- Fast batch clustering
prepare_training_data.py- Dataset organizationgenerate_captions_blip2.py- Auto-caption generationaugment_small_clusters.py- Data augmentationprepare_pose_lora_data.py- Pose LoRA dataset preparation (CPU/GPU)- Batch scripts:
prepare_all_pose_lora_cpu.sh- Automated batch pose data prep (32-thread optimized)
test_lora_checkpoints.py- Test LoRA qualitycompare_lora_models.py- Model comparisonlora_quality_metrics.py- Quality metrics
Adjust for 3D characteristics:
- Lower alpha threshold:
0.15(vs0.25for 2D) - Lower blur threshold:
80(vs100for 2D) - Consider using SAM for better 3D detection
3D characters are more consistent:
- Lower min_cluster_size:
10-15(vs20-25for 2D) - Lower min_samples:
2(vs3-5for 2D)
For identity LoRAs, captions must explicitly teach the attributes you want the LoRA to learn (e.g., age/gender).
Guidelines:
- Start captions with the trigger token (e.g.,
yuwen) and any required identity tags (e.g.,12-year-old boy, child), then style tokens (e.g.,pixar style). - Keep the rest factual and descriptive (face/eyes/hair/clothing/accessories/pose/camera/lighting).
- Avoid redundant repeats of style tokens and the trigger token inside the generated tag list.
Example 3D tags:
"a 3D animated character with smooth shading, pixar style"
"rendered 3d character model, studio lighting, high quality animation"
- Pixar (Toy Story, Incredibles, etc.)
- DreamWorks (Shrek, How to Train Your Dragon)
- Blue Sky Studios (Ice Age, Rio)
- Illumination (Minions, Sing)
- Disney 3D (Frozen, Moana)
- Custom 3D animations
For personal use and research purposes. Models and assets subject to their respective licenses.
v1.0.0 - Initial 3D animation pipeline (2025-11-08)
For detailed usage, see .claude/claude.md or docs/ directory.