This document describes the architecture of the ComicFrames package after the major refactoring.
ComicFrames is now organized into a modular, extensible architecture with proper separation of concerns, caching, configuration management, and model abstraction.
src/comicframes/
├── __init__.py # Main package exports
├── cli.py # Command-line interfaces
├── pdf_processor.py # Legacy PDF processing (backward compatibility)
├── frame_detector.py # Legacy frame detection (backward compatibility)
├── utils.py # Legacy utilities (backward compatibility)
│
├── config/ # Configuration management
│ ├── __init__.py
│ ├── settings.py # Global settings
│ ├── cache_config.py # Cache configuration
│ └── model_config.py # Model registry and configuration
│
├── cache/ # Caching system
│ ├── __init__.py
│ ├── file_cache.py # File-based caching
│ ├── memory_cache.py # In-memory caching
│ └── cache_manager.py # Central cache management
│
├── core/ # Core abstractions
│ ├── __init__.py
│ ├── data_structures.py # Data models (Frame, ComicPage, etc.)
│ ├── base_processor.py # Base class for processors
│ ├── base_model.py # Base class for ML models
│ └── pipeline.py # Processing pipeline
│
├── models/ # Model implementations
│ ├── __init__.py
│ ├── model_factory.py # Factory for creating models
│ ├── frame_detection_model.py # OpenCV frame detection
│ ├── interpolation_models.py # RIFE, FILM interpolation
│ └── object_detection_model.py # YOLO detection
│
└── processing/ # High-level processors
├── __init__.py
├── pdf_processor.py # PDF processing with new architecture
├── frame_processor.py # Frame detection with new architecture
└── interpolation_processor.py # Frame interpolation
- Settings: Global configuration with environment variable support
- CacheConfig: Cache-specific configuration
- ModelConfig: Model registry and configuration
- ModelRegistry: Central registry of available models
- FileCache: Persistent file-based caching with TTL and size limits
- MemoryCache: Fast in-memory caching with LRU eviction
- CacheManager: Unified interface for all cache types
- Data Structures: Type-safe data models (Frame, ComicPage, BoundingBox, etc.)
- BaseProcessor: Abstract base for all processing operations
- BaseModel: Abstract base for all ML models
- ProcessingPipeline: Chain multiple processors together
- ModelFactory: Create model instances by name or type
- Frame Detection: OpenCV-based contour detection
- Interpolation: RIFE and FILM frame interpolation (TODO)
- Object Detection: YOLO-based detection (TODO)
- PDFProcessor: Extract pages from PDF files
- FrameProcessor: Detect frames in comic pages
- InterpolationProcessor: Generate interpolated frames
All existing APIs are preserved for backward compatibility:
# Old way (still works)
from comicframes import pdf_to_images, detect_frames
# New way
from comicframes.processing import PDFProcessor, FrameProcessorfrom comicframes import Settings, get_settings
settings = get_settings()
settings.min_frame_width = 100
settings.enable_cache = Truefrom comicframes import get_cache_manager
cache = get_cache_manager()
cache.set_frame_data("key", data)
data = cache.get_frame_data("key")from comicframes import ModelFactory
from comicframes.config import ModelType
# Create specific model
detector = ModelFactory.create_model("opencv_contours")
# Create default model for type
detector = ModelFactory.create_default_model(ModelType.FRAME_DETECTION)from comicframes.processing import PDFProcessor, FrameProcessor
from comicframes.core import ProcessingPipeline
pipeline = ProcessingPipeline("comic_processing")
pipeline.add_stage(PDFProcessor())
pipeline.add_stage(FrameProcessor())
result = pipeline.process("comic.pdf")The package now provides several CLI commands:
comicframes-pdf: Convert PDF to images (legacy)comicframes-extract: Extract frames from images (legacy)comicframes-pipeline: Run complete processing pipelinecomicframes-cache: Manage cache (stats, clear, cleanup)comicframes-models: Manage models (list, info)
- Multi-level caching: Memory, file-based, and processing caches
- TTL-based expiration: Automatic cleanup of stale data
- Size limits: Prevents cache from consuming too much storage
- Cache hit tracking: Monitors cache effectiveness
- Models are loaded only when needed
- Image data is loaded on demand
- Configuration is cached across requests
- Processing time tracking
- Success/failure rates
- Cache hit rates
- Stage-level metrics in pipelines
from comicframes.core import BaseProcessor
class CustomProcessor(BaseProcessor):
def _process(self, data):
# Your processing logic
return processed_datafrom comicframes.core import BaseModel
class CustomModel(BaseModel):
def _load_model(self):
# Load your model
return model
def _predict(self, input_data):
# Make predictions
return predictionsThe cache system is extensible to support different backends (Redis, database, etc.).
The refactored architecture integrates the existing external models:
- ECCV2022-RIFE: Frame interpolation models (to be integrated)
- frame-interpolation: Google's FILM models (to be integrated)
- Yolo_Model: Object detection (to be integrated)
These will be properly wrapped in the new model abstraction system.
- Model Integration: Complete RIFE, FILM, and YOLO integrations
- Distributed Processing: Support for distributed/parallel processing
- Model Training: APIs for training custom models
- Web Interface: Web-based UI for processing pipelines
- Database Backend: Optional database storage for metadata
- GPU Acceleration: Automatic GPU utilization when available