This repository contains the deployment configuration for running OminiControl on Replicate. OminiControl is a powerful and versatile subject-driven image generation model that enables precise control over image generation while maintaining high fidelity.
β It's not currently deployed on Replicate due to huge dimension of model folder (flux1-dev and omini) > ~56 GB!
This implementation is based on the OminiControl project:
- Original Repository: OminiControl
- Paper: "OminiControl: Control Any Elements in Any Images"
- Authors: Yuan Shi*, Jing Shi*, Michael J. Black, Yebin Liu, Yiyi Liao
If you use this model, please cite:
@article{shi2023ominicontrol,
title={OminiControl: Control Any Elements in Any Images},
author={Shi, Yuan and Shi, Jing and Black, Michael J and Liu, Yebin and Liao, Yiyi},
journal={arXiv preprint arXiv:2411.15098},
year={2023}
}This deployment uses Cog to package the OminiControl model, allowing you to:
- Generate images with specific subject control
- Choose between 512x512 and 1024x1024 resolutions
- Fine-tune the generation process with custom prompts
- Deploy easily to Replicate or run locally
- NVIDIA GPU with CUDA support
- Cog installed
- HuggingFace Account with access token
- Python 3.11+
- Install Cog:
sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)"
sudo chmod +x /usr/local/bin/cog- Clone this repository:
git clone [your-repo-url]
cd [repo-name]- Set up your HuggingFace token:
export HF_TOKEN=your_token_hereβββ cog.yaml # Cog configuration file
βββ predict.py # Main prediction script
βββ download_weights.py # Script to download model weights
βββ README.md # This file
- Download the model weights:
python download_weights.py- Build the Docker image:
cog build- Run predictions locally:
cog predict -i image=@path/to/your/image.jpg -i prompt="Your prompt here" -i resolution=512Here's a complete example of how to use the model locally:
# Build the image
cog build
# Run a prediction
cog predict \
-i image=@examples/cat.jpg \
-i prompt="A cat sitting on a moon surface, with Earth visible in the background" \
-i resolution=512 \
-i num_inference_steps=8image(Path): Input image for conditioningprompt(string): Text prompt describing the desired outputresolution(int): Output resolution, either 512 or 1024 (default: 512)num_inference_steps(int): Number of denoising steps (default: 8, range: 1-50)
The implementation includes several optimizations:
- Dynamic loading of LoRA weights based on selected resolution
- Automatic mixed precision inference
- GPU memory cleanup after each prediction
- CPU offloading when possible
- Push your model:
cog push r8.im/username/model-name- Your model will be available at
https://replicate.com/username/model-name
If you encounter issues:
- Verify your HuggingFace token is set correctly
- Ensure you have enough GPU memory (at least 12GB recommended)
- Check CUDA compatibility with installed PyTorch version
- Clear GPU memory if you encounter CUDA out of memory errors:
import torch
torch.cuda.empty_cache()This deployment configuration is provided under the MIT License. However, please note that the original OminiControl model has its own license and usage restrictions. Make sure to check and comply with their license terms.
Contributions are welcome! Please feel free to submit a Pull Request.
For any issues or questions, please open an issue in the repository.