Skip to content
Open
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
5dd139e
Modernize `requirements*.txt`.
hameerabbasi Jan 4, 2026
0b1c4c1
Move to `pixi` from `pip`.
hameerabbasi Jan 4, 2026
e2b5664
Move to `pyproject.toml` from `pixi.toml`.
hameerabbasi Jan 5, 2026
46d7833
Make everything `pip`-installable.
hameerabbasi Jan 14, 2026
6634864
Require minimum `pixi`.
hameerabbasi Jan 14, 2026
b41658e
Remove unnecessary parts.`
hameerabbasi Jan 14, 2026
12b7309
Make it *actually* work without a `pixi.toml`.
hameerabbasi Jan 14, 2026
f7cb8c6
Add back `--extra-index-url`.
hameerabbasi Jan 15, 2026
b9c4bd6
Revert `requirements.txt`
hameerabbasi Jan 15, 2026
0bffc86
Use `xft_*` build of `tk` to avoid font issues on Linux.
hameerabbasi Jan 15, 2026
2c24690
Don't install unnecessary packages.
hameerabbasi Jan 19, 2026
9426b27
Modify format of `requirements*.txt`.
hameerabbasi Jan 19, 2026
00f6d52
Make `OneTrainer` package editable for easier dev.
hameerabbasi Jan 19, 2026
e76e790
Repeat git requirements in `requirements-global.txt`.
hameerabbasi Mar 3, 2026
6fc0455
Update the Linux/macOS installation scripts.
hameerabbasi Mar 3, 2026
0f7a373
Initial Windows 'it works' pixi scripts and bump min pixi version for…
O-J1 Mar 9, 2026
e6b83c3
Tweak platform_env
O-J1 Mar 9, 2026
101cc26
Fix still incorrect platform_env
O-J1 Mar 9, 2026
b897bcc
Fix Tensorboard breaking, swap to shutil.which
O-J1 Mar 9, 2026
fbeec51
Update `README.md` with new install instructions.
hameerabbasi Mar 9, 2026
1c50e81
Further readme tweaks
O-J1 Mar 10, 2026
c040fb9
Add `run-cmd.ps1` and update `README.md` accordingly.
hameerabbasi Mar 10, 2026
afe6b42
Fix run-cmd.ps1 syntax
O-J1 Mar 10, 2026
fbc6991
Minimal Dockerfile.
hameerabbasi Mar 18, 2026
aa28f0b
Update torch version for ROCm.
hameerabbasi Apr 12, 2026
f582cbb
Merge remote-tracking branch 'origin/master' into modernize-install
hameerabbasi Apr 12, 2026
0403b1e
Update lockfile.
hameerabbasi Apr 12, 2026
ea54713
Merge remote-tracking branch 'hameerabbasi/modernize-install' into mo…
hameerabbasi Apr 12, 2026
543d410
Remove lazy update mentions and update dockerfiles.
hameerabbasi Apr 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necessary?
the Dockerfiles start from a base image and then install OneTrainer in there. never is the local repository used.

why shouldn't the image have a .gitignore file?
Someone might develop inside docker

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was mainly to not add .pixi, as that will be generated on the build itself. I can change this if needed, but the overall goal was to keep the dockerfile as small as possible.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh! This is a symlink, so both ignore files are kept in sync, and we don't have to sync both.

1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
* text=auto eol=lf
*.bat text eol=crlf
pixi.lock merge=binary linguist-language=YAML linguist-generated=true -diff
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this line why I don't get a diff for the pixi.lock file when I update?

git diff
diff --git a/pixi.lock b/pixi.lock
index fddd349..09c986d 100644
Binary files a/pixi.lock and b/pixi.lock differ

when I make changes, I usually want to see what I did before I commit or push. Currently I can't.

10 changes: 5 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
*~
debug*.ipynb
debug*.py
settings.json
/debug*

# user data
Expand All @@ -27,13 +28,12 @@ secrets.json
.python-version
*.egg-info

# pixi environments
.pixi
pixi.lock
pixi.toml

# misc files
/src
train.bat
debug_report.log
config_diff.txt

# pixi environments
.pixi/*
!.pixi/config.toml
31 changes: 2 additions & 29 deletions LAUNCH-SCRIPTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,48 +13,21 @@

### All of the scripts accept the following *optional* environment variables to customize their behavior:

- `OT_CONDA_CMD`: Sets a custom Conda command or an absolute path to the binary (useful when it isn't in the user's `PATH`). If nothing is provided, we detect and use `CONDA_EXE` if available, which is a variable that's set by Conda itself and always points at the user's installed Conda binary.

- `OT_CONDA_ENV`: Sets the directory name (or an absolute/relative path) of the Conda environment. If a name or relative path is used, it will be relative to the OneTrainer directory. Defaults to `conda_env`.

- `OT_PYTHON_CMD`: Sets the Host's Python executable. It's used for creating the Python Venvs. This can be used to force the usage of a specific Python version's binary (such as `python3.10`) whenever the host has multiple versions installed. However, it's *always* recommended to use Conda or Pyenv instead, rather than relying on the host's unreliable system-wide Python binaries (which might change or be removed with system updates), so we don't recommend changing this option unless you *really* know what you're doing. Defaults to `python`.

- `OT_PYTHON_VENV`: Sets the directory name (or an absolute/relative path) of the Python Venv. If a name or relative path is used, it will be relative to the OneTrainer directory. Defaults to `venv`.

- `OT_PREFER_VENV`: If set to `true`, Conda will be ignored even if it exists on the system, and Python Venv will be used instead. This ensures that people who use `pyenv` (to choose which Python version to run on the host) can easily set up their desired Python Venv environments. Defaults to `false`.

- `OT_LAZY_UPDATES`: If set to `true`, OneTrainer's self-update process will only update the Python environment's dependencies if the OneTrainer source code has been modified since the previous dependency update. This speeds up executions of `update.sh`, and is generally safe, but may miss some updates and important bugfixes for external third-party dependencies. If you use this option, you must set it permanently for *every* script (not just `update.sh`). Defaults to `false`.
Comment thread
hameerabbasi marked this conversation as resolved.

- `OT_CUDA_LOWMEM_MODE`: If set to `true`, it enables aggressive garbage collection in PyTorch to help with low-memory GPUs. Defaults to `false`.

- `OT_PLATFORM_REQUIREMENTS`: Allows you to override which platform-specific "requirements" file you want to install. Defaults to `detect`, which automatically detects whether you have an AMD or NVIDIA GPU. But people with multi-GPU systems can use this setting to force a specific GPU acceleration framework's requirements. Valid values are `requirements-rocm.txt` for AMD, `requirements-cuda.txt` for NVIDIA, and `requirements-default.txt` for non-AMD/NVIDIA systems.
- `OT_PLATFORM`: Allows you to override which platform you want to install for. Defaults to `detect`, which automatically detects whether you have an AMD or NVIDIA GPU, and falls back to CPU on failure. But people with multi-GPU systems can use this setting to force a specific GPU acceleration framework's requirements. Valid values are `rocm` for AMD, `cuda` for NVIDIA, and `default` for non-AMD/NVIDIA systems.

- `OT_SCRIPT_DEBUG`: If set to `true`, it enables additional debug logging in the scripts. Defaults to `false`.


### Examples of how to use the custom environment variables:

- You can provide custom environment variables directly on the command line, as follows: `env OT_PREFER_VENV="true" OT_CUDA_LOWMEM_MODE="true" OT_PLATFORM_REQUIREMENTS="requirements-cuda.txt" ./start-ui.sh`.
- You can provide custom environment variables directly on the command line, as follows: `env OT_PLATFORM="rocm" OT_CUDA_LOWMEM_MODE="true" ./start-ui.sh`.
- You can add them to your user's persistent environment variables, so that they are always active. The process varies depending on your operating system. On Linux, you can place them in `~/.config/environment.d/onetrainer.conf` (on all Systemd-based distros), which is a plaintext file with *one variable per line,* such as `OT_CUDA_LOWMEM_MODE="true"`. Beware that changes to `environment.d` requires a *complete system restart* to take effect (there is no command for reloading them live). To verify that your environment has been set persistently, you can then open a terminal window and run `printenv <variable name>` (such as `printenv OT_CUDA_LOWMEM_MODE`) to see if your custom values have taken effect.
- If you're launching OneTrainer from your own, custom scripts, then you can instead `export` the new values (which tells the shell to pass those environment variables onto child processes). For example, by having a line such as `export OT_CUDA_LOWMEM_MODE="true"` before your script calls `./OneTrainer/start-ui.sh`.
- If you're running OneTrainer inside a Docker/Podman container, you can instead use the [ENV](https://docs.docker.com/reference/dockerfile/#env) instruction in your `Dockerfile` / `Containerfile` to set the variables, such as `ENV OT_CUDA_LOWMEM_MODE="true"`.


### Installing the required Python version for OneTrainer:

- If you've received a warning that your system's Python version is incorrect, then your system most likely doesn't have Conda installed, and has instead tried to create a Python Venv with your host's default Python version. If that version is incompatible with OneTrainer, then you'll have to resolve the problem by manually installing a compatible version. Alternatively, you are using an outdated Conda environment.
- Begin by deleting the `venv` sub-directory inside the OneTrainer directory, to erase the invalid Python Venv (which was created with the wrong Python version). If you were using Conda, then you must instead delete the outdated `conda_env` sub-directory.
- Now you'll have to choose which solution you prefer.
- The most beginner-friendly solution is to install [Miniconda](https://docs.anaconda.com/miniconda/) on your system. OneTrainer will then automatically install and manage the correct Python version for you via Conda. You can stop reading here if you're choosing this solution. Everything will work automatically after that.
- Alternatively, if you prefer a more lightweight and advanced solution, then you can use [pyenv](https://github.com/pyenv/pyenv), which allows you to set the exact Python version to use for OneTrainer's directory. If you're on Linux, then read their "[automatic installer](https://github.com/pyenv/pyenv?tab=readme-ov-file#automatic-installer)" section and follow the instructions. If you're on a Mac instead, then read their "[Homebrew](https://github.com/pyenv/pyenv?tab=readme-ov-file#homebrew-in-macos)" section (which is an open-source package manager for Macs).
- After installing pyenv, you will also need to install the [Python build dependencies](https://github.com/pyenv/pyenv/wiki#suggested-build-environment) on your system, since pyenv installs each Python version by compiling them directly from the official source code.
- Restart your shell, and then try the `pyenv doctor` command, which ensures that pyenv is loaded and verifies that your system contains all required dependencies for installing Python.
- Run `pyenv install <python version>` to install whichever Python version is currently required by OneTrainer. You can look at the `OT_CONDA_USE_PYTHON_VERSION` variable at the top of the `lib.include.sh` file in OneTrainer's project directory, to see which Python version is recommended by OneTrainer at the moment.
- Lastly, you must navigate to the OneTrainer directory, and then run `pyenv local <python version>` to force OneTrainer to use that version of Python. Your choice will be stored persistently in the hidden `.python-version` file, and can be changed again in the future by running the command again.
- You can now run `python --version` to verify that the `python` command in OneTrainer is being mapped to the correct Python version by pyenv.
- Everything is now ready for running OneTrainer!


### Running custom script commands:

- Always use `run-cmd.sh` when you want to execute any of OneTrainer's CLI tasks. It automatically validates the chosen target script's name, configures the runtime environment correctly, and then runs the target script with your given command-line arguments.
Expand Down
48 changes: 21 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ OneTrainer is a one-stop solution for all your Diffusion training needs.
## Installation

> [!IMPORTANT]
> Installing OneTrainer requires Python >=3.10 and <3.14.
> Installing OneTrainer manually requires Python >=3.10 and <3.14.
> You can download Python at https://www.python.org/downloads/windows/.
> Then follow the below steps.
> Then follow the below manual steps.

#### Automatic installation

Expand All @@ -47,23 +47,12 @@ OneTrainer is a one-stop solution for all your Diffusion training needs.

#### Manual installation

1. Clone the repository `git clone https://github.com/Nerogar/OneTrainer.git`
2. Navigate into the cloned directory `cd OneTrainer`
3. Set up a virtual environment `python -m venv venv`
4. Activate the new venv:
- Windows: `venv\scripts\activate`
- Linux and Mac: Depends on your shell, activate the venv accordingly
5. Install the requirements `pip install -r requirements.txt`

> [!Tip]
> Some Linux distributions are missing required packages for instance: On Ubuntu you must install `libGL`:
>
> ```bash
> sudo apt-get update
> sudo apt-get install libgl1
> ```
>
> Additionally it's been reported Alpine, Arch and Xubuntu Linux may be missing `tkinter`. Install it via `apk add py3-tk` for Alpine and `sudo pacman -S tk` for Arch.
1. Install `pixi`: [Guide](https://pixi.prefix.dev/latest/installation/)
2. Clone the repository `git clone https://github.com/Nerogar/OneTrainer.git`
3. Navigate into the cloned directory `cd OneTrainer`
4. Perform the installation: `pixi install --locked -e cuda` (Replace `cuda` by `rocm` or `cpu` if needed).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manual installation with pip should remain possible.
Currently it fails:

INFO: pip is looking at multiple versions of onetrainer to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install -r requirements-global.txt (line 1) and diffusers 0.37.0.dev0 (from git+https://github.com/huggingface/diffusers.git@99daaa8#egg=diffusers) because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested diffusers 0.37.0.dev0 (from git+https://github.com/huggingface/diffusers.git@99daaa8#egg=diffusers)
    onetrainer 0.1.0 depends on diffusers 0.37.0.dev0 (from git+https://github.com/huggingface/diffusers.git@99daaa802da01ef4cff5141f4f3c0329a57fb591)

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you merged branches and forgot to sync the git deps in requirements-global.txt and pyproject.toml. This is duplication, yes, but it's needed to keep pip and Pixi installs working.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I just followed these steps: https://github.com/nerogar/OneTrainer#manual-installation
This should still work. If it works for you, I can retry.


**Note:** We don't support ROCm on Windows currently.
Comment thread
dxqb marked this conversation as resolved.

## Updating

Expand All @@ -75,8 +64,7 @@ OneTrainer is a one-stop solution for all your Diffusion training needs.

1. Cd to folder containing the repo `cd OneTrainer`
2. Pull changes `git pull`
3. Activate the venv `venv/scripts/activate`
4. Re-install all requirements `pip install -r requirements.txt --force-reinstall`
3. Recreate the environment `pixi install --locked -e cuda`

## Usage

Expand All @@ -96,7 +84,7 @@ For a technically focused quick start, see the [Quick Start Guide](docs/QuickSta

### CLI Mode

If you need more control or a headless approach OT also supports the command-line interface. All commands **need** to be run inside the active venv created during installation.
If you need more control or a headless approach OT also supports the command-line interface. All commands **need** to be run inside the active pixi environment created during installation.

All functionality is split into different scripts located in the `scripts` directory. This currently includes:

Expand All @@ -111,9 +99,17 @@ All functionality is split into different scripts located in the `scripts` direc
- `generate_masks.py` A utility to automatically create masks for your dataset
- `calculate_loss.py` A utility to calculate the training loss of every image in your dataset

To learn more about the different parameters, execute `<script-name> -h`. For example `python scripts\train.py -h`
To learn more about the different parameters, execute `./run-cmd.sh <script-name> -h`. For example `./run-cmd.sh scripts/train.py -h`. On Windows, you can do `./run-cmd.ps1 <script-name> -h`. An example of running training scripts on Windows is:

```sh
./run-cmd.ps1 train --config-path ./config.json
```

You can also activate a shell, you'd select your gpu env (usually `cuda`: https://pixi.prefix.dev/latest/advanced/pixi_shell/

If you are on Mac or Linux, you can also read [the launch script documentation](LAUNCH-SCRIPTS.md) for detailed information about how to run OneTrainer and its various scripts on your system. Windows users are to refer lib.include.ps1, it mostly mirrors linux launch scripts.


If you are on Mac or Linux, you can also read [the launch script documentation](LAUNCH-SCRIPTS.md) for detailed information about how to run OneTrainer and its various scripts on your system.

## Troubleshooting

Expand Down Expand Up @@ -143,9 +139,7 @@ You also **NEED** to **install the required developer dependencies** for your cu
> Be sure to run those commands _without activating your venv or Conda environment_, since [pre-commit](https://pre-commit.com/) is supposed to be installed outside any environment.

```sh
cd OneTrainer
pip install -r requirements-dev.txt
pre-commit install
pixi global install pre-commit
```

Now all of your commits will automatically be verified for common errors and code style issues, so that code reviewers can focus on the architecture of your changes without wasting time on style/formatting issues, thus greatly improving the chances that your pull request will be accepted quickly and effortlessly.
Expand Down
47 changes: 47 additions & 0 deletions docker.init.sh
Comment thread
hameerabbasi marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/usr/bin/env bash

# Setup ssh
setup_ssh() {
if [[ $PUBLIC_KEY ]]; then
echo "Setting up SSH..."
mkdir -p ~/.ssh
echo "$PUBLIC_KEY" >> ~/.ssh/authorized_keys
chmod 700 -R ~/.ssh

if [ ! -f /etc/ssh/ssh_host_rsa_key ]; then
ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key -q -N ''
echo "RSA key fingerprint:"
ssh-keygen -lf /etc/ssh/ssh_host_rsa_key.pub
fi

if [ ! -f /etc/ssh/ssh_host_dsa_key ]; then
ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key -q -N ''
echo "DSA key fingerprint:"
ssh-keygen -lf /etc/ssh/ssh_host_dsa_key.pub
fi

if [ ! -f /etc/ssh/ssh_host_ecdsa_key ]; then
ssh-keygen -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -q -N ''
echo "ECDSA key fingerprint:"
ssh-keygen -lf /etc/ssh/ssh_host_ecdsa_key.pub
fi

if [ ! -f /etc/ssh/ssh_host_ed25519_key ]; then
ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key -q -N ''
echo "ED25519 key fingerprint:"
ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub
fi

service ssh start

echo "SSH host keys:"
for key in /etc/ssh/*.pub; do
echo "Key: $key"
ssh-keygen -lf $key
done
fi
}

setup_ssh
service ssh start
sleep infinity
37 changes: 4 additions & 33 deletions export_debug.bat
Original file line number Diff line number Diff line change
@@ -1,36 +1,7 @@
@echo off

REM Avoid footgun by explictly navigating to the directory containing the batch file
chcp 65001 >nul
cd /d "%~dp0"

REM Verify that OneTrainer is our current working directory
if not exist "scripts\train_ui.py" (
echo Error: train_ui.py does not exist, you have done something very wrong. Reclone the repository.
goto :end
)

if not defined PYTHON (set PYTHON=python)
if not defined VENV_DIR (set "VENV_DIR=%~dp0venv")

:check_venv
dir "%VENV_DIR%" >NUL 2>NUL
if not errorlevel 1 goto :activate_venv
echo venv not found, please run install.bat first
goto :end

:activate_venv
echo activating venv %VENV_DIR%
set PYTHON="%VENV_DIR%\Scripts\python.exe" -X utf8
echo Using Python %PYTHON%

:launch
echo Generating debug report...
%PYTHON% scripts\generate_debug_report.py
if errorlevel 1 (
echo Error: Debug report generation failed with code %ERRORLEVEL%
) else (
echo Now upload the debug report to your Github issue or post in Discord.
)

:end
powershell -ExecutionPolicy Bypass -File "%~dp0scripts\powershell\export_debug.ps1" %*
set "_EXIT=%ERRORLEVEL%"
pause
exit /b %_EXIT%
Loading