Skip to content

Commit 7f17e3c

Browse files
authored
docs: SDXL LoRA training example (#75)
1 parent 32a16da commit 7f17e3c

1 file changed

Lines changed: 65 additions & 45 deletions

File tree

README.md

Lines changed: 65 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,26 @@ Dockerfile for [kohya-ss/sd-scripts](https://github.com/kohya-ss/sd-scripts).
1212
- Replace `accelerate launch` with `sudo docker run --rm --gpus all aoirint/sd_scripts`.
1313
- Training command will run in the container by a general user (UID=1000).
1414

15-
### Example: LoRA-LierLa training with `DreamBooth、class+identifier方式` for Waifu Diffusion 1.5 Beta 2.
15+
### WD14 Captioning (ONNX)
16+
17+
```shell
18+
mkdir -p "./cache/wd14_tagger_model_cache"
19+
sudo chown -R 1000:1000 "./cache/wd14_tagger_model_cache"
20+
21+
# If your cache is broken, execute
22+
# rm -rf ./cache/wd14_tagger_model_cache/wd14_tagger_model
23+
24+
sudo docker run --rm --gpus all \
25+
-v "./work:/work" \
26+
-v "./cache/wd14_tagger_model_cache:/wd14_tagger_model_cache" \
27+
aoirint/sd_scripts \
28+
finetune/tag_images_by_wd14_tagger.py \
29+
--model_dir "/wd14_tagger_model_cache/wd14_tagger_model" \
30+
--onnx \
31+
/work/my_dataset-20230715.1/img
32+
```
33+
34+
### Training LoRA-LierLa U-Net only with `DreamBooth、キャプション方式` for Animagine XL 4.0 Zero (Stable Diffusion XL)
1635

1736
Create permanent directories to mount on container.
1837

@@ -21,95 +40,96 @@ mkdir -p "./base_model" "./work" "./cache/huggingface/hub"
2140
sudo chown -R 1000:1000 "./base_model" "./work" "./cache/huggingface/hub"
2241
```
2342

24-
Download `wd-1-5-beta2-fp32.safetensors` from [waifu-diffusion/wd-1-5-beta2](https://huggingface.co/waifu-diffusion/wd-1-5-beta2).
43+
Download `animagineXL40_v4Zero.safetensors` from [Animagine XL 4.0 Zero](https://civitai.com/models/1188071?modelVersionId=1409042).
2544

2645
```shell
27-
wget "https://huggingface.co/waifu-diffusion/wd-1-5-beta2/resolve/main/checkpoints/wd-1-5-beta2-fp32.safetensors"
28-
echo "764f93581d80b46011039bb388e899f17f7869fce7e7928b060e9a5574bd8f84 wd-1-5-beta2-fp32.safetensors" | sha256sum -c -
46+
wget -O "animagineXL40_v4Zero.safetensors" "https://civitai.com/api/download/models/1409042?type=Model&format=SafeTensor&size=full&fp=fp16"
47+
echo "f15812e65c2ea7f4e19ce37fb2a8445eb65c64da450a508dd9c8f237c73f6bb8 animagineXL40_v4Zero" | sha256sum -c -
2948
```
3049

31-
Prepare a dataset directory `work/my_dataset-20230715.1` and a config file `work/my_dataset-20230715.1/config.toml` following [train_README](https://github.com/kohya-ss/sd-scripts/blob/v0.6.4/train_README-ja.md#dreamboothclassidentifier%E6%96%B9%E5%BC%8F%E6%AD%A3%E5%89%87%E5%8C%96%E7%94%BB%E5%83%8F%E4%BD%BF%E7%94%A8%E5%8F%AF). Set file ownership `UID:GID = 1000:1000` (`sudo chown -R 1000:1000 "./work"`). You can also choose another directory structure to modify `config.toml` and the training command.
50+
Prepare a dataset directory `work/my_dataset-20230715.1` and a config file `work/my_dataset-20230715.1/config.toml` following [train_README](https://github.com/kohya-ss/sd-scripts/blob/206adb643848ff27894f1e72b6987fa66db99378/docs/train_README-ja.md#dreambooth%E3%82%AD%E3%83%A3%E3%83%97%E3%82%B7%E3%83%A7%E3%83%B3%E6%96%B9%E5%BC%8F%E6%AD%A3%E5%89%87%E5%8C%96%E7%94%BB%E5%83%8F%E4%BD%BF%E7%94%A8%E5%8F%AF).
51+
52+
Set file ownership `UID:GID = 1000:1000` (`sudo chown -R 1000:1000 "./work"`).
53+
54+
You can also choose another directory structure to modify `config.toml` and the training command.
3255

3356
- work/my_dataset-20230715.1/
3457
- config.toml
3558
- img/
3659
- 0001.png
60+
- 0001.txt
3761
- 0002.png
62+
- 0002.txt
3863
- ...
3964
- reg_img/
4065
- transparent_1.png
4166
- transparent_2.png
4267
- ...
68+
- sample_prompts.txt
4369
- output/
4470
- logs/
4571

46-
Here is a example `config.toml`.
72+
This is an example `sample_prompts.txt`. Use the output of WD14 tagger as reference. Add new lines for multiple samples.
73+
74+
```plain
75+
shs 1girl, 1girl, solo, simple background, white background, masterpiece, high score, great score, absurdres --w 1024 --h 1024 --d 42 --s 28 --l 5 --ss euler_a --n lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry
76+
```
77+
78+
This is an example `config.toml`.
4779

4880
```toml
4981
[general]
5082
enable_bucket = true
5183

5284
[[datasets]]
53-
resolution = 768
54-
batch_size = 4
85+
resolution = 1024
86+
batch_size = 1
5587

5688
[[datasets.subsets]]
5789
image_dir = '/work/my_dataset-20230715.1/img'
58-
class_tokens = 'shs girl'
59-
num_repeats = 10
90+
caption_extension = '.txt'
91+
num_repeats = 20
6092

6193
[[datasets.subsets]]
6294
is_reg = true
6395
image_dir = '/work/my_dataset-20230715.1/reg_img'
64-
class_tokens = 'girl'
96+
class_tokens = '1girl'
6597
num_repeats = 1
6698
```
6799

68100
Execute training.
69101

70102
```shell
71-
sudo docker run --rm --gpus all \
103+
sudo docker run \
104+
--rm \
105+
--gpus all \
72106
-v "./base_model:/base_model" \
73107
-v "./work:/work" \
74-
-v "./cache/huggingface/hub:/huggingface/hub" \
108+
-v "./cache/huggingface/hub:/home/user/.cache/huggingface/hub" \
75109
aoirint/sd_scripts \
76-
--num_cpu_threads_per_process 1 \
77-
train_network.py \
78-
--pretrained_model_name_or_path=/base_model/wd-1-5-beta2-fp32.safetensors \
79-
--dataset_config=/work/my_dataset-20230715.1/config.toml \
80-
--output_dir=/work/my_dataset-20230715.1/output \
81-
--output_name=my_dataset-20230715.1 \
82-
--save_model_as=safetensors \
83-
--logging_dir=/work/my_dataset-20230715.1/logs \
110+
--num_cpu_threads_per_process=1 \
111+
sdxl_train_network.py \
112+
--seed=42 \
113+
--pretrained_model_name_or_path="/base_model/animagineXL40_v4Zero.safetensors" \
114+
--dataset_config="/work/my_dataset-20230715.1/config.toml" \
115+
--output_dir="/work/my_dataset-20230715.1/output" \
116+
--output_name="my_dataset-20230715.1" \
117+
--save_model_as="safetensors" \
118+
--logging_dir="/work/my_dataset-20230715.1/logs" \
84119
--prior_loss_weight=1.0 \
85-
--max_train_steps=400 \
120+
--max_train_epochs=5 \
86121
--learning_rate=1e-4 \
87-
--optimizer_type="AdamW8bit" \
122+
--optimizer_type="AdaFactor" \
88123
--xformers \
89124
--mixed_precision="fp16" \
90125
--cache_latents \
126+
--cache_text_encoder_outputs \
91127
--gradient_checkpointing \
92128
--save_every_n_epochs=1 \
93-
--network_module=networks.lora \
94-
--v2 \
95-
--v_parameterization
96-
```
97-
98-
### Example: WD14 Captioning (ONNX)
99-
100-
```shell
101-
mkdir -p "./cache/wd14_tagger_model_cache"
102-
sudo chown -R 1000:1000 "./cache/wd14_tagger_model_cache"
103-
104-
# If your cache is broken, execute
105-
# rm -rf ./cache/wd14_tagger_model_cache/wd14_tagger_model
106-
107-
sudo docker run --rm --gpus all \
108-
-v "./work:/work" \
109-
-v "./cache/wd14_tagger_model_cache:/wd14_tagger_model_cache" \
110-
aoirint/sd_scripts \
111-
finetune/tag_images_by_wd14_tagger.py \
112-
--model_dir "/wd14_tagger_model_cache/wd14_tagger_model" \
113-
--onnx \
114-
/work/my_dataset-20230715.1/img
129+
--sample_at_first \
130+
--sample_every_n_epochs=1 \
131+
--sample_prompts="/work/my_dataset-20230715.1/sample_prompts.txt" \
132+
--sample_sampler="euler_a" \
133+
--network_module="networks.lora" \
134+
--network_train_unet_only
115135
```

0 commit comments

Comments
 (0)