HunyuanWorld-Voyager-WinBlackwell

English

HunyuanWorld-Voyager-WinBlackwell

Windows Native + RTX 50 Series (Blackwell) 完全対応フォーク

このフォークについて

HunyuanWorld-Voyager は Tencent が開発した、単一画像から3D一貫性のあるRGB-D動画を生成する革新的なモデルです。

しかし、公式版はLinux + マルチGPU環境のみをサポートしており、Windowsユーザーや最新のRTX 50シリーズ（Blackwell GPU）ユーザーは動作させることができませんでした。

このフォークは、Windowsネイティブ環境とBlackwell GPU（RTX 5090/5080/5070等） で完全に動作するように修正を加えたものです。

公式版との比較

項目	公式版	本フォーク
対応OS	Linux のみ	Windows 10/11
対応GPU	sm_90以下（RTX 40シリーズまで）	sm_120対応（RTX 50シリーズ）
flash-attn	必須	不要（自動フォールバック）
deepspeed	推奨	不要（スキップ可能）
推論方式	torchrun（マルチGPU前提）	シングルGPU最適化
環境	conda	Python venv
起動方法	コマンドライン	ダブルクリック（bat）
ポート管理	手動	自動検出・自動起動
パス設定	Linux固定（/root/ckpts）	Windows対応
ファイル名	Linux形式	Windows対応（サニタイズ済み）

主な特徴

1. Windowsネイティブ対応

WSL2やDockerを使わず、Windowsで直接動作
Python venvによるクリーンな環境分離
Windows特有のパス・ファイル名問題を解決済み

2. Blackwell GPU（RTX 50シリーズ）完全対応

PyTorch Nightly (cu130) + triton-windows による sm_120 サポート
RTX 5090 / RTX 5080 / RTX 5070 で動作確認済み
RTX PRO 6000 Blackwell でも動作確認済み

3. flash-attn不要

Linuxでしかビルドできない flash-attn への依存を排除
PyTorchネイティブの Scaled Dot-Product Attention へ自動フォールバック
品質は同等、速度は若干低下（許容範囲内）

4. ワンクリック起動

run-hwv.bat をダブルクリックするだけで起動
ポートが使用中なら自動的に空きポートを検出
ブラウザが自動的に開く

5. シングルGPU最適化

マルチGPU前提の torchrun を排除
シングルGPU（96GB以下）で効率的に動作
CPU オフロードによるVRAM節約対応

動作要件

項目	要件
OS	Windows 10/11（64bit）
GPU	NVIDIA GPU（CUDA対応、VRAM 60GB以上）
推奨GPU	RTX 5090 (32GB)、RTX PRO 6000 Blackwell (96GB)
対応世代	Blackwell (sm_120)、Hopper (sm_90)、Ada (sm_89)、Ampere (sm_86/80)
Driver	NVIDIA Driver 580以上
Python	Python 3.11〜3.12
ストレージ	100GB以上（モデル約70GB）

VRAMについて

最小: 60GB（540p生成）
推奨: 80GB以上
快適: 96GB（RTX PRO 6000 Blackwell等）

クイックスタート

Step 1: リポジトリのクローン

git clone https://github.com/YOUR_USERNAME/HunyuanWorld-Voyager-WinBlackwell.git
cd HunyuanWorld-Voyager-WinBlackwell

Step 2: 仮想環境の作成

python -m venv venv
.\venv\Scripts\activate

Step 3: PyTorch Nightly のインストール

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu130

Step 4: triton-windows のインストール

pip install triton-windows

Step 5: 依存関係のインストール

# フィルタリング版（torch, deepspeed, numpy固定を除外）
pip install -r req_filtered.txt
pip install "numpy<2"
pip install transformers==4.39.3

Step 6: 追加依存関係

# MoGe（深度推定）
pip install --no-deps git+https://github.com/microsoft/MoGe.git
pip install scipy matplotlib trimesh
pip install git+https://github.com/EasternJournalist/utils3d.git
pip install git+https://github.com/EasternJournalist/pipeline.git@866f059d2a05cde05e4a52211ec5051fd5f276d6

# Gradio & xfuser
pip install gradio
pip install xfuser==0.4.2

Step 7: モデルのダウンロード

huggingface-cli download tencent/HunyuanWorld-Voyager --local-dir ./ckpts

注意: 約70GBのダウンロードです。完了まで時間がかかります。

Step 8: 起動

.\run-hwv.bat

または仮想環境内で直接:

python app.py

使い方

ブラウザで http://localhost:7860 が自動的に開きます
画像をアップロード: 任意の画像をドラッグ＆ドロップ
カメラ方向を選択: forward / backward / left / right / turn_left / turn_right
Generate Condition をクリック: 条件動画が生成されます（数秒）
プロンプトを入力: シーンの説明を英語で入力
Generate Video をクリック: 最終動画が生成されます（約20分）

フォークで行った修正

1. utils3d 互換レイヤー

MoGeが使用する utils3d.pt（旧API）を utils3d.torch（新API）にマッピング。

venv\Lib\site-packages\utils3d\pt.py:

from utils3d.torch.utils import *
from utils3d.torch.transforms import *
from utils3d.torch.segment_ops import *
from utils3d.torch.mesh import *
from utils3d.torch.maps import *

2. flash-attn フォールバック

voyager\modules\attenion.py:

def attention(q, k, v, mode="flash", ...):
    # flash_attnが利用不可なら自動的にtorchモードへ
    if mode == "flash" and flash_attn_varlen_func is None:
        mode = "torch"
    # ...

3. Windows ファイル名サニタイズ

sample_image2video.py:

# Windows-safe time format
time_flag = datetime.fromtimestamp(time.time()).strftime("%Y-%m-%d_%H-%M-%S")

# 無効文字の除去
safe_prompt = prompt.replace(':', '').replace('/', '').replace('\\', '')...

4. MODEL_BASE 環境変数

run-hwv.bat:

set MODEL_BASE=%~dp0ckpts

5. app.py シングルGPU対応

torchrun を subprocess + python に置換、ポート自動検出を追加。

トラブルシューティング

ModuleNotFoundError: No module named 'utils3d.pt'

utils3d/pt.py の互換レイヤーが正しく作成されているか確認してください。

TypeError: 'NoneType' object is not callable (flash_attn)

voyager/modules/attenion.py のフォールバック処理が適用されているか確認してください。

ValueError: model_path not exists: \root\ckpts...

run-hwv.bat から起動するか、MODEL_BASE 環境変数を設定してください。

OSError: [Errno 32] Broken pipe

ファイル名にコロンや改行が含まれています。sample_image2video.py のサニタイズ処理を確認してください。

PyTorch + CUDA エラー

# PyTorchバージョン確認
python -c "import torch; print(torch.__version__)"
# 出力に "dev" が含まれていること

# CUDA確認
python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)"
# True, 13.0 であること

パフォーマンス

環境	推論時間	VRAMピーク
RTX PRO 6000 Blackwell (96GB)	約20分	約50GB
RTX 5090 (32GB) + CPU Offload	約25-30分	約28GB

※ 50ステップ、シングルGPU、flash-attn非使用時

謝辞

Tencent Hunyuan - オリジナルの HunyuanWorld-Voyager
MoGe - 深度推定モデル
triton-windows - Windows版Triton
utils3d - 3Dユーティリティ

引用

オリジナルの Voyager を引用してください：

@article{huang2025voyager,
  title={Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation},
  author={Huang, Tianyu and Zheng, Wangguandong and Wang, Tengfei and Liu, Yuhao and Wang, Zhenwei and Wu, Junta and Jiang, Jie and Li, Hui and Lau, Rynson WH and Zuo, Wangmeng and Guo, Chunchao},
  journal={arXiv preprint arXiv:2506.04225},
  year={2025}
}

ライセンス

オリジナルリポジトリのライセンスに従います。

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
assets		assets
ckpts		ckpts
data_engine		data_engine
examples		examples
voyager		voyager
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README_en.md		README_en.md
README_zh.md		README_zh.md
app.py		app.py
req_filtered.txt		req_filtered.txt
requirements.txt		requirements.txt
run-hwv.bat		run-hwv.bat
sample_image2video.py		sample_image2video.py

Folders and files

Latest commit

History

Repository files navigation

HunyuanWorld-Voyager-WinBlackwell

このフォークについて

公式版との比較

主な特徴

1. Windowsネイティブ対応

2. Blackwell GPU（RTX 50シリーズ）完全対応

3. flash-attn不要

4. ワンクリック起動

5. シングルGPU最適化

動作要件

VRAMについて

クイックスタート

Step 1: リポジトリのクローン

Step 2: 仮想環境の作成

Step 3: PyTorch Nightly のインストール

Step 4: triton-windows のインストール

Step 5: 依存関係のインストール

Step 6: 追加依存関係

Step 7: モデルのダウンロード

Step 8: 起動

使い方

フォークで行った修正

1. utils3d 互換レイヤー

2. flash-attn フォールバック

3. Windows ファイル名サニタイズ

4. MODEL_BASE 環境変数

5. app.py シングルGPU対応

トラブルシューティング

ModuleNotFoundError: No module named 'utils3d.pt'

TypeError: 'NoneType' object is not callable (flash_attn)

ValueError: model_path not exists: \root\ckpts...

OSError: [Errno 32] Broken pipe

PyTorch + CUDA エラー

パフォーマンス

謝辞

引用

ライセンス

関連プロジェクト

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages