Would other people confirm how long it takes to run the demos that generate 10 second clips? I have a 3090 and it takes 30 minutes, but the github page says it should only take 6 minutes to run on a 3090. I'm just running the demos with the provided assets like:
time python demo.py
--one_shot
--video_inference
--stage1_checkpoint_path 'assets/checkpoints/stage1_state_dict.ckpt'
--stage2_checkpoint_path 'assets/checkpoints/stage2_state_dict.ckpt'
--saved_path 'assets/samples/RD_Radio30_000/'
--hubert_feat_path 'assets/samples/WRA_LamarAlexander_000/WRA_LamarAlexander_000.npy'
--wav_path 'assets/samples/WRA_LamarAlexander_000/WRA_LamarAlexander_000.wav'
--mp4_original_path 'assets/samples/RD_Radio35_000/RD_Radio35_000.mp4'
--denoising_step 20
--saved_name 'one_shot_pred.mp4'
--device 'cuda:0'
Would other people confirm how long it takes to run the demos that generate 10 second clips? I have a 3090 and it takes 30 minutes, but the github page says it should only take 6 minutes to run on a 3090. I'm just running the demos with the provided assets like:
time python demo.py
--one_shot
--video_inference
--stage1_checkpoint_path 'assets/checkpoints/stage1_state_dict.ckpt'
--stage2_checkpoint_path 'assets/checkpoints/stage2_state_dict.ckpt'
--saved_path 'assets/samples/RD_Radio30_000/'
--hubert_feat_path 'assets/samples/WRA_LamarAlexander_000/WRA_LamarAlexander_000.npy'
--wav_path 'assets/samples/WRA_LamarAlexander_000/WRA_LamarAlexander_000.wav'
--mp4_original_path 'assets/samples/RD_Radio35_000/RD_Radio35_000.mp4'
--denoising_step 20
--saved_name 'one_shot_pred.mp4'
--device 'cuda:0'