Skip to content

irfan112/yowov3-multistreaming-inferencing

Repository files navigation

yowov3-multistreaming

YOWOv3(Spatio Temporal Action Detection task) using (UCF101-24) dataset. The repo is extension of https://github.com/Hope1337/YOWOv3, https://arxiv.org/pdf/2408.02623

Environment Setup

Clone this repository:

git clone https://github.com/irfan112/yowov3-multistreaming-inferencing.git

Use Python 3.8 or Python 3.9, and then install the dependencies:

pip install -r requirements.txt
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 \
  --extra-index-url https://download.pytorch.org/whl/cu117

Datasets

UCF101-24

Download from: Google Drive Link

Pretrained Weights & Checkpoints

To train or evaluate YOWO (I3D / ResNet), you need to download the pretrained weights and checkpoints provided here:

Google Drive - YOWO (I3D / ResNet) Checkpoints

After downloading, place the files into the corresponding weights/ or checkpoints/ folder in this repository (create them if they don’t exist).

yowov3-multistreaming-inferencing/
│── weights/
│   ├── yowo_i3d.pth
│   ├── yowo_resnet.pth
│── checkpoints/
│   ├── checkpoint_epoch_XX.pth

Video Source Configuration

You can specify your video inputs in a .env file. These sources can be either local video files or RTSP streams from live cameras.

Example (.env)

# Primary video sources (can be RTSP streams or video files)
VIDEO_SOURCE_1=ucf24/videos/Basketball/v_Basketball_g22_c01.mp4

# Example RTSP live camera stream
VIDEO_SOURCE_1=rtsp://admin:password@192.168.1.100:554/cam/realmonitor?channel=1&subtype=1

Once your .env is configured, run YOWOv3 with one of the following modes:

🔹 Multistreaming Mode

python main.py -m multistreaming_live -cf config/cf2/ucf_config.yaml

🔹 Single Live Mode (Original YOWOv3)

python main.py -m live -cf config/cf2/ucf_config.yaml

Multistreaming Example

Basketball Stream

Diving Stream

Training Info – YOWO (I3D & ResNet-3D Backbones)

trained YOWO using two different 3D backbones: I3D and ResNet-3D.

Training Loss Comparison: I3D vs ResNet-3D

🔹 Observations

  • I3D Backbone started with a lower loss and converged more smoothly.
  • ResNet-3D Backbone had a higher initial loss but showed consistent improvement and comparable convergence by epoch 7.
  • Both models benefited from gradual learning rate decay.

References

About

A real-time inferencing of multistreaming YOWOv3(Spatio Temporal Action Detection task) using (UCF101-24) dataset. The repo is extension of https://github.com/Hope1337/YOWOv3, https://arxiv.org/pdf/2408.02623

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages