Official implementation of GAF-Net: Video-Based Person Re-Identification via Appearance and Gait Recognitions (VISAPP 2024).
Authors: Moncef Boujou, Rabah Iguernaissi, Lionel Nicod, Djamal Merad, Séverine Dubuisson
Affiliations: LIS, CNRS, Aix-Marseille University, France | CERGAM, Aix-Marseille University, France
Video-based person re-identification (Re-ID) is a challenging task aiming to match individuals across various cameras based on video sequences. While most existing Re-ID techniques focus solely on appearance information, including gait information could potentially improve person Re-ID systems. In this study, we propose GAF-Net, a novel approach that integrates appearance with gait features for re-identifying individuals; the appearance features are extracted from RGB tracklets while the gait features are extracted from skeletal pose estimation. These features are then combined into a single feature allowing the re-identification of individuals.
This repository provides:
- Evaluation script to reproduce paper results
- Pre-computed embeddings (appearance + gait)
- GaitGraph training code adapted for iLIDS-VID
- Pose data for training
This repository does not provide:
- Appearance model training (PiT/MGH/OSNet) — use original repos
- End-to-end inference pipeline (video → Re-ID)
- Pre-trained GaitGraph weights (coming soon)
| Method | Gait | Rank-1 | Rank-5 | Rank-10 | Rank-20 | λ |
|---|---|---|---|---|---|---|
| GAF-Net (PiT) | gait1 | 93.07% | 99.27% | 99.74% | 99.94% | 0.74 |
| GAF-Net (MGH) | gait2 | 90.40% | 98.66% | 98.99% | 99.66% | 0.84 |
| GAF-Net (OSNet) | gait1 | 70.93% | 88.40% | 93.00% | 96.54% | 0.90 |
| Backbone | Appearance Only | GAF-Net (+ Gait) | Improvement |
|---|---|---|---|
| PiT | 92.07% | 93.07% | +1.00% |
| MGH | 85.60% | 90.40% | +4.80% |
| OSNet | 59.20% | 70.93% | +11.73% |
# Clone the repository
git clone https://github.com/Moncef-Bj/GAF-Net-for-Video-Based-Person-Re-Identification.git
cd GAF-Net-for-Video-Based-Person-Re-Identification
# Install dependencies
pip install -r requirements.txt- Python >= 3.8
- PyTorch >= 1.9
- NumPy
- Pandas
- scikit-learn
- torchreid
Download the pre-computed embeddings and pose data from Google Drive:
Download gaf-net-data.zip (395 MB)
Extract the zip in the repository root:
# Linux/Mac
unzip gaf-net-data.zip -d .
# Windows (PowerShell)
Expand-Archive -Path gaf-net-data.zip -DestinationPath .This will create:
GAF-Net-for-Video-Based-Person-Re-Identification/
├── embeddings/ # Pre-computed embeddings for evaluation
│ └── ilids/
│ ├── pit/ # PiT appearance embeddings (9216-d)
│ ├── mgh/ # MGH appearance embeddings (5120-d)
│ ├── osnet/ # OSNet appearance embeddings (512-d)
│ ├── gait1/ # GaitGraph embeddings for PiT & OSNet (128-d)
│ └── gait2/ # GaitGraph embeddings for MGH (128-d)
└── poses/ # Pose data for training GaitGraph
├── splits_pit/ # Poses for PiT & OSNet splits
└── splits_mgh/ # Poses for MGH splits
GAF-Net-for-Video-Based-Person-Re-Identification/
├── README.md
├── requirements.txt
├── evaluate.py # Main evaluation script (reproduces paper results)
├── embeddings/ # (download from Google Drive)
├── poses/ # (download from Google Drive)
└── gaitgraph/ # Modified GaitGraph code
└── src/
├── train.py # Training script
├── evaluate.py # Evaluation & embedding extraction
├── common.py # Configuration & model setup
├── losses.py # SupConLoss
└── datasets/
├── gait.py # Dataset classes (iLIDS, MARS, CASIA-B)
├── augmentation.py # Data augmentation
└── graph.py # COCO skeleton graph
# Download and extract data first (see above)
# Evaluate all backbones
python evaluate.py
# Evaluate specific backbone
python evaluate.py --backbone pit
# Custom lambda value
python evaluate.py --backbone mgh --lambda_val 0.84Expected output:
======================================================================
FINAL SUMMARY
======================================================================
Backbone Gait λ Rank-1 Rank-5 mAP Paper
----------------------------------------------------------------------
PIT gait1 0.74 93.07 99.33 95.80 93.07
MGH gait2 0.84 90.40 98.67 94.01 90.40
OSNET gait1 0.90 70.93 89.60 79.14 70.93
======================================================================
1. ModuleNotFoundError: No module named 'torchreid'
pip install torchreid
# or
pip install git+https://github.com/KaiyangZhou/deep-person-reid.git2. FileNotFoundError: embeddings/ilids/...
Make sure you downloaded and extracted gaf-net-data.zip in the repository root.
3. Results don't match paper exactly
Ensure you're using the correct lambda values:
- PiT: λ = 0.74
- MGH: λ = 0.84
- OSNet: λ = 0.90
Our fusion formula combines appearance and gait modalities (Equation 4 in the paper):
z_i = [normalize(z_appearance), λ · normalize(z_gait)]
Where:
z_appearance: Appearance embedding from PiT/MGH/OSNetz_gait: Gait embedding from GaitGraph (128-d)λ: Fusion weight, λ ∈ [0.6, 0.8] recommendednormalize(): L2 normalization
Video Sequence
│
├──────────────────────────────────────┐
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Pose Estimator │ │ Appearance │
│ (YOLO-Pose) │ │ (PiT/MGH/OSNet) │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Pose Sequence │ │ Appearance │
│ (T × 17 × 3) │ │ Embedding │
└─────────────────┘ └─────────────────┘
│ │
▼ │
┌─────────────────┐ │
│ GaitGraph │ │
│ (ResGCN) │ │
└─────────────────┘ │
│ │
▼ │
┌─────────────────┐ │
│ Gait Embedding │ │
│ (128-d) │ │
└─────────────────┘ │
│ │
└──────────────┬───────────────────────┘
▼
┌────────────────┐
│ Fusion │
│ [z_app, λ·z_g] │
└────────────────┘
│
▼
┌────────────────┐
│ Final Embedding│
└────────────────┘
The pose files in poses/ contain 2D pose estimations in COCO format (17 keypoints):
image_name,nose_x,nose_y,nose_conf,left_eye_x,left_eye_y,left_eye_conf,...,right_ankle_x,right_ankle_y,right_ankle_conf
cam1_person001_00319.png,30.04,22.35,0.278,29.14,20.0,0.046,...,20.52,124.65,0.457Format:
- Column 0: Image filename (
cam{1|2}_person{XXX}_{FRAME}.png) - Columns 1-51: 17 COCO keypoints × 3 values (x, y, confidence) = 51 values
COCO 17 Keypoints Order:
0: nose, 1: left_eye, 2: right_eye, 3: left_ear, 4: right_ear,
5: left_shoulder, 6: right_shoulder, 7: left_elbow, 8: right_elbow,
9: left_wrist, 10: right_wrist, 11: left_hip, 12: right_hip,
13: left_knee, 14: right_knee, 15: left_ankle, 16: right_ankle
Appearance Embeddings (pit/, mgh/, osnet/):
index,feat_0,feat_1,...,feat_N,person_id,camera_id
0,0.123,0.456,...,0.789,43,2Gait Embeddings (gait1/, gait2/):
feat_0,feat_1,...,feat_127,person_id,camera_id
0.123,0.456,...,0.789,43,2Our GaitGraph model follows a two-stage training approach:
Stage 1: Pre-train on CASIA-B
cd gaitgraph/src
python train.py casia-b /path/to/casia_b_train.csv \
--valid_data_path /path/to/casia_b_test.csv \
--batch_size 128 \
--epochs 1000 \
--learning_rate 1e-2 \
--temp 0.01 \
--sequence_length 60 \
--network_name resgcn-n39-r8Stage 2: Fine-tune on iLIDS-VID
# For PiT/OSNet splits
python train.py iLIDS ../poses/splits_pit/train_split0.csv \
--valid_data_path ../poses/splits_pit/test_split0.csv \
--batch_size 128 \
--epochs 300 \
--learning_rate 1e-5 \
--temp 0.01 \
--sequence_length 60 \
--network_name resgcn-n39-r8 \
--weight_path /path/to/casia_b_pretrained.pth
# For MGH splits
python train.py iLIDS ../poses/splits_mgh/train_split0.csv \
--valid_data_path ../poses/splits_mgh/test_split0.csv \
...After training, extract embeddings for fusion:
python evaluate.py iLIDS /path/to/test_split0.csv \
--weight_path /path/to/ckpt_epoch_best.pthIf you find this work useful, please cite our paper:
@inproceedings{boujou2024gafnet,
title={GAF-Net: Video-Based Person Re-Identification via Appearance and Gait Recognitions},
author={Boujou, Moncef and Iguernaissi, Rabah and Nicod, Lionel and Merad, Djamal and Dubuisson, S{\'e}verine},
booktitle={Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP)},
pages={493--500},
year={2024},
organization={SCITEPRESS},
doi={10.5220/0012364200003660}
}This work builds upon:
- PiT - Multidirection and Multiscale Pyramid in Transformer (Zang et al., IEEE TII 2022)
- MGH - Multi-Granularity Hypergraph (Yan et al., CVPR 2020)
- OSNet - Omni-Scale Network (Zhou et al., ICCV 2019)
- GaitGraph - Graph Convolutional Network for Skeleton-Based Gait Recognition (Teepe et al., ICIP 2021)
This project is licensed under the MIT License - see the LICENSE file for details.
For questions or issues, please open an issue or contact:
- Moncef Boujou - moncef.boujou@univ-amu.fr