Skip to content

Latest commit

 

History

History
105 lines (71 loc) · 3.54 KB

File metadata and controls

105 lines (71 loc) · 3.54 KB

🧠 Models Directory

This folder contains all pretrained deep learning models used for action classification via pose estimation and image-based inference in the video action recognition API.


📁 Folder Structure

models/
├── final/
│   ├── best_model.h5          # Combined pose+image model (.h5 format)
│   ├── best_model.keras       # Same model in TensorFlow SavedModel format
│
├── img/
│   ├── image_model.h5         # Image-based model (.h5 format)
│   ├── image_model.keras      # Image-based model (.keras format)
│   ├── image_model.pb         # Frozen TensorFlow graph (Protocol Buffer)
│
├── pose/
│   ├── pose_model.h5          # Pose-based model (.h5 format)
│   ├── pose_model.keras       # Pose-based model (.keras format)
│   ├── pose_model.pb          # Frozen graph for pose-only model

📌 Model Types & Use Cases

Folder Purpose Input Model Type Used In
final/ Unified model for pose + image fusion Sequences of poses + frames ConvLSTM + CNN + Dense Poseclassifier.py (mode='both')
img/ Classifies from image sequences Raw video frames (RGB) CNN + TimeDistributed Poseclassifier.py (mode='action')
pose/ Classifies from pose keypoints Pose landmark vectors LSTM / BiLSTM Poseclassifier.py (mode='pose')

⚠️ Model Format Notes

✅ Supported

  • .h5 and .keras work with Keras / TensorFlow.
  • .pb (frozen graph) is for TensorFlow inference (static graphs).

❌ Not Supported

Attempts to convert to:

  • ONNX
  • TFLite
  • CoreML

…have failed, mainly due to:

🔄 Use of LSTM and ConvLSTM layers introduces dynamic input shapes and internal states that many static format converters (like ONNX or TFLite) do not support well.

Why?

  • Dynamic time steps (None, 30, ...) and recurrent stateful behavior break static conversion logic.
  • ConvLSTM's internal Keras implementations rely on tf.while_loop and nested control flows, which ONNX/TFLite cannot trace reliably.

🔧 Recommendation

If you need these models in ONNX or TFLite in the future:

  1. Try simpler versions using GRU or Flattened TimeDistributed CNNs.

  2. Use post-training conversion with concrete input signatures and tracing with tf.function.

  3. For mobile inference, consider:

    • Edge-friendly models with fixed input shapes.
    • Using only pose vectors (not ConvLSTM) + small feed-forward layers.

📂 Usage Reference

File Used In
final/*.h5 In Poseclassifier.py under mode='both'
img/*.h5 Used in mode='action'
pose/*.h5 Used in mode='pose'

🛠️ Example Code to Load

from tensorflow.keras.models import load_model

# Load pose model
model = load_model('models/pose/pose_model.h5')

# Load final model (pose + image)
combined_model = load_model('models/final/best_model.keras')

Let me know if you'd like:

  • The models zipped for release
  • A license section
  • Or an automated script to test model loading and inference