Skip to content

Add active hand estimation utility#223

Merged
AmitMY merged 2 commits into
masterfrom
feat/add_hand_estimate_utility
Jun 7, 2026
Merged

Add active hand estimation utility#223
AmitMY merged 2 commits into
masterfrom
feat/add_hand_estimate_utility

Conversation

@shaltielshmid

Copy link
Copy Markdown
Contributor

Summary

  • add pose_format.utils.hand.estimate_active_hand for MediaPipe holistic poses
  • document the short/long clip heuristic and validation provenance
  • include the module in the Sphinx utils toctree for Read the Docs
  • add synthetic tests for long-clip hand landmark emphasis, short-clip motion fallback, and left/right symmetry

Validation

  • uv run python -m py_compile pose_format/utils/hand.py tests/hand_estimation_test.py
  • uv run --with pytest pytest tests/hand_estimation_test.py
  • uv run --with sphinx==6.2.1 --with myst-parser==2.0.0 --with autodocsumm==0.2.11 --with sphinxcontrib-bibtex==2.5.0 --with sphinx-rtd-theme==1.2.2 --with sphinx-needs==1.3.0 --with sphinxcontrib-plantuml==0.25 sphinx-build -b html docs /tmp/pose-docs-build (succeeds with existing optional torch/tensorflow/doc warnings)
  • validation sample via packaged function: CFSW 143/143, mirrored CFSW 143/143, fsboard 100/100, mirrored fsboard 100/100

)

SHORT_CLIP_FRAMES = 50

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this utility is very much tailored to mediapipe
one option is to use a generic util like hands_components from generic, to be able to deal with varying formats
another option is estimate_active_hand should first ask if detect_known_pose_format(pose) == "holistic":

The heuristic compares torso-normalized wrist geometry, hand landmark confidence, distance from the torso,
and motion. It uses a short/long clip split: short clips use body-relative wrist motion because summed motion is
still stable there, while longer clips emphasize tracked hand landmarks and avoid duration-sensitive body-motion
accumulation.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, can we not use the "long" version always?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long version ignores body-motion accumulation, which is correct for long fsboard clips but loses short ChicagoFSWild cases where hand landmarks are sparse and body-relative wrist motion is the strongest signal. The short/long split keeps that short-clip motion signal without letting duration-scaled motion dominate long clips.

@@ -0,0 +1,104 @@
import numpy as np

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also include 5~ pose files in the test assets, and show that they work, and that mirror_horizontal flips handedness correctly.
(to prevent future regressions)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@shaltielshmid

Copy link
Copy Markdown
Contributor Author

Addressed the review notes:

  • estimate_active_hand now explicitly checks detect_known_pose_format(pose) == "holistic" and raises NotImplementedError for other pose formats.
  • Added five real .pose fixtures from the validation/sample data: 2 LEFT and 3 RIGHT, spanning ChicagoFSWild short clips and fsboard long clips.
  • Added fixture tests that assert the expected handedness and assert mirror_horizontal flips the estimated handedness for every fixture.
  • Removed the standalone Bazel hand target I had added, because this utility now depends on generic.py, which is not modeled as a Bazel target in this package; setuptools packaging still includes the module.

Validation:

  • uv run python -m py_compile pose_format/utils/hand.py tests/hand_estimation_test.py
  • uv run --with pytest --with "mediapipe<0.10.30" pytest tests/hand_estimation_test.py -q -> 14 passed
  • Sphinx build still succeeds and includes pose_format.utils.hand; remaining warnings are existing optional torch/tensorflow/autodoc warnings.
  • Validation sample remains 100%: CFSW 143/143, mirrored CFSW 143/143, fsboard 100/100, mirrored fsboard 100/100.

Why not use the long version always? The long version ignores body-motion accumulation, which is correct for long fsboard clips but loses short ChicagoFSWild cases where hand landmarks are sparse and body-relative wrist motion is the strongest signal. The short/long split keeps that short-clip motion signal without letting duration-scaled motion dominate long clips.

@AmitMY AmitMY merged commit 8165491 into master Jun 7, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants