Merge pull request #280 from clamsproject/277-sampling-mode-docs

keighrim · web-flow · commit fb98e0ed7aa0 · 2026-03-11T20:22:08.000-04:00
Document tfSamplingMode in developer tutorial
diff --git a/documentation/tutorial.md b/documentation/tutorial.md
@@ -226,6 +226,27 @@ def _run_nlp_tool(self, doc, new_view):
 
 First, with `text_value` we get the text from the text document, either from its `location` property or from its `text` property. Second, we apply the tokenizer to the text. And third, we loop over the token offsets in the tokenizer result and create annotations of type `Uri.TOKEN` with an identifier that is automatically generated by the SDK. All that is needed for adding an annotation is the `new_annotation()` method on the view object and the `add_property()` method on the annotation object.
 
+## Working with TimeFrame Annotations
+
+Many CLAMS apps process video by operating on TimeFrame annotations produced by an upstream app (e.g., scene detection, shot segmentation). A TimeFrame can carry structural members (currently called `targets` — a list of TimePoint IDs covering every frame in the segment), a salient subset of those members (currently called `representatives`), or simply `start`/`end` boundaries.
+
+> **Note**
+> The property names `targets` and `representatives` are under review and may be renamed in a future MMIF spec version. See [mmif#238](https://github.com/clamsproject/mmif/issues/238) for the ongoing discussion. The SDK API will be updated accordingly.
+
+### Frame sampling with `tfSamplingMode`
+
+When your app receives TimeFrame annotations, the caller can control which frames your app processes by setting the `tfSamplingMode` runtime parameter. This is a **universal parameter** — automatically available on every CLAMS app without any per-app configuration.
+
+There are three modes:
+
+- `representatives` (default) — use the frames listed in the TimeFrame's `representatives` property. If no representatives exist, the TimeFrame is skipped.
+- `single` — pick one frame: the middle representative if available, otherwise the midpoint of the start/end interval.
+- `all` — use every frame in `targets` if present, otherwise generate every frame in the start/end interval.
+
+App developers do **not** need to handle this parameter themselves. The SDK intercepts it in `annotate()` and sets a context variable before `_annotate()` runs. Inside `_annotate()`, calls to `vdh.extract_frames_by_mode()` automatically read the active mode and select frames accordingly. The underlying per-mode functions (`_sample_representatives()`, `_sample_single()`, `_sample_all()`) in `mmif.utils.video_document_helper` are also available for apps that need frame numbers without extracting images.
+
+See the `extract_frames_by_mode()` API documentation for details.
+
 ## Containerization with Docker
 
 Apps within CLAMS typically run as Flask servers in Docker containers, and after an app is tested as a local Flask application, it should be containerized. In fact, in some cases we don't even bother running a local Flask server and move straight to the container set up.