Skip to content

feat(video): share-to-video ingestion + fix user_video duplicates#635

Open
mircealungu wants to merge 2 commits into
masterfrom
feat/share-to-video-ingestion
Open

feat(video): share-to-video ingestion + fix user_video duplicates#635
mircealungu wants to merge 2 commits into
masterfrom
feat/share-to-video-ingestion

Conversation

@mircealungu
Copy link
Copy Markdown
Member

Backend for "share a video to Zeeguu" → interactive viewing. The interactive video reader (/user_video) already works; this adds the missing ingestion path and fixes a prerequisite bug found while testing it. No client is wired yet — the browser-extension and iOS callers come in follow-up PRs.

Why this approach

YouTube blocks caption fetching from datacenter IPs and now requires a PO token, which is why server-side fetching broke (captions had degraded to a manual captions.json). Instead, the client — a real browser tab or iPhone on a residential IP with an authorized YouTube player — extracts the caption track and hands it to us. This is the same pattern the article share-flow already uses (scrapeForUpload.js). Metadata is unaffected (Data API key, not IP-blocked).

Commits

1. fix(video): dedupe + UNIQUE(user_id, video_id) on user_video (prerequisite)

  • The model declared its unique constraint as a bare db.UniqueConstraint(...) outside __table_args__, so no DDL was ever emitted. Concurrent first-open requests (/user_video + /video_opened fire together on reader load) inserted duplicate rows, and .one() then 500'd with MultipleResultsFound — the video became permanently un-openable (observed live on video 2316).
  • Constraint moved into __table_args__ (also fixes the table_args typo); find/find_or_create use .first(); migration dedupes then adds the unique key.

2. feat(video): accept client-extracted captions via /video_upload/create

  • Video.find_or_create(..., captions=, enforce_language=False, enforce_caption_length=False) — ingests supplied captions and skips the crawler-era reject filters for user-shared videos. Defaults unchanged, so the crawler path is untouched.
  • youtube_api: extract_youtube_video_id + normalize_caption_list; fetch_video_info gains provided_captions + enforce_* flags.
  • New POST /video_upload/create.

Migration

tools/migrations/26-05-26-a--dedupe-and-unique-user-video.sql — run before/with deploy. Dedupes existing user_video rows (keep lowest id) then adds UNIQUE(user_id, video_id).

Testing

  • Helpers unit-smoke-tested (URL parsing across shapes, entity cleanup, empty-segment drop).
  • All changed modules import/compile cleanly.
  • Not yet exercised end-to-end (needs the extension/iOS client) — that's the next PR.

🤖 Generated with Claude Code

mircealungu and others added 2 commits May 26, 2026 17:34
UserVideo declared its unique constraint as a bare db.UniqueConstraint(...)
expression outside __table_args__, so no DDL was ever emitted. Concurrent
first-open requests (/user_video + /video_opened fire together on reader load)
both INSERTed, producing duplicate rows; UserVideo.find/find_or_create then
500'd with MultipleResultsFound and the video became permanently un-openable.

- Move the constraint into __table_args__ (also fixes the table_args ->
  __table_args__ typo that silently dropped the collation).
- find/find_or_create use .first() not .one() so pre-existing dups never raise;
  the rollback-and-requery race handler now actually fires once the unique
  index exists.
- Migration dedupes existing rows (keep lowest id) then adds the unique key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Enables "share a video to Zeeguu": the client (browser extension / iOS
WKWebView) extracts captions from YouTube's authorized player and POSTs them,
sidestepping the server-side caption fetch YouTube blocks from datacenter IPs
(and which now requires a PO token). The server fetches only metadata (Data
API key, key-authenticated, not IP-blocked) and creates the Video + Caption
rows; the existing /user_video reader serves them unchanged.

- Video.find_or_create(..., captions=, enforce_language=False,
  enforce_caption_length=False): ingest supplied captions and skip the
  crawler-era reject filters for user-shared videos (the user chose the video).
  Defaults are unchanged, so the crawler path behaves identically.
- youtube_api: extract_youtube_video_id (watch?v= / youtu.be / shorts / embed /
  bare id) and normalize_caption_list (unescape entities, drop empty segments);
  fetch_video_info gains provided_captions + enforce_* flags.
- New POST /video_upload/create endpoint, registered in the blueprint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

ArchLens detected architectural changes in the following views:
diff
diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant