Skip to content

fix(xiaoyuzhou): migrate from broken SSR scraping to authenticated API (fixes #1023)#1059

Merged
jackwener merged 3 commits intojackwener:mainfrom
kagura-agent:fix/xiaoyuzhou-ssr-to-api
Apr 16, 2026
Merged

fix(xiaoyuzhou): migrate from broken SSR scraping to authenticated API (fixes #1023)#1059
jackwener merged 3 commits intojackwener:mainfrom
kagura-agent:fix/xiaoyuzhou-ssr-to-api

Conversation

@kagura-agent
Copy link
Copy Markdown
Contributor

Summary

Fixes #1023 — Xiaoyuzhou podcast, podcast-episodes, episode, and download commands all return 404 errors.

Root Cause

Xiaoyuzhou (小宇宙) switched their website from Next.js SSR to a static export (SSG). The /podcast/<id> and /episode/<id> pages now return HTTP 404, which broke fetchPageProps() in utils.js that relied on scraping <script id="__NEXT_DATA__"> from server-rendered HTML.

Fix

Migrated all four affected commands from HTML scraping (fetchPageProps) to the authenticated REST API (requestXiaoyuzhouJson) that already exists in auth.js and is used successfully by transcript.js.

Command Endpoint
podcast GET /v1/podcast/get?pid=<id>
podcast-episodes POST /v1/podcast/listEpisode
episode GET /v1/episode/get?eid=<id>
download GET /v1/episode/get?eid=<id>

Other changes:

  • Strategy changed from PUBLIC to LOCAL (requires ~/.opencli/xiaoyuzhou.json credentials)
  • Removed fetchPageProps() from utils.js (no longer used)
  • Updated tests: download.test.js (mocks migrated), utils.test.js (removed fetchPageProps tests)
  • podcast-episodes: removed "(up to 15, SSR limit)" description, default limit changed from 15 to 20

Testing

vitest run clis/xiaoyuzhou/

 ✓ adapter clis/xiaoyuzhou/utils.test.js (11 tests)
 ✓ adapter clis/xiaoyuzhou/auth.test.js (5 tests)
 ✓ adapter clis/xiaoyuzhou/download.test.js (3 tests)
 ✓ adapter clis/xiaoyuzhou/transcript.test.js (5 tests)

Test Files  4 passed (4)
     Tests  24 passed (24)

kagura-agent and others added 3 commits April 16, 2026 20:38
fixes jackwener#1023)

Xiaoyuzhou removed SSR rendering — /podcast/<id> and /episode/<id> pages
now return 404, breaking fetchPageProps() which scraped __NEXT_DATA__.

Migrate podcast, podcast-episodes, episode, and download commands to use
the existing authenticated API client (requestXiaoyuzhouJson) that
transcript.js already uses successfully.

Changes:
- podcast.js: use /v1/podcast/get API endpoint
- podcast-episodes.js: use /v1/podcast/listEpisode API endpoint
- episode.js: use /v1/episode/get API endpoint
- download.js: use /v1/episode/get API endpoint
- utils.js: remove unused fetchPageProps, keep format helpers
- Update all affected tests (download.test.js, utils.test.js)
- Change strategy from PUBLIC to LOCAL (requires credentials)
@jackwener jackwener merged commit ab44d9f into jackwener:main Apr 16, 2026
11 checks passed
jackwener added a commit that referenced this pull request Apr 17, 2026
PR #1059 migrated xiaoyuzhou from SSR scraping to authenticated API.
The E2E tests run without credentials, producing exit code 78
(CONFIG_ERROR). The existing `isExpectedChineseSiteRestriction` guard
only caught FETCH_ERROR, PARSE_ERROR, and NOT_FOUND — not config
errors from missing auth credentials.
jackwener added a commit that referenced this pull request Apr 17, 2026
PR #1059 migrated xiaoyuzhou from SSR scraping to authenticated API.
The E2E tests run without credentials, producing exit code 78
(CONFIG_ERROR). The existing `isExpectedChineseSiteRestriction` guard
only caught FETCH_ERROR, PARSE_ERROR, and NOT_FOUND — not config
errors from missing auth credentials.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

xiaoyuzhou: podcast/podcast-episodes commands return 404 (ID format broken)

2 participants