Skip to content

Add Bilibili extractor and transcript extraction support#270

Open
JayMeDotDot wants to merge 1 commit into
kepano:mainfrom
JayMeDotDot:bilibili-extractor
Open

Add Bilibili extractor and transcript extraction support#270
JayMeDotDot wants to merge 1 commit into
kepano:mainfrom
JayMeDotDot:bilibili-extractor

Conversation

@JayMeDotDot
Copy link
Copy Markdown

Relative Issue

Close #268

Changes

  • New Bilibili Extractor: Implemented BilibiliExtractor (src/extractors/bilibili.ts), which handles Bilibili's specific API to fetch video data and subtitles/transcripts, mimicking the grouping behavior used for YouTube transcripts.
  • Extractor Registry: Registered BilibiliExtractor in src/extractor-registry.ts to trigger on bilibili.com URLs.
  • Markdown Rendering: Updated src/markdown.ts to ensure bilibili.com and bilibili.tv iframe embeds are retained in the final generated Markdown.
  • Shared Transcript Logic: Exported transcript parsing constants (e.g., SENTENCE_END, TRANSCRIPT_GROUP_GAP_SECONDS, MID_TEXT_SENTENCE_BOUNDARY) from src/extractors/youtube.ts to reuse them within the Bilibili extractor.
  • Type Definitions: Extracted and centralized the TranscriptResult interface in src/types/extractors.ts.
  • Lockfile Updates: Minor updates to package-lock.json resolving some peer dependency flags.

Relative Doc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Support for Bilibili (bilibili.com) URLs

1 participant