VRChat Media Gateway is a Windows-first toolkit for turning web, Telegram, Spotify, and local media into VRChat-friendly HLS streams. It includes a Chromium browser extension, a Spicetify bridge for Spotify Desktop, a FastAPI backend, websocket RPC for Spotify control, and an HLS segment engine for live Spotify capture.
- Browser extension in
extension/with:- YouTube watch-page button
- SoundCloud track-page button
- Telegram Web context-menu export for images, stickers, animated GIF posts, videos, music tracks, voice messages, and post-text variants
- Spotify Web player buttons
- Generic image context-menu export on any site, with animated GIFs treated as motion video
- Generic audio context-menu export on any site when the page exposes a direct audio URL
- Generic video context-menu export on any site when the page exposes a direct video URL
- Quick-link popup for URLs, pasted local paths, and dropped/selected local files
- Long-running popup builds keep running in the extension background and resume when the popup is reopened
- Settings popup with local/public endpoint handling and browser HLS segment duration control
- Spotify Desktop bridge in
spotify_extension/for Spicetify:VRChatbuttonClear cachebuttonSettingsbuttonStreaming settingssection for Spotify HLS prefetch tuning and optional cover/title/artist poster outputRestore audio outputaction
- FastAPI backend in
api/ - Websocket RPC endpoint for Spotify control at
/api/ws/spotify - HLS segment engine for live Spotify capture in
api/segments_engine/ - nginx reverse proxy and static HLS serving through
main.conf - Optional Cloudflare Tunnel exposure for a public HTTPS URL
| Source | How it works now |
|---|---|
| YouTube | Browser button or direct API call downloads via yt-dlp, uses Node-based JS challenge solving, then converts to HLS |
| SoundCloud | Browser button appears only on track pages, preserves secret/private links when possible, downloads with yt-dlp, then converts to poster-style HLS with artwork, title, performer, and duration when SoundCloud metadata exposes them |
| Telegram images and static stickers | Telegram endpoints classify the Telegram post first, then use Telethon for real Telegram photos and static .webp stickers, with HTML image parsing only as a photo fallback |
| Telegram videos, animated GIFs, and motion stickers | Telegram Web context menu and direct API calls can auto-detect videos, animated GIF posts, video stickers, and animated .tgs stickers, download them through Telethon, then convert them to HLS |
| Telegram music tracks and voice messages | Telegram Web context menu and direct API calls can auto-detect Telegram audio documents and voice notes, download them through Telethon, then convert them to poster-style audio HLS with cover/title/performer when Telegram exposes that metadata |
| Telegram post text | Telegram Web can export VRChat with post text, which renders the post text on a black panel below the media; pasted text-only t.me/... quick links also build a text-only HLS stream automatically |
| Generic images | Right-click any still image on most sites and export it to a static HLS video |
| Generic audio | Right-click a site audio element with a direct media URL and export it to poster-style HLS; embedded tags and cover art are used when the source file exposes them |
| Generic videos | Right-click a site video with a direct media URL and export it to HLS |
| Spotify Web | Injected buttons in open.spotify.com generate /api/stream-spotify links and allow cache clearing |
| Spotify Desktop | Spicetify bridge talks to the backend over websocket, routes Spotify audio into a virtual cable, and writes live HLS segments; optional poster mode shows the current track cover, title, artist, and duration |
| Local media in popup | The browser extension popup can build HLS links from selected or dropped local image/video/audio files, or from pasted absolute local paths when the API can see that file on the same machine; local audio exports also use embedded tags and cover art when available, and large files prefer direct filesystem handles when Chromium exposes them |
| Legacy local files | Manual flow through run convertion.bat using files from input/ |
- The full project is designed around Windows.
- The documented scripts are
.batfiles. - Spotify capture uses
winappaudiorouter,Spotify.exe, and DirectShow audio capture. - The Spotify path expects VB-Audio Virtual Cable device names by default.
- YouTube, SoundCloud, image, and Telegram conversion logic is less OS-specific, but the repo setup and helper scripts still assume Windows.
Install these tools and make sure they are available in PATH:
- Python 3.10+
ffmpegyt-dlpnodenginxcloudflaredif you want public HTTPS URLs
Optional but required for specific features:
- Telegram API credentials and a logged-in Telegram session for Telegram media
- Spotify Desktop
- Spicetify CLI
- VB-Audio Virtual Cable
Install Python dependencies:
pip install fastapi "uvicorn[standard]" requests python-dotenv telethon "qrcode[pil]" psutil winappaudiorouter pillow rlottie_pythonrlottie_python is used for Telegram animated .tgs sticker export.
The normal runtime shape is:
- Browser extension or Spicetify bridge triggers an
/api/...endpoint, or a loopback-only/local-api/...endpoint for local file ingestion. - nginx listens on
http://127.0.0.1:8080, proxies/api/*and/api/ws/*to FastAPI on127.0.0.1:5000, and serveshtml/streams/. - FastAPI downloads, uploads, or captures media and writes HLS output into
html/streams/<sid>/. - VRChat receives the public HTTPS URL from Cloudflare Tunnel, or a local URL for testing.
Important port split:
- FastAPI listens on
127.0.0.1:5000 - nginx listens on
127.0.0.1:8080 - The browser extension and Spotify websocket bridge should point at
8080, not5000 - Exception: the browser extension popup talks to
127.0.0.1:5000/local-api/*directly for local file uploads and local filesystem paths, so those routes never go through nginx or the public tunnel
main.conf already contains the expected layout:
/api/-> FastAPI/api/ws/-> websocket proxy for Spotify RPC/streams/-> public HLS output fromhtml/streams/
Start nginx with:
.\run server.batFastAPI is in api/. For most features there is no required backend config besides installed binaries.
For Telegram support:
- Copy
api/.env.sampletoapi/.env - Fill in:
TG_API_ID=...
TG_API_HASH=...
TG_PASSWORD=
TG_SESSION=tg_session.session- Generate a Telegram session:
python auxillary/get_tg_session.pyThe QR login helper writes the session to the path from TG_SESSION. By default that ends up in the repository root as tg_session.session.
If Telegram login is acting up, inspect the session with:
python auxillary/telethon_check.pyyt-dlp is invoked with Node-based JS challenge solving. Keep both yt-dlp and Node available in PATH.
If you need age-restricted or authenticated downloads:
- export cookies to
cookies.txtin the repository root - do not commit that file
If YouTube starts failing because of extractor changes, update yt-dlp with:
.\update yt-dlp.batLoad extension/ as an unpacked Chromium extension.
The popup supports:
Use public URL (tunnel)Use local API for processing- local host and port
- public tunnel URL auto-detection or manual override
- local file choose/drop for image, video, and audio
- pasted absolute local filesystem paths such as
C:\media\clip.mp4orfile:///C:/media/clip.mp4 Clear all cachebutton for wiping generated streams and temporary output from the local machine
Use local API for processing means:
- requests are sent to the local stack on
127.0.0.1:8080 - the copied result still uses the public Cloudflare Tunnel URL
That mode is useful when the tunnel is public but you want all downloading and transcoding to happen locally.
Local media security model:
- local file uploads and local path builds use loopback-only FastAPI routes under
/local-api/* - those routes are intentionally not proxied by nginx and should not be exposed through the tunnel
- pasted local paths only work when FastAPI is running on the same machine and can read that path directly
- file picker and drag-and-drop prefer the browser File System Access handle when Chromium exposes it, and fall back to local upload when they cannot
Spotify Desktop streaming is separate from Spotify Web link generation.
Required pieces:
- Spotify Desktop installed
- Spicetify CLI installed and working
- VB-Audio Virtual Cable installed
The defaults in api/config.py expect these device names:
CABLE Input (VB-Audio Virtual Cable)CABLE Output (VB-Audio Virtual Cable)
If your device names differ, change them in api/config.py.
Install the Spicetify bridge:
.\install_stream_bridge.batAfter installation, start Spotify. The bridge will connect to:
ws://127.0.0.1:8080/api/ws/spotify
The Spotify Desktop path works like this:
- Backend asks the Spicetify bridge to load track metadata.
- Bridge controls Spotify playback over websocket RPC.
- Backend routes Spotify output to the configured virtual cable.
ffmpegcaptures from the cable and writes HLS segments.- When the track ends, the playlist is finalized and converted into a replayable VOD-style result.
Spicetify settings now also include a Streaming settings button. That section stores:
- prefetch segments count
- prefetch segment duration in seconds
- a read-only total prefetch duration field
Show track cover/title/artistcheckbox for poster-style Spotify output
The VRChat button appends those values to Spotify links as:
/api/stream-spotify?url=<spotify-track-url>&segment_time=<seconds>&prefetch=<count>&show_info=<0|1>
If either query parameter is omitted, the backend falls back to the defaults from api/config.py SPOTIFY_HLS_OPTS.
Recommended full-stack command:
.\run stream server.batThat starts:
- nginx on
:8080 - FastAPI on
:5000 - a quick Cloudflare Tunnel with logs written to
logs/cloudflared.log
You can also start parts separately:
.\run server.bat
.\run api.bat
.\run tunnel.batNotes:
- For the complete feature set, prefer
run stream server.bat. - The Spotify websocket registry is in-memory, so single-process operation is the safe path for Spotify features.
run api.batis mainly useful for direct HTTP testing and non-Spotify flows.
Site-specific behavior on the current branch:
- YouTube: injects a
VRChatbutton only on watch pages and survives SPA navigation - SoundCloud: injects only on real track pages, not artist/profile tabs; exported tracks now build poster-style audio HLS with artwork/title/performer when available
- Telegram Web: adds
VRChatandVRChat with post textentries to the message context menu, auto-detects images/stickers/videos/animated GIF posts/music tracks/voice messages, and hides the plainVRChataction on text-only posts - Spotify Web: adds
VRChatandClear cachebuttons near the player controls - Generic images: adds a
VRChatitem to the browser image context menu; animated GIF URLs are routed through the video exporter - Generic audio: adds a
VRChatitem to the browser audio context menu for direct audio URLs and renders poster-style audio HLS when tags/cover art are available - Generic videos: adds a
VRChatitem to the browser video context menu for direct video URLs - Popup quick-link: can build ready links from remote URLs, selected/dropped local media, and pasted absolute local paths
Managed popup flows now wait for the stream to be ready, keep running if the popup closes, and usually copy the final /streams/<sid>/index.m3u8 URL instead of a still-building /api/stream-* URL.
The browser extension settings popup now also includes a Streaming settings section with:
- segment duration in seconds for non-Spotify builds
That value is appended as segment_time=<seconds> for YouTube, SoundCloud, Telegram, generic image/audio/video, and local media builds. If it is omitted, the backend falls back to the default value from api/config.py HLS_OPTS.
Main HTTP endpoints:
GET /api/stream-yt?url=<youtube-url>GET /api/stream-sc?url=<soundcloud-url>GET /api/stream-image?url=<image-url>&duration=300&width=1280&height=720GET /api/stream-audio?url=<direct-audio-url>&referer=<page-url>GET /api/stream-video?url=<direct-video-url>&referer=<page-url>GET /api/stream-tg-media?url=<telegram-post-url>GET /api/stream-tg-image?url=<telegram-post-url>GET /api/stream-tg-video?url=<telegram-post-url>GET /api/stream-tg-audio?url=<telegram-post-url>GET /api/tg-post-info?url=<telegram-post-url>GET /api/stream-spotify?url=<spotify-track-url>&segment_time=<seconds>&prefetch=<count>&show_info=<0|1>POST /api/stream-spotify-clear?url=<spotify-track-url>&segment_time=<seconds>&prefetch=<count>&show_info=<0|1>GET /api/tunnel
Local-only endpoints:
POST /local-api/stream-local-path-build-startPOST /local-api/stream-local-upload-build-startGET /local-api/stream-local-build-status?job_id=<job-id>POST /local-api/clear-cache-all
Spotify-specific delivery endpoints:
GET /api/stream-spotify-playlist/{sid}GET /api/stream-spotify-segment/{sid}/{filename}WS /api/ws/spotify
Behavior notes:
- Most VOD endpoints build the stream on first request and then serve cached HLS from
html/streams/<sid>/ - Non-Spotify VOD endpoints can also accept
segment_time=<seconds>and use that value in both HLS generation and cache keys - Telegram media endpoints also accept
with_text=1to render the Telegram post text on a black panel under the media; text-only Telegram posts can build a text-only HLS output through the same flow - Telegram sticker posts are supported too: static
.webpstickers go through the image path, whilevideo/webmand animated.tgsstickers go through the motion/video path - Telegram music tracks and voice messages are supported too: they go through the Telegram audio path and build poster-style audio HLS output with cover/title/performer when Telegram exposes that metadata
- SoundCloud, generic audio, and local audio exports also build poster-style audio HLS output with cover/title/performer when the source metadata exposes them
- Spotify is segment-driven and uses the websocket bridge for metadata, seeking, playback start, cache clearing, and audio restoration;
show_info=1enables poster-style cover/title/artist output on the live Spotify stream - Managed popup flows usually resolve to the final
/streams/<sid>/index.m3u8link after the build is ready - Local media ingestion uses
/local-api/*only on loopback; the final playback URL is still served from/streams/... - Animated GIFs are treated as motion media and end up on the video/HLS path instead of the still-image path
run convertion.bat still works for manual local testing:
- place files in
input/ - converted output is written to
output/ - the latest stream is copied into
html/for nginx to serve
This path is now the legacy/manual mode. The browser and API-driven flows are the main path.
run stream server.bat: nginx + API + Cloudflare Tunnelrun server.bat: nginx onlyrun api.bat: FastAPI onlyrun tunnel.bat: Cloudflare Tunnel onlyrun convertion.bat: local file to HLS conversioninstall_stream_bridge.bat: install Spicetify bridgeclear_cache.bat: wipe cached output and generated streamsupdate yt-dlp.bat: updateyt-dlpget telegram session.bat: launch Telegram QR login helper
- Make sure
cloudflaredis running - Make sure
logs/cloudflared.logis being written - The browser extension and Spicetify bridge read the latest
https://*.trycloudflare.comURL from that log through/api/tunnel - If needed, set the public URL manually in the popup/settings UI
- Confirm
nodeis installed and available inPATH - Update
yt-dlpwithupdate yt-dlp.bat - Add
cookies.txtfor age-restricted or logged-in content
- Verify
api/.env - Regenerate the Telegram session
- Test the session with
python auxillary/telethon_check.py - The image endpoint uses Telethon first and HTML parsing second; if both fail, the post is likely inaccessible from the current session
- Start nginx before opening Spotify so websocket proxying exists on port
8080 - Confirm the Spicetify bridge was installed with
install_stream_bridge.bat - Confirm VB-Cable device names match
api/config.py - If Spotify audio stays routed incorrectly after a failure, use the
Restore audio outputbutton in the Spicetify settings modal
- Use the extension on the actual track page
- The new SoundCloud logic tries to capture the secret/private share URL instead of only the public permalink
- Keep
cookies.txtavailable if the backend needs authenticated access
- Start the FastAPI backend locally; the popup needs direct access to
127.0.0.1:5000for/local-api/* - Pasted local paths must be absolute paths on the same machine as the backend
- Large picked or dropped files use filesystem handles when Chromium supports them; otherwise the popup falls back to local upload and that extra local copy can take time
- Only local image, video, and audio formats are accepted
- Animated GIFs are exported as motion video instead of a static image loop
- Use the Spotify
Clear cachebutton for per-track resets - Use the popup
Clear all cachebutton for a loopback-only wipe of generated streams and temporary output - Use
clear_cache.batto wipe generated media underoutput/andhtml/streams/
api/ FastAPI app, routers, websocket RPC, segment engine
extension/ Browser extension for YouTube, SoundCloud, Telegram, Spotify Web, images, and local media popup export
spotify_extension/ Spicetify bridge for Spotify Desktop
html/streams/ Generated HLS output served by nginx
input/ Manual local-file input for legacy conversion mode
output/ Temporary downloaded media, upload cache, and conversion artifacts
logs/ cloudflared log and runtime logs