Scrape YouTube search results, channels, transcripts, and playlists — no API key needed.
Built on YouTube's internal InnerTube API. Three interfaces: Python SDK, CLI, and REST API.
- Search with filters — type, duration, upload date, sort order, features (4K, HDR, live, CC)
- Channel browsing — videos, shorts, playlists, and in-channel search
- Transcripts — fetch, translate, format (SRT / WebVTT / JSON), save to file
- Playlists — full pagination with position tracking and metadata
- Async-first — every method has an
asyncvariant for concurrent workloads - JSON-ready — every result object has
.to_dict()for instant serialization - Proxy rotation — built-in support for residential proxy lists
- Three interfaces — Python SDK, CLI with Rich tables, REST API with Swagger docs
- Zero config — no API key, no OAuth, no setup
- Lightweight — only
httpxas a core dependency
recommended to install using pip
You can also integrate it into an existing project or use it via a CLI.
# Core SDK only
pip install tubescrape
# With CLI (adds click + rich)
pip install "tubescrape[cli]"
# With REST API server (adds fastapi + uvicorn)
pip install "tubescrape[api]"
# Everything
pip install "tubescrape[all]"Requirements: Python 3.10+
import json
from tubescrape import YouTube
with YouTube() as yt:
# Search YouTube
results = yt.search('python tutorial', max_results=5)
for video in results.videos:
print(f'{video.title} — {video.duration} — {video.view_count}')
# Browse a channel
channel = yt.get_channel_videos('@lexfridman', max_results=10)
for video in channel.videos:
print(f'{video.title} ({video.published_text})')
# Get a transcript and save as subtitles
transcript = yt.get_transcript('dQw4w9WgXcQ')
print(transcript.text)
transcript.save('subtitles.srt')
# Every result serializes to JSON instantly
data = results.to_dict()
print(json.dumps(data, indent=2))Every method accepts plain IDs, full URLs, or @handles — parsed automatically.
| tubescrape | youtube-transcript-api | pytube / pytubefix | yt-dlp | |
|---|---|---|---|---|
| Search videos | Yes | No | No | Limited |
| Channel browse | Yes | No | No | Yes |
| Transcripts | Yes | Yes | No | Yes |
| Playlists | Yes | No | Yes | Yes |
| Async support | Yes | No | No | No |
| Built-in REST API | Yes | No | No | No |
| CLI tool | Yes | No | No | Yes |
| Core dependencies | 1 (httpx) | 1 (requests) | 0 | Many |
| API key needed | No | No | No | No |
results = yt.search('python tutorial', max_results=5)
for video in results.videos:
print(f'{video.title} — {video.url}')
print(f' {video.view_count} | {video.duration} | {video.published_text}')
print(f' Channel: {video.channel} (verified: {video.is_verified})')All filters can be combined in a single call:
results = yt.search(
'podcast interview',
max_results=10,
type='video', # video | channel | playlist | movie
duration='long', # short (<4m) | medium (4-20m) | long (>20m)
upload_date='this_month', # last_hour | today | this_week | this_month | this_year
sort_by='view_count', # relevance | upload_date | view_count | rating
features=['hd', 'subtitles'], # live | 4k | hd | subtitles | cc | creative_commons | 360 | vr180 | 3d | hdr
)All channel methods accept @handle, channel ID (UC...), or full URL.
# Videos (newest first, with pagination)
videos = yt.get_channel_videos('@lexfridman', max_results=10)
all_videos = yt.get_channel_videos('@lexfridman', max_results=0) # fetch ALL
# Shorts
shorts = yt.get_channel_shorts('@lexfridman')
for short in shorts.shorts:
print(f'{short.title} — {short.view_count} — {short.url}')
# Playlists
playlists = yt.get_channel_playlists('@lexfridman')
for pl in playlists.playlists:
print(f'{pl.title} — {pl.video_count} — {pl.url}')
# Search within a channel
results = yt.search_channel('@lexfridman', 'artificial intelligence', max_results=10)# Fetch transcript (auto-detects best language)
transcript = yt.get_transcript('dQw4w9WgXcQ')
print(transcript.text) # full text as a single string
# Choose language (priority order fallback)
transcript = yt.get_transcript('dQw4w9WgXcQ', languages=['de', 'en'])
# Translate to any language
transcript = yt.get_transcript('dQw4w9WgXcQ', translate_to='es')
# Without timestamps (plain text blob)
transcript = yt.get_transcript('dQw4w9WgXcQ', timestamps=False)
# List available languages
languages = yt.list_transcripts('dQw4w9WgXcQ')
for entry in languages:
print(f'{entry.language} ({entry.language_code}) — {"auto" if entry.is_generated else "manual"}')transcript = yt.get_transcript('dQw4w9WgXcQ')
# Format as SRT, WebVTT, JSON, or plain text
srt = YouTube.format_transcript(transcript, fmt='srt')
vtt = YouTube.format_transcript(transcript, fmt='vtt')
# Save to file (format auto-detected from extension)
transcript.save('subtitles.srt')
transcript.save('subtitles.vtt')
transcript.save('transcript.json')
transcript.save('transcript.txt')# Accepts playlist ID or full URL
playlist = yt.get_playlist('PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf')
print(f'{playlist.title} by {playlist.channel} — {len(playlist.videos)} videos')
for entry in playlist.videos:
print(f'#{entry.position} {entry.title} — {entry.duration}')Every result object converts to a plain Python dictionary — ready for JSON, databases, or APIs:
import json
# Works on every result type
results.to_dict() # SearchResult → dict
video.to_dict() # VideoResult → dict
channel.to_dict() # BrowseResult → dict
shorts.to_dict() # ShortsResult → dict
playlists.to_dict() # ChannelPlaylistsResult → dict
playlist.to_dict() # PlaylistResult → dict
transcript.to_dict() # Transcript → dict
# Sparse output: optional fields excluded when empty/default
# is_verified=False → omitted | badges=[] → omitted | None fields → omitted
print(json.dumps(results.to_dict(), indent=2, ensure_ascii=False))Every method has an async variant prefixed with a. Use in FastAPI, Discord bots, or any async application:
import asyncio
from tubescrape import YouTube
async def main():
async with YouTube() as yt:
# All methods have async variants
results = await yt.asearch('python', max_results=5)
transcript = await yt.aget_transcript('dQw4w9WgXcQ')
# Run multiple requests concurrently
r1, r2, r3 = await asyncio.gather(
yt.asearch('python'),
yt.asearch('javascript'),
yt.asearch('rust'),
)
asyncio.run(main())# Single proxy
yt = YouTube(proxy='http://user:pass@proxy.example.com:8080')
# Proxy rotation (round-robin per request)
yt = YouTube(proxies=[
'http://user:pass@proxy1:8080',
'http://user:pass@proxy2:8080',
])
# SOCKS5
yt = YouTube(proxy='socks5://user:pass@proxy:1080')
# Custom timeout and retries
yt = YouTube(proxy='http://proxy:8080', timeout=60.0, max_retries=5)Tip: YouTube blocks cloud IPs aggressively. Use rotating residential proxies (BrightData, SmartProxy, Oxylabs) for production.
Install with pip install "tubescrape[cli]".
tubescrape search "python tutorial" -n 5
tubescrape search "podcast" --type video --duration long --sort-by view_count
tubescrape search "python" --json # JSON output
tubescrape channel @lexfridman # videos (default)
tubescrape channel @lexfridman shorts # shorts
tubescrape channel @lexfridman playlists # playlists
tubescrape channel @lexfridman search "podcast" # search within channel
tubescrape playlist PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf
tubescrape transcript dQw4w9WgXcQ # plain text
tubescrape transcript dQw4w9WgXcQ --format srt # SRT subtitles
tubescrape transcript dQw4w9WgXcQ --translate es # translate
tubescrape transcript dQw4w9WgXcQ --save output.srt # save to file
tubescrape transcript dQw4w9WgXcQ --list-languages # available languagestubescrape --proxy http://user:pass@host:port search "python" # with proxy
export TUBESCRAPE_PROXY="http://user:pass@host:port" # env variableInstall with pip install "tubescrape[api]".
tubescrape serve # starts on localhost:8000
tubescrape serve --host 0.0.0.0 --port 3000Interactive Swagger docs at http://localhost:8000/docs.
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/search?q=python |
Search videos |
| GET | /api/v1/channel/{id}/videos |
Channel videos |
| GET | /api/v1/channel/{id}/shorts |
Channel shorts |
| GET | /api/v1/channel/{id}/playlists |
Channel playlists |
| GET | /api/v1/channel/{id}/search?q=... |
Search within channel |
| GET | /api/v1/playlist/{id} |
Fetch playlist |
| GET | /api/v1/transcript/{video_id} |
Fetch transcript |
| GET | /api/v1/transcript/{video_id}/languages |
List languages |
| GET | /health |
Health check |
curl "http://localhost:8000/api/v1/search?q=python+tutorial&max_results=5"
curl "http://localhost:8000/api/v1/transcript/dQw4w9WgXcQ?format=srt&translate_to=es"All exceptions inherit from YouTubeError:
YouTubeError
├── RequestError
│ ├── RateLimitError # HTTP 429
│ └── BotDetectedError # HTTP 403
├── VideoUnavailableError # private, deleted, region-locked
│ └── AgeRestrictedError
├── TranscriptsDisabledError
├── TranscriptsNotAvailableError
├── TranscriptFetchError
├── TranslationNotAvailableError
├── ChannelNotFoundError
├── PlaylistNotFoundError
├── APIKeyNotFoundError
└── ParsingError
from tubescrape import YouTube, YouTubeError, RateLimitError
try:
results = yt.search('python')
except RateLimitError:
print('Rate limited — use a proxy')
except YouTubeError as e:
print(f'YouTube error: {e}')For detailed examples, all field references, and advanced usage, see the Complete Usage Guide.
This library uses YouTube's undocumented InnerTube API. It may break if YouTube changes their internal API. If it does, please open an issue.
git clone https://github.com/zaidkx37/tubescrape.git
cd tubescrape
pip install -e ".[all,dev]"
pytest # run tests
ruff check src/ # lint
mypy src/tubescrape/ # type checkMIT License. See LICENSE for details.



