Automated monitoring of the top PyPI and npm packages for supply chain compromise. Polls both registries for new releases, diffs each release against its predecessor, and uses an LLM (via Cursor Agent CLI) to classify diffs as benign or malicious. Malicious findings trigger a Slack alert.
Both ecosystems are monitored by default. Use --no-pypi or --no-npm to disable one.
Each ecosystem runs its own polling thread but shares the analysis and alerting pipeline.
┌─── PyPI ──────────────────────┐ ┌─── npm ───────────────────────┐
│ │ │ │
│ changelog_since_serial() │ │ CouchDB _changes feed │
│ │ │ │ │ │
│ ▼ │ │ ▼ │
│ ┌────────────┐ │ │ ┌────────────┐ │
│ │ All PyPI │─┐ │ │ │ All npm │─┐ │
│ │ events │ │ │ │ │ changes │ │ │
│ └────────────┘ ▼ │ │ └────────────┘ ▼ │
│ hugovk ──► Watchlist │ │ download-counts ─► Watchlist │
│ │ │ │ │ │
│ "new release" events only │ │ new versions since last epoch │
└───────────────┬───────────────┘ └───────────────┬───────────────┘
│ │
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ Download old + new│ │ Download old + new│
│ (sdist + wheel) │ │ (tarball) │
└───────────────────┘ └───────────────────┘
│ │
└─────────────────┬─────────────────┘
▼
┌───────────────┐
│ Unified diff │
│ report (.md) │
└───────┬───────┘
▼
┌───────────────┐ ◄── LLM analysis
│ Cursor Agent │ (read-only)
│ CLI (ask mode)│
└───────┬───────┘
│
verdict?
│
malicious │
▼
┌───────────────┐
│ Slack alert │
└───────────────┘
The LLM analysis is prompted to look for:
- Obfuscated code (base64, exec, eval, XOR, encoded strings)
- Network calls to unexpected hosts
- File system writes to startup/persistence locations
- Process spawning and shell commands
- Steganography or data hiding in media files
- Credential and token exfiltration
- Typosquatting indicators
- Python 3.9+ — install runtime dependencies with
pip install -r requirements.txt(stdlib covers most of the tool;requestsis used for Slack uploads) - Cursor Agent CLI — the standalone
agentbinary, not the IDE
Windows (PowerShell):
irm 'https://cursor.com/install?win32=true' | iexmacOS / Linux:
curl https://cursor.com/install -fsS | bashVerify with:
agent --versionYou must be authenticated with Cursor (agent login or set CURSOR_API_KEY).
Place your Slack bot token in etc/slack.json:
{
"url": "https://hooks.slack.com/services/...",
"bot_token": "xoxb-...",
"channel": "C01XXXXXXXX"
}The bot needs chat:write scope on the target channel. The channel field is the Slack channel ID where alerts are posted.
# One-shot: analyze releases from the last ~10 minutes
python monitor.py --once
# Continuous: monitor top 1000 packages (both ecosystems), poll every 5 min
python monitor.py --top 1000 --interval 300
# Production: monitor top 15000, alert to Slack
python monitor.py --top 15000 --interval 300 --slack
# npm only, top 5000
python monitor.py --no-pypi --npm-top 5000
# PyPI only
python monitor.py --no-npm| File | Purpose |
|---|---|
monitor.py |
Main orchestrator — poll PyPI + npm, diff, analyze, alert (parallel threads) |
pypi_monitor.py |
Standalone PyPI changelog poller (used for exploration) |
package_diff.py |
Download and diff two versions of any PyPI or npm package |
analyze_diff.py |
Send a diff to Cursor Agent CLI, parse verdict |
top_pypi_packages.py |
Fetch and list top N PyPI packages by download count |
slack.py |
Slack API client (SendMessage, PostFile) |
etc/slack.json |
Slack bot credentials |
last_serial.yaml |
Persisted polling state (PyPI serial + npm sequence/epoch) |
logs/ |
Daily log files (monitor_YYYYMMDD.log) |
python monitor.py [OPTIONS]
Options:
--top N Number of top packages to watch per ecosystem (default: 15000)
--interval SECS Poll interval in seconds (default: 300)
--once Single pass over recent events, then exit
--slack Enable Slack alerts for malicious findings
--model MODEL Override LLM model (default: composer-2-fast)
--debug Enable DEBUG logging (includes agent raw output)
PyPI options:
--no-pypi Disable PyPI monitoring
--serial N PyPI changelog serial to start from
npm options:
--no-npm Disable npm monitoring
--npm-top N Top N npm packages to watch (default: same as --top)
--npm-seq N npm replication sequence to start from
PyPI and npm each run in their own polling thread. Polling state (PyPI serial, npm sequence + epoch) is persisted to last_serial.yaml so the monitor resumes where it left off after a restart.
PyPI pipeline:
- Loads the top N packages from the hugovk/top-pypi-packages dataset as a watchlist
- Connects to PyPI's XML-RPC API and gets the current serial number
- Every
--intervalseconds, callschangelog_since_serial()— a single API call that returns all events since the last check - Filters for
"new release"events matching the watchlist - For each new release: downloads old + new versions (sdist and wheel when both exist), diffs, analyzes via LLM, and alerts Slack if malicious
npm pipeline:
- Loads the top N packages from the download-counts dataset (falls back to npm search API)
- Reads the current CouchDB replication sequence from
replicate.npmjs.com - Every
--intervalseconds, fetches the_changesfeed for all registry changes since the last sequence - Filters changed packages against the watchlist and checks for versions published after the last poll epoch
- For each new release: downloads old + new tarballs from the npm registry, diffs, analyzes via LLM, and alerts Slack if malicious
All output is logged to both the console and logs/monitor_YYYYMMDD.log.
# Compare two versions from PyPI
python package_diff.py requests 2.31.0 2.32.0
# Compare two versions from npm
python package_diff.py --npm express 4.18.2 4.19.0
# Save to file
python package_diff.py telnyx 2.0.0 2.1.0 -o telnyx_diff.md
# Compare local archives
python package_diff.py --local old.tar.gz new.tar.gz -n mypackageDownloads are done directly via registry APIs (PyPI JSON API / npm registry), not pip or npm. This means:
- No pip/npm dependency for downloads
- Platform-agnostic — can download and diff Linux-only packages from Windows
- PyPI: prefers wheel (pure-Python when available), falls back to sdist
- npm: downloads tarballs directly from the registry
# Analyze a diff file
python analyze_diff.py telnyx_diff.md
# JSON output
python analyze_diff.py telnyx_diff.md --json
# Use a specific model
python analyze_diff.py telnyx_diff.md --model claude-4-opusRuns Cursor Agent CLI in --mode ask (read-only) with --trust. The agent reads the diff file and returns a structured verdict.
Exit codes: 0 = benign, 1 = malicious, 2 = unknown/error.
# See what's being released right now (last ~10 min)
python pypi_monitor.py --once --top 15000
# Continuous monitoring (console output only, no analysis)
python pypi_monitor.py --top 1000 --interval 120Useful for exploring PyPI release velocity or debugging the changelog API without running the full analysis pipeline.
# Print top 1000 packages
python top_pypi_packages.py# Use as a library
from top_pypi_packages import fetch_top_packages
packages = fetch_top_packages(top_n=500)
# [{"project": "boto3", "download_count": 1577565199}, ...]| Source | What | Rate Limits |
|---|---|---|
| hugovk/top-pypi-packages | Top 15,000 PyPI packages by 30-day downloads (monthly JSON) | None (static file) |
PyPI XML-RPC changelog_since_serial() |
Real-time PyPI event firehose | Deprecated but functional; 1 call per poll is fine |
| PyPI JSON API | Package metadata, version history, download URLs | Generous; used sparingly (1 call per release) |
| download-counts (nice-registry) | Monthly download counts for every npm package (counts.json) |
None (npm tarball) |
npm CouchDB replication _changes feed |
Real-time npm registry change stream | Public; paginated reads |
| npm registry API | Package packuments, tarball downloads | Generous; used sparingly |
The monitor makes 1 API call per poll interval per ecosystem (PyPI changelog / npm _changes), plus 2-3 calls per new release (version history + downloads). This is very lightweight.
When the monitor detects a malicious release, it posts to Slack:
PyPI:
🚨 Supply Chain Alert: telnyx 4.87.2
Rank: #5,481 of top PyPI packages
Verdict: MALICIOUS
PyPI: https://pypi.org/project/telnyx/4.87.2/
Analysis summary (truncated):
The changes to src/telnyx/_client.py implement obfuscated
download-decrypt-execute behavior and module-import side effects.
A _d() function decodes base64 strings, a massive _p blob contains
an exfiltration script that downloads a .wav file from
http://83.142.209.203:8080/ringtone.wav and extracts a hidden
payload via steganography...
npm:
🚨 Supply Chain Alert: axios 0.30.4
Rank: #42 of top npm packages
Verdict: MALICIOUS
npm: https://www.npmjs.com/package/axios/v/0.30.4
Analysis summary (truncated):
1. **Non-standard dependency** — The `dependencies` block includes `plain-crypto-js`. Published axios only depends on `follow-redirects`, `form-data`, and `proxy-from-env`. A fourth package whose name looks like a **`crypto-js`–style typosquat** is a classic sign of a tampered or fake package, not a normal axios release.
- Releases are analyzed sequentially within each ecosystem thread. During high release volume, there will be a processing backlog.
- Cursor Agent CLI required — analysis depends on an active Cursor subscription and the
agentCLI being authenticated. - Sandbox mode (filesystem isolation) is only available on macOS/Linux. On Windows, the agent runs in read-only
askmode but without OS-level sandboxing. - Watchlists are static — loaded once at startup from the hugovk (PyPI) and download-counts (npm) datasets. Restart to refresh.
- npm _changes gap protection — if the saved npm sequence falls more than 10,000 changes behind the registry head, the monitor resets to head to avoid a long catch-up. Releases during the gap are missed.
Logs are written to both stdout and logs/monitor_YYYYMMDD.log. A new file is created each day. Both ecosystems log to the same file, with npm lines prefixed [npm]. Example:
2026-03-27 12:01:15 [INFO] Fetching top 15,000 packages from hugovk dataset...
2026-03-27 12:01:16 [INFO] Watchlist loaded: 15,000 packages (dataset updated 2026-03-01 07:34:08)
2026-03-27 12:01:16 [INFO] Fetching top 15,000 npm packages from download-counts dataset...
2026-03-27 12:01:18 [INFO] npm watchlist loaded: 15,000 packages (download-counts 1.0.52)
2026-03-27 12:01:19 [INFO] [pypi] Starting serial: 35,542,068 (from last_serial.yaml) — polling every 300s
2026-03-27 12:01:19 [INFO] [npm] Starting seq: 42,817,503 (from last_serial.yaml) — polling every 300s
2026-03-27 12:06:18 [INFO] [pypi] 2 new watchlist releases detected (serial 35,542,068 -> 35,542,190)
2026-03-27 12:06:18 [INFO] [pypi] Processing fast-array-utils 1.4 (rank #8,231)...
2026-03-27 12:06:18 [INFO] [pypi] Diffing fast-array-utils 1.3 -> 1.4
2026-03-27 12:06:50 [INFO] [pypi] Analyzing diff for fast-array-utils...
2026-03-27 12:07:35 [INFO] [pypi] Verdict for fast-array-utils 1.4: BENIGN
2026-03-27 12:06:20 [INFO] [npm] 1 new watchlist releases detected (seq -> 42,817,612)
2026-03-27 12:06:20 [INFO] [npm] Processing axios 0.30.4 (rank #42)...
2026-03-27 12:06:21 [INFO] [npm] Diffing axios 0.30.3 -> 0.30.4
2026-03-27 12:07:01 [INFO] [npm] Analyzing diff for axios...
2026-03-27 12:07:45 [INFO] [npm] Verdict for axios 0.30.4: MALICIOUS
This project is licensed under the MIT License. Third-party data sources and notices are summarized in NOTICE.txt.
Contributions are welcome — see CONTRIBUTING.md. This repository follows the Contributor Covenant. Report security issues through SECURITY.md, not public issues.
Questions and discussion: Elastic community Slack.