A Claude Code skill for identifying Chinese contributors from GitHub repositories — for talent recruiting and sourcing.
Given a GitHub repo (or a topic keyword), GitHub Digger:
- Auto-detects repo type — community open source (PR mode) vs enterprise internal research repo (Commit mode)
- Scans contributors — from closed PRs or commit history depending on repo type
- Traces arXiv papers — for research repos, fetches the full paper author list to surface people who don't push code but own the architecture
- Enriches profiles — name, company, location, email from 4 sources (commit metadata, paper footnotes, GitHub profile, personal homepage)
- Scores Chinese identity — high / medium confidence based on name, location, company signals
- Outputs a ready-to-use recruiting list — with personalized outreach templates
/github-digger ray-project/ray
/github-digger ByteDance-Seed/VideoWorld role: Data Infra Engineer
/github-digger "AI native data infra repos with Chinese contributors"
| Mode | Trigger | Strategy |
|---|---|---|
| Repo Discovery | Topic/keyword input | WebSearch → ranked repo list → drill down |
| PR Mode | Community repo (≥10 closed PRs) | Scan closed PRs |
| Commit Mode | Enterprise/research repo (<10 PRs) | Scan commits + arXiv paper tracing |
- Commit metadata — highest yield for enterprise repos (e.g.
renzhongwei@bytedance.com) - Paper corresponding author footnote — research repos often list email in arXiv PDF
- GitHub profile field — when explicitly set public
- Personal homepage — scan for
mailto:links
Place SKILL.md in your Claude skills directory:
~/.claude/skills/github-digger/SKILL.md
- Commit Mode: auto-switch when PR count < 10 (enterprise repo detection)
- arXiv paper tracing: surfaces full author lists from README arxiv links
- 4-source email strategy: added paper footnote as a new email source
- Repo Discovery Mode: accepts topic/keyword, finds repos before scanning
- Python urllib preferred over WebFetch (avoids enterprise network blocks)
- Output enhancements: scan strategy, email source,
[commit]/[paper]/[both]tagging
- PR-based contributor scanning
- Chinese identity scoring
- Profile enrichment with 2-source email
| Name | Role |
|---|---|
| @fflashxu | Creator & Maintainer |
| Claude Sonnet 4 | Skill design & implementation |