Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
301 changes: 148 additions & 153 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

# Smith

**Read-only code investigation for AI coding agents.**
**Let your AI coding agent investigate code anywhere — without cloning, and without write access.**

Search code, grep exact lines, inspect PRs, scan CI logs, and look up issues across
**GitHub**, **GitLab**, **Azure DevOps**, and **YouTrack** from one CLI.
One CLI to search code, grep exact lines, inspect PRs, scan CI logs, and look up
issues across **GitHub**, **GitLab**, **Azure DevOps**, and **YouTrack**.

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Python 3.12+](https://img.shields.io/badge/python-3.12%2B-blue.svg)](https://www.python.org)
Expand All @@ -16,50 +16,50 @@ Search code, grep exact lines, inspect PRs, scan CI logs, and look up issues acr

</div>

Smith gives an agent a safe, token-efficient way to answer questions like:
## What is Smith?

- Where is this setting defined?
- Which repos still use this dependency?
- What changed in this PR?
- Why did this pipeline fail?
- Is there an issue or story that explains this work?
AI coding agents are great at answering questions about the repo in front of
them — and blind to everything else. The answer to "why did CI fail?" or "which
repos still use this dependency?" usually lives in *other* repos, PR threads,
pipeline logs, or an issue tracker.

It does that without cloning every repo, downloading whole files, or switching
between provider-specific tools.
Smith fixes that. It gives your agent **read-only eyes on all your remotes**, so
it can answer questions like:

Think of Smith as a read-only remote investigation layer for AI agents: similar
in spirit to a provider MCP like the GitHub MCP, but optimized for token efficient
cross-remote search, grep-sized evidence, PR context, CI logs, and issue lookup.
- 🔍 *Where is this setting defined?*
- 📦 *Which repos still use this dependency?*
- 🔀 *What changed in this PR, and what did reviewers say?*
- 🚦 *Why did this pipeline fail?*
- 📋 *Is there an issue or story that explains this work?*

## Why Smith Helps
…and it does so without cloning repos, downloading whole files, or juggling
provider-specific tools.

- **Fewer tokens wasted.** Smith returns search hits and grep context windows
instead of full files.
- **One workflow across providers.** The same verbs work across GitHub, GitLab,
Azure DevOps, and YouTrack.
- **Read-only by contract.** There are no create, update, approve, comment, or
post commands.
- **Local credentials.** Tokens stay in your environment, keychain, or provider
CLI login. Smith is not a hosted proxy.
- **Agent-ready output.** We emit minimalist compact responses optimised for the LLMs.
### Why teams like it

## Golden Paths
| | |
|---|---|
| 🪶 **Token-efficient** | Returns search hits and grep-sized context windows, not full files. Built for LLM context budgets. |
| 🔄 **One workflow, four providers** | The same verbs (`code search`, `code grep`, `prs`, `pipelines`, `stories`) work on GitHub, GitLab, Azure DevOps, and YouTrack. |
Comment thread
faustodavid marked this conversation as resolved.
| 🔒 **Read-only by contract** | There are no create, update, approve, comment, or post commands. Smith cannot change anything. |
| 🔑 **Your credentials stay local** | Tokens live in your environment, OS keychain, or provider CLI login. Smith is not a hosted proxy. |

The first two paths are for you: install Smith and connect your remotes. After
that, the examples show the behavior encoded in `skills/smith/SKILL.md` — the
workflow your AI agent should follow once the Smith skill is loaded.
Think of it as a provider MCP (like the GitHub MCP) reimagined for one job:
fast, cross-remote, evidence-first investigation.

### Install and Connect
## Quick Start

#### 1. Install Smith
### 1. Install

Recommended for macOS and Linux:
macOS / Linux (recommended):

```bash
brew install faustodavid/tap/smith
smith config init
```

<details>
<summary>Other install options (standalone installer, Windows)</summary>

Standalone installer:

```bash
Expand All @@ -72,155 +72,80 @@ Windows PowerShell:
irm https://raw.githubusercontent.com/faustodavid/smith/main/scripts/install.py | python -
```

`smith config init` syncs the Smith agent skill to `~/.agents/skills/smith`.
The standalone installer does this during install too. The skill stays current
on its own after upgrades (set `SMITH_SKILL_CHECK=0` to opt out); to refresh it
manually run:

```bash
smith skill sync
```

#### 2. Add Remotes

`smith config init` starts a guided terminal onboarding flow. It scans for
existing provider auth, helps add remotes, and can store pasted tokens in the OS
secure credential store. Token values are never written to `config.yaml`.

To create an empty config for manual editing, run `smith config init --manual`.
`smith config init --format json` is also noninteractive for automation and
creates the same empty config shape.
</details>

To change remotes later, run `smith config edit`. If you prefer manual edits:
### 2. Connect your remotes

```bash
smith config path
$EDITOR ~/.config/smith/config.yaml
smith config init
```

Minimal config shape:

```yaml
defaults:
timeout_seconds: 30
max_output_chars: 20000

remotes:
github-public:
provider: github
org: acme
enabled: true

gitlab-platform:
provider: gitlab
org: acme/platform
enabled: true

azdo-main:
provider: azdo
org: acme
enabled: true
This starts a guided onboarding flow: it scans for existing provider auth
(`gh`, `glab`, `az`, env vars), helps you add remotes, and stores pasted tokens
in your OS secure credential store — never in the config file.

youtrack-main:
provider: youtrack
host: https://youtrack.acme.com
enabled: true
```
### 3. Let your agent use it

Set `SMITH_CONFIG=/path/to/config.yaml` when a workspace needs a different
config file.
`smith config init` also installs the **Smith agent skill** to
`~/.agents/skills/smith`, which teaches your AI agent how to investigate:
search broad, grep for exact proof, then corroborate with PRs, pipelines, or
issues. The skill keeps itself current after upgrades (set
`SMITH_SKILL_CHECK=0` to opt out, or run `smith skill sync` to refresh
manually).

Auth options:
That's it — ask your agent a question that spans repos and watch it work.

- `smith config show <remote>` prints the persisted remote fields only. If
`token_env` is absent, Smith may still authenticate through runtime fallback
sources below.
- `token_env` is optional when a provider has an implicit env fallback or CLI
login. Add it when you want Smith to read a specific environment variable or
secure-store entry.
- GitHub: configured env/secure-store token, public GitHub's implicit
`GITHUB_TOKEN`, then host-scoped `gh auth token`. GitHub Enterprise uses
host-scoped `gh` auth unless you configure `token_env`.
- GitLab: configured env/secure-store token, implicit `GITLAB_TOKEN`, then host-scoped
`glab config get token`.
- Azure DevOps: configured `AZURE_DEVOPS_PAT` env/secure-store PAT when
`token_env` is set, otherwise `az login` / Azure DefaultAzureCredential.
- YouTrack: configured env/secure-store token, or implicit `YOUTRACK_TOKEN`
when `token_env` is omitted.

### For Your Agent: Skill-Led Investigation
## What an Investigation Looks Like

The Smith skill teaches your AI agent a broad-to-narrow investigation loop:
search first, avoid full-file reads, extract proof with grep, then corroborate
with PRs, pipelines, or issues when needed.
These are the moves the Smith skill teaches your agent (and they work just as
well typed by hand). Examples assume remotes named `github-public`,
`gitlab-platform`, `azdo-main`, and `youtrack-main`.

#### 3. Search Everywhere, Then Narrow
### Find where something lives

When you ask an agent where something lives or how something is configured, the
skill should make it start broad across all enabled remotes:
Start broad across all enabled remotes, then grep the likely repo for exact
proof:

```bash
# Broad: search every remote at once
smith code search "auth middleware" --take 30
smith code search "python==3.10" --glob "pyproject.toml"
```

After search finds likely repos or paths, the skill should make the agent grep
the likely repository for exact proof:

```bash
# Narrow: grep one repo for the exact lines
smith github-public code grep api-service "timeout_seconds" --path src --context-lines 2
smith gitlab-platform code grep acme/platform/api "resources:" --glob "*.yaml" --context-lines 5
smith azdo-main code grep SRE api-service "TODO" --path src
```

For larger areas, the skill can map a subtree before reading matches:
Map a subtree before reading matches:

```bash
smith github-public code grep api-service ".*" --path src/auth --output-mode files_with_matches
```

#### 4. Inspect PRs and Review Threads

When your request points at a PR, review, or implementation history, the skill
guides the agent toward PR search, details, and review threads:
### Understand a PR

```bash
smith prs search "auth middleware" --status active --exclude-drafts
smith github-public prs list api-service --status active
smith github-public prs get api-service 123
smith github-public prs threads api-service 123
smith github-public prs threads api-service 123 # review comments
```

Provider repo arguments matter. These examples assume remotes named `gh`, `gl`,
`azdo`, and `yt`:

| Provider | Repo argument shape | Example |
|---|---|---|
| GitHub | bare repo name | `smith gh code grep api-service "TODO"` |
| GitLab | full group/project path | `smith gl code grep acme/platform/api "TODO"` |
| Azure DevOps | project then repo | `smith azdo code grep SRE api-service "TODO"` |
| YouTrack | issue IDs and queries only | `smith yt stories get RAD-1055` |

#### 5. Debug a Failing Pipeline
### Debug a failing pipeline

When your request is about a CI failure, the skill tells the agent to list the
run first, then grep logs instead of dumping entire job output:
List the run first, then grep the logs — never dump entire job output:

```bash
smith github-public pipelines list api-service 123456789
smith github-public pipelines grep api-service 123456789 "error|fatal|Traceback" --reverse --context-lines 3
```

Target one job or log when needed:

```bash
# Target one job or log when needed
smith azdo-main pipelines grep SRE 6789 "timeout" --log-id 42 --context-lines 3
```

#### 6. Find Issues and Stories
### Look up issues and stories

When story or issue context is needed, the skill gives the agent one shape for
GitHub, GitLab, Azure DevOps, and YouTrack lookups:
One shape across all four providers:

```bash
smith github-public stories search api-service --query "rate limit"
Expand All @@ -229,28 +154,98 @@ smith youtrack-main stories search --query "patch rollout" --state Open
smith youtrack-main stories get RAD-1055
```

### How each provider names a repo

The only per-provider difference to remember:

| Provider | Repo argument shape | Example |
|---|---|---|
| GitHub | bare repo name | `smith gh code grep api-service "TODO"` |
| GitLab | full group/project path | `smith gl code grep acme/platform/api "TODO"` |
| Azure DevOps | project, then repo | `smith azdo code grep SRE api-service "TODO"` |
| YouTrack | issue IDs and queries only | `smith yt stories get RAD-1055` |

## Supported Providers

| Provider | Code search | Code grep | PRs / MRs | Pipelines | Issues / stories | Discovery |
|---|---:|---:|---:|---:|---:|---:|
| GitHub | Yes | Yes | Yes | Yes | Yes | Orgs, repos |
| GitLab | Yes | Yes | Yes | Yes | Yes | Groups, repos |
| Azure DevOps | Yes | Yes | Yes | Yes | Yes | Orgs, repos |
| YouTrack | - | - | - | - | Yes | - |
| GitHub | ✅ | ✅ | ✅ | ✅ | ✅ | Orgs, repos |
| GitLab | ✅ | ✅ | ✅ | ✅ | ✅ | Groups, repos |
| Azure DevOps | ✅ | ✅ | ✅ | ✅ | ✅ | Orgs, repos |
| YouTrack | — | — | — | — | ✅ | — |

## Configuration

To change remotes later, run `smith config edit`, or edit the file directly:

```bash
smith config path
$EDITOR ~/.config/smith/config.yaml
```

Minimal config shape:

```yaml
defaults:
timeout_seconds: 30
max_output_chars: 20000

remotes:
github-public:
provider: github
org: acme
enabled: true

gitlab-platform:
provider: gitlab
org: acme/platform
enabled: true

azdo-main:
provider: azdo
org: acme
enabled: true

youtrack-main:
provider: youtrack
host: https://youtrack.acme.com
enabled: true
```

Set `SMITH_CONFIG=/path/to/config.yaml` when a workspace needs a different
config file. To create an empty config for manual editing, use
`smith config init --manual` (or `--format json` for noninteractive
automation).

<details>
<summary><strong>How authentication resolves per provider</strong></summary>

- `smith config show <remote>` prints the persisted remote fields only. If
`token_env` is absent, Smith may still authenticate through the runtime
fallbacks below.
- `token_env` is optional when a provider has an implicit env fallback or CLI
login. Add it when you want Smith to read a specific environment variable or
secure-store entry.
- **GitHub:** configured env/secure-store token → public GitHub's implicit
`GITHUB_TOKEN` → host-scoped `gh auth token`. GitHub Enterprise uses
host-scoped `gh` auth unless you configure `token_env`.
- **GitLab:** configured env/secure-store token → implicit `GITLAB_TOKEN` →
host-scoped `glab config get token`.
- **Azure DevOps:** configured `AZURE_DEVOPS_PAT` env/secure-store PAT when
`token_env` is set, otherwise `az login` / Azure DefaultAzureCredential.
- **YouTrack:** configured env/secure-store token, or implicit
`YOUTRACK_TOKEN` when `token_env` is omitted.

</details>

## Troubleshooting

- `401` or `403`: check `smith config show <remote>` for persisted fields. If
the remote has `token_env`, set that env var or store a token with
`smith config edit`. Otherwise check implicit provider env vars where
supported (`GITHUB_TOKEN`, `GITLAB_TOKEN`, `YOUTRACK_TOKEN`) and refresh
provider login/status as appropriate: `gh auth login`,
`glab auth login --hostname <host>`, or `az login`.
- `429`: lower `--take`, narrow `--path` or `--glob`, or reduce GitHub grep
workers with `GITHUB_GREP_MAX_WORKERS`.
- Truncated output: reduce `--context-lines`, narrow the path, or page with
`--from-line` and `--to-line`.
- Empty results: broaden the search query, then grep candidate repos again.
| Symptom | What to do |
|---|---|
| `401` / `403` | Check `smith config show <remote>`. If the remote has `token_env`, set that env var or store a token with `smith config edit`. Otherwise check implicit env vars (`GITHUB_TOKEN`, `GITLAB_TOKEN`, `YOUTRACK_TOKEN`) and refresh provider login: `gh auth login`, `glab auth login --hostname <host>`, or `az login`. |
| `429` rate limited | Lower `--take`, narrow `--path` or `--glob`, or reduce GitHub grep workers with `GITHUB_GREP_MAX_WORKERS`. |
| Truncated output | Reduce `--context-lines`, narrow the path, or page with `--from-line` and `--to-line`. |
| Empty results | Broaden the search query, then grep candidate repos again. |

## Development

Expand Down
Loading