A curated collection of DevOps skills for Claude Code — the CLI tool by Anthropic. These skills extend Claude with deep domain knowledge for Kubernetes, Terraform, AWS, and other infrastructure tools.
Skills are structured knowledge packs that Claude Code loads on demand. When you ask Claude a Kubernetes question, the k8s skill automatically activates — giving Claude access to troubleshooting workflows, manifest patterns, Helm guides, security references, and diagnostic scripts without you having to provide any context.
| Plugin | Description | Status |
|---|---|---|
| k8s-skills | Kubernetes operations, troubleshooting, and platform engineering | ✅ Available |
| docker-skills | Docker — Dockerfile, Compose, registry, image optimization | ✅ Available |
| terraform-skills | Terraform modules, state management, and IaC patterns | 🚧 Planned |
| aws-skills | AWS services, architecture, and operations | 🚧 Planned |
- Claude Code CLI installed and configured
# 1. Add the CloudDrove marketplace (one-time)
/plugin marketplace add clouddrove/claude-skills
# 2. Install the plugins you need
/plugin install k8s-skills@clouddrove-claude-skillsSkills trigger automatically based on context. Just ask naturally:
> My pod is stuck in CrashLoopBackOff, how do I debug it?
> Write a production-ready Deployment with HPA and PDB
> Which apps are failing in production?
> Compare resource usage between staging and production
> Is everything healthy in the cluster?
Or use slash commands for direct access:
/k8s-skills:k8s-health # Cluster health check
/k8s-skills:k8s-health production # Namespace health check
/k8s-skills:k8s-debug my-pod production # Debug a specific pod
/k8s-skills:k8s-deploy rollback myapp production # Rollback a deployment
The Kubernetes skill covers both day-to-day operations and platform engineering:
| Area | What's Included |
|---|---|
| Troubleshooting | Decision trees for CrashLoopBackOff, OOMKilled, ImagePullBackOff, Pending, networking, storage |
| Manifests | Production-ready YAML for Deployments, Services, Ingress, HPA, PDB, NetworkPolicy, and more |
| Helm | Chart management, values handling, Helmfile, chart authoring, testing |
| Security | RBAC patterns, Pod Security Standards, network policies, secrets management |
| Monitoring | Prometheus alerting rules, ServiceMonitor patterns, Grafana dashboards, USE/RED methods |
| GitOps | ArgoCD and Flux — Application CRDs, sync policies, app-of-apps, image automation, secrets |
| Commands | /k8s-health, /k8s-debug, /k8s-deploy — direct access slash commands |
| Scripts | diagnose.sh, cluster-health.sh, rbac-audit.sh, namespace-setup.sh |
| Examples | Complete Helm chart, multi-environment Helmfile (dev/staging/prod) |
Directory structure
plugins/k8s-skills/
├── commands/
│ ├── k8s-debug.md # /k8s-debug — pod/deployment diagnosis
│ ├── k8s-deploy.md # /k8s-deploy — deploy, rollback, restart
│ └── k8s-health.md # /k8s-health — cluster/namespace health check
├── skills/k8s/
│ ├── SKILL.md # Core skill — command reference, troubleshooting tree, workflows
│ ├── references/
│ │ ├── troubleshooting.md # Detailed debugging workflows for every error state
│ │ ├── manifests.md # Production-ready YAML templates
│ │ ├── security.md # RBAC, Pod Security Standards, network policies
│ │ ├── monitoring.md # Prometheus, Grafana, alerting rules
│ │ ├── helm.md # Helm operations, chart authoring, Helmfile
│ │ └── gitops.md # ArgoCD and Flux — sync policies, app-of-apps, secrets
│ ├── scripts/
│ │ ├── diagnose.sh # Pod diagnostic tool
│ │ ├── cluster-health.sh # Cluster health overview
│ │ ├── rbac-audit.sh # RBAC permissions audit
│ │ └── namespace-setup.sh # Production namespace generator
│ └── examples/
│ ├── helm-chart/ # Complete production Helm chart
│ └── helmfile/ # Multi-environment Helmfile setup
└── README.md
Contributions are welcome! Whether it's improving existing skills or adding new ones.
- Fork and clone this repository
- Create your skill directory under
plugins/<plugin-name>/skills/<skill-name>/ - Write a
SKILL.mdwith YAML frontmatter:--- name: my-skill description: "When to trigger this skill and what it does..." --- # Skill instructions here
- Add reference docs in
references/and scripts inscripts/as needed - Register the plugin in
.claude-plugin/marketplace.json - Submit a pull request
- Fix inaccuracies, add missing patterns, improve troubleshooting workflows
- Add new reference docs for uncovered topics
- Improve script diagnostics and output
- Keep
SKILL.mdlean (under 500 lines) — move detailed content toreferences/ - Write in imperative/third-person style for AI consumption
- Include working examples, not pseudo-code
- Scripts should support
--helpand work as black-box tools
Skills use a three-level progressive loading system:
- Metadata — Skill name and description are always loaded (~100 words). Claude uses this to decide when to activate.
- SKILL.md — Core instructions loaded when the skill triggers. Contains quick references, decision trees, and pointers to deeper docs.
- References — Detailed guides loaded on demand. Only pulled into context when needed for a specific task.
This keeps Claude's context efficient while making deep knowledge available when needed.
This project is licensed under the Apache License 2.0 — see the LICENSE file for details.
CloudDrove is a DevOps consultancy helping teams build and scale cloud infrastructure. We build open-source tools for the community.