Skip to content

Nicolas-Richard/k8s-snitch

Repository files navigation

Header Image

K8S-SNITCH

A real-time Kubernetes event tracker and vector database indexer for RAG-based chatbot applications that answer questions about cluster change history.

Quick Links

Overview

This project implements a Kubernetes event tracking system that monitors changes to various resources (Pod, ReplicaSet, ConfigMap, Rollout, ExternalSecret), converts these change events into natural language descriptions, and stores them in a vector database for semantic search and historical analysis. The system enables users to ask questions about what happened in their cluster over time, track resource lifecycles, and understand change patterns.

Setup Guide

Usage Guide

Development Guide

Architecture

Components

  • Indexer (src/indexer.py) - Watches Kubernetes resources and generates change events
  • Event Schema (src/event_schema.py) - Defines the event data model for tracking changes
  • Text Converter (src/k8s_to_text.py) - Converts Kubernetes objects to text descriptions
  • Vector DB Client (src/vector_db.py) - Manages connections to the vector database
  • Query Tool (src/query.py) - Command-line tool for searching events with filters
  • RAG System (src/rag.py) - RAG implementation for answering questions about change history
  • RAG CLI Tool (src/query_rag.py) - Command-line tool for natural language queries
  • System Prompts (src/rag_prompts.py) - Event-focused prompt templates for LLM interactions

Project Structure

.
├── docs/
│   ├── cluster-management.md  # Instructions for managing a local kind clusters
│   ├── goals.md
│   ├── project.md          # Project plan and implementation steps
│   ├── rag-implementation.md
│   ├── running-tests.md    # Testing Python test instructions
│   ├── timestamp-filters.md  ## Need to rework
│   └── vector-db-integration.md
├── kubernetes/
│   ├── argo-cd/            # Argo CD installation files
│   ├── argo-rollouts/      # Argo Rollouts installation files
│   ├── external-secrets/   # External Secrets Operator files
│   └── indexer/            # In recent tests I've actually been runnning the indexer script outside the cluster for faster iteration.
│       ├── rbac.yaml
│       └── indexer-deployment
├── docker-compose.yaml     # Docker Compose to bring up the vector DB locally
├── requirements.txt
├── src/
│   ├── constants.py
│   ├── event_schema.py
│   ├── indexer.py          # Main event watcher implementation
│   ├── k8s_to_text.py
│   ├── query.py            # Vector database event query tool
│   ├── query_rag.py        # RAG-based natural language query tool
│   ├── rag.py              # Core RAG implementation for event history
│   ├── rag_prompts.py      # Event-focused prompt templates
│   └── vector_db.py        # Vector database client
├── tests/
│   ├── generate-activity.sh # Test activity generator
│   ├── manifests/          # Test Kubernetes manifests
│   └── test_k8s_to_text.py # Unit tests for k8s_to_text module
└── tools/                  # Utility scripts
    ├── clear-collections.py # Script to clear ChromaDB collections
    ├── show-recent-events.py # Script to display recent events from the database
    ├── test_bedrock.py     # Script to test AWS Bedrock connectivity
    ├── test-chromadb-connection.py # Script to test ChromaDB connection
    └── test_rag_console.py # Script for testing RAG functionality

Supported Resources

  • Pods (v1)
  • ReplicaSets (apps/v1)
  • ConfigMaps (v1)
  • Rollouts (argoproj.io/v1alpha1)
  • ExternalSecrets (external-secrets.io/v1beta1)

Project Details

Prerequisites

  • Python 3.9+ with virtual environment
  • Docker
  • kind (Kubernetes in Docker)
  • kubectl
  • Docker Compose (for running ChromaDB locally)
  • AWS account with Bedrock access (for RAG functionality)

Setup

  1. Create a local Kubernetes cluster:

    kind create cluster --name k8s-change-stream
  2. Install the required Python packages:

    pip install -r requirements.txt
  3. Run the indexer locally for testing:

    python src/indexer.py

For more detailed setup instructions, see Cluster Management.

Using the Makefile (Recommended)

The project includes a Makefile to simplify common operations:

# Build and deploy in one step
make

# Check status and logs
make status

# Run test activity generator
make test-activity

# See all available commands
make help

Local Testing

The indexer will automatically use your local kubeconfig when run outside of a Kubernetes cluster. When deployed to a cluster, it will use in-cluster authentication.

For local development, you can also use Docker Compose to run ChromaDB:

docker compose up -d chromadb

Example of events stored in the VectorDB

[16] Event ID: aa970f11-f49b-405f-8ebb-be5dd9969ff9-144176-CONFIG_CHANGED
Timestamp: 2025-08-14 17:59:23 UTC
Number of diffs: 5
Event Type: CONFIG_CHANGED
Resource: ConfigMap default/test-app-config

--- DOCUMENT TEXT ---
CONFIG_CHANGED event for ConfigMap default/test-app-config
Occurred at 2025-08-14 17:59:23 UTC
Summary: ConfigMap default/test-app-config was modified (data, metadata changed)
Changes:
  - metadata.annotations.kubectl.kubernetes.io/last-applied-configuration changed from '{"apiVersion":"v1","data":{"random_value":"13355","timestamp":"Thu Aug 14 10:59:12 PDT 2025"},"kind":"ConfigMap","metadata":{"annotations":{},"creationTimestamp":null,"name":"test-app-config","namespace":"default"}}
' to '{"apiVersion":"v1","data":{"random_value":"13781","timestamp":"Thu Aug 14 10:59:23 PDT 2025"},"kind":"ConfigMap","metadata":{"annotations":{},"creationTimestamp":null,"name":"test-app-config","namespace":"default"}}
'
  - metadata.resource_version changed from '144137' to '144176'
  - metadata.creation_timestamp changed from '2025-08-14 17:59:00+00:00' to '2025-08-14 17:59:00+00:00'
  - data.random_value changed from '13355' to '13781'
  - data.timestamp changed from 'Thu Aug 14 10:59:12 PDT 2025' to 'Thu Aug 14 10:59:23 PDT 2025'

================================================================================

[17] Event ID: 88477ca2-3cc8-40a6-a283-dc1407ff32f4-144164-STATUS_CHANGED
Timestamp: 2025-08-14 17:59:21 UTC
Number of diffs: 6
Event Type: STATUS_CHANGED
Resource: ReplicaSet default/test-app-5b8b56ffd8

--- DOCUMENT TEXT ---
STATUS_CHANGED event for ReplicaSet default/test-app-5b8b56ffd8
Occurred at 2025-08-14 17:59:21 UTC
Summary: ReplicaSet default/test-app-5b8b56ffd8 was modified (metadata, status changed)
Changes:
  - metadata.resource_version changed from '144161' to '144164'
  - metadata.creation_timestamp changed from '2025-08-14 17:58:58+00:00' to '2025-08-14 17:58:58+00:00'
  - status.replicas changed from '3' to '1'
  - status.ready_replicas changed from '3' to '1'
  - status.available_replicas changed from '3' to '1'
  - status.fully_labeled_replicas changed from '3' to '1'
Owned by: Deployment/test-app
Labels: app=test-app, pod-template-hash=5b8b56ffd8

About

AI Kubernetes resource watcher that indexes resource changes (Pods, ConfigMaps, ReplicaSets, Rollouts, ExternalSecrets) in real-time and tells you everything about it. (Chime hackathon Q3 2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors