|
| 1 | +# ⚡ AI Infra Planner |
| 2 | + |
| 3 | +An interactive, highly accurate capacity planning toolkit for AI Engineers, DevOps, System Administrators and Researches. |
| 4 | + |
| 5 | +This tool mathematically calculates VRAM for Large Language Models (LLMs), accurately sizes RAM/Disk overhead for Vector Databases, and automatically compiles a complete suggestion of hardware for your workflow. |
| 6 | + |
| 7 | +[**View Live Demo**](https://digitlib.github.io/aip/) |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## 🛠️ Core Features |
| 12 | + |
| 13 | +### 1. GPU / VRAM Calculator |
| 14 | +Calculate VRAM requirements for selected LLM workload based on real-world inference mechanics. |
| 15 | +* **Production-Grade Math:** Accurately splits memory into Shared (weights + framework overhead) and Per-User Cost. |
| 16 | +* **Architecture Aware:** Supports standard Dense models, Mixture of Experts (MoE), and calculates native KV cache reductions for MQA/GQA architectures. |
| 17 | +* **Granular Controls:** Tweak weight quantization (FP16, INT8, Q4), KV cache precision, context length, batch size, and concurrency. |
| 18 | +* **Hardware Matching:** Automatically filters a built-in database of enterprise and consumer GPUs (Nvidia H100, RTX 5090, NVIDIA Spark, etc.) to find configurations that fit your VRAM footprint. |
| 19 | + |
| 20 | +### 2. Vector Database Planner |
| 21 | +Size and compare open-source vector databases (Milvus, Weaviate, Qdrant, Chroma, Zvec, pgvector). |
| 22 | +* **Graph Overhead Precision:** Accurately calculates HNSW/graph memory bloat. |
| 23 | +* **Workload Tuning:** Adjust total vectors, embedding dimensions, Target QPS, High Availability (HA) replicas, and vector precision. |
| 24 | +* **Instant Comparison:** Visualizes RAM utilization and identifies resource use. |
| 25 | +* **Table Comparison:** Information on databases features and Index type. |
| 26 | + |
| 27 | +### 3. Base Machine Configurator |
| 28 | +Build a complete workstation/server spec around your AI workflow requirements. |
| 29 | +* **Smart Auto-Sizer:** Automatically populates minimum CPU core counts and System RAM based on your saved LLM and Vector DB requirements. |
| 30 | +* **Unified Memory Support:** Seamlessly handles dedicated GPU servers vs. Unified Memory architectures (Apple Mac Studio, NVIDIA DGX Spark). |
| 31 | +* **Power & Cooling:** Calculates Estimated System TDP (Thermal Design Power) and recommends 240V circuit amperage. |
| 32 | +* **Config Export:** Generates a clean `.txt`, complete with component specs, sub-totals, and hardware requirements. |
| 33 | + |
| 34 | +--- |
| 35 | + |
| 36 | +## 🏗️ Architecture & Tech Stack |
| 37 | + |
| 38 | +This project is built for speed, privacy, and simplicity. It runs **100% locally in the browser** using static assets. |
| 39 | + |
| 40 | +* **Framework:** [Astro](https://astro.build/) (Static Site Generation) |
| 41 | +* **Logic:** Vanilla TypeScript / JavaScript |
| 42 | +* **Styling:** Pure CSS (CSS Grid, Flexbox, CSS Variables) |
| 43 | +* **State Management:** Browser `localStorage` (Creates a seamless "shopping cart" flow between the 3 calculators without a backend). |
| 44 | + |
| 45 | +--- |
| 46 | + |
| 47 | +## 🚀 New Models or Hardware |
| 48 | + |
| 49 | +For adding new Models and/or Components, please open an Issue or a PR. |
0 commit comments