docs(turing): add Assets page and align Knowledge Base docs with actual flow

Alexandre Oliveira · claude · Alexandre Oliveira · commit 08d7b621d288 · 2026-03-20T08:47:52.000-03:00
- New assets.md: dual-panel layout, file table (Name/Size/Type/Modified/AI/Actions),
  upload with drag-and-drop, create folder, download, delete, preview panel
  (images/PDF/video/audio/text), AI training column with tooltip, batch training
  sequence diagram (Tika extraction, 100K truncation, 1024-char chunks, progress polling),
  automatic indexing on upload/delete, embedding metadata table, MinIO configuration
  with Docker Compose snippet
- genai-llm.md: replace Knowledge Base section to reference assets.md and add
  accurate indexing pipeline details (Tika, 100K chars, 1024-char chunks)
- administration-guide.md: update Knowledge Base table row to point to
  Management → Assets with link
- sidebars-turing.ts: add Management category with assets page

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/docs-turing/administration-guide.md b/docs-turing/administration-guide.md
@@ -30,7 +30,7 @@ A brief overview of each administration section:
 | **LLM Instances** | Administration → LLM Instances | Configure connections to Anthropic Claude, OpenAI, Azure OpenAI, Gemini, and Ollama |
 | **MCP Servers** | Administration → MCP Servers | Register external MCP servers (HTTP or stdio) to extend agent tool calling |
 | **AI Agents** | Administration → AI Agents | Compose agents from an LLM Instance + selected tools + MCP Servers |
-| **Knowledge Base** | Administration → Knowledge Base | Upload and organize files in MinIO; files are indexed as vector embeddings for RAG |
+| **Knowledge Base** | Management → Assets | Upload and organize files in MinIO; files are indexed as vector embeddings and queried by AI Agents. See [Assets](./assets.md) |
 
 ---
 
diff --git a/docs-turing/assets.md b/docs-turing/assets.md
@@ -0,0 +1,224 @@
+---
+sidebar_position: 1
+title: Assets
+description: Manage files and train the RAG knowledge base with Viglet Turing ES Assets.
+---
+
+# Assets
+
+The **Assets** section (`/console/asset`) is a file manager with built-in RAG training capabilities. It is available in the **Management** section of the sidebar and is only visible when **MinIO is enabled**.
+
+Assets serves as the Knowledge Base for AI Agents — every file uploaded here can be indexed as vector embeddings and queried by the LLM via tool calling. For the conceptual overview of how this fits into the GenAI architecture, see [Generative AI & LLM Configuration](./genai-llm.md).
+
+:::info MinIO required
+Assets and all RAG Knowledge Base features require MinIO to be configured. See [MinIO Configuration](#minio-configuration) at the bottom of this page.
+:::
+
+---
+
+## Layout
+
+The interface uses a **resizable dual-panel layout**:
+
+- **Left panel** — file and folder listing with the action toolbar
+- **Right panel** — inline preview of the selected file
+
+A **breadcrumb** at the top of the left panel shows the current folder path and allows navigation to any parent level. A **Root** button returns to the top-level folder instantly.
+
+---
+
+## File Table
+
+The file listing displays the following columns:
+
+| Column | Description |
+|---|---|
+| **Name** | File or folder name |
+| **Size** | File size in human-readable format |
+| **Type** | MIME type or folder indicator |
+| **Last Modified** | Date and time of the last modification |
+| **AI** | Training status — a checkmark indicates the file has been indexed as embeddings, with a tooltip showing the training timestamp |
+| **Actions** | Per-row download and delete buttons |
+
+---
+
+## File Management
+
+### Upload Files
+
+Files are uploaded to the **current folder** via drag-and-drop or a file picker. Multiple files can be selected in one operation. Uploads are sent to:
+
+```
+POST /api/asset
+```
+
+After upload, an **asynchronous event automatically triggers individual AI indexing** for each uploaded file — no manual training step is needed for new uploads.
+
+### Create Folder
+
+A dialog prompts for a folder name. Folders can be nested to any depth and are navigated via the breadcrumb.
+
+### Download
+
+Each file has a dedicated download button that preserves the original filename.
+
+### Delete
+
+Files and folders can be deleted via an inline button. A **toast notification** confirms completion. When a file is deleted, its **embeddings are automatically removed from the vector store**.
+
+---
+
+## Preview Panel
+
+Selecting a file opens an inline preview in the right panel without leaving the page. Supported formats:
+
+| Category | Formats |
+|---|---|
+| **Images** | PNG, JPEG, GIF, WebP, SVG, BMP |
+| **PDFs** | Rendered via iframe |
+| **Video** | MP4, WebM, OGG (with player controls) |
+| **Audio** | MP3, OGG, WAV, WebM (with player controls) |
+| **Text** | TXT, CSV, HTML, CSS, JS, JSON, XML |
+
+**Panel actions:**
+
+- **Maximise** — opens fullscreen view (press `Esc` to close)
+- **Download** — downloads the file directly from the preview panel
+- **Close** — collapses the preview panel
+
+The panel footer displays the file size, content type, modification date, and file extension.
+
+---
+
+## AI Training (RAG)
+
+The AI training features are only available when `ragEnabled=true` **and** an embedding model and embedding store are configured in **Administration → Global Settings → RAG Settings**.
+
+### Training Status per File
+
+The **AI column** in the file table shows the indexing state of each file:
+
+- ✅ **Checkmark** — file has been indexed; hover to see the training timestamp
+- *(empty)* — file has not yet been indexed
+
+### Automatic Training on Upload
+
+When a file is uploaded, Turing ES dispatches an **asynchronous event** that indexes the file individually without any user action required. Similarly, when a file is deleted, its embeddings are automatically purged from the vector store.
+
+### Batch Training
+
+To index all existing files at once — useful after enabling RAG on an existing installation, or after changing the embedding model — use the **"Train AI with Assets"** button.
+
+```mermaid
+sequenceDiagram
+    participant Admin
+    participant UI as Assets UI
+    participant API as Turing ES
+    participant MinIO
+    participant Tika as Apache Tika
+    participant VS as Vector Store
+
+    Admin->>UI: Click "Train AI with Assets"
+    UI->>API: Start batch training
+    loop For each file in MinIO (recursive)
+        API->>MinIO: Download file
+        MinIO-->>API: File bytes
+        API->>Tika: Extract text
+        Tika-->>API: Plain text (truncated at 100,000 chars)
+        API->>API: Split into 1,024-char chunks
+        API->>VS: Create embeddings and store chunks
+        API->>API: Write record to asset_training_record
+    end
+    API-->>UI: Training complete
+```
+
+**Batch training steps for each file:**
+
+1. Download file bytes from MinIO
+2. Extract plain text via **Apache Tika** — supports PDF, DOCX, XLSX, PPTX, HTML, TXT, and images (with OCR)
+3. Truncate text to **100,000 characters**
+4. Split into **chunks of 1,024 characters**
+5. Generate embeddings and store in the configured vector store
+6. Write a record to `asset_training_record` with timestamp
+
+**Progress monitoring** — while the batch is running, the UI polls every **3 seconds** and displays:
+
+```
+X / Y files processed, Z errors
+```
+
+**Training states:** `IDLE` → `RUNNING` → `COMPLETED` / `FAILED`
+
+:::warning Re-training after embedding model change
+If you change the Default Embedding Model in **Administration → Global Settings → RAG Settings**, all existing embeddings become invalid. Run "Train AI with Assets" again to re-index all files with the new model.
+:::
+
+### Embedding Metadata
+
+Each chunk stored in the vector store carries the following metadata:
+
+| Field | Value |
+|---|---|
+| `source` | `"minio-asset"` |
+| `objectName` | Full object path in MinIO |
+| `objectPath` | Folder path within the bucket |
+| `fileName` | Original filename |
+| `contentType` | MIME type of the source file |
+| `size` | File size in bytes |
+
+This metadata is used by AI Agents when returning search results, so the LLM can cite the source file and provide context about where the information came from.
+
+---
+
+## How AI Agents Use the Knowledge Base
+
+Once files are indexed, AI Agents can query the knowledge base via four built-in tool callings:
+
+| Tool | Description |
+|---|---|
+| `search_knowledge_base` | Semantic similarity search across all indexed chunks |
+| `knowledge_base_stats` | Returns total files, chunks, and storage size |
+| `list_knowledge_base_files` | Lists all indexed files, with optional keyword filter |
+| `get_file_from_knowledge_base` | Retrieves the full indexed content of a specific file |
+
+For details on configuring AI Agents and tool callings, see [Generative AI & LLM Configuration — AI Agents](./genai-llm.md#ai-agents).
+
+---
+
+## MinIO Configuration
+
+MinIO must be enabled and configured before Assets becomes available:
+
+```properties
+turing.minio.enabled=true
+turing.minio.endpoint=http://minio:9000
+turing.minio.accessKey=minioadmin
+turing.minio.secretKey=minioadmin
+turing.minio.bucket=turing-assets
+```
+
+The bucket (`turing-assets` by default) is **created automatically on startup** if it does not exist.
+
+With Docker Compose, add the MinIO service alongside Turing ES:
+
+```yaml
+minio:
+  image: minio/minio
+  ports:
+    - "9000:9000"
+    - "9001:9001"
+  environment:
+    MINIO_ROOT_USER: minioadmin
+    MINIO_ROOT_PASSWORD: minioadmin
+  command: server /data --console-address ":9001"
+  volumes:
+    - minio_data:/data
+```
+
+:::tip
+The MinIO web console is available at `http://localhost:9001` when running locally via Docker Compose. Use it to inspect buckets and verify that files are being stored correctly.
+:::
+
+---
+
+*Previous: [Administration Guide](./administration-guide.md) | Next: [Generative AI & LLM Configuration](./genai-llm.md)*
diff --git a/docs-turing/genai-llm.md b/docs-turing/genai-llm.md
@@ -171,15 +171,20 @@ sequenceDiagram
 
 Turing ES retrieves the **top 10** most similar document chunks by default, using a similarity **threshold of 0.7**. Documents with a similarity score below the threshold are excluded from the context, preventing low-relevance content from polluting the prompt.
 
-### Knowledge Base (MinIO)
+### Knowledge Base (Assets)
 
-The Knowledge Base is a collection of files stored in MinIO and indexed as vector embeddings. Administrators manage files through a folder-based UI in the Turing ES admin console — creating folders, uploading documents, and organizing content in a way similar to a file system.
+The Knowledge Base is built from files managed in the **Assets** section (`/console/asset`), a file manager backed by MinIO. Administrators can create folders, upload documents, and browse content via a dual-panel interface. Files are indexed as vector embeddings and can be queried semantically by AI Agents.
 
-When a file is uploaded, the indexing pipeline:
-1. Extracts text content (including OCR for images and PDFs)
-2. Splits the content into chunks
-3. Generates a vector embedding for each chunk using the configured embedding model
-4. Stores the chunks and embeddings in the active embedding store
+Full documentation — including the UI layout, file preview, batch training, automatic indexing on upload/delete, and MinIO configuration — is available on the dedicated [Assets](./assets.md) page.
+
+**Indexing pipeline (per file):**
+
+1. Download file from MinIO
+2. Extract plain text via **Apache Tika** (supports PDF, DOCX, XLSX, PPTX, HTML, TXT, images with OCR)
+3. Truncate to **100,000 characters**
+4. Split into **chunks of 1,024 characters**
+5. Generate embeddings using the configured embedding model
+6. Store chunks in the active embedding store with source metadata
 
 The Knowledge Base is queried by AI Agents using the **RAG / Knowledge Base** tool callings:
 
diff --git a/sidebars-turing.ts b/sidebars-turing.ts
@@ -21,6 +21,13 @@ const sidebars: SidebarsConfig = {
         "installation-guide",
         "administration-guide",
         "developer-guide",
+        {
+          type: "category",
+          label: "Management",
+          items: [
+            "assets",
+          ],
+        },
         {
           type: "category",
           label: "Generative AI",