SimonSchubert
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎assets/basics/aitools.md‎
Lines changed: 18 additions & 0 deletions b/‎assets/basics/aitools.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎assets/commands/aichat.md‎
Lines changed: 92 additions & 0 deletions b/‎assets/commands/aichat.md‎
Lines changed: 92 additions & 0 deletions
diff --git a/‎assets/commands/bark.md‎
Lines changed: 64 additions & 0 deletions b/‎assets/commands/bark.md‎
Lines changed: 64 additions & 0 deletions
diff --git a/‎assets/commands/comfyui.md‎
Lines changed: 91 additions & 0 deletions b/‎assets/commands/comfyui.md‎
Lines changed: 91 additions & 0 deletions
diff --git a/‎assets/commands/faster-whisper.md‎
Lines changed: 90 additions & 0 deletions b/‎assets/commands/faster-whisper.md‎
Lines changed: 90 additions & 0 deletions
@@ -2,7 +2,7 @@
 
 ![Icon](https://raw.githubusercontent.com/SimonSchubert/LinuxCommandLibrary/master/art/web_hi_res_144.png)
 
-The app currently has **7706** manual pages, **23+** basic categories and a bunch of general terminal tips. It works 100% offline, doesn't need an internet connection and has no tracking software.
+The app currently has **7713** manual pages, **23+** basic categories and a bunch of general terminal tips. It works 100% offline, doesn't need an internet connection and has no tracking software.
 
 [![App Store](https://raw.githubusercontent.com/SimonSchubert/LinuxCommandBibliotheca/master/art/app_store_badge.png)](https://apps.apple.com/us/app/linux-command-library/id1219649976)
 [![Play Store](https://raw.githubusercontent.com/SimonSchubert/LinuxCommandBibliotheca/master/art/play_store_badge.png)](https://play.google.com/store/apps/details?id=com.inspiredandroid.linuxcommandbibliotheca)
 
@@ -24,3 +24,21 @@
 ```[nanobot](/man/nanobot)```
 ```[nanoclaw](/man/nanoclaw)```
 ```[leon](/man/leon)```
+
+## Image Generation
+```[sd-cli](/man/sd-cli)```
+```[mflux](/man/mflux)```
+```[comfyui](/man/comfyui)```
+
+## Speech & Audio AI
+```[whisper](/man/whisper)```
+```[faster-whisper](/man/faster-whisper)```
+```[deepspeech](/man/deepspeech)```
+```[piper](/man/piper)```
+```[bark](/man/bark)```
+```[tts](/man/tts)```
+
+## AI Terminal Utilities
+```[mods](/man/mods)```
+```[aichat](/man/aichat)```
+```[smartcat](/man/smartcat)```
@@ -0,0 +1,92 @@
+# TLDR
+
+**Ask a question**
+
+```aichat "[explain quicksort]"```
+
+**Pipe input for analysis**
+
+```cat [file.py] | aichat "[review this code]"```
+
+**Use a specific model**
+
+```aichat --model [claude:claude-sonnet-4-20250514] "[question]"```
+
+**Execute shell commands from natural language**
+
+```aichat -e "[list large files in current directory]"```
+
+**Start interactive chat session**
+
+```aichat -i```
+
+**Use a role/persona**
+
+```aichat --role [shell] "[find duplicate files]"```
+
+**Process a file**
+
+```aichat --file [document.pdf] "[summarize this]"```
+
+# SYNOPSIS
+
+**aichat** [_options_] [_prompt_]
+
+# PARAMETERS
+
+**-m**, **--model** _MODEL_
+> Model to use (provider:model format).
+
+**-r**, **--role** _ROLE_
+> Use a predefined role/persona.
+
+**-e**, **--execute**
+> Execute mode: translate natural language to shell commands.
+
+**-i**, **--interactive**
+> Start interactive chat REPL.
+
+**--file** _FILE_
+> Include file in the conversation.
+
+**-w**, **--wrap** _COLS_
+> Wrap output at column width.
+
+**-H**, **--no-highlight**
+> Disable syntax highlighting.
+
+**-S**, **--no-stream**
+> Disable streaming output.
+
+**--list-models**
+> List available models.
+
+**--list-roles**
+> List available roles.
+
+**--info**
+> Show current configuration.
+
+# DESCRIPTION
+
+**aichat** is an all-in-one AI CLI tool supporting chat, command execution, and RAG (Retrieval-Augmented Generation). It works with 20+ AI providers including OpenAI, Claude, Gemini, Ollama, Azure, and many more.
+
+In chat mode, it provides a REPL with conversation history, multi-line input, and syntax highlighting. Execute mode translates natural language descriptions into shell commands and optionally runs them. RAG mode indexes documents for question-answering over local files.
+
+Roles define reusable personas and system prompts. Built-in roles include coder, shell, and translator. Custom roles are defined in the configuration file.
+
+The tool supports function calling, allowing AI models to invoke defined tools. Sessions persist conversations across invocations. Multiple providers and models can be configured simultaneously.
+
+Install via `cargo install aichat`, Homebrew, or download binaries.
+
+# CAVEATS
+
+API keys required for cloud providers. Configuration file needed for multi-provider setup. RAG indexing requires additional setup. Token usage and costs vary by provider and model.
+
+# HISTORY
+
+**aichat** was created by **sigoden** in **2023** as a unified CLI for interacting with multiple AI providers. It grew from a simple chat tool into a comprehensive AI terminal toolkit with execute mode, RAG, and agent capabilities.
+
+# SEE ALSO
+
+[mods](/man/mods)(1), [smartcat](/man/smartcat)(1), [ollama](/man/ollama)(1), [llm](/man/llm)(1)
@@ -0,0 +1,64 @@
+# TLDR
+
+**Generate speech from text**
+
+```python -m bark --text "[Hello, how are you?]" --output_filename [output.wav]```
+
+**Use a specific speaker preset**
+
+```python -m bark --text "[Hello]" --output_filename [output.wav] --history_prompt [v2/en_speaker_6]```
+
+**Generate with emotions/effects**
+
+```python -m bark --text "[laughs] Oh that's funny! [sighs]" --output_filename [output.wav]```
+
+**Generate in another language**
+
+```python -m bark --text "[Bonjour le monde]" --output_filename [output.wav] --history_prompt [v2/fr_speaker_1]```
+
+**Generate with music notation**
+
+```python -m bark --text "[♪ La la la ♪]" --output_filename [output.wav]```
+
+# SYNOPSIS
+
+**python** **-m** **bark** **--text** _text_ **--output_filename** _file_ [_options_]
+
+# PARAMETERS
+
+**--text** _TEXT_
+> Input text to synthesize.
+
+**--output_filename** _FILE_
+> Output audio file path (.wav).
+
+**--history_prompt** _PRESET_
+> Speaker voice preset (e.g., v2/en_speaker_0 through v2/en_speaker_9).
+
+**--text_temp** _FLOAT_
+> Text generation temperature (default: 0.7).
+
+**--waveform_temp** _FLOAT_
+> Waveform generation temperature (default: 0.7).
+
+# DESCRIPTION
+
+**Bark** is a transformer-based text-to-audio model by **Suno AI**. Unlike traditional TTS, Bark generates highly expressive speech including laughter, sighs, breathing, crying, and even music.
+
+Special tokens in the text control non-speech sounds: `[laughs]`, `[sighs]`, `[gasps]`, `[clears throat]`, and `[music]`. Musical notation with `♪` symbols can generate singing. Capitalizing words adds emphasis, and `...` adds hesitation.
+
+Speaker presets select voice characteristics. Presets are available for multiple languages: English, German, Spanish, French, Hindi, Italian, Japanese, Korean, Polish, Portuguese, Russian, Turkish, and Chinese.
+
+Install with `pip install suno-bark`. Models are downloaded automatically on first use. GPU (CUDA) is strongly recommended for reasonable generation speed.
+
+# CAVEATS
+
+Slow on CPU (GPU strongly recommended). Large model downloads (~5GB). Output quality varies. Long text should be split into sentences. Not suitable for real-time synthesis. May produce unexpected audio artifacts.
+
+# HISTORY
+
+**Bark** was released by **Suno AI** in **April 2023** as an open-source text-to-audio model. Its ability to generate expressive speech with emotions and non-verbal sounds set it apart from conventional TTS systems. The model quickly gained popularity for creative audio generation.
+
+# SEE ALSO
+
+[piper](/man/piper)(1), [tts](/man/tts)(1), [espeak](/man/espeak)(1)
@@ -0,0 +1,91 @@
+# TLDR
+
+**Start ComfyUI server**
+
+```python [main.py]```
+
+**Start on specific port**
+
+```python [main.py] --port [8188]```
+
+**Start with CPU only**
+
+```python [main.py] --cpu```
+
+**Start in headless mode** (API only, no browser)
+
+```python [main.py] --dont-print-server```
+
+**Use specific GPU**
+
+```python [main.py] --cuda-device [0]```
+
+**Run with low VRAM mode**
+
+```python [main.py] --lowvram```
+
+**Execute a workflow via API**
+
+```curl -X POST http://localhost:8188/prompt -H "Content-Type: application/json" -d @[workflow.json]```
+
+# SYNOPSIS
+
+**python** _main.py_ [_options_]
+
+**comfyui** [_options_]
+
+# PARAMETERS
+
+**--port** _PORT_
+> Server port (default: 8188).
+
+**--listen** _ADDR_
+> Listen address (default: 127.0.0.1, use 0.0.0.0 for network).
+
+**--cpu**
+> Run on CPU only.
+
+**--cuda-device** _ID_
+> CUDA GPU device index.
+
+**--lowvram**
+> Low VRAM mode for GPUs with limited memory.
+
+**--dont-print-server**
+> Suppress server output.
+
+**--output-directory** _DIR_
+> Custom output directory.
+
+**--temp-directory** _DIR_
+> Custom temp directory.
+
+**--auto-launch**
+> Auto-open browser on start.
+
+**--disable-auto-launch**
+> Prevent auto-opening browser.
+
+# DESCRIPTION
+
+**ComfyUI** is a node-based workflow system for Stable Diffusion and Flux image generation. It provides both a visual graph editor (web UI) and a REST API for programmatic use.
+
+Workflows are built by connecting nodes: model loaders, samplers, VAE decoders, prompt encoders, and more. Complex pipelines (img2img, inpainting, ControlNet, LoRA stacking) are constructed visually without code.
+
+The API accepts workflow JSON, enabling headless batch generation and integration with scripts. Workflows created in the web UI can be exported and run via the API.
+
+An extensive ecosystem of custom nodes adds capabilities: video generation, face restoration, upscaling, IP-Adapter, and more. Custom nodes are installed into the `custom_nodes/` directory.
+
+Install via pip (`pip install comfyui`) or clone the repository. Models are placed in the `models/` directory tree.
+
+# CAVEATS
+
+Web UI requires a modern browser. GPU strongly recommended. Custom nodes may conflict. Model files are large (2-10+ GB each). Python 3.10+ required. Complex workflows can consume significant VRAM.
+
+# HISTORY
+
+**ComfyUI** was created by **comfyanonymous** in **2023** as a modular alternative to the Automatic1111 web UI. Its node-based design attracted power users who needed flexible, reproducible generation pipelines. It became one of the most popular Stable Diffusion interfaces, with a large community building custom nodes.
+
+# SEE ALSO
+
+[sd-cli](/man/sd-cli)(1), [mflux](/man/mflux)(1), [convert](/man/convert)(1)
@@ -0,0 +1,90 @@
+# TLDR
+
+**Transcribe an audio file**
+
+```faster-whisper [audio.mp3]```
+
+**Transcribe with a specific model**
+
+```faster-whisper [audio.mp3] --model [large-v3]```
+
+**Transcribe with language hint**
+
+```faster-whisper [audio.mp3] --language [en]```
+
+**Output as SRT subtitles**
+
+```faster-whisper [audio.mp3] --output_format [srt]```
+
+**Translate to English**
+
+```faster-whisper [audio.mp3] --task [translate]```
+
+**Save output to directory**
+
+```faster-whisper [audio.mp3] --output_dir [/path/to/output]```
+
+**Transcribe with word timestamps**
+
+```faster-whisper [audio.mp3] --word_timestamps [true]```
+
+# SYNOPSIS
+
+**faster-whisper** _audio_ [_--model size_] [_--language lang_] [_--task task_] [_options_]
+
+# PARAMETERS
+
+**--model** _SIZE_
+> Model size: tiny, base, small, medium, large-v1, large-v2, large-v3 (default: small).
+
+**--language** _LANG_
+> Language code (en, de, fr, etc.) or auto-detect.
+
+**--task** _TASK_
+> Task: transcribe or translate.
+
+**--output_format** _FORMAT_
+> Output format: txt, vtt, srt, tsv, json, all.
+
+**--output_dir** _DIR_
+> Output directory for results.
+
+**--word_timestamps** _BOOL_
+> Include word-level timestamps.
+
+**--device** _DEVICE_
+> Device: cpu, cuda, auto (default: auto).
+
+**--compute_type** _TYPE_
+> Compute type: int8, float16, float32 (default: int8 on CPU).
+
+**--beam_size** _N_
+> Beam search size (default: 5).
+
+**--vad_filter** _BOOL_
+> Enable voice activity detection filter.
+
+**--threads** _N_
+> Number of CPU threads.
+
+# DESCRIPTION
+
+**faster-whisper** is a reimplementation of OpenAI's Whisper using **CTranslate2**, a fast inference engine for Transformer models. It provides up to 4x faster transcription than the original Whisper while using less memory.
+
+The tool supports all Whisper model sizes. Larger models are more accurate but slower. The compute type parameter controls precision: int8 is fastest and most memory-efficient, float16 is a good balance on GPU, float32 is highest precision.
+
+Voice activity detection (VAD) filtering skips silent sections, improving both speed and accuracy. Language detection is automatic but specifying the language avoids detection overhead.
+
+Install via pip (`pip install faster-whisper`). CTranslate2 handles model conversion automatically. GPU acceleration requires CUDA toolkit.
+
+# CAVEATS
+
+Large models require significant memory. CUDA toolkit needed for GPU. First run downloads and converts models. Accuracy varies by audio quality. No speaker diarization in CLI (available via API).
+
+# HISTORY
+
+**faster-whisper** was created by **Guillaume Klein** (SYSTRAN) in **2023** using CTranslate2 to optimize Whisper inference. It became the preferred Whisper implementation for production use due to its speed and memory advantages. The project achieved wide adoption in transcription workflows.
+
+# SEE ALSO
+
+[whisper](/man/whisper)(1), [deepspeech](/man/deepspeech)(1), [ffmpeg](/man/ffmpeg)(1)