Cover mode fixes, queue reset on startup, Simple/Custom shared fields, run_local.sh, UI cleanup

E · cursoragent · E · commit fa8d7d92daa0 · 2026-02-08T13:45:54.000+01:00
Co-authored-by: Cursor &lt;cursoragent@cursor.com&gt;
diff --git a/README.md b/README.md
@@ -1,4 +1,6 @@
-# <img height="250" alt="image" src="https://github.com/user-attachments/assets/000f485b-3bb1-48c4-8031-cd941eec6bf7" />
+<img height="250" alt="image" src="https://github.com/user-attachments/assets/000f485b-3bb1-48c4-8031-cd941eec6bf7" />
+
+# AceForge
 
 AceForge is a **local-first AI music workstation for macOS Silicon** powered by **[ACE-Step](https://github.com/ace-step/ACE-Step)**<br>
 
@@ -59,6 +61,24 @@ AceForge is a **local-first AI music workstation for macOS Silicon** powered by
 
 > **Note:** The app bundle does NOT include the large model files. On first run, it will download the ACE-Step models (several GB) automatically. You can monitor the download progress in the Terminal window or in the Server Console panel in the web interface.
 
+### Option 2: Run locally for testing (developers)
+
+To run the app from source without building the `.app` bundle:
+
+1. **One-time setup:** Install dependencies (e.g. run the full build once to create the venv):
+   ```bash
+   ./build_local.sh
+   ```
+   This creates `venv_build/` and installs Python deps. You can cancel after the PyInstaller step if you only want to run locally.
+
+2. **Run the server:**
+   ```bash
+   ./run_local.sh
+   ```
+   This builds the React UI if needed (requires [Bun](https://bun.sh)), then starts the Flask server at **http://127.0.0.1:5056**. Open that URL in your browser to use AceForge.
+
+If you prefer to use your own venv instead of `venv_build`, install deps with `pip install -r requirements_ace_macos.txt` (and the same extra steps as in `build_local.sh` for TTS/ACE-Step), then run `python music_forge_ui.py`.
+
 ## Using AceForge (high-level workflow)
 
 1. Launch AceForge and wait for the UI
diff --git a/api/generate.py b/api/generate.py
diff --git a/cdmf_pipeline_ace_step.py b/cdmf_pipeline_ace_step.py
@@ -1232,7 +1232,7 @@ def text2music_diffusion_process(
         use_erg_lyric=False,
         use_erg_diffusion=False,
         retake_random_generators=None,
-        retake_variance=0.5,
+        retake_variance=0.2,
         add_retake_noise=False,
         guidance_scale_text=0.0,
         guidance_scale_lyric=0.0,
@@ -1929,7 +1929,7 @@ def __call__(
         lora_name_or_path: str = "none",
         lora_weight: float = 1.0,
         retake_seeds: list = None,
-        retake_variance: float = 0.5,
+        retake_variance: float = 0.2,  # ACE-Step-MCP retake/repaint default
         task: str = "text2music",
         repaint_start: int = 0,
         repaint_end: int = 0,
diff --git a/docs/ACE-Step-INFERENCE.md b/docs/ACE-Step-INFERENCE.md
@@ -3,6 +3,11 @@
   Source: https://github.com/ace-step/ACE-Step-1.5/blob/main/docs/en/INFERENCE.md
   Use for: GenerationParams/GenerationConfig, task types (text2music, cover, repaint, etc.),
   reference_audio vs src_audio, audio_cover_strength, and parameter specs.
+
+  AceForge alignment: We align core defaults with the working ACE-Step-MCP reference
+  (https://huggingface.co/spaces/reach-vb/ACE-Step-MCP/blob/main/ui/components.py):
+  ref_audio_strength=0.5 (Audio2Audio/cover reference), retake_variance=0.2 (retake/repaint),
+  and pipeline param names ref_audio_input, src_audio_path, audio2audio_enable.
 -->
 
 # ACE-Step Inference API Documentation
diff --git a/docs/ACEFORGE_API.md b/docs/ACEFORGE_API.md
@@ -121,15 +121,17 @@ ACE-Step text-to-music (and related tasks). Jobs are queued and run one at a tim
 - `songDescription` or `style`: text prompt (caption).
 - `lyrics`: optional lyrics (or "[inst]" for instrumental).
 - `instrumental`: boolean (default true).
-- `duration`: seconds (15–240), or -1/0 for auto-detection (pipeline will randomly select 30–240s).
+- `duration`: seconds (15–240).
 - `inferenceSteps`: int (e.g. 55).
 - `guidanceScale`: float (e.g. 6.0).
 - `seed`: int; if `randomSeed` is true, server may override with random.
-- `taskType`: `"text2music"` | `"retake"` | `"repaint"` | `"extend"` | `"cover"` | `"audio2audio"` | `"lego"` | `"extract"` | `"complete"`. **Lego**, **extract**, and **complete** require the ACE-Step **Base** DiT model (see Preferences and ACE-Step models).
-- `instruction`: optional; for `taskType` **lego** (and extract/complete), task-specific instruction (e.g. `"Generate the guitar track based on the audio context:"`). If omitted for lego, the server builds one from track name/caption.
-- `referenceAudioUrl`, `sourceAudioUrl`: URLs like `/audio/refs/...` or `/audio/<filename>` for reference/cover. For **lego**, **extract**, and **complete**, **sourceAudioUrl** is the backing/source audio (required).
-- `audioCoverStrength` / `ref_audio_strength`: 0–1.
-- `repaintingStart`, `repaintingEnd`: for repaint task.
+- **Task and audio params** (see [ACE-Step Tutorial](https://github.com/ace-step/ACE-Step-1.5/blob/main/docs/en/Tutorial.md#guiding-the-elephant-what-can-you-control)): The API accepts both **ACE-Step official names** (snake_case) and **UI names** (camelCase): `task_type` or `taskType`; `reference_audio` or `referenceAudioUrl` or `reference_audio_path`; `src_audio` or `sourceAudioUrl` or `source_audio_path`; `audio_cover_strength` or `audioCoverStrength` or `ref_audio_strength`.
+- `taskType` / `task_type`: `"text2music"` | `"retake"` | `"repaint"` | `"extend"` | `"cover"` | `"audio2audio"` | `"lego"` | `"extract"` | `"complete"`. **Lego**, **extract**, and **complete** require the ACE-Step **Base** DiT model (see Preferences and ACE-Step models).
+- `instruction`: optional; for **lego** (and extract/complete), task-specific instruction (e.g. `"Generate the guitar track based on the audio context:"`). If omitted for lego, the server builds one from track name/caption.
+- `reference_audio` / `referenceAudioUrl`: path or URL for reference audio (style/timbre). `src_audio` / `sourceAudioUrl`: path or URL for source/backing audio (cover, repaint, lego, etc.). For **lego**, **extract**, and **complete**, source audio is required.
+- `audio_cover_strength` / `audioCoverStrength`: 0–1 (reference/source influence strength). Defaults: **cover/retake** 0.8 (strong source); **audio2audio** 0.5 (matches [ACE-Step-MCP](https://huggingface.co/spaces/reach-vb/ACE-Step-MCP)); **lego** uses `legoBackingInfluence` (default 0.25).
+- `retake_variance` / `retakeVariance`: 0–1, default 0.2 (retake/repaint; aligned with ACE-Step-MCP).
+- `repaintingStart`, `repaintingEnd` / `repaint_start`, `repaint_end`: for repaint task; -1 end = end of audio.
 - `title`: base name for output file.
 - `outputDir` / `output_dir`: optional; else uses app default.
 - `keyScale`, `timeSignature`, `vocalLanguage`, `bpm`: optional.
diff --git a/generate_ace.py b/generate_ace.py
@@ -567,9 +567,9 @@ def _prepare_reference_audio(
     src_audio_path: str | None,
 ) -> tuple[str, bool, Optional[str]]:
     """
-    Normalise the ACE-Step edit / audio2audio mode:
+    Normalise the ACE-Step edit / audio2audio mode (task_type, reference_audio, src_audio per Tutorial/INFERENCE):
 
-      - Task is clamped to one of: text2music / retake / repaint / extend.
+      - Task (task_type) is clamped to one of: text2music / retake / repaint / extend.
       - UI tasks "cover" and "audio2audio" are mapped to "retake" (ACE-Step
         then uses ref_audio_input and sets task to "audio2audio" internally).
       - If Audio2Audio is enabled while task is still 'text2music', we
@@ -830,11 +830,11 @@ def _run_ace_text2music(
     task: str = "text2music",
     repaint_start: float = 0.0,
     repaint_end: float = 0.0,
-    retake_variance: float = 0.5,
+    retake_variance: float = 0.2,  # MCP retake/repaint use 0.2
     src_audio_path: str | None = None,
-    # Audio2Audio + LoRA
+    # Audio2Audio + LoRA (ref_audio_strength 0.5 matches ACE-Step-MCP / pipeline default)
     audio2audio_enable: bool = False,
-    ref_audio_strength: float = 0.7,
+    ref_audio_strength: float = 0.5,
     lora_name_or_path: str | None = None,
     lora_weight: float = 0.75,
     cancel_check: Optional[Callable[[], bool]] = None,
@@ -884,13 +884,7 @@ def _run_ace_text2music(
     if not tags:
         raise ValueError("ACE-Step: tags/prompt cannot be empty.")
 
-    # Allow -1 or 0 for auto-detection (pipeline randomly selects 30-240s); otherwise clamp to minimum 1.0s
-    seconds = float(seconds)
-    if seconds > 0:
-        seconds = max(1.0, seconds)
-    elif seconds < 0 and seconds != -1:
-        # Only -1 and 0 are valid for auto mode; reject other negative values
-        raise ValueError("Duration must be > 0, or -1 or 0 for auto-detection.")
+    seconds = max(1.0, float(seconds))
     steps = max(1, int(steps))
     guidance_scale = float(guidance_scale)
     omega_scale = float(omega_scale)
@@ -1128,9 +1122,9 @@ def generate_track_ace(
     task: str = "text2music",
     repaint_start: float = 0.0,
     repaint_end: float = 0.0,
-    retake_variance: float = 0.5,
+    retake_variance: float = 0.2,  # ACE-Step-MCP retake/repaint default
     audio2audio_enable: bool = False,
-    ref_audio_strength: float = 0.7,
+    ref_audio_strength: float = 0.5,  # ACE-Step-MCP / pipeline default
     src_audio_path: str | None = None,
     lora_name_or_path: str | None = None,
     lora_weight: float = 0.75,
@@ -1180,9 +1174,8 @@ def generate_track_ace(
             )
 
     requested_total = float(target_seconds)
-    # Allow -1 or 0 for auto-detection (pipeline will randomly select 30-240s)
-    if requested_total <= 0 and requested_total not in (-1, 0):
-        raise ValueError("Target length must be > 0, or -1 or 0 for auto-detection.")
+    if requested_total <= 0:
+        raise ValueError("Target length must be > 0.")
 
     fade_in_seconds = max(0.0, float(fade_in_seconds))
     fade_out_seconds = max(0.0, float(fade_out_seconds))
@@ -1231,7 +1224,7 @@ def generate_track_ace(
     if cfg_type not in ("apg", "cfg", "cfg_star"):
         cfg_type = "apg"
 
-    # Normalise edit / Audio2Audio settings before we talk to ACE-Step.
+    # Normalise edit / Audio2Audio (maps to ACE-Step task_type, ref_audio_input, src_audio, audio_cover_strength).
     task, audio2audio_enable, src_audio_path = _prepare_reference_audio(
         task,
         bool(audio2audio_enable),
diff --git a/music_forge_ui.py b/music_forge_ui.py
@@ -395,9 +395,11 @@ def flush(self):
         preferences_bp,
         ace_step_models_bp,
     )
+    from api.generate import reset_generation_queue
     app.register_blueprint(auth_bp, url_prefix="/api/auth")
     app.register_blueprint(songs_bp, url_prefix="/api/songs")
     app.register_blueprint(generate_bp, url_prefix="/api/generate")
+    reset_generation_queue()
     app.register_blueprint(playlists_bp, url_prefix="/api/playlists")
     app.register_blueprint(users_bp, url_prefix="/api/users")
     app.register_blueprint(contact_bp, url_prefix="/api/contact")
diff --git a/run_local.sh b/run_local.sh
@@ -0,0 +1,45 @@
+#!/bin/bash
+# ---------------------------------------------------------------------------
+#  AceForge - Run locally for testing (no .app bundle)
+#  Builds the React UI if needed, then starts the Flask server at http://127.0.0.1:5056
+#
+#  Prerequisites:
+#    - Python 3.11 and dependencies. Easiest: run ./build_local.sh once to create
+#      venv_build and install deps; then use this script. Or create a venv and:
+#      pip install -r requirements_ace_macos.txt (plus TTS/ACE-Step as in build_local.sh).
+#    - Bun (only if ui/dist is missing): https://bun.sh
+#
+#  Optional:
+#    ACEFORGE_SKIP_UI_BUILD=1  - Skip UI build; use existing ui/dist/
+# ---------------------------------------------------------------------------
+
+set -e
+APP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+cd "$APP_DIR"
+
+# Build UI if needed
+UI_DIR="${APP_DIR}/ui"
+if [ -f "$UI_DIR/package.json" ] && [ -z "${ACEFORGE_SKIP_UI_BUILD}" ]; then
+    if [ ! -f "$UI_DIR/dist/index.html" ]; then
+        if ! command -v bun &> /dev/null; then
+            echo "ERROR: ui/dist missing and Bun not found. Install Bun (https://bun.sh) or run: ./scripts/build_ui.sh"
+            exit 1
+        fi
+        echo "[Run] Building UI..."
+        "${APP_DIR}/scripts/build_ui.sh"
+    fi
+fi
+
+# Prefer venv from full build
+VENV_PY="${APP_DIR}/venv_build/bin/python"
+if [ -x "$VENV_PY" ]; then
+    PY="$VENV_PY"
+    echo "[Run] Using venv_build"
+else
+    PY="python3"
+    echo "[Run] Using system python3 (install deps in a venv if you see ModuleNotFoundError)"
+fi
+
+echo "[Run] Starting AceForge at http://127.0.0.1:5056"
+echo ""
+exec "$PY" music_forge_ui.py
diff --git a/ui/App.tsx b/ui/App.tsx
@@ -626,6 +626,7 @@ export default function App() {
       const genParams = {
         customMode: params.customMode,
         songDescription: params.songDescription,
+        prompt: params.prompt,
         lyrics: params.lyrics,
         style: params.style,
         title: params.title,
@@ -656,6 +657,7 @@ export default function App() {
         repaintingEnd: params.repaintingEnd,
         instruction: params.instruction,
         audioCoverStrength: params.audioCoverStrength,
+        coverBlendFactor: params.coverBlendFactor,
         taskType: params.taskType,
         useAdg: params.useAdg,
         cfgIntervalStart: params.cfgIntervalStart,
diff --git a/ui/components/CreatePanel.tsx b/ui/components/CreatePanel.tsx
diff --git a/ui/services/api.ts b/ui/services/api.ts
diff --git a/ui/types.ts b/ui/types.ts