modelstudioai · XXPermanentXX · Jun 22, 2026 · Jun 22, 2026 · Jun 22, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,13 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and
 
 [中文版](CHANGELOG.zh.md) · [README](README.md) · [Contributing](CONTRIBUTING.md)
 
+## [1.4.1] - 2026-06-22
+
+### Changed
+
+- Video generation now defaults to the upgraded HappyHorse 1.1 model for better quality. The 1.0 models are still available via `--model`.
+- `bl update` now keeps the agent skill in sync across all your agent apps (Claude Code, Cursor, etc.), and refreshes it even when the CLI is already up to date.
+
 ## [1.4.0] - 2026-06-17
 
 ### Added

diff --git a/CHANGELOG.zh.md b/CHANGELOG.zh.md
@@ -6,6 +6,13 @@
 
 [English](CHANGELOG.md) · [README](README.zh.md) · [参与贡献](CONTRIBUTING.zh.md)
 
+## [1.4.1] - 2026-06-22
+
+### 变更
+
+- 视频生成默认升级到 HappyHorse 1.1 模型,画面质量更佳。如需使用 1.0 模型,可通过 `--model` 指定。
+- `bl update` 现在会把 agent skill 同步更新到所有 agent 应用(Claude Code、Cursor 等),即使 CLI 已是最新版本也会刷新 skill。
+
 ## [1.4.0] - 2026-06-17
 
 ### 新增

diff --git a/README.md b/README.md
@@ -27,7 +27,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
 - **Text chat** — Qwen3.7-max: major gains in agentic coding, frontend coding, and vibe coding
 - **Multimodal (Omni)** — Full omni-modal support across text + image + audio + video
 - **Image generation & editing** — Qwen-Image 2.0: pro text rendering, photorealism, strong semantic adherence, multi-image composition
-- **Video generation & editing** — HappyHorse-1.0 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
+- **Video generation & editing** — happyhorse-1.1 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
 - **Speech synthesis & recognition** — CosyVoice streaming TTS, voice cloning from 5–20s samples; FunAudio-ASR covers 30 languages including 7 Chinese dialects and 20+ Mandarin accents
 - **Image & video understanding** — Qwen-VL: long-form video analysis, chart/document parsing, visual reasoning, multilingual OCR
 
@@ -54,7 +54,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
 A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from a single natural-language sentence, with **zero manual editing**. This showcase demonstrates how an AI Agent can compose a multi-step creative pipeline by orchestrating three primitives:
 
 - **[Qwen Code](https://github.com/QwenLM/qwen-code)** — the agentic coding model that interprets the user's intent and drives the workflow
-- **[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.0**, Aliyun Model Studio's text-/image-/reference-to-video generation model
+- **[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.1**, Aliyun Model Studio's text-/image-/reference-to-video generation model
 - **[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** — handles scene decomposition, storyboarding, shot continuity, and final stitching
 
 ### The single prompt
@@ -67,7 +67,7 @@ A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from
 
 1. **Qwen Code** parses the request, plans the narrative beats, and decides which tools to call.
 2. The **spark-video Skill** breaks the story into shots, writes per-shot prompts, and enforces visual continuity (characters, lighting, palette, lens language).
-3. **`bl video generate`** dispatches each shot to **HappyHorse 1.0** in parallel.
+3. **`bl video generate`** dispatches each shot to **HappyHorse 1.1** in parallel.
 4. The skill stitches all clips back together into a single 16:9 / ~2-min deliverable.
 
 No timeline scrubbing. No frame-by-frame editing. Just one sentence → one video.

diff --git a/README.zh.md b/README.zh.md
@@ -27,7 +27,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 - **文本对话** — Qwen3.7-max：Agentic coding、前端编程、Vibe coding 等能力显著增强
 - **全模态对话** — 文本 + 图像 + 音频 + 视频全模态支持
 - **图像生成与编辑** — Qwen-Image 2.0：专业文字渲染、真实质感、强语义遵循、多图合成
-- **视频生成与编辑** — HappyHorse-1.0 系列，支持文生 / 图生 / 参考生（最多 9 张图参考）/ 自然语言视频编辑
+- **视频生成与编辑** — happyhorse-1.1 系列，支持文生 / 图生 / 参考生（最多 9 张图参考）/ 自然语言视频编辑
 - **语音合成与识别** — CosyVoice 实时流式合成，5-20s 样本即可克隆；FunAudio-ASR 覆盖 30 种语种，含汉语七大方言与 20+ 口音官话
 - **图像与视频理解** — Qwen-VL：长视频解析、复杂图表与文档识别、视觉推理、多语种 OCR
 
@@ -54,7 +54,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 一部完整的 **2 分钟、16:9 电影感短片** —— 由一句自然语言端到端生成,**全程零手动剪辑**。这个示例展示了 AI Agent 如何把三个基础能力编排成一条多步创作流水线:
 
 - **[Qwen Code](https://github.com/QwenLM/qwen-code)** —— Agentic coding 模型,解析用户意图、驱动整个工作流
-- **[阿里云百炼 CLI](https://github.com/modelstudioai/cli/)** —— 调用 **HappyHorse 1.0**,百炼的文生/图生/参考生视频模型
+- **[阿里云百炼 CLI](https://github.com/modelstudioai/cli/)** —— 调用 **HappyHorse 1.1**,百炼的文生/图生/参考生视频模型
 - **[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** —— 负责场景拆分、分镜设计、镜头连贯性和最终拼接
 
 ### 唯一的提示词
@@ -65,7 +65,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 
 1. **Qwen Code** 解析需求、规划叙事节奏,决定要调用哪些工具。
 2. **spark-video Skill** 把故事拆成镜头、为每个镜头写提示词,并保证视觉连贯性(角色、光线、色调、镜头语言)。
-3. **`bl video generate`** 把每个镜头并行下发给 **HappyHorse 1.0**。
+3. **`bl video generate`** 把每个镜头并行下发给 **HappyHorse 1.1**。
 4. Skill 把所有片段拼成最终的 16:9 / 约 2 分钟成片。
 
 没有时间线拖拽,没有逐帧剪辑。一句话 → 一部短片。

diff --git a/packages/cli/README.md b/packages/cli/README.md
@@ -27,7 +27,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
 - **Text chat** — Qwen3.7-max: major gains in agentic coding, frontend coding, and vibe coding
 - **Multimodal (Omni)** — Full omni-modal support across text + image + audio + video
 - **Image generation & editing** — Qwen-Image 2.0: pro text rendering, photorealism, strong semantic adherence, multi-image composition
-- **Video generation & editing** — HappyHorse-1.0 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
+- **Video generation & editing** — happyhorse-1.1 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
 - **Speech synthesis & recognition** — CosyVoice streaming TTS, voice cloning from 5–20s samples; FunAudio-ASR covers 30 languages including 7 Chinese dialects and 20+ Mandarin accents
 - **Image & video understanding** — Qwen-VL: long-form video analysis, chart/document parsing, visual reasoning, multilingual OCR
 
@@ -54,7 +54,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
 A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from a single natural-language sentence, with **zero manual editing**. This showcase demonstrates how an AI Agent can compose a multi-step creative pipeline by orchestrating three primitives:
 
 - **[Qwen Code](https://github.com/QwenLM/qwen-code)** — the agentic coding model that interprets the user's intent and drives the workflow
-- **[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.0**, Aliyun Model Studio's text-/image-/reference-to-video generation model
+- **[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.1**, Aliyun Model Studio's text-/image-/reference-to-video generation model
 - **[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** — handles scene decomposition, storyboarding, shot continuity, and final stitching
 
 ### The single prompt
@@ -67,7 +67,7 @@ A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from
 
 1. **Qwen Code** parses the request, plans the narrative beats, and decides which tools to call.
 2. The **spark-video Skill** breaks the story into shots, writes per-shot prompts, and enforces visual continuity (characters, lighting, palette, lens language).
-3. **`bl video generate`** dispatches each shot to **HappyHorse 1.0** in parallel.
+3. **`bl video generate`** dispatches each shot to **HappyHorse 1.1** in parallel.
 4. The skill stitches all clips back together into a single 16:9 / ~2-min deliverable.
 
 No timeline scrubbing. No frame-by-frame editing. Just one sentence → one video.

diff --git a/packages/cli/README.zh.md b/packages/cli/README.zh.md
@@ -27,7 +27,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 - **文本对话** — Qwen3.7-max：Agentic coding、前端编程、Vibe coding 等能力显著增强
 - **全模态对话** — 文本 + 图像 + 音频 + 视频全模态支持
 - **图像生成与编辑** — Qwen-Image 2.0：专业文字渲染、真实质感、强语义遵循、多图合成
-- **视频生成与编辑** — HappyHorse-1.0 系列，支持文生 / 图生 / 参考生（最多 9 张图参考）/ 自然语言视频编辑
+- **视频生成与编辑** — happyhorse-1.1 系列，支持文生 / 图生 / 参考生（最多 9 张图参考）/ 自然语言视频编辑
 - **语音合成与识别** — CosyVoice 实时流式合成，5-20s 样本即可克隆；FunAudio-ASR 覆盖 30 种语种，含汉语七大方言与 20+ 口音官话
 - **图像与视频理解** — Qwen-VL：长视频解析、复杂图表与文档识别、视觉推理、多语种 OCR
 
@@ -54,7 +54,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 一部完整的 **2 分钟、16:9 电影感短片** —— 由一句自然语言端到端生成,**全程零手动剪辑**。这个示例展示了 AI Agent 如何把三个基础能力编排成一条多步创作流水线:
 
 - **[Qwen Code](https://github.com/QwenLM/qwen-code)** —— Agentic coding 模型,解析用户意图、驱动整个工作流
-- **[阿里云百炼 CLI](https://github.com/modelstudioai/cli/)** —— 调用 **HappyHorse 1.0**,百炼的文生/图生/参考生视频模型
+- **[阿里云百炼 CLI](https://github.com/modelstudioai/cli/)** —— 调用 **HappyHorse 1.1**,百炼的文生/图生/参考生视频模型
 - **[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** —— 负责场景拆分、分镜设计、镜头连贯性和最终拼接
 
 ### 唯一的提示词
@@ -65,7 +65,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 
 1. **Qwen Code** 解析需求、规划叙事节奏,决定要调用哪些工具。
 2. **spark-video Skill** 把故事拆成镜头、为每个镜头写提示词,并保证视觉连贯性(角色、光线、色调、镜头语言)。
-3. **`bl video generate`** 把每个镜头并行下发给 **HappyHorse 1.0**。
+3. **`bl video generate`** 把每个镜头并行下发给 **HappyHorse 1.1**。
 4. Skill 把所有片段拼成最终的 16:9 / 约 2 分钟成片。
 
 没有时间线拖拽,没有逐帧剪辑。一句话 → 一部短片。

diff --git a/packages/cli/package.json b/packages/cli/package.json
@@ -1,6 +1,6 @@
 {
   "name": "bailian-cli",
-  "version": "1.4.0",
+  "version": "1.4.1",
   "description": "CLI for Aliyun Model Studio (DashScope) AI Platform.",
   "keywords": [
     "agent",

diff --git a/packages/cli/src/commands/video/generate.ts b/packages/cli/src/commands/video/generate.ts
@@ -31,12 +31,12 @@ import {
 export default defineCommand({
   name: "video generate",
   description:
-    "Generate a video from text or image (happyhorse-1.0-t2v / happyhorse-1.0-i2v / wan2.6-t2v)",
+    "Generate a video from text or image (happyhorse-1.1-t2v / happyhorse-1.1-i2v / wan2.6-t2v)",
   usage: "bl video generate --prompt <text> [--image <url>] [flags]",
   options: [
     {
       flag: "--model <model>",
-      description: "Model ID (default: happyhorse-1.0-t2v, or happyhorse-1.0-i2v with --image)",
+      description: "Model ID (default: happyhorse-1.1-t2v, or happyhorse-1.1-i2v with --image)",
     },
     { flag: "--prompt <text>", description: "Video description", required: true },
     { flag: "--image <url>", description: "Input image URL for image-to-video generation" },
@@ -98,7 +98,7 @@ export default defineCommand({
     const model =
       (flags.model as string) ||
       config.defaultVideoModel ||
-      ((flags.image as string) ? "happyhorse-1.0-i2v" : "happyhorse-1.0-t2v");
+      ((flags.image as string) ? "happyhorse-1.1-i2v" : "happyhorse-1.1-t2v");
     const format = detectOutputFormat(config.output);
 
     const imageUrl = flags.image as string | undefined;
@@ -118,7 +118,7 @@ export default defineCommand({
       input: {
         prompt: prompt!,
         negative_prompt: (flags.negativePrompt as string) || undefined,
-        // i2v models (happyhorse-1.0-i2v) require input.media with type 'first_frame'
+        // i2v models (happyhorse-1.1-i2v) require input.media with type 'first_frame'
         ...(resolvedImageUrl
           ? { media: [{ type: "first_frame" as const, url: resolvedImageUrl }] }
           : {}),

diff --git a/packages/cli/src/commands/video/ref.ts b/packages/cli/src/commands/video/ref.ts
@@ -30,10 +30,10 @@ import {
 export default defineCommand({
   name: "video ref",
   description:
-    "Reference-to-video generation (happyhorse-1.0-r2v / wan2.6-r2v): multi-subject, multi-shot with voice",
+    "Reference-to-video generation (happyhorse-1.1-r2v / wan2.6-r2v): multi-subject, multi-shot with voice",
   usage: "bl video ref --prompt <text> --image <url>... [--ref-video <url>...] [flags]",
   options: [
-    { flag: "--model <model>", description: "Model ID (default: happyhorse-1.0-r2v)" },
+    { flag: "--model <model>", description: "Model ID (default: happyhorse-1.1-r2v)" },
     {
       flag: "--prompt <text>",
       description: "Video description with reference markers (image1, video1, etc.)",
@@ -126,7 +126,7 @@ export default defineCommand({
     const imageVoices = (flags.imageVoice as string[] | undefined) || [];
     const videoVoices = (flags.videoVoice as string[] | undefined) || [];
 
-    const model = (flags.model as string) || "happyhorse-1.0-r2v";
+    const model = (flags.model as string) || "happyhorse-1.1-r2v";
     const format = detectOutputFormat(config.output);
 
     // --- Resolve file URLs (auto-upload local files) ---

diff --git a/packages/cli/src/pipeline/steps/bl-api.ts b/packages/cli/src/pipeline/steps/bl-api.ts
@@ -391,7 +391,7 @@ export async function videoGenerate(
     });
   }
 
-  const model = input.model || (input.image ? "happyhorse-1.0-i2v" : "happyhorse-1.0-t2v");
+  const model = input.model || (input.image ? "happyhorse-1.1-i2v" : "happyhorse-1.1-t2v");
 
   let resolvedImageUrl: string | undefined;
   if (input.image) {

diff --git a/packages/cli/tests/e2e/video-download.e2e.test.ts b/packages/cli/tests/e2e/video-download.e2e.test.ts
@@ -91,7 +91,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "generate",
         "--model",
-        "happyhorse-1.0-t2v",
+        "happyhorse-1.1-t2v",
         "--duration",
         "3",
         "--prompt",

diff --git a/packages/cli/tests/e2e/video-generate-i2v.e2e.test.ts b/packages/cli/tests/e2e/video-generate-i2v.e2e.test.ts
@@ -37,7 +37,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "generate",
         "--model",
-        "happyhorse-1.0-i2v",
+        "happyhorse-1.1-i2v",
         "--image",
         "https://example.com/placeholder.png",
         "--non-interactive",
@@ -53,7 +53,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "generate",
         "--dry-run",
         "--model",
-        "happyhorse-1.0-t2v",
+        "happyhorse-1.1-t2v",
         "--prompt",
         "干跑无图",
         "--non-interactive",
@@ -68,7 +68,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
       expect(data.request?.input?.media).toBeUndefined();
     });
 
-    test("【happyhorse-1.0-i2v】图片生成视频", async () => {
+    test("【happyhorse-1.1-i2v】图片生成视频", async () => {
       const outDir = makeE2eOutputDir(e2eLabelFromMetaUrl(import.meta.url));
       const png = join(outDir, "e2e-gen.png");
       const gen = await runCli([
@@ -95,7 +95,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "generate",
         "--model",
-        "happyhorse-1.0-i2v",
+        "happyhorse-1.1-i2v",
         "--image",
         imagePath,
         "--prompt",

diff --git a/packages/cli/tests/e2e/video-generate-t2v.e2e.test.ts b/packages/cli/tests/e2e/video-generate-t2v.e2e.test.ts
@@ -37,7 +37,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "generate",
         "--model",
-        "happyhorse-1.0-t2v",
+        "happyhorse-1.1-t2v",
         "--non-interactive",
       ]);
       expect(exitCode).toBe(0);
@@ -51,7 +51,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "generate",
         "--dry-run",
         "--model",
-        "happyhorse-1.0-t2v",
+        "happyhorse-1.1-t2v",
         "--prompt",
         "干跑校验",
         "--non-interactive",
@@ -62,18 +62,18 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
       const data = parseStdoutJson<{ request?: { model?: string; input?: { prompt?: string } } }>(
         stdout,
       );
-      expect(data.request?.model).toBe("happyhorse-1.0-t2v");
+      expect(data.request?.model).toBe("happyhorse-1.1-t2v");
       expect(data.request?.input?.prompt).toBe("干跑校验");
     });
 
-    test("【happyhorse-1.0-t2v】文本生成视频", async () => {
+    test("【happyhorse-1.1-t2v】文本生成视频", async () => {
       const outDir = makeE2eOutputDir(e2eLabelFromMetaUrl(import.meta.url));
       const { stdout, stderr, exitCode } = await runCli([
         ...cliTimeoutPrefix(),
         "video",
         "generate",
         "--model",
-        "happyhorse-1.0-t2v",
+        "happyhorse-1.1-t2v",
         "--prompt",
         "夕阳下海面波光，远景静态镜头",
         "--download",

diff --git a/packages/cli/tests/e2e/video-ref-r2v.e2e.test.ts b/packages/cli/tests/e2e/video-ref-r2v.e2e.test.ts
@@ -37,7 +37,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "ref",
         "--model",
-        "happyhorse-1.0-r2v",
+        "happyhorse-1.1-r2v",
         "--image",
         "https://example.com/x.png",
         "--non-interactive",
@@ -52,7 +52,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "ref",
         "--model",
-        "happyhorse-1.0-r2v",
+        "happyhorse-1.1-r2v",
         "--prompt",
         "仅有描述无素材",
         "--non-interactive",
@@ -61,7 +61,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
       expect(stderr).toMatch(/--image|ref-video|At least one|required/i);
     });
 
-    test("【happyhorse-1.0-r2v】视频参考生成", async () => {
+    test("【happyhorse-1.1-r2v】视频参考生成", async () => {
       const outDir = makeE2eOutputDir(e2eLabelFromMetaUrl(import.meta.url));
       const gen = await runCli([
         "image",
@@ -88,7 +88,7 @@ describe.skipIf(!isBailianE2EVideoEnabled() || !isDashScopeE2EReady())(
         "video",
         "ref",
         "--model",
-        "happyhorse-1.0-r2v",
+        "happyhorse-1.1-r2v",
         "--prompt",
         "图1在画面中心轻微晃动",
         "--image",

diff --git a/packages/cli/tests/stress/lib/fixtures.mjs b/packages/cli/tests/stress/lib/fixtures.mjs
@@ -180,7 +180,7 @@ export async function ensurePrerequisites(ctx) {
       "video",
       "generate",
       "--model",
-      "happyhorse-1.0-t2v",
+      "happyhorse-1.1-t2v",
       "--prompt",
       "压测前置短视频：海浪与静态远景，无明显人物。",
       "--duration",

diff --git a/packages/cli/tests/stress/lib/suite-fixtures.mjs b/packages/cli/tests/stress/lib/suite-fixtures.mjs
@@ -132,7 +132,7 @@ export async function generateCombinedFixtures({ suiteRoot, cliPackage }) {
       "video",
       "generate",
       "--model",
-      "happyhorse-1.0-t2v",
+      "happyhorse-1.1-t2v",
       "--prompt",
       "压测前置短视频：海浪与静态远景，无明显人物。",
       "--duration",