You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
27
27
-**Text chat** — Qwen3.7-max: major gains in agentic coding, frontend coding, and vibe coding
28
28
-**Multimodal (Omni)** — Full omni-modal support across text + image + audio + video
29
29
-**Image generation & editing** — Qwen-Image 2.0: pro text rendering, photorealism, strong semantic adherence, multi-image composition
30
-
-**Video generation & editing** — HappyHorse-1.0 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
30
+
-**Video generation & editing** — happyhorse-1.1 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
31
31
-**Speech synthesis & recognition** — CosyVoice streaming TTS, voice cloning from 5–20s samples; FunAudio-ASR covers 30 languages including 7 Chinese dialects and 20+ Mandarin accents
32
32
-**Image & video understanding** — Qwen-VL: long-form video analysis, chart/document parsing, visual reasoning, multilingual OCR
33
33
@@ -54,7 +54,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
54
54
A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from a single natural-language sentence, with **zero manual editing**. This showcase demonstrates how an AI Agent can compose a multi-step creative pipeline by orchestrating three primitives:
55
55
56
56
-**[Qwen Code](https://github.com/QwenLM/qwen-code)** — the agentic coding model that interprets the user's intent and drives the workflow
57
-
-**[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.0**, Aliyun Model Studio's text-/image-/reference-to-video generation model
57
+
-**[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.1**, Aliyun Model Studio's text-/image-/reference-to-video generation model
58
58
-**[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** — handles scene decomposition, storyboarding, shot continuity, and final stitching
59
59
60
60
### The single prompt
@@ -67,7 +67,7 @@ A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from
67
67
68
68
1.**Qwen Code** parses the request, plans the narrative beats, and decides which tools to call.
69
69
2. The **spark-video Skill** breaks the story into shots, writes per-shot prompts, and enforces visual continuity (characters, lighting, palette, lens language).
70
-
3.**`bl video generate`** dispatches each shot to **HappyHorse 1.0** in parallel.
70
+
3.**`bl video generate`** dispatches each shot to **HappyHorse 1.1** in parallel.
71
71
4. The skill stitches all clips back together into a single 16:9 / ~2-min deliverable.
72
72
73
73
No timeline scrubbing. No frame-by-frame editing. Just one sentence → one video.
Copy file name to clipboardExpand all lines: packages/cli/README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
27
27
-**Text chat** — Qwen3.7-max: major gains in agentic coding, frontend coding, and vibe coding
28
28
-**Multimodal (Omni)** — Full omni-modal support across text + image + audio + video
29
29
-**Image generation & editing** — Qwen-Image 2.0: pro text rendering, photorealism, strong semantic adherence, multi-image composition
30
-
-**Video generation & editing** — HappyHorse-1.0 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
30
+
-**Video generation & editing** — happyhorse-1.1 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
31
31
-**Speech synthesis & recognition** — CosyVoice streaming TTS, voice cloning from 5–20s samples; FunAudio-ASR covers 30 languages including 7 Chinese dialects and 20+ Mandarin accents
32
32
-**Image & video understanding** — Qwen-VL: long-form video analysis, chart/document parsing, visual reasoning, multilingual OCR
33
33
@@ -54,7 +54,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
54
54
A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from a single natural-language sentence, with **zero manual editing**. This showcase demonstrates how an AI Agent can compose a multi-step creative pipeline by orchestrating three primitives:
55
55
56
56
-**[Qwen Code](https://github.com/QwenLM/qwen-code)** — the agentic coding model that interprets the user's intent and drives the workflow
57
-
-**[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.0**, Aliyun Model Studio's text-/image-/reference-to-video generation model
57
+
-**[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.1**, Aliyun Model Studio's text-/image-/reference-to-video generation model
58
58
-**[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** — handles scene decomposition, storyboarding, shot continuity, and final stitching
59
59
60
60
### The single prompt
@@ -67,7 +67,7 @@ A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from
67
67
68
68
1.**Qwen Code** parses the request, plans the narrative beats, and decides which tools to call.
69
69
2. The **spark-video Skill** breaks the story into shots, writes per-shot prompts, and enforces visual continuity (characters, lighting, palette, lens language).
70
-
3.**`bl video generate`** dispatches each shot to **HappyHorse 1.0** in parallel.
70
+
3.**`bl video generate`** dispatches each shot to **HappyHorse 1.1** in parallel.
71
71
4. The skill stitches all clips back together into a single 16:9 / ~2-min deliverable.
72
72
73
73
No timeline scrubbing. No frame-by-frame editing. Just one sentence → one video.
0 commit comments