面向 OpenClaw 视频处理链路的 Windows 桌面工具,窗口录屏只是前端采集与提交辅助。
Windows desktop tool for the OpenClaw video-processing pipeline; window recording is the front-end capture and submission helper.
- 录制指定窗口视频 / Record video for a selected window
- 录制目标进程树的系统音频 / Record system audio for the target process tree
- 支持自动停止检测 / Optional auto-stop detection
- 停止后合成最终 MP4 / Final MP4 output after muxing
- 可在录制完成后自动上传到腾讯云并触发 OpenClaw,按课程/会议类型调用 skill 生成并校验笔记 / Can automatically upload the finished recording to Tencent Cloud, trigger OpenClaw, and generate validated course or meeting notes through typed skills
| Area | Stack | Notes |
|---|---|---|
| Desktop UI | Python, CustomTkinter | Windows desktop app UI and settings dialogs |
| Desktop packaging | PyInstaller, Inno Setup | Build packaged app and Windows installer |
| Video capture | Windows Graphics Capture, ffmpeg gdigrab |
WGC first, fallback to gdigrab |
| Audio capture | WASAPI process loopback | Target-process-tree system audio capture |
| Native helpers | C++, Visual Studio Build Tools, Windows SDK | wgc_capture_helper and wasapi_capture_helper |
| Encoding / mux | ffmpeg, libx264 | Real-time encode and final mux |
| Auto-stop | Python, NumPy | Frame diff, audio activity, timing thresholds |
| Cloud API | FastAPI, python-dotenv | Receives uploads and dispatches ingest jobs |
| Speech-to-text | Whisper | Audio transcription for course and meeting note generation |
| Frame understanding | ffmpeg keyframe sampling, Tesseract OCR | Extracts visual text from video frames when useful |
| OpenClaw integration | OpenClaw skills, hook trigger, ingest script, quality validation | video-summary, ingest, feishu-doc-delete, note validation and post-processing |
| Feishu integration | Feishu Open Platform APIs | Message delivery and doc post-processing |
| Search enhancement | Tavily | Optional external search enrichment for notes |
| Deployment | PowerShell, Bash, systemd, SSH/SCP | Local deploy wrapper + remote installer |
推荐顺序:
Recommended order:
- 先准备基础服务器和聊天机器人 / Prepare the server base and chatbot first
- 再在本地安装桌面端 / Install the desktop app locally
- 再部署
server/+server-addon// Deployserver/+server-addon/ - 如果使用
ssh_tunnel,测试云端提交前先开本地隧道 / If usingssh_tunnel, open the local tunnel before cloud submit
本仓库的一键部署假设目标服务器已经具备以下基础环境:
The one-click deployment in this repo assumes the target server already has:
- 一台可通过 SSH 访问的云服务器 / A cloud server reachable via SSH
- OpenClaw 基础安装完成 / OpenClaw base installed
- 飞书通道已配置 / Feishu channel configured
- 当前 Windows 机器可通过 SSH 连接服务器 / SSH access working from this Windows machine
当前部署流程主要面向云服务器;例如腾讯云已经提供了相对低代码的基础部署方式。
The current deployment flow is primarily designed for cloud servers. Tencent Cloud, for example, already provides a relatively low-code base deployment path.
基础文档:
Base setup guides:
- OpenClaw(腾讯云)/ OpenClaw on Tencent Cloud: https://cloud.tencent.com/developer/article/2624003
- 飞书通道配置(腾讯云)/ Feishu channel setup on Tencent Cloud: https://cloud.tencent.com/developer/article/2626151
- 飞书应用类型与权限清单 / Feishu app type and permission checklist: docs/feishu-app-permissions.md
先确认 SSH 可用:
Confirm SSH first:
ssh <your-server-host-or-ssh-alias>然后执行项目部署入口:
Then run the project deploy entry:
powershell -ExecutionPolicy Bypass -File .\server-addon\deploy\deploy_all.ps1这一步会:
This step will:
- 上传
server/和server-addon// Uploadserver/andserver-addon/ - 在远端执行标准安装脚本 / Run the standard remote installer
- 使用保守覆盖策略 / Use conservative overwrite semantics
- 保留远端已有的无关文件 / Keep unrelated extra files on the remote host
- 同步受管的 OpenClaw skills 到
~/.openclaw/workspace/skills/ Sync managed OpenClaw skills into~/.openclaw/workspace/skills - 把连接文件下载回本地 / Download the generated connection file back to local paths:
config/desktop-connection.jsonserver-addon/deploy/desktop-connection.json
部署细节见:
Deployment details:
初始化本地环境:
Initialize the local environment:
powershell -ExecutionPolicy Bypass -File .\scripts\bootstrap.ps1开发模式运行:
Run in development mode:
powershell -ExecutionPolicy Bypass -File .\scripts\run_dev.ps1像普通桌面应用一样安装:
Install locally like a normal desktop app:
powershell -ExecutionPolicy Bypass -File .\scripts\install_desktop.ps1可选:加入开机启动。
Optional startup shortcut:
powershell -ExecutionPolicy Bypass -File .\scripts\install_desktop.ps1 -AddStartup干净卸载:
Clean uninstall:
powershell -ExecutionPolicy Bypass -File .\scripts\uninstall_desktop.ps1构建应用:
Build packaged app:
powershell -ExecutionPolicy Bypass -File .\scripts\build.ps1用 Inno Setup 生成 Windows 安装器:
Build Windows installer (setup.exe) via Inno Setup:
powershell -ExecutionPolicy Bypass -File .\scripts\build_installer.ps1安装器输出:
Installer output:
dist\installer\VideoAssistantDesktopSetup.exe
如果你是普通用户,不打算自己构建,直接到 GitHub Releases 下载已经构建好的桌面端安装包即可。
If you are a normal user and do not want to build locally, just download the prebuilt desktop installer from GitHub Releases.
deploy_all.ps1 完成后,直接启动桌面端即可。
After deploy_all.ps1 finishes, start the desktop app.
桌面端会按以下顺序自动寻找 desktop-connection.json:
The app auto-discovers desktop-connection.json in this order:
config/desktop-connection.jsondesktop-connection.jsonserver-addon/deploy/desktop-connection.json~/Downloads/desktop-connection.json- 打包后 exe 所在目录 / The packaged app executable directory
如果找到合法文件且当前云配置还不完整,会自动导入。
If a valid file is found and cloud settings are still missing, it is imported automatically.
如果自动发现失败,再打开 Cloud Settings:
If auto-discovery fails, open Cloud Settings and either:
- 导入
desktop-connection.json/ Importdesktop-connection.json - 或手动填写 / Or fill the fields manually
如果生成的连接模式是 ssh_tunnel,请在第二个终端里保持以下命令运行:
If the generated connection mode is ssh_tunnel, keep this running in a second terminal:
ssh -N -L 18000:127.0.0.1:8000 <your-server-host-or-ssh-alias>如果你的本地转发端口不是 18000,请替换成 desktop-connection.json 里的值。
If your generated local port is not 18000, replace it with the value from desktop-connection.json.
ssh_tunnel: 桌面端只能通过 SSH 端口转发访问服务器时选它。
Choose this when the desktop app can only reach the server through SSH port forwarding.public_http: 服务器已经有可直接访问的域名或公网IP:port时选它。
Choose this when the server already has a directly reachable domain or publicIP:port.custom_url: 你的 API 经过了非标准反向代理路径时选它。
Choose this when your API is published behind a non-standard reverse-proxy path.
- 启动应用 / Start the app
- 点击
Refresh Window List/ ClickRefresh Window List - 选择目标窗口 / Select the target window
- 首次测试保持
video_backend: "auto"/ Keepvideo_backend: "auto"for the first test - 点击
Start Recording/ ClickStart Recording - 在目标窗口中操作并播放音频 / Interact with the target window and play audio inside it
- 停止录制并检查
output/下生成的 MP4 / Stop recording and inspect the generated MP4 underoutput/
建议人工检查:
Recommended manual checks:
- 被其他窗口遮挡时,视频是否仍正确 / Cover the target window and confirm video is still correct
- 其他应用播放声音时,音频是否没有串入 / Play sound from another app and confirm it does not leak in
- UI 日志里是否显示实际使用的后端 / Check that the UI log reports the actual backend used
- 若启用了云提交,确认桌面端已自动加载连接配置 / If cloud submit is enabled, confirm the app auto-loaded cloud settings
- 若使用
ssh_tunnel,确认 SSH 隧道终端仍在运行 / If usingssh_tunnel, confirm the SSH tunnel terminal is still running - 先手动提交一个保留文件,再依赖自动提交 / Submit one retained file manually before relying on auto-submit
在 config/config.local.yaml 中设置 recording.video_backend。
Set recording.video_backend in config/config.local.yaml.
auto: 优先 WGC,失败时回退到gdigrab/ Prefer WGC, fall back togdigrabwgc: 强制使用 Windows Graphics Capture helper / Force Windows Graphics Capture helpergdigrab: 强制使用 ffmpeggdigrab/ Force ffmpeggdigrab
- 首选
wgc_capture_helper实时输出并编码 / Preferwgc_capture_helperwith real-time output and encoding - 回退路径是 ffmpeg
gdigrab/ Fallback path is ffmpeggdigrab
- 使用 WASAPI process loopback helper / Uses the WASAPI process loopback helper
- 目标是只录目标进程树的系统音频 / Targets system audio from the target process tree only
- 默认保留合成后的
mp4/ Retains the final muxedmp4by default - 也可保留分离音视频输出,用于调试或云端选择提交对象 / Can also retain separate audio and video outputs for debugging or cloud submit selection
桌面端可把录制文件提交到云端 API,再由服务器调用 OpenClaw,最终把结果发回飞书。
The desktop app can submit recordings to the cloud API; the server then calls OpenClaw and finally sends results back to Feishu.
OpenClaw 侧当前不是简单“收个视频就总结”,而是按用户选择的 course / meeting 类型进入不同的技能流程。
On the OpenClaw side, the flow is not a simple one-shot summary. It branches into different skill paths based on the selected course or meeting type.
course: 生成分章节、分知识点的课程笔记,要求更详细解释、代码片段和外部资料补充
course: generates sectioned course notes with detailed explanations, code snippets, and external referencesmeeting: 生成会议纪要,强调议题、决策项、行动项和后续跟进
meeting: generates meeting minutes focused on topics, decisions, action items, and follow-up- Whisper 转写:先把音频转成文本,作为笔记主语料
Whisper transcription: converts audio into text first and uses it as the primary note source - 关键帧 / OCR:在视频画面信息密度高时补抓关键帧并识别画面文字
Keyframes / OCR: extracts keyframes and reads on-screen text when the visual channel carries important information
当前链路会对笔记做结构和质量检查,例如章节/知识点、解释充分度、代码块覆盖、外部参考补充,以及不同类型模板是否达标。
The current pipeline also performs structural and quality checks, including section coverage, knowledge-point coverage, explanation depth, code-block presence, external reference enrichment, and whether the selected note template is satisfied.
ingest- 负责接收云端任务、触发 OpenClaw 处理链、等待结果,并做最终门禁
Receives cloud-side jobs, triggers the OpenClaw processing chain, waits for results, and performs final gating
- 负责接收云端任务、触发 OpenClaw 处理链、等待结果,并做最终门禁
video-summary- 负责实际笔记生成:音频提取、Whisper 转写、关键帧 / OCR 补充、按
course/meeting模板组织内容,并在输出前执行质量校验
Performs the actual note generation: audio extraction, Whisper transcription, keyframe / OCR enrichment, template-based note generation forcourse/meeting, and quality validation before output
- 负责实际笔记生成:音频提取、Whisper 转写、关键帧 / OCR 补充、按
feishu-doc-delete- 负责按文档链接或 token 删除飞书云文档,供对话中单独触发
Deletes Feishu cloud documents by link or token as a standalone skill callable from chat
- 负责按文档链接或 token 删除飞书云文档,供对话中单独触发
相关文档:
Related docs:
video-assistant-desktop/
├─ src/app/ # 桌面端主程序 / desktop application
│ ├─ main.py # 应用入口 / app entrypoint
│ ├─ ui/main_window.py # 主界面、设置弹窗、云端配置 UI / main window, dialogs, cloud settings UI
│ ├─ core/recorder.py # 录制总编排、停止、封装、提交 / recording orchestration, stop, mux, submit
│ ├─ core/video_capture.py # 视频后端选择与调用 / video backend selection and process control
│ ├─ core/cloud_client.py # 云端连接模式、提交与探测 / cloud endpoint modes, submit, probing
│ ├─ core/feishu_client.py # 飞书相关客户端逻辑 / Feishu-side client logic
│ ├─ core/session.py # 录制会话循环与状态管理 / recording session loop and state
│ ├─ core/auto_stop.py # 自动停止规则 / auto-stop rules
│ ├─ core/audio_meter.py # 音频活动检测 / audio activity sampling
│ ├─ core/window_tools.py # 窗口枚举、截图采样 / window enumeration and frame sampling
│ └─ assets/ # 图标等静态资源 / icons and static assets
├─ config/
│ └─ config.example.yaml # 桌面端配置模板 / desktop config template
├─ tools/
│ ├─ wgc_capture_helper/ # WGC 原生 helper 源码与产物 / WGC native helper source and build output
│ └─ wasapi_capture_helper/ # WASAPI 原生 helper 源码 / WASAPI native helper source
├─ scripts/ # 本地开发、构建、安装脚本 / local dev, build, install scripts
│ ├─ bootstrap.ps1 # 初始化 Python 环境与依赖 / bootstrap Python env and dependencies
│ ├─ run_dev.ps1 # 开发模式启动 / run app in development mode
│ ├─ build.ps1 # 构建 PyInstaller 桌面端 / build packaged desktop app
│ ├─ build_installer.ps1 # 构建 Inno Setup 安装器 / build Inno Setup installer
│ ├─ build_wgc_helper.ps1 # 编译 WGC helper / compile WGC helper
│ ├─ build_wasapi_helper.ps1 # 编译 WASAPI helper / compile WASAPI helper
│ ├─ install_desktop.ps1 # 本地创建快捷方式 / create local shortcuts
│ ├─ uninstall_desktop.ps1 # 清理本地安装痕迹 / clean local install traces
│ ├─ generate_app_icon.py # 生成应用图标资源 / generate app icon assets
│ └─ export_lock.ps1 # 导出依赖锁文件 / export dependency lock file
├─ installer/
│ ├─ VideoAssistantDesktop.iss # Inno Setup 安装器定义 / Inno Setup installer definition
│ └─ ChineseSimplified.isl # 中文安装界面语言文件 / Chinese installer language file
├─ server/ # 云端 API 服务本体 / cloud API service
│ ├─ app.py # 上传接口、任务调度、ingest 入口 / upload API, job dispatch, ingest entry
│ ├─ requirements.txt # 服务端 Python 依赖 / server Python dependencies
│ └─ README.md # 服务本体说明与调试方式 / server service notes and debug flow
├─ server-addon/ # OpenClaw 增量与部署包 / OpenClaw add-ons and deployment bundle
│ ├─ openclaw/skills/ # 下发到 OpenClaw 的技能目录 / skills synced into OpenClaw
│ │ ├─ video-summary/ # 课程/会议笔记技能 / course and meeting note skill
│ │ ├─ ingest/ # 云端触发与结果等待 skill / cloud trigger and result wait skill
│ │ └─ feishu-doc-delete/ # 删除飞书文档 skill / Feishu document deletion skill
│ └─ deploy/ # 标准远端部署入口 / standard remote deployment entry
│ ├─ deploy_all.ps1 # 本地上传并触发远端安装 / upload locally and invoke remote installer
│ ├─ install_video_assistant_openclaw.sh
│ │ # 远端标准安装脚本 / standard remote installer
│ ├─ deploy-inputs.example # 本地部署输入参考模板 / local deployment input reference
│ └─ desktop-connection.json # 连接文件模板 / desktop connection template
├─ docs/ # 补充文档 / supplementary docs
│ ├─ cloud-processing-api.md # 云端 API 说明 / cloud API notes
│ ├─ testing-and-backends.md # 后端测试与录制说明 / backend testing notes
│ └─ feishu-app-permissions.md # 飞书应用类型与权限清单 / Feishu app type and permission checklist
├─ requirements/ # 依赖拆分与锁文件 / requirement sets and lock file
├─ Install Video Assistant Desktop.cmd
│ # 一键安装入口 / one-click install entry
├─ Uninstall Video Assistant Desktop.cmd
│ # 一键卸载入口 / one-click uninstall entry
└─ README.md # 项目总说明 / project overview
- Windows
- Python 3.10 or 3.11
- ffmpeg
- Visual Studio Build Tools 2022
- Windows 11 SDK 22621
bootstrap.ps1 会处理 Python 虚拟环境和依赖安装。
bootstrap.ps1 handles the Python environment and dependency installation.
当前支持的 Python 版本:
Supported Python versions are currently:
3.103.11
3.13 目前还不在打包 / 运行验证范围内。
3.13 is currently not supported for packaging/runtime validation.
生成文件默认写入 output/。
Generated files are written under output/.
录制过程中可能看到这些中间文件:
During recording you may see temporary files such as:
*.video.mp4*.audio.wav*.audio.stop*.video.stop
最终交付文件:
The final deliverable is:
record_YYYYMMDD_HHMMSS.mp4
自动停止逻辑仍在 Python 层实现。
Auto-stop remains in the Python layer.
它主要使用:
It uses:
- 帧差采样 / Sampled frame difference
- 音频活动 / Audio activity
- 时间阈值 / Timing thresholds
- 可选结束信号 / Optional end-signal conditions
主要配置段:
Main config section:
auto_stop
主要录制时间参数:
Main recording timing config:
recording.check_interval_secondsrecording.max_duration_minutesrecording.mux_timeout_seconds
确认 ffmpeg 已安装并在 PATH 中,或按 bootstrap.ps1 的提示完成本地依赖安装。
Ensure ffmpeg is installed and available in PATH, or follow bootstrap.ps1 to finish local dependency setup.
优先检查:
- Visual Studio Build Tools 和 Windows SDK 是否完整
scripts/build_wasapi_helper.ps1是否编译成功- 目标进程是否真的有可 loopback 的系统音频
Check first:
- Visual Studio Build Tools and Windows SDK are installed correctly
scripts/build_wasapi_helper.ps1completed successfully- The target process actually produces loopback-capturable system audio
优先检查:
- 当前系统是否支持 Windows Graphics Capture
scripts/build_wgc_helper.ps1是否编译成功- 若失败,先回退
video_backend: "gdigrab"验证整条录制链路
Check first:
- The current system supports Windows Graphics Capture
scripts/build_wgc_helper.ps1completed successfully- If needed, fall back to
video_backend: "gdigrab"to validate the rest of the recording pipeline
This project is licensed under the MIT License.
本项目采用 MIT License 开源许可。