Skip to content

Commit e074d3c

Browse files
committed
Add bilingual README and MIT license
1 parent f612894 commit e074d3c

2 files changed

Lines changed: 253 additions & 43 deletions

File tree

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2026 Winlifes
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 232 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,56 @@
11
# SimulateInput
22

3-
Cross-platform desktop and browser automation platform for testing your own websites, desktop applications, installers, and system-level UI flows.
3+
English | [中文](#中文)
44

5-
## Features
5+
SimulateInput is a cross-platform desktop and browser automation platform for testing your own websites, desktop applications, installers, and system-level UI flows.
66

7-
- Window attach, focus, click, drag, type, hotkey, clear text, and screenshot actions
8-
- Multiple locator strategies: UIA/AX/AT-SPI style lookup, visible text, OCR, image matching, and coordinate fallback
9-
- CLI, MCP, and YAML case runner interfaces
10-
- Windows implementation with real smoke-tested execution
11-
- macOS MVP, Linux X11 MVP, and Linux Wayland compatibility layer
12-
- Skill docs for AI-driven automation workflows
7+
It combines direct input execution, multiple locator strategies, CLI and MCP interfaces, and YAML-driven reusable test cases so the same automation core can be used by engineers, CI pipelines, and AI agents.
138

14-
## Project Layout
9+
## Highlights
1510

16-
- `src/simulateinput/` - core engine, drivers, CLI, MCP server, runner, and locators
17-
- `docs/automation-platform-design.md` - architecture and implementation plan
18-
- `docs/cross-platform-installation.md` - platform-specific setup and permissions
19-
- `skills/simulateinput/` - skill definition and MCP/CLI references
20-
- `tests/` - unit tests and smoke case YAML files
11+
- Cross-platform driver architecture for Windows, macOS, Linux X11, and Linux Wayland compatibility
12+
- Multiple locator strategies:
13+
- structured accessibility lookup
14+
- visible text lookup
15+
- OCR-based text lookup
16+
- image template matching
17+
- coordinate fallback
18+
- Real input actions:
19+
- click
20+
- drag
21+
- type text
22+
- press key
23+
- hotkey
24+
- clear text
25+
- screenshot
26+
- MCP server for AI tool calling
27+
- YAML case runner for repeatable automation flows
28+
- Skill definitions and references for AI-assisted execution
29+
30+
## Current Platform Status
31+
32+
- Windows: primary implementation, real execution and smoke tested
33+
- macOS: MVP driver implemented, requires Accessibility / Automation / Screen Recording permissions
34+
- Linux X11: MVP driver implemented, depends on `wmctrl`, `xdotool`, screenshot helpers, and optional AT-SPI tooling
35+
- Linux Wayland: compatibility layer, helper-tool dependent and not yet full parity
36+
37+
## Repository Structure
38+
39+
- `src/simulateinput/`
40+
- core engine
41+
- platform drivers
42+
- locators
43+
- CLI
44+
- MCP server
45+
- case runner
46+
- `docs/automation-platform-design.md`
47+
- architecture and implementation plan
48+
- `docs/cross-platform-installation.md`
49+
- platform setup, dependencies, and permissions
50+
- `skills/simulateinput/`
51+
- skill definition and CLI / MCP references
52+
- `tests/`
53+
- unit tests and smoke case YAML files
2154

2255
## Quick Start
2356

@@ -28,7 +61,7 @@ python -m simulateinput.cli.main session start
2861
python -m simulateinput.cli.main mcp tools
2962
```
3063

31-
## Common CLI Flow
64+
## Typical CLI Workflow
3265

3366
```powershell
3467
$env:PYTHONPATH='src'
@@ -49,24 +82,21 @@ $env:PYTHONPATH='src'
4982
python -m simulateinput.cli.main case run tests/e2e/cases/windows-smoke.yaml
5083
```
5184

52-
Example step types:
53-
54-
- `attach_window`
55-
- `locate_text`
56-
- `locate_uia`
57-
- `locate_ocr`
58-
- `locate_image`
59-
- `click_text`
60-
- `click_uia`
61-
- `click_ocr`
62-
- `click_image`
63-
- `click`
64-
- `drag`
65-
- `type_text`
66-
- `press_key`
67-
- `hotkey`
68-
- `clear_text`
69-
- `screenshot`
85+
Example case:
86+
87+
```yaml
88+
name: locator-smoke
89+
profile: lab_default
90+
steps:
91+
- action: attach_window
92+
title: Notepad
93+
94+
- action: locate_text
95+
text: File
96+
97+
- action: screenshot
98+
output: artifacts/locator-smoke.png
99+
```
70100
71101
## MCP
72102
@@ -77,24 +107,183 @@ $env:PYTHONPATH='src'
77107
python -m simulateinput.cli.main mcp serve
78108
```
79109

80-
Current MCP tools include session management, window attach, text/UIA/OCR/image lookup, click actions, keyboard actions, drag, and screenshot capture.
81-
82-
## Platform Status
110+
Current MCP capabilities include:
83111

84-
- `Windows` - primary implementation, real execution and smoke tested
85-
- `macOS` - MVP driver implemented, requires Accessibility / Automation / Screen Recording permissions
86-
- `Linux X11` - MVP driver implemented, depends on `wmctrl`, `xdotool`, and a screenshot helper
87-
- `Linux Wayland` - compatibility layer, helper-tool dependent and not full parity
112+
- session management
113+
- window attach
114+
- structured locators
115+
- OCR and image locators
116+
- click and drag actions
117+
- keyboard actions
118+
- screenshot capture
88119

89-
## Installation Notes
120+
## Installation
90121

91122
See `docs/cross-platform-installation.md` for:
92123

93124
- Python dependencies
94125
- Tesseract OCR setup
95126
- macOS permissions
96-
- Linux X11 and Wayland helper packages
127+
- Linux helper packages
128+
- platform smoke cases
129+
130+
## Documentation
131+
132+
- Architecture: `docs/automation-platform-design.md`
133+
- Installation: `docs/cross-platform-installation.md`
134+
- Skill: `skills/simulateinput/SKILL.md`
135+
- CLI reference: `skills/simulateinput/references/cli-usage.md`
136+
- MCP reference: `skills/simulateinput/references/mcp-tools.md`
97137

98138
## Safety Boundary
99139

100-
This project is intended for automation of your own software, test environments, and explicitly authorized systems. It is not intended for bypassing third-party anti-bot controls or CAPTCHAs.
140+
SimulateInput is intended for automation of your own software, test environments, and explicitly authorized systems.
141+
142+
It is not intended for bypassing third-party anti-bot controls, CAPTCHAs, or unrelated security mechanisms.
143+
144+
---
145+
146+
## 中文
147+
148+
SimulateInput 是一个跨平台的桌面与浏览器自动化测试平台,用于测试你自己的网页、桌面软件、安装器以及系统级 UI 流程。
149+
150+
它把真实输入执行、多种定位策略、CLI / MCP 接口和 YAML 可复用测试用例整合到同一个自动化核心中,既可以给工程师直接使用,也可以接入 CI 和 AI Agent。
151+
152+
## 核心能力
153+
154+
- 跨平台驱动架构:Windows、macOS、Linux X11,以及 Linux Wayland 兼容层
155+
- 多种定位方式:
156+
- 结构化辅助功能 / 控件树定位
157+
- 可见文本定位
158+
- OCR 文本定位
159+
- 图像模板定位
160+
- 坐标兜底
161+
- 真实输入动作:
162+
- 点击
163+
- 拖拽
164+
- 文本输入
165+
- 单键输入
166+
- 组合键
167+
- 清空文本
168+
- 截图
169+
- MCP 服务,可供 AI 通过工具调用
170+
- YAML case runner,可执行可复用的自动化测试流程
171+
- 为 AI 使用准备的 skill 文档和参考资料
172+
173+
## 当前平台状态
174+
175+
- Windows:主实现,已完成真实执行和 smoke test
176+
- macOS:已完成 MVP 驱动,实现依赖 Accessibility / Automation / Screen Recording 权限
177+
- Linux X11:已完成 MVP 驱动,依赖 `wmctrl``xdotool`、截图工具和可选 AT-SPI 环境
178+
- Linux Wayland:当前是兼容层,依赖外部 helper,能力还未与 Windows 等价
179+
180+
## 仓库结构
181+
182+
- `src/simulateinput/`
183+
- 核心引擎
184+
- 平台驱动
185+
- 定位器
186+
- CLI
187+
- MCP 服务
188+
- 用例运行器
189+
- `docs/automation-platform-design.md`
190+
- 总体设计稿
191+
- `docs/cross-platform-installation.md`
192+
- 跨平台安装、依赖和权限说明
193+
- `skills/simulateinput/`
194+
- AI skill 定义和 CLI / MCP 参考
195+
- `tests/`
196+
- 单元测试和 smoke case YAML
197+
198+
## 快速开始
199+
200+
```powershell
201+
$env:PYTHONPATH='src'
202+
python -m simulateinput.cli.main doctor
203+
python -m simulateinput.cli.main session start
204+
python -m simulateinput.cli.main mcp tools
205+
```
206+
207+
## 常见 CLI 流程
208+
209+
```powershell
210+
$env:PYTHONPATH='src'
211+
212+
python -m simulateinput.cli.main session start
213+
python -m simulateinput.cli.main window list --session-id <session_id>
214+
python -m simulateinput.cli.main window attach --session-id <session_id> --window-id <window_id>
215+
216+
python -m simulateinput.cli.main locate uia --session-id <session_id> --name "Submit"
217+
python -m simulateinput.cli.main action click-uia --session-id <session_id> --name "Submit"
218+
python -m simulateinput.cli.main action screenshot --session-id <session_id> --output artifacts/shot.png
219+
```
220+
221+
## YAML 用例执行
222+
223+
```powershell
224+
$env:PYTHONPATH='src'
225+
python -m simulateinput.cli.main case run tests/e2e/cases/windows-smoke.yaml
226+
```
227+
228+
示例:
229+
230+
```yaml
231+
name: locator-smoke
232+
profile: lab_default
233+
steps:
234+
- action: attach_window
235+
title: Notepad
236+
237+
- action: locate_text
238+
text: File
239+
240+
- action: screenshot
241+
output: artifacts/locator-smoke.png
242+
```
243+
244+
## MCP 接入
245+
246+
启动本地 MCP 服务:
247+
248+
```powershell
249+
$env:PYTHONPATH='src'
250+
python -m simulateinput.cli.main mcp serve
251+
```
252+
253+
当前 MCP 已支持:
254+
255+
- 会话管理
256+
- 窗口附着
257+
- 结构化定位
258+
- OCR / 图像定位
259+
- 点击与拖拽
260+
- 键盘动作
261+
- 截图
262+
263+
## 安装说明
264+
265+
详见 `docs/cross-platform-installation.md`,其中包含:
266+
267+
- Python 依赖
268+
- Tesseract OCR 安装
269+
- macOS 权限配置
270+
- Linux helper 工具安装
271+
- 平台 smoke case 说明
272+
273+
## 文档
274+
275+
- 架构设计:`docs/automation-platform-design.md`
276+
- 安装文档:`docs/cross-platform-installation.md`
277+
- Skill:`skills/simulateinput/SKILL.md`
278+
- CLI 参考:`skills/simulateinput/references/cli-usage.md`
279+
- MCP 参考:`skills/simulateinput/references/mcp-tools.md`
280+
281+
## 安全边界
282+
283+
SimulateInput 只应用于:
284+
285+
- 你自己的软件
286+
- 测试环境
287+
- 经过明确授权的系统
288+
289+
它不用于绕过第三方反自动化机制、验证码或无关安全控制。

0 commit comments

Comments
 (0)