Skip to content

Commit 4bd5bfc

Browse files
docs: add advanced system-wide masking guide for contributors
Document ScreenCaptureKit + Virtual Camera approach as a more secure alternative to the overlay method. Includes architecture, code examples, lessons learned from overlay implementation, and suggested contribution order. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent d6862df commit 4bd5bfc

2 files changed

Lines changed: 257 additions & 0 deletions

File tree

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# System-Wide Masking:進階安全方案(社群貢獻方向)
2+
3+
## 現況
4+
5+
目前的 System-wide masking 使用 **Accessibility overlay**`AXObserver` + `NSPanel`),是視覺遮蔽方案:
6+
- ✅ 螢幕上看不到 key
7+
-`⌘C` 複製仍可取得原始 key
8+
- ❌ 螢幕錄影軟體可能錄到 overlay 下方的原始畫面(取決於 window level 順序)
9+
10+
## 更安全的方向:ScreenCaptureKit + Virtual Camera
11+
12+
最安全的方案是在**畫面輸出層**攔截,讓原始 key **從未出現在任何輸出中**
13+
14+
### 概念架構
15+
16+
```
17+
[macOS 螢幕]
18+
↓ ScreenCaptureKit
19+
[擷取畫面 frame (CGImage/IOSurface)]
20+
21+
[OCR 偵測 API key 位置] ← 可複用現有的 pattern matching
22+
23+
[在 frame 上繪製遮蔽區塊]
24+
25+
[輸出到 Virtual Camera / 螢幕分享]
26+
```
27+
28+
### 適用場景
29+
30+
| 場景 | overlay 方案 | ScreenCaptureKit 方案 |
31+
|------|------------|---------------------|
32+
| OBS 直播 | ⚠️ overlay 可能不在錄製範圍 | ✅ 輸出已遮蔽 |
33+
| Google Meet 螢幕分享 | ⚠️ 取決於 window level | ✅ virtual camera 輸出已遮蔽 |
34+
| 螢幕錄影 | ⚠️ 同上 | ✅ 錄到的就是遮蔽後的 |
35+
| Copy/Paste | ❌ 原始 key 仍可複製 | ❌ 同樣無法防止(不在畫面層) |
36+
37+
### 技術元件
38+
39+
#### 1. ScreenCaptureKit 擷取
40+
41+
```swift
42+
import ScreenCaptureKit
43+
44+
// 取得可擷取的螢幕
45+
let content = try await SCShareableContent.current
46+
let display = content.displays.first!
47+
48+
// 建立 filter(擷取整個螢幕)
49+
let filter = SCContentFilter(display: display, excludingWindows: [])
50+
51+
// 建立串流
52+
let config = SCStreamConfiguration()
53+
config.width = display.width
54+
config.height = display.height
55+
config.pixelFormat = kCVPixelFormatType_32BGRA
56+
57+
let stream = SCStream(filter: filter, configuration: config, delegate: self)
58+
try stream.addStreamOutput(self, type: .screen, sampleHandlerQueue: .global())
59+
try await stream.startCapture()
60+
```
61+
62+
#### 2. OCR 偵測 key 位置
63+
64+
兩種方案:
65+
66+
**方案 A:Vision framework(Apple 原生 OCR)**
67+
```swift
68+
import Vision
69+
70+
func detectKeys(in image: CGImage) -> [(String, CGRect)] {
71+
let request = VNRecognizeTextRequest { request, error in
72+
guard let results = request.results as? [VNRecognizedTextObservation] else { return }
73+
for observation in results {
74+
let text = observation.topCandidates(1).first?.string ?? ""
75+
// 複用 MaskingCoordinator 的 pattern matching
76+
let matches = maskingCoordinator.shouldMask(text: text)
77+
// observation.boundingBox → 螢幕座標
78+
}
79+
}
80+
request.recognitionLevel = .fast // 即時處理需要 fast mode
81+
let handler = VNImageRequestHandler(cgImage: image)
82+
try? handler.perform([request])
83+
}
84+
```
85+
86+
**方案 B:已知座標直接遮蔽(搭配 AX API)**
87+
88+
不用 OCR — 複用現有 `SystemMaskingService``AXBoundsForRange` 座標,直接在擷取的 frame 上繪製遮蔽區塊。這比 OCR 快得多。
89+
90+
```swift
91+
// 從 SystemMaskingService 取得已偵測的 key 座標
92+
let keyRects = systemMaskingService.getActiveOverlayRects()
93+
94+
// 在 frame 上繪製遮蔽
95+
let context = CGContext(data: ..., width: ..., height: ...)
96+
for rect in keyRects {
97+
context.setFillColor(CGColor.white)
98+
context.fill(rect)
99+
// 可選:繪製 masked text
100+
}
101+
```
102+
103+
#### 3. Virtual Camera 輸出
104+
105+
使用 [CoreMediaIO DAL Plugin](https://developer.apple.com/documentation/coremediaio)[OBS Virtual Camera](https://obsproject.com/) 將處理後的 frame 作為虛擬攝影機輸出。
106+
107+
第三方選項:
108+
- [mac-virtual-camera](https://github.com/pjb/mac-virtual-camera) — Swift 實作
109+
- OBS Studio 的 Virtual Camera API
110+
111+
### 已踩過的坑(省下你的時間)
112+
113+
從 overlay 方案開發中學到的經驗:
114+
115+
1. **AXBoundsForRange 是精確的** — 不需要估算字元寬度,直接用 AX API 取得多行文字的精確螢幕座標
116+
2. **座標轉換** — AX 用左上角原點,AppKit/CoreGraphics 用左下角,轉換公式:`appKitY = primaryScreenHeight - axY - height`
117+
3. **多螢幕** — 用 primary screen height 作為轉換基準
118+
4. **效能基準** — Pattern matching 只要 0.1-0.3ms,AX 查詢 + overlay 更新 2-6ms。ScreenCaptureKit 方案的瓶頸會在 OCR(如果用的話)
119+
5. **30ms debounce 足夠** — 人眼感知閃爍的閾值約 50ms,30ms debounce 夠快
120+
6. **App 切換要立即掃描** — 不能等 debounce,否則切回時 key 會閃現 500ms+
121+
7. **Overlay ID 要唯一** — 同一個 keyId 的多個出現需要不同的 overlay ID(用 keyId + match index)
122+
8. **NSHostingView 要重用** — 每次 `new NSHostingView` 會造成 view 累積
123+
124+
### 建議實作順序
125+
126+
1. **先做 OBS 插件**(最簡單)— 寫一個 OBS Source Plugin,從 DemoSafe Core 取得 key 座標(via IPC),在 OBS 的畫面上繪製遮蔽區塊
127+
2. **再做 Virtual Camera**(中等)— 用 ScreenCaptureKit 擷取 + AX 座標繪製遮蔽 + DAL Plugin 輸出
128+
3. **最後做 OCR**(最複雜)— 如果 AX API 座標不可用(某些 app 不支援),用 Vision framework OCR 作為 fallback
129+
130+
### 相關檔案
131+
132+
| 檔案 | 可複用內容 |
133+
|------|---------|
134+
| `Services/Accessibility/SystemMaskingService.swift` | AXObserver 設定、focused element 掃描、AXBoundsForRange 座標取得 |
135+
| `Services/Masking/MaskingCoordinator.swift` | Pattern matching 引擎(`shouldMask()`|
136+
| `Views/Overlay/SystemOverlayController.swift` | 座標轉換 (`convertAXToAppKit`) |
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# System-Wide Masking: Advanced Secure Approach (Community Contribution Guide)
2+
3+
## Current State
4+
5+
The current system-wide masking uses **Accessibility overlay** (`AXObserver` + `NSPanel`) — a visual-only approach:
6+
- ✅ Key not visible on screen
7+
-`⌘C` copy still captures the original key
8+
- ❌ Screen recording software may capture the original beneath the overlay
9+
10+
## More Secure Direction: ScreenCaptureKit + Virtual Camera
11+
12+
The most secure approach intercepts at the **frame output layer**, ensuring the original key **never appears in any output**.
13+
14+
### Architecture
15+
16+
```
17+
[macOS Screen]
18+
↓ ScreenCaptureKit
19+
[Capture frame (CGImage/IOSurface)]
20+
21+
[Detect API key positions] ← Reuse existing pattern matching
22+
23+
[Draw masking blocks on frame]
24+
25+
[Output to Virtual Camera / Screen Share]
26+
```
27+
28+
### Use Case Comparison
29+
30+
| Scenario | Overlay Approach | ScreenCaptureKit Approach |
31+
|----------|-----------------|--------------------------|
32+
| OBS livestream | ⚠️ Overlay may not be in capture | ✅ Output already masked |
33+
| Google Meet screen share | ⚠️ Depends on window level | ✅ Virtual camera output masked |
34+
| Screen recording | ⚠️ Same issue | ✅ Recording captures masked version |
35+
| Copy/Paste | ❌ Original key still copyable | ❌ Same (not in frame layer) |
36+
37+
### Technical Components
38+
39+
#### 1. ScreenCaptureKit Capture
40+
41+
```swift
42+
import ScreenCaptureKit
43+
44+
let content = try await SCShareableContent.current
45+
let display = content.displays.first!
46+
let filter = SCContentFilter(display: display, excludingWindows: [])
47+
48+
let config = SCStreamConfiguration()
49+
config.width = display.width
50+
config.height = display.height
51+
config.pixelFormat = kCVPixelFormatType_32BGRA
52+
53+
let stream = SCStream(filter: filter, configuration: config, delegate: self)
54+
try stream.addStreamOutput(self, type: .screen, sampleHandlerQueue: .global())
55+
try await stream.startCapture()
56+
```
57+
58+
#### 2. Key Position Detection
59+
60+
**Option A: Vision Framework OCR**
61+
```swift
62+
import Vision
63+
64+
func detectKeys(in image: CGImage) -> [(String, CGRect)] {
65+
let request = VNRecognizeTextRequest { request, error in
66+
guard let results = request.results as? [VNRecognizedTextObservation] else { return }
67+
for observation in results {
68+
let text = observation.topCandidates(1).first?.string ?? ""
69+
let matches = maskingCoordinator.shouldMask(text: text)
70+
// observation.boundingBox → screen coordinates
71+
}
72+
}
73+
request.recognitionLevel = .fast // Real-time needs fast mode
74+
}
75+
```
76+
77+
**Option B: AX API Coordinates (Recommended — No OCR needed)**
78+
79+
Reuse existing `SystemMaskingService`'s `AXBoundsForRange` coordinates to draw masking blocks directly on captured frames. Much faster than OCR.
80+
81+
```swift
82+
let keyRects = systemMaskingService.getActiveOverlayRects()
83+
let context = CGContext(data: ..., width: ..., height: ...)
84+
for rect in keyRects {
85+
context.setFillColor(CGColor.white)
86+
context.fill(rect)
87+
}
88+
```
89+
90+
#### 3. Virtual Camera Output
91+
92+
Use [CoreMediaIO DAL Plugin](https://developer.apple.com/documentation/coremediaio) or [OBS Virtual Camera](https://obsproject.com/).
93+
94+
Third-party options:
95+
- [mac-virtual-camera](https://github.com/pjb/mac-virtual-camera) — Swift implementation
96+
- OBS Studio Virtual Camera API
97+
98+
### Lessons Learned (From Overlay Implementation)
99+
100+
1. **AXBoundsForRange is precise** — returns exact multi-line text screen coordinates, no estimation needed
101+
2. **Coordinate conversion** — AX uses top-left origin, AppKit/CG uses bottom-left: `appKitY = primaryScreenHeight - axY - height`
102+
3. **Multi-monitor** — use primary screen height as conversion reference
103+
4. **Performance baseline** — pattern matching: 0.1-0.3ms, AX query + overlay: 2-6ms. ScreenCaptureKit bottleneck would be OCR (if used)
104+
5. **30ms debounce is sufficient** — human flicker perception threshold is ~50ms
105+
6. **Immediate scan on app switch** — don't wait for debounce, or key flashes for 500ms+
106+
7. **Unique overlay IDs** — same keyId with multiple occurrences needs distinct IDs (keyId + match index)
107+
8. **Reuse NSHostingView** — creating new views on every update causes accumulation
108+
109+
### Suggested Implementation Order
110+
111+
1. **OBS Plugin first** (easiest) — write an OBS Source Plugin that gets key coordinates from DemoSafe Core via IPC, draw masking on OBS scene
112+
2. **Virtual Camera next** (moderate) — ScreenCaptureKit capture + AX coordinates + DAL Plugin output
113+
3. **OCR last** (complex) — Vision framework OCR as fallback when AX API coordinates unavailable
114+
115+
### Reusable Files
116+
117+
| File | Reusable Content |
118+
|------|-----------------|
119+
| `Services/Accessibility/SystemMaskingService.swift` | AXObserver setup, focused element scanning, AXBoundsForRange |
120+
| `Services/Masking/MaskingCoordinator.swift` | Pattern matching engine (`shouldMask()`) |
121+
| `Views/Overlay/SystemOverlayController.swift` | Coordinate conversion (`convertAXToAppKit`) |

0 commit comments

Comments
 (0)