You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+116-5Lines changed: 116 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,6 +36,7 @@ pybreeze/
36
36
├── logging/ # pybreeze_logger
37
37
├── file_process/ # File/directory utilities
38
38
├── json_format/ # JSON processing
39
+
├── network/ # URL validation (SSRF prevention)
39
40
└── manager/package_manager/ # PackageManager class
40
41
```
41
42
@@ -95,25 +96,135 @@ All code must follow secure-by-default principles. Review every change against t
95
96
- Never use `subprocess.Popen(..., shell=True)` — always pass argument lists
96
97
- Never log or display secrets, tokens, passwords, or API keys
97
98
- Use `json.loads()` / `json.dumps()` for serialisation — never pickle
99
+
- Never use `yaml.load()` — always use `yaml.safe_load()`
98
100
- Validate all user input at system boundaries (file dialogs, URL inputs, network data)
101
+
- Handle exceptions without leaking stack traces, file paths, or internal state to the user
99
102
100
103
### Network requests (SSRF prevention)
101
-
- All outbound HTTP requests must go through `diagram_net_utils.safe_download_image()` or equivalent guards
102
-
- Only `http://` and `https://` schemes are allowed — block `file://`, `ftp://`, `data:`, `gopher://`
103
-
- Resolved IP addresses must be checked against private/loopback/link-local ranges (`ipaddress.is_private`, `is_loopback`, `is_link_local`, `is_reserved`)
- Never pass user-supplied URLs directly to `urlopen()` without validation
104
+
-**All** outbound HTTP requests to user-specified URLs must validate the target before connecting:
105
+
1. Only `http://` and `https://` schemes — block `file://`, `ftp://`, `data:`, `gopher://`
106
+
2. Resolve the hostname and check IPs against private/loopback/link-local/reserved ranges (`ipaddress.is_private`, `is_loopback`, `is_link_local`, `is_reserved`)
107
+
3. Enforce connection timeouts (default: 15 s for downloads, 30 s for API calls)
108
+
4. Enforce response size limits where applicable (default: 20 MB for binary downloads)
109
+
- Reference implementation: `diagram_net_utils._validate_url()` and `safe_download_image()`
110
+
- For API-style requests (`requests.get/post`): create or reuse a URL validation helper that performs scheme + IP checks, then call it before every `requests.*` call
111
+
- Disable automatic redirect following (`allow_redirects=False`) or re-validate the redirect target to prevent redirect-based SSRF
112
+
- Never pass user-supplied URLs directly to `urlopen()` or `requests.*` without validation
113
+
114
+
### Network requests (TLS / SSH)
115
+
- All HTTPS requests must use default TLS verification — never set `verify=False`
116
+
- SSH connections: never use `paramiko.AutoAddPolicy()` or `paramiko.WarningPolicy()` — both silently accept unknown host keys and are vulnerable to MITM. Use `InteractiveHostKeyPolicy` from `pybreeze.pybreeze_ui.connect_gui.ssh.ssh_host_key_policy` (via `apply_host_key_policy(client, parent_widget)`), which prompts the user with the SHA256 fingerprint on first connection and persists confirmed keys to `~/.pybreeze/ssh_known_hosts`
117
+
118
+
### Subprocess execution
119
+
- Always pass argument lists to `subprocess.Popen` / `subprocess.run` — never `shell=True`
120
+
- Explicitly set `shell=False` for clarity in new code
121
+
- Never interpolate user input into command strings — pass as separate list elements
122
+
- Set `timeout` on all `subprocess.run()` calls to prevent hangs
123
+
- The IDE intentionally runs user-authored scripts; this is trusted local execution, not arbitrary remote code. Subprocess hardening protects against accidental shell injection, not against malicious local files
124
+
125
+
### JupyterLab integration
126
+
- The embedded JupyterLab server binds to `localhost` only and is intended for local development
127
+
-`--ServerApp.token=` and `--ServerApp.password=` are deliberately empty to enable seamless embedding — this is safe only because the server is localhost-only
128
+
- Do not change `--ServerApp.ip` to `0.0.0.0` or any externally-reachable address
129
+
-`--ServerApp.disable_check_xsrf=True` is required for the embedded QWebEngineView; do not expose the server externally with XSRF disabled
106
130
107
131
### File I/O
108
132
- File read/write paths from user dialogs (`QFileDialog`) are trusted (user-initiated)
109
133
- File paths loaded from saved data (`.diagram.json`) must be validated before access:
110
134
- Local paths: check `path.is_file()` and verify extension is in an allowlist
111
135
- URLs: pass through the same SSRF validation as user-entered URLs
112
136
- Never construct file paths by string concatenation with user input — use `pathlib.Path` with validation
137
+
- When writing to data directories (`.pybreeze/`), create the directory with `os.makedirs(exist_ok=True)` and always use `encoding="utf-8"`
138
+
- Never follow symlinks from untrusted sources — use `Path.resolve(strict=True)` and verify the resolved path is still within expected boundaries
113
139
114
140
### Qt / UI
115
141
-`QGraphicsTextItem` with `TextEditorInteraction` must not be enabled by default — use double-click-to-edit pattern to prevent unintended text selection issues in themed environments
116
142
- Plugin loading (`jeditor_plugins/`) uses auto-discovery — only load `.py` files, skip files starting with `_` or `.`
143
+
-`QWebEngineView.setUrl()` must only load trusted URLs (localhost or user-confirmed external URLs) — never load untrusted HTML or URLs without user consent
144
+
- Never call `QWebEngineView.setHtml()` with unsanitised content — this enables XSS within the embedded browser
145
+
146
+
### Secrets and credentials
147
+
- SSH passwords and private key passphrases are held in memory only during the session — never persist to disk or logs
148
+
- Password fields must use `QLineEdit.EchoMode.Password`
149
+
- API endpoint URLs may contain embedded tokens — treat URL strings with the same care as credentials (do not log full URLs)
150
+
- Environment variables (`PYBREEZE_LOG_MAX_BYTES`, etc.) must never contain secrets; use dedicated secure stores for credentials
151
+
152
+
### Dependency security
153
+
- Pin dependencies to exact versions in `requirements.txt` / `dev_requirements.txt`
154
+
- Do not add new dependencies without reviewing their security posture (maintained? known CVEs?)
155
+
- Avoid transitive dependency bloat — prefer stdlib solutions when the alternative is a single-function dependency
156
+
157
+
## Code quality (SonarQube / Codacy compliance)
158
+
159
+
All code must satisfy common static-analysis rules enforced by SonarQube and Codacy. Review each change against the checklist below.
160
+
161
+
### Complexity & size
162
+
- Cyclomatic complexity per function: ≤ 15 (hard cap 20). Break large branches into helpers
163
+
- Cognitive complexity per function: ≤ 15. Flatten nested `if`/`for`/`try` chains with early returns or guard clauses
164
+
- Function length: ≤ 75 lines of code (excluding docstring / blank lines). Extract helpers past that
165
+
- Parameter count: ≤ 7 per function/method. Use a dataclass or typed dict when more are needed
166
+
- Nesting depth: ≤ 4 levels of `if`/`for`/`while`/`try`. Refactor with early returns instead of pyramids
167
+
- File length: ≤ 1000 lines — split modules past that
168
+
- Class `__init__`: keep attribute count reasonable; if a class has > 15 instance attributes, split responsibilities
169
+
170
+
### Exception handling
171
+
- Never use bare `except:` — always specify exception types
172
+
- Avoid catching `Exception` or `BaseException` unless immediately re-raising or logging and re-raising with context
173
+
- Never `pass` silently inside `except` — log the error via `pybreeze_logger` (at minimum `.debug()`) with context
174
+
- Do not `return` / `break` / `continue` inside a `finally` block — it swallows exceptions
175
+
- Custom exceptions must inherit from `ITEException`; never `raise Exception(...)` directly
176
+
- Use `raise ... from err` (or `raise ... from None`) when re-raising to preserve / suppress the chain explicitly
177
+
178
+
### Pythonic correctness
179
+
- Compare with `None` using `is` / `is not`, never `==` / `!=`
180
+
- Type checks use `isinstance(obj, T)`, never `type(obj) == T`
181
+
- Never use mutable default arguments (`def f(x=[])`) — use `None` and initialise inside
182
+
- Prefer f-strings over `%` formatting or `str.format()`
183
+
- Use context managers (`with open(...) as f:`) for every file / socket / lock — never leave resources to GC
184
+
- Use `enumerate()` instead of `range(len(...))` when the index is needed alongside the item
185
+
- Use `dict.get(key, default)` instead of `key in dict and dict[key]` patterns
186
+
- Use set / dict comprehensions when clearer than manual loops; avoid comprehensions with side effects
187
+
188
+
### Naming & style (PEP 8)
189
+
-`snake_case` for functions, methods, variables, module names
190
+
-`PascalCase` for classes
191
+
-`UPPER_SNAKE_CASE` for module-level constants
192
+
-`_leading_underscore` for protected / internal members; never use `__dunder__` for custom attributes
193
+
- No single-letter names except loop indices (`i`, `j`) or conventional math (`x`, `y`)
194
+
- Do not shadow built-ins (`id`, `type`, `list`, `dict`, `input`, `file`, `open`, etc.) — rename the local variable
195
+
196
+
### Duplication & dead code
197
+
- String literal used 3+ times in the same module → extract a module-level constant
198
+
- Identical 6+ line blocks in 2+ places → extract a helper function
199
+
- Remove unused imports, unused parameters, unused local variables, unreachable code after `return` / `raise`
200
+
- No commented-out code blocks — delete them (git history is the archive)
201
+
- No `TODO` / `FIXME` / `XXX` without an accompanying issue reference (`# TODO(#123): ...`)
202
+
203
+
### Logging, printing, assertions
204
+
- Never use `print()` for diagnostics in library / runtime code — use `pybreeze_logger`
205
+
- Use lazy logging (`logger.debug("x=%s", x)`) — avoid eager f-string formatting inside log calls on hot paths
206
+
- Never use `assert` for runtime validation (Python strips assertions with `-O`). Use explicit `if … raise …` instead; `assert` is only for test code
207
+
208
+
### Hardcoded values & secrets
209
+
- No hardcoded passwords, tokens, API keys, or secrets — use env vars or a config file excluded from VCS
210
+
- No hardcoded IP addresses or hostnames outside of `localhost` / documented loopback — use config
211
+
- Magic numbers (except 0, 1, -1) should be named constants when repeated or non-obvious
212
+
213
+
### Boolean & return hygiene
214
+
- Replace `if cond: return True else: return False` with `return bool(cond)` or `return cond`
215
+
- Replace `if x == True` / `if x == False` with `if x` / `if not x`
216
+
- A function should have a consistent return type — never mix `return value` and bare `return` (returns `None`) on meaningful paths unless explicitly documented
217
+
- Do not return inside a generator function (`yield` + `return value` is a syntax pitfall)
218
+
219
+
### Imports
220
+
- One import per line for `import` statements; grouped `from x import a, b` is fine
- No wildcard imports (`from x import *`) outside of `__init__.py` re-exports
223
+
- No relative imports beyond one level (`from ..pkg import x` OK, `from ...pkg import x` avoid)
224
+
225
+
### Running the linters
226
+
- Before committing any non-trivial change, run `ruff check pybreeze/` locally to catch these rules — `ruff` covers the majority of SonarQube/Codacy Python rules
227
+
- When adding a new rule exception, justify it in a `# noqa: RULE` comment with a short reason — never blanket-disable
0 commit comments