You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+13-4Lines changed: 13 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -68,12 +68,21 @@ This repository focuses strictly on the Python-side automation layer; the render
68
68
69
69
### Fuzz testing
70
70
71
-
-`invoke test_fuzz` (or `invoke test-fuzz`) runs `html2pdf4doc/html2pdf4doc_fuzzer.py` against whichever HTML fixture you point it to. The default task configuration targets one of the StrictDoc exports under a path like `tests/fuzz/<fixture_name>/`, but you can edit the task or call the script directly to fuzz any other HTML bundle.
72
-
- Pass `--long` to run 200 iterations instead of the default 20. Failures copy the entire fuzz fixture tree plus the mutated HTML/PDF into `output/<relative_fixture_path>/`, so you can inspect the exact inputs that triggered issues.
71
+
-**Generic helper.**`invoke test_fuzz` (aliases `invoke test-fuzz` / `invoke tf`) runs `html2pdf4doc/html2pdf4doc_fuzzer.py` for any HTML bundle: pass `--input-file=<path/to/file.html>` and `--root-path=<fixture_root>` to control the target. By default it points to the sample folder `tests/fuzz/_your_fixture_slug_/` which contains `file_to_mutate.html` plus an `_assets/` directory. When you create a new non-StrictDoc fixture, mirror that structure (root folder with the HTML and assets) and provide its paths via `--input-file/--root-path`. You can also call the Python script directly:
Shells like `zsh` require trailing `\` when breaking a command across lines; otherwise keep it on one line. Add `--long` (either via Invoke or direct call) to run 200 iterations instead of 20.
78
+
79
+
-**StrictDoc helper.**`invoke sdoc_test_fuzz` (aliases `invoke sdoc-test-fuzz` / `invoke sdoc-tf`) assumes the fixture lives under `tests/fuzz/<slug>/strictdoc/docs/` and auto-detects the primary HTML file unless you override it via `--html=name.html`. The helper scans `strictdoc/docs/` for “base” `.html` files (ignoring anything with `.mut.` in the name); if it finds exactly one, that file is used automatically, otherwise you must pass `--html`. The default slug `_your_strictdoc_fixture_slug_` demonstrates the expected layout (`strictdoc/docs/strictdoc_file_to_mutate.html` plus `_assets`). For real StrictDoc exports, drop them into `tests/fuzz/<your_slug>/strictdoc/docs/` and run `invoke sdoc-test-fuzz --fixture=<your_slug> [--html=...] [--long]`. You can still use the generic `invoke test-fuzz --input-file=tests/fuzz/<slug>/strictdoc/docs/<file>.html --root-path=tests/fuzz/<slug>/` if you prefer explicit paths.
80
+
81
+
- Failures in either helper copy the entire fixture tree and mutated HTML/PDF into `output/<relative_fixture_path>/`, so you can inspect the exact inputs that triggered issues.
73
82
74
83
## Fuzz Fixture & Mutation Questions
75
84
76
-
-**Fixture selection.** The fuzzer expects two CLI arguments: the path to a single HTML file (`input_file`) and the path to the root directory that contains the file plus all required assets (`root_path`). Out of the box, the `invoke test_fuzz` task points to one of the StrictDoc exports in `tests/fuzz/<fixture_name>/`, but you can drop in any other HTML bundle by adjusting the task arguments or calling the script directly. Each run processes exactly one fixture; to cover several, run the command repeatedly or wrap the fuzzer with a loop. The output folder names include the relative root, so multiple fixtures can coexist.
85
+
-**Fixture selection.** The fuzzer expects two CLI arguments: the path to a single HTML file (`input_file`) and the path to the root directory that contains the file plus all required assets (`root_path`). Out of the box, the `invoke test_fuzz` task points to `tests/fuzz/_your_fixture_slug_/file_to_mutate.html`, and `invoke sdoc_test_fuzz` targets `tests/fuzz/_your_strictdoc_fixture_slug_/strictdoc/docs/strictdoc_file_to_mutate.html`. Swap these paths for your own bundles whenever needed. Each run processes exactly one fixture; to cover several, run the command repeatedly or wrap the fuzzer with a loop. The output folder names include the relative root, so multiple fixtures can coexist.
77
86
78
87
-**Mutations and how to change them.**`html2pdf4doc/html2pdf4doc_fuzzer.py` parses the HTML with `lxml`, collects all `<p>` and `<td>` elements, and performs up to 25 iterations where it picks a random node and replaces its `.text` with sentences from `Faker`. The mutator intentionally stays within well-formed DOM changes so we stress realistic content variations rather than intentional corruption. To extend mutation types, modify `mutate_and_print()` function by changing the XPath, touching attributes, inserting/removing nodes, or composing several mutation strategies. Just keep the DOM valid and serialize back to HTML via `etree.tostring()` before printing.
79
88
@@ -87,7 +96,7 @@ This repository focuses strictly on the Python-side automation layer; the render
87
96
-`invoke build`: rebuild the JS core (`submodules/html2pdf`) and copy the minified bundle into `html2pdf4doc/html2pdf4doc_js/`. Whenever the JS submodule changes, pull it (e.g., `git submodule update --remote --merge` or `cd submodules/html2pdf && git pull`), run this task, and only then rerun tests/release steps so the Python CLI packages the refreshed `html2pdf4doc.min.js`.
88
97
-`invoke lint`: run formatting (`ruff format`), lint (`ruff check`), and type checks (`mypy`) across the Python sources.
89
98
-`invoke test` / `invoke test_integration`: execute the integration suite via `lit`.
90
-
-`invoke test_fuzz [--long]`: run the HTML fuzzing harness for 20 (or 200) iterations.
99
+
-`invoke test_fuzz [--input-file=… --root-path=… --long]` (aka `invoke test-fuzz` / `invoke tf`): run the HTML fuzzing harness for 20 (or 200) iterations. `invoke sdoc_test_fuzz --fixture=<slug> [--html=… --long]` (aka `invoke sdoc-test-fuzz` / `invoke sdoc-tf`) is a shortcut tailored for StrictDoc exports and auto-detects the main HTML file.
91
100
-`invoke clean_itest_artifacts`: scrub generated integration-test outputs via `git clean`.
92
101
-`invoke package` / `invoke release`: build wheels/sdists and optionally upload to (test) PyPI after running `twine check`.
0 commit comments