Skip to content

Commit e623599

Browse files
committed
docs: expand README fuzzing instructions
1 parent 25b142d commit e623599

1 file changed

Lines changed: 13 additions & 4 deletions

File tree

README.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -68,12 +68,21 @@ This repository focuses strictly on the Python-side automation layer; the render
6868

6969
### Fuzz testing
7070

71-
- `invoke test_fuzz` (or `invoke test-fuzz`) runs `html2pdf4doc/html2pdf4doc_fuzzer.py` against whichever HTML fixture you point it to. The default task configuration targets one of the StrictDoc exports under a path like `tests/fuzz/<fixture_name>/`, but you can edit the task or call the script directly to fuzz any other HTML bundle.
72-
- Pass `--long` to run 200 iterations instead of the default 20. Failures copy the entire fuzz fixture tree plus the mutated HTML/PDF into `output/<relative_fixture_path>/`, so you can inspect the exact inputs that triggered issues.
71+
- **Generic helper.** `invoke test_fuzz` (aliases `invoke test-fuzz` / `invoke tf`) runs `html2pdf4doc/html2pdf4doc_fuzzer.py` for any HTML bundle: pass `--input-file=<path/to/file.html>` and `--root-path=<fixture_root>` to control the target. By default it points to the sample folder `tests/fuzz/_your_fixture_slug_/` which contains `file_to_mutate.html` plus an `_assets/` directory. When you create a new non-StrictDoc fixture, mirror that structure (root folder with the HTML and assets) and provide its paths via `--input-file/--root-path`. You can also call the Python script directly:
72+
```bash
73+
python html2pdf4doc/html2pdf4doc_fuzzer.py \
74+
tests/fuzz/<_your_fixture_slug_>/file_to_mutate.html \
75+
tests/fuzz/<_your_fixture_slug_>/
76+
```
77+
Shells like `zsh` require trailing `\` when breaking a command across lines; otherwise keep it on one line. Add `--long` (either via Invoke or direct call) to run 200 iterations instead of 20.
78+
79+
- **StrictDoc helper.** `invoke sdoc_test_fuzz` (aliases `invoke sdoc-test-fuzz` / `invoke sdoc-tf`) assumes the fixture lives under `tests/fuzz/<slug>/strictdoc/docs/` and auto-detects the primary HTML file unless you override it via `--html=name.html`. The helper scans `strictdoc/docs/` for “base” `.html` files (ignoring anything with `.mut.` in the name); if it finds exactly one, that file is used automatically, otherwise you must pass `--html`. The default slug `_your_strictdoc_fixture_slug_` demonstrates the expected layout (`strictdoc/docs/strictdoc_file_to_mutate.html` plus `_assets`). For real StrictDoc exports, drop them into `tests/fuzz/<your_slug>/strictdoc/docs/` and run `invoke sdoc-test-fuzz --fixture=<your_slug> [--html=...] [--long]`. You can still use the generic `invoke test-fuzz --input-file=tests/fuzz/<slug>/strictdoc/docs/<file>.html --root-path=tests/fuzz/<slug>/` if you prefer explicit paths.
80+
81+
- Failures in either helper copy the entire fixture tree and mutated HTML/PDF into `output/<relative_fixture_path>/`, so you can inspect the exact inputs that triggered issues.
7382

7483
## Fuzz Fixture & Mutation Questions
7584

76-
- **Fixture selection.** The fuzzer expects two CLI arguments: the path to a single HTML file (`input_file`) and the path to the root directory that contains the file plus all required assets (`root_path`). Out of the box, the `invoke test_fuzz` task points to one of the StrictDoc exports in `tests/fuzz/<fixture_name>/`, but you can drop in any other HTML bundle by adjusting the task arguments or calling the script directly. Each run processes exactly one fixture; to cover several, run the command repeatedly or wrap the fuzzer with a loop. The output folder names include the relative root, so multiple fixtures can coexist.
85+
- **Fixture selection.** The fuzzer expects two CLI arguments: the path to a single HTML file (`input_file`) and the path to the root directory that contains the file plus all required assets (`root_path`). Out of the box, the `invoke test_fuzz` task points to `tests/fuzz/_your_fixture_slug_/file_to_mutate.html`, and `invoke sdoc_test_fuzz` targets `tests/fuzz/_your_strictdoc_fixture_slug_/strictdoc/docs/strictdoc_file_to_mutate.html`. Swap these paths for your own bundles whenever needed. Each run processes exactly one fixture; to cover several, run the command repeatedly or wrap the fuzzer with a loop. The output folder names include the relative root, so multiple fixtures can coexist.
7786

7887
- **Mutations and how to change them.** `html2pdf4doc/html2pdf4doc_fuzzer.py` parses the HTML with `lxml`, collects all `<p>` and `<td>` elements, and performs up to 25 iterations where it picks a random node and replaces its `.text` with sentences from `Faker`. The mutator intentionally stays within well-formed DOM changes so we stress realistic content variations rather than intentional corruption. To extend mutation types, modify `mutate_and_print()` function by changing the XPath, touching attributes, inserting/removing nodes, or composing several mutation strategies. Just keep the DOM valid and serialize back to HTML via `etree.tostring()` before printing.
7988

@@ -87,7 +96,7 @@ This repository focuses strictly on the Python-side automation layer; the render
8796
- `invoke build`: rebuild the JS core (`submodules/html2pdf`) and copy the minified bundle into `html2pdf4doc/html2pdf4doc_js/`. Whenever the JS submodule changes, pull it (e.g., `git submodule update --remote --merge` or `cd submodules/html2pdf && git pull`), run this task, and only then rerun tests/release steps so the Python CLI packages the refreshed `html2pdf4doc.min.js`.
8897
- `invoke lint`: run formatting (`ruff format`), lint (`ruff check`), and type checks (`mypy`) across the Python sources.
8998
- `invoke test` / `invoke test_integration`: execute the integration suite via `lit`.
90-
- `invoke test_fuzz [--long]`: run the HTML fuzzing harness for 20 (or 200) iterations.
99+
- `invoke test_fuzz [--input-file=… --root-path=… --long]` (aka `invoke test-fuzz` / `invoke tf`): run the HTML fuzzing harness for 20 (or 200) iterations. `invoke sdoc_test_fuzz --fixture=<slug> [--html=… --long]` (aka `invoke sdoc-test-fuzz` / `invoke sdoc-tf`) is a shortcut tailored for StrictDoc exports and auto-detects the main HTML file.
91100
- `invoke clean_itest_artifacts`: scrub generated integration-test outputs via `git clean`.
92101
- `invoke package` / `invoke release`: build wheels/sdists and optionally upload to (test) PyPI after running `twine check`.
93102

0 commit comments

Comments
 (0)