SLD2reader is a command-line tool for inspecting .sdc SLD2 dictionary
files and exporting their contents into more open, easier-to-process formats.
It is designed for BYO-files workflows: you supply your own SLD2 files,
run the tool locally, and keep the original content under your control.
Use SLD2reader to:
- inspect unknown
SLD2files - extract readable data from supported file families
- export entries into generic formats such as plain text, JSONL, structured JSONL, and DSL
- produce machine-readable manifests and validation artifacts
- move data out of opaque source containers and into formats that are easier to preserve, search, transform, and reuse
This project intentionally does not:
- bundle dictionary content
- grant redistribution rights over source files or exports
- provide a GUI
- promise byte-faithful recreation of every original presentation detail
- Python 3.10+
- Your own
SLD2.sdcfiles - No required third-party Python packages for the core CLI
- Optional: downstream tools such as
pyglossaryif you want to convert DSL into other dictionary ecosystems
-
Inspect a file:
python3 sld2reader.py inspect your_dictionary.sdc
-
Export directly readable material where available:
python3 sld2reader.py export-readable your_dictionary.sdc out_readable
-
Export the best currently supported open form:
python3 sld2reader.py export-decoded your_dictionary.sdc out_export
-
See the full CLI surface:
python3 sld2reader.py --help
SLD2reader can produce several output styles. Choose the format that matches
your workflow:
| Output | Best for | Notes |
|---|---|---|
| Plain text | Quick inspection | Simple, readable exports |
| JSONL | Scripts and indexing | One JSON object per record |
| Structured JSONL | Highest-fidelity machine-readable export | Best choice when you want richer structure |
| DSL | Dictionary-oriented exchange | Useful as a bridge into other conversion tools |
| Manifest JSON | Automation and provenance | Summarizes written outputs and counts |
| Validation/report JSON | Regression and integrity checks | Useful for repeatable local workflows |
- Prefer structured JSONL for the richest open export.
- Prefer flat JSONL for simpler entry-oriented pipelines.
- Prefer DSL when your goal is downstream dictionary conversion.
- Prefer plain text when you are first exploring an unfamiliar file.
| Command | Purpose |
|---|---|
inspect |
Show high-level metadata and section/container information |
export-readable |
Write directly readable text-based material where available |
export-decoded |
Write the best currently supported open export for the given file |
Some specialized analysis and validation commands also exist for local research
workflows. They are intentionally omitted from this high-level README so the
public documentation stays generic and file-family-neutral. Use
python3 sld2reader.py --help when you need the full command list.
The public goal of this tool is to move SLD2 data into formats that are easier
to inspect, preserve, transform, and use elsewhere.
For most users, that means:
- inspect the source file
- run
export-decoded - choose JSONL or DSL depending on the next tool in the pipeline
Some decoded export paths use a conservative alias policy:
aliases/export_aliasescontain display-safe exported aliasesraw_search_termspreserves the broader recovered lookup vocabularyrejected_search_termsrecords lookup terms intentionally kept out of DSL keys and other standard dictionary outputs
When available, --alias-mode legacy restores the broader legacy alias
behavior for comparison/debugging.
python3 sld2reader.py inspect your_dictionary.sdc
python3 sld2reader.py export-readable your_dictionary.sdc out_readablepython3 sld2reader.py export-decoded your_dictionary.sdc out_exportThen choose:
*.jsonlfor flat machine-readable records*_structured_entries.jsonlfor richer structure where available*.dslfor dictionary-oriented interchange
SLD2reader intentionally focuses on extraction and open intermediate formats.
If you need another dictionary format, the usual workflow is:
- export DSL with
SLD2reader - convert that DSL with a downstream tool such as
pyglossary
Exact filenames depend on the input file and export path, but common artifacts include:
- decoded entry exports (
*.jsonl) - structured entry exports (
*_structured_entries.jsonl) - DSL exports (
*.dsl) - manifest files (
*_decoded_manifest.json) - validation/report files (
*_validation_report.json) - optional analysis artifacts for deeper research commands
SLD2reader is a generic SLD2 reader, but practical decoded export coverage
still varies by file family.
The safest assumption is:
- every file should start with
inspect - some files may expose readable sections immediately
- some files may support richer decoded export
- some advanced research commands are still tailored to local sample fixtures
| Path | Purpose |
|---|---|
sld2reader.py |
Thin CLI entrypoint |
sld2reader_lib/ |
Internal parsing, decoding, exporting, and validation logic |
README.md |
High-level project overview |
WORKFLOW.txt |
Practical BYO-files usage walkthrough |
FORMAT_NOTES.txt |
Lower-level format and reverse-engineering notes |
CONTRIBUTING.md |
Contributor workflow and publishing guidance |
- Decoded export coverage varies by file family.
- Unknown
SLD2files may be inspectable before they are exportable. - Some export paths remain experimental.
- Higher-fidelity structured exports are still approximations rather than perfect recreations of original presentation layers.
- The deeper analysis commands are intended for investigation and refinement, not for everyday export workflows.