Cross‑platform digital preservation conversion tools for city and cultural archives.
This repository contains two Bash‑based converters designed for reliable, logged, repeatable format migration on:
- macOS
- Windows 11 (WSL)
- Linux
No ImageMagick is required.
Converts common raster image formats to archival TIFF files and embeds descriptive metadata.
Input formats:
- JPG / JPEG
- PNG
- TIFF
Output format:
- TIFF (
.tiff) with XMP metadata
Converters used:
- macOS:
ffmpeg - Windows 11 WSL / Linux:
ffmpeg
Metadata tool:
exiftool
Migrates legacy Word 97–2003 (.doc) files to Word 2007+ (.docx) format.
Converter used:
LibreOffice(headless CLI)
Original .doc files are preserved. .docx files are migration derivatives.
ffmpegexiftooldialog- LibreOffice (installed via
.dmg)
manually install LibreOffice
bash terminal:
brew install ffmpeg exiftool dialog
Verify:
ffmpeg -version
exiftool -ver
dialog --version
LibreOffice CLI path used automatically:
/Applications/LibreOffice.app/Contents/MacOS/soffice
ffmpegexiftooldialog- LibreOffice
Windows Subsystem for Linux (WSL) is a feature available in Windows 11 that allows users to run a Linux environment directly within Windows 10 and 11.
Installation of WSL on Windows 11 can be done easily using the wsl --install command in an elevated PowerShell or Command Prompt session, which automatically enables the necessary features and installs a default Linux distribution, typically Ubuntu.
Inside WSL (Ubuntu/Debian):
sudo apt update
sudo apt install ffmpeg exiftool dialog -y
Verify:
ffmpeg -version
exiftool -ver
dialog --version
soffice --version
bash afconverter
Put your image files into the /input_images folder.
You will be prompted for metadata fields (optional):
- Title
- Creator / Photographer
- Description
- City / Location
- Archive / Source
- Rights
- Year
All images in input_images/ are processed.
Put your doc files into the /input_docs folder.
All .doc / .DOC files in input_docs/ are converted.
Both scripts:
- Append to log files (never overwrite)
- Add timestamps per run
- Record tool paths and conversion actions
Logs:
logs/image_convert.loglogs/doc_convert.log
These logs support auditability and provenance tracking.
- Originals are never modified
- Outputs are deterministic and repeatable
- TIFF + XMP metadata is standards‑compliant
- DOCX is used as an access/migration format
These workflows align with common practices in:
- Municipal archives
- Cultural heritage institutions
- Digital preservation pipelines
- Image compression is default (can be extended)
- No recursive folder processing (by design)
- No checksum generation (can be added)
Possible enhancements:
- SHA‑256 checksums
- Recursive directory support
- CSV conversion reports
- PDF/A generation
- Unified image + document pipeline
- Double‑clickable macOS launcher



