Skip to content

j-grosse/afconverter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Archivist File Converter by J. Grosse

Cross‑platform digital preservation conversion tools for city and cultural archives.

This repository contains two Bash‑based converters designed for reliable, logged, repeatable format migration on:

  • macOS
  • Windows 11 (WSL)
  • Linux

No ImageMagick is required.


Included Tools

1. Image to TIFF+ Converter

Converts common raster image formats to archival TIFF files and embeds descriptive metadata.

Input formats:

  • JPG / JPEG
  • PNG
  • TIFF

Output format:

  • TIFF (.tiff) with XMP metadata

Converters used:

  • macOS: ffmpeg
  • Windows 11 WSL / Linux: ffmpeg

Metadata tool:

  • exiftool

2. Word DOC to DOCX Converter

Migrates legacy Word 97–2003 (.doc) files to Word 2007+ (.docx) format.

Converter used:

  • LibreOffice (headless CLI)

Original .doc files are preserved. .docx files are migration derivatives.


Requirements / Installation

macOS

  • ffmpeg
  • exiftool
  • dialog
  • LibreOffice (installed via .dmg)

manually install LibreOffice

bash terminal:

brew install ffmpeg exiftool dialog

Verify:

ffmpeg -version
exiftool -ver
dialog --version

LibreOffice CLI path used automatically:

/Applications/LibreOffice.app/Contents/MacOS/soffice

Windows 11 (WSL)

  • ffmpeg
  • exiftool
  • dialog
  • LibreOffice

Windows Subsystem for Linux (WSL) is a feature available in Windows 11 that allows users to run a Linux environment directly within Windows 10 and 11. Installation of WSL on Windows 11 can be done easily using the wsl --install command in an elevated PowerShell or Command Prompt session, which automatically enables the necessary features and installs a default Linux distribution, typically Ubuntu.

Inside WSL (Ubuntu/Debian):

sudo apt update
sudo apt install ffmpeg exiftool dialog -y

Verify:

ffmpeg -version
exiftool -ver
dialog --version
soffice --version

Usage

bash afconverter

Image → TIFF Conversion

Put your image files into the /input_images folder.

You will be prompted for metadata fields (optional):

  • Title
  • Creator / Photographer
  • Description
  • City / Location
  • Archive / Source
  • Rights
  • Year

All images in input_images/ are processed.


DOC → DOCX Conversion

Put your doc files into the /input_docs folder.

All .doc / .DOC files in input_docs/ are converted.


Logging & Provenance

Both scripts:

  • Append to log files (never overwrite)
  • Add timestamps per run
  • Record tool paths and conversion actions

Logs:

  • logs/image_convert.log
  • logs/doc_convert.log

These logs support auditability and provenance tracking.


Archival Best Practice Notes

  • Originals are never modified
  • Outputs are deterministic and repeatable
  • TIFF + XMP metadata is standards‑compliant
  • DOCX is used as an access/migration format

These workflows align with common practices in:

  • Municipal archives
  • Cultural heritage institutions
  • Digital preservation pipelines

Limitations

  • Image compression is default (can be extended)
  • No recursive folder processing (by design)
  • No checksum generation (can be added)

Future Extensions (Optional)

Possible enhancements:

  • SHA‑256 checksums
  • Recursive directory support
  • CSV conversion reports
  • PDF/A generation
  • Unified image + document pipeline
  • Double‑clickable macOS launcher

Releases

No releases published

Packages

 
 
 

Contributors