PteRedactyl utilizes advanced natural language processing techniques to identify and anonymise personal information in clinical free text.
Developed by the Data & AI Research (DAIR) Unit at University Hospital Southampton NHSFT for use in clinical research, PteRedactyl wraps around swappable NER models to redact or hide PII in strings or DataFrames.
Features
- Anonymisation of various entities such as names, locations, and phone numbers.
- Support for processing both strings and pandas DataFrames.
- Text highlighting for easy identification of anonymised sections.
- Hide in plain site (HIPS) replacement
Execute:
pip install pteredactyl
To install in development mode, we recommend using uv.
-
Install uv from the Astral website, or install via PyPI with
pip install uv -
Clone the PteRedactyl repo:
git clone https://github.com/SETT-Centre-Data-and-AI/pteredactyl.git
- Navigate to the repositry (
cd ...\pteredactyl\) and execute:
uv sync --group dev
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
This work is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License.
Valediction was developed by Cai Davis and Michael George at University Hospital Southampton NHSFT's Data & AI Research Unit (DAIR) - part of the Southampton Emerging Therapies and Technology (SETT) Centre.
"# PteRedactyl_development"