Skip to content

SETT-Centre-Data-and-AI/PteRedactyl

Repository files navigation

PteRedactyl

PteRedactyl

PteRedactyl utilizes advanced natural language processing techniques to identify and anonymise personal information in clinical free text.

Developed by the Data & AI Research (DAIR) Unit at University Hospital Southampton NHSFT for use in clinical research, PteRedactyl wraps around swappable NER models to redact or hide PII in strings or DataFrames.

Features

  • Anonymisation of various entities such as names, locations, and phone numbers.
  • Support for processing both strings and pandas DataFrames.
  • Text highlighting for easy identification of anonymised sections.
  • Hide in plain site (HIPS) replacement

⚙️ Installation

Via PyPI

Execute:

pip install pteredactyl

Via GitHub (uv)

To install in development mode, we recommend using uv.

  1. Install uv from the Astral website, or install via PyPI with pip install uv

  2. Clone the PteRedactyl repo:

git clone https://github.com/SETT-Centre-Data-and-AI/pteredactyl.git
  1. Navigate to the repositry (cd ...\pteredactyl\) and execute:
uv sync --group dev

📚 Guides

🤝 Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

⚖️ License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. CC BY-NC 4.0

🧑‍🔬 Authors

Valediction was developed by Cai Davis and Michael George at University Hospital Southampton NHSFT's Data & AI Research Unit (DAIR) - part of the Southampton Emerging Therapies and Technology (SETT) Centre.

NHS UHS SETT Centre

"# PteRedactyl_development"

About

A python module for redaction of personally identifiable information (PII) in clinical free-text. It builds on Presidio and is extremely easy to use for those getting started with PII redaction.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors