PII detection and redaction validator for Guardrails AI, powered by Tonic Textual.
Uses transformer-based NER supporting 46+ entity types across 50+ languages.
pip install guardrails-tonic-textualOr via Guardrails Hub:
guardrails hub install hub://tonic/textual_piiSet the TONIC_TEXTUAL_API_KEY environment variable with your API key. See Creating and revoking Textual API keys for setup instructions.
export TONIC_TEXTUAL_API_KEY="your-key"from guardrails import Guard
from guardrails.hub import TextualPII
# or: from validator import TextualPII
guard = Guard().use(TextualPII(on_fail="fix"))
result = guard.validate("My SSN is 123-45-6789, please help me file my taxes")
print(result.validated_output)
# "My SSN is [US_SSN_...], please help me file my taxes"This works in both directions — scrub user input before it reaches the LLM, or scrub LLM output before it reaches the user.
Raise on PII instead of redacting:
guard = Guard().use(TextualPII(on_fail="exception"))
guard.validate("My SSN is 123-45-6789") # raises ValidationErrorFilter to specific entity types:
guard = Guard().use(
TextualPII(entities=["US_SSN", "CREDIT_CARD", "EMAIL_ADDRESS"], on_fail="fix")
)Wrap an LLM call:
result = guard(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me about John Smith"}],
)
print(result.validated_output) # PII auto-redactedControl how detected PII is replaced in the fix_value via generator_default and per-entity generator_config. See the Tonic Textual entity type handling docs for details.
guard = Guard().use(
TextualPII(
on_fail="fix",
generator_default="Synthesis", # Replace with realistic fakes
generator_config={
"NAME_GIVEN": "Synthesis", # Fake names
"EMAIL_ADDRESS": "Redaction", # [EMAIL_ADDRESS] labels
"PHONE_NUMBER": "Off", # Leave unchanged
},
),
)| Mode | Behavior |
|---|---|
Off |
PII is detected but left unchanged in the fix value |
Redaction |
PII is replaced with entity type labels (e.g., [NAME_GIVEN]) |
Synthesis |
PII is replaced with realistic synthetic values |
GroupingSynthesis |
Groups related entities and generates new names via LLM |
ReplacementSynthesis |
Redacts first, then uses LLM to generate contextual replacements |
Use regex patterns to force-tag or exclude specific values per entity type:
guard = Guard().use(
TextualPII(
on_fail="fix",
# Force-tag text matching these regexes as the given entity type
label_allow_lists={
"ORGANIZATION": ["Acme Corp", "Initech"],
"PHONE_NUMBER": [r"\+1\s?\(\d{3}\)\s?\d{3}-\d{4}"],
},
# Exclude values matching these regexes from detection
label_block_lists={
"NAME_FAMILY": [r"^Smith$"],
},
),
)Include custom entity types defined in the Tonic Textual UI:
guard = Guard().use(
TextualPII(
on_fail="fix",
custom_entities=["CUSTOM_INTERNAL_ID", "CUSTOM_ACCOUNT_NUMBER"],
),
)Use random_seed for deterministic synthesis/tokenization across calls:
guard = Guard().use(
TextualPII(
on_fail="fix",
generator_default="Synthesis",
random_seed=42,
),
)For self-hosted Textual instances, provide your deployment URL:
guard = Guard().use(
TextualPII(
base_url="https://textual.your-company.com",
on_fail="fix",
),
)Unlike tool-based integrations (where an LLM actively calls a redaction tool), this validator operates as a passive PII firewall. It can be applied in two directions:
Input filtering (scrub user messages before they reach the LLM):
- The user submits a message that may contain PII
guard.validate(user_input)scrubs PII from the message- The scrubbed text is sent to the LLM -- PII never leaves your perimeter
Output filtering (scrub LLM responses before they reach the user):
- The LLM generates a response normally
- The Guard intercepts the response before it reaches the user
- Tonic Textual scans for PII and returns entity positions
- The Guard applies the configured
on_failaction
The on_fail strategies control what happens when PII is detected:
"fix": Replaces the text with the redacted version (usingfix_value)"exception": Raises aValidationErrorblocking the text entirely"noop": Logs the PII detection but passes through unchanged"reask": Re-prompts the LLM asking it to remove the PII
| Parameter | Type | Default | Description |
|---|---|---|---|
entities |
list[str] | None |
None |
PII types to detect (all if None) |
api_key |
str | None |
None |
API key (falls back to env var) |
base_url |
str | None |
None |
Self-hosted deployment URL |
generator_default |
str | None |
None |
Default handling mode (Off, Redaction, Synthesis, GroupingSynthesis, ReplacementSynthesis) |
generator_config |
dict[str, str] | None |
None |
Per-entity mode overrides |
label_allow_lists |
dict[str, list[str]] | None |
None |
Per-entity regex patterns to force-tag as that entity type |
label_block_lists |
dict[str, list[str]] | None |
None |
Per-entity regex patterns to exclude from detection |
custom_entities |
list[str] | None |
None |
Custom entity types to include (defined in Textual UI) |
random_seed |
int | None |
None |
Seed for reproducible synthesis/tokenization |
on_fail |
str | callable | None |
None |
Failure action |