A regex backstop that strips Python source from language-model text output.
When a language model is told not to reveal Python implementation in its replies, it usually complies — until someone coaxes it into pasting code anyway. This is the last-line filter that catches those cases. Run it over the model's final text and it removes anything that looks like Python source, then tells you whether it stripped anything (so you can audit-log the event).
It catches three things:
| Pass | What it removes |
|---|---|
| Fenced code | ```python / ```py blocks, and bare ``` blocks whose body has ≥2 Python syntax signals |
| Unfenced declarations | a def/class/async def declaration at any indent, plus its indented body |
| Smuggling | exec(base64.b64decode(...)) / eval(...) payloads hidden in a generic fence |
Source-secrecy is normally enforced only by a system prompt — a single instruction the model can be talked out of. This library makes the policy structural: a deterministic, regex-driven pass that runs after generation and doesn't care how the model was persuaded. It is the third layer of a belt-and-suspenders defense (secrecy instruction → a read tool that hands the model AST summaries instead of raw source → this output filter), and it is the only layer that fails closed regardless of model behavior. It is ~150 lines, zero dependencies, and the heuristics are tuned to strip Python source while leaving prose, SQL, and JSON untouched — the hard part is that selectivity, and it is encoded here as data. See docs/MOAT.md.
pip install -e .Zero runtime dependencies — pure standard library (re only), Python 3.9+.
from python_source_leak_filter import strip_python_source
filtered, was_stripped = strip_python_source(model_output)
if was_stripped:
audit_log("model output contained Python source; redacted")
send_to_user(filtered)strip_python_source(text) returns (filtered_text, was_stripped). Empty or whitespace-only input returns (text, False) — it never raises.
strip_python_source(text) -> (str, bool)— the single public entry point.
Full reference: docs/API.md. Tuning it for your stack: docs/FORKING.md.
Two vectors are deliberately not defended here:
- a base64 string sitting in plain prose with no code fence, and
- a reversed/obfuscated source string in plain prose.
Both require the model to spontaneously re-encode source it was never served. The intended upstream defenses (a secrecy system prompt + a read tool that returns AST summaries rather than raw source) are what close them. This filter is the backstop, not the whole wall.
MIT — see LICENSE.
Powerweave Skunkworks is the AI R&D division of Powerweave Software Services — a rapid-innovation lab that turns real-world product feedback into working, reusable, open-source building blocks. Working in parallel to the main engineering backlog, a lean, cross-functional team of product and technology specialists (UX, data, software engineering, and AI) fast-tracks high-priority ideas into validated modules ready for full-scale build-out.
python-source-leak-filter is one such building block — a de-domained, MIT-licensed, dependency-light component extracted from Powerweave's internal R&D and engineered to be forked into any SaaS or enterprise product.
- 🧪 Powerweave Skunkworks on GitHub — https://github.com/skunkworks-powerweave
- 🌐 Powerweave — https://powerweave.com
- 💼 Powerweave on LinkedIn — https://www.linkedin.com/company/powerweave
Powerweave Software Services Pvt. Ltd. is a digital-transformation company founded in 2001 and headquartered in Mumbai, India. With 25+ years of experience, 1,700+ professionals, and 350+ global customers, Powerweave builds platforms, processes, and teams across enterprise eCommerce, AI-powered procurement, Microsoft Dynamics ERP, business services, and sustainability — with a strong focus on cutting-edge AI automation that streamlines workflows, reduces manual errors, and accelerates decision-making. Powerweave is ISO 27001:2013 certified.
Explore Powerweave
- 🌐 Website — https://powerweave.com
- 💼 LinkedIn — https://www.linkedin.com/company/powerweave
- 𝕏 Twitter / X — https://twitter.com/powerweave
▶️ YouTube — https://www.youtube.com/channel/UCE1t_rg38z4n5BAg29PDZFA- 📘 Facebook — https://www.facebook.com/PowerweaveSoftwareSolutions/
- 🛒 Enterprise eCommerce — https://www.powerweave.com/solutions/enterprise-ecommerce/
- 📦 AI-Powered Procurement — https://www.powerweave.com/solutions/procurement/
- 🧮 Microsoft Dynamics ERP — https://www.powerweave.com/solutions/microsoft-dynamics-erp/
- 🌱 Snowkap — Sustainability — https://www.snowkap.com/
- 🎨 Powerweave Studio — https://www.powerweavestudio.com/
- 🧑💼 About Us & Leadership — https://www.powerweave.com/about-us/
- 🚀 Careers — https://www.powerweave.com/careers/
Keywords: llm · filter · output · redaction · secrecy · security · regex · Powerweave · Powerweave Skunkworks · AI R&D · open source · MIT · Python · forkable.