Skip to content

Latest commit

 

History

History
251 lines (203 loc) · 10.6 KB

File metadata and controls

251 lines (203 loc) · 10.6 KB

PatchSteg — Agent Instructions

3-person hackathon team. Each teammate runs Claude Code. This file is the single source of truth. Read it fully before doing anything.

Environment Setup (WSL)

The venv lives outside the repo at ~/patchseg-venv to keep Windows filesystem I/O fast.

# First time only — create venv and install deps:
python3 -m venv ~/patchseg-venv
~/patchseg-venv/bin/pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
~/patchseg-venv/bin/pip install -r requirements.txt

# Every session — activate before running anything:
source ~/patchseg-venv/bin/activate

# Run experiments:
python experiments/cdf_capacity_test.py

Never pip install to system Python — always activate the venv first or use ~/patchseg-venv/bin/python directly.

Running the Demo

source ~/patchseg-venv/bin/activate
python demo/app.py
# Gradio UI at http://localhost:7860

The demo UI includes an "Attack method" radio selector:

  • Original PatchSteg (±ε) — classic perturbation; detectable at high ε (AUC ~0.93)
  • CDF-PatchSteg (SOTA, undetectable) — distribution-preserving; target AUC ≈ 0.5

A static HTML summary for offline presentation: demo/demo_summary.html

Running SOTA Attacks

CDF-PatchSteg (Phase 1 — undetectable steganography)

source ~/patchseg-venv/bin/activate
python experiments/cdf_capacity_test.py
# Outputs: accuracy/PSNR/KS-test across K=5,10,20,50 carriers
# Figures: paper/figures/cdf_capacity_curve.png, cdf_detectability.png, cdf_distribution.png

What it does: Replaces ±ε perturbation with inverse Gaussian CDF resampling. Each carrier value is replaced by a fresh sample from the upper (bit=1) or lower (bit=0) half of N(μ_channel, σ). The latent distribution is preserved exactly, so no statistical detector can distinguish stego from cover (theoretical AUC = 0.5).

Key novelty: Post-hoc application to existing images — unlike Gaussian Shading, PRC, or PSyDUCK which all require generation-time noise seed control.

PCA-guided directions (Phase 2)

python experiments/pca_test.py
# Figures: paper/figures/pca_components.png, pca_accuracy_comparison.png, pca_detectability.png

Latent steganalysis detector (Phase 3 — defense side)

python experiments/latent_detector_test.py
# Figures: paper/figures/detector_roc_curves.png, detector_cross_method.png, detector_feature_importance.png
# Key finding: Detects original PatchSteg (AUC>0.90) but fails on CDF-PatchSteg (AUC≈0.5)

Git Workflow (CRITICAL)

git pull --rebase origin main   # ALWAYS before starting work
# ... do work ...
git add <specific files>
git commit -m "descriptive message"
git push origin main            # ALWAYS after finishing a unit of work
  • Never force push. If push is rejected, pull --rebase first.
  • Commit often in small logical units — reduces merge conflicts.
  • Avoid editing the same file in parallel. Check the ownership table below.

File Ownership (Conflict Avoidance)

When two agents edit the same file simultaneously, merges break. Use this table to coordinate. If you need to edit a file owned by someone else, communicate first or make a focused, minimal edit.

Area Files Notes
Core library core/*.py Shared — keep changes minimal and backward-compatible
Phase 1 (CDF) core/cdf_steganography.py, experiments/cdf_capacity_test.py One owner at a time
Phase 2 (PCA) core/pca_directions.py, experiments/pca_test.py One owner at a time
Phase 3 (Detector) core/detector.py, experiments/latent_detector_test.py One owner at a time
Phase 4 (Benchmarks) experiments/baseline_comparison.py, core/metrics.py Not started
Paper paper/main.tex HIGH CONFLICT RISK — only one person edits at a time
Figures paper/figures/*.png Generated by experiments — commit after running
Demo demo/app.py One owner at a time
Experiment log EXPERIMENT_LOG.md Append-only — add your results at the bottom

Project Layout

patchseg/
├── CLAUDE.md              ← YOU ARE HERE (agent instructions)
├── EXPERIMENT_LOG.md      ← Results log (append after each experiment)
├── README.md              ← Public-facing project description
├── requirements.txt       ← Python deps
│
├── core/                  ← Library code (importable modules)
│   ├── __init__.py
│   ├── vae.py             ← StegoVAE: SD VAE encode/decode wrapper
│   ├── steganography.py   ← PatchSteg: original ±epsilon encoding
│   ├── cdf_steganography.py ← CDFPatchSteg: distribution-preserving (Phase 1)
│   ├── pca_directions.py  ← PCADirections + PCAPatchSteg (Phase 2)
│   ├── detector.py        ← LatentStegDetector: 46-feature steganalysis (Phase 3)
│   ├── analysis.py        ← Mechanistic analysis helpers
│   └── metrics.py         ← PSNR, SSIM, bit accuracy
│
├── experiments/           ← Runnable scripts (each standalone)
│   ├── v1: roundtrip_test, carrier_stability, capacity_test,
│   │   robustness_test, detectability_test, mechanistic_analysis,
│   │   extended_experiments, generate_figures, run_all, run_remaining
│   ├── v2: v2_content_science, v2_detection_strength, v2_multimodel,
│   │   v2_robustness_deployment, v2_run_all, v2_serious_dataset
│   └── new: cdf_capacity_test, pca_test, latent_detector_test
│
├── paper/                 ← LaTeX paper
│   ├── main.tex           ← Paper source (section structure in appendix below)
│   ├── build.sh           ← Build PDF: `cd paper && bash build.sh`
│   └── figures/           ← All PNGs (generated by experiment scripts)
│
├── demo/
│   └── app.py             ← Gradio interactive demo
│
└── references/            ← Research papers (PDFs)
    ├── INDEX.md           ← Paper summaries + relevance mapping
    └── *.pdf              ← 7 key papers (see INDEX.md)

Coding Conventions

  • Seeds: seed=42 unless explicitly testing seed variation
  • Printing: Always flush=True
  • Matplotlib: Always matplotlib.use('Agg') before importing pyplot
  • Figures: Save to paper/figures/ as PNG, dpi=150, bbox_inches='tight'
  • Image size: IMG_SIZE = 256 (32x32 latent grid) unless otherwise noted
  • Path setup (every experiment script):
    import sys, os
    os.environ['PYTHONUNBUFFERED'] = '1'
    sys.stdout.reconfigure(line_buffering=True)
    sys.path.insert(0, str(__import__('pathlib').Path(__file__).resolve().parent.parent))
  • New dependencies: Add to requirements.txt

Key APIs (Quick Reference)

# VAE
vae = StegoVAE(device='cpu', image_size=256)
latent = vae.encode(pil_image)          # -> [1, 4, 32, 32]
pil_image = vae.decode(latent)          # -> PIL Image
latent, recon = vae.round_trip(image)   # encode then decode

# Original PatchSteg (±epsilon)
steg = PatchSteg(seed=42, epsilon=5.0)
carriers, smap = steg.select_carriers_by_stability(vae, img, n_carriers=20)
latent_mod = steg.encode_message(latent, carriers, bits)
decoded_bits, confs = steg.decode_message(latent_clean, latent_received, carriers)

# CDF-PatchSteg (distribution-preserving) — Phase 1
cdf = CDFPatchSteg(seed=42, sigma=1.0)
carriers, smap = cdf.select_carriers_by_stability(vae, img, n_carriers=20)
latent_mod = cdf.encode_message(latent, carriers, bits)
decoded_bits, confs = cdf.decode_message(vae, stego_pil_image, carriers)  # no clean latent needed!

# PCA directions — Phase 2
pca_dir = PCADirections(n_components=4)
pca_dir.fit_global(vae, image_list)
pca_steg = PCAPatchSteg(pca_dir, seed=42, epsilon=5.0, component=0)
# Same API as PatchSteg after that

# Detector — Phase 3
det = LatentStegDetector()
feats = det.extract_features(vae, pil_image)  # -> 46-dim numpy array
det.fit(X, y)                                  # sklearn pipeline
det.predict_proba(X)

# Metrics
compute_psnr(img1, img2)      # PIL or tensor
compute_ssim_pil(img1, img2)  # PIL only
bit_accuracy(sent, received)  # lists -> float %

Evolution Roadmap

Phase 1 (CDF) ──┐
                 ├──> Phase 3 (Detector — needs CDF for attack-defense pair)
Phase 2 (PCA) ──┘
                      │
                      v
                 Phase 4 (Benchmarks — needs all above)
Phase Goal Status Run
1 — CDF Undetectable steganography (AUC≈0.5) Code done, experiments pending python3 experiments/cdf_capacity_test.py
2 — PCA Data-driven perturbation directions Code done, experiments pending python3 experiments/pca_test.py
3 — Detector Latent steganalysis (attack-defense) Code done, experiments pending python3 experiments/latent_detector_test.py
4 — Benchmarks Compare vs RoSteALS, TrustMark, etc. Not started

Post-Experiment Checklist (every phase)

  1. Run experiment script end-to-end
  2. Verify figures in paper/figures/
  3. Update paper/main.tex with results (fill in table placeholders)
  4. Build PDF: cd paper && bash build.sh
  5. Append results to EXPERIMENT_LOG.md
  6. Commit + push

Paper Section Map

For quick navigation when updating paper/main.tex:

Section Label What Goes There
Abstract Summary of all results
Related Work Gaussian Shading, PRC, Tree-Ring, RoSteALS, Motwani
Method: CDF Encoding sec:cdf CDF-PatchSteg algorithm
Method: PCA Directions sec:pca PCA-guided perturbations
CDF Experiments sec:cdf_experiments CDF accuracy, KS test, detectability
PCA Experiments sec:pca_experiments Variance explained, accuracy, detectability
Steganalysis sec:steganalysis Detector results, cross-method matrix, ROC
Conclusion Updated with CDF + detector narrative
Bibliography 4 references (add more as needed)

References (in references/)

Read references/INDEX.md for the full table. Key papers every agent should know:

Paper File TL;DR
Gaussian Shading gaussian_shading_cvpr2024.pdf CDF partitioning for undetectable watermarks — our Phase 1 inspiration
VAE PCA Directions vae_pca_directions_cvpr2019.pdf VAEs align with PCA — our Phase 2 theory
RoSteALS rosteals_cvprw2023.pdf Trained stego on frozen VAE — our main comparison baseline
Motwani Collusion secret_collusion_motwani2024.pdf LLM covert channels — our threat model