A production-hardened, three-tier cascading security system for real-time threat detection and alert triage. Combines deterministic watchers (Tier 1), a fine-tuned 3B LLM (Tier 2), and frontier model escalation (Tier 3).
A deployable security monitoring system that runs on commodity hardware (4GB VRAM sufficient for Tier 2). Tier 1 requires zero GPU — it runs entirely on CPU via /proc polling and inotify.
The system was adversarially tested with a full 7-stage intrusion simulation, prompt injection attacks, and evasion techniques. All attack vectors were detected.
┌─────────────────────────────────────────────────────────────┐
│ TIER 1: Deterministic Watchers (always-on, 0 VRAM, <25% CPU) │
│ │
│ FS Watcher ─── Log Watcher ─── YARA Scanner │
│ Supply Chain ─ Process Watcher ─ Net Watcher ─ Integrity │
│ │
│ Defenses: snapshot-on-detect, homoglyph normalization, │
│ ELF magic detection, watchdog auto-restart │
│ │
│ Alert Queue → SQLite (rate-limited, severity-prioritized) │
└───────────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 2: Fine-Tuned LLM (Qwen2.5-Coder-3B, Q4_K_M, 4GB) │
│ │
│ Input: Structured alert from Tier 1 │
│ Output: benign / suspicious / escalate + confidence (0-1) │
│ │
│ Serves via llama-server (OpenAI-compatible API, port 8085) │
│ Single-turn verdict, 3-6s per alert, no multi-turn │
└───────────────────────────┬─────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 3: Frontier Model (on-demand, CRITICAL + low conf) │
│ │
│ Full context window analysis for ambiguous escalations │
│ Invoked only when Tier 2 confidence < threshold │
└─────────────────────────────────────────────────────────────┘
| Watcher | Source | Detects |
|---|---|---|
| FS Watcher | inotify on /tmp, /dev/shm, /var/tmp | New executables, suspicious extensions, Unicode homoglyph evasion (GlassWorm-style), ELF binaries regardless of extension |
| Log Watcher | tail on auth.log, syslog | Failed SSH brute force, sudo abuse, cron modifications, PAM failures |
| YARA Scanner | Triggered by FS events | Webshells, reverse shells, crypto miners, supply chain injections, encoded payloads |
| Supply Chain | pip/npm install hooks | Typosquatting, exec-in-setup.py, obfuscated eval/exec, Fernet-encrypted payloads |
| Process Watcher | /proc poll (5s) | memfd_create fileless execution, deleted binaries, known malicious tools, restricted user activity |
| Net Watcher | /proc/net/tcp poll (10s) | C2 port connections (4444, 8888, 1337...), unexpected listeners, restricted user outbound |
| Integrity Watcher | SHA256 baseline (30s) | Tampering with Sentinel's own code, YARA rules, or database |
Files are copied to a staging directory at inotify trigger time, before alert queuing. This defeats the write-execute-delete race condition where attackers create a file, execute it, and delete it sub-second. YARA scans the snapshot even after the original is gone.
Cyrillic characters (ѕ→s, р→p, о→o) and fullwidth variants are normalized before extension matching. Catches GlassWorm-style evasion where .pу (Cyrillic у) masquerades as .py.
- Cron-based watchdog restarts Tier 1 within 60s if killed
- systemd unit with
Restart=alwaysandRestartSec=3 - Integrity watcher detects code/rule/DB tampering
- Rate limiter prevents alert flooding DoS
| Stage | Attack | Detection |
|---|---|---|
| Initial Access | Webshell drop to /tmp | FS watcher + YARA |
| Persistence | Crontab modification | Log watcher |
| Privilege Escalation | SUID binary in /dev/shm | FS watcher + process watcher |
| Lateral Movement | SSH from www-data | Net watcher + process watcher |
| Data Exfiltration | Outbound to C2 port 4444 | Net watcher (CRITICAL) |
| Supply Chain | Malicious pip package | Supply chain watcher + YARA |
| Stealth | memfd_create fileless exec | Process watcher (fd scan) |
| Technique | Result |
|---|---|
| Unicode homoglyph extension (.pу) | Caught |
| Write-execute-delete (<100ms) | Caught (snapshot) |
| Fileless via memfd_create | Caught (fd scan) |
| Prompt injection in alert text | Resisted (3/3) |
| Daemon kill | Auto-restarted (60s) |
| Code tampering | Caught (integrity watcher) |
| Metric | Score |
|---|---|
| Sandbox eval (FP16) | 10/10 (100%) |
| vs Vanilla Qwen2.5-3B | +90pp improvement |
| vs Q4 baseline (no LoRA) | +50pp improvement |
| Confidence calibration | 0.85-0.98 |
# Clone and install
git clone https://github.com/YOUR_ORG/sentinel-hybrid-stack
cd sentinel-hybrid-stack
pip install -e .
# Start Tier 1 (no GPU needed)
python3 -m sentinel.tier1_daemon --watch-dirs /tmp /var/tmp /dev/shm
# Optional: Start Tier 2 (requires GGUF model + 4GB VRAM)
llama-server -m models/sentinel-qwen2.5-3b-q4_k_m.gguf -c 2048 --port 8085sudo cp sentinel-tier1.service /etc/systemd/system/
sudo systemctl enable --now sentinel-tier1echo "* * * * * /path/to/sentinel/watchdog.sh" | crontab -The Tier 2 model is available on Hugging Face:
- Model: EchoLabs33/sentinel-qwen2.5-3b-security
- Training corpus: EchoLabs33/sentinel-training-corpus-v0.1 (private)
- Base: Qwen/Qwen2.5-Coder-3B-Instruct + LoRA (r=8, alpha=16)
- Training cost: $78.05 total ($1.47 compute + $76.58 corpus generation)
sentinel/
__init__.py
schema.py # Alert/AlertType/AlertSeverity dataclasses
db.py # SQLite persistence layer
tier1_daemon.py # Main daemon (coordinates all watchers)
tier2_llm.py # LLM classification client
watchers/
fs_watcher.py # Filesystem monitoring (inotify)
log_watcher.py # Log file tail monitoring
yara_scanner.py # YARA rule matching engine
supply_chain_watcher.py # pip/npm supply chain monitor
process_watcher.py # /proc process monitoring
net_watcher.py # /proc/net/tcp network monitoring
integrity_watcher.py # Self-integrity verification
hash_checker.py # SHA256 file hashing
rules/
sentinel.yar # YARA detection rules
watchdog.sh # Cron-based restart watchdog
sentinel-tier1.service # systemd unit file
adversarial_test.py # Adversarial test suite
attack_sim.py # 7-stage intrusion simulation
- Q4 quantization conservatism: GGUF deployment over-classifies some benign activity (apt-get, Docker starts). FP16 model is accurate; quantization shifts toward false positives.
- Synthetic training data: 7,167 Claude-generated scenarios. Real SOC telemetry would improve production calibration.
- Polling intervals: Process (5s) and network (10s) watchers have blind spots for ultra-short-lived events between polls.
- Keyword-dependent gate: Novel attack vocabulary not in YARA rules or
HIGH_RISK_KEYWORDSwill only be caught by Tier 2 behavioral analysis. - Single-host scope: Currently monitors one machine. No distributed correlation.
- Added process watcher (memfd, deleted binaries, restricted users)
- Added network watcher (C2 ports, new listeners, restricted outbound)
- Added integrity watcher (self-defense against code tampering)
- Added snapshot-on-detect (write-delete race condition defense)
- Added Unicode homoglyph normalization (GlassWorm evasion defense)
- Added ELF magic detection (extension-independent binary detection)
- Added supply chain YARA rules (exec+Fernet, getattr obfuscation)
- Added stealth script YARA rules (socket+dup2, encoded PowerShell)
- Added watchdog auto-restart (cron + systemd)
- Full adversarial testing: 7/7 kill chain, 3/3 prompt injection, 5/5 evasion
- LoRA fine-tuned Tier 2: 10/10 sandbox eval (vs 1/10 vanilla)
- Initial release: SSM + LLM + post-LLM gate
- 3 watchers (FS, Log, YARA)
- Synthetic eval only
Apache-2.0