feat: PE cert signatures, headless browser stealer, COM office martian, ransomware_message fix#571
Conversation
**New signatures:**
- `pe_cert_suspicious.py` (all): Three PE Authenticode cert signatures:
- `pe_cert_self_signed` (sev 3): PE signed with self-signed cert (subject == issuer,
excluding known root CAs). Uses both digital_signers and guest_signers data.
- `pe_cert_suspicious_issuer` (sev 3): PE signed by unrecognized CA with incomplete
chain, domain-style subject CN, or short validity window (< 180 days).
- `pe_cert_invalid_signature` (sev 4): Signature failed cryptographic verification
(hash mismatch 0x80096010, revoked 0x800B0109, chain can't be built 0x800B010A).
Distinguishes definitive failures from sandbox trust-store gaps.
- `stealer_headless_browser.py` (all): Detects browser stealers launching browsers
headless with logging suppressed (--headless --disable-logging --log-level=3)
from a suspicious parent directory. Fires when 3+ browsers are launched this way
(multi-browser sweep = high confidence) or when the process tree confirms the
suspicious parent. Catches the credential-extraction phase that follows the
initial browser probe.
- `com_process_activation.py` (all): Detects Office applications (Excel, Word, etc.)
that COM-activated a suspicious process (mshta, powershell, cmd, etc.) via the
DCOM broker — the LethalHTA / OLE embedding attack pattern. Only fires when the
process tree enrichment confirms an actual subprocess was spawned (requires
CAPEv2 behavior.py network_map COM enrichment).
**Bug fixes:**
- `martians_office.py`: Add COM-logical children check. The existing OS-process-tree
walk misses LethalHTA spawns because mshta's OS parent is svchost, not Excel.
Added `_check_com_martians()` that walks the enriched processtree for nodes with
`com_logical_parent_pid` pointing to an Office process.
- `ransomware_message.py`: Fix `TypeError: can't use a bytes pattern on a string-like
object` in re2. `indicators` were encoded to bytes and joined with `b"|"` producing
a bytes regex, but `buff.lower()` returns a str. Changed to compile a str pattern.
There was a problem hiding this comment.
Code Review
This pull request introduces several new signatures for detecting malicious behavior, including COM-activated process spawning from Office applications, suspicious or invalid PE certificates, and headless browser launches used for credential theft. It also refactors the RansomwareMessage signature and updates MartiansOffice. Review feedback identifies several critical issues in the ransomware detection logic, specifically potential TypeError exceptions in Python 3 when applying string-based regex to raw bytes, and a crash risk during JSON serialization when reporting raw byte buffers. Additionally, the reviewer noted the accidental removal of the MassRansomNoteDrop class, a performance bottleneck in process tree traversal, and a regression caused by removing a filename fallback in the ransomware signature.
| buff = self.get_raw_argument(call, "Buffer") | ||
| if buff and len(buff) >= 128: | ||
| buff_lower = buff.lower() | ||
| matches = set(self.regex.findall(buff_lower)) |
There was a problem hiding this comment.
In Python 3, using a string-compiled regex (self.regex) on a bytes object (buff_lower) will raise a TypeError. Since get_raw_argument returns bytes, you should either decode the buffer or compile the regex as bytes.
| matches = set(self.regex.findall(buff_lower)) | |
| matches = set(self.regex.findall(buff_lower.decode("utf-8", errors="ignore"))) |
| matches = set(self.regex.findall(buff_lower)) | ||
| if len(matches) > 1: | ||
| self.data.append({"ransom_note": filepath}) | ||
| self.data.append({"beginning_of_ransom_message": buff}) |
There was a problem hiding this comment.
Appending raw bytes (buff) to self.data will cause a crash during report generation when the results are serialized to JSON. Ensure the data is converted to a string.
| self.data.append({"beginning_of_ransom_message": buff}) | |
| self.data.append({"beginning_of_ransom_message": buff.decode("utf-8", errors="ignore")}) |
| if isinstance(filedata, str): | ||
| filedata = filedata.encode("utf-8", errors="ignore") |
There was a problem hiding this comment.
Encoding filedata to bytes here conflicts with the string-based regex used later on line 191, which will cause a TypeError in Python 3. If the regex is string-based, you should ensure filedata is a string.
| if isinstance(filedata, str): | |
| filedata = filedata.encode("utf-8", errors="ignore") | |
| if isinstance(filedata, (bytes, bytearray)): | |
| filedata = filedata.decode("utf-8", errors="ignore") |
| for note_name, dirs in notes.items(): | ||
| if len(dirs) >= 5: | ||
| self.data.append({"ransom_note": note_name, "pid": pid, "directories_count": len(dirs)}) | ||
| return self.ret |
|
|
||
| # Very short validity (< 180 days) | ||
| try: | ||
| from datetime import datetime |
| for parent in ( | ||
| self.results.get("behavior", {}).get("processes", []) or [] | ||
| ): | ||
| if parent.get("process_id") != parent_id: | ||
| continue |
| "what happened", | ||
| "what happened", |
|
|
||
| def on_call(self, call, process): | ||
| filepath = self.get_argument(call, "HandleName") or self.get_argument(call, "FileName") | ||
| filepath = self.get_argument(call, "HandleName") |
There was a problem hiding this comment.
Removing the fallback to FileName regression: HandleName is not always populated for NtWriteFile calls. This change might reduce detection coverage for ransom notes written via handles that don't have a resolved name in the HandleName field.
| filepath = self.get_argument(call, "HandleName") | |
| filepath = self.get_argument(call, "HandleName") or self.get_argument(call, "FileName") |
There was a problem hiding this comment.
Pull request overview
This PR adds several new detection signatures (PE Authenticode certificate anomalies, headless-browser stealer behavior, and Office→COM process activation) and updates existing ransomware/Office signatures to improve detection accuracy.
Changes:
- Added new signatures:
pe_cert_suspicious.py,stealer_headless_browser.py, andcom_process_activation.py. - Updated
martians_office.pyto also detect COM-logical children spawned by Office (DCOM broker pattern). - Refactored
ransomware_message.pyto fix there2bytes/str regex mismatch and adjusted indicator handling.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/signatures/windows/ransomware_message.py | Refactors buffer/regex handling and dropped-file scanning; also removes an additional signature class from the file. |
| modules/signatures/windows/martians_office.py | Adds COM-logical child detection to catch Office-spawned “martians” hidden behind svchost/DCOM. |
| modules/signatures/all/stealer_headless_browser.py | New signature to detect headless+silent browser launches from suspicious parent locations / multi-browser sweeps. |
| modules/signatures/all/pe_cert_suspicious.py | New signatures to flag self-signed, suspicious-issuer, and invalid Authenticode signatures. |
| modules/signatures/all/com_process_activation.py | New signature to detect Office COM-activated subprocesses via enriched process tree metadata. |
Comments suppressed due to low confidence (1)
modules/signatures/windows/ransomware_message.py:200
- This PR removes the
MassRansomNoteDropsignature entirely, but the PR description only mentions a regex TypeError fix forransomware_message.py. If the removal is unintentional, restore/move the signature; if intentional, please document the behavior change (and consider deprecating instead of deleting to avoid breaking downstream expectations).
def on_complete(self):
if not self.ret and "dropped" in self.results:
for dropped in self.results["dropped"]:
raw_name = dropped.get("name", "")
if isinstance(raw_name, list) and len(raw_name) > 0:
filename = str(raw_name[0]).lower()
else:
filename = str(raw_name).lower()
if (
filename.endswith((".txt", ".html", ".hta", ".rtf"))
or "read_me" in filename
or "readme" in filename
or "read-me" in filename
):
filedata = dropped.get("data")
if isinstance(filedata, str):
filedata = filedata.encode("utf-8", errors="ignore")
if filedata and len(filedata) >= 128:
filedata_lower = filedata.lower()
matches = set(self.regex.findall(filedata_lower))
if len(matches) > 1:
self.data.append({"ransom_note": filename})
self.data.append({"beginning_of_ransom_message": filedata})
self.ret = True
break
return self.ret
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| buff = self.get_raw_argument(call, "Buffer") | ||
| if buff and len(buff) >= 128: | ||
| buff_lower = buff.lower() | ||
| matches = set(self.regex.findall(buff_lower)) | ||
|
|
||
| if len(buff_str) >= 32: | ||
| buff_lower = buff_str.lower() | ||
| matches = set(self.regex.findall(buff_lower)) | ||
| if len(matches) > 1: | ||
| self.data.append({"ransom_note": filepath}) | ||
| self.data.append({"beginning_of_ransom_message": buff}) | ||
|
|
||
| if len(matches) > 1: | ||
| if self.pid: | ||
| self.mark_call() | ||
| return True | ||
| self.ret = True |
| "BTC", | ||
| "ethereum", | ||
| "what happened", | ||
| "what happened", |
| BROWSER_RE = re.compile( | ||
| r'\\(?:chrome|brave|msedge|firefox|opera)\.exe', | ||
| re.IGNORECASE | ||
| ) | ||
|
|
||
| SUSPICIOUS_PARENT_RE = re.compile( | ||
| r'\\(?:Temp|AppData|ProgramData|Users\\[^\\]+\\(?:AppData|Downloads)|Users\\Public)\\', | ||
| re.IGNORECASE | ||
| ) |
| for proc in ( | ||
| self.results.get("behavior", {}).get("processes", []) or [] | ||
| ): | ||
| path = proc.get("module_path", "") or proc.get("process_name", "") or "" | ||
| if not BROWSER_RE.search(path): | ||
| continue | ||
| parent_id = proc.get("parent_id") | ||
| if parent_id is None: | ||
| continue | ||
| # Find parent process | ||
| for parent in ( | ||
| self.results.get("behavior", {}).get("processes", []) or [] | ||
| ): | ||
| if parent.get("process_id") != parent_id: | ||
| continue | ||
| parent_path = parent.get("module_path", "") or "" | ||
| if SUSPICIOUS_PARENT_RE.search(parent_path) and not LEGITIMATE_LAUNCHERS.search(parent_path): | ||
| suspicious_parent = parent_path | ||
| self.data.append({"suspicious_parent": parent_path}) | ||
| break | ||
| if suspicious_parent: |
| lower = cmd.lower() | ||
| if not BROWSER_RE.search(cmd): | ||
| continue | ||
| if "--headless" not in lower: |
| from lib.cuckoo.common.abstracts import Signature | ||
|
|
||
|
|
||
| def _get_pe(results): |
| OFFICE_ACTIVATORS = { | ||
| "excel.exe", "winword.exe", "powerpnt.exe", "outlook.exe", | ||
| "msaccess.exe", "mspub.exe", "visio.exe", | ||
| } | ||
|
|
||
| def run(self): | ||
| # Only report confirmed COM-spawned subprocesses visible in the enriched tree. | ||
| # Requiring com_logical_parent_pid avoids noise from normal JScript/WMI activations. | ||
| def walk(nodes): | ||
| for node in nodes: | ||
| lpid = node.get("com_logical_parent_pid") | ||
| lname = (node.get("com_logical_parent_name") or "").lower() | ||
| if lpid and os.path.basename(lname) in self.OFFICE_ACTIVATORS: | ||
| self.data.append({ | ||
| "spawned": "%s (pid %s)" % (node.get("name"), node.get("pid")), | ||
| "logical_parent": "%s (pid %s)" % ( | ||
| node.get("com_logical_parent_name"), lpid), | ||
| "via": node.get("com_progid") or node.get("com_clsid", ""), | ||
| }) |
…op; stealer O(N2) loop
…Name fallback; stealer browser regex + Firefox -headless
|
Round 2 fixes pushed (cab6740) — addressing all reviewer feedback: ransomware_message.py: stealer_headless_browser.py: Replaced O(N²) nested loop with pe_cert_suspicious.py: com_process_activation.py: Noted — a follow-up can tighten the match to a suspicious-children set (mshta, powershell, cmd, wscript, etc.). Left as a known limitation in this PR. |
New Signatures
pe_cert_suspicious.py(all)Three signatures for suspicious Authenticode certificates:
pe_cert_self_signed(severity 3) — PE signed with a self-generated certificate. Detects when subject CN == issuer CN, excluding well-known root CAs (DigiCert, Entrust, etc.). Common in malware that generates throwaway signing certs to appear legitimate.pe_cert_suspicious_issuer(severity 3) — PE signed by an unrecognized CA with red flags: single-cert chain (no intermediate CA), domain-style subject CN (e.g.112bhv.nl), or validity window < 180 days. Pattern seen in malware using certs from low-trust/compromised issuers.pe_cert_invalid_signature(severity 4) — Signature failed cryptographic verification. Distinguishes definitive failures (hash mismatch0x80096010, chain can't be built0x800B010A, revoked0x800B0109) from sandbox trust-store gaps ("not trusted by trust provider") which are normal in analysis VMs.Requires CAPEv2 parse_pe.py fix for cryptography ≥ 40.x (companion PR kevoreilly/CAPEv2#3018).
stealer_headless_browser.py(all)Detects the credential-extraction phase of browser stealers: browsers launched headless with logging suppressed (
--headless --disable-logging --log-level=3) from a suspicious parent directory (Temp, AppData, ProgramData).Pattern observed: malware from
%TEMP%first probes installed browsers with--disable-gpu about:blank, then re-launches them headless+silent to access saved passwords, cookies, and session tokens. Fires when 3+ different browser binaries are launched this way (multi-browser sweep) or when the process tree confirms the suspicious parent.com_process_activation.py(all)Detects Office applications (Excel, Word, Outlook, etc.) that COM-activated a suspicious process (mshta, powershell, cmd, wscript, etc.) via the DCOM broker. The LethalHTA technique embeds HTA/ActiveX objects in Office documents; when activated, Windows launches
mshta.exe -Embeddingvia svchost as the COM surrogate — hiding the true parent. This signature only fires when the CAPEv2 process tree enrichment confirms an actual COM subprocess was spawned.Requires behavior.py COM enrichment (companion PR kevoreilly/CAPEv2#3019).
Bug Fixes
martians_office.pyAdd COM-logical children check to the existing Office martians signature. The OS-tree walk was missing LethalHTA spawns because
mshta.exe's OS parent is svchost, not Excel. Adds_check_com_martians()that walks the enriched processtree for nodes withcom_logical_parent_pidpointing to an Office process — same whitelist as the existing walk.ransomware_message.pyFix
TypeError: can't use a bytes pattern on a string-like objectin re2. Indicators were encoded to bytes and joined withb"|"producing a bytes regex, butbuff.lower()returns a str. Changed to compile a plain str pattern matching the str input.🤖 Generated with Claude Code