feat: PE cert signatures, headless browser stealer, COM office martian, ransomware_message fix by wmetcalf · Pull Request #571 · CAPESandbox/community

wmetcalf · 2026-05-12T16:21:04Z

New Signatures

`pe_cert_suspicious.py` (all)

Three signatures for suspicious Authenticode certificates:

pe_cert_self_signed (severity 3) — PE signed with a self-generated certificate. Detects when subject CN == issuer CN, excluding well-known root CAs (DigiCert, Entrust, etc.). Common in malware that generates throwaway signing certs to appear legitimate.

pe_cert_suspicious_issuer (severity 3) — PE signed by an unrecognized CA with red flags: single-cert chain (no intermediate CA), domain-style subject CN (e.g. 112bhv.nl), or validity window < 180 days. Pattern seen in malware using certs from low-trust/compromised issuers.

pe_cert_invalid_signature (severity 4) — Signature failed cryptographic verification. Distinguishes definitive failures (hash mismatch 0x80096010, chain can't be built 0x800B010A, revoked 0x800B0109) from sandbox trust-store gaps ("not trusted by trust provider") which are normal in analysis VMs.

Requires CAPEv2 parse_pe.py fix for cryptography ≥ 40.x (companion PR kevoreilly/CAPEv2#3018).

`stealer_headless_browser.py` (all)

Detects the credential-extraction phase of browser stealers: browsers launched headless with logging suppressed (--headless --disable-logging --log-level=3) from a suspicious parent directory (Temp, AppData, ProgramData).

Pattern observed: malware from %TEMP% first probes installed browsers with --disable-gpu about:blank, then re-launches them headless+silent to access saved passwords, cookies, and session tokens. Fires when 3+ different browser binaries are launched this way (multi-browser sweep) or when the process tree confirms the suspicious parent.

`com_process_activation.py` (all)

Detects Office applications (Excel, Word, Outlook, etc.) that COM-activated a suspicious process (mshta, powershell, cmd, wscript, etc.) via the DCOM broker. The LethalHTA technique embeds HTA/ActiveX objects in Office documents; when activated, Windows launches mshta.exe -Embedding via svchost as the COM surrogate — hiding the true parent. This signature only fires when the CAPEv2 process tree enrichment confirms an actual COM subprocess was spawned.

Requires behavior.py COM enrichment (companion PR kevoreilly/CAPEv2#3019).

Bug Fixes

`martians_office.py`

Add COM-logical children check to the existing Office martians signature. The OS-tree walk was missing LethalHTA spawns because mshta.exe's OS parent is svchost, not Excel. Adds _check_com_martians() that walks the enriched processtree for nodes with com_logical_parent_pid pointing to an Office process — same whitelist as the existing walk.

`ransomware_message.py`

Fix TypeError: can't use a bytes pattern on a string-like object in re2. Indicators were encoded to bytes and joined with b"|" producing a bytes regex, but buff.lower() returns a str. Changed to compile a plain str pattern matching the str input.

🤖 Generated with Claude Code

**New signatures:** - `pe_cert_suspicious.py` (all): Three PE Authenticode cert signatures: - `pe_cert_self_signed` (sev 3): PE signed with self-signed cert (subject == issuer, excluding known root CAs). Uses both digital_signers and guest_signers data. - `pe_cert_suspicious_issuer` (sev 3): PE signed by unrecognized CA with incomplete chain, domain-style subject CN, or short validity window (< 180 days). - `pe_cert_invalid_signature` (sev 4): Signature failed cryptographic verification (hash mismatch 0x80096010, revoked 0x800B0109, chain can't be built 0x800B010A). Distinguishes definitive failures from sandbox trust-store gaps. - `stealer_headless_browser.py` (all): Detects browser stealers launching browsers headless with logging suppressed (--headless --disable-logging --log-level=3) from a suspicious parent directory. Fires when 3+ browsers are launched this way (multi-browser sweep = high confidence) or when the process tree confirms the suspicious parent. Catches the credential-extraction phase that follows the initial browser probe. - `com_process_activation.py` (all): Detects Office applications (Excel, Word, etc.) that COM-activated a suspicious process (mshta, powershell, cmd, etc.) via the DCOM broker — the LethalHTA / OLE embedding attack pattern. Only fires when the process tree enrichment confirms an actual subprocess was spawned (requires CAPEv2 behavior.py network_map COM enrichment). **Bug fixes:** - `martians_office.py`: Add COM-logical children check. The existing OS-process-tree walk misses LethalHTA spawns because mshta's OS parent is svchost, not Excel. Added `_check_com_martians()` that walks the enriched processtree for nodes with `com_logical_parent_pid` pointing to an Office process. - `ransomware_message.py`: Fix `TypeError: can't use a bytes pattern on a string-like object` in re2. `indicators` were encoded to bytes and joined with `b"|"` producing a bytes regex, but `buff.lower()` returns a str. Changed to compile a str pattern.

gemini-code-assist

Code Review

This pull request introduces several new signatures for detecting malicious behavior, including COM-activated process spawning from Office applications, suspicious or invalid PE certificates, and headless browser launches used for credential theft. It also refactors the RansomwareMessage signature and updates MartiansOffice. Review feedback identifies several critical issues in the ransomware detection logic, specifically potential TypeError exceptions in Python 3 when applying string-based regex to raw bytes, and a crash risk during JSON serialization when reporting raw byte buffers. Additionally, the reviewer noted the accidental removal of the MassRansomNoteDrop class, a performance bottleneck in process tree traversal, and a regression caused by removing a filename fallback in the ransomware signature.

gemini-code-assist · 2026-05-12T16:26:22Z

+        buff = self.get_raw_argument(call, "Buffer")
+        if buff and len(buff) >= 128:
+            buff_lower = buff.lower()
+            matches = set(self.regex.findall(buff_lower))


In Python 3, using a string-compiled regex (self.regex) on a bytes object (buff_lower) will raise a TypeError. Since get_raw_argument returns bytes, you should either decode the buffer or compile the regex as bytes.

Suggested change

matches = set(self.regex.findall(buff_lower))

matches = set(self.regex.findall(buff_lower.decode("utf-8", errors="ignore")))

gemini-code-assist · 2026-05-12T16:26:22Z

-                matches = set(self.regex.findall(buff_lower))
+            if len(matches) > 1:
+                self.data.append({"ransom_note": filepath})
+                self.data.append({"beginning_of_ransom_message": buff})


Appending raw bytes (buff) to self.data will cause a crash during report generation when the results are serialized to JSON. Ensure the data is converted to a string.

Suggested change

self.data.append({"beginning_of_ransom_message": buff})

self.data.append({"beginning_of_ransom_message": buff.decode("utf-8", errors="ignore")})

gemini-code-assist · 2026-05-12T16:26:23Z

+                    if isinstance(filedata, str):
+                        filedata = filedata.encode("utf-8", errors="ignore")


Encoding filedata to bytes here conflicts with the string-based regex used later on line 191, which will cause a TypeError in Python 3. If the regex is string-based, you should ensure filedata is a string.

Suggested change

if isinstance(filedata, str):

filedata = filedata.encode("utf-8", errors="ignore")

if isinstance(filedata, (bytes, bytearray)):

filedata = filedata.decode("utf-8", errors="ignore")

gemini-code-assist · 2026-05-12T16:26:23Z

-                for note_name, dirs in notes.items():
-                    if len(dirs) >= 5:
-                        self.data.append({"ransom_note": note_name, "pid": pid, "directories_count": len(dirs)})
        return self.ret


The MassRansomNoteDrop signature class was entirely removed from this file. If this was unintentional, it represents a significant loss in detection capability for ransomware that drops identical notes across many directories.

gemini-code-assist · 2026-05-12T16:26:23Z

+
+            # Very short validity (< 180 days)
+            try:
+                from datetime import datetime


Importing datetime inside the run method is inefficient as it will be re-imported every time the signature runs. Move this import to the top of the file.

gemini-code-assist · 2026-05-12T16:26:23Z

+            for parent in (
+                self.results.get("behavior", {}).get("processes", []) or []
+            ):
+                if parent.get("process_id") != parent_id:
+                    continue


This nested loop over the process list results in $O(N^2)$ complexity, which can significantly impact performance for analyses with many processes. Consider pre-indexing the processes by process_id into a dictionary for $O(1)$ lookup.

gemini-code-assist · 2026-05-12T16:26:23Z

+            "what happened",
+            "what happened",


The indicator "what happened" is duplicated in the list.

Suggested change

"what happened",

"what happened",

"what happened",

gemini-code-assist · 2026-05-12T16:26:23Z


    def on_call(self, call, process):
-        filepath = self.get_argument(call, "HandleName") or self.get_argument(call, "FileName")
+        filepath = self.get_argument(call, "HandleName")


Removing the fallback to FileName regression: HandleName is not always populated for NtWriteFile calls. This change might reduce detection coverage for ransom notes written via handles that don't have a resolved name in the HandleName field.

Suggested change

filepath = self.get_argument(call, "HandleName")

filepath = self.get_argument(call, "HandleName") or self.get_argument(call, "FileName")

Copilot

Pull request overview

This PR adds several new detection signatures (PE Authenticode certificate anomalies, headless-browser stealer behavior, and Office→COM process activation) and updates existing ransomware/Office signatures to improve detection accuracy.

Changes:

Added new signatures: pe_cert_suspicious.py, stealer_headless_browser.py, and com_process_activation.py.
Updated martians_office.py to also detect COM-logical children spawned by Office (DCOM broker pattern).
Refactored ransomware_message.py to fix the re2 bytes/str regex mismatch and adjusted indicator handling.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
modules/signatures/windows/ransomware_message.py	Refactors buffer/regex handling and dropped-file scanning; also removes an additional signature class from the file.
modules/signatures/windows/martians_office.py	Adds COM-logical child detection to catch Office-spawned “martians” hidden behind svchost/DCOM.
modules/signatures/all/stealer_headless_browser.py	New signature to detect headless+silent browser launches from suspicious parent locations / multi-browser sweeps.
modules/signatures/all/pe_cert_suspicious.py	New signatures to flag self-signed, suspicious-issuer, and invalid Authenticode signatures.
modules/signatures/all/com_process_activation.py	New signature to detect Office COM-activated subprocesses via enriched process tree metadata.

Comments suppressed due to low confidence (1)

modules/signatures/windows/ransomware_message.py:200

This PR removes the MassRansomNoteDrop signature entirely, but the PR description only mentions a regex TypeError fix for ransomware_message.py. If the removal is unintentional, restore/move the signature; if intentional, please document the behavior change (and consider deprecating instead of deleting to avoid breaking downstream expectations).

    def on_complete(self):
        if not self.ret and "dropped" in self.results:
            for dropped in self.results["dropped"]:

                raw_name = dropped.get("name", "")
                if isinstance(raw_name, list) and len(raw_name) > 0:
                    filename = str(raw_name[0]).lower()
                else:
                    filename = str(raw_name).lower()

                if (
                    filename.endswith((".txt", ".html", ".hta", ".rtf"))
                    or "read_me" in filename
                    or "readme" in filename
                    or "read-me" in filename
                ):
                    filedata = dropped.get("data")

                    if isinstance(filedata, str):
                        filedata = filedata.encode("utf-8", errors="ignore")

                    if filedata and len(filedata) >= 128:
                        filedata_lower = filedata.lower()
                        matches = set(self.regex.findall(filedata_lower))

                        if len(matches) > 1:
                            self.data.append({"ransom_note": filename})
                            self.data.append({"beginning_of_ransom_message": filedata})
                            self.ret = True
                            break

        return self.ret

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        buff = self.get_raw_argument(call, "Buffer")
+        if buff and len(buff) >= 128:
+            buff_lower = buff.lower()
+            matches = set(self.regex.findall(buff_lower))

-            if len(buff_str) >= 32:
-                buff_lower = buff_str.lower()
-                matches = set(self.regex.findall(buff_lower))
+            if len(matches) > 1:
+                self.data.append({"ransom_note": filepath})
+                self.data.append({"beginning_of_ransom_message": buff})

-                if len(matches) > 1:
+                if self.pid:
                    self.mark_call()
-                    return True
+                self.ret = True


+            "BTC",
+            "ethereum",
+            "what happened",
+            "what happened",


+BROWSER_RE = re.compile(
+    r'\\(?:chrome|brave|msedge|firefox|opera)\.exe',
+    re.IGNORECASE
+)
+
+SUSPICIOUS_PARENT_RE = re.compile(
+    r'\\(?:Temp|AppData|ProgramData|Users\\[^\\]+\\(?:AppData|Downloads)|Users\\Public)\\',
+    re.IGNORECASE
+)


+        for proc in (
+            self.results.get("behavior", {}).get("processes", []) or []
+        ):
+            path = proc.get("module_path", "") or proc.get("process_name", "") or ""
+            if not BROWSER_RE.search(path):
+                continue
+            parent_id = proc.get("parent_id")
+            if parent_id is None:
+                continue
+            # Find parent process
+            for parent in (
+                self.results.get("behavior", {}).get("processes", []) or []
+            ):
+                if parent.get("process_id") != parent_id:
+                    continue
+                parent_path = parent.get("module_path", "") or ""
+                if SUSPICIOUS_PARENT_RE.search(parent_path) and not LEGITIMATE_LAUNCHERS.search(parent_path):
+                    suspicious_parent = parent_path
+                    self.data.append({"suspicious_parent": parent_path})
+                    break
+            if suspicious_parent:


+            lower = cmd.lower()
+            if not BROWSER_RE.search(cmd):
+                continue
+            if "--headless" not in lower:


+from lib.cuckoo.common.abstracts import Signature
+
+
+def _get_pe(results):


+    OFFICE_ACTIVATORS = {
+        "excel.exe", "winword.exe", "powerpnt.exe", "outlook.exe",
+        "msaccess.exe", "mspub.exe", "visio.exe",
+    }
+
+    def run(self):
+        # Only report confirmed COM-spawned subprocesses visible in the enriched tree.
+        # Requiring com_logical_parent_pid avoids noise from normal JScript/WMI activations.
+        def walk(nodes):
+            for node in nodes:
+                lpid = node.get("com_logical_parent_pid")
+                lname = (node.get("com_logical_parent_name") or "").lower()
+                if lpid and os.path.basename(lname) in self.OFFICE_ACTIVATORS:
+                    self.data.append({
+                        "spawned": "%s (pid %s)" % (node.get("name"), node.get("pid")),
+                        "logical_parent": "%s (pid %s)" % (
+                            node.get("com_logical_parent_name"), lpid),
+                        "via": node.get("com_progid") or node.get("com_clsid", ""),
+                    })


…op; stealer O(N2) loop

…Name fallback; stealer browser regex + Firefox -headless

wmetcalf · 2026-05-13T15:01:33Z

Round 2 fixes pushed (cab6740) — addressing all reviewer feedback:

ransomware_message.py: get_raw_argument returns bytes; now decoded to str before regex matching and before appending to self.data. filedata in on_complete is similarly decoded rather than encoded. Duplicate "what happened" entry removed. FileName fallback restored to on_call. MassRansomNoteDrop restored.

stealer_headless_browser.py: Replaced O(N²) nested loop with proc_by_pid dict for O(1) parent lookup. BROWSER_RE updated to (?<!\w) lookbehind so it matches both bare chrome.exe (process_name field) and full paths. Added Firefox -headless (single-dash) alongside --headless.

pe_cert_suspicious.py: datetime import moved to module level. Re: the target.file.pe vs static.pe suggestion — in this codebase PE data (digital_signers, guest_signers) is normalised into a separate files MongoDB collection and transparently merged back into target.file.pe by a mongo_hook denormalize step on every mongo_find_one call. static.pe is always empty in practice. _get_pe() already tries static.pe first as a fallback; the operative data is at target.file.pe after denormalization. Verified on the live system: pe_cert_self_signed fires correctly.

com_process_activation.py: Noted — a follow-up can tighten the match to a suspicious-children set (mshta, powershell, cmd, wscript, etc.). Left as a known limitation in this PR.

Copilot AI review requested due to automatic review settings May 12, 2026 16:21

Copilot started reviewing on behalf of wmetcalf May 12, 2026 16:21 View session

gemini-code-assist Bot reviewed May 12, 2026

View reviewed changes

Copilot AI reviewed May 12, 2026

View reviewed changes

wmetcalf added 2 commits May 12, 2026 16:59

fix: ransomware_message bytes/str handling + restore MassRansomNoteDr…

3d7296f

…op; stealer O(N2) loop

fix: pe_cert_suspicious static.pe source; ransomware dup entry + File…

cab6740

…Name fallback; stealer browser regex + Firefox -headless

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: PE cert signatures, headless browser stealer, COM office martian, ransomware_message fix#571

feat: PE cert signatures, headless browser stealer, COM office martian, ransomware_message fix#571
wmetcalf wants to merge 3 commits into
CAPESandbox:masterfrom
wmetcalf:feat/new-detection-signatures

wmetcalf commented May 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

wmetcalf commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	matches = set(self.regex.findall(buff_lower))
	matches = set(self.regex.findall(buff_lower.decode("utf-8", errors="ignore")))

	self.data.append({"beginning_of_ransom_message": buff})
	self.data.append({"beginning_of_ransom_message": buff.decode("utf-8", errors="ignore")})

		if isinstance(filedata, str):
		filedata = filedata.encode("utf-8", errors="ignore")

	filepath = self.get_argument(call, "HandleName")
	filepath = self.get_argument(call, "HandleName") or self.get_argument(call, "FileName")

		from lib.cuckoo.common.abstracts import Signature


		def _get_pe(results):

Conversation

wmetcalf commented May 12, 2026

New Signatures

pe_cert_suspicious.py (all)

stealer_headless_browser.py (all)

com_process_activation.py (all)

Bug Fixes

martians_office.py

ransomware_message.py

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

wmetcalf commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`pe_cert_suspicious.py` (all)

`stealer_headless_browser.py` (all)

`com_process_activation.py` (all)

`martians_office.py`

`ransomware_message.py`