Skip to content

Reboot-time deletion + full COM surface cleanup#2

Open
rianbk wants to merge 1 commit into
nullifyac:mainfrom
rianbk:reboot-deletion-rewrite
Open

Reboot-time deletion + full COM surface cleanup#2
rianbk wants to merge 1 commit into
nullifyac:mainfrom
rianbk:reboot-deletion-rewrite

Conversation

@rianbk
Copy link
Copy Markdown

@rianbk rianbk commented May 18, 2026

Fixes #1

Summary

Substantial rewrite to handle the case where Lenovo AI Now's shell-extension DLLs are loaded into other shell-using processes (Explorer, Office, browsers, anything that opens a file dialog) and can't be deleted in-session. The existing approach loses against AutoRestartShell when killing Explorer and can't unload the DLLs from the 17+ other consumers anyway, so most devices end up with 270 residual files and exit 1603 after every remediation cycle.

This PR pivots the remediation to PendingFileRenameOperations + reboot as the primary path. Validated end-to-end on two real Lenovo devices.

Major changes

  • Two-phase model: Phase A cleans what's cleanable in-session and queues locked files via PFRO, writes a sentinel registry value, returns 3010 (Intune-compatible success). Phase B happens after the user's organic reboot — SMSS clears the install dir at next boot before any shell process exists, next detect cycle confirms.
  • Self-relaunch under 64-bit PowerShell. Intune defaults Proactive Remediation scripts to 32-bit, which silently WOW64-redirects HKLM:\SOFTWARE reads to Wow6432Node. The AI Now uninstall key and all three shell-ext CLSIDs live in the 64-bit hive, so a 32-bit script ran blind and cleaned nothing.
  • Full COM registration surface mapped (via direct reg query on a live install): 3 CLSIDs + AppID + TypeLib + 2 ProgIDs + 10 shellex handler subkeys + ShellIconOverlayIdentifiers entry + Shell Extensions\Approved value. Hardcoded list + dynamic .NET Microsoft.Win32.Registry walk as backup (the cmdlet-based walk took 25+ minutes on real devices and timed out under Intune).
  • MSIX/AppX handling. AI Now ships AINowContextWIN11 as a Win11 context-menu MSIX package — Remove-AppxPackage handles the main package but leaves orphaned per-user repository stubs at HKU\<sid>\Software\Classes\Local Settings\...\AppModel\Repository, which this PR scrubs across every user SID plus the HKLM AppxAllUserStore mirrors.
  • Service disable-before-stop (sc config disabledStop-ServiceWaitForStatus('Stopped', 30s)sc delete) to prevent auto-restart races.
  • Wildcard match for the uninstall DisplayName — the existing ^Lenovo AI Now\b regex silently failed on the actual Lenovo AI Now 1.3 entry, so the uninstall registry key survived every cycle.
  • -LiteralPath for registry paths containing literal * — the existing -Path treated * as a wildcard and expanded across every HKCR subkey, causing multi-minute hangs in the shellex handler loops.
  • Removed dead code (Invoke-Uninstaller, Invoke-MsiUninstall) and the robocopy /MIR fallback (could deadlock against kernel-locked DLLs).
  • Sentinel-based loop suppression in detect so PFRO entries don't bloat between Phase A and reboot.

How it was validated

End-to-end on two real Lenovo devices:

  • Device 1 (fresh-install state, 270 files + locked DLLs): exercised the full PFRO + reboot path. Phase A queued files and wrote the sentinel, user rebooted, install dir cleared, Phase B exited 0.
  • Device 2 (mid-cleanup state from previous failed runs): exercised direct deletion after natural reboot. Cleaned up successfully in 28 seconds.

Plus MDE Advanced Hunting across a 36-device fleet to confirm the scope of DLL loading (18 different processes load OverlayIcon.dll, 5 load AINppShell.dll).

Caveats

  • Tested only against Lenovo AI Now 1.3. Future versions with different CLSID GUIDs would still be caught by the dynamic walk (matched by DLL filename + parent path containing \Lenovo\Lenovo AI), but truly new DLL filenames would need the $lenovoAIDllFileNames list extended.
  • Doesn't handle Lenovo Vantage / Commercial Vantage re-pushing AI Now after removal (out of scope; needs a Vantage policy change).
  • Sibling Lenovo AI-family products (AI Solution, AI Meeting Manager) explicitly out of scope.

Test plan

  • Detect on a clean Win11 box (no AI Now) → exit 0
  • Detect on a box with AI Now installed → exit 1, all components surfaced in stdout
  • Remediate on a fresh AI Now install → exit 3010, PFRO populated, sentinel written
  • Reboot validates SMSS clears install dir
  • Re-run detect after reboot → exit 0
  • Remediate on a mid-cleanup device → exit 0 directly
  • Sentinel suppression — second detect during pending-reboot window exits 0
  • Multi-user machine (defaultuser0 + a real user) cleans both profiles
  • Confirm against Lenovo AI Now versions other than 1.3 (only 1.3 available for testing)

🤖 Discovery and authoring assisted with Claude Code.

The previous in-session deletion approach can't succeed against
AINppShell.dll / OverlayIcon.dll because they're loaded into virtually
every shell-using GUI process on the machine -- Explorer, Office apps,
browsers, anything that opens a file dialog. Killing Explorer loses
the race against AutoRestartShell (~1s respawn) and can't unload the
DLL from the other consumers anyway. On real devices this left 270
residual files and exit 1603 after every cycle.

Major architectural changes:

* Pivot to PendingFileRenameOperations + reboot as the primary path
  for the install directory. Phase A cleans what's cleanable in-session,
  queues locked files via PFRO, writes a sentinel, returns 3010
  (Intune-compatible success). Phase B fires after the user's organic
  reboot -- install dir clears at boot via SMSS before any shell process
  exists, next detect cycle confirms.

* Self-relaunch under 64-bit PowerShell. Intune defaults Proactive
  Remediation scripts to 32-bit PowerShell, which silently WOW64-
  redirects HKLM:\SOFTWARE reads to Wow6432Node -- the AI Now uninstall
  key and all three shell-ext CLSIDs are in the 64-bit hive, so the
  32-bit script ran blind and cleaned nothing. Both detect.ps1 and
  remediate.ps1 now relaunch themselves via SysNative if started 32-bit.

* Full mapping of the COM registration surface (confirmed via reg query
  on a live install): 3 CLSIDs + AppID + TypeLib + 2 ProgIDs + 10 shellex
  handlers across {*, Directory, Drive, Folder, lnkfile} x
  {ContextMenu, DragDrop} + ShellIconOverlayIdentifiers (with the
  whitespace-prefixed subkey name Lenovo uses to game the 15-slot limit)
  + Shell Extensions\Approved value. Hardcoded list + dynamic .NET
  registry-API walk as backup, anchored to filename + parent path
  containing \Lenovo\Lenovo AI.

* MSIX/AppX handling. AI Now 1.3 ships AINowContextWIN11 as a packaged
  Win11 context-menu extension; Remove-AppxPackage handles the main
  package but leaves orphaned per-user repository stubs at
  HKU\<sid>\Software\Classes\Local Settings\...\AppModel\Repository
  which Remove-LenovoAINowAppxRepositoryStubs scrubs explicitly across
  every resolved user SID, plus the HKLM AppxAllUserStore mirrors.

* Disable services before stopping (sc config start= disabled ->
  Stop-Service -> WaitForStatus('Stopped', 30s) -> sc delete) to
  prevent auto-restart races during cleanup.

* Wildcard DisplayName match. The original ^Lenovo AI Now\b regex
  silently failed against the actual "Lenovo AI Now 1.3" entry on
  observed devices, so the Uninstall registry key survived every
  remediation cycle. Both scripts now use the same -like wildcard.

* -LiteralPath for registry paths containing literal '*'. The default
  -Path treats '*' as a wildcard and expanded across every HKCR subkey,
  causing multi-minute hangs in the shellex handler loops.

* .NET Microsoft.Win32.Registry API for the dynamic CLSID hive walk
  (~200ms vs ~40s via Get-ChildItem cmdlet on a 7k-key hive).

* Sentinel + suppression. After Phase A writes
  HKLM:\SOFTWARE\LenovoAINowRemediation\PhaseAComplete, detect.ps1
  short-circuits to exit 0 while the sentinel is fresh (<=7 days) and
  PFRO still references Lenovo entries. Prevents the detect-remediate
  loop bloat that would otherwise happen between Phase A and reboot.

* Removed dead code (Invoke-Uninstaller, Invoke-MsiUninstall) and the
  robocopy /MIR fallback (could deadlock against kernel-locked DLLs).

* Files renamed to *.tobedeleted before PFRO queue so detect's file
  count doesn't see them between Phase A and reboot.

Validated end-to-end on two real Lenovo devices: one starting from
full fresh-install state with locked DLLs (exercised PFRO + reboot
path), one in mid-cleanup state from prior partial runs (exercised
direct deletion after natural reboot). Both reached exit 0.

Adds tools/diagnose_clsid_walk.ps1 for measuring registry-walk
performance on a specific device (useful when diagnosing hangs).

Updates README.md to reflect the new architecture, requirements
(64-bit PowerShell setting), exit code semantics, and monitoring
KQL queries for fleet rollout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AINow still appears in winget list after uninstall via script

1 participant