[REVIEW] container-security: add Windows scheduling and HostProcess identity evidence gates

﻿## Skill Being Reviewed
**Skill name:** `container-security`
**Skill path:** `skills/cloud/container-security/`

## False Positive Analysis

**Benign code that triggers a false positive:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: windows-api
  namespace: apps
spec:
  replicas: 2
  selector:
    matchLabels:
      app: windows-api
  template:
    metadata:
      labels:
        app: windows-api
    spec:
      os:
        name: windows
      securityContext:
        windowsOptions:
          runAsUserName: "ContainerUser"
      nodeSelector:
        kubernetes.io/os: windows
      containers:
        - name: api
          image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2022
          ports:
            - containerPort: 80
```

**Why this is a false positive:**

The current Pod Security Standards quick reference in `SKILL.md` is Linux-centric: it lists `allowPrivilegeEscalation`, `seccompProfile`, Linux capabilities, and non-root UID-style controls without an OS-specific branch. For a Windows pod with `spec.os.name: windows`, Kubernetes treats several Linux security context fields differently or rejects them outright. A review that blindly requires Linux-only fields such as seccomp or Linux capabilities would incorrectly mark a valid Windows workload as non-compliant, and a remediation that adds those fields can break admission for Windows pods.

The skill should branch on `spec.os.name` and, for Windows workloads, validate Windows-specific controls such as `windowsOptions.runAsUserName`, HostProcess usage, node scheduling, Windows build compatibility, and gMSA authorization instead of forcing Linux-only hardening.

## Coverage Gaps

**Missed variant 1: HostProcess pod identity and scheduling context**
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: windows-node-agent
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: windows-node-agent
  template:
    metadata:
      labels:
        app: windows-node-agent
    spec:
      os:
        name: windows
      hostNetwork: true
      serviceAccountName: windows-node-agent
      securityContext:
        windowsOptions:
          hostProcess: true
          runAsUserName: "NT AUTHORITY\\SYSTEM"
      containers:
        - name: collector
          image: ghcr.io/example/windows-node-agent:v1.0.0
```
**Why it should be caught:**

`cis-benchmarks.md` currently says only to check `windowsOptions.hostProcess: true`. That catches the flag, but it does not require the evidence needed to judge the risk:

- whether `hostProcess` is set at the pod level or container level
- whether `hostNetwork: true` is present as required for HostProcess pods
- which Windows identity is used through `runAsUserName`
- whether `NT AUTHORITY\SYSTEM` is justified or a lower-privilege account such as LocalService / NetworkService / a local user group would work
- whether the workload is intentionally isolated to Windows nodes through `nodeSelector`, tolerations, or `RuntimeClass`
- whether the service account / RBAC grants match the host-level capability of the workload

HostProcess containers run with host access and have much weaker isolation than ordinary Windows containers. The skill should classify HostProcess findings by identity and scheduling evidence, not just by the presence of the `hostProcess` flag.

**Missed variant 2: `spec.os.name: windows` without effective Windows node placement**
```yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: windows-maintenance
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          os:
            name: windows
          containers:
            - name: job
              image: mcr.microsoft.com/windows/servercore:ltsc2022
              command: ["powershell.exe", "-File", "maintenance.ps1"]
          restartPolicy: OnFailure
```
**Why it should be caught:**

The Kubernetes scheduler does not use `spec.os.name` to place pods on matching nodes. Windows workloads need explicit placement evidence, such as `nodeSelector: kubernetes.io/os: windows`, matching taints/tolerations, or a `RuntimeClass` whose scheduling section targets Windows nodes. Without that evidence, the manifest can fail at runtime or bypass the intended Windows node pool controls.

This is especially important for Helm charts and Kustomize overlays, where a chart may set `spec.os.name: windows` but leave node placement to values. The review should require rendered-manifest evidence or values evidence for Windows placement.

**Missed variant 3: gMSA credential spec authorization**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: domain-integrated-api
spec:
  template:
    spec:
      os:
        name: windows
      serviceAccountName: app-sa
      securityContext:
        windowsOptions:
          gmsaCredentialSpecName: payroll-api-gmsa
          runAsUserName: "DOMAIN\\payroll-api$"
      nodeSelector:
        kubernetes.io/os: windows
      containers:
        - name: api
          image: ghcr.io/example/payroll-api:2.4.1
```
**Why it should be caught:**

Windows pods can use Group Managed Service Accounts for domain access. That creates a separate identity and authorization surface: the cluster needs the GMSA CRD, mutating/validating webhooks, and RBAC that authorizes the pod's service account to use the referenced credential spec. The current skill covers Kubernetes RBAC and secrets generally, but does not prompt reviewers to verify GMSA credential-spec authorization or whether a Windows workload is obtaining domain privileges beyond its intended scope.

## Edge Cases

- `windowsOptions` can be defined at pod level or container level; container-level values override pod-level values. The review should inspect both regular containers and init containers.
- HostProcess pods are not just "privileged containers for Windows"; they require Windows-specific evidence including `hostNetwork`, `runAsUserName`, Windows node placement, and a justification for the selected Windows account.
- For Windows pods, remediation that adds Linux-only fields such as seccomp, Linux capabilities, or numeric `runAsUser` can break the manifest rather than harden it.
- `spec.os.name` is useful evidence for Pod Security Standards evaluation, but it is not scheduling evidence. Node selector, taints/tolerations, or RuntimeClass scheduling must be checked separately.
- Windows Server build compatibility matters when multiple Windows node versions exist in the same cluster; `node.kubernetes.io/windows-build` or RuntimeClass scheduling evidence can prevent workloads landing on incompatible nodes.

## Remediation Quality

- [x] Fix resolves the vulnerability
- [x] Fix doesn't introduce new security issues
- [x] Fix doesn't break functionality
- **Issues found:** Existing remediation is strong for Linux Kubernetes hardening, but it should add an explicit Windows branch. For Windows workloads, recommend `runAsUserName`, node placement evidence, gMSA authorization checks, and HostProcess-specific least-privilege guidance. Avoid recommending Linux-only fields for pods with `spec.os.name: windows`.

Suggested remediation additions:

1. Add a "Windows workload branch" to Pod Security Standards evaluation:
   - if `spec.os.name: windows`, do not require Linux-only seccomp/capability/allowPrivilegeEscalation evidence
   - require Windows identity evidence through `windowsOptions.runAsUserName`
   - require node placement evidence via `nodeSelector`, tolerations, or RuntimeClass
2. Expand CIS 5.2.11 from "check for `hostProcess: true`" to a HostProcess evidence matrix:
   - pod-level and container-level `windowsOptions.hostProcess`
   - `hostNetwork: true`
   - `runAsUserName` identity and privilege level
   - service account / RBAC scope
   - dedicated namespace and privileged PSA exception justification
   - Windows node placement and Windows build compatibility
3. Add gMSA checks:
   - `gmsaCredentialSpecName` / `gmsaCredentialSpec` usage
   - GMSA webhook presence
   - RBAC authorizing only approved service accounts to use each credential spec
   - domain privilege review for the selected account

## Comparison to Other Tools

| Tool | Catches this? | Notes |
|------|:---:|-------|
| Semgrep | Partial | Custom YAML rules can find `hostProcess`, missing `nodeSelector`, or `gmsaCredentialSpecName`, but cross-field reasoning and OS-specific false-positive handling need custom policy logic. |
| CodeQL | No/Partial | CodeQL is not the natural fit for Kubernetes manifest policy evaluation. |
| Checkov / Trivy / Kubescape | Partial | These can detect many Kubernetes misconfigurations, but Windows-specific PSS branching, HostProcess identity severity, and gMSA authorization usually need policy tuning. |
| Kyverno / OPA Gatekeeper | Yes/Partial | Admission policies can enforce these checks, but the skill should still guide reviewers to collect the right evidence and avoid Linux-only false positives. |

## Overall Assessment

**Strengths:**

- Broad Docker and Kubernetes coverage with concrete CIS / NIST mapping.
- Good discovery patterns for Dockerfiles, manifests, Helm, Kustomize, and RBAC resources.
- Helpful common pitfalls for init containers, Helm overrides, default namespaces, NetworkPolicy behavior, and secrets.

**Needs improvement:**

- Add OS-specific Pod Security Standards handling for Windows pods.
- Expand HostProcess review beyond a single `hostProcess: true` flag.
- Require effective Windows scheduling evidence instead of treating `spec.os.name` as sufficient.
- Add gMSA credential-spec authorization checks for domain-integrated Windows workloads.

**Priority recommendations:**
1. Add a Windows PSS branch that avoids Linux-only false positives and checks `windowsOptions` instead.
2. Add a HostProcess evidence matrix covering identity, `hostNetwork`, RBAC, namespace exception, and Windows node placement.
3. Add Windows placement and gMSA checks to the output format so reviewers record evidence consistently.

References:

- Kubernetes Pod Security Standards: https://kubernetes.io/docs/concepts/security/pod-security-standards/
- Kubernetes Windows containers user guide: https://kubernetes.io/docs/concepts/windows/user-guide/
- Kubernetes HostProcess pods: https://kubernetes.io/docs/tasks/configure-pod-container/create-hostprocess-pod/
- Kubernetes RunAsUserName for Windows pods: https://kubernetes.io/docs/tasks/configure-pod-container/configure-runasusername/
- Kubernetes GMSA for Windows pods: https://kubernetes.io/docs/tasks/configure-pod-container/configure-gmsa/

## Bounty Info
- [x] I have read and agree to the [CONTRIBUTING.md](../../CONTRIBUTING.md) bounty terms
- **Preferred payment method:** GitHub Sponsors


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] container-security: add Windows scheduling and HostProcess identity evidence gates #2555

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

Bounty Info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tool	Catches this?	Notes
Semgrep	Partial	Custom YAML rules can find `hostProcess`, missing `nodeSelector`, or `gmsaCredentialSpecName`, but cross-field reasoning and OS-specific false-positive handling need custom policy logic.
CodeQL	No/Partial	CodeQL is not the natural fit for Kubernetes manifest policy evaluation.
Checkov / Trivy / Kubescape	Partial	These can detect many Kubernetes misconfigurations, but Windows-specific PSS branching, HostProcess identity severity, and gMSA authorization usually need policy tuning.
Kyverno / OPA Gatekeeper	Yes/Partial	Admission policies can enforce these checks, but the skill should still guide reviewers to collect the right evidence and avoid Linux-only false positives.

[REVIEW] container-security: add Windows scheduling and HostProcess identity evidence gates #2555

Description

Skill Being Reviewed

False Positive Analysis

Coverage Gaps

Edge Cases

Remediation Quality

Comparison to Other Tools

Overall Assessment

Bounty Info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions