diff --git a/Documentation/security-foundations/README.md b/Documentation/security-foundations/README.md new file mode 100644 index 00000000000..d8586b6e529 --- /dev/null +++ b/Documentation/security-foundations/README.md @@ -0,0 +1,13 @@ +# .NET security framing guides + +This folder contains a set of framing documents guiding the .NET team's security posture. These guides cover how we perform security designs for our components, the contracts these components have with users, and how we assess reported vulnerabilities in these components. They are specific to .NET's unique shape of being a provider of reusable libraries and SDK components rather than a standalone deployed application. + +This folder also contains _prototype_ skills used to generate security design documents / threat models and to assess incoming vulnerability reports. + +## Primary documents + +There are two primary documents located here. These documents serve as the foundation for all other documents present. + +[**Baseline security assumptions**](baseline-security-assumptions.md) covers the implicit contract governing consumers' proper use of library components, expected environmental configuration, and the interaction between components in a system. It's the set of things incorporated by reference into all .NET security documentation, even if that documentation never explicitly says so. + +[**Vulnerability theory**](vulnerability-theory.md) describes .NET's operating definition of what constitutes a theoretical product vulnerability. Because we define vulnerabilities in terms of excess privilege grants, this can be combined with the baseline document above to derive a non-exhaustive list of behaviors that are categorically _not_ vulnerabilities. diff --git a/Documentation/security-foundations/baseline-security-assumptions.md b/Documentation/security-foundations/baseline-security-assumptions.md new file mode 100644 index 00000000000..86362478119 --- /dev/null +++ b/Documentation/security-foundations/baseline-security-assumptions.md @@ -0,0 +1,279 @@ +# Baseline security assumptions + +This document frames a set of baseline assumptions that consumers and implementers of .NET components can simply _assume_. These assumptions determine how .NET reasons about a component's authority boundary and the trusted actors that operate within that boundary. They are incorporated by default into all .NET documentation and threat models, and they need not be restated. + +The baseline described here is not intended to immutably bind all .NET components. Components may in some cases need to expand the authority boundary (e.g., fully trust an external entity) or narrow the authority boundary (e.g., treat another same-user process as less privileged). The baseline does not prohibit these modifications. However, because these changes fall outside the baseline, the component author should prominently document these deviations and any obligation they place on consumers or integrators. + +## Audience and how to read this document + +The primary audience for this document is the .NET team, which creates libraries, tools, and SDKs. This document does not itself discuss how to create a threat model; it instead discusses general principles that those threat models or security design documents can rely on as invariants. + +A secondary audience for this document is security researchers who are submitting vulnerability reports against .NET. By documenting how the .NET team thinks about security invariants, we hope researchers can more productively tailor their investigations to areas that do cross true boundaries and do not violate an invariant expected by .NET. Vulnerability reports that are predicated on an invariant being violated are closed as won't fix or by design. See ["Vulnerability theory"](vulnerability-theory.md) for .NET's formal definition of what constitutes a vulnerability. + +The first several sections of this document list axioms that all .NET components -- both those written by the .NET team and those written by third-party .NET developers -- can use as foundational underpinnings for their own security analyses. We do not tend to state these in API or product documentation; they are simply _assumed_. + +The latter sections describe various interpretations of these axioms as they apply to components created by the .NET team. These are more high-level consequences than direct axioms, but they should still derive naturally from the axioms, and .NET-authored components can assume these interpretations without explicitly stating them in documentation. + +## Defining "fully trusted" + +Many of the items below use the term **_fully trusted_**. We assume that a fully trusted authority: + +* may legitimately exercise total control over the process's execution flow; +* exercises that authority intentionally and not in error; and +* exercises that authority without adversarial influence. + +Or, more bluntly: a fully trusted authority tells us what to do, and it's not our place to second-guess them. + +> [!NOTE] +> +> This doesn't mean the fully trusted authority _is in fact_ free of misconfiguration or compromise; real systems can and do fail in those ways. It means that, for the purposes of threat assessment, we must _model_ the fully trusted authority as intentional, correct, and uncompromised, because we cannot meaningfully defend against failures in that authority. +> +> Similarly, the assumptions listed here are not necessarily factual claims about the world. For example, nothing prevents a user from making their home directory world-writable. They are instead axioms that bound the model. If a baseline assumption is violated, the model is no longer applicable, and **we make no security guarantees whatsoever under such conditions.** Components that must operate in environments where a baseline assumption does not hold must declare that as part of their model, and they must redefine their security goals accordingly. + +> [!IMPORTANT] +> +> Under this definition, **any entity that can legitimately influence execution flow is fully trusted.** This includes both sysadmins and non-sysadmin operators. +> +> This is distinct from the OS's notion of privilege level, which is primarily concerned with preventing non-sysadmin operators from making sweeping changes that impact _other_ accounts. Our models are scoped to the application, not the entire machine, so OS privilege level does not solely determine trust. +> +> Since non-sysadmin operators are often able to control how the application is launched, configured, deployed, or supplied with operational inputs (such as environment variables, command-line arguments, and configuration files), they are considered "fully trusted" under our model. + +## 1. Operating system axioms + +### 1.1. The operating system provides account isolation. + +The operating system enforces account isolation through ACLs, integrity levels, and other access‑control mechanisms that prevent lower‑privilege accounts from modifying or reading resources owned by higher‑privilege accounts. These same controls prevent standard user accounts from modifying resources belonging to other standard user accounts. Higher‑privilege accounts are not restricted from accessing resources owned by lower‑privilege accounts. + +### 1.2. Any entity with equivalent or higher effective privilege is fully trusted. + +Any code running within the same process shares the process's privilege level and is therefore fully trusted. ([Sec. 3.3](#33-in-process-composition-is-not-a-security-boundary) revisits this from an application-composition perspective.) Additionally, any other process running under the same user account (if at an equal or higher integrity level) or under a higher-privileged account (such as SYSTEM or root) is fully trusted. + +Note that the inverse is not true. Integrity level by itself does not dictate the full set of privileges available to a process. Similarly, non-SYSTEM accounts may hold administrative-equivalent privileges. One cannot conclude _solely from integrity level or user account_ that a secondary process definitively lies beyond the current process's trust boundary. For related Windows guidance, see [Non-boundaries](https://www.microsoft.com/msrc/windows-security-servicing-criteria). + +> [!NOTE] +> +> This extends to physical access: an operator with physical access to the machine who can exercise control over the operating system itself is therefore fully trusted. + +### 1.3. Diagnostic tools are fully trusted. + +When a diagnostic tool attaches a debugger to the process, inspects its memory, or captures a crash dump, a fully trusted authority (the OS, the host, or an in-process diagnostic component) has already validated that the requestor has sufficient privilege to perform that action. Instructions from the diagnostic tool are therefore fully trusted. + +> [!NOTE] +> +> This analysis is from the perspective _of the debuggee._ A debugger who attaches to a target running under a different authority or who processes a memory dump originating from a different authority **must not** make trust assumptions about the target. + +This extends to the contents of the dumps themselves: an operator in possession of a dump is considered fully trusted. That operator can restore the process from the snapshot and can control execution flow, including potentially exercising some of the same authority (e.g., accessing the same resources) as the original process. Because there is no general way for a component to identify and scrub all privilege-granting state from a dump, we must assume that in the extreme, an operator with access to the dump contents can exercise equivalent authority as the original process. + +### 1.4. Certain directories provide access control guarantees. + +OS configuration directories and globally installed executable directories are writable only by sysadmins; their contents are therefore fully trusted. + + * On Windows, this includes but is not limited to `%WINDIR%` and `%PROGRAMFILES%`. + * On non-Windows platforms, this includes but is not limited to `/etc`, `/usr/bin`, and `/usr/lib`. + +The OS affords the user's home directory certain access control protections. + + * On Windows, `%USERPROFILE%` can only be accessed (read or write) by the target user; it is inaccessible to other less-privileged accounts. + * On non-Windows platforms, `~/` can only be written to by the target user. + +> [!WARNING] +> +> Some non-Windows platforms _may_ permit world-read access to the user's home directory. Even on platforms that have moved to more restrictive defaults, user accounts created prior to an OS upgrade may retain overly broad permissions. Additionally, some .NET-supported Linux distros continue to grant broad permissions by default. +> +> On these platforms, components **must not** assume confidentiality of the data in the user's home directory unless they have confirmed that the directory is not accessible by unauthorized accounts. + +On Windows only, the .NET API `Path.GetTempPath` and the Win32 API `GetTempPath2` return a directory that is accessible only to the current user. + +> [!WARNING] +> +> The temp directory guarantees above apply **only** to .NET's `Path.GetTempPath` on Windows (backed by `GetTempPath2`). Callers must not assume equivalent isolation from: +> +> * the Win32 API `GetTempPath`, which may resolve to a shared location such as `C:\Windows\Temp`; +> * any other framework's `GetTempPath`-equivalent API on Windows; or +> * the temp directory on non-Windows platforms. + +### 1.5. The machine is correctly configured. + +The sysadmin configures the operating system and machine-wide settings in accordance with established best practices. This includes system clock accuracy, TLS root certificate stores, DNS resolver configuration, and any other OS-level settings that security-sensitive components may depend on. + +What constitutes "established best practices" is determined by the vendor, the organization, or both; and defining this is outside the scope of this document. We simply assume the responsible party has made a reasonable and informed effort to keep the environment secure. + +### 1.6. The machine and environment are up-to-date. + +The sysadmin keeps the operating system, globally installed applications, and other machine-wide components up-to-date with security patches. + +The operator keeps the execution environment, all dependencies, available tools, and configuration up-to-date with security patches and configured in accordance with established best practices. That is, for anything that is expected to be on the machine prior to the .NET installer or workload being invoked, the operator bears responsibility for maintenance. + +## 2. Environmental axioms + +### 2.1. Application folders are fully trusted. + +An application folder is any directory that contains code or assets that the application _may_ load, not necessarily what it _does_ load. This includes any directory which: + +* contains the base executable image; +* contains plugins or other dynamically-loaded code (see [Sec. 3.3](#33-in-process-composition-is-not-a-security-boundary)); +* contains configuration, resources, or other application assets (see [Sec. 3.1](#31-application-configuration-is-fully-trusted)); or +* is referenced by `%PATH%` or similar environment variables (see [Sec. 2.2](#22-environment-variables-are-trusted-as-a-control-plane-mechanism) below). + +These directories are trusted because their contents can influence execution flow, and we previously defined that any authority who can influence execution flow is fully trusted. This classification of "application folder" derives from the fact that the authority exercises control over these locations, not that the locations have any special meaning to the OS. Similarly, given that application assets and resources may influence execution flow within the application, their containers are considered application folders under this model. + +> [!IMPORTANT] +> +> This **does not** automatically include `cwd`, and models **must not** implicitly assume `cwd` to be trustworthy. Users typically expect `cwd` to contain data rather than instructions, so treating it as a trusted application folder can silently expand the trust boundary in ways that contradict user expectations. +> +> However, we recognize that we are in the business of providing CLI dev tools (e.g., `dotnet`, `msbuild`), and there are legitimate scenarios where we are expected to treat the contents of `cwd` as control flow instructions or executable code. If a component does need to treat `cwd` as trustworthy, the model must expressly call this out rather than leave it as an unwritten assumption. + +> [!WARNING] +> +> Documentation should avoid encouraging users to run executables directly from `%USERPROFILE%\Downloads`. Doing so treats the _Downloads_ folder as a trusted application folder because the fully trusted user has exercised their authority to execute code from that location, not because the folder is inherently trustworthy. This expands the trust boundary to include all contents of the _Downloads_ folder, which may surprise end users and can inadvertently grant an adversary control over execution flow. + +### 2.2. Environment variables are trusted as a control-plane mechanism. + +The environment block can only be set by a host authority who is able to launch the application or influence its execution flow. Therefore, the environment variables contained within the environment block are fully trusted. + +This makes sense intuitively, as environment variables are specifically intended to carry **control-plane instructions**. These include information about the execution environment, configuration values, and dependency locations. + +> [!NOTE] +> +> This justifies why we treat `%PATH%` as fully trusted: it can only be set by a fully trusted authority. While it is certainly possible for the authority to include in `%PATH%` a directory whose contents are not safe to load, [this represents a misconfiguration](https://cwe.mitre.org/data/definitions/427.html) by that authority. And, as mentioned in the introduction, we assume the authority has properly configured the operating environment, as we cannot meaningfully defend against misconfigurations. + +In some cases, such as cgi-bin handlers, environment variables (e.g., `HTTP_*`) are used to carry **opaque data** rather than control-plane instructions. In these cases, the _shape_ of the environment block (the existence of keys, the number of keys, the structural patterns of keys/values) is assumed to be under the control of the fully trusted host authority. Every key/value pair appears because the host intentionally included it. + +The _contents_ of these data-carrying variables might originate from an untrusted actor. The host must not introduce such data unless there is an explicit, documented contract with the target application describing: + +* how to distinguish variables containing control-plane instructions from variables carrying opaque data; and +* what semantics apply to each variable that carries opaque data. + +The host cannot impose these restrictions unilaterally; the target application must also agree to this contract. Absent this agreement, the target application may treat all environment variables as control-plane instructions and therefore fully trusted. + +### 2.3. The command line is trusted as a control-plane mechanism. + +This section parallels the environment block discussion above. Command line arguments, like environment variables, represent control-plane instructions. They describe what specific action to execute, how that action should take place, or what file should be operated on. The command line itself is intentionally constructed by the fully trusted host who is authorized to provide such instructions. + +Like with environment variables, the host may wish to provide opaque data via the command line, and this opaque data may have originated from an untrusted actor. The host must not introduce such data unless there is an explicit, documented contract with the target application describing: + +* how to distinguish arguments containing control-plane instructions from arguments carrying opaque data; and +* what semantics apply to each argument that carries opaque data. + +As with the environment block, the host cannot impose this contract unilaterally. The target application must agree to it; and absent such agreement, the target application may treat all arguments as fully trusted control-plane instructions. + +> [!IMPORTANT] +> +> File paths are considered control-plane instructions. For example, if the host passes `--infile somefile.txt`, they are exercising their authority to direct the target application to operate on that file. It is up to the host to ensure that the filename is meaningful to the operating system and that any special characters are properly escaped so that the target application receives a faithful reproduction of the path without mangling or splitting. +> +> This **does not** necessarily mean that the file _contents_ are trusted. Though the file name is control-plane information and thus must implicitly be trusted, the file itself may have originated from the network or some other untrusted source. It is up to each application to determine whether it is a valid scenario to operate on potentially untrusted data. Those determinations are out of scope of this document. + +## 3. Application axioms + +### 3.1. Application configuration is fully trusted. + +Application configuration is a **control-plane mechanism**. It exists to direct execution flow, select modes, control features, and generally dictate application behavior. Because these decisions belong to a fully-trusted authority, the configuration input is itself fully trusted. The configuration system can assume that the input's shape and contents are intentional and correct, and any downstream consumers can assume that the configuration system is returning correct values. + +A consequence of this is that we assume the operator maintains confidentiality of and practices good hygiene regarding sensitive data provided via configuration. This includes, but is not limited to, connection strings, API keys, and private keys. Improper disclosure could allow an adversary to masquerade as a legitimate authority, and this may undermine some axioms which underpin the component's security design. + +> [!TIP] +> +> Applications may choose to improve usability by detecting overt misconfigurations. This is purely optional and does not impact the fact that configuration remains a trusted control-plane mechanism. A component is not considered insecure simply because it misbehaves when provided incorrect configuration. + +In hosted multitenant scenarios, individual tenants may legitimately supply configuration-like values: connection strings, web hook endpoints, or per-tenant feature settings. These values originate outside the trust boundary and are therefore **data**, not control-plane instructions. + +When configuration may contain tenant-supplied or otherwise untrusted data, the scenario requires an explicit, documented contract between: + +* the configuration system itself, which must document predictability, isolation, and robustness guarantees in the face of potentially hostile input; and +* the downstream consumer (e.g., a component which uses a tenant-provided connection string), which must treat these values as potentially tainted and handle them accordingly. + +Absent this agreement, the configuration system and all configuration consumers may treat configuration as fully-trusted control-plane input. + +### 3.2. All fully trusted components within the authority boundary are correctly implemented. + +All fully trusted components operating within the same authority boundary are correct implementations of their published contracts and invariants. Our security design documents generally do not attempt to enumerate potential defects in other components within the same boundary, as we cannot meaningfully defend against such defects. If a component's published invariants do not hold, the defect lies within that component. + +### 3.3. In-process composition is not a security boundary. + +Each component within a process is fully trusted with respect to every other component within the process. Any code that can be loaded into the process can fully control execution flow within the process, and components are loaded only as a result of actions by a fully trusted authority. Per [Sec. 3.2](#32-all-fully-trusted-components-within-the-authority-boundary-are-correctly-implemented), all such components are correctly implemented. + +.NET does not provide or enforce any intra-process sandboxing mechanism. + +> [!NOTE] +> +> One component may still provide untrusted ("tainted") data to another component, perhaps because it initially acquired that data from outside the trust boundary. When this happens, it is because the source component is exercising its authority to invoke the target component, and it is doing this intentionally. The interaction between these two components is governed by the published contract of expected invariants and behavioral guarantees. +> +> See [Sec. 4](#4-authority-boundaries-data-provenance-and-taint) for further information. + +### 3.4. Out-of-process composition under the same authority remains fully trusted. + +Components ("helpers") intentionally executed out-of-process under the same authority are fully trusted with respect to the invoking component. A process boundary, by itself, does not introduce a security boundary. As described in [Sec. 1.2](#12-any-entity-with-equivalent-or-higher-effective-privilege-is-fully-trusted), any process running under the same user account at an equal or higher integrity level is fully trusted. Per [Sec. 3.2](#32-all-fully-trusted-components-within-the-authority-boundary-are-correctly-implemented), all such components are correctly implemented. + +As with in-process composition, data exchanged with an out-of-process helper may be untrusted ("tainted"), even though the control-plane information used to invoke the helper is fully trusted. The interaction between the invoker and the helper remains governed by the published contract of invariants and guarantees. + +> [!WARNING] +> +> [Sec. 3.4](#34-out-of-process-composition-under-the-same-authority-remains-fully-trusted) applies only when there is **continuity of authority** between the invoker and the helper. When that continuity is absent, the out-of-process component **must not** be assumed fully trusted, and the model must account for this authority transition. +> +> Authority transitions can be overt or subtle. **Overt** transitions are usually recognizable at the launch site: the invoker launches the helper on a different machine, under a different user account, or within [a less-privileged AppContainer](https://learn.microsoft.com/windows/win32/secauthz/implementing-an-appcontainer). +> +> Other transitions are **subtle**: communication mechanisms that rely on an out-of-band rendezvous (for example, connecting to a named pipe or TCP socket, even on localhost) are subject to interference by processes running under different authorities. Endpoints must mutually authenticate before trusting the channel. By contrast, in-band mechanisms like stdin/stdout are established at process creation time and therefore preserve continuity of authority. (The data carried may still be tainted.) +> +> Recall that we make no security guarantees when a baseline assumption is violated. An authority transition does not "violate" Sec. 3.4 in this sense. Sec. 3.4 remains a sound axiom; what changes under an authority transition is only whether this section's precondition holds. It is therefore a logical error for a model to use Sec. 3.4 as the basis for a security claim when continuity of authority is absent. + +## 4. Authority boundaries, data provenance, and taint + +We've avoided concretely defining "authority boundary," as the operational definition more effectively illustrates the overall concept. If you'd like, you can think of the authority boundary as encompassing every component that is _technically capable_ of performing the same set of actions as you (even if they never exercise this capability), including accessing the same resources in the same manner or generating the same externally observable effects. Any entity that cannot directly do this -- but which instead necessarily relies on a component within your authority boundary mediating such operation -- is outside your authority boundary. + +> [!IMPORTANT] +> +> The earlier axioms should not be read as an exhaustive enumeration of everything within your authority boundary. Any real-world system will interact with components whose relationship to the rest of the system does not cleanly fit into one of these buckets. This document does not claim that such components automatically fall within or outside the authority boundary. The true answer of where the component lies requires an analysis of the component's capabilities, informed by the nature of the relationship between the component and the rest of the system. + +Remember: this is from the perspective of the component being modeled. There's not necessarily a symmetry present. For example, user-mode code necessarily must include the OS kernel within its authority boundary, since the kernel can exercise full control over any code running in user mode. From the kernel's perspective, however, user-mode code is outside its authority boundary. The kernel mediates user-mode access to any resource under the kernel's authority. + +> [!TIP] +> +> This roughly parallels the concept of a "control sphere" [as described by the CWE working group](https://cwe.mitre.org/documents/vulnerability_theory/intro.html#chap7). Feel free to refer to the linked document for more information on vulnerability theory. This is not required reading; we do not assume our audience is familiar with the linked document or its foundational theories. + +When interacting with an entity beyond the authority boundary, no component within the authority boundary can assume that the entity is upholding any required invariants or behavioral guarantees. The entity might even be actively hostile and intentionally trying to subvert the guarantees of components operating within the authority boundary. + +When encountering data or commands whose provenance rests beyond the authority boundary, we say the data flow is **tainted**. The most obvious cases occur when interacting with a tunnel which punctures the authority boundary: reading from a TCP stream returns tainted bytes to the reader. More subtle cases occur when a trusted component generates a payload but incorporates tainted data within the payload; the generated payload is likewise tainted. For example, if web site registrants can specify their own username at account creation, then the _username_ argument is tainted, and any component which generates a URI like `https://example.com/users/{username}` is generating a tainted URI string, _even if that component properly escapes the username argument._ + +A trusted component may persist this data to a trusted store secured by a trusted channel, but the data remains tainted because its provenance rests beyond the authority boundary. The taint flag is removed only once a component guarantees that the data fulfills all required invariants expected by all other components that will interact with the data. + +Improper tracking of tainted data can lead a component to fail to uphold its contracted invariants or guarantees, potentially undermining the assumptions that other components within the boundary depend on. If one component (the "sender") transmits data to another component (the "receiver") within the boundary, responsibility rests with the sender to ensure that any potentially tainted data conforms to any requirements the receiver has placed on that interaction. The invocation action itself is an implicit assertion by the sender that the data meets the receiver's contract. + +> [!NOTE] +> +> Describing how to track taint flows and how to reason about them in published contracts is out of scope of this document. See ["How to choose a threat modeling framework"](choosing-a-framework.md) and ["How to model taint flows"](modeling-taint-flows.md) for more information. + +## 5. The CIA triad + +The preceding sections stated axioms that define trust boundaries and assumptions about behaviors, and they discussed how we shape the authority boundaries and reason about taint flows. This section explores how these concepts apply when assessing .NET components against [the CIA triad](https://github.com/microsoft/Security-101/blob/main/1.1%20The%20CIA%20triad%20and%20other%20key%20concepts.md), a model for reasoning about security properties. These are interpretations of the concepts, not axioms in themselves. + +### 5.1. Confidentiality + +A **confidentiality** failure arises when an entity can observe data it is not authorized to observe. Fully trusted authorities are definitionally authorized to observe any and all data; therefore it is not a confidentiality failure for such an authority to have access to sensitive data. + +**Scope.** Confidentiality concerns are meaningful only at authority boundaries. Capabilities that allow code to inspect memory, data structures, exception information, or configuration do not, by themselves, constitute confidentiality risks under this model. This includes reflective or dynamic mechanisms, which execute under the authority of the invoking component. + +**Responsibility.** Disclosure of information is a control-plane decision. If a component elects to disclose sensitive information outside its authority boundary, any confidentiality failure lies in the component making that disclosure -- not in the component that originally housed the sensitive data, and not in any shared plumbing that merely transports data. + +### 5.2. Integrity + +An **integrity** failure arises when an entity can, without authorization, mutate state or influence execution in a way that violates intended invariants. Fully trusted authorities are definitionally authorized to direct execution flow and to modify application state; therefore it is not an integrity failure for such an authority to do so. + +**Scope.** The mere existence of powerful execution or mutation mechanisms does not imply an integrity risk. Capabilities that allow dynamic code loading, dynamic dispatch, indirect invocation, or state mutation do not, by themselves, constitute integrity risks under this model. Such capabilities execute under the authority of the invoking component. + +**Responsibility.** Integrity failures typically arise when a trusted component exposes a control plane to tainted input, allowing that input to influence control-flow decisions or state transitions that were meant to remain under the fully trusted authority. When this occurs, the integrity failure lies in the component that made the control-plane reachable from untrusted input -- not in the mechanism that facilitates or carries out the modification, and not in any shared plumbing that merely transports data. + +Integrity failures are not limited to control-plane exposure within the current authority boundary. They can also arise when unauthorized entities mutate data, even if that data is never used within the current authority's control plane. For example, if a web application serves a tampered file to a client, that client (or its human operator) may improperly frame the contents as having the authority of the trusted web server, resulting in harm to the client or its operator when they act on that data. + +> [!NOTE] +> +> In practice, confidentiality and integrity failures usually manifest differently. Confidentiality failures often occur due to misconfigurations or operational choices that _unintentionally_ expose sensitive data to an overly broad audience; e.g., by enabling verbose diagnostics at a boundary. Integrity failures are often driven by design choices that _intentionally_ expose a control plane to tainted input, typically because the invoking component isn't aware that it's interfacing with a control plane or that the input may be tainted. These examples are simply commonly observed patterns, not strict definitions. + +### 5.3. Availability + +An **availability** failure arises when an authorized caller cannot interact with a resource at the resource provider's advertised service level. This typically manifests as the resource simply being inaccessible to the requester (e.g., machine unreachable, HTTP 503). It can also mean that specific interactions with the resource degrade (e.g., "account transfers are unavailable at this time"), time out, or otherwise fail to meet the resource provider's SLA. + +**Scope.** Because fully trusted authorities are definitionally authorized to direct execution flow, it is not an availability failure for such an authority to restrict access to resources. Availability claims are meaningful only when the resource provider has established a service level, whether through documentation, SLA, or implicit contract. Absent such a commitment, there is no baseline against which to measure degradation. + +Availability failures typically occur when a limited resource (CPU, threads, memory, storage, network bandwidth, energy, etc.) is exhausted. They can also occur due to externalities: integrity failures (e.g., unauthorized deletion or corruption of state; see [Sec. 5.2](#52-integrity)), operating environment failures (e.g., hardware failures, power outages), or resource exhaustion induced by external actors legitimately authorized to consume those resources. + +**Responsibility.** As library and SDK authors, we cannot mitigate availability failures caused by external sources. The scope of our responsibility is twofold: we must not exacerbate externally-induced failures should they occur, and we must not introduce availability failures ourselves by allowing unauthorized agents to exhaust application resources. ["APIs and calling patterns"](apis-and-calling-patterns.md) further describes .NET's philosophy regarding allowable resource consumption. diff --git a/Documentation/security-foundations/vulnerability-theory.md b/Documentation/security-foundations/vulnerability-theory.md new file mode 100644 index 00000000000..485ef9e4cc7 --- /dev/null +++ b/Documentation/security-foundations/vulnerability-theory.md @@ -0,0 +1,82 @@ +# Vulnerability theory + +This document describes .NET's theory of what constitutes a security bug (a "vulnerability"). It largely hews close to the [CWE working group's theory of vulnerabilities](https://cwe.mitre.org/documents/vulnerability_theory/intro.html), but it simplifies concepts for a non-research audience. It also leans slightly on [U.S. FISMA (2014)](https://www.cisa.gov/topics/cyber-threats-and-advisories/federal-information-security-modernization-act) for codifying definitions and impact. The [baseline document](baseline-security-assumptions.md) provides further details on authority boundaries and the CIA properties, which play central roles here. + +## Audience and how to read this document + +The primary audience for this document is the .NET team, which creates libraries, tools, and SDKs. This document provides a formal definition of what constitutes a vulnerability within the .NET product, giving teams a consistent framework for triaging security reports and distinguishing true vulnerabilities from reliability bugs. + +A secondary audience is security researchers submitting vulnerability reports against .NET. These researchers likely have their own real-world experience working with systems which follow a different definition of "vulnerability." This document crystallizes some of the consequences from the [baseline security assumptions](baseline-security-assumptions.md) document, clarifying why .NET's security team sometimes reaches different conclusions than security researchers may be used to, particularly in areas like operator-triggered issues and availability-impacting reports. + +## Definition and discussion + +> A **vulnerability** is any behavior which grants an entity some privilege the system did not intend them to have, where the exercise of that privilege could violate one of the CIA properties. + +That is an information-dense sentence. It's useful to focus on defining or elaborating on some key phrases. + +* **system**: The set of all components, configurations, and other fully trusted operators acting within the context of some [authority boundary](baseline-security-assumptions.md#4-authority-boundaries-data-provenance-and-taint). +* **behavior**: As exhibited by the system, in a manner observable beyond the authority boundary. +* **... which grants an entity some privilege ...**: Requires the existence of an entity operating _outside_ the authority boundary, as entities operating _within_ the authority boundary are [intrinsically fully privileged](baseline-security-assumptions.md#12-any-entity-with-equivalent-or-higher-effective-privilege-is-fully-trusted); they cannot be "granted" further privilege. +* **... did not intend ...**: Requires that the system distinguish (or should have distinguished) fully trusted vs. less-privileged entities. +* **... exercise of that privilege ...**: Requires that the improperly privileged entity actually be able to do something with that privilege. +* **... could violate one of the CIA properties**: If the excess privilege cannot impact at least one [CIA property](baseline-security-assumptions.md#5-the-cia-triad), it is not a vulnerability, [because it would have a base CVSS score of 0.0.](https://www.first.org/cvss/v3.1/specification-document#2-3-Impact-Metrics) + +These are _necessary_ but not _sufficient_ conditions for a behavior to be considered a vulnerability. In practice, security practitioners exercise discretion to classify a behavior by its severity. If a behavior is not practically exploitable, or the amount of overprivilege is not meaningful, or the real-world impact to systems and people is inconsequential or easily recoverable, then practitioners might avoid using the term "vulnerability" to describe the behavior. This is not an attempt at dishonesty or hiding unpleasantries. Rather, it's to avoid causing unnecessary panic and inappropriate allocation of limited resources to address low-impact items. + +Other terminology used within this document: + +* **grant set**: The set of all privileges granted to an entity. + +### Intent + +"Intent" here also merits discussion. We necessarily assume that a system intends to follow the specifications, requirements, invariants, and constraints placed on it by its designers, administrators, or regulators. This doesn't guarantee that the system does in fact follow the intended design. That is the very nature of a bug, after all. + +This does, however, mean that intent -- and whether a particular behavior qualifies as a vulnerability -- may change depending on whether you're modeling the entirety of the system, a single component within that system, or some layer in between. Individual components obviously intend to uphold their own invariants and to follow instructions provided to their API. However, individual components cannot infer anything about operator goals or administrator policies which apply to the _composition_ spanning components and which are never directly presented to the component's interface; thus they cannot meaningfully _intend_ to follow such goals or policies. The general takeaway is that policy-related vulnerabilities tend to represent defects not in any individual low-level component, but in the orchestration layer between components. This is explored more in [the robustness principle](the-robustness-principle.md#integrity). + +Similarly, a component cannot meaningfully intend for a privileged operator to be denied the exercise of their authority within the boundary. For example, because it is legal for debuggers attached to the process to access full process memory, components operating in-process cannot meaningfully intend to hide any memory-resident value from the debugger. This is the crux of the argument that -- while it may be best practice to purge secrets from memory when they are no longer needed -- it is not a vulnerability (see [Memory dumps](the-robustness-principle.md#memory-dumps)) for such secrets to remain in memory strictly longer than necessary. + +Nor can a component meaningfully intend to defend against a counterparty violating the component's contract. A component could advertise robustness guarantees against misuse, but then the specific set of conditions it claims robustness against becomes an explicit part of its contract. Outside these conditions, the component is not obligated to behave in any prescribed manner. Components might as a matter of hygiene implement some defense-in-depth protections against misuse, but such defenses are never contractual guarantees, and one cannot realistically intend to defend against any possible manner of misuse. + +Because intent is a necessary prerequisite for a vulnerability, any misbehaviors induced by a contract violation do not meet the definition of a vulnerability. The vulnerability instead resides in whoever induced the contract violation. Some exceptions exist -- the counterparty sits across an authority boundary; the contract is impractically burdensome -- where a component cannot realistically assume anybody else will uphold the contract, and a lack of robustness may in fact represent a true vulnerability. This is discussed more in (LINK TBD). + +### Vulnerability vs. reliability bug + +> Corollary: A behavior -- even if unwanted -- that does not facilitate a privilege grant is **not a vulnerability**. Such behaviors may still qualify as reliability bugs. + +As an example, assume that I am managing files in my user directory, and I issue a command to rename `foo.txt` to `bar.txt`, but the file management tool has a defect that inadvertently deletes `foo.txt`. This would obviously be a severe bug that merits a quick fix, but whether it's a _vulnerability_ depends on whether I have privilege to issue file deletion commands. In most systems, there's a common "file modification privileges" grant set that permits overwriting, renaming, or deletion of the file, so my privilege to rename the file also carries the legitimate privilege to delete the file. Unless there's something unique about this system that clearly privileges those two operations separately, while this would be a truly unfortunate bug, it would _not_ be classified as a vulnerability. It grants me no privilege I didn't already have. + +Perhaps, instead of deleting the file, the tool makes a temporary copy of the file in a public location during rename. This could shift the behavior from a reliability bug into the realm of a true vulnerability, as it now privileges other entities (any entity who can access that public location) to read data that should have remained confidential. While the system did not empower _me_ with any privilege I didn't already have -- I can legitimately read my own files, and I can legitimately send a copy of my file to others -- it did empower _some other entity_ with a privilege grant that they should not have had. + +Both scenarios involve defects, including the system facilitating the exercise of authority clearly counter to how the designer intends (or should have intended). The crucial distinction is that only the second scenario involves improperly overprivileging an entity. Therefore only the second scenario qualifies as a vulnerability. + +### Trigger, subject, and causation + +> [!NOTE] +> +> The definition of vulnerability **does not** require that some less-privileged entity _trigger_ the defect, only that some less-privileged entity be the _subject_ of the overprivileged grant set. +> +> Similarly, the definition **does not** require that the overprivileged entity _actually_ take advantage of their expanded grant set, only that the _potential_ exists for them to do so. + +This borrows ideas from [U.S. NIST](https://csrc.nist.gov/pubs/sp/800/61/r3/final) and [the U.S. CISA governing laws](https://www.law.cornell.edu/uscode/text/6/650#12), which define "incidents" based solely on their outcome (a violation of a CIA property), not on their cause. (The upcoming [Availability](#availability-and-denial-of-service) discussion delineates this idea of "incident" from our idea of "vulnerability.") This definition does not require an adversary to trigger the violation; the violation could just as easily arise from misconfigurations ("all users can access the admin folder") or environmental factors. + +Imagine that a web application issues authentication tickets to users. These tickets are time-limited: they expire after 24 hours. Perhaps there is a flaw in how this system processes dates, and tickets issued immediately before a leap day don't expire until the day immediately following the leap day. (For example: a ticket issued on `2024-02-28T10:00Z` does not expire until `2024-03-01T10:00Z`.) These tickets are valid for 48 hours instead of the intended 24 hours, giving the ticketholders access to the system beyond the restricted time period. This meets the definition of a vulnerability because it expands the grant set in a way the system author did not intend; however, absent other aggravating factors, it would be considered low-severity since the excess grants are adjacent to a legitimate grant the authenticated users already held. + +### Adversary as trigger vs. adversary as subject + +If the unintended privilege grant is triggered by an active adversary, there is no requirement that the adversary also be the _subject_ of that grant. Imagine that a web server has a bug which discloses server memory as part of the HTTP response. This memory is confidential; arbitrary remote clients (entities beyond the authority boundary) are not privileged to observe it. Assume this sequence of events: + +1. Entity \ sends a malicious request to the server. +2. The server invokes the buggy code path and pulls process memory into an unrelated cache. +3. The server sends that data as part of a response to _unrelated_ Entity \ on a different connection. + +This still qualifies as a vulnerability, as \ was not authorized to trigger disclosure, and \ was not authorized to observe that data. The severity of the vulnerability might depend on several factors, including what data becomes observable and whether \ can control which connection the disclosure occurs on. If \ can reliably coerce the server to respond with the confidential data back on \'s own connection, that could increase the severity. + +### Availability and denial of service + +Unlike the U.S. NIST and U.S. CISA definitions of "incident," our definition of "vulnerability" follows the CWE working group's definition by requiring that an entity gain privileges beyond what the system intended to grant them. This is why "Denial of Service" remains a true vulnerability category, even under a definition where vulnerabilities require improper overprivileging. The system does not intend for less-privileged clients to modify the server's availability characteristics, but a DoS vulnerability allows the client to do just that. + +Take the scenario where Entity \ triggers a misbehavior in a web application; the consequence is that the server denies access to Entity \. This involves overprivileging \: unless \ has a privilege corresponding to "remove server access from \'s grant set" or "modify the server's SLA," then \ has no legitimate authority to deny \ access. \ is meant to be governed by the server's SLA, not have the ability to modify it. + +Not every availability failure meets this bar, however. Consider again the [authentication ticket example](#trigger-subject-and-causation), but imagine the bug manifests differently. Say that tickets issued during the leap day itself improperly have a 0-hour validity window: tickets issued on `2024-02-29T10:00Z` expire on `2024-02-29T10:00Z`. The practical consequence of this is that no authorized user can log in to the web site during the leap day itself. The site suffers a 24-hour-long availability outage (an "incident," per NIST and CISA). + +Though both scenarios described in this section involve the denial of a legitimate privilege (an authorized client is not able to access the site), mere _denial_ does not by itself meet our definition of a vulnerability. The key distinction between the two is that only the first example involves overprivileging: \ improperly exercised the privilege to alter \'s grant set. Similarly, had \ crashed the web server, \ would have improperly exercised a privilege to alter every other user's effective grant set. These potentially meet our definition of a DoS vulnerability. The authentication ticket example in this section (all auth cookies being invalidated immediately) does not involve any entity improperly obtaining an expanded grant set, therefore it does not meet our definition of a vulnerability.