Privacy Review - MathML 4

# Scope 

MathML4 delta from MathML Core. MathML Core's prior PING review ([issue \#130](https://github.com/w3cping/privacy-request/issues/130)) is taken as a baseline where applicable, but this review covers MathML Full-specific features that Core intentionally omits.

# Overall assessment

**No blocking privacy concerns**. For features shared with MathML Core, the baseline Core review stands. MathML 4’s Privacy and Security Considerations sections (§D.4 and §D.5) are nevertheless insufficient because they largely defer to Core, while MathML Full introduces several features outside Core’s scope — specifically `intent`, the `annotation src` reference mechanism, `mglyph`, broad `href` on all elements, and Content MathML semantic identifiers — each of which warrants explicit treatment. The primary new privacy risk is misuse of `intent` as a hidden alternate content channel, potentially enabling assistive-technology-use detection. All findings appear addressable with targeted additions to the specification text.

---

## Finding 1 — §D.4 and §D.5 must not simply defer to MathML Core

MathML Core’s privacy and security sections address the subset of MathML defined by Core, including Core-specific rendering, layout, font, link, and embedding risks. MathML 4 Full introduces features outside Core’s scope.Simply deferring to MathML Core leaves several MathML Full-specific features without explicit privacy or security analysis, including `intent` processing, `annotation src` external references, `mglyph` resource loading, broad `href` support on all elements, and Content MathML semantic identifiers.

*Request:* MathML 4 should include Privacy and Security Considerations sections that explicitly address MathML Full-specific features rather than relying on the Core review alone.

---

## Finding 2 — `href` on all MathML elements reintroduces link-model risks outside Core

MathML Core did not retain MathML3’s broad `href/xlink:href` model. MathML 4 Full reinstates `href` on all MathML elements, including invisible and nested elements that may not fit the ordinary HTML link model. The specification should clarify activation behavior and privacy protections for these cases, including visited-link handling.

*Requested addition to §D.5:*

In web contexts, MathML `href` must not create link, navigation, URL-scheme, referrer, script-execution, download, or target-handling capabilities beyond those allowed by the host environment’s ordinary link model. The specification should also clarify safe behavior for `href` on non-rendered elements and for nested MathML links, since these cases may not map cleanly to ordinary visible links.

---

## Finding 3 — AT-use detection via intent divergent content (primary new privacy concern)

The W3C Security and Privacy Questionnaire explicitly flags features that allow authors to serve different content to AT users as a privacy concern, because sites can infer AT use from subsequent user behavior. MathML Core does not include intent, so this risk is entirely new to MathML 4\. The intent attribute, by design, can influence the accessible speech generated for MathML while leaving visual rendering unchanged. A malicious author could embed behavioral probes or instructions exclusively in intent values and observe whether users respond — enabling disability-related profiling.

Normative author requirements alone are insufficient. The stronger protection is UA-level guidance ensuring no page-observable signal indicates whether intent was consumed.

*Requested addition — normative author requirement (§D.4):*

Authors MUST NOT use `intent` to convey hidden instructions, behavioral probes, tracking tokens, or content that materially differs from the visible mathematical expression. intent should be used only to disambiguate or improve narration/navigation of the same mathematical content.

*Requested addition — UA-level guidance (§D.4):*

User agents should not expose to page script any signal indicating whether, how, or by whom intent was consumed by assistive technology.

---

## Finding 4 — intent requires explicit non-observability guidance

MathML Core reserves `intent` and `arg` as valid attributes but does not define their processing behavior. As a result, MathML Core’s privacy review does not cover their privacy implications. MathML 4 should therefore add explicit privacy guidance for `intent`.

*Requested addition to §D.4:*

The `intent` attribute provides an author-supplied semantic layer intended to improve mathematical narration and accessibility. Although `intent` does not directly expose user data, its processing may depend on assistive-technology behavior, locale, speech or braille settings, supported concept dictionaries, fallback behavior, or parsing outcomes. Implementations should ensure that these processing differences are not exposed to page script. In particular, user agents and assistive technologies should not expose generated speech strings, parse errors, supported concept dictionaries, fallback choices, or other AT-specific processing results through DOM APIs, accessibility APIs observable by the page, events, timing, layout, or other page-observable behavior. 

---

## Finding 5 — `intent` literals should be safely handled in speech and braille pipelines

MathML 4’s `intent` attribute is author-controlled input intended to influence how mathematical notation is spoken or otherwise presented by assistive technologies. Because `intent` values may be parsed and forwarded into speech, braille, accessibility, or platform services, the specification should make clear that this data remains untrusted throughout the processing pipeline.

This is not a concern about parsing the MathML `intent` grammar itself. Processors necessarily need to parse `intent` according to the specification. The concern is that literal strings, fallback names, concept names, or other author-provided text derived from intent should not be interpreted by downstream systems as commands, SSML, markup, URLs, code, or other executable/control syntax.

Without this clarification, implementations could accidentally create injection-style risks in assistive-technology pipelines, especially if future speech or braille integrations accept richer command languages or markup-like input.

*Requested addition to §D.5:*

The `intent` attribute is author-controlled input. Implementations may parse it according to the MathML `intent` grammar, but any author-provided text derived from `intent` should be treated as data when forwarded to speech, braille, accessibility, or platform services. Such text should not be interpreted as SSML, commands, markup, URLs, scripts, or other control instructions unless explicitly defined and safely constrained.

---

## Finding 6 — `intent` processing should not expose user locale or AT preferences

This risk does not arise in MathML Core because intent is not defined there. MathML 4 introduces author-provided `intent` values that may be interpreted differently depending on language, locale, speech rules, braille rules, or assistive-technology preferences.

Using user-specific settings such as OS locale, speech locale, braille preferences, or installed accessibility dictionaries is not necessarily a privacy problem by itself. These settings may be needed to produce the correct experience for the user. The privacy concern arises if those differences become observable to the page. For example, if a page script can observe different generated accessible names, fallback behavior, parsing errors, timing, layout, or other outputs based on user-specific locale or AT configuration, then `intent` processing could add fingerprinting entropy beyond MathML Core’s baseline.

*Requested addition to §D.4:*

Implementations should use document and element language as the author-controlled input for `intent` interpretation when possible. User-specific locale, speech, braille, or assistive-technology preferences may affect the user’s final accessibility experience, but differences derived from those preferences must not be exposed to page script through generated accessible names, fallback behavior, parsing errors, timing, layout, events, or other observable behavior.

---

## Finding 7 — Clarify fetch behavior for external annotation references

MathML 4 allows `annotation` and `annotation-xml` elements to reference external annotation content using `src`. The specification appears to discuss this mainly for processors that expand, export, or transform annotations, rather than for ordinary visual rendering. However, because `src` is a URL-bearing attribute, MathML 4 should explicitly define when, if ever, these external references may be dereferenced in web contexts.

In particular, MathML 4 should state that user agents must not automatically fetch external annotation references merely for parsing, rendering, accessibility-tree construction, indexing, or passive document inspection. If a processor chooses to expand or export annotation content by fetching an external reference, that fetch should be explicit, should follow the host environment’s normal web security policies, and should not bypass CSP, referrer policy, mixed-content restrictions, credential rules, private-network protections, or user/application mediation.

Without this clarification, external annotations could become an unexpected network-observable surface, especially in tools that process MathML for accessibility, search, conversion, validation, or export.

*Requested addition to §D.5:*

In web contexts, external annotation references via `annotation src` or `annotation-xml src` must not be fetched automatically during parsing, rendering, accessibility-tree construction, or other passive processing. Any processor that expands or exports external annotation content should treat the reference as an explicit resource load subject to the host environment’s normal fetch, CSP, referrer, credentials, mixed-content, and network-isolation policies.

---

## Finding 8 — `mglyph` adds external image resource loading outside Core

mglyph is not in MathML Core. It includes a src attribute for external glyph images, and the spec notes a JavaScript polyfill implements it using img. This creates image-like network requests not present in Core's baseline.

*Requested addition to §D.5:*

Web implementations and polyfills must treat mglyph resource loading like ordinary image loading: subject to CSP, referrer policy, mixed-content blocking, credential rules, and canvas tainting where applicable. User agents should not create additional network observability beyond ordinary image loading behavior.

---

## Finding 9 — Content MathML semantic identifiers should not be resolved automatically

Content MathML is outside MathML Core and introduces semantic identifiers such as `definitionURL`, `cd`, and `csymbol`. These identifiers can refer to external or application-defined semantic definitions. While such references may be useful for specialized tools, MathML 4 should clarify that web user agents must not automatically resolve or dereference them during ordinary parsing, rendering, or accessibility processing..

*Requested addition to §D.4:*

Content MathML semantic identifiers such as `definitionURL`, `cd`, and `csymbol` should be treated as opaque identifiers in web contexts. User agents must not automatically fetch, resolve, or dereference them during parsing, rendering, or accessibility processing unless an application explicitly requests such resolution subject to the host environment’s normal fetch and privacy controls.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Privacy Review - MathML 4 #576

Scope

Overall assessment

Finding 1 — §D.4 and §D.5 must not simply defer to MathML Core

Finding 2 — `href` on all MathML elements reintroduces link-model risks outside Core

Finding 3 — AT-use detection via intent divergent content (primary new privacy concern)

Finding 4 — intent requires explicit non-observability guidance

Finding 5 — `intent` literals should be safely handled in speech and braille pipelines

Finding 6 — `intent` processing should not expose user locale or AT preferences

Finding 7 — Clarify fetch behavior for external annotation references

Finding 8 — `mglyph` adds external image resource loading outside Core

Finding 9 — Content MathML semantic identifiers should not be resolved automatically

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Privacy Review - MathML 4 #576

Description

Scope

Overall assessment

Finding 1 — §D.4 and §D.5 must not simply defer to MathML Core

Finding 2 — href on all MathML elements reintroduces link-model risks outside Core

Finding 3 — AT-use detection via intent divergent content (primary new privacy concern)

Finding 4 — intent requires explicit non-observability guidance

Finding 5 — intent literals should be safely handled in speech and braille pipelines

Finding 6 — intent processing should not expose user locale or AT preferences

Finding 7 — Clarify fetch behavior for external annotation references

Finding 8 — mglyph adds external image resource loading outside Core

Finding 9 — Content MathML semantic identifiers should not be resolved automatically

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Finding 2 — `href` on all MathML elements reintroduces link-model risks outside Core

Finding 5 — `intent` literals should be safely handled in speech and braille pipelines

Finding 6 — `intent` processing should not expose user locale or AT preferences

Finding 8 — `mglyph` adds external image resource loading outside Core