Skip to content

Global substitution notation t[x := t'] breaks instance-binder fields in downstream projects #631

@FrankieNC

Description

@FrankieNC

First of all, thanks for the library! While working on a project that depends on Cslib transitively, I ran into a parse error and I thought it was worth reporting since it affects code that doesn't use Cslib's notation at all.

The problem

The substitution notation in Cslib/Foundations/Syntax/HasSubstitution.lean is currently declared as:

notation t:max "[" x ":=" t' "]" => HasSubstitution.subst t x t'

I believe the problem comes from the combination of two things. Since it is a plain notation rather than a scoped one, it becomes active in every file of every project that imports Cslib, even transitively (the surrounding namespace Cslib only affects the generated parser name, not where the notation is active). And since there is no whitespace constraint, the parser may attach a [ to a preceding term even when the bracket appears on the following line.

In combination, this means that some ordinary Lean code no longer parses once Cslib is in the import graph. For example:

import Cslib.Foundations.Syntax.HasSubstitution
import Mathlib

structure Foo where
  A : Type*
  [A_acg : AddCommGroup A]
-- error: unexpected token ':'; expected ':='

The same structure compiles fine with import Mathlib alone. Writing the field as (A : Type*) also makes it parse, since the closing parenthesis ends the type expression however, I find it a bit odd to have to do this every time.

My understanding of what happens: after parsing the field type Type*, the term parser looks for trailing parsers keyed on the next token. The substitution notation registers one on [ without a whitespace guard, so the [ on the next line matches and the parser commits to reading Type*[… := …]. It then parses A_acg as the inner term and fails on the :, since only := can follow at that point. As there is no backtracking, the intended reading of [A_acg : AddCommGroup A] as an instance-binder field is never considered.

This also reproduces on a bare toolchain, without Cslib:

class HasSubstitution (α : Type u) (β : Type v) (γ : Type w) where
  subst (t : α) (x : β) (s : γ) : α

notation t:max "[" x ":=" s "]" => HasSubstitution.subst t x s

structure Foo where
  A : Type
  [inst : Inhabited A]
-- error: unexpected token ':'; expected ':='

For comparison, core's indexing notation xs[i] has the same shape but is guarded with noWs (Init/GetElem.lean):

syntax:max term noWs "[" withoutPosition(term) "]" : term

so l[0] parses as getElem, but l [0] does not.

Possible fixes

I thought of a few ways this could be addressed:

  • Declaring the notation as scoped. The notation would then only be active after open scoped Cslib, so downstream projects are unaffected unless they opt in. The greedy parsing would remain, but only for files that opt in.

  • Guarding the notation with noWs, as core does for getElem. I tried the following locally:

    syntax:max term noWs "[" term " := " term "]" : term
    macro_rules | `($t[$x := $s]) => `(HasSubstitution.subst $t $x $s)

With this change the structure above parses correctly, t[x := s] still works when written without spaces, and core's indexing is unaffected. There are two caveats. The lambda calculus files sometimes write the notation with spaces (e.g. t [x := sub] in LocallyNameless/Untyped/Properties.lean and FullBeta.lean), so those occurrences would need to be adjusted. And since noWs is not available in the notation command, the declaration has to become a syntax + macro_rules pair, which loses the automatically generated unexpander, so pretty-printing would need a hand-written one.

  • Renaming the notation away from a bare [, as feat: change context notation to ·<[·] #393 did for the context-fill notation (c[t]c<[t]). That notation was affected in exactly the same way — on v4.28.0 it is what turns the error above into expected ':=' or ']' — and the rename resolved it.

I would be happy to make a PR with a suggested fix (if one is needed).

Environment

  • I originally hit this on Cslib v4.28.0 (11b70e5d) with Lean 4.28.0. On that version the error actually reads unexpected token ':'; expected ':=' or ']' — the extra ']' alternative comes from the old c[t] context-fill notation, which was still around at that point.
  • The issue is still present on current main (e3991ff2, Lean 4.31.0-rc2, Mathlib d90090f6), where the structure above fails with expected ':='. There, importing Cslib.Foundations.Syntax.Context by itself causes no problems, so the feat: change context notation to ·<[·] #393 rename fixed the context notation and only the substitution notation remains.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions