Skip to content

Latest commit

 

History

History
399 lines (300 loc) · 10.8 KB

File metadata and controls

399 lines (300 loc) · 10.8 KB

Universal Expression Tree

Lisp-inspired syntax tree, opposed to ESTree:

  • language-agnostic, can be compiled to different targets
  • reflects execution sequence, rather than code layout
  • has minimal overhead, directly maps to operators
  • simplifies manual evaluation and debugging
  • has conventional form and one-liner docs
  • JSON-compatible, sparse arrays

Kinds:

[operator, ...operands]    operation
"name"                     identifier (resolved from context)
[, value]                  literal

An expression tree is one of three forms:

1. Identifier

A string representing a name to be resolved from context.

"x"           → the value of x
"foo"         → the value of foo

2. Literal

A single-element array containing a value.

[, 1]         → the number 1
[, "hello"]   → the string "hello"
[, true]      → the boolean true
[, null]      → null

The empty first slot distinguishes literals from operations. This follows the pattern: what has no operator is data.

3. Operation

An array where the first element is the operator, followed by operands.

[op]                → nullary
[op, a]             → unary prefix
[op, a, null]       → unary postfix
[op, a, b]          → binary
[op, a, b, c]       → ternary
[op, a, b, c, ...]  → n-ary

Operators

Arithmetic

['+', a, b]        a + b
['-', a, b]        a - b
['-', a]           -a (negation)
['+', a]           +a (coercion)
['*', a, b]        a * b
['/', a, b]        a / b
['%', a, b]        a % b
['**', a, b]       a ** b

Comparison

['==', a, b]       a == b
['!=', a, b]       a != b
['===', a, b]      a === b
['!==', a, b]      a !== b
['<', a, b]        a < b
['<=', a, b]       a <= b
['>', a, b]        a > b
['>=', a, b]       a >= b

Logical

['!', a]           !a
['&&', a, b]       a && b
['||', a, b]       a || b
['??', a, b]       a ?? b

Bitwise

['~', a]           ~a
['&', a, b]        a & b
['|', a, b]        a | b
['^', a, b]        a ^ b
['<<', a, b]       a << b
['>>', a, b]       a >> b
['>>>', a, b]      a >>> b

Access

['.', a, 'b']      a.b
['[]', a, b]       a[b]
['()', a, b]       a(b)
['()', a, null]    a()
['?.', a, 'b']     a?.b

Note

Property access uses the literal property name as a string, not an identifier to resolve.

Assignment

['=', a, b]        a = b
['+=', a, b]       a += b
['-=', a, b]       a -= b
['*=', a, b]       a *= b
['/=', a, b]       a /= b
['%=', a, b]       a %= b

Increment

['++', a]          ++a
['++', a, null]    a++
['--', a]          --a
['--', a, null]    a--

The null second operand distinguishes postfix from prefix.

Sequence

[',', a, b, c]     a, b, c
[';', a, b, c]     a; b; c

Sequence operators flatten naturally into n-ary form.

Conditional

['?', a, b, c]     a ? b : c

Keywords

[, true]           true
[, false]          false
[, null]           null
[]                 undefined (empty array for JSON safety)
[, NaN]            NaN (JS only)
[, Infinity]       Infinity (JS only)

Keywords are literals — they have no operator, only a value.

Structures

['[]', [',', a, b]]           [a, b]
['{}', [':', k, v]]           {k: v} (object literal)
['{}', 'a']                   {a} (object shorthand)
['{', body]                   { body } (block statement)

Functions

['=>', params, body]          (params) => body
['...', a]                    ...a (spread)

Templates

['`', [,'a'], 'x', [,'b']]    `a${x}b` (template literal)
['``', 'tag', [,'a'], 'x']    tag`a${x}` (tagged template)

Template literals contain string parts (as literals) interleaved with expression parts (as nodes).

Statements (jessie)

Control flow

['if', cond, then]            if (cond) then
['if', cond, then, else]      if (cond) then else alt
['while', cond, body]         while (cond) body
['for', head, body]           for (...) body
['for', ['of', 'x', iter], body]        for (x of iter) body
['for', ['of', ['const', 'x'], iter], body]  for (const x of iter) body
['for', ['in', 'x', obj], body]         for (x in obj) body
['for', [';', init, cond, step], body]  for (init; cond; step) body
['let', 'x']                  let x
['let', ['=', 'x', val]]      let x = val
['const', ['=', 'x', val]]    const x = val
['break']                     break
['continue']                  continue
['return']                    return
['return', val]               return val

Exceptions (feature/throw.js, feature/try.js)

['throw', val]                                  throw val
['try', body, ['catch', 'e', handler]]          try { body } catch (e) { handler }
['try', body, ['finally', cleanup]]             try { body } finally { cleanup }
['try', body, ['catch', 'e', h], ['finally', c]] try {...} catch {...} finally {...}

Function Declarations (feature/function.js)

['function', 'name', [',', 'a', 'b'], body]  function name(a, b) { body }
['function', '', 'x', body]                  function(x) { body }

Design Principles

1. Operator as First Element

Following McCarthy's original Lisp (1960), the operator occupies position zero. This enables uniform traversal: every node is processed identically regardless of arity.

2. Strings are References or Tokens

An unwrapped string in operand position is interpreted by the operator:

  • Identifier: resolved from context (most operators)
  • Token: used directly as name/data (operator-specific)
"x"               → resolve x from context
[, "x"]           → the literal string "x"
['.', a, 'b']     → 'b' is property name (token)
['//', 'abc', 'g']→ pattern/flags are tokens
['n', '123']      → bigint digits are token
['px', '100']     → unit value is token
['function', 'f', ...] → 'f' is function name (token)

Operators that use tokens: ., ?., //, n, px/units, =>, function, let, const, try (catch param).

3. Empty Slot for Literals

The pattern [, value] uses JavaScript's elision syntax. The empty first position signals: this is data, not code. No operator means no operation.

4. Null Marks Position

Postfix operators use null to mark the absent second operand:

['++', 'a']        → ++a (prefix)
['++', 'a', null]  → a++ (postfix)

This preserves structural consistency without inventing new node types.

5. AST, not CST

This is an Abstract, not Concrete Syntax Tree. The parser normalizes syntax to semantic structure. Delimiters like () {} are stripped when purely syntactic; operators preserve meaning.

Type Purpose Example
CST Preserve source exactly Formatters, refactoring tools
AST Semantic structure Compilers, evaluators

6. Flat Sequences

Associative operators flatten:

a + b + c  →  ['+', 'a', 'b', 'c']    // not ['+', ['+', 'a', 'b'], 'c']

This reflects execution semantics and enables SIMD-style optimization.

7. Location Metadata

Nodes may carry a .loc property indicating source position:

['+', 'a', 'b', loc: 5]    // operator at column 5

This is a non-enumerable property, ignored during serialization.

Extension

The format accommodates domain-specific operators:

['LIKE', 'name', [, '%smith%']]
['BETWEEN', 'x', [, 1], [, 10]]

Custom operators are simply strings in position zero.

Non-JSON Primitives

Literals ([, value]) hold JSON primitives only: number, string, boolean, null.

Non-JSON values use primitive operators with token operands:

Value Form
undefined [] (empty array, JSON round-trip safe)
NaN [, NaN] (JS runtime only, serializes to [null, null])
Infinity [, Infinity] (JS runtime only, serializes to [null, null])
/abc/gi ['//', 'abc', 'gi'] (pattern/flags are tokens)
/abc/ ['//', 'abc']
10n (BigInt) ['n', '10'] (digits are token)
100px ['px', '100'] (unit with value token)
Symbol('x') ['()', 'Symbol', [, 'x']] (function call)
`a${x}b` ['`', [, 'a'], 'x', [, 'b']] (interpolation)

Regex (//), BigInt (n), and Units (px, em, etc.) use tokens because they're syntactic parts of the primitive, like property names in . access.

Template literals must be operations — they contain sub-expressions to evaluate.

Examples

Property access — name is token:

a.b               → ['.', 'a', 'b']
a?.b              → ['?.', 'a', 'b']

Primitives — pattern/digits/unit are tokens:

/abc/gi           → ['//', 'abc', 'gi']
10n               → ['n', '10']
100px             → ['px', '100']

Variables — wrap assignment expression:

let x             → ['let', 'x']
let x = 1         → ['let', ['=', 'x', [, 1]]]
const y = 2       → ['const', ['=', 'y', [, 2]]]

For loops — head is an expression:

for (x of arr) {}         → ['for', ['of', 'x', 'arr'], body]
for (const x of arr) {}   → ['for', ['of', ['const', 'x'], 'arr'], body]
for (let x of arr) {}     → ['for', ['of', ['let', 'x'], 'arr'], body]
for (x in obj) {}         → ['for', ['in', 'x', 'obj'], body]
for (;;) {}               → ['for', [';', null, null, null], body]

Try/catch — keywords preserved as operators:

try { a } catch (e) { b }                 → ['try', 'a', ['catch', 'e', 'b']]
try { a } finally { c }                   → ['try', 'a', ['finally', 'c']]
try { a } catch (e) { b } finally { c }   → ['try', 'a', ['catch', 'e', 'b'], ['finally', 'c']]

Functions — name and params are tokens:

function f(a, b) { c }    → ['function', 'f', [',', 'a', 'b'], 'c']
function(x) { y }         → ['function', '', 'x', 'y']
x => x                    → ['=>', 'x', 'x']
(a, b) => a + b           → ['=>', ['()', [',', 'a', 'b']], ['+', 'a', 'b']]

Serialization

The format is valid JSON when serialized with standard JSON.stringify. Elided array elements serialize as null:

[, 1]    [null, 1]

Implementations SHOULD accept both forms on input.

Inspiration

  • frisk — Porsager. Evaluable arrays as function calls.
  • nisp — Ysmood. JSON-compatible lisp.
  • Pratt parsing — Pratt, 1973. Top Down Operator Precedence. Elegant precedence-driven parsing.
  • S-expressions — McCarthy, 1960. Code as nested lists.
  • JSON — Crockford, 2001. Minimal format, universal adoption.
  • ESTree – opposite example

License

This specification is released under CC0. It belongs to no one and everyone.

🕉