This document provides context for AI coding agents working on the Predicator project.
Predicator is a secure, non-evaluative condition engine for processing end-user boolean predicates in Elixir. It provides a complete compilation pipeline from string expressions to executable instructions without the security risks of dynamic code execution. Supports arithmetic operators (+, -, *, /, %) with proper precedence, comparison operators (>, <, >=, <=, =, !=), logical operators (AND, OR, NOT), date/datetime literals, list literals, object literals with JavaScript-style syntax, membership operators (in, contains), function calls with built-in system functions, nested data structure access using dot notation, and bracket access for dynamic property and array access.
Expression String → Lexer → Parser → Compiler → Instructions → Evaluator
↓
StringVisitor (decompile)
expression → logical_or
logical_or → logical_and ( ("OR" | "or") logical_and )*
logical_and → logical_not ( ("AND" | "and") logical_not )*
logical_not → ("NOT" | "not") logical_not | comparison
comparison → addition ( ( ">" | "<" | ">=" | "<=" | "=" | "==" | "!=" | "===" | "!==" | "in" | "contains" ) addition )?
addition → multiplication ( ( "+" | "-" ) multiplication )*
multiplication → unary ( ( "*" | "/" | "%" ) unary )*
unary → ( "-" | "!" ) unary | postfix
postfix → primary ( "[" expression "]" | "." IDENTIFIER )*
primary → NUMBER | FLOAT | STRING | BOOLEAN | DATE | DATETIME | IDENTIFIER | duration | relative_date | list | object | function_call | "(" expression ")"
function_call → FUNCTION_NAME "(" ( expression ( "," expression )* )? ")"
list → "[" ( expression ( "," expression )* )? "]"
object → "{" ( object_entry ( "," object_entry )* )? "}"
object_entry → object_key ":" expression
object_key → IDENTIFIER | STRING
duration → NUMBER UNIT+
relative_date → duration "ago" | duration "from" "now" | "next" duration | "last" duration
- Lexer (
lib/predicator/lexer.ex): Tokenizes expressions with position tracking - Parser (
lib/predicator/parser.ex): Recursive descent parser building AST - Compiler (
lib/predicator/compiler.ex): Converts AST to executable instructions - Evaluator (
lib/predicator/evaluator.ex): Executes instructions against data - Visitors (
lib/predicator/visitors/): AST transformation modules- StringVisitor: Converts AST back to strings
- InstructionsVisitor: Converts AST to executable instructions
- Functions (
lib/predicator/functions/): Function system components- SystemFunctions: Built-in system functions (len, upper, abs, max, etc.) provided via
all_functions/0
- SystemFunctions: Built-in system functions (len, upper, abs, max, etc.) provided via
- Main API (
lib/predicator.ex): Public interface with convenience functions
After implementing a new set of functionality
- ensure the local project is not on the main branch
- identify all code issues by running 'mix quality'
- fix those issues
- if necessary update the CHANGELOG, README and AGENTS document
- prompt me if I would like to create a git commit
- if so, create a git commit with title and message
mix test # Run all tests
mix test --watch # Watch mode
mix test.coverage # Coverage report
mix test.coverage.html # HTML coverage reportmix quality # Run all quality checks (format, credo, coverage, dialyzer)
mix quality.check # Check quality without fixing
mix format # Format code
mix credo --strict # Lint with strict mode
mix dialyzer # Type checking- Overall: 92.2%
- Evaluator: 95.7% (arithmetic with type coercion, unary, and all operations)
- StringVisitor: 97.5% (all formatting options)
- InstructionsVisitor: 95.2% (all AST node types)
- Lexer: 98.4% (all token types including floats and arithmetic)
- Parser: 86.4% (complex expressions with precedence and float support)
- Target: >90% for all components ✅
- No
eval()or dynamic code execution - All expressions compiled to safe instruction sequences
- Input validation at lexer/parser level
- Comprehensive error messages with line/column positions
- Graceful error propagation through pipeline stages
- Type-safe error handling with
{:ok, result} | {:error, message, line, col}tuples
- Compile-once, evaluate-many pattern supported
- Efficient instruction-based execution
- Minimal memory allocation during evaluation
- Credo complexity warnings suppressed for lexer/parser with explanatory comments
- High complexity is appropriate and necessary for these functions
- Well-tested and contained complexity
lib/predicator/
├── lexer.ex # Tokenization with position tracking
├── parser.ex # Recursive descent parser
├── compiler.ex # AST to instructions conversion
├── evaluator.ex # Instruction execution engine with custom function support
├── visitor.ex # Visitor behavior definition
├── types.ex # Type specifications
├── functions/ # Function system components
│ └── system_functions.ex # Built-in functions (len, upper, abs, etc.)
└── visitors/ # AST transformation modules
├── string_visitor.ex # AST to string decompilation
└── instructions_visitor.ex # AST to instructions conversion
test/predicator/
├── lexer_test.exs
├── parser_test.exs
├── compiler_test.exs
├── evaluator_test.exs
├── object_evaluation_test.exs # Object literal evaluation tests
├── object_edge_cases_test.exs # Object literal edge cases
├── object_integration_test.exs # Object literal integration tests
├── predicator_test.exs # Integration tests
└── visitors/ # Visitor tests
├── string_visitor_test.exs
└── instructions_visitor_test.exs
-
Natural-language durations and relative time expressions
-
Relative dates:
3d ago,2w from now,next 1mo,last 1y -
Date/DateTime arithmetic:
#2024-01-10# + 5d,#2024-01-15T10:30:00Z# - 2h -
Grammar updates:
durationandrelative_dateproductions -
Full pipeline support (lexer, parser, compiler, evaluator, string visitor) with tests
-
Examples:
Predicator.evaluate("created_at > 3d ago", %{"created_at" => ~U[2024-01-20 00:00:00Z]}) Predicator.evaluate("due_at < 2w from now", %{"due_at" => Date.add(Date.utc_today(), 10)}) Predicator.evaluate("#2024-01-10# + 5d = #2024-01-15#", %{}) Predicator.evaluate("#2024-01-15T10:30:00Z# - 2h < #2024-01-15T10:30:00Z#", %{})
-
Syntax Support: Complete JavaScript-style object literal syntax (
{},{name: "John"},{user: {role: "admin"}}) -
Lexer Extensions: Added
:lbrace,:rbrace,:colontokens for object parsing -
Parser Grammar: Comprehensive object parsing with proper precedence and error handling
-
AST Nodes: New
{:object, entries}AST node type for object representation -
Stack-based Compilation: Uses
object_newandobject_setinstructions for efficient evaluation -
Evaluator Support: Object construction and equality comparison with type-safe guards
-
String Decompilation: Round-trip formatting preserves original object syntax
-
Key Types: Both identifier keys (
name) and string keys ("name") supported -
Nested Objects: Unlimited nesting depth with proper evaluation order
-
Type Safety: Enhanced type matching guards to support maps while preserving Date/DateTime separation
-
Comprehensive Testing: 47 new tests covering evaluation, edge cases, and integration scenarios
-
Examples:
Predicator.evaluate("{name: 'John', age: 30}", %{}) # Object construction Predicator.evaluate("{score: 85} = user_data", %{"user_data" => %{"score" => 85}}) # Comparison Predicator.evaluate("{user: {role: 'admin'}}", %{}) # Nested objects
-
Float Literals: Lexer supports floating-point numbers (e.g.,
3.14,0.5) -
Numeric Types: Both integers and floats supported in arithmetic operations
-
String Concatenation:
+operator performs string concatenation when at least one operand is a string -
Type Coercion Rules:
- Number + Number → Numeric addition
- String + String → String concatenation
- String + Number → String concatenation (number converted to string)
- Number + String → String concatenation (number converted to string)
-
Examples:
Predicator.evaluate("3.14 * 2", %{}) # {:ok, 6.28} Predicator.evaluate("'Hello' + ' World'", %{}) # {:ok, "Hello World"} Predicator.evaluate("'Count: ' + 42", %{}) # {:ok, "Count: 42"} Predicator.evaluate("100 + ' items'", %{}) # {:ok, "100 items"}
-
Built-in Functions: System functions automatically available in all evaluations
- String functions:
len(string),upper(string),lower(string),trim(string) - Numeric functions:
abs(number),max(a, b),min(a, b) - Date functions:
year(date),month(date),day(date)
- String functions:
-
Custom Functions: Provided per evaluation via
functions:option inevaluate/3 -
Function Format:
%{name => {arity, function}}where function takes[args], contextand returns{:ok, result}or{:error, message} -
Function Merging: Custom functions merged with system functions, allowing overrides
-
Thread Safety: No global state - functions scoped to individual evaluation calls
-
Examples:
custom_functions = %{ "double" => {1, fn [n], _context -> {:ok, n * 2} end}, "len" => {1, fn [_], _context -> {:ok, "custom_override"} end} # Override built-in } Predicator.evaluate("double(score) > 100", %{"score" => 60}, functions: custom_functions) Predicator.evaluate("len('anything')", %{}, functions: custom_functions) # Uses override Predicator.evaluate("len('hello')", %{}) # Uses built-in (returns 5)
-
Full Arithmetic Support: Complete parsing and evaluation pipeline for arithmetic expressions
- Binary operations:
+(addition),-(subtraction),*(multiplication),/(division),%(modulo) - Unary operations:
-(unary minus),!(unary bang/logical NOT)
- Binary operations:
-
Proper Precedence: Mathematical precedence handling (unary → multiplication → addition → equality → comparison)
-
Instruction Execution: Stack-based evaluator with 7 new instruction handlers
-
Error Handling: Division by zero protection, type checking, comprehensive error messages
-
Pattern Matching: Idiomatic Elixir implementation using pattern matching for clean code
-
Examples:
Predicator.evaluate("2 + 3 * 4", %{}) # {:ok, 14} - correct precedence Predicator.evaluate("(10 - 5) / 2", %{}) # {:ok, 2} - parentheses and division Predicator.evaluate("-score > -100", %{"score" => 85}) # {:ok, true} - unary minus Predicator.evaluate("total % 2 = 0", %{"total" => 14}) # {:ok, true} - modulo
- Syntax:
#2024-01-15#(date),#2024-01-15T10:30:00Z#(datetime) - Lexer: Added date tokenization with ISO 8601 parsing
- Parser: Extended AST to support date literals
- Evaluator: Date/datetime comparisons and membership operations
- StringVisitor: Round-trip formatting
#date#syntax
- Syntax:
[1, 2, 3],["admin", "manager"] - Operators:
in(element in list),contains(list contains element) - Examples:
role in ["admin", "manager"],[1, 2, 3] contains 2
-
Syntax:
{},{name: "John"},{user: {role: "admin", active: true}} -
Key Types: Identifiers (
name) and strings ("name") supported as keys -
Nested Objects: Unlimited nesting depth with proper evaluation order
-
Stack-based Compilation: Uses
object_newandobject_setinstructions for efficient evaluation -
Type Safety: Object equality comparisons with proper map type guards
-
String Decompilation: Round-trip formatting preserves original syntax
-
Examples:
Predicator.evaluate("{name: 'John'} = user_data", %{}) # Object comparison Predicator.evaluate("{score: 85, active: true}", %{}) # Object construction Predicator.evaluate("user = {profile: {name: 'Alice'}}", %{}) # Nested objects
- Case-insensitive: Both
AND/and,OR/or,NOT/notsupported - Pattern matching: Refactored evaluator and parser to use pattern matching over case statements
- Plain boolean expressions: Support for
active,expiredwithout= true
- Dot Notation: Access deeply nested data structures using
.syntax - Bracket Notation: Dynamic property and array access using
[key]syntax (NEW) - Mixed Access: Combine both notations like
user.settings['theme'](NEW) - Syntax:
- Dot:
user.profile.name,config.database.settings.ssl - Bracket:
user['profile']['name'],items[0],scores[index] - Mixed:
user.settings['theme'],data['users'][0].name
- Dot:
- Key Types: Supports string keys, atom keys, integer keys, and mixed types
- Array Indexing: Full array access with bounds checking (
items[0],scores[index]) - Dynamic Keys: Variable and expression-based keys (
obj[key],items[i + 1]) - Parser: Added postfix parsing for bracket access with recursive chaining
- Evaluator:
- Enhanced
load_nested_value/2for dot notation - New
access_value/2for bracket access with comprehensive type handling
- Enhanced
- Error Handling: Returns
:undefinedfor missing paths, out-of-bounds access, or non-map/non-array intermediate values - Examples:
user.name.first = "John"(dot notation)user['profile']['role'] = "admin"(bracket notation)items[0] = "apple"(array access)data['users'][index]['name'](chained bracket access)user.settings['theme'] = 'dark'(mixed notation)
- Backwards Compatible: Simple variable names and existing dot notation work exactly as before
-
Purpose: SCXML datamodel location expressions for assignment operations (
<assign>elements) -
API Function:
Predicator.context_location/3- resolves location paths for assignment targets -
Location Paths: Returns lists like
["user", "name"],["items", 0, "property"]for navigation -
Validation: Distinguishes assignable locations (l-values) from computed expressions (r-values)
-
Error Handling: Structured
LocationErrorwith detailed error types and context -
Core Module:
Predicator.ContextLocationwith comprehensive location resolution logic -
Error Types:
:not_assignable- Expression cannot be used as assignment target (literals, functions, etc.):invalid_node- Unknown or unsupported AST node type:undefined_variable- Variable referenced in bracket key is not defined:invalid_key- Bracket key is not a valid string or integer:computed_key- Computed expressions cannot be used as assignment keys
-
Examples:
Predicator.context_location("user.profile.name", %{}) # {:ok, ["user", "profile", "name"]} Predicator.context_location("items[0]", %{}) # {:ok, ["items", 0]} Predicator.context_location("data['users'][i]['name']", %{"i" => 2}) # {:ok, ["data", "users", 2, "name"]} Predicator.context_location("len(name)", %{}) # {:error, %LocationError{type: :not_assignable}} Predicator.context_location("42", %{}) # {:error, %LocationError{type: :not_assignable}}
-
Assignable Locations: Simple identifiers, property access, bracket access, mixed notation
-
Non-Assignable: Literals, function calls, arithmetic expressions, comparisons, any computed values
-
Mixed Notation Support:
user.settings['theme'],data['users'][0].profilefully supported -
SCXML Integration: Enables safe assignment operations while preventing assignment to computed expressions
- Changed: Complete reimplementation of dot notation parsing from dotted identifiers to proper property access AST
- Breaking: Expressions like
user.emailnow parsed as{:property_access, {:identifier, "user"}, "email"}instead of{:identifier, "user.email"} - Impact: Context keys with dots like
"user.email"will no longer match the identifieruser.email- they are now parsed as property access - Instructions: Evaluation now generates separate
loadandaccessinstructions instead of singleloadwith dotted name - Benefit: Enables proper mixed notation like
user.settings['theme']and SCXML location expressions - Migration: Use proper nested data structures
%{"user" => %{"email" => "..."}}instead of flat keys%{"user.email" => "..."} - Lexer Change: Dots removed from valid identifier characters, now parsed as separate tokens
- Parser Enhancement: Added property access grammar
postfix → primary ( "[" expression "]" | "." IDENTIFIER )* - New AST Nodes:
{:property_access, left_node, property}for dot notation parsing - Evaluator Update: New
accessinstruction handler, removed old dotted identifier support fromload_from_context - Full Compatibility: All existing expressions without dots work exactly as before
- Removed: Global function registry system (
Predicator.Functions.Registrymodule) - Removed:
Predicator.register_function/3,Predicator.clear_custom_functions/0,Predicator.list_custom_functions/0 - Changed: Custom functions now passed via
functions:option inevaluate/3calls instead of global registration - Benefit: Thread-safe, no global state, per-evaluation function scoping
- Migration: Replace registry calls with function maps passed to
evaluate/3
- Changed: Variables containing dots (e.g.,
"user.email") now parsed as nested access paths - Impact: Context keys like
"user.profile.name"will no longer match identifieruser.profile.name - Solution: Use proper nested data structures instead of flat keys with dots
- Add token type to
lexer.ex - Add parsing logic to
parser.ex - Add instruction type to
types.ex - Add evaluation logic to
evaluator.ex - Add compilation logic to
compiler.ex - Add string formatting to
string_visitor.ex - Add comprehensive tests
- Update lexer tokenization (see date implementation)
- Update parser grammar and AST types
- Update type specifications in
types.ex - Add evaluation support with type checking
- Add string visitor formatting support
- Add tests for all pipeline components
- Use
mix test --tracefor detailed test output - Check coverage with
mix test.coverage.html - Use
mix dialyzerfor type issues - Run
mix credo explain <issue>for linting details
- Unit Tests: Each component tested in isolation
- Integration Tests: Full pipeline testing in
predicator_test.exs - Property Testing: Comprehensive input validation
- Error Path Testing: All error conditions covered
- Round-trip Testing: AST → String → AST consistency
- Current Test Count: 886 tests (65 doctests + 821 regular tests)
- Documentation: All public functions have
@docand@spec - Type Safety: Comprehensive
@typeand@specdefinitions - Error Handling: Consistent
{:ok, result} | {:error, ...}patterns - Testing: >90% coverage requirement
- Formatting: Automatic with
mix format - Linting: Credo strict mode compliance
- Lexer/parser complexity is intentional and appropriate
- String concatenation optimized in StringVisitor
- Instruction execution designed for repeated evaluation
- Memory usage minimized during compilation pipeline
- Credo Complexity: Intentionally suppressed for lexer/parser functions
- Doctest Escaping: Use simple examples without nested quotes
- Coverage Gaps: Focus on error paths and edge cases
- Type Errors: Check
@specdefinitions match implementation
-
Elixir ~> 1.11 required
-
All dependencies in development/test only
-
No runtime dependencies for core functionality
-
When creating git commit messages:
- be concise but informative, and highlight the functional changes
- no need to mention code quality improvements as they are expected (unless the functional change is about code quality improvements)
- commit titles should be less than 50 characters and be in the simple present tense (active voice)
- commit descriptions should wrap at about 72 characters and also be in the simple present tense (active voice)