Skip to content

Commit f5f65cd

Browse files
committed
feat: Add initial draft of the ProX Programming Language specification.
1 parent 1ebd784 commit f5f65cd

1 file changed

Lines changed: 304 additions & 0 deletions

File tree

SPEC.md

Lines changed: 304 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,304 @@
1+
# ProX Programming Language Specification
2+
3+
**Version:** 1.0.0-alpha
4+
**Status:** Draft
5+
**Date:** January 2026
6+
**Author:** ProXPL Design Team
7+
8+
---
9+
10+
## 1. Introduction
11+
12+
### 1.1. Purpose
13+
This document defines the formal specification of the **ProX Programming Language** (ProXPL). It serves as the authoritative reference for the language's syntax, semantics, type system, and execution model. This specification is intended for compiler implementers, tool developers, and advanced users requiring precise knowledge of the language behavior.
14+
15+
### 1.2. Scope
16+
This specification describes the core language features, including lexical structure, grammar, type system, memory model, and standard library interfaces. It distinguishes between mandatory language features and implementation-specific behaviors.
17+
18+
### 1.3. Goals
19+
ProXPL is designed with the following key objectives:
20+
- **Performance**: Predictable performance suitable for systems programming and backend development.
21+
- **Clarity**: A clean, readable syntax inspired by Python and JavaScript to reduce cognitive load.
22+
- **Scalability**: Support for large codebases through a robust module system and static typing.
23+
- **Safety**: Compile-time type safety to eliminate common classes of runtime errors.
24+
25+
### 1.4. Non-Goals
26+
- **Undefined Behavior**: The language aims to minimize undefined behavior; however, unsafe interfaces (FFI) are provided for system-level access.
27+
- **Dynamic Typing**: While type inference is supported, ProXPL is fundamentally statically typed.
28+
29+
---
30+
31+
## 2. Language Overview
32+
33+
ProXPL is a statically-typed, general-purpose programming language. It combines the expressive high-level syntax of scripting languages with the performance characteristics of compiled languages.
34+
35+
### 2.1. Design Philosophy
36+
- **Explicit is better than implicit**, except where type inference improves readability without sacrificing safety.
37+
- **Composition over inheritance**, favoring interfaces and traits (planned) over deep class hierarchies.
38+
- **Batteries included**, providing a comprehensive standard library for common tasks.
39+
40+
### 2.2. Intended Use Cases
41+
- specific Systems tooling and CLI applications.
42+
- High-performance web servers and network services.
43+
- Embedded logic and game scripting.
44+
- Cross-platform application development.
45+
46+
---
47+
48+
## 3. Lexical Structure
49+
50+
ProXPL source code is encoded in UTF-8.
51+
52+
### 3.1. Formatting
53+
The language is free-form. Whitespace (spaces, tabs, newlines) acts as a token separator but is otherwise ignored, except in string literals. Semicolons (`;`) are recommended to terminate statements but arguments may vary on strict enforcement in future versions (current grammar requires valid statement termination).
54+
55+
### 3.2. Identifiers
56+
Identifiers name variables, types, functions, and other entities.
57+
- **Pattern**: `[a-zA-Z_][a-zA-Z0-9_]*`
58+
- **Case Sensitivity**: Identifiers are case-sensitive. `myVar` and `myvar` are distinct.
59+
60+
### 3.3. Keywords
61+
The following tokens are reserved and cannot be used as identifiers:
62+
63+
| Declaration | Control Flow | Types/Values | OOP/Async | Modules |
64+
| :--- | :--- | :--- | :--- | :--- |
65+
| `let` | `if` | `true` | `class` | `use` |
66+
| `const` | `else` | `false` | `new` | `import` |
67+
| `func` | `while` | `null` | `this` | `from` |
68+
| `native` | `for` | `void` | `super` | `as` |
69+
| `extern` | `return` | `int` | `extends` | |
70+
| | `break` | `float` | `interface` | |
71+
| | `continue` | `bool` | `static` | |
72+
| | `switch` | `string` | `async` | |
73+
| | `case` | | `await` | |
74+
| | `default` | | | |
75+
| | `try` | | | |
76+
| | `catch` | | | |
77+
| | `throw` | | | |
78+
79+
### 3.4. Literals
80+
81+
- **Integer Literals**: Decimal (`123`), Hexadecimal (`0x7B`), Binary (`0b1111011`).
82+
- **Floating-Point Literals**: Decimal with fractional part or exponent (`3.14`, `1.0e-5`).
83+
- **Boolean Literals**: `true`, `false`.
84+
- **String Literals**: Enclosed in double quotes (`"hello"`). access escape sequences: `\n`, `\t`, `\r`, `\"`, `\\`.
85+
- **Null Literal**: `null`.
86+
87+
### 3.5. Comments
88+
- **Line Comments**: Begin with `//` and continue to end of line.
89+
- **Block Comments**: Begin with `/*` and end with `*/`. Nesting is not supported.
90+
91+
---
92+
93+
## 4. Syntax and Grammar
94+
95+
The grammar is defined using Extended Backus-Naur Form (EBNF) notation.
96+
97+
### 4.1. Declarations
98+
A program consists of a sequence of declarations.
99+
100+
```ebnf
101+
program ::= declaration* EOF
102+
declaration ::= funcDecl | varDecl | classDecl | statement
103+
```
104+
105+
#### 4.1.1. Variable Declaration
106+
Variables are declared using `let` (mutable) or `const` (immutable).
107+
108+
```ebnf
109+
varDecl ::= "let" IDENTIFIER ( ":" type )? ( "=" expression )? ";"
110+
| "const" IDENTIFIER ( ":" type )? "=" expression ";"
111+
```
112+
113+
#### 4.1.2. Function Declaration
114+
Functions are declared with the `func` keyword. Parameters may optionally define types.
115+
116+
```ebnf
117+
funcDecl ::= "func" IDENTIFIER "(" parameters? ")" ( ":" type )? block
118+
parameters ::= IDENTIFIER ( ":" type )? ( "," IDENTIFIER ( ":" type )? )*
119+
```
120+
121+
### 4.2. Statements
122+
Statements execute actions or control flow.
123+
124+
```ebnf
125+
statement ::= exprStmt
126+
| ifStmt
127+
| whileStmt
128+
| forStmt
129+
| returnStmt
130+
| block
131+
| tryStmt
132+
```
133+
134+
- **Block**: `{ statement* }` introduces a new scope.
135+
- **If**: `if (expr) stmt (else stmt)?`
136+
- **While**: `while (expr) stmt`
137+
- **For**: `for (init?; cond?; incr?) stmt`
138+
- **Return**: `return expr? ";"`
139+
140+
### 4.3. Expressions
141+
Expressions evaluate to a value. Precedence follows standard C/Java rules.
142+
143+
```ebnf
144+
expression ::= assignment
145+
assignment ::= IDENTIFIER "=" assignment
146+
| logicOr
147+
logicOr ::= logicAnd ( "||" logicAnd )*
148+
logicAnd ::= equality ( "&&" equality )*
149+
equality ::= comparison ( ( "!=" | "==" ) comparison )*
150+
comparison ::= term ( ( ">" | ">=" | "<" | "<=" ) term )*
151+
term ::= factor ( ( "-" | "+" ) factor )*
152+
factor ::= unary ( ( "/" | "*" | "%" ) unary )*
153+
unary ::= ( "!" | "-" ) unary | call
154+
call ::= primary ( "(" arguments? ")" | "." IDENTIFIER )*
155+
primary ::= NUMBER | STRING | "true" | "false" | "null"
156+
| "(" expression ")" | IDENTIFIER
157+
| "[" elements? "]" | "{" members? "}"
158+
```
159+
160+
---
161+
162+
## 5. Type System
163+
164+
ProXPL employs a **static type system** with extensive **type inference**. Types are checked at compile-time to ensure safety.
165+
166+
### 5.1. Primitive Types
167+
The type system distinguishes logical types, though implementation representations may vary (e.g., boxing).
168+
169+
| Type | Description |
170+
| :--- | :--- |
171+
| `int` | Signed 64-bit integer (compile-time constraint). |
172+
| `float` | 64-bit IEEE 754 floating point number. |
173+
| `bool` | Boolean value (`true` or `false`). |
174+
| `string` | Immutable sequence of UTF-8 characters. |
175+
| `void` | Absence of value (return type only). |
176+
| `any` | Dynamic type (escape hatch from static checking). |
177+
178+
> **Note**: In the current version 1.0 Runtime, `int` and `float` may both be represented as double-precision floats (NaN tagging), limiting accurate integer precision to 53 bits.
179+
180+
### 5.2. Composite Types
181+
- **List**: Mutable, ordered collection of values (`[1, 2, 3]`). Heterogeneous if typed as `List<any>`.
182+
- **Dictionary**: Key-value map (`{"key": "value"}`).
183+
- **Function**: First-class function type.
184+
185+
### 5.3. Object Types
186+
- **Class**: User-defined type containing fields and methods.
187+
- **Interface**: Abstract definition of behavior (method signatures).
188+
- **Instance**: Concrete instance of a Class.
189+
190+
### 5.4. Type Inference
191+
The compiler infers types for variables initialized at declaration.
192+
```javascript
193+
let x = 10; // Inferred as int
194+
let y = x; // Inferred as int
195+
```
196+
Function parameters defaulting to `any` if not annotated in the current grammar, though strict mode requires explicit types.
197+
198+
---
199+
200+
## 6. Memory Model
201+
202+
### 6.1. Allocation
203+
- **Values**: Primitive types (booleans, numbers, null) are typically passed by value.
204+
- **Objects**: Complex types (strings, lists, instances) are allocated on the heap and accessed via references.
205+
206+
### 6.2. Garbage Collection
207+
ProXPL manages memory automatically using a **Mark-and-Sweep Garbage Collector**.
208+
- **Reachability**: Objects are retained as long as they are reachable from the root set (global variables, stack, upvalues).
209+
- **Cycles**: The GC can handle reference cycles.
210+
- **Trigger**: GC is triggered based on allocation pressure or explicit request (`std.gc.collect()`).
211+
212+
### 6.3. Lifetime
213+
Variables declared in a block (`{}`) exist until the end of that block. Objects created within the block persist as long as references exist.
214+
215+
---
216+
217+
## 7. Execution Model
218+
219+
ProXPL defines a dual-mode execution environment.
220+
221+
### 7.1. Bytecode Interpretation (Default)
222+
The source code is compiled to platform-independent **bytecode**.
223+
- **VM**: A stack-based Virtual Machine executes the bytecode.
224+
- **Chunk**: Bytecode is organized into chunks containing instructions and constants.
225+
- **Performance**: Optimization passes (constant folding, dead code elimination) occur during bytecode generation.
226+
227+
### 7.2. Native Compilation (AOT)
228+
ProXPL supports Ahead-of-Time (AOT) compilation via an **LLVM Backend**.
229+
- Transforms ProXPL intermediate representation (IR) into LLVM IR.
230+
- Compiles to native machine code for maximum performance.
231+
- Eliminates interpretation overhead.
232+
233+
### 7.3. Concurrency
234+
ProXPL integrates **asynchronous execution** directly into the runtime.
235+
- **Async/Await**: Syntax for non-blocking operations.
236+
- **Coroutines**: Built on top of LLVM Coroutines (or fibers in VM), allowing functions to suspend and resume state.
237+
- **Event Loop**: Scheduler manages execution of Tasks.
238+
239+
---
240+
241+
## 8. Error Handling
242+
243+
### 8.1. Compile-Time Errors
244+
The compiler detects:
245+
- Syntax errors.
246+
- Type mismatches (e.g., adding `bool` to `float`).
247+
- Undefined variables or symbols.
248+
- Argument count mismatches.
249+
250+
### 8.2. Runtime Errors
251+
Runtime conditions that cannot be checked statically raise specific errors:
252+
- **Exceptions**: Structured mechanism using `try`, `catch`, `throw`. (Roadmap v1.1)
253+
- **Panic**: Unrecoverable system errors (e.g., Stack Overflow, Out of Memory) terminate the process.
254+
255+
---
256+
257+
## 9. Standard Library
258+
259+
The rigid, versioned Standard Library provides essential capabilities protected by stability guarantees.
260+
261+
### 9.1. Modules
262+
- **`std.core`**: Fundamental types and assertions.
263+
- **`std.io`**: File and console I/O.
264+
- **`std.math`**: Mathematical functions and constants.
265+
- **`std.sys`**: System interaction (env vars, processes).
266+
- **`std.net`**: Networking capabilities (planned).
267+
268+
### 9.2. Stability
269+
APIs within the `std` namespace follow semantic versioning. Breaking changes occur only in major version increments.
270+
271+
---
272+
273+
## 10. Foreign Function Interface (FFI)
274+
275+
ProXPL allows binding to native system libraries.
276+
- **`extern` keyword**: Defines signatures for external C functions.
277+
- **Dynamic Loading**: Libraries (`.dll`, `.so`) are loaded at runtime.
278+
- **Type Mapping**: ProXPL types are marshaled to C-compatible types (Number → double/int, String → char*).
279+
280+
```javascript
281+
extern "kernel32.dll" "Sleep" func sleep(ms);
282+
```
283+
284+
---
285+
286+
## 11. Undefined and Implementation-Defined Behavior
287+
288+
### 11.1. Implementation-Defined
289+
- **Integer Size**: Currently mapped to double-precision float (53-bit SAFE_INT). Future versions may implement strict 64-bit integers.
290+
- **Hash Order**: Iteration order of Dictionaries is implementation-dependent (not guaranteed).
291+
292+
### 11.2. Undefined Behavior
293+
- Modifying a collection while iterating over it.
294+
- Accessing FFI pointers incorrectly.
295+
- Relying on specific GC timing.
296+
297+
---
298+
299+
## 12. Future Evolution
300+
301+
ProXPL evolves through a Request for Comments (RFC) process. Breaking changes are reserved for major integer releases. The roadmap includes Generic Programming and Pattern Matching for v2.0.
302+
303+
***
304+
*End of Specification*

0 commit comments

Comments
 (0)