AeThex-OS/AETHEX_COMPILER_SPEC.md

776 lines
17 KiB
Markdown

# AeThex Language - Technical Specification & Compiler Implementation Guide
## Document Info
- **Status:** Production Reference
- **Version:** 1.0.0
- **Last Updated:** February 20, 2026
- **Target:** AeThex Language Compiler Development
---
## Table of Contents
1. [Language Specification](#language-specification)
2. [Compiler Architecture](#compiler-architecture)
3. [Implementation Roadmap](#implementation-roadmap)
4. [API Reference](#api-reference)
5. [Configuration Format](#configuration-format)
---
## Language Specification
### Lexical Elements
#### Keywords
**Declarations:**
- `reality` - Start reality/namespace declaration
- `journey` - Start journey/function declaration
- `let` - Variable declaration
- `import` - Import libraries/modules
**Control Flow:**
- `when` - Conditional (if)
- `otherwise` - Else clause
- `return` - Exit early from journey
- `platform` - Platform specifier
**Operations:**
- `notify` - Output/logging
- `reveal` - Return value
- `sync` - Data synchronization
- `across` - Sync target platforms
- `new` - Object instantiation
#### Identifiers
- Start with letter or underscore
- Contain letters, numbers, underscores
- Case-sensitive
- Examples: `playerName`, `_private`, `CONSTANT`, `Game1`
#### Literals
- **String:** `"hello"` or `'hello'`
- **Number:** `123`, `45.67`, `0xFF`
- **Boolean:** Implicit in when conditions
- **Array:** `[value1, value2]` or `[platform1, platform2]`
- **Object:** `{ key: value, key2: value2 }`
#### Comments
- Single line: `# comment to end of line`
- Multi-line: Not supported (use multiple `#`)
### Grammar
#### Reality Declaration
```
REALITY ::= "reality" IDENTIFIER "{" REALITY_BODY "}"
REALITY_BODY ::= (PROPERTY)*
PROPERTY ::= IDENTIFIER ":" (IDENTIFIER | ARRAY | STRING)
ARRAY ::= "[" (IDENTIFIER ("," IDENTIFIER)*)? "]" | "all"
```
**Example:**
```aethex
reality MyGame {
platforms: [roblox, web]
type: "multiplayer"
}
```
#### Journey Declaration
```
JOURNEY ::= "journey" IDENTIFIER "(" PARAMS? ")" "{" JOURNEY_BODY "}"
PARAMS ::= IDENTIFIER ("," IDENTIFIER)*
JOURNEY_BODY ::= (STATEMENT)*
STATEMENT ::= WHEN_STMT | LET_STMT | EXPR_STMT | RETURN_STMT
```
**Example:**
```aethex
journey Greet(name) {
platform: all
notify "Hello, " + name
}
```
#### When Statement (Conditional)
```
WHEN_STMT ::= "when" EXPR "{" BODY "}" ("otherwise" "{" BODY "}")?
EXPR ::= COMPARISON | FUNCTION_CALL | IDENTIFIER
COMPARISON ::= EXPR ("<" | ">" | "==" | "!=" | "<=" | ">=") EXPR
```
**Example:**
```aethex
when player.age < 13 {
notify "Parent consent required"
} otherwise {
reveal player.data
}
```
#### Platform-Specific Code
```
PLATFORM_BLOCK ::= "platform" ":" (IDENTIFIER | "{" PLATFORM_BODY "}")
PLATFORM_BODY ::= ("platform" ":" IDENTIFIER "{" BODY "}")+
```
**Example:**
```aethex
platform: roblox {
reveal leaderboardGUI
}
platform: web {
reveal leaderboardHTML
}
```
#### Synchronization
```
SYNC_STMT ::= "sync" IDENTIFIER "across" ARRAY
IMPORT_STMT ::= "import" "{" IMPORT_LIST "}" "from" STRING
IMPORT_LIST ::= IDENTIFIER ("," IDENTIFIER)*
```
**Example:**
```aethex
import { Passport, DataSync } from "@aethex.os/core"
sync player.data across [roblox, web]
```
### Type System
AeThex has implicit typing with these base types:
- **string** - Text values
- **number** - Numeric values (int or float)
- **boolean** - True/false (implicit from conditions)
- **object** - Key-value data
- **array** - Indexed collections
- **any** - Dynamic/unknown types
**Type Checking:**
- Happens at compile-time
- Automatic type inference
- Runtime type validation for critical paths
---
## Compiler Architecture
### Stage 1: Lexical Analysis (Lexer)
**Input:** `.aethex` source code (string)
**Output:** Token stream
```typescript
interface Token {
type: 'KEYWORD' | 'IDENTIFIER' | 'STRING' | 'NUMBER' | 'OPERATOR' | 'PUNCTUATION';
value: string;
line: number;
column: number;
}
```
**Process:**
1. Read source code character by character
2. Recognize patterns (keywords, identifiers, literals, operators)
3. Generate tokens with position information
4. Handle comments (skip `#` lines)
5. Report lexical errors
**Key Methods:**
```typescript
class Lexer {
tokenize(source: string): Token[]
nextToken(): Token
peek(): Token
consume(type: string): Token
}
```
### Stage 2: Syntax Analysis (Parser)
**Input:** Token stream
**Output:** Abstract Syntax Tree (AST)
```typescript
interface ASTNode {
type: string;
[key: string]: any;
}
interface Reality extends ASTNode {
type: 'Reality';
name: string;
platforms: string[];
properties: Record<string, any>;
}
interface Journey extends ASTNode {
type: 'Journey';
name: string;
params: string[];
body: Statement[];
}
interface When extends ASTNode {
type: 'When';
condition: Expression;
body: Statement[];
otherwise?: Statement[];
}
```
**Process:**
1. Parse top-level declarations (reality, journey, import)
2. Parse statements and expressions recursively
3. Build AST respecting language grammar
4. Report syntax errors with line/column info
**Key Methods:**
```typescript
class Parser {
parse(tokens: Token[]): Program
parseReality(): Reality
parseJourney(): Journey
parseStatement(): Statement
parseExpression(): Expression
}
```
### Stage 3: Semantic Analysis
**Input:** AST
**Output:** Validated AST + Symbol Table
**Process:**
1. Check identifiers are defined before use
2. Validate journey parameters and return types
3. Verify platform specifiers are valid
4. Check import statements reference valid modules
5. Validate compliance module usage
**Key Checks:**
- Undefined variables/journeys
- Platform compatibility
- Import validity
- Type consistency
### Stage 4: Code Generation
**Input:** Validated AST + Target Platform
**Output:** Target language source code
#### Target Language Mapping
| AeThex | JavaScript | Lua (Roblox) | Verse (UEFN) | C# (Unity) |
|--------|-----------|------------|-------------|-----------|
| journey | function | function | function | method |
| reality | object | table | class | namespace |
| when | if | if | if | if |
| notify | console.log | print | log | Debug.Log |
| reveal | return | return | return | return |
| let | const | local | var | var |
#### JavaScript Code Generation
```typescript
class JavaScriptGenerator {
generate(ast: Program): string {
let code = '';
// Generate imports
for (const imp of ast.imports) {
code += generateImport(imp);
}
// Generate realities as objects
for (const reality of ast.realities) {
code += generateReality(reality);
}
// Generate journeys as functions
for (const journey of ast.journeys) {
code += generateJourney(journey);
}
return code;
}
private generateJourney(journey: Journey): string {
// Check platform compatibility
let code = `function ${journey.name}(${journey.params.join(', ')}) {\n`;
for (const stmt of journey.body) {
code += generateStatement(stmt);
}
code += '}\n';
return code;
}
}
```
#### Lua (Roblox) Code Generation
```typescript
class LuaGenerator {
generate(ast: Program): string {
let code = '';
// Lua-specific imports
code += 'local AeThexCore = require("@aethex.os/core")\n\n';
// Generate Roblox-specific code
for (const journey of ast.journeys) {
if (journey.platforms.includes('roblox') || journey.platforms.includes('all')) {
code += generateRobloxJourney(journey);
}
}
return code;
}
private generateRobloxJourney(journey: Journey): string {
let code = `local function ${journey.name}(${journey.params.join(', ')})\n`;
// ... Lua generation logic ...
return code;
}
}
```
### Stage 5: Optimization
**Input:** Generated code
**Output:** Optimized code
**Optimizations:**
1. Dead code elimination
2. Variable inlining
3. String constant pooling
4. Unused import removal
5. PII detection preprocessing
### Stage 6: Emission
**Input:** Optimized code
**Output:** File system
```typescript
class Emitter {
emit(code: string, target: string, outputPath: string): void {
const extension = this.getExtension(target);
const filePath = `${outputPath}/${fileName}.${extension}`;
fs.writeFileSync(filePath, code);
}
}
```
---
## Compiler Architecture Diagram
```
┌─────────────────┐
│ Source Code │
│ (.aethex file) │
└────────┬────────┘
┌─────────┐
│ Lexer │ → Tokenize
└────┬────┘
│Token Stream
┌─────────┐
│ Parser │ → Parse to AST
└────┬────┘
│AST
┌──────────────┐
│ Semantic │ → Validate
│ Analyzer │
└────┬─────────┘
│Validated AST
┌──────────────┐
│ Code │ → Generate Target Code
│ Generator │ (JavaScript, Lua, etc.)
└────┬─────────┘
│Target Code
┌──────────────┐
│ Optimizer │ → Optimize
└────┬─────────┘
│Optimized Code
┌──────────────┐
│ Emitter │ → Write to File
└────┬─────────┘
┌──────────────┐
│ Output File │
│ (.js, .lua) │
└──────────────┘
```
---
## Implementation Roadmap
### Phase 1: Foundation (Weeks 1-2)
- [ ] Lexer implementation
- [ ] Token types enumeration
- [ ] Character scanning
- [ ] Token recognition
- [ ] Error reporting
- [ ] Parser basics
- [ ] Reality declarations
- [ ] Journey declarations
- [ ] Simple expressions
### Phase 2: AST & Semantic (Weeks 3-4)
- [ ] Complete AST node types
- [ ] Semantic analyzer
- [ ] Symbol table management
- [ ] Type checking
### Phase 3: Code Generation (Weeks 5-6)
- [ ] JavaScript generator
- [ ] Lua (Roblox) generator
- [ ] Basic optimizations
- [ ] File emission
### Phase 4: Features (Weeks 7-8)
- [ ] Platform-specific code blocks
- [ ] Sync statements
- [ ] Import/module system
- [ ] Compliance checks
### Phase 5: CLI & Tools (Weeks 9-10)
- [ ] CLI argument parsing
- [ ] Watch mode
- [ ] Multiple target compilation
- [ ] Error reporting
### Phase 6: Testing & Documentation (Weeks 11-12)
- [ ] Unit tests for each stage
- [ ] Integration tests
- [ ] Documentation
- [ ] Example projects
---
## API Reference
### CLI API
```bash
aethex compile <file> [options]
aethex new <name> [--template <type>]
aethex init [options]
aethex --version
aethex --help
```
### Programmatic API
```typescript
import { AeThexCompiler } from '@aethex.os/cli';
const compiler = new AeThexCompiler({
targets: ['javascript', 'roblox'],
srcDir: 'src',
outDir: 'build'
});
// Compile single file
const result = await compiler.compile('src/main.aethex');
// Compile entire project
const results = await compiler.compileProject();
// Watch mode
compiler.watch('src', (file) => {
console.log(`Recompiled ${file}`);
});
```
### Compiler Stages API
```typescript
// Manual compilation pipeline
const lexer = new Lexer(sourceCode);
const tokens = lexer.tokenize();
const parser = new Parser(tokens);
const ast = parser.parse();
const analyzer = new SemanticAnalyzer();
const validated = analyzer.analyze(ast);
const generator = new JavaScriptGenerator();
const code = generator.generate(validated);
const optimizer = new Optimizer();
const optimized = optimizer.optimize(code);
fs.writeFileSync('output.js', optimized);
```
---
## Configuration Format
### aethex.config.json Schema
```json
{
"$schema": "http://aethex.dev/schema/aethex.config.json",
"name": "string",
"version": "string",
"description": "string",
"targets": ["javascript", "roblox", "uefn", "unity"],
"srcDir": "string",
"outDir": "string",
"entry": "string",
"stdlib": true,
"compliance": {
"coppa": true,
"ferpa": true,
"piiDetection": true,
"auditLogging": true
},
"platforms": {
"javascript": {
"output": "string"
},
"roblox": {
"output": "string"
}
}
}
```
### Environment Variables
```bash
AETHEX_TARGET=javascript # Target compilation platform
AETHEX_OUTPUT_DIR=./build # Output directory
AETHEX_WATCH=true # Enable watch mode
AETHEX_DEBUG=true # Enable debug output
AETHEX_STRICT=true # Strict mode
```
---
## Error Handling
### Error Types
```
SyntaxError
├── UnexpectedToken
├── UnexpectedEndOfFile
├── InvalidExpression
└── MissingClosingBracket
SemanticError
├── UndefinedVariable
├── UndefinedJourney
├── InvalidPlatform
├── InvalidImport
└── TypeMismatch
CompilationError
├── InvalidConfiguration
├── SourceNotFound
├── OutputPermissionDenied
└── UnsupportedTarget
```
### Error Reporting
```typescript
interface CompilationError {
type: 'SyntaxError' | 'SemanticError' | 'CompilationError';
message: string;
line: number;
column: number;
source: string;
code: string;
}
```
**Example Error Output:**
```
Error: Undefined dance "Greet"
at journey.aethex:5:12
5 | when Greet(player) {
| ^
Did you mean "Greet" defined at line 3?
```
---
## Performance Targets
- **Compilation Speed:** < 100ms for typical files
- **Memory Usage:** < 50MB for average projects
- **Output Size:** < 2x source code size (before minification)
- **Watch Mode Latency:** < 50ms file change to recompile
---
## Testing Strategy
### Unit Tests
```typescript
// Lexer tests
describe('Lexer', () => {
it('should tokenize keywords', () => {
const lexer = new Lexer('reality MyGame { platforms: all }');
const tokens = lexer.tokenize();
expect(tokens[0].type).toBe('KEYWORD');
expect(tokens[0].value).toBe('reality');
});
});
// Parser tests
describe('Parser', () => {
it('should parse reality declarations', () => {
const parser = new Parser(tokens);
const ast = parser.parse();
expect(ast.realities).toHaveLength(1);
expect(ast.realities[0].name).toBe('MyGame');
});
});
```
### Integration Tests
```typescript
describe('Compiler Integration', () => {
it('should compile realities with cross-platform sync', () => {
const source = `
import { DataSync } from "@aethex.os/core"
reality Game { platforms: [roblox, web] }
journey Save(player) {
sync player across [roblox, web]
}
`;
const compiler = new AeThexCompiler();
const result = compiler.compile(source);
expect(result.javascript).toContain('function Save');
expect(result.lua).toContain('function Save');
});
});
```
### Property-Based Tests
```typescript
// Test compliance
describe('Compliance', () => {
it('should never allow PII in leaderboard', () => {
const inputs = [
'555-1234', // Phone
'user@email.com', // Email
'123-45-6789', // SSN
];
inputs.forEach(input => {
const result = SafeInput.validate(input);
expect(result.valid).toBe(false);
});
});
});
```
---
## Module System
### Package Structure
```
@aethex.os/
├── cli/ # Command line interface
├── core/ # Standard library
│ ├── Passport/
│ ├── DataSync/
│ ├── SafeInput/
│ └── Compliance/
├── roblox/ # Platform-specific
├── web/
└── unity/
```
### Imports
```aethex
# From standard library
import { Passport, DataSync } from "@aethex.os/core"
# From platform packages
import { RemoteEvent, Leaderboard } from "@aethex.os/roblox"
# Local imports
import { helpers } from "./utils"
```
---
## Security Considerations
1. **Input Validation:** Validate all user input for PII at compile time
2. **Unsafe Operations:** Flash warnings for unsafe patterns
3. **Privilege Escalation:** Separate dev vs production compilation modes
4. **Audit Trails:** Log all compliance checks
5. **Data Privacy:** Scrub sensitive data in error messages
---
## Standards & References
- **ECMAScript:** https://tc39.es/ecma262/
- **Lua:** https://www.lua.org/manual/5.3/
- **Verse (UEFN):** https://dev.epicgames.com/documentation/en-US/uefn/verse-language-reference
- **C# (.NET):** https://docs.microsoft.com/en-us/dotnet/csharp/
---
## Support & References
- **GitHub:** https://github.com/AeThex-Corporation/AeThexOS
- **npm:** https://www.npmjs.com/package/@aethex.os/cli
- **Documentation:** https://aethex.dev/docs/lang
- **Issues:** https://github.com/AeThex-Corporation/AeThexOS/issues
---
**Last Updated:** February 20, 2026
**Status:** Production-Ready Specification
**License:** MIT (Copyright 2025 AeThex)