Compiler Pipeline
The Covenant compiler (covc) transforms source files through ten sequential phases. Each phase has a well-defined input, output, and error class.
The Ten Phases
| # | Phase | Input | Output | Error range |
|---|---|---|---|---|
| 1 | Lexer | UTF-8 source | Token stream | E100–E199 |
| 2 | Parser | Tokens | AST | E200–E299 |
| 3 | Resolver | Unresolved AST | Bound AST | E300–E399 |
| 4 | Typechecker | Bound AST | Typed AST | E400–E499 |
| 5 | Privacy Flow | Typed AST | Domain-annotated AST | E500–E599 |
| 6 | IR Builder | Annotated AST | SSA IR | E600–E699 |
| 7 | Optimizer | SSA IR | Optimized IR | W700–W799 |
| 8 | Backend | Optimized IR | Backend IR | E800–E899 |
| 9 | Codegen | Backend IR | Bytecode + ABI | E800–E899 |
| 10 | Metadata | Bytecode + ABI | .artifact.json | — |
Phase 1 — Lexer
Input: Raw UTF-8 source text
Output: Token stream
Error codes: E100–E199
The lexer handles keywords, numeric literals (decimal, hex 0x…, underscore separators), string literals, comments (// and /* */), and FHE type tokens (encrypted<, sealed).
Common errors:
E101 UnexpectedCharacter— invalid byte in sourceE102 UnterminatedString— missing closing"E103 InvalidEscape— unknown escape sequenceE104 OverflowLiteral— integer literal exceeds any Covenant type
Phase 2 — Parser
Input: Token stream
Output: AST (after desugaring)
Error codes: E200–E299
A hand-written recursive-descent PEG parser. Desugaring rules:
token Foo { }→contract Foo extends __ERC20Token { }encrypted token Foo { }→contract Foo extends __ERC8227Token { }only owner→given caller == owner or revert_with Unauthorized()when condition→given condition or revert_with PreconditionFailed()
Common errors:
E201 UnexpectedTokenE202 MissingClosingBraceE203 InvalidFieldType
Phase 3 — Resolver
Input: AST with unresolved names
Output: AST with all names bound
Error codes: E300–E399
Builds a scope tree, resolves imports (use covenant::stdlib::*), detects circular dependencies, and binds built-ins (caller, block.timestamp, deployer).
Common errors:
E301 UndefinedNameE302 AmbiguousNameE303 CircularDependencyE310 PrivateFieldAccess
Phase 4 — Typechecker
Input: Resolved AST
Output: Typed AST
Error codes: E400–E499
Hindley-Milner extended for Covenant domain types: amount (non-negative, unit-tracked), encrypted<T> (FHE ciphertext), pq_key (2,592-byte opaque blob), address.
Key type rules:
encrypted<T>cannot flow to plaintext without explicitrevealpq_keyis immutable once assigned- Constant
amountarithmetic is overflow-checked at compile time
Common errors:
E401 TypeMismatchE402 EncryptedToPlaintext— implicit ciphertext coercionE410 AmountOverflowE420 UnitMismatch
Phase 5 — Privacy Flow Analyzer
Input: Typed AST
Output: Domain-annotated AST
Error codes: E500–E599
Enforces the two-domain model:
- P domain — public, visible on-chain
- E domain — encrypted FHE ciphertext
- A domain — amnesia-eligible (inside
amnesia { }blocks)
E-domain values cannot flow to P-domain outputs without an authorized reveal. Amnesia-eligible fields cannot be read after the ceremony reaches DESTROYED.
Common errors:
E501 DomainViolationE502 RevealWithoutAuthE503 AmnesiaEscapeAttemptE510 CrossContractDomainLeak
Phase 6 — IR Builder
Input: Annotated typed AST
Output: SSA IR
Error codes: E600–E699
Lowers AST to Static Single Assignment form. Each variable assigned exactly once; control flow uses phi nodes. See Intermediate Representation.
Phase 7 — Optimizer
Input: SSA IR
Output: Optimized SSA IR
Warning codes: W700–W799
22 passes (not errors, cannot break compilation):
- Constant folding
- Dead code elimination
- Common subexpression elimination
- Guard inlining
- Bounded loop unrolling
- FHE operation batching
- Storage slot packing
- Event deduplication
- Revert string deduplication
- Gas cost canonicalization 11–22. Backend-specific peephole passes
Key warnings:
W701 UnreachableCodeW702 ConstantFieldW710 FHEDepthExceeded— multiplication depth will require bootstrapping
Phase 8 — Backend Selection
Input: Optimized IR
Output: Backend-lowered IR
Error codes: E800–E899
Available backends: evm (default), aster, wasm (experimental V0.8+). See Compiler Targets.
Phase 9 — Codegen
Produces raw EVM bytecode. Handles the selector dispatcher, constructor, PUSH/JUMPDEST layout. Aster backend emits Aster VM opcodes instead.
Phase 10 — Metadata Emission
Emits .artifact.json:
{
"compiler": "covc 0.7.4",
"contract": "MyContract",
"abi": [],
"bytecode": "0x608060...",
"metadata": {
"erc_compliance": ["8227"],
"compliance_profile": "CL2",
"optimizer": {"passes": 22},
"audit": {"report": "OMEGA V4", "findings": 41, "resolved": 41}
}
}
Running Individual Phases
covenant inspect ast ./src/Contract.cvt # stop after parsing
covenant check ./src/Contract.cvt # stop after typechecking
covenant inspect ir ./src/Contract.cvt # emit optimized IR
covenant build ./src/Contract.cvt # full build
Performance Targets
| Phase | Target (1000-line contract) |
|---|---|
| Lexer + Parser | < 50ms |
| Resolver | < 30ms |
| Typechecker | < 100ms |
| Privacy flow | < 80ms |
| IR + Optimizer | < 250ms |
| Codegen + Metadata | < 140ms |
| Total | < 650ms |