Wednesday, March 18, 2026

a sufficiently friendly compiler

license: public domain CC0

 

Design: Interactive, Constraint‑Driven Compiler Collaboration

This doc sketches a compiler system where the programmer, an agent, and the compiler negotiate lowering from high‑level code to low‑level implementation using annotations, dials, and an explicit constraint graph.


1. Goals and non‑goals

  • Goal: Make lowering from HLL → LLL explicit, explainable, and steerable without sacrificing safety.
  • Goal: Treat performance and representation decisions as first‑class, checkable semantics, not opaque heuristics.
  • Goal: Allow interactive refinement of lowering choices with clear knock‑on effects.
  • Non‑goal: Replace all compiler heuristics with manual control. The system should augment, not burden, the programmer.
  • Non‑goal: Require annotations everywhere. Defaults must be reasonable and compositional.

2. Core concepts

2.1 Annotations (hard constraints)

Annotations are semantic contracts attached to code or types. If they cannot be upheld in the lowered program, the compiler must reject the program.

  • Examples:

    • @heap — value must be heap‑allocated.
    • @stack — value must be stack‑allocated.
    • @region("frame") — value must live in a specific region.
    • @noescape — value must not outlive its lexical scope.
    • @pure — function must be side‑effect‑free.
    • @noalias — reference must not alias any other reference.
    • @soa / @aos — layout constraints.
    • @inline(always) — must be inlined (subject to well‑formedness rules).
  • Properties:

    • Hard: Violations are compile‑time errors, not warnings.
    • Compositional: Annotations propagate through the IR and participate in constraint solving.
    • Semantic: They describe what must be true, not how to implement it.

2.2 Dials (soft preferences)

Dials are global or scoped optimization preferences that guide heuristics but do not invalidate programs.

  • Examples:

    • opt.memory = "cache_locality" vs "allocation_count".
    • opt.layout = "prefer_soa" for a module.
    • opt.inlining_aggressiveness = high.
    • opt.vectorization = "prefer_branchless".
    • opt.reg_pressure_budget = medium.
  • Properties:

    • Soft: They influence choices but never cause errors by themselves.
    • Scoped: Can apply to a project, module, function, or region.
    • Negotiable: The agent can propose dial changes to satisfy constraints or improve performance.

2.3 Constraint graph

Lowering is modeled as a constraint satisfaction problem over an IR:

  • Nodes: IR entities (functions, blocks, values, allocations, regions, loops).
  • Constraints:
    • From annotations (hard).
    • From language semantics (hard).
    • From dials and heuristics (soft).
  • Edges: Dependencies between decisions (e.g., “if this escapes, stack allocation is impossible”).

The compiler maintains this graph and uses it to:

  • Check feasibility of annotations.
  • Explore alternative lowerings.
  • Explain knock‑on effects.

3. Architecture

3.1 Pipeline overview

  1. Front‑end:

    • Parse source → AST.
    • Type check, effect check.
    • Attach annotations to AST nodes.
  2. Semantic IR:

    • Lower AST to a high‑level IR with:
      • explicit control flow,
      • explicit effects,
      • explicit allocation sites,
      • explicit regions/scopes.
    • Preserve annotations as IR metadata.
  3. Constraint extraction:

    • Build a constraint graph from:
      • annotations,
      • type/effect system,
      • lifetime/escape analysis,
      • alias analysis,
      • layout rules.
  4. Initial lowering plan:

    • Apply default heuristics + dials to propose:
      • allocation strategies,
      • inlining decisions,
      • layout choices,
      • vectorization/fusion decisions.
  5. Interactive negotiation (optional mode):

    • Expose the plan and constraint graph to the agent + programmer.
    • Allow adjustments to annotations/dials.
    • Re‑solve constraints and update the plan.
  6. Final IR + codegen:

    • Commit to a consistent lowering.
    • Emit low‑level IR / machine code.
    • Optionally emit a “lowering report” for debugging and learning.

4. Error model for annotations

Annotations are part of static semantics. They can fail in well‑defined ways.

4.1 Typical error classes

  • Lifetime violations:

    • Example: @stack on a value that escapes its function.
    • Result: Error with explanation of the escape path.
  • Purity violations:

    • Example: @pure function performs I/O or calls impure code.
    • Result: Error with call chain showing the impure operation.
  • Alias violations:

    • Example: @noalias reference proven to alias another reference.
    • Result: Error with the aliasing path.
  • Layout violations:

    • Example: @packed on a type requiring alignment; @soa on unsupported structure.
    • Result: Error with the conflicting fields/types.
  • Inlining violations:

    • Example: @inline(always) on a recursive function where inlining would not terminate.
    • Result: Error with recursion cycle.
  • Region violations:

    • Example: @region("frame") on a value that must outlive the frame.
    • Result: Error with lifetime mismatch.

4.2 Error reporting shape

  • Core message: Which annotation failed and where.
  • Cause chain: Minimal slice of the constraint graph that explains why.
  • Alternatives: Valid strategies the compiler can suggest:
    • remove or relax the annotation,
    • adjust another annotation,
    • tweak a dial (e.g., enable region allocation).

5. Interactive negotiation flow

This mode is optional but central to the design.

5.1 Baseline flow

  1. Compiler proposes a plan:

    • “Given current code + annotations + dials, here is the lowering.”
  2. Agent summarizes tradeoffs:

    • Example: “Using SoA for Particle improves cache locality but increases register pressure; loop fusion reduces parallelism.”
  3. Programmer adjusts:

    • Add/modify annotations.
    • Change dials (e.g., “don’t fuse loops in this module”).
  4. Compiler re‑solves constraints:

    • Updates the plan.
    • Detects any new annotation conflicts.
  5. Agent highlights knock‑on effects:

    • “Unfusing loops may disable vectorization; here’s the affected loop.”
  6. Programmer accepts or iterates.

5.2 Conflict resolution

When an annotation is impossible:

  • Compiler: Rejects the program and marks the conflicting region.
  • Agent: Explains:
    • the failing annotation,
    • the dependency chain,
    • possible fixes (e.g., remove @stack, add @noescape, introduce a region).

This keeps the system sound while still being negotiable.


6. Example scenario

6.1 Source sketch

@soa
struct Particle {
  position: Vec3,
  velocity: Vec3,
}

@pure
fn update(@noalias particles: &mut [Particle]) {
  for p in particles {
    p.position += p.velocity;
  }
}

6.2 Compiler’s initial plan

  • Use SoA layout for Particle.
  • Inline update into hot call sites.
  • Vectorize the loop.
  • Allocate particles in a region tied to the simulation frame.

6.3 Programmer adds a constraint

@stack
fn simulate_frame() {
  let particles = make_particles(); // large array
  update(&mut particles);
  render(&particles);
}

6.4 Constraint failure

  • @stack on particles conflicts with:
    • its size (too large for stack) or
    • its lifetime (if it escapes) or
    • region strategy (if region is required).

Error:

@stack allocation for particles is impossible.
Reason: particles is passed to render, which may store it beyond simulate_frame.
Options:

  • Remove @stack and allow region/heap allocation.
  • Mark render so that it cannot retain particles (@noescape on parameter).
  • Introduce a frame‑region and use @region("frame") instead of @stack.

The programmer can then refine the design explicitly.


7. Open design questions

  • Annotation granularity:
    How fine‑grained should annotations be (per value, per field, per function, per module)?
  • Default policy:
    How much can the compiler do “well enough” without annotations, while still being explainable?
  • Annotation ergonomics:
    How to avoid annotation bloat while still enabling precise control where needed?
  • Performance modeling:
    How should the system surface performance tradeoffs (e.g., estimated cache misses, allocations, branch mispredicts)?
  • Agent protocol:
    What is the minimal, compositional vocabulary for the agent to explain constraints and tradeoffs?

8. Summary

This design treats compilation as:

  • A constraint‑driven transformation from high‑level semantics to low‑level implementation.
  • A collaboration between programmer, compiler, and agent.
  • A space of explicit, explainable choices, not opaque heuristics.

Annotations are hard, checkable contracts.
Dials are soft, steerable preferences.
The constraint graph is the shared object of reasoning.

 

No comments:

Post a Comment