license: public domain CC0

attn: improved version https://docs.google.com/document/d/19w_P7hPIawRBzRXpBlyuFaFTogdtlFT6ZnqRRJv9vhQ/edit?usp=drivesdk

Project document: Stackless CPS engine for Haskell

1. Vision and goals

Vision:
Build a stackless CPS engine written in Haskell that:

Executes Haskell programs via a Core→CPS IR pipeline.
Has no semantic call–return: only one active frame; all “future” is heap data.
Treats control flow as data: continuations are first-class, mutable, cloneable, serializable.
Enables:
- hot-swap and version routing,
- shadow execution (fork futures),
- replay / time travel,
- semantic drift governance,
- rich introspection and lore.

Key separation:

Host Haskell (GHC): stackful, classic call–return, used to implement the engine and tooling.
Guest Haskell: compiled to a Core-like IR, executed by the CPS engine with stackless semantics.

2. Architecture overview

2.1 High-level pipeline

Haskell source (guest)
GHC front-end
- parse, rename, typecheck
- desugar to GHC Core
- run Core-to-Core optimizations
Core extraction (plugin/backend)
- capture Core AST
- translate to EngineCore (our IR)
CPS engine (host Haskell)
- interpret EngineCore using heap-resident continuations
- manage scheduling, effects, hot-swap, shadow execution, replay

2.2 Components

GHC plugin / custom backend
- Intercepts Core.
- Serializes or passes it directly to the engine.
EngineCore IR
- Small, typed lambda calculus with ADTs and case.
- Designed for CPS interpretation.
CPS runtime
- Machine state (expr, env, cont, meta).
- Step function (small-step interpreter).
- Engine loop (schedule–resume, no semantic stack).
Meta layer
- Versioning, tracing, effect routing, shadow execution hooks.

3. Front-end: GHC integration

3.1 Core interception via plugin

A GHC plugin that captures Core after optimization:

module Plugin (plugin) where

import GhcPlugins

plugin :: Plugin
plugin = defaultPlugin { installCoreToDos = install }

install :: [CommandLineOption] -> [CoreToDo] -> CoreM [CoreToDo]
install _ todos = do
  modGuts <- getModGuts
  let binds = mg_binds modGuts
  liftIO $ writeFile "program.core" (show binds)
  -- Option A: stop compilation here
  pure []

Later iteration: instead of writing to a file, directly call into the engine or serialize a structured format.

3.2 Core → EngineCore translation

Define a Core-like IR:

type Name = String

data EngineExpr
  = EVar Name
  | ELam Name EngineExpr
  | EApp EngineExpr EngineExpr
  | ELet Name EngineExpr EngineExpr
  | EInt Int
  | ECase EngineExpr [Alt]
  | EConstr Name [EngineExpr]
  deriving (Show)

data Alt = Alt ConName [Name] EngineExpr
type ConName = String

Translator skeleton:

ghcCoreToEngine :: GHC.CoreExpr -> EngineExpr
ghcCoreToEngine = go
  where
    go (Var v)      = EVar (occNameString (getOccName v))
    go (Lit (MachInt n)) = EInt (fromIntegral n)
    go (Lam b e)    = ELam (occNameString (getOccName b)) (go e)
    go (App f a)    = EApp (go f) (go a)
    go (Let (NonRec b rhs) body) =
      ELet (occNameString (getOccName b)) (go rhs) (go body)
    go (Case scrut b _ alts) =
      ECase (go scrut) (map goAlt alts)
    go _ = error "unsupported Core form (first pass)"

    goAlt (DataAlt dc, bs, rhs) =
      Alt (occNameString (getOccName dc))
          (map (occNameString . getOccName) bs)
          (go rhs)
    goAlt _ = error "unsupported alt"

4. CPS engine: core runtime model

4.1 Values, environments, continuations

data Value
  = VInt Int
  | VClosure Name EngineExpr Env
  | VConstr ConName [Value]
  deriving (Show)

type Env = [(Name, Value)]

data Frame
  = FAppFun EngineExpr Env
  | FAppArg Value
  | FLet Name EngineExpr Env
  | FCase [Alt] Env
  deriving (Show)

type Cont = [Frame]  -- head = next frame

4.2 Meta state

data VersionId = VersionId String
data TraceId   = TraceId String

data MetaState = MetaState
  { msVersion :: VersionId
  , msTrace   :: TraceId
  -- hooks for logging, lore, etc.
  }

4.3 Machine state

data MachineState = MachineState
  { msExpr :: EngineExpr
  , msEnv  :: Env
  , msCont :: Cont
  , msMeta :: MetaState
  }

Exactly one semantic frame is active: (msExpr, msEnv).
All other “future work” is in msCont on the heap.

5. CPS interpreter loop

5.1 Engine monad

newtype Engine a = Engine { runEngine :: IO a }
  deriving (Functor, Applicative, Monad, MonadIO)

5.2 Step function

step :: MachineState -> Engine MachineState
step st@MachineState{ msExpr = expr, msEnv = env, msCont = cont } =
  case expr of
    EVar x ->
      case lookup x env of
        Just v  -> continueValue v cont (msMeta st)
        Nothing -> error ("unbound variable: " ++ x)

    ELam x body ->
      continueValue (VClosure x body env) cont (msMeta st)

    EApp f a ->
      pure st { msExpr = f
              , msCont = FAppFun a env : cont
              }

    ELet x e body ->
      pure st { msExpr = e
              , msCont = FLet x body env : cont
              }

    EInt n ->
      continueValue (VInt n) cont (msMeta st)

    EConstr c es ->
      -- naive: evaluate args left-to-right
      case es of
        [] -> continueValue (VConstr c []) cont (msMeta st)
        (e0:rest) ->
          pure st { msExpr = e0
                  , msCont = FAppFun (EConstr c rest) env : cont
                  }

    ECase scrut alts ->
      pure st { msExpr = scrut
              , msCont = FCase alts env : cont
              }

5.3 Continuation handling

continueValue :: Value -> Cont -> MetaState -> Engine MachineState
continueValue v [] meta =
  pure $ MachineState (EInt 0) [] [] meta  -- “done” marker

continueValue v (frame:rest) meta =
  case frame of
    FAppFun arg env ->
      pure $ MachineState arg env (FAppArg v : rest) meta

    FAppArg funVal ->
      case funVal of
        VClosure x body cloEnv ->
          let env' = (x, v) : cloEnv
          in pure $ MachineState body env' rest meta
        _ ->
          error "attempt to call non-function"

    FLet x body env ->
      let env' = (x, v) : env
      in pure $ MachineState body env' rest meta

    FCase alts env ->
      matchCase v alts env rest meta

5.4 Case matching

matchCase :: Value -> [Alt] -> Env -> Cont -> MetaState -> Engine MachineState
matchCase v alts env cont meta =
  case v of
    VConstr c vs ->
      case findAlt c alts of
        Just (Alt _ names body) ->
          let env' = zip names vs ++ env
          in pure $ MachineState body env' cont meta
        Nothing ->
          error "no matching constructor"
    _ ->
      error "case on non-constructor"

findAlt :: ConName -> [Alt] -> Maybe Alt
findAlt c = find (\(Alt c' _ _) -> c' == c)

5.5 Run loop

runMachine :: MachineState -> Engine Value
runMachine st0 = loop st0
  where
    loop st = do
      st' <- step st
      case msCont st' of
        [] -> extractResult st'
        _  -> loop st'

extractResult :: MachineState -> Engine Value
extractResult MachineState{ msExpr = expr, msEnv = env } =
  case expr of
    EInt n -> pure (VInt n)
    _      -> error "non-value at termination"

6. Advanced features: where the CPS model shines

Because continuations and machine states are heap data, we can:

6.1 Hot-swap / version routing

Store VersionId in MetaState.
At each step, consult a version router to decide:
- which implementation of primitives to use,
- which function body to dispatch to (e.g., by name + version).

Hook point: inside step or continueValue, before executing a function body.

6.2 Shadow execution

Clone MachineState (deep copy of msExpr, msEnv, msCont, msMeta).
Run:
- stA under version A,
- stB under version B.
Compare results, logs, or effects.

Shadow execution is “fork the future,” not “clone a stack.”

6.3 Replay / time travel

Periodically snapshot MachineState (or log deltas).
To replay:
- restore a snapshot,
- run forward under new code or different version.

6.4 Semantic drift governance

When semantics change, traverse msCont and:
- rewrite frames (e.g., rename functions, adjust environments),
- migrate data representations,
- annotate frames with drift metadata.

Because frames are just data, you can patch history.

7. Implementation roadmap

Phase 1: Minimal engine
- Implement EngineExpr, Value, Env, Frame, Cont, MachineState.
- Implement step, continueValue, runMachine.
- Hand-construct small EngineExpr programs and run them.
Phase 2: GHC integration
- Write a GHC plugin that dumps Core.
- Implement ghcCoreToEngine for a small subset (ints, lambdas, apps, lets).
- Run simple Haskell programs on the engine.
Phase 3: Extend Core coverage
- Add ADTs, pattern matching, recursion, basic primitives.
- Support more Core constructs (case, letrec, etc.).
Phase 4: Meta and effects
- Add MetaState with versioning and tracing.
- Introduce an Effect layer (IO-like operations via explicit effect nodes).
- Implement effect routing in the engine.
Phase 5: Advanced semantics
- Hot-swap: version routing for functions and primitives.
- Shadow execution: state cloning and dual runs.
- Replay: checkpoints and re-runs.
- Drift governance: continuation rewriting tools.
Phase 6: Tooling and ergonomics
- CLI or library to compile Haskell → EngineCore → run.
- Debug views: pretty-print continuations, traces, versions.
- Optional: GHC plugin that directly invokes the engine for main.

This gives you a full, end-to-end story:

Haskell in → GHC front-end → Core → EngineCore → CPS engine → stackless, mutable-history execution.

ADDENDUM

🌟 The “Big Five” features your runtime truly depends on

These are the features that make your hot‑swap / replay / shadow‑execution / version‑routing runtime easy instead of a nightmare of hacks.

1. Semantic closures

Not just syntactic closures — but closures that:

capture lexical environments predictably
can be reified or at least reasoned about
behave deterministically
don’t hide mutable references in surprising ways

This is the foundation of your entire model.

2. Purity (or at least effect isolation)

You don’t need religious purity.
You need semantic purity:

deterministic functions
explicit effects
replayable computations
serializable state transitions
no hidden side effects

This is what makes:

shadow execution
hot‑swap
drift detection
replay
time travel
version routing

…possible without black magic.

3. First‑class continuations (or CPS‑friendly semantics)

Your runtime is fundamentally:

continuation‑driven
checkpoint‑driven
resumable
hot‑swappable
replayable

You don’t need full call/cc, but you do need:

the ability to reify the “rest of the computation”
or to structure the runtime so that the stack is already in the heap

This is why your “no real stack, only semantic frames” idea is so powerful.

4. Structured concurrency / actor‑style isolation

Your runtime wants:

tasks as independent semantic units
message passing
supervision trees
restartability
deterministic scheduling (or at least observable scheduling)

This is what makes:

shadow execution of tasks
version‑routed execution
hot‑swap of individual tasks
replay of task histories

…tractable.

5. A type system that can express effects, capabilities, or at least “what this function touches”

This doesn’t have to be Haskell‑level purity.
It can be:

algebraic effects
effect rows
capabilities
linear types
region types
ownership types
or even a disciplined annotation system

But you need some way to know:

what a function reads
what it writes
what it depends on
what it captures
what it can be safely swapped with

This is essential for:

drift detection
safe hot‑swap
shadow execution
replay correctness
version routing

🌱 Secondary but still important features

These aren’t strictly required, but they make your life dramatically easier.

6. Immutable data structures as the default

Because:

snapshots become trivial
diffs become trivial
replay becomes trivial
shadow execution becomes trivial

7. Pattern matching / algebraic data types

Because your runtime manipulates:

semantic frames
effect logs
versioned closures
task states

…and ADTs make this clean.

8. Tail‑call friendliness

Not for performance —
but because your runtime is fundamentally CPS‑shaped.

9. A runtime that doesn’t hide too much

Languages like Java, Go, and Python hide:

stack frames
scheduling
GC timing
object layout
closure representation

This makes your runtime harder to implement.

Languages like:

Haskell
OCaml
Scheme
Racket
Clojure
Rust (with some work)

…give you more control.

🎯 The crisp answer to your question

If closures were the only requirement, you could port the runtime anywhere.
But closures are only 1 of the 5 essential features.

The full set is:

Semantic closures
Purity / effect isolation
Continuation‑friendly semantics
Structured concurrency
Effect‑expressive type system

These five together make your runtime natural instead of forced.

Sunday, February 8, 2026

cps ftw