ADDENDUM
Keep t2conduit out of the compiler core: it stays optional, replaceable, and composable with other pipeline designs someone might want to build.
Let me think through what actually needs compiler support versus what can live in macros.
What Macros Can Do
A sufficiently powerful macro system — one that operates on the AST before compilation, like Clojure's or Rust's proc macros — can do most of what t2conduit needs:
Fusion is just AST rewriting. If the macro sees a sequence of (conduit/map f) forms where f carries ^:pure metadata, it can rewrite them into a single fused form before any code is emitted. The macro walks the pipeline form, collects adjacent pure stages, and produces a single generator form. This is exactly what Clojure macros do with -> and what core.async does with go — the macro transforms the entire body into something the compiler never sees in its original form.
; What the programmer writes:
(pipeline
(source/array users)
(conduit/map normalize-email) ; ^:pure
(conduit/filter adult?) ; ^:pure
(conduit/tap log-user) ; no ^:pure
(conduit/map extract-email) ; ^:pure
(sink/to-array))
; What the pipeline macro expands to — fusion already done:
(let [seg1 (fn* [iter]
(letfn [(step [xs acc]
(if (empty? xs) acc
(let [u (first xs)
e (normalize-email (:email u))]
(if (adult? (:age u))
(step (rest xs) (conj acc (assoc u :email e)))
(step (rest xs) acc)))))]
(step iter [])))
seg2 (fn* [iter]
(doseq [u iter] (log-user u)) iter)
seg3 (fn* [iter]
(map extract-email iter))]
(-> users seg1 seg2 seg3 vec))
The compiler sees none of the Source/Conduit/Sink concepts. It just sees functions and function application.
Purity metadata is already a macro concern. Reading ^:pure off a var's metadata is something a macro does naturally — (:pure (meta (resolve 'normalize-email))). The macro checks this at macro-expansion time and decides whether to fuse.
The sync/async inference is also macro-tractable. The macro walks the pipeline stages, checks whether any resolved var has ^:async metadata, and emits either a (fn* [iter] (for [x iter] ...)) or an (async-fn* [iter] (for-await [x iter] ...)) form accordingly.
What Actually Needs Compiler Support
Honestly, very little — and what does need it is general-purpose, not t2conduit-specific:
1. Generator syntax. The macro needs to emit something like (generator [x iter] (yield x)). That generator / yield syntax either needs to be a compiler primitive or itself be another macro that desugars to continuation-passing or state-machine style. Clojure doesn't have generators but t2lang targets TypeScript, which does — so the question is whether function* / yield are exposed as t2lang primitives. If they are (even as thin compiler builtins), the t2conduit macro can emit them. If not, someone needs to add them — but that's a general t2lang feature, not a t2conduit feature.
2. Metadata on function definitions. ^:pure and ^:async need to survive from definition site to macro-expansion time at the call site. This is how Clojure works — var metadata is first-class and readable by macros. If t2lang has this (and it should, being an s-expression language), t2conduit needs nothing special from the compiler.
3. Macro-expansion access to resolved vars. The pipeline macro needs to call resolve or equivalent to look up what normalize-email actually is and read its metadata. This is standard in Lisp macro systems.
That's genuinely it. No special IR. No fusion pass registered in the compiler. No special handling of Source/Conduit/Sink types.
The Practical Split
t2lang compiler provides:
- macro system with AST access
- var metadata (^:pure, ^:async, etc.)
- generator primitives (function*, yield, async function*, await)
- (resolve sym) at macro-expansion time
t2conduit provides (entirely as macros + runtime functions):
- (pipeline ...) macro — does fusion, mode inference, emits generators
- (source/array ...), (source/range ...) etc. — simple runtime functions
- (conduit/map ...), (conduit/filter ...) etc. — annotated runtime functions
- (sink/to-array ...) etc. — simple runtime functions
- Pure.Number, Pure.String etc. — the stdlib, just annotated functions
- The flat-loop optimization — a branch inside the pipeline macro
The pipeline macro is doing real work — it's a small compiler for pipeline forms. But it's a library macro, not a compiler pass. Anyone who doesn't want t2conduit just doesn't require it.
The One Tension
The flat-loop optimization — where an all-pure, all-sync, array-source pipeline lowers to a plain for loop — is harder to guarantee from a macro than from a compiler pass, because the macro can emit the right code but can't prevent the JS engine from doing something unexpected with it. In practice this doesn't matter: you emit a plain for loop and V8 will optimize it fine. But if you wanted to prove the optimization happened, you'd need a compiler pass. Since we're not claiming that, a macro is fine.
Summary
t2conduit can be almost entirely macros if t2lang gives you: a real macro system with resolve and metadata access, generator syntax as primitives, and ^:pure/^:async surviving to macro-expansion time. Those three things are all things t2lang should have for its own reasons, independent of t2conduit. Nothing about t2conduit needs to be special-cased in the compiler. The pipeline macro is the heart of the system and it's just a macro — sophisticated, but a macro.
No comments:
Post a Comment