Sunday, February 1, 2026

cnn -> nerf

license: public domain CC0



Neural LOD Booster (NLB)

A Neural Appearance Layer for Real‑Time Engines

Technical Design Document — Unified Revision (with ObjectField‑Based NeRF Pipeline)


1. Introduction

This document describes a two‑tier neural rendering architecture:

  1. Tier 1 — Mesh‑Conditioned Neural LOD Booster (NLB)
    A practical, shippable system where tiny per‑model CNNs reconstruct high‑fidelity shading from low‑poly meshes.

  2. Tier 2 — Neural Scene Graph Renderer (NSGR)
    A future‑facing extension where a NeRF‑like model renders the scene directly from structured object data (ObjectFields), eliminating the need for meshes on the GPU.

The core principle:

NLB is not a geometry system. It is an appearance system that sits on top of whatever low‑poly geometry representation the engine uses.

Tier 1 uses meshes as the representation.
Tier 2 uses ObjectFields instead.


2. Tier 1 — Mesh‑Conditioned Neural LOD Booster (Baseline System)

(This section remains unchanged — it’s the practical, shippable system.)


3. Tier 2 — Neural Scene Graph Renderer (NSGR)

A future extension that eliminates meshes from the GPU entirely

Tier 2 extends the Mesh+CNN system into a scene‑level neural renderer that uses a NeRF‑like model conditioned on structured object data.

The key shift:

Tier 1 is mesh‑conditioned neural shading.
Tier 2 is object‑conditioned neural rendering.

Meshes disappear from the rendering path.


3.1 Motivation

Pure NeRF worlds are beautiful but unusable for games because they lack:

  • object identity
  • physics
  • determinism
  • editing
  • consistency

We fix this by inserting a structured semantic layer.


3.2 Object Fields: The Missing Middle Layer

We introduce a universal, engine‑friendly representation:

**ObjectField = {

type,
position,
orientation,
boundingVolume,
material/styleEmbedding,
physicalProperties,
optional lowPolyProxy (for physics only)
}**

This gives:

  • physics engines → colliders, rigid bodies
  • gameplay systems → semantic identity
  • neural renderer → appearance conditioning

3.3 Nuance: Tier 2 does not require meshes on the GPU

This is the crucial distinction.

Tier 1

  • Needs low‑poly meshes on GPU
  • Mesh → G‑buffer → CNN → shading

Tier 2

  • Does not need meshes at all
  • ObjectFields → NeRF‑like renderer → pixels
  • Physics uses bounding volumes, not meshes
  • Rendering is fully neural

Meshes only exist offline (for training) or CPU‑side for physics if needed.


3.4 NeRF as a Conditional Neural Renderer

The NeRF‑like model becomes a giant CNN that renders the scene:

[ f(x, d \mid \text{ObjectFields}) \rightarrow (color, density) ]

It no longer hallucinates the world.
It renders the structured world you give it.

This eliminates:

  • view inconsistency
  • geometry drift
  • hallucinations

And preserves:

  • neural shading
  • neural detail
  • neural style
  • neural lighting

3.5 The ObjectField‑Based NeRF Pipeline (Expanded Design)

The ObjectField‑based NeRF pipeline has three major stages:


Stage 1 — Text → ObjectFields (Semantic World Generation)

NeRFs cannot infer objects.
So we introduce a companion model:

Text‑to‑World Object Model (TWOM)

A lightweight generative model that converts high‑level descriptions into structured ObjectFields.

Example:

"small wooden cabin with a stone chimney"
→
[
  {type:"cabin", position:(…), orientation:(…), boundingVolume:(…), material:"wood", styleEmbedding:(…)},
  {type:"chimney", position:(…), orientation:(…), boundingVolume:(…), material:"stone", styleEmbedding:(…)}
]

TWOM can be implemented as:

  • a scene‑graph generator
  • a diffusion‑based object placer
  • a transformer trained on scene descriptions
  • a hybrid symbolic + neural system

Output: A complete list of ObjectFields.


Stage 2 — ObjectFields → Physics + Gameplay

ObjectFields are fed into the physics and gameplay systems:

Physics Engine

  • Uses boundingVolume for collisions
  • Updates transforms
  • Handles rigid bodies, joints, constraints

Gameplay Systems

  • Use type, material, and semantic ID
  • Attach scripts, AI, interactions

World State

  • Stored as a dynamic list of ObjectFields
  • Updated every frame

This ensures:

  • determinism
  • editability
  • multiplayer sync
  • gameplay consistency

Stage 3 — ObjectFields → NeRF‑Style Renderer

The NeRF‑like renderer consumes ObjectFields as conditioning input.

3.3.1 Conditioning Mechanisms

Each ObjectField provides:

  • a latent style embedding
  • a material embedding
  • a transform
  • a bounding region

The renderer uses these to determine:

  • which objects influence each ray
  • how materials should look
  • how lighting interacts with surfaces

3.3.2 Rendering Process

For each pixel:

  1. Cast a ray
  2. Query relevant ObjectFields
  3. Inject object embeddings into the NeRF network
  4. Evaluate neural radiance + density
  5. Composite results
  6. Output final color

3.3.3 No Meshes Required

The renderer does not need:

  • vertices
  • triangles
  • UVs
  • tangents
  • topology

It only needs:

  • object embeddings
  • transforms
  • bounding volumes

3.6 Why This Architecture Works

This division of labor ensures:

Physics works

Because objects have bounding volumes.

Gameplay works

Because objects have identity and transforms.

Rendering is neural

Because the NeRF consumes ObjectFields.

No hallucinations

Because the renderer does not invent geometry.

Editing is possible

Because ObjectFields are explicit and modifiable.

This makes Tier 2 a game‑ready neural rendering architecture, not a black‑box generative scene.


4. Summary

Tier 1 — Mesh‑Conditioned Neural LOD Booster

  • Requires low‑poly meshes on GPU
  • CNN reconstructs hi‑fi shading
  • Works on all hardware
  • Practical and shippable

Tier 2 — Neural Scene Graph Renderer

  • Requires no meshes on GPU
  • NeRF‑like renderer consumes ObjectFields
  • Physics uses bounding volumes
  • Fully neural rendering
  • Eliminates hallucinations
  • Provides scene‑level neural shading
  • Uses TWOM to convert text → ObjectFields

Together, they form a unified neural appearance architecture that complements — not replaces — existing geometry systems like Nanite, voxels, SDFs, splats, and neural fields.

No comments:

Post a Comment