Overview

This is a high-level overview of how differenet parts of the Serene's compiler work together. It is meant to be the entry point for developers who wish to understand the the internals.

Serene is going to be a self-hosted compiler — its real compiler will be written in Serene itself. To get there we first need to bootstrap, and that is what we are building right now: the stage-0 compiler, a throwaway whose only job is to compile enough of Serene to write the self-hosted compiler in.

Forward-looking. This describes the intended architecture, assuming the in-progress pieces are finished. Several parts — most of the backend, the compiler↔runtime bridge, and large chunks of the type checker — are still under construction today. Read this for the shape of the system, not its current state. Sections that lean on unfinished work are marked (in progress).

The shape of stage 0

The stage-0 compiler (lscz, "lxsameer's serene compiler") is the stage-0 compiler, its only job is to compile enough of Serene that Serene can be written in itself. It has two halves — a front end that produces well-typed core terms, and a runtime those terms ultimately run on. The bridge between them is codegen plus a small ABI. The following diagram captures the big picture.

flowchart TD
  src(["Serene Source"]) --> parser

  subgraph FE["Front-end · lscz"]
    direction TB
    parser("Parser") --> elab
    subgraph GR["Graph reduction"]
      direction LR
      elab("Elaborator") --> type("TypeChecker") --> core("Typed Core · QTT")
    end
  end

  subgraph RT["Runtime"]
    direction TB
    subgraph JIT["JIT"]
      direction LR
      eval("Evaluate") --> llvm("LLVM Backend")
    end
    llvm --> value[("Values")]
    mm("Memory Manager") <--> value
    ds("Data Structures") <--> value
    fiber("Fiber Subsystem") <--> value
    io("IO Reactor") <--> fiber
    ffi("FFI") <--> value
  end

  core ==> eval
  value -. "read back" .-> type
  llvm --> prog[/"Executable"/]
  ffi --> world(["World"])

  %% Serene palette: purple front-end bands, amber runtime bands, purple hub.
  classDef feBand fill:#f1eaf3,stroke:#7c3a8f,color:#1d141f
  classDef grBand fill:#e4d4ec,stroke:#5e246d,color:#1d141f
  classDef rtBand fill:#fff3d6,stroke:#cf9526,color:#241c10
  classDef jitBand fill:#ffe7b0,stroke:#b5752a,color:#241c10
  classDef hub fill:#5e246d,stroke:#431950,color:#ffffff,stroke-width:2px
  classDef proc fill:#ffffff,stroke:#7c3a8f,color:#1d141f
  classDef term fill:#faf8fb,stroke:#6a5f6e,color:#1d141f

  class FE feBand
  class GR grBand
  class RT rtBand
  class JIT jitBand
  class value hub
  class parser,elab,type,core,eval,llvm,mm,ds,fiber,io,ffi proc
  class src,prog,world term

The front end: `lscz`

The front end is written in Idris2 and is a made up of small, and total passes that form a graph reduction pipeline. It reads the Serene code via Serene.Reader, stores the syntactically correct forms in a graph (Serene.Graph) and run the nodes through the pipeline on demand.

Forms -> Highlevel language -> Elaborate -> Well typed core TT -> Type cheker -> Core Terms

It's pretty streight-forward on the surface. But each of the passes have their own level of complexity. You can find out more by reading through the lscz API Reference

The front-end works with the runtime to compile the core terms to values, and read them back if necessary for type checking or macro expansion.

That being said, lscz is designed to support to have different backends, for example it has a simple interpretation backend that does not go through the JIT compiler.

The Runtime

The runtime is a static C library (libserene.runtime.a) that compiled Serene code links against. It owns everything that has to exist at run time, and provides support for the the programs. It have many different components, and subsystems such as:

Object model & data structures. Serene values are immutable. The runtime ships persistent collections — cons lists, a vector-trie seq, and a HAMT-backed map — so "updates" share structure instead of copying.
Memory Manager A pluggable memory manager system that supports different implementations. Allocation goes through a block-based arena allocator by default but it is easy to hook a garbage collector into it as well. (mm)
Concurrency. The runtime has stackful fibers multiplexed over OS threads by an M:N work-stealing scheduler, with an IO reactor for non-blocking IO. All the programs that run on the runtime, will use the fiber subsystem by default.
Execution. A JIT runs code at compile time (for the evaluator). It utilizes LLVM ABI/FFI layer lets Serene call C and vice versa.

See the runtime API reference for the concrete types and functions.

Connecting the two halves

The front end produces typed core terms; the runtime knows how to hold values and run code. Two paths bridge them, and they share one value representation:

Compile-time evaluation. Normalization during type checking needs to actually reduce terms. The interpreter (and, for speed, the runtime's JIT) evaluates core terms into runtime values, which flow back into the graph.
Code generation (in progress). The backend (Serene.Compiler.Backend) lowers fully-checked core terms to native code (via the runtime API). Emitted code is just calls into the runtime's object model, data structures, and scheduler.

Thanks to the type checker, what reaches the runtime is ordinary first-order code over the runtime's value model — no types travel to runtime.

Overview

The shape of stage 0

The front end: lscz

The Runtime

Connecting the two halves

The front end: `lscz`