Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Murk Concepts Guide

This guide explains the mental model behind Murk. It’s written for someone who has run the heat_seeker example and wants to build something of their own.

Every Murk simulation has five components:

  1. A Space — the topology cells live on
  2. Fields — per-cell data stored in arenas
  3. Propagators — stateless operators that update fields each tick
  4. Commands — how actions from outside enter the simulation
  5. Observations — how state gets extracted for agents or renderers

These components are configured once, compiled into a world, and then ticked forward repeatedly. The rest of this guide explains each one.


Spaces & Topologies

A space defines how many cells exist and which cells are neighbors. Murk ships with seven built-in space backends:

SpaceDimsNeighborsParametersDistance metric
Line1D1D2length, edgeManhattan
Ring1D1D2 (periodic)lengthmin(fwd, bwd)
Square42D4 (N/S/E/W)width, height, edgeManhattan
Square82D8 (+ diagonals)width, height, edgeChebyshev
Hex2D2D6 (pointy-top)cols, rowsCube distance
Fcc123D12 (face-centred cubic)w, h, d, edgeFCC metric
ProductSpaceN-Dvarieslist of component spacesL1 sum

Choosing a space

  • Line1D / Ring1D — 1D cellular automata, queues, pipelines.
  • Square4 — grid worlds, pathfinding, Conway’s Game of Life.
  • Square8 — grid worlds where diagonal movement matters.
  • Hex2D — isotropic 2D movement without diagonal bias.
  • Fcc12 — 3D isotropic lattice (12 equidistant neighbors). Good for volumetric simulations like crystal growth or 3D diffusion.
  • ProductSpace — compose any spaces together (e.g., Hex2D x Line1D for a hex map with a vertical elevation axis).

Edge behaviors

Spaces that have boundaries support three edge behaviors:

BehaviorAt boundaryExample use
AbsorbEdge cells have fewer neighborsBounded arena, finite grid
ClampBeyond-edge maps to edge cellImage processing, extrapolation
WrapWraps to opposite side (torus)Pac-Man map, periodic simulation

Ring1D is always periodic (wrap). Hex2D only supports Absorb.

Coordinates

Every cell has a coordinate — a small vector of i32 values:

  • Line1D / Ring1D: [x]
  • Square4 / Square8: [row, col]
  • Hex2D: [q, r] (axial, pointy-top)
  • Fcc12: [x, y, z] where (x + y + z) % 2 == 0
  • ProductSpace: concatenation of component coordinates

Cells are stored in canonical order (a deterministic traversal of all coordinates). When you read a field as a flat f32 array, element i corresponds to canonical coordinate i. For 2D grids this is row-major order.

Cell count

The number of cells is determined by the space parameters:

  • Line1D(5) → 5 cells
  • Square4(10, 10) → 100 cells
  • Hex2D(8, 8) → 64 cells
  • Fcc12(4, 4, 4) → approximately w*h*d / 2 cells (parity constraint)

This matters because every field allocates cell_count * components floats per generation.


Fields & Mutability

Fields are per-cell data arrays. A 100-cell Square4 world with one Scalar field allocates 100 f32 values for that field.

Field types

TypeStorage per cellUse case
Scalar1 × f32Temperature, density, boolean flags
Vector { dims }dims × f32Velocity, color
Categorical { n_values }1 × f32 (stored as index)Terrain type, cell state

Field mutability

Mutability controls how and when memory is allocated for a field. This is the most important performance decision you’ll make.

MutabilityAllocation patternRead baselineUse when
StaticOnce, never againAlways generation 0Constants (terrain type, wall mask)
PerTickFresh buffer every tickPrevious tick’s valuesFrequently-updated state (heat, positions)
SparseNew buffer only on writeShared until mutatedInfrequently-changed state (terrain HP)

Static fields are allocated once in a shared arena. They’re read-only after initialization — propagators can read them but never write them. Use these for data that never changes (terrain layout, obstacle masks).

PerTick fields get a fresh buffer every tick. If a propagator writes to the field, it fills the new buffer. If nothing writes to the field, the previous tick’s values are copied forward. This is the most common mutability class — use it for anything that changes regularly.

Sparse fields share memory across ticks until something writes to them, at which point a new buffer is allocated (copy-on-write). Use these for data that changes rarely — the arena skips allocation on ticks where the field isn’t modified.

Bounds and boundary behavior

Fields can optionally have value bounds (min, max). When a value is written outside those bounds, the BoundaryBehavior determines what happens:

  • Clamp — value is clamped to the nearest bound
  • Reflect — value bounces off the bound
  • Absorb — value is set to the bound
  • Wrap — value wraps to the opposite bound

If you don’t need bounds, just use the defaults.


Propagators

A propagator is a stateless function that runs once per tick. It reads some fields, writes some fields, and that’s it. All simulation logic lives in propagators.

The step signature (Python)

def my_propagator(reads, reads_prev, writes, tick_id, dt, cell_count):
    """
    reads:       list of numpy arrays (fields from current-tick overlay)
    reads_prev:  list of numpy arrays (fields from previous tick, frozen)
    writes:      list of numpy arrays (output buffers to fill)
    tick_id:     int, monotonically increasing tick counter
    dt:          float, simulation timestep in seconds
    cell_count:  int, number of cells in the space
    """
    ...

The step signature (Rust)

#![allow(unused)]
fn main() {
fn step(&self, ctx: &mut StepContext<'_>) -> Result<(), PropagatorError> {
    let prev_heat = ctx.reads_previous().read_field(HEAT_ID)?;
    let space = ctx.space();
    let writer = ctx.writes();
    // ... compute new values, write to output ...
}
}

Read modes: Euler vs Jacobi

Every propagator declares which fields it reads. There are two read modes:

  • reads (Euler mode) — sees the in-tick overlay. If a prior propagator in the same tick already wrote to this field, you see those new values. This creates a dependency chain between propagators.

  • reads_previous (Jacobi mode) — sees the frozen tick-start snapshot. Always reads the base generation, regardless of what other propagators have written this tick.

The choice matters for correctness:

  • Diffusion should use reads_previous (Jacobi). Otherwise the result depends on cell visit order, which is wrong.
  • A reward propagator that reads an agent-position field written by a movement propagator should use reads (Euler) to see the already-updated position.

Write modes

Each written field has a write mode:

  • WriteMode.Full — the propagator fills every cell. The engine gives you a fresh, zeroed buffer. In debug builds, a coverage guard checks that every cell was written.

  • WriteMode.Incremental — the propagator modifies only some cells. The engine pre-seeds the buffer with the previous tick’s values via memcpy. You only update the cells you need.

Pipeline validation

Murk validates the propagator pipeline at startup:

  • Write conflicts — two propagators writing the same field is an error (detected and reported with both propagator names).
  • CFL stability — if a propagator declares a max_dt, Murk checks that the configured dt doesn’t exceed it.
  • Undefined fields — reading a field that doesn’t exist is an error.

Ordering

Propagators run in the order they’re registered. This ordering, combined with the Euler/Jacobi read declarations, defines the dataflow. The engine precomputes a ReadResolutionPlan that maps each (propagator, field) pair to either the base generation or a prior propagator’s staged output — with zero per-tick routing overhead.


Commands & Ingress

Commands are how actions from outside the simulation (agent actions, user input, network messages) enter the tick loop.

Command types

CommandPurpose
SetField(coord, field_id, value)Write a single cell value
Move(entity_id, target_coord)Move an entity
Spawn(coord, field_values)Create a new entity
Despawn(entity_id)Remove an entity
SetParameter(key, value)Change a global simulation parameter
Custom(type_id, data)User-defined command type

In the Python API, the most common command is SetField:

cmd = Command.set_field(field_id=1, coord=[5, 3], value=1.0)
receipts, metrics = world.step([cmd])

Receipts

Every command submitted to step() gets a receipt:

receipts, metrics = world.step([cmd])
for r in receipts:
    print(r.accepted, r.applied_tick_id)

A command can be rejected if the ingress queue is full, the command is stale (refers to an old tick), or the world is shutting down.

Command ordering

Commands are applied in this order: priority_class (lower = higher priority), then source_id, then arrival_seq (monotonic counter). System commands (priority 0) run before user commands (priority 1).


Observations

The observation system extracts field data into flat f32 tensors suitable for neural networks.

The pipeline: ObsSpec → ObsPlan → execute

  1. ObsSpec — a list of ObsEntry objects declaring what to observe.
  2. ObsPlan — a compiled plan (precomputed gather indices). Created once, reused every tick.
  3. execute — runs the plan against the current world snapshot, producing a flat f32 array.
# 1. Specify what to observe
obs_entries = [
    ObsEntry(field_id=0, region_type=RegionType.All),
    ObsEntry(field_id=1, region_type=RegionType.AgentDisk, radius=3),
]

# 2. MurkEnv compiles the plan internally
# 3. Each step(), the plan executes and returns obs as a numpy array
obs, reward, terminated, truncated, info = env.step(action)

Region types

RegionDescriptionWhen to use
AllEvery cell in the spaceFull observability, small grids
AgentDisk(radius)Cells within radius graph-distance of the agentPartial observability, foveation
AgentRect(half_extent)Axis-aligned bounding box around agentRectangular partial observability

All is the simplest — you get cell_count floats per entry. Agent-centered regions give partial observability and scale better on large grids.

Transforms

Transforms are applied to field values during extraction:

  • Identity — raw values, no change
  • Normalize(min, max) — linearly maps [min, max] to [0, 1], clamping values outside the range

Pooling

For large observations, pooling reduces dimensionality:

  • PoolKernel.Mean — average of each window
  • PoolKernel.Max — maximum of each window
  • PoolKernel.Min — minimum of each window
  • PoolKernel.Sum — sum of each window

Pooling is configured per-entry with kernel_size and stride.

Observation layout

Entries are concatenated in order. If you observe two fields on a 100-cell grid with region_type=All, you get a 200-element f32 array: the first 100 elements are field 0, the next 100 are field 1.


Runtime Modes

Murk has two runtime modes that share the same tick engine but differ in how you interact with it.

LockstepWorld (synchronous)

The standard mode for RL training:

# Python (via MurkEnv)
obs, reward, terminated, truncated, info = env.step(action)

# Rust
let result = world.step_sync(commands)?;
let snapshot = result.snapshot;  // borrows world

Properties:

  • Blocking step() call — you wait for the tick to complete
  • In Rust, &mut self enforces single-threaded access at compile time
  • The snapshot borrows the world, preventing a new step until you’re done reading
  • Deterministic: same seed + same commands = same result, always

This is what MurkEnv and MurkVecEnv use internally.

RealtimeAsyncWorld (asynchronous)

For real-time applications (game servers, live visualizations):

#![allow(unused)]
fn main() {
// Commands are submitted without blocking
world.submit_commands(commands)?;

// Observations can be taken concurrently
let result = world.observe(&mut plan)?;
}

Properties:

  • Background tick thread runs at a configurable rate
  • Multiple observation requests can be served concurrently via a worker pool
  • Epoch-based reclamation ensures snapshots aren’t freed while being read
  • Command channel provides back-pressure when the queue is full

The Python bindings currently only expose LockstepWorld.


Arena & Memory

Murk uses arena-based generational allocation instead of per-object heap allocation. This is what makes it fast and GC-free.

The ping-pong buffer

The engine maintains two segment pools (A and B). On each tick:

  1. One pool is staging (being written by propagators)
  2. The other is published (readable as a snapshot)
  3. After the tick, they swap roles

This means the previous tick’s data is always available for reading while the current tick is being computed.

How mutability maps to memory

  • Static fields live in a separate shared arena. They’re allocated once and never touched again. No per-tick cost.

  • PerTick fields get a fresh allocation in the staging pool every tick. After publish, the old staging pool (now published) still holds the previous tick’s values — so snapshots and reads_previous work without copying.

  • Sparse fields use a dedicated copy-on-write slab. They share memory across ticks until a propagator writes to them, at which point a new allocation is made. On ticks where nothing writes to a sparse field, there’s zero allocation cost.

Why this matters

  • No garbage collection pauses — arena memory is bulk-freed, not per-object
  • Deterministic memory lifetime — you know exactly when memory is allocated and freed
  • Zero-copy snapshots — reading the previous tick’s data is just a pointer into the published pool

For most users, you don’t need to think about arenas directly. The practical takeaway is: choose the right FieldMutability for your data, and the arena system handles the rest efficiently.


Putting It Together

Here’s how these concepts compose in a typical simulation:

import murk
from murk import (
    Config, FieldMutability, EdgeBehavior,
    WriteMode, ObsEntry, RegionType,
)

# 1. Space: defines topology
config = Config()
config.set_space_square4(32, 32, EdgeBehavior.Wrap)

# 2. Fields: define per-cell data
config.add_field("temperature", mutability=FieldMutability.PerTick)
config.add_field("terrain", mutability=FieldMutability.Static)
config.add_field("agent_pos", mutability=FieldMutability.PerTick)

# 3. Propagator: defines simulation logic
def diffuse(reads, reads_prev, writes, tick_id, dt, cell_count):
    # reads_prev[0] = previous tick's temperature
    # writes[0] = this tick's temperature output
    ...

murk.add_propagator(
    config,
    name="diffusion",
    step_fn=diffuse,
    reads_previous=[0],              # Jacobi read of field 0
    writes=[(0, WriteMode.Full)],    # Full write to field 0
)

config.set_dt(0.1)
config.set_seed(42)

# 4. Observations: define what the agent sees
obs_entries = [
    ObsEntry(0, region_type=RegionType.All),       # Full temperature grid
    ObsEntry(2, region_type=RegionType.AgentDisk, radius=5),  # Agent's local view
]

# 5. Environment: wraps everything in the Gymnasium interface
env = murk.MurkEnv(config, obs_entries, n_actions=5, seed=42)
obs, info = env.reset()
obs, reward, terminated, truncated, info = env.step(action)

For a complete working example, see heat_seeker.


Glossary

TermDefinition
CellA single location in the space. Has a coordinate and one value per field.
TickOne simulation timestep. All propagators run, then the arena publishes.
GenerationArena version counter. Incremented on each publish.
Canonical orderThe deterministic traversal of all coordinates (row-major for 2D grids).
SnapshotRead-only view of the world state at a particular generation.
ObsPlanCompiled observation plan. Precomputes gather indices for fast extraction.
IngressThe command queue that feeds actions into the tick loop.
EgressThe observation pathway that extracts state out of the simulation.
CFL conditionCourant-Friedrichs-Lewy stability constraint: N * D * dt < 1 where N is neighbor count.