Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Determinism Catalogue (R-DET-6)

Living document cataloging known sources of non-determinism and the mitigations applied in the Murk simulation framework.

Sources of Non-Determinism

1. HashMap / HashSet Iteration Order

Risk: HashMap and HashSet use randomized hashing by default. Iterating over them produces different orderings across runs.

Mitigation: Banned project-wide via clippy.toml:

disallowed-types = [
    { path = "std::collections::HashMap", reason = "Use IndexMap for deterministic iteration" },
    { path = "std::collections::HashSet", reason = "Use IndexSet for deterministic iteration" },
]

All code uses IndexMap / BTreeMap instead.

Verification: cargo clippy enforces this at CI time.


2. Floating-Point Reassociation

Risk: Compilers may reorder floating-point operations for performance (e.g., -ffast-math), producing different results across builds.

Mitigation:

  • Rust does not enable fast-math by default.
  • All arithmetic uses explicit operation ordering (no auto-vectorization that could reassociate).
  • Build metadata is recorded in the replay header, enabling detection of toolchain differences.

Verification: Replay header stores BuildMetadata.compile_flags and BuildMetadata.toolchain.


3. Sort Stability

Risk: Unstable sorts may produce different orderings for equal elements across implementations or runs.

Mitigation:

  • Command ordering uses priority_class (primary), source_id (secondary), arrival_seq (final tiebreaker) — all fields are distinct.
  • Agent actions are sorted by agent_id before processing.
  • All sorts use stable sort (sort_by_key / sort_by).

Verification: Scenario 2 (multi-source command ordering) exercises 3 sources with 1000 ticks.


4. Thread Scheduling

Risk: In multi-threaded modes, OS thread scheduling is non-deterministic.

Mitigation:

  • Lockstep mode is single-threaded by design. All propagators execute sequentially in pipeline order.
  • RealtimeAsync mode (future) will use epoch-synchronized snapshots and deterministic command ordering at tick boundaries.

Status: N/A for Lockstep (current scope). Future concern for RealtimeAsync.


5. Arena Recycling

Risk: Memory recycling patterns could theoretically affect state if buffer reuse is order-dependent.

Mitigation:

  • PingPong buffer swap is deterministic: generation N always writes to buffer N % 2, reads from (N-1) % 2.
  • Arena allocations are generation-indexed, not address-indexed.
  • Ring buffer recycling is deterministic (circular index modulo ring size).

Verification: Scenario 4 (arena double-buffer recycling) runs 1100 ticks to exercise multiple full ring buffer cycles.


6. RNG Seed

Risk: Different seeds produce different simulation trajectories.

Mitigation:

  • Seed is stored in the replay header (InitDescriptor.seed).
  • Replay reconstruction uses the same seed.
  • config_hash() includes the seed.

Verification: All scenarios use explicit seeds and verify hash equality.


7. Build Metadata Differences

Risk: Different compilers, optimization levels, or target architectures may produce different floating-point results for the same source code.

Mitigation:

  • BuildMetadata is recorded in every replay file: toolchain, target_triple, murk_version, compile_flags.
  • Replay consumers can warn or reject when metadata doesn’t match.

Status: Detection only. Cross-build determinism is not guaranteed and is explicitly documented as a known limitation.


8. Command Serialization Fidelity

Risk: Fields like expires_after_tick and arrival_seq are runtime-only and should not affect determinism if excluded.

Mitigation:

  • expires_after_tick is NOT serialized in replays. On deserialization, it is set to TickId(u64::MAX) (never expires).
  • arrival_seq is NOT serialized. Set to 0 on deserialization. The ingress pipeline assigns fresh arrival sequences.
  • Only payload, priority_class, source_id, source_seq are recorded.

Verification: Proptest round-trip tests verify command serialization preserves all payload data. Integration tests verify replay produces identical snapshots despite sentinel values.


Verified Scenarios

#ScenarioTicksStatus
1Sequential-commit vs Jacobi1000PASS
2Multi-source command ordering1000PASS
3WriteMode::Incremental1000PASS
4Arena double-buffer recycling1100PASS
5Sparse field modification1000PASS
6Tick rollback recovery100PASS
7GlobalParameter mid-episode1000PASS
810+ propagator pipeline1000PASS