Smol Example:
ChatGPT Web:
<After workshopping the specs/requirements for a while...>
Write an extremely detailed implementation doc for the streaming version and th apis . We are only looking at geometry rf engine right now . We need to be able to hand the doc you make off to an LLM to code the geometry enginer API up from the existing python code. Make it so . Extremely detailed. Markdown checkboxes for each discrete task
- [ ]so the AI can mark them off with an X as they complete
The codex-cli can start working on the output TASKS_GEOENGINE.md.
Implementing component 3 (of 4):
ChatGPT Web:
I pasted in all the TASKS_*.md the AI coder has produced so far, so the web planner has an idea of what was done. (If we didn't have access restrictions, the web version could just access the git repo).
Okay, here are the markdown taskfiles of what we've got done so far. We build geo engine and orchestrator . I need you to write a very detailed TASKS_IMNET.md file that deals with the imnet++ inet part and how it interacts with the other two systems as they exist . Write it as a markdown checklist - [ ] so the ai can cross items off as it writes them . Go .
This produces TASKS_ORCHESTRATOR.md
Codex CLI:
Check over TASKS_ORCHESTRATOR.md It was written without knowledg of the actual current implementation. Let me know if it looks good for ARCHITECTURE.md and if it is compatible with what has been done so far in TASKS_GEO_ENGINE.md We have only tried to make a gEO engine so far. no edits ,just check the TASKS_ORCHESTRATOR.md makes sense.
Overall verdict: it is mostly aligned with ARCHITECTURE.md and broadly compatible with the Geo engine work, but the 3 medium/high items above should be clarified before implementation so the orchs estrator doesn’t encode wrong assumptions.
This step is super important! I need to make sure the plan we thought of actually makes sense with what is happening on the ground.
Can the misaligned items be easily solved? Is there a moslty obvisous right answer to them ? If so tell me the gravious solutions, or show me my optiosn if there s a fork or hard choice
<More workshopping of differenes between web and implementation. The web has the pdf source documents and better web search, and has unlimited usage, so I did planning there>
Codex CLI:
@AGENTS.md You task is to implement the geometry engine defined in @TASKS_GEO_ENGINE.md . After completing each task, mark it off with an x (
- [x]) so its markdown checkbox so there is an external record of what has been done. If plans change, then modify the task list appropriately. The overall high level architecture of the program is in @ARCHITECTURE.md . Go .
This is where the magic happens! We have thought through our API and have developed a test plan, and a development plan. Now the AI can develop the code and test the API via the test suite to ensure its correct.
Overall Advice:
-
Think about your inputs/outputs/dependencies beforehand.
-
When the AI screw ups hallucinates, or does something silly, you can make a note in AGENTS.md to do the right thing instead.
-
Force the AI to use as many deterministic static tools as possible:
- Strict Type Checking (Use Rust instead of C, Typescript instead of Javascript, Python with Type Annotations instead of without)
- Linters
- Code Format Tools
- Unit/Integration tests
-
Have the AI write as many tests as possible of what you want the program to do
-
If you want the AI to "one-shot" (i.e., autonomously code something complex for a while without supervision and get a good result), then you need to give it as much test input/output behaviour as possible, so it can keep checking against the "proper" results without your guidance.
Python Environment And Package Policy
Use uv for all Python workflows in this repository.
Rules
- Always run Python commands with
uv run. - Always add dependencies with
uv add. - Use
uv venvfor virtual environment setup/management. - Do not use
pip,pip3,python -m pip,virtualenv, orpython -m venv. - Do not install dependencies outside
uv.
Examples
- Run app:
uv run python main.py - Add runtime dependency:
uv add requests - Add dev dependency:
uv add --dev pytest
Test Coverage Policy
Use pytest-cov via uv for coverage checks.
Rules
- If coverage flags are needed and
pytest-covis missing, install it withuv add --dev pytest-cov. - For the workspace package, install coverage tooling with
uv add --package geomrf-engine --dev pytest-cov. - Run coverage with
uv run pytest ... --cov=.... - Because both root and
geomrf-engineuse atestspackage name, run coverage in two passes and combine reports instead of a single mixed pytest invocation.
Coverage command pattern
- Orchestrator pass:
COVERAGE_FILE=.coverage.orch uv run pytest tests --cov=satsim_orch --cov-report= - Geo engine pass:
COVERAGE_FILE=.coverage.geomrf uv run --package geomrf-engine pytest geomrf-engine/tests --cov=geomrf_engine --cov-report= - Combine/report:
uv run coverage combine .coverage.orch .coverage.geomrf && uv run coverage report -m
OMNeT++/INET Environment Policy
Use opp_env for OMNeT++/INET install and environment management in this repository.
Rules
- Use
opp_envto install/manage OMNeT++ and INET versions. - Do not rely on ad-hoc/manual OMNeT++ or INET installs for project workflows.
- Keep OMNeT++/INET version selection pinned and reproducible across dev and CI.
Scope boundary
- Python dependency and execution workflows remain
uv-managed. - OMNeT++/INET toolchain workflows are
opp_env-managed.
Legacy Code Reuse Policy
If functionality from old_code/ is needed:
- Do not import or execute from
old_code/directly. - Copy the required snippet(s) into a new file under the active codebase.
- Adapt and maintain the copied code in the active module only.
Sandbox / Network Restriction Policy
If a task is blocked by sandbox or network restrictions:
- Stop immediately and do not spend tokens repeatedly trying to bypass restrictions.
- Clearly tell the user that we are in a sandbox-restricted environment.
- Ask the user for permission before attempting any sandbox breakout or elevated access.
# ARCHITECTURE.md — SatSim System Overview
This document gives the **system-level architecture** for SatSim. It is intended to provide a complete “sight picture” for anyone implementing a subproject (e.g., the Geometry/RF Engine) so they understand how their component fits into the larger simulator.
---
## 1) Purpose and guiding idea
SatSim is a **hybrid satellite networking simulator** that combines:
1) A swappable **Geometry/RF/Link-Budget Engine** (physics + propagation + link feasibility)
2) A **packet-level discrete-event simulation lane 'OMNeT++/INET'** (scale + protocol behavior)
3) A **real SDN emulation lane 'Mininet/OVS'** (controller-in-the-loop + real Linux networking)
4) An **Orchestrator** that provides a single scenario/timebase and keeps all parts consistent
The fundamental design choice is that SatSim is **layered and composable**: we reuse mature simulators/emulators and treat satellite physics as an external service with a stable interface.
---
## 2) Key design decisions (why this looks the way it does)
### 2.1 Why two lanes (simulation vs emulation)
We intentionally run two different lanes because they answer different questions:
- **OMNeT++/INET lane (Discrete-Event Simulation)**
- Best for: scaling up to many nodes, protocol studies, routing and congestion behavior, reproducibility.
- Not best for: running real SDN controllers and real Linux TCP stacks.
- **Mininet/OVS lane (Network Emulation)**
- Best for: real SDN controllers (ONOS/Ryu), real forwarding behavior (OpenFlow/OVS), real apps/traffic tools.
- Not best for: scaling to thousands of nodes with full protocol stacks.
Trying to “pipe packets” between them is possible but usually not worth it early, because it introduces hard time synchronization problems (DES time vs wall-clock time) and packet bridging complexity. Instead we connect both lanes to the same **state oracle** (the Geo/RF engine) through the Orchestrator.
### 2.2 Where the lanes *do* meet today
They meet at:
- **Scenario definition** (same nodes, same constraints, same time window)
- **LinkState/Event timeline** (same “truth” about which links exist and their properties)
- **Metrics and artifacts** (comparable outputs; shared logging/PCAP strategy)
Optionally, they also meet via:
- **Shared SDN decision logic** (same ONOS/Ryu app used to compute routes, then applied in both lanes through adapters)
### 2.3 Future “stacking” (OMNeT++ feeding Mininet)
In the future, OMNeT++ may “feed” Mininet in two practical ways:
1) **Trace-driven replay (recommended future path)**
- OMNeT++ generates a curated set of traces (topology/failure schedules, traffic demands, baseline routing decisions).
- Mininet replays those traces in real-time to validate controller behavior under identical conditions.
2) **Hard co-simulation / packet bridging (advanced, optional)**
- Some nodes simulated in OMNeT++, others emulated in Mininet at the same time.
- Requires strict time coupling and a gateway that transforms/timeshifts packets.
- Not a v1 target.
### 2.4 Locked decisions (2026-02-18)
- **Tick authority:** `StreamLinkDeltas` is the control-plane source of truth for lane updates.
- **Events contract:** event streaming is retained, but aligned to the same requested `dt` and `selector`, and each event carries `tick_index`.
- **Orchestrator behavior:** streaming-driven execution is canonical; any scheduler is pacing-only.
- **Scenario translation:** orchestrator must fail-fast when it cannot produce a valid Geo/RF `ScenarioSpec`.
- **Python tooling:** Python workflows use `uv`; OMNeT++/INET workflows use `opp_env`.
---
## 3) Top-level components
### 3.1 Geometry/RF/Link-Budget Engine (black box, replaceable)
**Role:** The authoritative “physics layer” that translates orbital/propagation reality into network-usable link state.
**Key properties**
- Replaceable implementation (Skyfield + ITU-R today, could be STK import or other later)
- Stable interface (the rest of SatSim depends only on its API)
- Produces time-indexed:
- Link feasibility (up/down)
- Link properties (delay/capacity/loss proxies)
- Discrete events (link up/down, handover, failures if modeled)
**Location:** `geomrf-engine/`
---
### 3.2 Orchestrator (system conductor)
**Role:** Owns the simulation lifecycle and timebase. It is the “brain” that coordinates all lanes.
**Responsibilities**
- Load scenario config → create/initialize the Geo/RF engine scenario
- Choose execution mode:
- OMNeT-only, Mininet-only, or both in parallel
- Drive execution pacing:
- offline apply-fast or real-time apply-paced, while consuming authoritative engine stream ticks
- Consume Geo/RF LinkState stream and distribute it to:
- OMNeT adapter
- Mininet adapter
- logging/metrics
- Collect artifacts (PCAPs, timeseries metrics, configs, run manifests)
- Provide reproducible run IDs and version stamping
**Location:** `orchestrator/`
---
### 3.3 OMNeT++/INET Lane (packet-level discrete-event)
**Role:** Packet-level simulation of protocols, queuing, routing, traffic at scale.
**Responsibilities**
- Build the network node models (routers, hosts, queues) using INET components
- Apply dynamic link updates (delay/capacity/loss/up-down) based on Geo/RF output
- Run deterministic experiments rapidly (sweeps)
- Export artifacts:
- logs + metrics
- optional PCAP outputs (where supported)
**Custom SatSim additions**
- A lightweight **LinkState Adapter Module** that subscribes to orchestrator/Geo output
- A mechanism to apply link changes at simulation timestamps
**Environment and install management**
- Use `opp_env` as the standard way to install/manage OMNeT++ and INET.
- Avoid ad-hoc/manual OMNeT++/INET installs in project workflows.
**Location:** `lanes/omnet/`
---
### 3.4 Mininet/OVS Lane (SDN emulation)
**Role:** Real SDN controller + real forwarding plane under dynamic link conditions.
**Responsibilities**
- Build an emulated topology with Mininet (or Containernet)
- Use OVS as the dataplane switch/router substrate
- Run a real SDN controller (ONOS or Ryu)
- Apply dynamic link shaping based on Geo/RF output:
- `tc/netem` for delay/loss/jitter
- `tbf/htb` for rate control
- interface up/down to emulate link drops
- Generate traffic using real tools:
- iperf3, D-ITG, SIPp, tcpreplay, custom apps
**Location:** `lanes/mininet/`
---
### 3.5 Observability, artifacts, and visualization
**Role:** Make runs inspectable, comparable, and reproducible.
**Artifacts**
- Scenario config snapshot + run manifest (versions, seeds, git SHAs)
- LinkState/Event traces (optional export)
- Metrics time-series (throughput/delay/loss/path changes)
- PCAP captures (Mininet tcpdump; OMNeT if enabled)
**Tools**
- Prometheus + Grafana for dashboards
- Wireshark for PCAP analysis
**Location:** `observability/` and `artifacts/`
---
## 4) System boundaries and data ownership
### 4.1 The Geo/RF engine owns *physics truth*
- It is the source of truth for which links can exist and their physical/network properties.
- Other components must not invent geometry/rf; they only consume the engine’s output.
### 4.2 The Orchestrator owns *time and execution*
- It defines run window requests, pacing mode, and synchronization rules.
- For v1/v1.1, tick production comes from Geo/RF stream output rather than orchestrator-generated ticks.
- It routes updates to the lanes and standardizes artifacts.
### 4.3 Each lane owns *packet/control behavior*
- OMNeT owns packet-level behavior inside DES.
- Mininet owns real SDN and Linux networking behavior.
---
## 5) Core data flows (end-to-end)
### 5.1 Initialization flow
1. User provides `ScenarioConfig` (YAML/JSON).
2. Orchestrator validates config and creates a new run ID.
3. Orchestrator calls Geo/RF engine:
- `CreateScenario` (returns scenario ref)
4. Orchestrator initializes selected lane(s):
- OMNeT: compile/load model, start run
- Mininet: build topology, start controller
5. Orchestrator subscribes to Geo/RF streaming output for LinkState and Events.
### 5.2 Runtime (parallel lane mode)
At each time tick:
1. Geo/RF produces `LinkDeltaBatch` + optional events.
2. Orchestrator receives it and distributes:
- OMNeT adapter: update channel/link state in simulator time
- Mininet adapter: apply tc/netem shaping and link toggles
- (optional) Event recorder: store aligned `EngineEvent` stream for analysis/observability
- Observability: record metrics and store link traces
3. Lanes generate traffic and produce metrics/PCAPs.
### 5.3 Completion flow
1. Orchestrator stops lane processes.
2. Orchestrator closes the Geo/RF scenario.
3. All artifacts are written under the run ID.
---
## 6) Timebase and execution modes
SatSim supports multiple execution modes, controlled by the Orchestrator:
### Mode A — OMNeT-only (offline DES)
- Orchestrator consumes Geo/RF ticks and applies them to OMNeT without wall-clock pacing.
- Highest scalability and repeatability.
### Mode B — Mininet-only (real-time emulation)
- Orchestrator consumes Geo/RF ticks and applies wall-clock pacing while updating Mininet shaping.
- Best for SDN/controller realism and app-level testing.
### Mode C — Parallel (OMNeT + Mininet simultaneously)
- Both lanes consume the same LinkState stream.
- Used to compare “simulated protocol outcomes” vs “real controller outcomes” under the same link dynamics.
### Mode D — Trace-driven replay (future/optional)
- Geo/RF and/or OMNeT exports a trace.
- Mininet replays trace deterministically.
---
## 7) Interfaces between components (high-level)
### 7.1 Geo/RF Engine interface (v1)
- gRPC service, Protobuf messages
- Scenario lifecycle + streaming link deltas/events
- Output is **NetworkView** link properties (up/down, delay, capacity, loss proxy)
- Optional debug scalars for validation (SNR margin, elevation, range)
- Event-stream alignment target: events use same requested window/selector/dt semantics as deltas and expose `tick_index`.
### 7.2 Orchestrator ↔ OMNeT interface
- OMNeT subscribes to orchestrator updates via:
- gRPC client inside a C++ adapter module, OR
- file/trace ingestion for offline runs
- Applies updates to INET channel/link parameters and toggles connectivity
### 7.3 Orchestrator ↔ Mininet interface
- Orchestrator controls Mininet via:
- Python Mininet API calls
- Linux `tc` and interface management commands
- SDN controller is external (ONOS/Ryu), connected in the standard Mininet way
---
## 8) Reproducibility rules
Each run must record:
- ScenarioConfig snapshot
- Seeds
- Engine versions:
- Geo/RF engine version + schema version
- orchestrator version
- OMNeT/INET versions and model git SHA
- `opp_env` environment definition/metadata used for OMNeT/INET
- controller version and app git SHA
- LinkState trace hash (if stored)
- Toolchain/container image tags (if containerized)
---
## 9) Suggested monorepo layout
satsim/ ARCHITECTURE.md
orchestrator/ ...
subprojects/ geomrf-engine/ ARCHITECTURE.md # subproject-specific (the streaming API spec lives here) proto/ src/ tests/
lanes/ omnet/ models/ adapter/ scripts/
mininet/
topo/
driver/
controllers/
scripts/
observability/ grafana/ prometheus/ dashboards/
artifacts/ runs/ <run_id>/ scenario.yaml manifest.json linkstate.parquet (optional) metrics/ pcaps/ logs/
---
## 10) What an implementer of the Geo/RF engine must know
- The Geo/RF engine must be treated as **the physics oracle**.
- Its output must be:
- time-indexed
- sparse (selector-driven)
- stable and deterministic
- expressed in consistent units
- The Orchestrator will use it in both:
- offline sampling (for OMNeT)
- real-time streaming (for Mininet)
- The lanes do not need to know how link budgets are computed—only how to consume the streaming LinkDelta/Event outputs.
---
## 11) Roadmap hooks (explicit future extensions)
- Add a richer PHY view (optional fields) without breaking NetworkView consumers.
- Add trace import/replay for deterministic mininet runs.
- Add “shared SDN decision interface” so ONOS/Ryu path computation can be applied inside OMNeT.
- Add advanced co-simulation only if required (packet bridging).
---
SatSim docs index
Primary design and task documents:
ARCHITECTURE.mdTASKS_GEO_ENGINE.mdTASKS_IMNET.mdTASKS_ORCHESTRATOR.mdTASKS_TESTSUITE_GEOENGINE.md
Locked design decisions (2026-02-18):
- Control-plane tick authority is
StreamLinkDeltas(streaming-driven orchestration). - Event stream alignment is being standardized with request
dt/selectorand eventtick_index. - Orchestrator error handling includes
NOT_FOUND,INVALID_ARGUMENT,FAILED_PRECONDITION, andRESOURCE_EXHAUSTED. - Scenario translation to Geo/RF
ScenarioSpecis fail-fast. - Python workflows are
uv-managed; OMNeT++/INET workflows areopp_env-managed.
Geometry/RF Engine v1/v1.1 Streaming API — Implementation Specification
This document specifies a complete, implementable Geometry/RF Engine API and server for streaming link-state deltas + events. It is written so another LLM can generate the code from existing Python geometry/RF/link-budget code (Skyfield + ITU-R + your models) with minimal guesswork.
Status note:
- v1 baseline is implemented.
- v1.1 alignment updates (for orchestrator compatibility) are now specified below, especially for
StreamEventstick alignment.
0) Deliverables
What must exist at the end
-
A runnable Python gRPC server that implements:
- Scenario creation/closure
- Capabilities/version endpoints
- Streaming link deltas
- Streaming events
-
A Protobuf schema with:
- Stable IDs, time semantics, units
- Selector logic (which links/nodes to compute)
- Delta semantics (what counts as a “change”)
-
A reference Python client demonstrating:
- Create scenario → stream deltas/events → close scenario
1) Core design constraints
1.1 Contract invariants
- Scenario-scoped: All computation happens inside a
ScenarioRef. - Time-indexed: All output is keyed by a timestamp and tick index.
- Selector-driven: Never compute “all links” unless explicitly requested.
- Streaming-first: Primary runtime interface is server→client stream.
- Deterministic: Given identical inputs (scenario + seed + engine version), output is replayable.
1.2 What the engine outputs (NetworkView)
For each directed link (src → dst) at each tick, the engine provides:
up(boolean)one_way_delay_s(float; seconds)capacity_bps(float; bits per second)loss_rate(float; [0,1] packet loss proxy OR PER proxy)- optional debug scalar(s):
snr_margin_db,elevation_deg,range_m
Everything else can be exposed later via an optional “debug view”; v1 focuses on network-usable state.
2) Repository layout (recommended)
geomrf-engine/
proto/
geomrf/v1/geomrf.proto
src/geomrf_engine/
__init__.py
server.py
config_schema.py
scenario_store.py
timebase.py
selectors.py
compute/
__init__.py
ephemeris.py
geometry.py
rf_models.py
link_budget.py
adaptation.py
streaming/
__init__.py
delta.py
events.py
backpressure.py
util/
ids.py
units.py
logging.py
metrics.py
examples/
client_stream.py
tests/
test_proto_roundtrip.py
test_delta_thresholds.py
test_selectors.py
test_determinism.py
3) Implementation tasks checklist
3.1 Project & build system
-
Create repo structure as above
-
Add
pyproject.tomlwith dependencies:-
grpcio,grpcio-tools,protobuf -
pydantic(scenario validation) -
pyyaml(YAML scenario input) -
numpy,scipy(if used) -
skyfield,sgp4 - your ITU-R package(s)
-
prometheus-client(optional but recommended)
-
-
Add a
Makefileor task runner:-
uv run python -m grpc_tools.protoc ...compiles.prototo Python -
uv run python -m geomrf_engine.server ...starts server -
uv run pytestruns tests
-
3.2 Protobuf + gRPC schema
- Write
proto/geomrf/v1/geomrf.proto(spec below) - Generate Python stubs
- Add schema version constants and embed in responses
3.3 Server skeleton
-
Implement async gRPC server (
grpc.aio) -
Wire servicer methods:
-
GetVersion -
GetCapabilities -
CreateScenario -
CloseScenario -
StreamLinkDeltas -
StreamEvents
-
-
Add structured logging and request correlation IDs
3.4 Scenario lifecycle
- Implement scenario validation (Pydantic)
- Implement scenario store (in-memory for v1)
- Implement scenario ID generation (UUIDv4)
- Snapshot
ScenarioSpec+ resolved assets into aScenarioRuntime
3.5 Compute pipeline
- Implement ephemeris loader (TLE list initially)
- Implement geometry evaluation (positions + visibility + elevation + range)
- Implement RF/link budget mapping to
NetworkLinkState - Implement adaptation mapping (SNR → capacity/loss) with a default policy
- Implement per-tick evaluation returning sparse link set
3.6 Streaming + deltas/events
- Implement tick loop (timebase)
- Implement delta computation with thresholds
- Implement event emission (link up/down, handover optional)
- Implement backpressure-safe streaming
- Add stream cancellation handling and cleanup
3.7 Tests + examples
- Determinism test (same scenario+seed → identical deltas)
- Selector test (only requested links computed)
- Threshold test (small changes suppressed)
- Example client script (prints updates, counts links)
4) gRPC/Protobuf specification (v1)
4.1 .proto (authoritative spec)
Create proto/geomrf/v1/geomrf.proto:
syntax = "proto3";
package geomrf.v1;
import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";
option go_package = "geomrf/v1;geomrfv1"; // harmless for other langs
// ---------------------------
// Service
// ---------------------------
service GeometryRfEngine {
rpc GetVersion(GetVersionRequest) returns (GetVersionResponse);
rpc GetCapabilities(GetCapabilitiesRequest) returns (GetCapabilitiesResponse);
rpc CreateScenario(CreateScenarioRequest) returns (CreateScenarioResponse);
rpc CloseScenario(CloseScenarioRequest) returns (CloseScenarioResponse);
// Primary: stream sparse deltas per tick.
rpc StreamLinkDeltas(StreamLinkDeltasRequest) returns (stream LinkDeltaBatch);
// Primary: stream discrete events (optional separate channel for clean consumers).
rpc StreamEvents(StreamEventsRequest) returns (stream EngineEvent);
}
// ---------------------------
// Version / capabilities
// ---------------------------
message GetVersionRequest {}
message GetVersionResponse {
string engine_name = 1; // e.g., "geomrf-engine"
string engine_version = 2; // semver, e.g., "1.0.0"
string schema_version = 3; // e.g., "geomrf.v1"
string build_git_sha = 4; // optional
}
message GetCapabilitiesRequest {}
message GetCapabilitiesResponse {
string schema_version = 1;
// Limits
uint32 max_links_per_tick = 2;
uint32 max_nodes = 3;
uint32 max_streams_per_scenario = 4;
google.protobuf.duration min_dt = 5;
google.protobuf.duration max_dt = 6;
// Supported outputs
bool supports_loss_rate = 10;
bool supports_capacity_bps = 11;
bool supports_delay_s = 12;
bool supports_snr_margin_db = 13;
// Supported selectors/features (advertise so clients can adapt)
bool supports_only_visible = 20;
bool supports_min_elevation_deg = 21;
bool supports_max_degree = 22;
bool supports_link_types = 23; // GS-SAT, SAT-SAT, etc.
}
// ---------------------------
// Scenario lifecycle
// ---------------------------
message CreateScenarioRequest {
ScenarioSpec spec = 1;
}
message CreateScenarioResponse {
string scenario_ref = 1; // UUID string
string schema_version = 2;
}
message CloseScenarioRequest {
string scenario_ref = 1;
}
message CloseScenarioResponse {
bool ok = 1;
}
// ---------------------------
// Scenario specification (v1)
// ---------------------------
message ScenarioSpec {
// Reproducibility
uint64 seed = 1;
// Time model
google.protobuf.timestamp t0 = 2; // UTC
google.protobuf.timestamp t1 = 3; // UTC
google.protobuf.duration default_dt = 4;
// Nodes
repeated NodeSpec nodes = 10;
// Eligibility rules (which links can exist)
LinkPolicy link_policy = 20;
// Mapping PHY -> network outputs (can be simplistic in v1)
AdaptationPolicy adaptation = 30;
// Optional: engine-side caching hints
CacheHints cache_hints = 40;
}
enum NodeRole {
NODE_ROLE_UNSPECIFIED = 0;
SATELLITE = 1;
GROUND_STATION = 2;
USER_TERMINAL = 3;
}
message NodeSpec {
string node_id = 1; // stable ID used everywhere
NodeRole role = 2;
// One of the following depending on role
SatelliteOrbit orbit = 10;
GroundFixedSite fixed_site = 11;
// Radio/terminal model parameters (minimal v1)
TerminalModel terminal = 20;
// Arbitrary tags for selectors/grouping
map<string,string> tags = 30;
}
message SatelliteOrbit {
// v1: only TLE supported. Later: OEM/SP3/etc.
string tle_line1 = 1;
string tle_line2 = 2;
}
message GroundFixedSite {
double lat_deg = 1;
double lon_deg = 2;
double alt_m = 3;
}
message TerminalModel {
// Minimal knobs to compute link budgets consistently.
// Units: dBW, dBi, Hz, K, etc.
double tx_power_dbw = 1;
double tx_gain_dbi = 2; // can be treated as peak gain in v1
double rx_gain_dbi = 3; // can be treated as peak gain in v1
double rx_noise_temp_k = 4;
double bandwidth_hz = 5;
double frequency_hz = 6;
// Optional: simple pointing/antenna pattern loss approximation
double pointing_loss_db = 10; // default constant loss if you don’t model patterns yet
}
enum LinkType {
LINK_TYPE_UNSPECIFIED = 0;
GS_TO_SAT = 1;
SAT_TO_GS = 2;
SAT_TO_SAT = 3;
UT_TO_SAT = 4;
SAT_TO_UT = 5;
}
message LinkPolicy {
// Which link types are allowed at all
repeated LinkType allowed_types = 1;
// Dynamic feasibility thresholds
double min_elevation_deg = 2; // default 0 if unused
bool only_visible = 3; // if true, return only visible/feasible links
// Degree constraints (optional)
uint32 max_out_degree = 10; // 0 means unlimited
uint32 max_in_degree = 11; // 0 means unlimited
// Optional: limit candidates by distance for scalability
double max_range_m = 20; // 0 means unlimited
}
message AdaptationPolicy {
// v1: a simple mapping mode.
// Future: full MCS tables, ACM, coding gains, etc.
enum Mode {
MODE_UNSPECIFIED = 0;
FIXED_RATE = 1; // constant capacity if link is up, else 0
SNR_TO_RATE = 2; // rate from snr_margin (simple piecewise)
SNR_TO_LOSS = 3; // loss from snr_margin (simple logistic)
SNR_TO_BOTH = 4;
}
Mode mode = 1;
// v1 defaults
double fixed_capacity_bps = 2;
double fixed_loss_rate = 3;
// Parameters for simple SNR->rate/loss mappings (implementation defined but deterministic)
double snr_margin_min_db = 10;
double snr_margin_max_db = 11;
}
message CacheHints {
bool precompute_positions = 1;
bool precompute_visibility = 2;
uint32 max_cache_ticks = 3; // 0 = engine default
}
// ---------------------------
// Streaming requests
// ---------------------------
message StreamLinkDeltasRequest {
string scenario_ref = 1;
// Time range for this stream. If empty, use scenario t0..t1.
google.protobuf.timestamp t_start = 2;
google.protobuf.timestamp t_end = 3;
// If unset, use scenario default_dt.
google.protobuf.duration dt = 4;
// Which links to consider/return.
LinkSelector selector = 10;
// Delta emission thresholds
DeltaThresholds thresholds = 20;
// Behavior knobs
bool emit_full_snapshot_first = 30; // recommended true for simpler clients
bool include_debug_fields = 31; // if true, fill debug fields in updates
}
message StreamEventsRequest {
string scenario_ref = 1;
google.protobuf.timestamp t_start = 2;
google.protobuf.timestamp t_end = 3;
// If unset, use scenario default_dt. Must satisfy capabilities bounds.
google.protobuf.duration dt = 4;
EventFilter filter = 10;
// Apply the same selection surface as StreamLinkDeltas for deterministic alignment.
LinkSelector selector = 11;
}
message LinkSelector {
// v1 supports:
// - explicit pairs
// - by link type
// - by node role sets
repeated LinkPair explicit_pairs = 1;
repeated LinkType link_types = 2;
// If non-empty, only consider links where src in set AND dst in set
repeated string src_node_ids = 10;
repeated string dst_node_ids = 11;
// Optional tag filters (exact match)
map<string,string> src_tags = 12;
map<string,string> dst_tags = 13;
// If true, apply scenario LinkPolicy.only_visible behavior
bool only_visible = 20;
// Optional override thresholds (0 uses scenario policy)
double min_elevation_deg = 21;
double max_range_m = 22;
}
message DeltaThresholds {
// Only emit update if absolute change exceeds threshold.
// 0 means "emit on any change" for that field.
double delay_s = 1;
double capacity_bps = 2;
double loss_rate = 3;
double snr_margin_db = 4;
// Emit if link up/down changes always (implicit).
}
// ---------------------------
// Streaming output
// ---------------------------
message LinkDeltaBatch {
string scenario_ref = 1;
string schema_version = 2;
google.protobuf.timestamp time = 3; // tick time
uint64 tick_index = 4;
// If emit_full_snapshot_first=true, first batch may be a full snapshot.
bool is_full_snapshot = 5;
// Sparse updates (add/update)
repeated LinkUpdate updates = 10;
// Links to remove from active set (no longer selected/visible/allowed)
repeated LinkKey removals = 11;
// Optional: server stats
TickStats stats = 20;
}
message LinkUpdate {
LinkKey key = 1;
// Core NetworkView outputs
bool up = 2;
double one_way_delay_s = 3;
double capacity_bps = 4;
double loss_rate = 5;
// Optional debug fields (filled if include_debug_fields=true)
double snr_margin_db = 10;
double elevation_deg = 11;
double range_m = 12;
// Extension space for later (avoid breaking schema)
map<string,string> extra = 30;
}
message LinkKey {
string src = 1;
string dst = 2;
LinkType type = 3;
}
message LinkPair {
string src = 1;
string dst = 2;
LinkType type = 3;
}
message TickStats {
uint32 links_computed = 1;
uint32 links_emitted = 2;
double compute_ms = 3;
}
// ---------------------------
// Events
// ---------------------------
enum EventType {
EVENT_TYPE_UNSPECIFIED = 0;
LINK_UP = 1;
LINK_DOWN = 2;
HANDOVER_START = 3;
HANDOVER_COMPLETE = 4;
NODE_FAILURE = 5;
NODE_RECOVERY = 6;
}
message EngineEvent {
string scenario_ref = 1;
string schema_version = 2;
EventType type = 3;
google.protobuf.timestamp time = 4;
uint64 tick_index = 5;
// Which entities are involved (optional depending on event)
string node_id = 10;
LinkKey link = 11;
map<string,string> meta = 20;
}
message EventFilter {
repeated EventType types = 1;
repeated string node_ids = 2;
}
5) Server behavior specification (streaming semantics)
5.1 Timebase rules
-
ScenarioSpec.t0/t1define the canonical simulation window. -
Stream requests may override with
t_start/t_end:- If unset → default to scenario window.
- Engine must clamp requests to
[t0, t1]unless explicitly configured otherwise.
-
dt:- If unset → use
ScenarioSpec.default_dt. - Must be within
[Capabilities.min_dt, Capabilities.max_dt]; otherwise returnINVALID_ARGUMENT.
- If unset → use
5.2 Tick indexing
- Tick
0corresponds tot_start. - Tick
kcorresponds tot_start + k*dt. - Engine must emit
tick_indexandtimeon every batch.
5.3 Active link set and removals
The stream maintains a client-side “active link table”.
updates[]means: create or replace link entry keyed by(src,dst,type).removals[]means: delete that link entry (no longer in selection or no longer feasible under policy).
This is required for sparse streams when visibility causes links to appear/disappear.
5.4 First message behavior
If emit_full_snapshot_first=true:
-
The first emitted
LinkDeltaBatchat tick 0 must have:is_full_snapshot=trueupdates[]containing all currently selected/feasible linksremovals[]empty
This drastically simplifies consumers (no special “initialization” logic).
5.5 Delta emission thresholds
For ticks after the initial snapshot:
-
A link is emitted in
updates[]if:- it is newly added, OR
- its
upchanged, OR abs(new.delay - old.delay) > thresholds.delay_s(if threshold > 0), ORabs(new.capacity - old.capacity) > thresholds.capacity_bps(if threshold > 0), ORabs(new.loss - old.loss) > thresholds.loss_rate(if threshold > 0), OR- (optional debug) changes exceed debug thresholds if included.
If a threshold is 0, treat it as “emit on any change”.
5.6 Event stream behavior (v1.1 alignment)
StreamEvents emits events in chronological order within [t_start, t_end]:
-
Minimum set in v1:
LINK_UP,LINK_DOWN
-
Optional:
HANDOVER_START,HANDOVER_COMPLETEif you can detect “best-sat changed” for a GS/UT.
-
Alignment requirements:
StreamEventsRequest.dtmust use the same semantics/rules as delta streams (default_dtwhen unset, cap-validated).StreamEventsRequest.selectormust use the same link candidate filtering semantics as delta streams.- Every emitted event includes
tick_index, where tickk = t_start + k*dt.
-
Events should be consistent with LinkDelta stream:
- If link transitions from
up=falsetoup=trueat tick k, emit aLINK_UPevent at that tick’stime.
- If link transitions from
5.7 Backpressure and cancellation
-
Use
grpc.aiostreaming andyieldmessages. -
If the client is slow, await on send; do not build unbounded queues.
-
On cancellation (
context.cancelled()):- stop computation promptly
- release scenario references held by the stream
- record a log entry with reason
6) Internal engine architecture (recommended)
6.1 Modules and responsibilities
scenario_store.py
-
Holds
ScenarioRuntimeobjects keyed byscenario_ref -
Contains:
- validated
ScenarioSpec - pre-parsed skyfield satellite objects
- node dictionaries and role sets
- cached computed data (positions/visibility per tick if enabled)
- RNG seeded from
ScenarioSpec.seed
- validated
timebase.py
- Converts timestamps to ticks and vice versa
- Handles rounding rules (recommend: tick times exactly
t_start + k*dt)
selectors.py
-
Applies
LinkSelector+LinkPolicyto yield candidate link pairs -
Must support:
- explicit pairs (exact)
- link types
- src/dst id filters
- tag filters
- only_visible/min_elevation/max_range constraints
compute/ephemeris.py
- Builds skyfield
EarthSatelliteobjects from TLE - Provides
get_sat_ecef(t)orget_sat_eci(t)depending on your implementation
compute/geometry.py
-
Computes:
- range (m)
- elevation (deg) from ground site to satellite (and vice versa if needed)
- visibility boolean: elevation >= min_elev, range <= max_range
compute/link_budget.py
-
Computes:
- FSPL from range + frequency
- atmospheric attenuation (via ITU-R), optional
- noise power from bandwidth + noise temp
- received power, C/N0, SNR margin, etc.
-
Returns a
PhySummary(internal dataclass)
compute/adaptation.py
-
Maps
PhySummary→NetworkLinkState:-
capacity_bpsand/orloss_rate -
If v1, implement a deterministic piecewise mapping:
- clamp snr_margin_db into [min,max]
- map linearly to capacity between [0, terminal.bandwidth * eff_max] (or use fixed)
- map snr_margin_db to loss via logistic or fixed thresholds
-
streaming/delta.py
- Maintains per-stream “previous link table”
- Computes
updatesandremovalseach tick
streaming/events.py
- Detects link up/down transitions and yields
EngineEvent
7) Scenario validation rules (must be enforced)
-
t0 < t1 -
default_dt > 0 -
Node IDs unique
-
Satellites must include valid TLE lines
-
Fixed sites must have valid lat/lon ranges
-
Terminal model must include:
frequency_hz > 0bandwidth_hz > 0rx_noise_temp_k > 0
-
LinkPolicy.allowed_typesmust be non-empty OR default to all valid types for provided roles
Return gRPC status INVALID_ARGUMENT with a descriptive error message if validation fails.
8) Performance requirements (practical targets)
These are engineering targets; adjust later.
-
Tick compute should scale with number of candidate links, not N² nodes.
-
Implement at least one of:
- pre-filter by link type and role sets
- max_range cutoff
- max_degree pruning (keep best K neighbors by range or SNR)
Recommended optimizations (v1)
- Cache satellite positions per tick if
precompute_positions=true. - Cache ground station ECEF once.
- Vectorize range computations where possible (NumPy arrays).
9) Determinism requirements
Determinism must include:
-
Ordering: Always sort link keys before emitting for stable output
- sort by
(src, dst, type)
- sort by
-
RNG: Use
numpy.random.Generator(PCG64(seed))attached to scenario -
Floating rounding: Do not over-round; but be consistent in computations (same order of ops)
Test: Run the same stream twice and ensure byte-equivalent serialized output (or field-wise equal within tolerance where appropriate).
10) Error handling (gRPC status codes)
Implement these consistent statuses:
NOT_FOUND: unknownscenario_refINVALID_ARGUMENT: bad time range, dt, selector, scenario validation failureRESOURCE_EXHAUSTED: too many active streams for a scenario; or too many links per tick requestedFAILED_PRECONDITION: scenario closedINTERNAL: unexpected exceptions (log stack trace server-side)
Add a stable error message prefix, e.g. GEOMRF_ERR:<CODE>:<details> for easier parsing.
11) Reference streaming algorithm (server-side)
Pseudocode for StreamLinkDeltas
-
Resolve scenario and compute effective
t_start/t_end/dt. -
Build selector state (resolved node sets, tag filters, link types).
-
Initialize:
prev_links = {}(LinkKey → LinkUpdate-like internal struct)active_keys = set()
-
For tick k from 0..:
-
Compute
t = t_start + k*dt; stop whent > t_end. -
Determine candidate link pairs from selector+policy.
-
For each candidate link:
- compute geometry (range/elev/visibility)
- if not feasible and only_visible: skip (will cause removal if previously active)
- compute PHY summary
- compute NetworkLinkState (up/delay/capacity/loss)
- assemble internal current map
curr_links[key] = state
-
Compute removals = keys in prev_links but not in curr_links
-
Compute updates:
- if first tick and emit_full_snapshot_first: all curr_links become updates
- else: apply delta thresholds comparing curr vs prev
-
Emit
LinkDeltaBatch(even if empty updates/removals? optional; recommended emit every tick for simplicity) -
Update prev_links = curr_links
-
Pseudocode for StreamEvents
-
Either:
- derive from
StreamLinkDeltaslogic (shared per-stream evaluator), OR - implement as separate evaluation loop that only checks transitions
- derive from
-
Resolve and validate
t_start/t_end/dtexactly as in delta streams. -
Build selector from request and apply identical candidate filtering.
-
Emit event when
(prev.up != curr.up)for any link in the selected set. -
Populate both
timeandtick_index.
12) Client expectations (contract for consumers)
A correct consumer must:
-
Start with the first
LinkDeltaBatch(full snapshot) -
Maintain
active_table[LinkKey] = LinkUpdate -
Apply each tick:
- delete removals
- upsert updates
-
Use
timeandtick_indexas authoritative time -
Optionally also subscribe to events; events are primarily observability data, not control-plane truth
13) Example client (must be included)
Create examples/client_stream.py:
-
Connect to server
-
CreateScenariofrom an inline scenario object (or YAML file) -
Start
StreamLinkDeltasand print:- tick index, number of updates/removals, sample link
-
Optionally start
StreamEventsconcurrently -
Close scenario at end
Checklist:
-
Implement
examples/client_stream.py -
Add README usage snippet:
- start server
- run client
- expected output format
14) Minimal “default” adaptation mapping (v1, deterministic)
If your existing code already outputs a usable throughput and PER proxy, use it. If not, implement a deterministic fallback:
v1 fallback policy
-
up = visibility && snr_margin_db > 0(or >= threshold) -
delay = range_m / c(c = 299792458 m/s) -
capacity_bps:-
FIXED_RATE:
fixed_capacity_bpswhen up else 0 -
SNR_TO_RATE:
- normalize
x = clamp((snr_margin_db - min)/(max-min), 0..1) capacity = x * capacity_max, wherecapacity_max = bandwidth_hz * eff_max- choose
eff_maxconstant (e.g., 4 bits/s/Hz) in v1; document it
- normalize
-
-
loss_rate:-
FIXED:
fixed_loss_ratewhen up else 1 -
SNR_TO_LOSS:
- logistic:
loss = 1 / (1 + exp(a*(snr_margin_db - b)))with fixed a,b - clamp to [0,1]
- logistic:
-
Checklist:
- Decide v1 constants (
eff_max, logistic params, up-threshold) - Put them in
adaptation.pyand record them in logs/version
15) Observability (recommended even in v1)
-
gRPC access logs including:
- scenario_ref, stream type, time range, dt, selector summary
-
Prometheus counters (optional but easy):
- streams active, ticks computed, links computed, mean compute time
-
TickStats in stream payload (already specified)
Checklist:
- Add
TickStatscomputation - Add server-side metrics (optional)
- Add structured logging with correlation IDs
16) Security and robustness (v1 minimum)
-
Bind address configurable (
0.0.0.0:50051default) -
Optional TLS later; v1 can be plaintext for local lab use
-
Enforce limits:
- max nodes
- max links per tick
- max active streams per scenario
Checklist:
- Enforce link/node limits with
RESOURCE_EXHAUSTED - Enforce max concurrent streams per scenario
17) Acceptance criteria (definition of done)
Functional
-
Server starts and responds to
GetVersionandGetCapabilities -
CreateScenarioreturns a scenario_ref and validates inputs -
StreamLinkDeltasemits:- a full snapshot first (when enabled)
- then sparse deltas/removals per tick
-
StreamEventsemits link up/down events consistent with deltas -
CloseScenariofrees scenario resources and blocks further streams
Correctness
- Determinism test passes (same inputs → same outputs)
- Selector tests pass (only requested links emitted)
- Delta threshold tests pass (small changes suppressed)
Usability
- Example client runs end-to-end against server and prints reasonable output
- README explains how to run locally and how to pass a scenario YAML
18) Optional but high-value extension hooks (safe to leave stubbed)
These can exist as placeholders in code (no API changes needed later):
- “Debug fields” population (
include_debug_fields=true) - Additional events (handover start/complete)
- More orbit formats (OEM/SP3) behind
SatelliteOrbitoneof later - Better antenna pattern modeling behind
TerminalModel
19) v1.1 orchestrator-alignment tasks (new)
- Bump schema to
geomrf.v1.1(or equivalent versioning plan) for event-alignment fields. - Update
StreamEventsRequestimplementation to honor requestdtandselector. - Populate
EngineEvent.tick_indexfrom the same tick loop semantics as deltas. - Add tests:
- events and deltas requested with same window/selectors produce aligned tick grids
- invalid event
dtreturnsINVALID_ARGUMENT - event selector filtering mirrors delta selector behavior
- Keep backward compatibility plan explicit (version gate or dual-field behavior) for existing v1 clients.
TASKS_IMNET.md — SatSim IMNET Lane (OMNeT++/INET) Implementation Plan
This is a task-driven implementation plan for the IMNET lane (OMNeT++/INET). It covers:
- Building and running an OMNeT++/INET simulation project under opp_env
- Integrating IMNET into the existing uv-managed SatSim workspace (without mixing concerns)
- Consuming orchestrator-produced LinkState traces (v1: trace-first) to apply dynamic link changes
- Producing artifacts compatible with SatSim run directories / manifests
Policy reminder (already in repo): Python workflows are
uv-managed; OMNeT++/INET toolchain isopp_env-managed.
-1) Locked v1 decisions for this plan update
- Execution model is post-stream replay:
- Orchestrator records the OMNeT trace during streaming, then runs OMNeT after stream completion.
- Topology strategy is Strategy 3:
- stable NED template + orchestrator-generated
node_map.jsonandlink_map.json.
- stable NED template + orchestrator-generated
- Orchestrator ↔ OMNeT runtime interface is typed and orchestrator-owned:
- no implicit "just pass arbitrary
run_args" contract for required parameters.
- no implicit "just pass arbitrary
- Canonical trace key fields use
src,dst,link_type(not mixedtypenaming). - Repo split is intentional:
- OMNeT assets under
lanes/omnet/, Python lane adapter/runner undersatsim_orch/lanes/omnet_lane/.
- OMNeT assets under
opp_envdefault path uses Nix, but--nixless-workspaceis a supported fallback mode.
0) Deliverables (what “done” means)
-
lanes/omnet/exists and contains:- a reproducible
opp_envworkspace definition (pinned OMNeT++ + INET versions) - an OMNeT++ project (“satsim-imnet”) that compiles and runs headless
- a LinkState trace ingestion + applier module that updates delay/rate/(optional loss) per tick
- a minimal demo scenario (2–10 nodes) that:
- runs via orchestrator in
--mode omnet - reads the trace written by orchestrator
- produces artifacts under the run directory (logs + .vec/.sca; optional pcap)
- runs via orchestrator in
- a reproducible
- One-command dev flow:
-
uv run satsim run <scenario.yaml> --mode omnet ...executes IMNET via opp_env and stores artifacts in the standard run folder.
-
- Existing OMNeT lane blockers in current orchestrator code are closed before IMNET C++ work:
- trace writer removal serialization does not use
__dict__on slots dataclasses - trace line includes
is_full_snapshot - OMNeT launch is not silently skipped when
--mode omnetis selected
- trace writer removal serialization does not use
1) Repo layout for IMNET lane
- Create directory structure:
-
lanes/omnet/-
lanes/omnet/WORKSPACE.md(how to install + run with opp_env; pinned versions) -
lanes/omnet/opp_env/(workspace init and pinned selection) -
lanes/omnet/satsim-imnet/(the OMNeT++ project)-
src/(C++ modules) -
ned/(NED definitions) -
omnetpp.ini(baseline config; orchestrator may override with -f/-c) -
Makefile(generated by opp_makemake; committed only if desired, otherwise generated) -
README.md(how to build/run inside opp_env)
-
-
lanes/omnet/scripts/(helper wrappers used by orchestrator)-
install.py(optional convenience; callsopp_env install ...) -
build.py(build IMNET project inside opp_env) -
run.py(run IMNET headless inside opp_env, accepts args from orchestrator)
-
-
-
- Keep orchestrator Python integration in package code:
-
satsim_orch/lanes/omnet_lane/adapter.pyremains the lane entrypoint -
satsim_orch/lanes/omnet_lane/runner.pyowns command construction and process launch -
lanes/omnet/WORKSPACE.mddocuments how these package paths map tolanes/omnet/assets
-
2) Version pinning and opp_env workspace (reproducible toolchain)
2.1 Pick and pin OMNeT++ + INET versions
- Pin versions (v1 recommendation; can be adjusted later):
- OMNeT++:
omnetpp-6.3.0 - INET:
inet-4.5.4
- OMNeT++:
- Document the pin in:
-
lanes/omnet/WORKSPACE.md - orchestrator run manifest fields (already exists; ensure it records these exact strings)
-
2.2 Initialize an opp_env workspace outside git working trees
Goal: keep installs reproducible while avoiding committing huge toolchains.
- Decide where the opp_env workspace lives:
- default:
~/.cache/satsim/opp_env/workspace(outside git tree) - store only small config/metadata in git, not the compiled artifacts
- default:
- Add
.gitignoreentries:- ignore optional repo-local workspace path
lanes/omnet/opp_env/workspace/if used for local experimentation - ignore
lanes/omnet/**/out/(OMNeT outputs) - ignore
lanes/omnet/**/results/(if used)
- ignore optional repo-local workspace path
2.3 Provide canonical commands (must work from uv venv)
- Ensure
opp_envis invoked via uv:-
uv run opp_env --version -
uv run opp_env list
-
- Implement
lanes/omnet/scripts/install.pythat performs:- workspace init (idempotent)
- install pinned INET (which pulls matching OMNeT++)
- verify the installed packages exist
- Add workspace mode guidance:
- default mode: Nix-backed
opp_envworkspace - fallback mode:
opp_env --nixless-workspace(document prerequisites and reproducibility caveats) - orchestrator preflight must emit a clear error only when the selected workspace mode requirements are unmet
- default mode: Nix-backed
3) uv ↔ opp_env integration (clean boundary)
3.1 Keep responsibility boundaries strict
- Confirm and document:
-
uvmanages Python deps + orchestrator execution -
opp_envmanages OMNeT++/INET toolchain and the shell/run environment - Orchestrator calls
opp_env run ...(oropp_env shell -c ...) rather than assuming OMNeT binaries are on PATH
-
3.2 Add orchestrator-side “IMNET preflight”
(Only if not already present; keep it minimal.)
- In orchestrator’s OMNeT lane runner:
- verify
opp_envis available (uv run opp_env --version) - verify required runtime mode dependencies:
- Nix-backed mode: verify
nixis available - nixless mode: verify required toolchain binaries are present
- Nix-backed mode: verify
- verify the IMNET scripts exist (install/build/run)
- if missing dependencies:
- fail with a single actionable message (no partial runs)
- verify
- Close current OMNeT-lane correctness blockers first:
- fix trace writer removal serialization for
LinkKey(slots dataclass) - include
is_full_snapshotin the OMNeT trace JSONL tick payload - remove
ini_path-gated silent skip; in omnet mode runner must launch or raise an explicit error
- fix trace writer removal serialization for
3.3 Standardize how orchestrator launches IMNET
- Replace ad-hoc argument passing with a typed runner contract (orchestrator-owned):
- required fields:
-
workspace_path -
inet_version(pinned string) -
project_path -
ini_path -
trace_path -
dt_seconds -
outdir -
seed
-
- optional fields:
-
config_name -
sim_time_limit_s -
extra_args(non-critical escape hatch only)
-
- required fields:
- Define one canonical
opp_envinvocation generated by runner code:-
opp_env run inet-<PINNED> --init -w <WORKSPACE> --chdir -c "<COMMAND>"
-
-
runner.pyconverts typed fields into OMNeT CLI args; callers do not handcraft command strings
4) IMNET contract with the other SatSim systems (as they exist)
4.1 Relationship to Geo/RF engine
- IMNET does not talk to Geo/RF directly in v1
- IMNET consumes orchestrator-produced trace derived from:
- Geo/RF
StreamLinkDeltasbatches (tick_index + time + updates/removals)
- Geo/RF
4.2 Relationship to Orchestrator (v1 trace-first)
- Orchestrator responsibilities (assumed existing):
- create scenario in Geo/RF engine
- consume LinkDeltaBatch stream
- write a deterministic LinkState trace file for OMNeT lane
- launch OMNeT lane runner with correct typed parameters after stream completion (post-stream replay)
- IMNET responsibilities:
- parse the trace deterministically
- schedule link updates in simulation time aligned to tick_index
- apply updates to the simulated links (delay/rate/(optional loss), and “up/down” semantics)
4.3 Trace file format (IMNET must support this)
Pick one and lock it. If orchestrator already emits a format, IMNET must match it.
- Decide and document the v1 trace format (JSONL, locked)
- One JSON object per tick line
- Fields required:
-
tick_index(uint64) -
time(ISO8601 or unix seconds; used for logging only; simtime comes from tick_index * dt) -
is_full_snapshot(bool) -
updates(list) -
removals(list)
-
- Each
updatecontains:-
src(string) -
dst(string) -
link_type(string enum name) -
up(bool) -
one_way_delay_s(float) -
capacity_bps(float) -
loss_rate(float; optional if IMNET v1 ignores loss)
-
- Each
removalcontains:-
src,dst,link_type(same key fields)
-
- Determinism requirements:
- stable ordering of
updatesandremovalswithin each tick - no mixed key names (
typevslink_type) in v1 output
- stable ordering of
- Define dt semantics for IMNET:
- Lock v1 approach: IMNET gets
dtvia typed runner argument (dt_seconds) - trace
timeis informational/logging only; tick scheduling usestick_index * dt_seconds
- Lock v1 approach: IMNET gets
5) OMNeT++/INET project scaffolding (“satsim-imnet”)
5.1 Create a minimal INET-based network model
Goal: minimal, not fancy, but real packets flow and link parameters can change.
- Create
lanes/omnet/satsim-imnet/ned/SatSimNetwork.ned- Use INET
StandardHost(orRouter) modules for nodes - Include:
- one traffic source app and one sink app (UDP is fine for v1)
- optional intermediate router (for a 2-hop demo)
- Strategy 3 mapping is required in v1:
- orchestrator generates
node_map.json(node_id↔ module path) - orchestrator generates
link_map.json(LinkKey ↔ mutable channel/shim path) - LinkStateApplier consumes map files in strict mode and fails on unknown keys
- orchestrator generates
- Use INET
- Create
lanes/omnet/satsim-imnet/omnetpp.ini- include INET paths and defaults
- configure:
- IP address assignment (INET configurator)
- app endpoints and start times
- disable GUI by default (
Cmdenv) for orchestrator runs
5.2 Add a LinkState ingestion + application component
- Add
src/LinkTraceReader.{h,cc}- reads the trace file
- validates ordering (tick_index monotonic)
- provides an in-memory list of per-tick updates (or streaming reader)
- Add
src/LinkStateApplier.{h,cc}as acSimpleModule- parameters:
-
string tracePath -
double dtSeconds -
bool strict(fail-fast on unknown link keys) -
bool applyLoss(optional)
-
- behavior:
- on init:
- load/validate trace header/dt
- build a mapping from LinkKey → simulation link object(s)
- schedule first self-message at tick 0
- on each tick:
- apply updates/removals for that tick
- schedule next tick if present
- on finish:
- write summary scalars (num updates applied, unknown keys, etc.)
- on init:
- parameters:
6) How IMNET represents links (v1 minimal approach)
You need a concrete, implementable mapping from LinkKey → something mutable in OMNeT/INET.
6.1 Choose v1 link representation
Pick one approach and implement it end-to-end:
Option A (preferred v1): mutate OMNeT channels
- Use a channel type that supports:
- delay updates (
delay) - datarate updates (
datarate) - “up/down” via disabling or forcing drop
- delay updates (
- Build a stable topology at startup containing all candidate links
- Map each LinkKey to a channel pointer
- On update:
- set channel delay
- set channel datarate
- if
up=falseor “removal”: disable channel / set drop mode
Option B: insert a small “LinkShim” module per edge
- Create a
LinkShimcSimpleModulethat:- applies propagation delay (schedule) (not selected for v1 Option A path)
- applies serialization delay from capacity (packet_bits / capacity_bps) (not selected for v1 Option A path)
- drops packets by loss_rate (not selected for v1 Option A path)
- drops everything when
up=false(not selected for v1 Option A path)
- Connect hosts via
LinkShimmodules instead of relying on channel mutability (not selected for v1 Option A path)
v1 recommendation: start with Option A if channel mutability is adequate; fall back to Option B if not.
6.2 Define “up/down/removal” semantics for v1
- Lock v1 semantics (must match orchestrator trace writer):
-
up=falsemeans the link exists but currently unavailable (drop all / disable) -
removalmeans the link is not in the current active set- v1 handling recommendation: treat removal as
up=false(do not delete topology) - allow a later
updateto re-enable it
- v1 handling recommendation: treat removal as
-
6.3 Unit conversion rules (lock them)
-
one_way_delay_s→ OMNeTsimtime_tdelay -
capacity_bps→ channel datarate (bps) -
loss_rate:- if supported natively by chosen link representation, apply directly
- otherwise implement drop probability in
LinkShimor a per-interface dropper
7) Topology generation strategy (v1: small but consistent)
7.1 Keep topology simple in v1
- v1 target: 2–10 nodes, with:
- at least one satellite-like router node
- at least one ground station-like host node
- at least one dynamic link changing delay/rate/up/down over time
7.2 How the topology is created
Use Strategy 3 for v1:
- Commit a stable NED topology template under
lanes/omnet/satsim-imnet/ned/ - Orchestrator writes
node_map.jsonandlink_map.jsonper run into the run artifacts - IMNET loads maps at startup and applies all updates by LinkKey lookup through the map
- In strict mode, any unmapped LinkKey is a hard failure
- Generated NED per run (Strategy 2) is explicitly deferred beyond v1
8) Orchestrator ↔ IMNET runtime interface (exact parameters)
8.1 Standardize the runner arguments
- Define a single typed runner entrypoint (called by adapter/runner code) that accepts:
-
--trace <path> -
--dt <seconds> -
--outdir <path> -
--seed <int> -
--tend <seconds>(optional; can be inferred from last tick) -
--config <omnet_config_name>(optional)
-
- Runner converts these to OMNeT args:
-
-u Cmdenv -
-n <NEDPATHS including INET and satsim-imnet/ned> -
-l(load required libraries if needed) -
--output-dir=<outdir> -
--seed-set=<seed>(or equivalent OMNeT seed setting) -
--sim-time-limit=<tend>s(if used) - pass
tracePathanddtSecondsas module parameters
-
- Runner invocation semantics:
- if lane mode is
omnetorparallel, omnet runner invocation is mandatory - missing required runner inputs is a hard error, never a silent no-op
- if lane mode is
8.2 Ensure artifacts land in the SatSim run directory
- Orchestrator passes
outdir = artifacts/<run_id>/omnet/ - IMNET writes:
- OMNeT results (.vec/.sca) to
outdir/results/(or directly under outdir) - stdout/stderr to
outdir/logs/omnet.log(or orchestrator captures it) - a small
imnet_runinfo.jsoncontaining:- versions (inet, omnetpp)
- trace hash
- config name
- seed
- start/end ticks processed
- OMNeT results (.vec/.sca) to
9) Observability for IMNET (minimal but useful)
- Configure OMNeT to export:
- scalar summary stats (packets sent/received, drops)
- vector time series for throughput/delay (where feasible)
- Add LinkStateApplier scalars:
-
ticksProcessed -
updatesApplied -
removalsApplied -
unknownLinkKeys(must be 0 in strict mode)
-
- Optional (v1.1): PCAP output
- If using INET features that can emit pcap:
- enable and write into
outdir/pcap/(deferred; not enabled in v1)
- enable and write into
- Otherwise: skip; rely on Mininet lane for PCAPs
- If using INET features that can emit pcap:
10) Testing and validation (must exist for v1)
10.1 Offline smoke test (no orchestrator)
- Add
lanes/omnet/satsim-imnet/tests/(or scripts) that:- runs a short simulation with a tiny trace
- verifies results files created
- verifies LinkStateApplier processed N ticks
- Provide
lanes/omnet/scripts/smoke.py:- installs toolchain (if needed)
- builds project
- runs headless sim for ~5–20 seconds simtime
10.2 End-to-end smoke test (with orchestrator)
- Add a minimal
scenarios/omnet_smoke.yaml:- small node set
- dt ~ 1s
- short window (e.g., 60s)
- explicit link selector to keep link key set stable
- Add a single command documented in root README or WORKSPACE.md:
-
uv run satsim run scenarios/omnet_smoke.yaml --mode omnet
-
- Verify artifacts:
- orchestrator run folder exists
- omnet subfolder contains .vec/.sca
- logs show LinkStateApplier applying ticks in order
11) Known v1 constraints (explicitly accepted)
- v1 uses trace-first ingestion (no live gRPC inside OMNeT)
- v1 may ignore directional asymmetry:
- if LinkKey is directional but the chosen link model is bidirectional, document how direction is collapsed (or restrict scenarios accordingly)
- v1 focuses on:
- correctness of tick alignment
- correctness of delay/rate updates
- reproducible, scripted build/run under opp_env
12) v1.1+ hooks (implemented as opt-in features)
- Live streaming adapter hook inside OMNeT++ (
ingest_mode=live_stream, runtime trace refresh +grpcTargethook parameter) - Dynamic topology construction from ScenarioSpec maps (generated NED at run-time in
run.py) - Proper per-direction link modeling hook (
directional_links=truecreates separate directional channels) - Integration hook for SDN decision traces (
sdn_trace_path+ per-tick channel enable/disable decisions in LinkStateApplier)
ORCHESTRATOR_IMPLEMENTATION.md — SatSim Orchestrator (Python) Detailed Plan
This document is a task-driven implementation plan for the SatSim Orchestrator. It is intended to be handed to an LLM to implement the orchestrator in Python. It focuses on architecture, module boundaries, data flow, and concrete tasks (not deep RPC wire details).
The orchestrator is responsible for:
- Loading a scenario
- Starting and coordinating subcomponents (Geo/RF engine, OMNeT++ lane, Mininet lane)
- Driving a unified timebase
- Fanning out LinkState/Event updates to lane adapters
- Recording reproducible artifacts (manifest, logs, optional traces, metrics)
Decision lock (2026-02-18)
These ambiguities are now resolved and should be treated as fixed v1 design:
- Authoritative tick source:
StreamLinkDeltasis the only control-plane tick source. Lanes apply link state from deltas, not from events. - Event stream alignment (Option B accepted): evolve Geo/RF event API so
StreamEventsRequestcarriesdt+selector, andEngineEventcarriestick_index, aligned to the same tick grid as deltas. - Error contract handling: orchestrator must handle
NOT_FOUND,INVALID_ARGUMENT,FAILED_PRECONDITION, andRESOURCE_EXHAUSTEDas first-class engine responses. - Scenario translation strictness: translation to Geo/RF
ScenarioSpecis fail-fast and must satisfy all required engine fields/constraints. - Tooling policy: Python workflows use
uv(uv run,uv add); do not rely onpip.
1) Repository layout (recommended)
satsim/
orchestrator/
pyproject.toml
README.md
ORCHESTRATOR_IMPLEMENTATION.md
satsim_orch/ init.py cli.py main.py
config/ init.py schema.py loader.py defaults.py normalize.py
runtime/ init.py run_manager.py manifest.py artifact_store.py logging.py versioning.py process.py
timebase/ init.py clock.py modes.py scheduler.py
bus/ init.py messages.py queues.py fanout.py
geomrf/ init.py client.py translate.py health.py
lanes/ init.py base.py registry.py
mininet_lane/
__init__.py
adapter.py
topo.py
shaping.py
controller.py
capture.py
omnet_lane/
__init__.py
adapter.py
trace_ingest.py
runner.py
metrics/ init.py prom.py records.py exporters.py
util/ init.py ids.py units.py asyncx.py errors.py
tests/ test_config_validation.py test_timebase_scheduler.py test_bus_fanout.py test_run_manifest.py test_lane_adapter_contract.py test_geomrf_client_smoke.py
subprojects/
geomrf-engine/ # separate project; orchestrator consumes it via gRPC
lanes/
omnet/
mininet/
observability/
artifacts/
2) Orchestrator design summary (targets)
2.1 Orchestrator responsibilities
- Scenario loading & validation
- Run directory + manifest creation
- Geo/RF engine lifecycle (create scenario; start streams; close)
- Lane lifecycle:
prepare()(build topology / start processes)apply_tick()(apply link deltas / events)finalize()(stop processes, collect outputs)
- Unified runtime pacing:
- offline apply-fast
- real-time apply-paced (wall-clock aligned)
- parallel lane fanout (same incoming ticks feed multiple lanes)
- Artifact collection:
- config snapshot, manifest
- optional LinkState trace logging
- metrics export
- PCAP capture (Mininet lane)
2.2 Key architectural choices
- Python 3.11+ with asyncio
- gRPC async client (
grpc.aio) for Geo/RF streaming - In-process async fanout bus using bounded queues (v1)
- Pluggable lane adapters via a registry
- Everything stamped with versions/seeds for reproducibility
3) Implementation checklist (extremely detailed)
3.1 Project bootstrap and build
- Create
orchestrator/pyproject.toml- Define package name (e.g.,
satsim-orchestrator) - Set Python version (>=3.11)
- Add dependencies:
-
pydantic -
pyyaml -
grpcio,grpcio-tools,protobuf -
rich(optional, for CLI UX) -
prometheus-client(optional) -
aiofiles(optional, async file writes)
-
- Define package name (e.g.,
- Add dev dependencies:
-
pytest,pytest-asyncio -
ruff/black -
mypy(optional)
-
- Add task runner and
uvcommands:-
uv run pytest -
uv run ruff check . -
uv run python -m satsim_orch.cli run <scenario.yaml> ...
-
3.2 CLI and entrypoints
- Implement
satsim_orch/cli.pywith commands:-
run <scenario.yaml> --mode {omnet|mininet|parallel} --dt 1s --t0 ... --t1 ... -
validate <scenario.yaml> -
list-runs -
show-run <run_id>
-
- Implement
satsim_orch/main.py- Parse CLI args
- Load scenario
- Create
RunContext - Run orchestrator loop
- Define exit codes and error messages:
- Invalid config → exit 2
- Missing dependency/lane binary → exit 3
- Runtime error → exit 1
4) Configuration system
4.1 Scenario schema (Pydantic)
- Implement
config/schema.pywith a canonicalScenarioConfig- Global:
-
name -
seed -
time: {t0, t1, dt, mode} -
execution: {lane_mode, strict_reproducible, record_trace, record_pcap} -
paths: {artifacts_root}
-
- Geo/RF engine connection:
-
geomrf: {grpc_target, request_dt, selector_defaults, thresholds_defaults}
-
- Geo/RF scenario payload:
-
geomrf.scenario_specmaps 1:1 to Geo/RFScenarioSpecrequired fields (nodes,terminal, orbit/site, link/adaptation policy) - optional high-level shorthand may exist, but must compile deterministically to valid
ScenarioSpec
-
- Lane configs:
-
mininet: {controller: {type, addr}, topo: {...}, shaping: {...}} -
omnet: {project_path, ini_path, run_args, trace_mode}
-
- Global:
- Add validators:
-
t0 < t1 -
dt > 0 -
seed >= 0 - lane configs exist for chosen mode
- fail-fast if engine-required scenario fields are missing/invalid
- fail-fast if
request_dtis outside engine capabilities (min_dt,max_dt) - if
mode=mininetrequire Linux + OVS checks (soft validate with warnings)
-
- Add defaulting rules in
config/defaults.py- dt default (e.g., 1s)
- thresholds default (delay/capacity/loss)
- artifacts root default
./artifacts/runs
4.2 Loader and normalization
- Implement
config/loader.py- load YAML/JSON
- environment variable expansion (optional)
- include/merge support (optional)
- Implement
config/normalize.py- produce a normalized config (canonical types, timezone normalization)
- compute derived fields (run duration, tick count)
- Implement
config/normalize.pyto build:-
GeomrfScenarioSpec(engine-facing) fromScenarioConfig -
LaneScenarioSpec(lane-facing) fromScenarioConfig
-
5) Run manager and artifacts
5.1 Run context and directory structure
- Implement
runtime/run_manager.py- Generate
run_id(timestamp + short random, or UUID) - Create run directory:
-
artifacts/runs/<run_id>/ -
logs/,metrics/,pcaps/,traces/,manifests/
-
- Save copies of:
- raw scenario file
- normalized scenario JSON
- Generate
- Implement
runtime/manifest.py- manifest fields:
- run_id, scenario name, timestamps
- seeds
- component versions (orchestrator, geomrf engine, lanes)
- execution mode, dt, tick count
- git SHAs if available
- host info (OS, python version) (optional)
- manifest fields:
- Implement
runtime/versioning.py- orchestrator version string
- best-effort git SHA discovery
5.2 Logging
- Implement
runtime/logging.py- structured JSON logs to file
- human-readable console logs
- include
run_idand correlation IDs
- Implement log rotation policy (optional)
- Implement
util/errors.pywith typed exceptions:-
ScenarioError,GeomrfError,LaneError,TimebaseError
-
5.3 Artifact store helpers
- Implement
runtime/artifact_store.py-
write_text(path, text) -
write_json(path, obj) -
append_jsonl(path, obj) - atomic writes (write temp then rename)
-
- Implement trace recording option:
- If
record_trace=true, append received LinkDeltaBatch to JSONL/Parquet later
- If
6) Timebase and pacing
6.1 Time modes
- Implement
timebase/modes.pyenum:-
OFFLINE(apply incoming ticks as fast as possible; no sleeping) -
REALTIME(apply incoming ticks at wall-clock pace) -
PARALLEL(lane selection mode; both lanes consume the same incoming ticks)
-
- Implement
timebase/clock.py-
SimulationTimetype for formatting/validation of incoming stream ticks - conversions and formatting
-
- Implement
timebase/scheduler.py- implement pacing, not tick generation
- for REALTIME: sleep until expected wall-clock for next received tick
- for OFFLINE: apply each received tick immediately
- Add drift handling for REALTIME:
- if late by > 1 tick, either skip ticks or catch up (configurable)
- default: never skip control-plane ticks; warn if drift accumulates
7) Internal bus and message contracts
7.1 Canonical internal messages
- Implement
bus/messages.pydataclasses:-
TickUpdate:- run_id, scenario_ref
- tick_index, time
- link_updates: list
- link_removals: list
- events: list
- stats: compute timing, counts
-
RunControlmessages:- start/pause/resume/stop
-
LaneStatusmessages:- ready/running/error/stopped
-
- Implement
bus/queues.py- bounded asyncio queues
- per-lane queue limits (configurable)
- Implement
bus/fanout.py- one producer (Geo/RF stream consumer)
- N consumers (lane adapters + recorder)
- backpressure policy:
- default: block producer when any lane queue is full (strict sync)
- option: drop trace recorder only (never drop lane updates)
- Add message ordering rules:
- tick updates delivered in increasing tick_index
- within a tick: removals applied before updates by consumers (documented)
8) Geo/RF engine client integration
8.1 gRPC client
- Implement
geomrf/client.py- gRPC channel creation (
grpc.aio.insecure_channel(target)) - stub creation from generated proto
-
get_version(),get_capabilities() -
create_scenario(scenario_spec) -> scenario_ref -
close_scenario(scenario_ref) -
stream_link_deltas(request) -> async iterator -
stream_events(request) -> async iterator
- gRPC channel creation (
- Implement
geomrf/health.py- connect + health check on startup
- gate event-consumer features on engine schema/version support
- Implement
geomrf/translate.py- translate orchestrator ScenarioConfig to Geo/RF ScenarioSpec (engine-facing)
- enforce deterministic key ordering where needed for reproducible payloads
- validate all required proto fields before RPC call; reject locally on mismatch
- translate Geo/RF
LinkDeltaBatchinto internalTickUpdate
- Implement robust error mapping:
- map
NOT_FOUND,INVALID_ARGUMENT,FAILED_PRECONDITION,RESOURCE_EXHAUSTEDto typedGeomrfError - define retry policy for
UNAVAILABLE/DEADLINE_EXCEEDED(bounded retries + backoff) - include scenario_ref and tick_index in error logs
- map
8.2 Stream consumption tasks
- Implement
geomrfstream consumer coroutine:- starts
StreamLinkDeltaswithemit_full_snapshot_first=true - reads batches and pushes
TickUpdateto bus producer
- starts
- Implement event stream consumer coroutine (optional in v1):
- call
StreamEventswith samet_start/t_end/dt/selectorused for deltas - consume
EngineEvent.tick_indexdirectly (no nearest-tick heuristics) - record events to trace/metrics channel for observability
- if connected engine does not support aligned event schema, disable event consumer and warn once
- call
- Merge/control strategy:
- lane control path uses
TickUpdatefromStreamLinkDeltasonly - event stream is informational and must not mutate lane state
- lane control path uses
9) Lane adapter architecture
9.1 Adapter base contract
- Implement
lanes/base.py:-
class LaneAdapter(Protocol)or ABC with:-
name: str -
async prepare(run_context, scenario_config) -> None -
async apply_tick(tick: TickUpdate) -> None -
async finalize(run_context) -> None -
async health() -> dict(optional)
-
-
- Implement
lanes/registry.py- register adapters by name
- instantiate chosen adapters based on
lane_mode
- Implement
tests/test_lane_adapter_contract.pyfor interface compliance
9.2 Mininet lane adapter (detailed tasks)
-
Implement
lanes/mininet_lane/adapter.py-
prepare():- validate Linux prereqs (
ovs-vsctl,tc,ip) - start controller (ONOS/Ryu) if configured
- create Mininet topology (delegate to
topo.py) - start Mininet network
- start PCAP capture if enabled (delegate to
capture.py)
- validate Linux prereqs (
-
apply_tick():- apply removals (links down) first
- apply updates:
- for each link: set up/down state
- apply delay/loss/rate using shaping module
-
finalize():- stop captures
- stop Mininet
- stop controller if orchestrator started it
-
-
Implement
lanes/mininet_lane/topo.py- create a Mininet graph from ScenarioConfig node roles
- map SatSim node IDs to Mininet host/switch names
- define OVS switches and host attachments
- decide representation:
- v1 recommended: represent satellites as OVS switches; GS/UT as hosts
- allow optional SAT as hosts if needed
- create links but keep them initially “neutral” (shaping applied per tick)
-
Implement
lanes/mininet_lane/shaping.py- provide functions:
-
set_link_up(link_id)/set_link_down(link_id) -
apply_netem(link_id, delay_ms, loss_pct) -
apply_rate(link_id, rate_mbps) -
clear_shaping(link_id)
-
- implement using:
-
tc qdisc replace dev <if> root netem delay ... loss ... -
tc qdisc ... tbf/htbfor rate
-
- ensure idempotency (repeated calls safe)
- log every applied shaping change with tick_index
- provide functions:
-
Implement
lanes/mininet_lane/controller.py- support controller options:
- external controller address (already running)
- orchestrator-launched controller container/process (optional v1)
- store controller version info in manifest
- support controller options:
-
Implement
lanes/mininet_lane/capture.py- start tcpdump for relevant interfaces
- rotate PCAP per time or per run (v1: one PCAP per run)
- store PCAP path in manifest
9.3 OMNeT lane adapter (trace-first v1)
- Implement
lanes/omnet_lane/adapter.py- v1 assumption: OMNeT consumes a trace file (offline) rather than live streaming
- Implement
lanes/omnet_lane/trace_ingest.py- orchestrator writes a LinkState trace file suitable for OMNeT adapter
- define a simple trace format:
- JSONL per tick containing updates/removals
- or CSV-like with (tick, src, dst, up, delay, rate, loss)
- ensure deterministic ordering of entries
- Implement
lanes/omnet_lane/runner.py- launch OMNeT simulation via subprocess:
- capture stdout/stderr to run logs
- exit code handling
- place outputs into artifacts directory
- launch OMNeT simulation via subprocess:
10) Orchestrator main runtime loop
10.1 Lifecycle coordination
- Implement
satsim_orch/main.pyorchestration steps:- Create run context + artifact directories
- Log environment + versions
- Initialize Geo/RF client and fetch version/capabilities
- Create Geo/RF scenario
- Instantiate chosen lane adapters (mininet/omnet/parallel)
- Call
prepare()for each lane - Start stream consumer tasks
- Start realtime pacing task only when
time.mode=REALTIME - Await completion conditions:
- reached t_end
- user stop signal (CTRL+C)
- error in any task
- Finalize lanes
- Close Geo/RF scenario
- Write final manifest + summary
10.2 Streaming-driven execution (locked)
- Geo/RF stream is the authoritative tick source.
- Orchestrator does not generate ticks; it consumes them and fans out.
Tasks:
- In streaming consumer, for each LinkDeltaBatch:
- translate to
TickUpdate - push to fanout bus
- translate to
10.3 Fanout to lanes
- For each lane, run a consumer task:
-
while True: tick = await queue.get(); await lane.apply_tick(tick) - handle cancellation and lane errors
-
- Implement strict ordering:
- do not allow lane to process tick k+1 before tick k
- Implement shutdown handshake:
- send
RunControl(STOP)to lanes on exit - drain queues if configured
- send
11) Error handling and shutdown
11.1 Exception strategy
-
Any uncaught exception in:
- Geo/RF stream consumer
- any lane consumer
- any lane adapter method triggers a coordinated shutdown.
-
Implement
runtime/process.py:- subprocess management with kill/terminate escalation
- collect exit codes and stderr tails
-
Add SIGINT/SIGTERM handling:
- first CTRL+C: graceful stop
- second CTRL+C: immediate stop
11.2 Cleanup correctness
- Always attempt:
-
finalize()lanes -
close_scenario()Geo/RF even when errors occur.
-
- Write final manifest including failure reason.
12) Metrics and run summaries
12.1 Metric recording
- Implement
metrics/records.py- standard metric record format for:
- tick compute time
- links emitted
- lane apply times (optional)
- standard metric record format for:
- Implement
metrics/exporters.py- JSONL writer to
metrics/ - optional Prometheus exporter
- JSONL writer to
- Implement per-tick timing:
- time spent translating batches
- time spent applying to each lane
12.2 Summary report (v1)
- Write a
summary.jsonat end of run:- total ticks, total links emitted, mean compute time, runtime duration
- lane success/failure states
- artifact paths (pcaps, traces, logs)
13) Integration tests (practical, not huge)
13.1 Smoke tests
-
test_geomrf_client_smoke.py- connect to Geo/RF engine on localhost
- create a tiny scenario (1 GS + 1 SAT)
- stream first 3 ticks and assert non-empty output
-
test_event_alignment_smoke.py- request deltas/events with identical
t_start/t_end/dt/selector - assert each event has
tick_indexand maps to existing/expected delta tick
- request deltas/events with identical
13.2 Bus correctness
-
test_bus_fanout.py- ensure ticks delivered to all lanes in order
- ensure backpressure blocks producer when lane queue is full
13.3 Run manifest correctness
-
test_run_manifest.py- run manager writes expected keys
- manifest includes versions and config snapshot
14) Minimum viable orchestrator (v1) — acceptance criteria
-
Can run
satsim run scenario.yaml --mode mininet- Geo/RF scenario created
- Link deltas streamed and applied via
tc/netem - PCAP recorded (optional)
- run artifacts written (logs, manifest)
-
Can run
satsim run scenario.yaml --mode omnet- Geo/RF stream recorded to trace
- OMNeT launched consuming trace (or stubbed with clear TODO if not ready)
- run artifacts written
-
Can run
satsim run scenario.yaml --mode parallel- both lane adapters receive identical tick updates
- lane adapters derive control only from link deltas
- optional events are captured in artifacts without driving lane state
- orchestrator shuts down cleanly on completion or CTRL+C
15) Optional but valuable v1.1 tasks (safe additions)
- Orchestrator exposes its own gRPC stream
StreamTickUpdatesso lanes can subscribe remotely - Add NATS internal bus option for multi-process fanout
- Add replay command:
satsim replay <run_id>(use stored trace) - Add sweep runner: parameter grid search with repeated runs and consolidated summary
Geometry/RF Engine Test Suite Plan
This checklist tracks the work to build and verify a comprehensive RPC-focused test suite for geomrf-engine.
0) Deliverables
- Add a dedicated gRPC service test module that exercises all six RPCs.
- Validate success + error-path behavior for lifecycle and streaming RPCs.
- Produce an updated coverage report and capture gaps.
- Keep this checklist updated as tasks are completed.
1) Baseline and scope
- Confirm current tests/coverage baseline before adding new RPC tests.
- Confirm test scenario strategy (deterministic helper scenario; compatible with 027 overhead-pass style TLE + GS setup).
2) Test infrastructure
- Add an in-process gRPC test harness (ephemeral port, async channel/stub, clean teardown).
- Add shared helpers for creating/closing scenarios from tests.
3) RPC lifecycle tests
-
GetVersionreturns expected identity/schema metadata. -
GetCapabilitiesreturns expected limits and feature flags. -
CreateScenariosuccess path returnsscenario_ref. -
CreateScenarioinvalid spec path returnsINVALID_ARGUMENT. -
CloseScenariosuccess path returnsok=true. -
CloseScenariounknown scenario path returnsNOT_FOUND.
4) Streaming RPC tests
-
StreamLinkDeltassuccess path returns ordered batches with snapshot metadata. -
StreamLinkDeltasunknown scenario path returnsNOT_FOUND. -
StreamLinkDeltasclosed scenario path returnsFAILED_PRECONDITION. -
StreamLinkDeltasinvalid time parameters returnINVALID_ARGUMENT. -
StreamEventssuccess path returns well-formed events for the test scenario. -
StreamEventsfiltered path validates event filtering behavior. -
StreamEventsunknown scenario path returnsNOT_FOUND. -
StreamEventsclosed scenario path returnsFAILED_PRECONDITION.
5) Execution and coverage
- Run full test suite and ensure all tests pass.
- Run coverage scoped to
geomrf_engine. - Verify
server.pyand stream/event modules are covered by tests. - Document final coverage numbers and remaining gaps.
6) Results summary
- Test count:
20 passed. - Coverage total (
geomrf_engine):85%(820statements,120missed). - Core RPC implementation coverage:
server.pyat79%,streaming/events.pyat96%,streaming/backpressure.pyat80%,util/logging.pyat92%. - Remaining notable gaps captured for follow-up: evaluator branch coverage (
56%) and delta-threshold branch coverage (71%).
7) v1.1 follow-up (event alignment)
- Add
StreamEventsalignment tests for requestdtsemantics (default_dtfallback + invalid-range rejection). - Add selector-parity tests ensuring event selection mirrors
StreamLinkDeltasselection. - Add assertions that every emitted
EngineEventcarriestick_index. - Add cross-stream alignment test: same window/dt/selector for events+deltas yields consistent tick mapping.
- Extend error-path coverage for new event request fields.