000 How I AI.md · 3.7 KiB · Markdown Raw

# Smol Example: ### ChatGPT Web: _<After workshopping the specs/requirements for a while...>_ > Write an extremely detailed implementation doc for the streaming version and th apis . We are only looking at geometry rf engine right now . We need to be able to hand the doc you make off to an LLM to code the geometry enginer API up from the existing python code. Make it so . Extremely detailed. Markdown checkboxes for each discrete task `- [ ]` so the AI can mark them off with an X as they complete The codex-cli can start working on the output `TASKS_GEOENGINE.md`. # Implementing component 3 (of 4): ### ChatGPT Web: I pasted in all the `TASKS_*.md` the AI coder has produced so far, so the web planner has an idea of what was done. (If we didn't have access restrictions, the web version could just access the git repo). > Okay, here are the markdown taskfiles of what we've got done so far. We build geo engine and orchestrator . <extraneous details omitted> I need you to write a very detailed TASKS_IMNET.md file that deals with the imnet++ inet part and how it interacts with the other two systems as they exist . Write it as a markdown checklist - [ ] so the ai can cross items off as it writes them . Go . This produces TASKS_ORCHESTRATOR.md ### Codex CLI: > Check over TASKS_ORCHESTRATOR.md It was written without knowledg of the actual current implementation. Let me know if it looks good for ARCHITECTURE.md and if it is compatible with what has been done so far in TASKS_GEO_ENGINE.md We have only tried to make a gEO engine so far. no edits ,just check the TASKS_ORCHESTRATOR.md makes sense. ``` Overall verdict: it is mostly aligned with ARCHITECTURE.md and broadly compatible with the Geo engine work, but the 3 medium/high items above should be clarified before implementation so the orchs estrator doesn’t encode wrong assumptions. ``` This step is super important! I need to make sure the plan we thought of actually makes sense with what is happening on the ground. > Can the misaligned items be easily solved? Is there a moslty obvisous right answer to them ? If so tell me the gravious solutions, or show me my optiosn if there s a fork or hard choice _<More workshopping of differenes between web and implementation. The web has the pdf source documents and better web search, and has unlimited usage, so I did planning there>_ ### Codex CLI: > @AGENTS.md You task is to implement the geometry engine defined in @TASKS_GEO_ENGINE.md . After completing each task, mark it off with an x (`- [x]`) so its markdown checkbox so there is an external record of what has been done. If plans change, then modify the task list appropriately. The overall high level architecture of the program is in @ARCHITECTURE.md . Go . This is where the magic happens! We have thought through our API and have developed a test plan, and a development plan. Now the AI can develop the code and test the API via the test suite to ensure its correct. --- Overall Advice: - Think about your inputs/outputs/dependencies beforehand. - When the AI screw ups hallucinates, or does something silly, you can make a note in AGENTS.md to do the right thing instead. - Force the AI to use as many deterministic static tools as possible: - Strict Type Checking (Use Rust instead of C, Typescript instead of Javascript, Python with Type Annotations instead of without) - Linters - Code Format Tools - Unit/Integration tests - Have the AI write as many tests as possible of what you want the program to do - If you want the AI to "one-shot" (i.e., autonomously code something complex for a while without supervision and get a good result), then you need to give it as much test input/output behaviour as possible, so it can keep checking against the "proper" results without your guidance.

Smol Example:

ChatGPT Web:

<After workshopping the specs/requirements for a while...>

Write an extremely detailed implementation doc for the streaming version and th apis . We are only looking at geometry rf engine right now . We need to be able to hand the doc you make off to an LLM to code the geometry enginer API up from the existing python code. Make it so . Extremely detailed. Markdown checkboxes for each discrete task - [ ] so the AI can mark them off with an X as they complete

The codex-cli can start working on the output TASKS_GEOENGINE.md.

Implementing component 3 (of 4):

ChatGPT Web:

I pasted in all the TASKS_*.md the AI coder has produced so far, so the web planner has an idea of what was done. (If we didn't have access restrictions, the web version could just access the git repo).

Okay, here are the markdown taskfiles of what we've got done so far. We build geo engine and orchestrator . I need you to write a very detailed TASKS_IMNET.md file that deals with the imnet++ inet part and how it interacts with the other two systems as they exist . Write it as a markdown checklist - [ ] so the ai can cross items off as it writes them . Go .

This produces TASKS_ORCHESTRATOR.md

Codex CLI:

Check over TASKS_ORCHESTRATOR.md It was written without knowledg of the actual current implementation. Let me know if it looks good for ARCHITECTURE.md and if it is compatible with what has been done so far in TASKS_GEO_ENGINE.md We have only tried to make a gEO engine so far. no edits ,just check the TASKS_ORCHESTRATOR.md makes sense.

  Overall verdict: it is mostly aligned with ARCHITECTURE.md and broadly compatible with the Geo engine work, but the 3 medium/high items above should be clarified before implementation so the orchs estrator doesn’t encode wrong assumptions.

This step is super important! I need to make sure the plan we thought of actually makes sense with what is happening on the ground.

Can the misaligned items be easily solved? Is there a moslty obvisous right answer to them ? If so tell me the gravious solutions, or show me my optiosn if there s a fork or hard choice

<More workshopping of differenes between web and implementation. The web has the pdf source documents and better web search, and has unlimited usage, so I did planning there>

Codex CLI:

@AGENTS.md You task is to implement the geometry engine defined in @TASKS_GEO_ENGINE.md . After completing each task, mark it off with an x (- [x]) so its markdown checkbox so there is an external record of what has been done. If plans change, then modify the task list appropriately. The overall high level architecture of the program is in @ARCHITECTURE.md . Go .

This is where the magic happens! We have thought through our API and have developed a test plan, and a development plan. Now the AI can develop the code and test the API via the test suite to ensure its correct.

Overall Advice:

Think about your inputs/outputs/dependencies beforehand.
When the AI screw ups hallucinates, or does something silly, you can make a note in AGENTS.md to do the right thing instead.
Force the AI to use as many deterministic static tools as possible:
- Strict Type Checking (Use Rust instead of C, Typescript instead of Javascript, Python with Type Annotations instead of without)
- Linters
- Code Format Tools
- Unit/Integration tests
Have the AI write as many tests as possible of what you want the program to do
If you want the AI to "one-shot" (i.e., autonomously code something complex for a while without supervision and get a good result), then you need to give it as much test input/output behaviour as possible, so it can keep checking against the "proper" results without your guidance.

AGENTS.md · 2.4 KiB · Markdown Raw

# Python Environment And Package Policy Use `uv` for all Python workflows in this repository. ## Rules - Always run Python commands with `uv run`. - Always add dependencies with `uv add`. - Use `uv venv` for virtual environment setup/management. - Do not use `pip`, `pip3`, `python -m pip`, `virtualenv`, or `python -m venv`. - Do not install dependencies outside `uv`. ## Examples - Run app: `uv run python main.py` - Add runtime dependency: `uv add requests` - Add dev dependency: `uv add --dev pytest` # Test Coverage Policy Use `pytest-cov` via `uv` for coverage checks. ## Rules - If coverage flags are needed and `pytest-cov` is missing, install it with `uv add --dev pytest-cov`. - For the workspace package, install coverage tooling with `uv add --package geomrf-engine --dev pytest-cov`. - Run coverage with `uv run pytest ... --cov=...`. - Because both root and `geomrf-engine` use a `tests` package name, run coverage in two passes and combine reports instead of a single mixed pytest invocation. ## Coverage command pattern - Orchestrator pass: `COVERAGE_FILE=.coverage.orch uv run pytest tests --cov=satsim_orch --cov-report=` - Geo engine pass: `COVERAGE_FILE=.coverage.geomrf uv run --package geomrf-engine pytest geomrf-engine/tests --cov=geomrf_engine --cov-report=` - Combine/report: `uv run coverage combine .coverage.orch .coverage.geomrf && uv run coverage report -m` # OMNeT++/INET Environment Policy Use `opp_env` for OMNeT++/INET install and environment management in this repository. ## Rules - Use `opp_env` to install/manage OMNeT++ and INET versions. - Do not rely on ad-hoc/manual OMNeT++ or INET installs for project workflows. - Keep OMNeT++/INET version selection pinned and reproducible across dev and CI. ## Scope boundary - Python dependency and execution workflows remain `uv`-managed. - OMNeT++/INET toolchain workflows are `opp_env`-managed. # Legacy Code Reuse Policy If functionality from `old_code/` is needed: - Do not import or execute from `old_code/` directly. - Copy the required snippet(s) into a new file under the active codebase. - Adapt and maintain the copied code in the active module only. # Sandbox / Network Restriction Policy If a task is blocked by sandbox or network restrictions: - Stop immediately and do not spend tokens repeatedly trying to bypass restrictions. - Clearly tell the user that we are in a sandbox-restricted environment. - Ask the user for permission before attempting any sandbox breakout or elevated access.

Python Environment And Package Policy

Use uv for all Python workflows in this repository.

Rules

Always run Python commands with uv run.
Always add dependencies with uv add.
Use uv venv for virtual environment setup/management.
Do not use pip, pip3, python -m pip, virtualenv, or python -m venv.
Do not install dependencies outside uv.

Examples

Run app: uv run python main.py
Add runtime dependency: uv add requests
Add dev dependency: uv add --dev pytest

Test Coverage Policy

Use pytest-cov via uv for coverage checks.

Rules

If coverage flags are needed and pytest-cov is missing, install it with uv add --dev pytest-cov.
For the workspace package, install coverage tooling with uv add --package geomrf-engine --dev pytest-cov.
Run coverage with uv run pytest ... --cov=....
Because both root and geomrf-engine use a tests package name, run coverage in two passes and combine reports instead of a single mixed pytest invocation.

Coverage command pattern

Orchestrator pass: COVERAGE_FILE=.coverage.orch uv run pytest tests --cov=satsim_orch --cov-report=
Geo engine pass: COVERAGE_FILE=.coverage.geomrf uv run --package geomrf-engine pytest geomrf-engine/tests --cov=geomrf_engine --cov-report=
Combine/report: uv run coverage combine .coverage.orch .coverage.geomrf && uv run coverage report -m

OMNeT++/INET Environment Policy

Use opp_env for OMNeT++/INET install and environment management in this repository.

Rules

Use opp_env to install/manage OMNeT++ and INET versions.
Do not rely on ad-hoc/manual OMNeT++ or INET installs for project workflows.
Keep OMNeT++/INET version selection pinned and reproducible across dev and CI.

Scope boundary

Python dependency and execution workflows remain uv-managed.
OMNeT++/INET toolchain workflows are opp_env-managed.

Legacy Code Reuse Policy

If functionality from old_code/ is needed:

Do not import or execute from old_code/ directly.
Copy the required snippet(s) into a new file under the active codebase.
Adapt and maintain the copied code in the active module only.

Sandbox / Network Restriction Policy

If a task is blocked by sandbox or network restrictions:

Stop immediately and do not spend tokens repeatedly trying to bypass restrictions.
Clearly tell the user that we are in a sandbox-restricted environment.
Ask the user for permission before attempting any sandbox breakout or elevated access.

ARCHITECTURE.md · 12 KiB · Markdown Raw

```markdown # ARCHITECTURE.md — SatSim System Overview This document gives the **system-level architecture** for SatSim. It is intended to provide a complete “sight picture” for anyone implementing a subproject (e.g., the Geometry/RF Engine) so they understand how their component fits into the larger simulator. --- ## 1) Purpose and guiding idea SatSim is a **hybrid satellite networking simulator** that combines: 1) A swappable **Geometry/RF/Link-Budget Engine** (physics + propagation + link feasibility) 2) A **packet-level discrete-event simulation lane 'OMNeT++/INET'** (scale + protocol behavior) 3) A **real SDN emulation lane 'Mininet/OVS'** (controller-in-the-loop + real Linux networking) 4) An **Orchestrator** that provides a single scenario/timebase and keeps all parts consistent The fundamental design choice is that SatSim is **layered and composable**: we reuse mature simulators/emulators and treat satellite physics as an external service with a stable interface. --- ## 2) Key design decisions (why this looks the way it does) ### 2.1 Why two lanes (simulation vs emulation) We intentionally run two different lanes because they answer different questions: - **OMNeT++/INET lane (Discrete-Event Simulation)** - Best for: scaling up to many nodes, protocol studies, routing and congestion behavior, reproducibility. - Not best for: running real SDN controllers and real Linux TCP stacks. - **Mininet/OVS lane (Network Emulation)** - Best for: real SDN controllers (ONOS/Ryu), real forwarding behavior (OpenFlow/OVS), real apps/traffic tools. - Not best for: scaling to thousands of nodes with full protocol stacks. Trying to “pipe packets” between them is possible but usually not worth it early, because it introduces hard time synchronization problems (DES time vs wall-clock time) and packet bridging complexity. Instead we connect both lanes to the same **state oracle** (the Geo/RF engine) through the Orchestrator. ### 2.2 Where the lanes *do* meet today They meet at: - **Scenario definition** (same nodes, same constraints, same time window) - **LinkState/Event timeline** (same “truth” about which links exist and their properties) - **Metrics and artifacts** (comparable outputs; shared logging/PCAP strategy) Optionally, they also meet via: - **Shared SDN decision logic** (same ONOS/Ryu app used to compute routes, then applied in both lanes through adapters) ### 2.3 Future “stacking” (OMNeT++ feeding Mininet) In the future, OMNeT++ may “feed” Mininet in two practical ways: 1) **Trace-driven replay (recommended future path)** - OMNeT++ generates a curated set of traces (topology/failure schedules, traffic demands, baseline routing decisions). - Mininet replays those traces in real-time to validate controller behavior under identical conditions. 2) **Hard co-simulation / packet bridging (advanced, optional)** - Some nodes simulated in OMNeT++, others emulated in Mininet at the same time. - Requires strict time coupling and a gateway that transforms/timeshifts packets. - Not a v1 target. ### 2.4 Locked decisions (2026-02-18) - **Tick authority:** `StreamLinkDeltas` is the control-plane source of truth for lane updates. - **Events contract:** event streaming is retained, but aligned to the same requested `dt` and `selector`, and each event carries `tick_index`. - **Orchestrator behavior:** streaming-driven execution is canonical; any scheduler is pacing-only. - **Scenario translation:** orchestrator must fail-fast when it cannot produce a valid Geo/RF `ScenarioSpec`. - **Python tooling:** Python workflows use `uv`; OMNeT++/INET workflows use `opp_env`. --- ## 3) Top-level components ### 3.1 Geometry/RF/Link-Budget Engine (black box, replaceable) **Role:** The authoritative “physics layer” that translates orbital/propagation reality into network-usable link state. **Key properties** - Replaceable implementation (Skyfield + ITU-R today, could be STK import or other later) - Stable interface (the rest of SatSim depends only on its API) - Produces time-indexed: - Link feasibility (up/down) - Link properties (delay/capacity/loss proxies) - Discrete events (link up/down, handover, failures if modeled) **Location:** `geomrf-engine/` --- ### 3.2 Orchestrator (system conductor) **Role:** Owns the simulation lifecycle and timebase. It is the “brain” that coordinates all lanes. **Responsibilities** - Load scenario config → create/initialize the Geo/RF engine scenario - Choose execution mode: - OMNeT-only, Mininet-only, or both in parallel - Drive execution pacing: - offline apply-fast or real-time apply-paced, while consuming authoritative engine stream ticks - Consume Geo/RF LinkState stream and distribute it to: - OMNeT adapter - Mininet adapter - logging/metrics - Collect artifacts (PCAPs, timeseries metrics, configs, run manifests) - Provide reproducible run IDs and version stamping **Location:** `orchestrator/` --- ### 3.3 OMNeT++/INET Lane (packet-level discrete-event) **Role:** Packet-level simulation of protocols, queuing, routing, traffic at scale. **Responsibilities** - Build the network node models (routers, hosts, queues) using INET components - Apply dynamic link updates (delay/capacity/loss/up-down) based on Geo/RF output - Run deterministic experiments rapidly (sweeps) - Export artifacts: - logs + metrics - optional PCAP outputs (where supported) **Custom SatSim additions** - A lightweight **LinkState Adapter Module** that subscribes to orchestrator/Geo output - A mechanism to apply link changes at simulation timestamps **Environment and install management** - Use `opp_env` as the standard way to install/manage OMNeT++ and INET. - Avoid ad-hoc/manual OMNeT++/INET installs in project workflows. **Location:** `lanes/omnet/` --- ### 3.4 Mininet/OVS Lane (SDN emulation) **Role:** Real SDN controller + real forwarding plane under dynamic link conditions. **Responsibilities** - Build an emulated topology with Mininet (or Containernet) - Use OVS as the dataplane switch/router substrate - Run a real SDN controller (ONOS or Ryu) - Apply dynamic link shaping based on Geo/RF output: - `tc/netem` for delay/loss/jitter - `tbf/htb` for rate control - interface up/down to emulate link drops - Generate traffic using real tools: - iperf3, D-ITG, SIPp, tcpreplay, custom apps **Location:** `lanes/mininet/` --- ### 3.5 Observability, artifacts, and visualization **Role:** Make runs inspectable, comparable, and reproducible. **Artifacts** - Scenario config snapshot + run manifest (versions, seeds, git SHAs) - LinkState/Event traces (optional export) - Metrics time-series (throughput/delay/loss/path changes) - PCAP captures (Mininet tcpdump; OMNeT if enabled) **Tools** - Prometheus + Grafana for dashboards - Wireshark for PCAP analysis **Location:** `observability/` and `artifacts/` --- ## 4) System boundaries and data ownership ### 4.1 The Geo/RF engine owns *physics truth* - It is the source of truth for which links can exist and their physical/network properties. - Other components must not invent geometry/rf; they only consume the engine’s output. ### 4.2 The Orchestrator owns *time and execution* - It defines run window requests, pacing mode, and synchronization rules. - For v1/v1.1, tick production comes from Geo/RF stream output rather than orchestrator-generated ticks. - It routes updates to the lanes and standardizes artifacts. ### 4.3 Each lane owns *packet/control behavior* - OMNeT owns packet-level behavior inside DES. - Mininet owns real SDN and Linux networking behavior. --- ## 5) Core data flows (end-to-end) ### 5.1 Initialization flow 1. User provides `ScenarioConfig` (YAML/JSON). 2. Orchestrator validates config and creates a new run ID. 3. Orchestrator calls Geo/RF engine: - `CreateScenario` (returns scenario ref) 4. Orchestrator initializes selected lane(s): - OMNeT: compile/load model, start run - Mininet: build topology, start controller 5. Orchestrator subscribes to Geo/RF streaming output for LinkState and Events. ### 5.2 Runtime (parallel lane mode) At each time tick: 1. Geo/RF produces `LinkDeltaBatch` + optional events. 2. Orchestrator receives it and distributes: - OMNeT adapter: update channel/link state in simulator time - Mininet adapter: apply tc/netem shaping and link toggles - (optional) Event recorder: store aligned `EngineEvent` stream for analysis/observability - Observability: record metrics and store link traces 3. Lanes generate traffic and produce metrics/PCAPs. ### 5.3 Completion flow 1. Orchestrator stops lane processes. 2. Orchestrator closes the Geo/RF scenario. 3. All artifacts are written under the run ID. --- ## 6) Timebase and execution modes SatSim supports multiple execution modes, controlled by the Orchestrator: ### Mode A — OMNeT-only (offline DES) - Orchestrator consumes Geo/RF ticks and applies them to OMNeT without wall-clock pacing. - Highest scalability and repeatability. ### Mode B — Mininet-only (real-time emulation) - Orchestrator consumes Geo/RF ticks and applies wall-clock pacing while updating Mininet shaping. - Best for SDN/controller realism and app-level testing. ### Mode C — Parallel (OMNeT + Mininet simultaneously) - Both lanes consume the same LinkState stream. - Used to compare “simulated protocol outcomes” vs “real controller outcomes” under the same link dynamics. ### Mode D — Trace-driven replay (future/optional) - Geo/RF and/or OMNeT exports a trace. - Mininet replays trace deterministically. --- ## 7) Interfaces between components (high-level) ### 7.1 Geo/RF Engine interface (v1) - gRPC service, Protobuf messages - Scenario lifecycle + streaming link deltas/events - Output is **NetworkView** link properties (up/down, delay, capacity, loss proxy) - Optional debug scalars for validation (SNR margin, elevation, range) - Event-stream alignment target: events use same requested window/selector/dt semantics as deltas and expose `tick_index`. ### 7.2 Orchestrator ↔ OMNeT interface - OMNeT subscribes to orchestrator updates via: - gRPC client inside a C++ adapter module, OR - file/trace ingestion for offline runs - Applies updates to INET channel/link parameters and toggles connectivity ### 7.3 Orchestrator ↔ Mininet interface - Orchestrator controls Mininet via: - Python Mininet API calls - Linux `tc` and interface management commands - SDN controller is external (ONOS/Ryu), connected in the standard Mininet way --- ## 8) Reproducibility rules Each run must record: - ScenarioConfig snapshot - Seeds - Engine versions: - Geo/RF engine version + schema version - orchestrator version - OMNeT/INET versions and model git SHA - `opp_env` environment definition/metadata used for OMNeT/INET - controller version and app git SHA - LinkState trace hash (if stored) - Toolchain/container image tags (if containerized) --- ## 9) Suggested monorepo layout ``` satsim/ ARCHITECTURE.md orchestrator/ ... subprojects/ geomrf-engine/ ARCHITECTURE.md # subproject-specific (the streaming API spec lives here) proto/ src/ tests/ lanes/ omnet/ models/ adapter/ scripts/ ``` mininet/ topo/ driver/ controllers/ scripts/ ``` observability/ grafana/ prometheus/ dashboards/ artifacts/ runs/ <run_id>/ scenario.yaml manifest.json linkstate.parquet (optional) metrics/ pcaps/ logs/ ``` --- ## 10) What an implementer of the Geo/RF engine must know - The Geo/RF engine must be treated as **the physics oracle**. - Its output must be: - time-indexed - sparse (selector-driven) - stable and deterministic - expressed in consistent units - The Orchestrator will use it in both: - offline sampling (for OMNeT) - real-time streaming (for Mininet) - The lanes do not need to know how link budgets are computed—only how to consume the streaming LinkDelta/Event outputs. --- ## 11) Roadmap hooks (explicit future extensions) - Add a richer PHY view (optional fields) without breaking NetworkView consumers. - Add trace import/replay for deterministic mininet runs. - Add “shared SDN decision interface” so ONOS/Ryu path computation can be applied inside OMNeT. - Add advanced co-simulation only if required (packet bridging). --- ```

# ARCHITECTURE.md — SatSim System Overview

This document gives the **system-level architecture** for SatSim. It is intended to provide a complete “sight picture” for anyone implementing a subproject (e.g., the Geometry/RF Engine) so they understand how their component fits into the larger simulator.

---

## 1) Purpose and guiding idea

SatSim is a **hybrid satellite networking simulator** that combines:

1) A swappable **Geometry/RF/Link-Budget Engine** (physics + propagation + link feasibility)
2) A **packet-level discrete-event simulation lane 'OMNeT++/INET'** (scale + protocol behavior)
3) A **real SDN emulation lane 'Mininet/OVS'** (controller-in-the-loop + real Linux networking)
4) An **Orchestrator** that provides a single scenario/timebase and keeps all parts consistent

The fundamental design choice is that SatSim is **layered and composable**: we reuse mature simulators/emulators and treat satellite physics as an external service with a stable interface.

---

## 2) Key design decisions (why this looks the way it does)

### 2.1 Why two lanes (simulation vs emulation)
We intentionally run two different lanes because they answer different questions:

- **OMNeT++/INET lane (Discrete-Event Simulation)**
  - Best for: scaling up to many nodes, protocol studies, routing and congestion behavior, reproducibility.
  - Not best for: running real SDN controllers and real Linux TCP stacks.

- **Mininet/OVS lane (Network Emulation)**
  - Best for: real SDN controllers (ONOS/Ryu), real forwarding behavior (OpenFlow/OVS), real apps/traffic tools.
  - Not best for: scaling to thousands of nodes with full protocol stacks.

Trying to “pipe packets” between them is possible but usually not worth it early, because it introduces hard time synchronization problems (DES time vs wall-clock time) and packet bridging complexity. Instead we connect both lanes to the same **state oracle** (the Geo/RF engine) through the Orchestrator.

### 2.2 Where the lanes *do* meet today
They meet at:
- **Scenario definition** (same nodes, same constraints, same time window)
- **LinkState/Event timeline** (same “truth” about which links exist and their properties)
- **Metrics and artifacts** (comparable outputs; shared logging/PCAP strategy)

Optionally, they also meet via:
- **Shared SDN decision logic** (same ONOS/Ryu app used to compute routes, then applied in both lanes through adapters)

### 2.3 Future “stacking” (OMNeT++ feeding Mininet)
In the future, OMNeT++ may “feed” Mininet in two practical ways:

1) **Trace-driven replay (recommended future path)**
   - OMNeT++ generates a curated set of traces (topology/failure schedules, traffic demands, baseline routing decisions).
   - Mininet replays those traces in real-time to validate controller behavior under identical conditions.

2) **Hard co-simulation / packet bridging (advanced, optional)**
   - Some nodes simulated in OMNeT++, others emulated in Mininet at the same time.
   - Requires strict time coupling and a gateway that transforms/timeshifts packets.
   - Not a v1 target.

### 2.4 Locked decisions (2026-02-18)
- **Tick authority:** `StreamLinkDeltas` is the control-plane source of truth for lane updates.
- **Events contract:** event streaming is retained, but aligned to the same requested `dt` and `selector`, and each event carries `tick_index`.
- **Orchestrator behavior:** streaming-driven execution is canonical; any scheduler is pacing-only.
- **Scenario translation:** orchestrator must fail-fast when it cannot produce a valid Geo/RF `ScenarioSpec`.
- **Python tooling:** Python workflows use `uv`; OMNeT++/INET workflows use `opp_env`.

---

## 3) Top-level components

### 3.1 Geometry/RF/Link-Budget Engine (black box, replaceable)
**Role:** The authoritative “physics layer” that translates orbital/propagation reality into network-usable link state.

**Key properties**
- Replaceable implementation (Skyfield + ITU-R today, could be STK import or other later)
- Stable interface (the rest of SatSim depends only on its API)
- Produces time-indexed:
  - Link feasibility (up/down)
  - Link properties (delay/capacity/loss proxies)
  - Discrete events (link up/down, handover, failures if modeled)

**Location:** `geomrf-engine/`

---

### 3.2 Orchestrator (system conductor)
**Role:** Owns the simulation lifecycle and timebase. It is the “brain” that coordinates all lanes.

**Responsibilities**
- Load scenario config → create/initialize the Geo/RF engine scenario
- Choose execution mode:
  - OMNeT-only, Mininet-only, or both in parallel
- Drive execution pacing:
  - offline apply-fast or real-time apply-paced, while consuming authoritative engine stream ticks
- Consume Geo/RF LinkState stream and distribute it to:
  - OMNeT adapter
  - Mininet adapter
  - logging/metrics
- Collect artifacts (PCAPs, timeseries metrics, configs, run manifests)
- Provide reproducible run IDs and version stamping

**Location:** `orchestrator/`

---

### 3.3 OMNeT++/INET Lane (packet-level discrete-event)
**Role:** Packet-level simulation of protocols, queuing, routing, traffic at scale.

**Responsibilities**
- Build the network node models (routers, hosts, queues) using INET components
- Apply dynamic link updates (delay/capacity/loss/up-down) based on Geo/RF output
- Run deterministic experiments rapidly (sweeps)
- Export artifacts:
  - logs + metrics
  - optional PCAP outputs (where supported)

**Custom SatSim additions**
- A lightweight **LinkState Adapter Module** that subscribes to orchestrator/Geo output
- A mechanism to apply link changes at simulation timestamps

**Environment and install management**
- Use `opp_env` as the standard way to install/manage OMNeT++ and INET.
- Avoid ad-hoc/manual OMNeT++/INET installs in project workflows.

**Location:** `lanes/omnet/`

---

### 3.4 Mininet/OVS Lane (SDN emulation)
**Role:** Real SDN controller + real forwarding plane under dynamic link conditions.

**Responsibilities**
- Build an emulated topology with Mininet (or Containernet)
- Use OVS as the dataplane switch/router substrate
- Run a real SDN controller (ONOS or Ryu)
- Apply dynamic link shaping based on Geo/RF output:
  - `tc/netem` for delay/loss/jitter
  - `tbf/htb` for rate control
  - interface up/down to emulate link drops
- Generate traffic using real tools:
  - iperf3, D-ITG, SIPp, tcpreplay, custom apps

**Location:** `lanes/mininet/`

---

### 3.5 Observability, artifacts, and visualization
**Role:** Make runs inspectable, comparable, and reproducible.

**Artifacts**
- Scenario config snapshot + run manifest (versions, seeds, git SHAs)
- LinkState/Event traces (optional export)
- Metrics time-series (throughput/delay/loss/path changes)
- PCAP captures (Mininet tcpdump; OMNeT if enabled)

**Tools**
- Prometheus + Grafana for dashboards
- Wireshark for PCAP analysis

**Location:** `observability/` and `artifacts/`

---

## 4) System boundaries and data ownership

### 4.1 The Geo/RF engine owns *physics truth*
- It is the source of truth for which links can exist and their physical/network properties.
- Other components must not invent geometry/rf; they only consume the engine’s output.

### 4.2 The Orchestrator owns *time and execution*
- It defines run window requests, pacing mode, and synchronization rules.
- For v1/v1.1, tick production comes from Geo/RF stream output rather than orchestrator-generated ticks.
- It routes updates to the lanes and standardizes artifacts.

### 4.3 Each lane owns *packet/control behavior*
- OMNeT owns packet-level behavior inside DES.
- Mininet owns real SDN and Linux networking behavior.

---

## 5) Core data flows (end-to-end)

### 5.1 Initialization flow
1. User provides `ScenarioConfig` (YAML/JSON).
2. Orchestrator validates config and creates a new run ID.
3. Orchestrator calls Geo/RF engine:
   - `CreateScenario` (returns scenario ref)
4. Orchestrator initializes selected lane(s):
   - OMNeT: compile/load model, start run
   - Mininet: build topology, start controller
5. Orchestrator subscribes to Geo/RF streaming output for LinkState and Events.

### 5.2 Runtime (parallel lane mode)
At each time tick:
1. Geo/RF produces `LinkDeltaBatch` + optional events.
2. Orchestrator receives it and distributes:
   - OMNeT adapter: update channel/link state in simulator time
   - Mininet adapter: apply tc/netem shaping and link toggles
   - (optional) Event recorder: store aligned `EngineEvent` stream for analysis/observability
   - Observability: record metrics and store link traces
3. Lanes generate traffic and produce metrics/PCAPs.

### 5.3 Completion flow
1. Orchestrator stops lane processes.
2. Orchestrator closes the Geo/RF scenario.
3. All artifacts are written under the run ID.

---

## 6) Timebase and execution modes

SatSim supports multiple execution modes, controlled by the Orchestrator:

### Mode A — OMNeT-only (offline DES)
- Orchestrator consumes Geo/RF ticks and applies them to OMNeT without wall-clock pacing.
- Highest scalability and repeatability.

### Mode B — Mininet-only (real-time emulation)
- Orchestrator consumes Geo/RF ticks and applies wall-clock pacing while updating Mininet shaping.
- Best for SDN/controller realism and app-level testing.

### Mode C — Parallel (OMNeT + Mininet simultaneously)
- Both lanes consume the same LinkState stream.
- Used to compare “simulated protocol outcomes” vs “real controller outcomes” under the same link dynamics.

### Mode D — Trace-driven replay (future/optional)
- Geo/RF and/or OMNeT exports a trace.
- Mininet replays trace deterministically.

---

## 7) Interfaces between components (high-level)

### 7.1 Geo/RF Engine interface (v1)
- gRPC service, Protobuf messages
- Scenario lifecycle + streaming link deltas/events
- Output is **NetworkView** link properties (up/down, delay, capacity, loss proxy)
- Optional debug scalars for validation (SNR margin, elevation, range)
- Event-stream alignment target: events use same requested window/selector/dt semantics as deltas and expose `tick_index`.

### 7.2 Orchestrator ↔ OMNeT interface
- OMNeT subscribes to orchestrator updates via:
  - gRPC client inside a C++ adapter module, OR
  - file/trace ingestion for offline runs
- Applies updates to INET channel/link parameters and toggles connectivity

### 7.3 Orchestrator ↔ Mininet interface
- Orchestrator controls Mininet via:
  - Python Mininet API calls
  - Linux `tc` and interface management commands
- SDN controller is external (ONOS/Ryu), connected in the standard Mininet way

---

## 8) Reproducibility rules

Each run must record:
- ScenarioConfig snapshot
- Seeds
- Engine versions:
  - Geo/RF engine version + schema version
  - orchestrator version
  - OMNeT/INET versions and model git SHA
  - `opp_env` environment definition/metadata used for OMNeT/INET
  - controller version and app git SHA
- LinkState trace hash (if stored)
- Toolchain/container image tags (if containerized)

---

## 9) Suggested monorepo layout

satsim/ ARCHITECTURE.md

orchestrator/ ...

subprojects/ geomrf-engine/ ARCHITECTURE.md # subproject-specific (the streaming API spec lives here) proto/ src/ tests/

lanes/ omnet/ models/ adapter/ scripts/

mininet/
  topo/
  driver/
  controllers/
  scripts/

observability/ grafana/ prometheus/ dashboards/

artifacts/ runs/ <run_id>/ scenario.yaml manifest.json linkstate.parquet (optional) metrics/ pcaps/ logs/


---

## 10) What an implementer of the Geo/RF engine must know

- The Geo/RF engine must be treated as **the physics oracle**.
- Its output must be:
  - time-indexed
  - sparse (selector-driven)
  - stable and deterministic
  - expressed in consistent units
- The Orchestrator will use it in both:
  - offline sampling (for OMNeT)
  - real-time streaming (for Mininet)
- The lanes do not need to know how link budgets are computed—only how to consume the streaming LinkDelta/Event outputs.

---

## 11) Roadmap hooks (explicit future extensions)

- Add a richer PHY view (optional fields) without breaking NetworkView consumers.
- Add trace import/replay for deterministic mininet runs.
- Add “shared SDN decision interface” so ONOS/Ryu path computation can be applied inside OMNeT.
- Add advanced co-simulation only if required (packet bridging).

---

README.md · 672 B · Markdown Raw

SatSim docs index

Primary design and task documents:

ARCHITECTURE.md
TASKS_GEO_ENGINE.md
TASKS_IMNET.md
TASKS_ORCHESTRATOR.md
TASKS_TESTSUITE_GEOENGINE.md

Locked design decisions (2026-02-18):

Control-plane tick authority is StreamLinkDeltas (streaming-driven orchestration).
Event stream alignment is being standardized with request dt/selector and event tick_index.
Orchestrator error handling includes NOT_FOUND, INVALID_ARGUMENT, FAILED_PRECONDITION, and RESOURCE_EXHAUSTED.
Scenario translation to Geo/RF ScenarioSpec is fail-fast.
Python workflows are uv-managed; OMNeT++/INET workflows are opp_env-managed.

TASKS_GEO_ENGINE.md · 27 KiB · Markdown Raw

# Geometry/RF Engine v1/v1.1 Streaming API — Implementation Specification This document specifies a **complete, implementable** Geometry/RF Engine API and server for **streaming link-state deltas + events**. It is written so another LLM can generate the code from existing Python geometry/RF/link-budget code (Skyfield + ITU-R + your models) with minimal guesswork. Status note: - v1 baseline is implemented. - v1.1 alignment updates (for orchestrator compatibility) are now specified below, especially for `StreamEvents` tick alignment. --- ## 0) Deliverables ### What must exist at the end * A runnable **Python gRPC server** that implements: * Scenario creation/closure * Capabilities/version endpoints * **Streaming link deltas** * **Streaming events** * A Protobuf schema with: * Stable IDs, time semantics, units * Selector logic (which links/nodes to compute) * Delta semantics (what counts as a “change”) * A reference Python client demonstrating: * Create scenario → stream deltas/events → close scenario --- ## 1) Core design constraints ### 1.1 Contract invariants * **Scenario-scoped**: All computation happens inside a `ScenarioRef`. * **Time-indexed**: All output is keyed by a timestamp and tick index. * **Selector-driven**: Never compute “all links” unless explicitly requested. * **Streaming-first**: Primary runtime interface is server→client stream. * **Deterministic**: Given identical inputs (scenario + seed + engine version), output is replayable. ### 1.2 What the engine outputs (NetworkView) For each directed link `(src → dst)` at each tick, the engine provides: * `up` (boolean) * `one_way_delay_s` (float; seconds) * `capacity_bps` (float; bits per second) * `loss_rate` (float; [0,1] packet loss proxy OR PER proxy) * optional debug scalar(s): `snr_margin_db`, `elevation_deg`, `range_m` **Everything else** can be exposed later via an optional “debug view”; v1 focuses on network-usable state. --- ## 2) Repository layout (recommended) ``` geomrf-engine/ proto/ geomrf/v1/geomrf.proto src/geomrf_engine/ __init__.py server.py config_schema.py scenario_store.py timebase.py selectors.py compute/ __init__.py ephemeris.py geometry.py rf_models.py link_budget.py adaptation.py streaming/ __init__.py delta.py events.py backpressure.py util/ ids.py units.py logging.py metrics.py examples/ client_stream.py tests/ test_proto_roundtrip.py test_delta_thresholds.py test_selectors.py test_determinism.py ``` --- ## 3) Implementation tasks checklist ### 3.1 Project & build system - [x] Create repo structure as above - [x] Add `pyproject.toml` with dependencies: - [x] `grpcio`, `grpcio-tools`, `protobuf` - [x] `pydantic` (scenario validation) - [x] `pyyaml` (YAML scenario input) - [x] `numpy`, `scipy` (if used) - [x] `skyfield`, `sgp4` - [x] your ITU-R package(s) - [x] `prometheus-client` (optional but recommended) - [x] Add a `Makefile` or task runner: - [x] `uv run python -m grpc_tools.protoc ...` compiles `.proto` to Python - [x] `uv run python -m geomrf_engine.server ...` starts server - [x] `uv run pytest` runs tests ### 3.2 Protobuf + gRPC schema - [x] Write `proto/geomrf/v1/geomrf.proto` (spec below) - [x] Generate Python stubs - [x] Add schema version constants and embed in responses ### 3.3 Server skeleton - [x] Implement async gRPC server (`grpc.aio`) - [x] Wire servicer methods: - [x] `GetVersion` - [x] `GetCapabilities` - [x] `CreateScenario` - [x] `CloseScenario` - [x] `StreamLinkDeltas` - [x] `StreamEvents` - [x] Add structured logging and request correlation IDs ### 3.4 Scenario lifecycle - [x] Implement scenario validation (Pydantic) - [x] Implement scenario store (in-memory for v1) - [x] Implement scenario ID generation (UUIDv4) - [x] Snapshot `ScenarioSpec` + resolved assets into a `ScenarioRuntime` ### 3.5 Compute pipeline - [x] Implement ephemeris loader (TLE list initially) - [x] Implement geometry evaluation (positions + visibility + elevation + range) - [x] Implement RF/link budget mapping to `NetworkLinkState` - [x] Implement adaptation mapping (SNR → capacity/loss) with a default policy - [x] Implement per-tick evaluation returning sparse link set ### 3.6 Streaming + deltas/events - [x] Implement tick loop (timebase) - [x] Implement delta computation with thresholds - [x] Implement event emission (link up/down, handover optional) - [x] Implement backpressure-safe streaming - [x] Add stream cancellation handling and cleanup ### 3.7 Tests + examples - [x] Determinism test (same scenario+seed → identical deltas) - [x] Selector test (only requested links computed) - [x] Threshold test (small changes suppressed) - [x] Example client script (prints updates, counts links) --- ## 4) gRPC/Protobuf specification (v1) ### 4.1 `.proto` (authoritative spec) Create `proto/geomrf/v1/geomrf.proto`: ```proto syntax = "proto3"; package geomrf.v1; import "google/protobuf/timestamp.proto"; import "google/protobuf/duration.proto"; option go_package = "geomrf/v1;geomrfv1"; // harmless for other langs // --------------------------- // Service // --------------------------- service GeometryRfEngine { rpc GetVersion(GetVersionRequest) returns (GetVersionResponse); rpc GetCapabilities(GetCapabilitiesRequest) returns (GetCapabilitiesResponse); rpc CreateScenario(CreateScenarioRequest) returns (CreateScenarioResponse); rpc CloseScenario(CloseScenarioRequest) returns (CloseScenarioResponse); // Primary: stream sparse deltas per tick. rpc StreamLinkDeltas(StreamLinkDeltasRequest) returns (stream LinkDeltaBatch); // Primary: stream discrete events (optional separate channel for clean consumers). rpc StreamEvents(StreamEventsRequest) returns (stream EngineEvent); } // --------------------------- // Version / capabilities // --------------------------- message GetVersionRequest {} message GetVersionResponse { string engine_name = 1; // e.g., "geomrf-engine" string engine_version = 2; // semver, e.g., "1.0.0" string schema_version = 3; // e.g., "geomrf.v1" string build_git_sha = 4; // optional } message GetCapabilitiesRequest {} message GetCapabilitiesResponse { string schema_version = 1; // Limits uint32 max_links_per_tick = 2; uint32 max_nodes = 3; uint32 max_streams_per_scenario = 4; google.protobuf.duration min_dt = 5; google.protobuf.duration max_dt = 6; // Supported outputs bool supports_loss_rate = 10; bool supports_capacity_bps = 11; bool supports_delay_s = 12; bool supports_snr_margin_db = 13; // Supported selectors/features (advertise so clients can adapt) bool supports_only_visible = 20; bool supports_min_elevation_deg = 21; bool supports_max_degree = 22; bool supports_link_types = 23; // GS-SAT, SAT-SAT, etc. } // --------------------------- // Scenario lifecycle // --------------------------- message CreateScenarioRequest { ScenarioSpec spec = 1; } message CreateScenarioResponse { string scenario_ref = 1; // UUID string string schema_version = 2; } message CloseScenarioRequest { string scenario_ref = 1; } message CloseScenarioResponse { bool ok = 1; } // --------------------------- // Scenario specification (v1) // --------------------------- message ScenarioSpec { // Reproducibility uint64 seed = 1; // Time model google.protobuf.timestamp t0 = 2; // UTC google.protobuf.timestamp t1 = 3; // UTC google.protobuf.duration default_dt = 4; // Nodes repeated NodeSpec nodes = 10; // Eligibility rules (which links can exist) LinkPolicy link_policy = 20; // Mapping PHY -> network outputs (can be simplistic in v1) AdaptationPolicy adaptation = 30; // Optional: engine-side caching hints CacheHints cache_hints = 40; } enum NodeRole { NODE_ROLE_UNSPECIFIED = 0; SATELLITE = 1; GROUND_STATION = 2; USER_TERMINAL = 3; } message NodeSpec { string node_id = 1; // stable ID used everywhere NodeRole role = 2; // One of the following depending on role SatelliteOrbit orbit = 10; GroundFixedSite fixed_site = 11; // Radio/terminal model parameters (minimal v1) TerminalModel terminal = 20; // Arbitrary tags for selectors/grouping map<string,string> tags = 30; } message SatelliteOrbit { // v1: only TLE supported. Later: OEM/SP3/etc. string tle_line1 = 1; string tle_line2 = 2; } message GroundFixedSite { double lat_deg = 1; double lon_deg = 2; double alt_m = 3; } message TerminalModel { // Minimal knobs to compute link budgets consistently. // Units: dBW, dBi, Hz, K, etc. double tx_power_dbw = 1; double tx_gain_dbi = 2; // can be treated as peak gain in v1 double rx_gain_dbi = 3; // can be treated as peak gain in v1 double rx_noise_temp_k = 4; double bandwidth_hz = 5; double frequency_hz = 6; // Optional: simple pointing/antenna pattern loss approximation double pointing_loss_db = 10; // default constant loss if you don’t model patterns yet } enum LinkType { LINK_TYPE_UNSPECIFIED = 0; GS_TO_SAT = 1; SAT_TO_GS = 2; SAT_TO_SAT = 3; UT_TO_SAT = 4; SAT_TO_UT = 5; } message LinkPolicy { // Which link types are allowed at all repeated LinkType allowed_types = 1; // Dynamic feasibility thresholds double min_elevation_deg = 2; // default 0 if unused bool only_visible = 3; // if true, return only visible/feasible links // Degree constraints (optional) uint32 max_out_degree = 10; // 0 means unlimited uint32 max_in_degree = 11; // 0 means unlimited // Optional: limit candidates by distance for scalability double max_range_m = 20; // 0 means unlimited } message AdaptationPolicy { // v1: a simple mapping mode. // Future: full MCS tables, ACM, coding gains, etc. enum Mode { MODE_UNSPECIFIED = 0; FIXED_RATE = 1; // constant capacity if link is up, else 0 SNR_TO_RATE = 2; // rate from snr_margin (simple piecewise) SNR_TO_LOSS = 3; // loss from snr_margin (simple logistic) SNR_TO_BOTH = 4; } Mode mode = 1; // v1 defaults double fixed_capacity_bps = 2; double fixed_loss_rate = 3; // Parameters for simple SNR->rate/loss mappings (implementation defined but deterministic) double snr_margin_min_db = 10; double snr_margin_max_db = 11; } message CacheHints { bool precompute_positions = 1; bool precompute_visibility = 2; uint32 max_cache_ticks = 3; // 0 = engine default } // --------------------------- // Streaming requests // --------------------------- message StreamLinkDeltasRequest { string scenario_ref = 1; // Time range for this stream. If empty, use scenario t0..t1. google.protobuf.timestamp t_start = 2; google.protobuf.timestamp t_end = 3; // If unset, use scenario default_dt. google.protobuf.duration dt = 4; // Which links to consider/return. LinkSelector selector = 10; // Delta emission thresholds DeltaThresholds thresholds = 20; // Behavior knobs bool emit_full_snapshot_first = 30; // recommended true for simpler clients bool include_debug_fields = 31; // if true, fill debug fields in updates } message StreamEventsRequest { string scenario_ref = 1; google.protobuf.timestamp t_start = 2; google.protobuf.timestamp t_end = 3; // If unset, use scenario default_dt. Must satisfy capabilities bounds. google.protobuf.duration dt = 4; EventFilter filter = 10; // Apply the same selection surface as StreamLinkDeltas for deterministic alignment. LinkSelector selector = 11; } message LinkSelector { // v1 supports: // - explicit pairs // - by link type // - by node role sets repeated LinkPair explicit_pairs = 1; repeated LinkType link_types = 2; // If non-empty, only consider links where src in set AND dst in set repeated string src_node_ids = 10; repeated string dst_node_ids = 11; // Optional tag filters (exact match) map<string,string> src_tags = 12; map<string,string> dst_tags = 13; // If true, apply scenario LinkPolicy.only_visible behavior bool only_visible = 20; // Optional override thresholds (0 uses scenario policy) double min_elevation_deg = 21; double max_range_m = 22; } message DeltaThresholds { // Only emit update if absolute change exceeds threshold. // 0 means "emit on any change" for that field. double delay_s = 1; double capacity_bps = 2; double loss_rate = 3; double snr_margin_db = 4; // Emit if link up/down changes always (implicit). } // --------------------------- // Streaming output // --------------------------- message LinkDeltaBatch { string scenario_ref = 1; string schema_version = 2; google.protobuf.timestamp time = 3; // tick time uint64 tick_index = 4; // If emit_full_snapshot_first=true, first batch may be a full snapshot. bool is_full_snapshot = 5; // Sparse updates (add/update) repeated LinkUpdate updates = 10; // Links to remove from active set (no longer selected/visible/allowed) repeated LinkKey removals = 11; // Optional: server stats TickStats stats = 20; } message LinkUpdate { LinkKey key = 1; // Core NetworkView outputs bool up = 2; double one_way_delay_s = 3; double capacity_bps = 4; double loss_rate = 5; // Optional debug fields (filled if include_debug_fields=true) double snr_margin_db = 10; double elevation_deg = 11; double range_m = 12; // Extension space for later (avoid breaking schema) map<string,string> extra = 30; } message LinkKey { string src = 1; string dst = 2; LinkType type = 3; } message LinkPair { string src = 1; string dst = 2; LinkType type = 3; } message TickStats { uint32 links_computed = 1; uint32 links_emitted = 2; double compute_ms = 3; } // --------------------------- // Events // --------------------------- enum EventType { EVENT_TYPE_UNSPECIFIED = 0; LINK_UP = 1; LINK_DOWN = 2; HANDOVER_START = 3; HANDOVER_COMPLETE = 4; NODE_FAILURE = 5; NODE_RECOVERY = 6; } message EngineEvent { string scenario_ref = 1; string schema_version = 2; EventType type = 3; google.protobuf.timestamp time = 4; uint64 tick_index = 5; // Which entities are involved (optional depending on event) string node_id = 10; LinkKey link = 11; map<string,string> meta = 20; } message EventFilter { repeated EventType types = 1; repeated string node_ids = 2; } ``` --- ## 5) Server behavior specification (streaming semantics) ### 5.1 Timebase rules * `ScenarioSpec.t0/t1` define the canonical simulation window. * Stream requests may override with `t_start/t_end`: * If unset → default to scenario window. * Engine must clamp requests to `[t0, t1]` unless explicitly configured otherwise. * `dt`: * If unset → use `ScenarioSpec.default_dt`. * Must be within `[Capabilities.min_dt, Capabilities.max_dt]`; otherwise return `INVALID_ARGUMENT`. ### 5.2 Tick indexing * Tick `0` corresponds to `t_start`. * Tick `k` corresponds to `t_start + k*dt`. * Engine must emit `tick_index` and `time` on every batch. ### 5.3 Active link set and removals The stream maintains a client-side “active link table”. * `updates[]` means: **create or replace** link entry keyed by `(src,dst,type)`. * `removals[]` means: delete that link entry (no longer in selection or no longer feasible under policy). This is required for sparse streams when visibility causes links to appear/disappear. ### 5.4 First message behavior If `emit_full_snapshot_first=true`: * The first emitted `LinkDeltaBatch` at tick 0 must have: * `is_full_snapshot=true` * `updates[]` containing **all currently selected/feasible links** * `removals[]` empty This drastically simplifies consumers (no special “initialization” logic). ### 5.5 Delta emission thresholds For ticks after the initial snapshot: * A link is emitted in `updates[]` if: * it is newly added, OR * its `up` changed, OR * `abs(new.delay - old.delay) > thresholds.delay_s` (if threshold > 0), OR * `abs(new.capacity - old.capacity) > thresholds.capacity_bps` (if threshold > 0), OR * `abs(new.loss - old.loss) > thresholds.loss_rate` (if threshold > 0), OR * (optional debug) changes exceed debug thresholds if included. If a threshold is **0**, treat it as “emit on any change”. ### 5.6 Event stream behavior (v1.1 alignment) `StreamEvents` emits events in chronological order within `[t_start, t_end]`: * Minimum set in v1: * `LINK_UP`, `LINK_DOWN` * Optional: * `HANDOVER_START`, `HANDOVER_COMPLETE` if you can detect “best-sat changed” for a GS/UT. * Alignment requirements: * `StreamEventsRequest.dt` must use the same semantics/rules as delta streams (`default_dt` when unset, cap-validated). * `StreamEventsRequest.selector` must use the same link candidate filtering semantics as delta streams. * Every emitted event includes `tick_index`, where tick `k = t_start + k*dt`. * Events should be consistent with LinkDelta stream: * If link transitions from `up=false` to `up=true` at tick k, emit a `LINK_UP` event at that tick’s `time`. ### 5.7 Backpressure and cancellation * Use `grpc.aio` streaming and `yield` messages. * If the client is slow, await on send; do not build unbounded queues. * On cancellation (`context.cancelled()`): * stop computation promptly * release scenario references held by the stream * record a log entry with reason --- ## 6) Internal engine architecture (recommended) ### 6.1 Modules and responsibilities **`scenario_store.py`** * Holds `ScenarioRuntime` objects keyed by `scenario_ref` * Contains: * validated `ScenarioSpec` * pre-parsed skyfield satellite objects * node dictionaries and role sets * cached computed data (positions/visibility per tick if enabled) * RNG seeded from `ScenarioSpec.seed` **`timebase.py`** * Converts timestamps to ticks and vice versa * Handles rounding rules (recommend: tick times exactly `t_start + k*dt`) **`selectors.py`** * Applies `LinkSelector` + `LinkPolicy` to yield candidate link pairs * Must support: * explicit pairs (exact) * link types * src/dst id filters * tag filters * only_visible/min_elevation/max_range constraints **`compute/ephemeris.py`** * Builds skyfield `EarthSatellite` objects from TLE * Provides `get_sat_ecef(t)` or `get_sat_eci(t)` depending on your implementation **`compute/geometry.py`** * Computes: * range (m) * elevation (deg) from ground site to satellite (and vice versa if needed) * visibility boolean: elevation >= min_elev, range <= max_range **`compute/link_budget.py`** * Computes: * FSPL from range + frequency * atmospheric attenuation (via ITU-R), optional * noise power from bandwidth + noise temp * received power, C/N0, SNR margin, etc. * Returns a `PhySummary` (internal dataclass) **`compute/adaptation.py`** * Maps `PhySummary` → `NetworkLinkState`: * `capacity_bps` and/or `loss_rate` * If v1, implement a deterministic piecewise mapping: * clamp snr_margin_db into [min,max] * map linearly to capacity between [0, terminal.bandwidth * eff_max] (or use fixed) * map snr_margin_db to loss via logistic or fixed thresholds **`streaming/delta.py`** * Maintains per-stream “previous link table” * Computes `updates` and `removals` each tick **`streaming/events.py`** * Detects link up/down transitions and yields `EngineEvent` --- ## 7) Scenario validation rules (must be enforced) * `t0 < t1` * `default_dt > 0` * Node IDs unique * Satellites must include valid TLE lines * Fixed sites must have valid lat/lon ranges * Terminal model must include: * `frequency_hz > 0` * `bandwidth_hz > 0` * `rx_noise_temp_k > 0` * `LinkPolicy.allowed_types` must be non-empty OR default to all valid types for provided roles Return gRPC status `INVALID_ARGUMENT` with a descriptive error message if validation fails. --- ## 8) Performance requirements (practical targets) These are engineering targets; adjust later. * Tick compute should scale with **number of candidate links**, not N² nodes. * Implement at least one of: * pre-filter by link type and role sets * max_range cutoff * max_degree pruning (keep best K neighbors by range or SNR) ### Recommended optimizations (v1) * Cache satellite positions per tick if `precompute_positions=true`. * Cache ground station ECEF once. * Vectorize range computations where possible (NumPy arrays). --- ## 9) Determinism requirements Determinism must include: * Ordering: Always sort link keys before emitting for stable output * sort by `(src, dst, type)` * RNG: Use `numpy.random.Generator(PCG64(seed))` attached to scenario * Floating rounding: Do not over-round; but be consistent in computations (same order of ops) Test: Run the same stream twice and ensure byte-equivalent serialized output (or field-wise equal within tolerance where appropriate). --- ## 10) Error handling (gRPC status codes) Implement these consistent statuses: * `NOT_FOUND`: unknown `scenario_ref` * `INVALID_ARGUMENT`: bad time range, dt, selector, scenario validation failure * `RESOURCE_EXHAUSTED`: too many active streams for a scenario; or too many links per tick requested * `FAILED_PRECONDITION`: scenario closed * `INTERNAL`: unexpected exceptions (log stack trace server-side) Add a stable error message prefix, e.g. `GEOMRF_ERR:<CODE>:<details>` for easier parsing. --- ## 11) Reference streaming algorithm (server-side) ### Pseudocode for `StreamLinkDeltas` 1. Resolve scenario and compute effective `t_start/t_end/dt`. 2. Build selector state (resolved node sets, tag filters, link types). 3. Initialize: * `prev_links = {}` (LinkKey → LinkUpdate-like internal struct) * `active_keys = set()` 4. For tick k from 0..: * Compute `t = t_start + k*dt`; stop when `t > t_end`. * Determine candidate link pairs from selector+policy. * For each candidate link: * compute geometry (range/elev/visibility) * if not feasible and only_visible: skip (will cause removal if previously active) * compute PHY summary * compute NetworkLinkState (up/delay/capacity/loss) * assemble internal current map `curr_links[key] = state` * Compute removals = keys in prev_links but not in curr_links * Compute updates: * if first tick and emit_full_snapshot_first: all curr_links become updates * else: apply delta thresholds comparing curr vs prev * Emit `LinkDeltaBatch` (even if empty updates/removals? optional; recommended emit every tick for simplicity) * Update prev_links = curr_links ### Pseudocode for `StreamEvents` * Either: * derive from `StreamLinkDeltas` logic (shared per-stream evaluator), OR * implement as separate evaluation loop that only checks transitions * Resolve and validate `t_start/t_end/dt` exactly as in delta streams. * Build selector from request and apply identical candidate filtering. * Emit event when `(prev.up != curr.up)` for any link in the selected set. * Populate both `time` and `tick_index`. --- ## 12) Client expectations (contract for consumers) A correct consumer must: * Start with the first `LinkDeltaBatch` (full snapshot) * Maintain `active_table[LinkKey] = LinkUpdate` * Apply each tick: * delete removals * upsert updates * Use `time` and `tick_index` as authoritative time * Optionally also subscribe to events; events are primarily observability data, not control-plane truth --- ## 13) Example client (must be included) Create `examples/client_stream.py`: * Connect to server * `CreateScenario` from an inline scenario object (or YAML file) * Start `StreamLinkDeltas` and print: * tick index, number of updates/removals, sample link * Optionally start `StreamEvents` concurrently * Close scenario at end Checklist: - [x] Implement `examples/client_stream.py` - [x] Add README usage snippet: - [x] start server - [x] run client - [x] expected output format --- ## 14) Minimal “default” adaptation mapping (v1, deterministic) If your existing code already outputs a usable throughput and PER proxy, use it. If not, implement a deterministic fallback: ### v1 fallback policy * `up = visibility && snr_margin_db > 0` (or >= threshold) * `delay = range_m / c` (c = 299792458 m/s) * `capacity_bps`: * FIXED_RATE: `fixed_capacity_bps` when up else 0 * SNR_TO_RATE: * normalize `x = clamp((snr_margin_db - min)/(max-min), 0..1)` * `capacity = x * capacity_max`, where `capacity_max = bandwidth_hz * eff_max` * choose `eff_max` constant (e.g., 4 bits/s/Hz) in v1; document it * `loss_rate`: * FIXED: `fixed_loss_rate` when up else 1 * SNR_TO_LOSS: * logistic: `loss = 1 / (1 + exp(a*(snr_margin_db - b)))` with fixed a,b * clamp to [0,1] Checklist: - [x] Decide v1 constants (`eff_max`, logistic params, up-threshold) - [x] Put them in `adaptation.py` and record them in logs/version --- ## 15) Observability (recommended even in v1) * gRPC access logs including: * scenario_ref, stream type, time range, dt, selector summary * Prometheus counters (optional but easy): * streams active, ticks computed, links computed, mean compute time * TickStats in stream payload (already specified) Checklist: - [x] Add `TickStats` computation - [x] Add server-side metrics (optional) - [x] Add structured logging with correlation IDs --- ## 16) Security and robustness (v1 minimum) * Bind address configurable (`0.0.0.0:50051` default) * Optional TLS later; v1 can be plaintext for local lab use * Enforce limits: * max nodes * max links per tick * max active streams per scenario Checklist: - [x] Enforce link/node limits with `RESOURCE_EXHAUSTED` - [x] Enforce max concurrent streams per scenario --- ## 17) Acceptance criteria (definition of done) ### Functional - [x] Server starts and responds to `GetVersion` and `GetCapabilities` - [x] `CreateScenario` returns a scenario_ref and validates inputs - [x] `StreamLinkDeltas` emits: - [x] a full snapshot first (when enabled) - [x] then sparse deltas/removals per tick - [x] `StreamEvents` emits link up/down events consistent with deltas - [x] `CloseScenario` frees scenario resources and blocks further streams ### Correctness - [x] Determinism test passes (same inputs → same outputs) - [x] Selector tests pass (only requested links emitted) - [x] Delta threshold tests pass (small changes suppressed) ### Usability - [x] Example client runs end-to-end against server and prints reasonable output - [x] README explains how to run locally and how to pass a scenario YAML --- ## 18) Optional but high-value extension hooks (safe to leave stubbed) These can exist as placeholders in code (no API changes needed later): - [x] “Debug fields” population (`include_debug_fields=true`) - [ ] Additional events (handover start/complete) - [ ] More orbit formats (OEM/SP3) behind `SatelliteOrbit` oneof later - [ ] Better antenna pattern modeling behind `TerminalModel` --- ## 19) v1.1 orchestrator-alignment tasks (new) - [ ] Bump schema to `geomrf.v1.1` (or equivalent versioning plan) for event-alignment fields. - [ ] Update `StreamEventsRequest` implementation to honor request `dt` and `selector`. - [ ] Populate `EngineEvent.tick_index` from the same tick loop semantics as deltas. - [ ] Add tests: - [ ] events and deltas requested with same window/selectors produce aligned tick grids - [ ] invalid event `dt` returns `INVALID_ARGUMENT` - [ ] event selector filtering mirrors delta selector behavior - [ ] Keep backward compatibility plan explicit (version gate or dual-field behavior) for existing v1 clients.

Geometry/RF Engine v1/v1.1 Streaming API — Implementation Specification

This document specifies a complete, implementable Geometry/RF Engine API and server for streaming link-state deltas + events. It is written so another LLM can generate the code from existing Python geometry/RF/link-budget code (Skyfield + ITU-R + your models) with minimal guesswork.

Status note:

v1 baseline is implemented.
v1.1 alignment updates (for orchestrator compatibility) are now specified below, especially for StreamEvents tick alignment.

0) Deliverables

What must exist at the end

A runnable Python gRPC server that implements:
- Scenario creation/closure
- Capabilities/version endpoints
- Streaming link deltas
- Streaming events
A Protobuf schema with:
- Stable IDs, time semantics, units
- Selector logic (which links/nodes to compute)
- Delta semantics (what counts as a “change”)
A reference Python client demonstrating:
- Create scenario → stream deltas/events → close scenario

1) Core design constraints

1.1 Contract invariants

Scenario-scoped: All computation happens inside a ScenarioRef.
Time-indexed: All output is keyed by a timestamp and tick index.
Selector-driven: Never compute “all links” unless explicitly requested.
Streaming-first: Primary runtime interface is server→client stream.
Deterministic: Given identical inputs (scenario + seed + engine version), output is replayable.

1.2 What the engine outputs (NetworkView)

For each directed link (src → dst) at each tick, the engine provides:

up (boolean)
one_way_delay_s (float; seconds)
capacity_bps (float; bits per second)
loss_rate (float; [0,1] packet loss proxy OR PER proxy)
optional debug scalar(s): snr_margin_db, elevation_deg, range_m

Everything else can be exposed later via an optional “debug view”; v1 focuses on network-usable state.

2) Repository layout (recommended)

geomrf-engine/
  proto/
    geomrf/v1/geomrf.proto
  src/geomrf_engine/
    __init__.py
    server.py
    config_schema.py
    scenario_store.py
    timebase.py
    selectors.py
    compute/
      __init__.py
      ephemeris.py
      geometry.py
      rf_models.py
      link_budget.py
      adaptation.py
    streaming/
      __init__.py
      delta.py
      events.py
      backpressure.py
    util/
      ids.py
      units.py
      logging.py
      metrics.py
  examples/
    client_stream.py
  tests/
    test_proto_roundtrip.py
    test_delta_thresholds.py
    test_selectors.py
    test_determinism.py

3) Implementation tasks checklist

3.1 Project & build system

Create repo structure as above
Add pyproject.toml with dependencies:
- grpcio, grpcio-tools, protobuf
- pydantic (scenario validation)
- pyyaml (YAML scenario input)
- numpy, scipy (if used)
- skyfield, sgp4
- your ITU-R package(s)
- prometheus-client (optional but recommended)
Add a Makefile or task runner:
- uv run python -m grpc_tools.protoc ... compiles .proto to Python
- uv run python -m geomrf_engine.server ... starts server
- uv run pytest runs tests

3.2 Protobuf + gRPC schema

Write proto/geomrf/v1/geomrf.proto (spec below)
Generate Python stubs
Add schema version constants and embed in responses

3.3 Server skeleton

Implement async gRPC server (grpc.aio)
Wire servicer methods:
- GetVersion
- GetCapabilities
- CreateScenario
- CloseScenario
- StreamLinkDeltas
- StreamEvents
Add structured logging and request correlation IDs

3.4 Scenario lifecycle

Implement scenario validation (Pydantic)
Implement scenario store (in-memory for v1)
Implement scenario ID generation (UUIDv4)
Snapshot ScenarioSpec + resolved assets into a ScenarioRuntime

3.5 Compute pipeline

Implement ephemeris loader (TLE list initially)
Implement geometry evaluation (positions + visibility + elevation + range)
Implement RF/link budget mapping to NetworkLinkState
Implement adaptation mapping (SNR → capacity/loss) with a default policy
Implement per-tick evaluation returning sparse link set

3.6 Streaming + deltas/events

Implement tick loop (timebase)
Implement delta computation with thresholds
Implement event emission (link up/down, handover optional)
Implement backpressure-safe streaming
Add stream cancellation handling and cleanup

3.7 Tests + examples

Determinism test (same scenario+seed → identical deltas)
Selector test (only requested links computed)
Threshold test (small changes suppressed)
Example client script (prints updates, counts links)

4) gRPC/Protobuf specification (v1)

4.1 `.proto` (authoritative spec)

Create proto/geomrf/v1/geomrf.proto:

syntax = "proto3";

package geomrf.v1;

import "google/protobuf/timestamp.proto";
import "google/protobuf/duration.proto";

option go_package = "geomrf/v1;geomrfv1"; // harmless for other langs

// ---------------------------
// Service
// ---------------------------
service GeometryRfEngine {
  rpc GetVersion(GetVersionRequest) returns (GetVersionResponse);
  rpc GetCapabilities(GetCapabilitiesRequest) returns (GetCapabilitiesResponse);

  rpc CreateScenario(CreateScenarioRequest) returns (CreateScenarioResponse);
  rpc CloseScenario(CloseScenarioRequest) returns (CloseScenarioResponse);

  // Primary: stream sparse deltas per tick.
  rpc StreamLinkDeltas(StreamLinkDeltasRequest) returns (stream LinkDeltaBatch);

  // Primary: stream discrete events (optional separate channel for clean consumers).
  rpc StreamEvents(StreamEventsRequest) returns (stream EngineEvent);
}

// ---------------------------
// Version / capabilities
// ---------------------------
message GetVersionRequest {}

message GetVersionResponse {
  string engine_name = 1;            // e.g., "geomrf-engine"
  string engine_version = 2;         // semver, e.g., "1.0.0"
  string schema_version = 3;         // e.g., "geomrf.v1"
  string build_git_sha = 4;          // optional
}

message GetCapabilitiesRequest {}

message GetCapabilitiesResponse {
  string schema_version = 1;

  // Limits
  uint32 max_links_per_tick = 2;
  uint32 max_nodes = 3;
  uint32 max_streams_per_scenario = 4;
  google.protobuf.duration min_dt = 5;
  google.protobuf.duration max_dt = 6;

  // Supported outputs
  bool supports_loss_rate = 10;
  bool supports_capacity_bps = 11;
  bool supports_delay_s = 12;
  bool supports_snr_margin_db = 13;

  // Supported selectors/features (advertise so clients can adapt)
  bool supports_only_visible = 20;
  bool supports_min_elevation_deg = 21;
  bool supports_max_degree = 22;
  bool supports_link_types = 23; // GS-SAT, SAT-SAT, etc.
}

// ---------------------------
// Scenario lifecycle
// ---------------------------
message CreateScenarioRequest {
  ScenarioSpec spec = 1;
}

message CreateScenarioResponse {
  string scenario_ref = 1; // UUID string
  string schema_version = 2;
}

message CloseScenarioRequest {
  string scenario_ref = 1;
}

message CloseScenarioResponse {
  bool ok = 1;
}

// ---------------------------
// Scenario specification (v1)
// ---------------------------

message ScenarioSpec {
  // Reproducibility
  uint64 seed = 1;

  // Time model
  google.protobuf.timestamp t0 = 2;      // UTC
  google.protobuf.timestamp t1 = 3;      // UTC
  google.protobuf.duration default_dt = 4;

  // Nodes
  repeated NodeSpec nodes = 10;

  // Eligibility rules (which links can exist)
  LinkPolicy link_policy = 20;

  // Mapping PHY -> network outputs (can be simplistic in v1)
  AdaptationPolicy adaptation = 30;

  // Optional: engine-side caching hints
  CacheHints cache_hints = 40;
}

enum NodeRole {
  NODE_ROLE_UNSPECIFIED = 0;
  SATELLITE = 1;
  GROUND_STATION = 2;
  USER_TERMINAL = 3;
}

message NodeSpec {
  string node_id = 1;     // stable ID used everywhere
  NodeRole role = 2;

  // One of the following depending on role
  SatelliteOrbit orbit = 10;
  GroundFixedSite fixed_site = 11;

  // Radio/terminal model parameters (minimal v1)
  TerminalModel terminal = 20;

  // Arbitrary tags for selectors/grouping
  map<string,string> tags = 30;
}

message SatelliteOrbit {
  // v1: only TLE supported. Later: OEM/SP3/etc.
  string tle_line1 = 1;
  string tle_line2 = 2;
}

message GroundFixedSite {
  double lat_deg = 1;
  double lon_deg = 2;
  double alt_m = 3;
}

message TerminalModel {
  // Minimal knobs to compute link budgets consistently.
  // Units: dBW, dBi, Hz, K, etc.
  double tx_power_dbw = 1;
  double tx_gain_dbi = 2;          // can be treated as peak gain in v1
  double rx_gain_dbi = 3;          // can be treated as peak gain in v1
  double rx_noise_temp_k = 4;
  double bandwidth_hz = 5;
  double frequency_hz = 6;

  // Optional: simple pointing/antenna pattern loss approximation
  double pointing_loss_db = 10;    // default constant loss if you don’t model patterns yet
}

enum LinkType {
  LINK_TYPE_UNSPECIFIED = 0;
  GS_TO_SAT = 1;
  SAT_TO_GS = 2;
  SAT_TO_SAT = 3;
  UT_TO_SAT = 4;
  SAT_TO_UT = 5;
}

message LinkPolicy {
  // Which link types are allowed at all
  repeated LinkType allowed_types = 1;

  // Dynamic feasibility thresholds
  double min_elevation_deg = 2;     // default 0 if unused
  bool only_visible = 3;            // if true, return only visible/feasible links

  // Degree constraints (optional)
  uint32 max_out_degree = 10;       // 0 means unlimited
  uint32 max_in_degree = 11;        // 0 means unlimited

  // Optional: limit candidates by distance for scalability
  double max_range_m = 20;          // 0 means unlimited
}

message AdaptationPolicy {
  // v1: a simple mapping mode.
  // Future: full MCS tables, ACM, coding gains, etc.
  enum Mode {
    MODE_UNSPECIFIED = 0;
    FIXED_RATE = 1;        // constant capacity if link is up, else 0
    SNR_TO_RATE = 2;       // rate from snr_margin (simple piecewise)
    SNR_TO_LOSS = 3;       // loss from snr_margin (simple logistic)
    SNR_TO_BOTH = 4;
  }
  Mode mode = 1;

  // v1 defaults
  double fixed_capacity_bps = 2;
  double fixed_loss_rate = 3;

  // Parameters for simple SNR->rate/loss mappings (implementation defined but deterministic)
  double snr_margin_min_db = 10;
  double snr_margin_max_db = 11;
}

message CacheHints {
  bool precompute_positions = 1;
  bool precompute_visibility = 2;
  uint32 max_cache_ticks = 3; // 0 = engine default
}

// ---------------------------
// Streaming requests
// ---------------------------

message StreamLinkDeltasRequest {
  string scenario_ref = 1;

  // Time range for this stream. If empty, use scenario t0..t1.
  google.protobuf.timestamp t_start = 2;
  google.protobuf.timestamp t_end = 3;

  // If unset, use scenario default_dt.
  google.protobuf.duration dt = 4;

  // Which links to consider/return.
  LinkSelector selector = 10;

  // Delta emission thresholds
  DeltaThresholds thresholds = 20;

  // Behavior knobs
  bool emit_full_snapshot_first = 30;  // recommended true for simpler clients
  bool include_debug_fields = 31;      // if true, fill debug fields in updates
}

message StreamEventsRequest {
  string scenario_ref = 1;
  google.protobuf.timestamp t_start = 2;
  google.protobuf.timestamp t_end = 3;
  // If unset, use scenario default_dt. Must satisfy capabilities bounds.
  google.protobuf.duration dt = 4;

  EventFilter filter = 10;
  // Apply the same selection surface as StreamLinkDeltas for deterministic alignment.
  LinkSelector selector = 11;
}

message LinkSelector {
  // v1 supports:
  // - explicit pairs
  // - by link type
  // - by node role sets
  repeated LinkPair explicit_pairs = 1;
  repeated LinkType link_types = 2;

  // If non-empty, only consider links where src in set AND dst in set
  repeated string src_node_ids = 10;
  repeated string dst_node_ids = 11;

  // Optional tag filters (exact match)
  map<string,string> src_tags = 12;
  map<string,string> dst_tags = 13;

  // If true, apply scenario LinkPolicy.only_visible behavior
  bool only_visible = 20;

  // Optional override thresholds (0 uses scenario policy)
  double min_elevation_deg = 21;
  double max_range_m = 22;
}

message DeltaThresholds {
  // Only emit update if absolute change exceeds threshold.
  // 0 means "emit on any change" for that field.
  double delay_s = 1;
  double capacity_bps = 2;
  double loss_rate = 3;
  double snr_margin_db = 4;

  // Emit if link up/down changes always (implicit).
}

// ---------------------------
// Streaming output
// ---------------------------

message LinkDeltaBatch {
  string scenario_ref = 1;
  string schema_version = 2;

  google.protobuf.timestamp time = 3; // tick time
  uint64 tick_index = 4;

  // If emit_full_snapshot_first=true, first batch may be a full snapshot.
  bool is_full_snapshot = 5;

  // Sparse updates (add/update)
  repeated LinkUpdate updates = 10;

  // Links to remove from active set (no longer selected/visible/allowed)
  repeated LinkKey removals = 11;

  // Optional: server stats
  TickStats stats = 20;
}

message LinkUpdate {
  LinkKey key = 1;

  // Core NetworkView outputs
  bool up = 2;
  double one_way_delay_s = 3;
  double capacity_bps = 4;
  double loss_rate = 5;

  // Optional debug fields (filled if include_debug_fields=true)
  double snr_margin_db = 10;
  double elevation_deg = 11;
  double range_m = 12;

  // Extension space for later (avoid breaking schema)
  map<string,string> extra = 30;
}

message LinkKey {
  string src = 1;
  string dst = 2;
  LinkType type = 3;
}

message LinkPair {
  string src = 1;
  string dst = 2;
  LinkType type = 3;
}

message TickStats {
  uint32 links_computed = 1;
  uint32 links_emitted = 2;
  double compute_ms = 3;
}

// ---------------------------
// Events
// ---------------------------

enum EventType {
  EVENT_TYPE_UNSPECIFIED = 0;
  LINK_UP = 1;
  LINK_DOWN = 2;
  HANDOVER_START = 3;
  HANDOVER_COMPLETE = 4;
  NODE_FAILURE = 5;
  NODE_RECOVERY = 6;
}

message EngineEvent {
  string scenario_ref = 1;
  string schema_version = 2;

  EventType type = 3;
  google.protobuf.timestamp time = 4;
  uint64 tick_index = 5;

  // Which entities are involved (optional depending on event)
  string node_id = 10;
  LinkKey link = 11;

  map<string,string> meta = 20;
}

message EventFilter {
  repeated EventType types = 1;
  repeated string node_ids = 2;
}

5) Server behavior specification (streaming semantics)

5.1 Timebase rules

ScenarioSpec.t0/t1 define the canonical simulation window.
Stream requests may override with t_start/t_end:
- If unset → default to scenario window.
- Engine must clamp requests to [t0, t1] unless explicitly configured otherwise.
dt:
- If unset → use ScenarioSpec.default_dt.
- Must be within [Capabilities.min_dt, Capabilities.max_dt]; otherwise return INVALID_ARGUMENT.

5.2 Tick indexing

Tick 0 corresponds to t_start.
Tick k corresponds to t_start + k*dt.
Engine must emit tick_index and time on every batch.

5.3 Active link set and removals

The stream maintains a client-side “active link table”.

updates[] means: create or replace link entry keyed by (src,dst,type).
removals[] means: delete that link entry (no longer in selection or no longer feasible under policy).

This is required for sparse streams when visibility causes links to appear/disappear.

5.4 First message behavior

If emit_full_snapshot_first=true:

The first emitted LinkDeltaBatch at tick 0 must have:
- is_full_snapshot=true
- updates[] containing all currently selected/feasible links
- removals[] empty

This drastically simplifies consumers (no special “initialization” logic).

5.5 Delta emission thresholds

For ticks after the initial snapshot:

A link is emitted in updates[] if:
- it is newly added, OR
- its up changed, OR
- abs(new.delay - old.delay) > thresholds.delay_s (if threshold > 0), OR
- abs(new.capacity - old.capacity) > thresholds.capacity_bps (if threshold > 0), OR
- abs(new.loss - old.loss) > thresholds.loss_rate (if threshold > 0), OR
- (optional debug) changes exceed debug thresholds if included.

If a threshold is 0, treat it as “emit on any change”.

5.6 Event stream behavior (v1.1 alignment)

StreamEvents emits events in chronological order within [t_start, t_end]:

Minimum set in v1:
- LINK_UP, LINK_DOWN
Optional:
- HANDOVER_START, HANDOVER_COMPLETE if you can detect “best-sat changed” for a GS/UT.
Alignment requirements:
- StreamEventsRequest.dt must use the same semantics/rules as delta streams (default_dt when unset, cap-validated).
- StreamEventsRequest.selector must use the same link candidate filtering semantics as delta streams.
- Every emitted event includes tick_index, where tick k = t_start + k*dt.
Events should be consistent with LinkDelta stream:
- If link transitions from up=false to up=true at tick k, emit a LINK_UP event at that tick’s time.

5.7 Backpressure and cancellation

Use grpc.aio streaming and yield messages.
If the client is slow, await on send; do not build unbounded queues.
On cancellation (context.cancelled()):
- stop computation promptly
- release scenario references held by the stream
- record a log entry with reason

6) Internal engine architecture (recommended)

6.1 Modules and responsibilities

scenario_store.py

Holds ScenarioRuntime objects keyed by scenario_ref
Contains:
- validated ScenarioSpec
- pre-parsed skyfield satellite objects
- node dictionaries and role sets
- cached computed data (positions/visibility per tick if enabled)
- RNG seeded from ScenarioSpec.seed

timebase.py

Converts timestamps to ticks and vice versa
Handles rounding rules (recommend: tick times exactly t_start + k*dt)

selectors.py

Applies LinkSelector + LinkPolicy to yield candidate link pairs
Must support:
- explicit pairs (exact)
- link types
- src/dst id filters
- tag filters
- only_visible/min_elevation/max_range constraints

compute/ephemeris.py

Builds skyfield EarthSatellite objects from TLE
Provides get_sat_ecef(t) or get_sat_eci(t) depending on your implementation

compute/geometry.py

Computes:
- range (m)
- elevation (deg) from ground site to satellite (and vice versa if needed)
- visibility boolean: elevation >= min_elev, range <= max_range

compute/link_budget.py

Computes:
- FSPL from range + frequency
- atmospheric attenuation (via ITU-R), optional
- noise power from bandwidth + noise temp
- received power, C/N0, SNR margin, etc.
Returns a PhySummary (internal dataclass)

compute/adaptation.py

Maps PhySummary → NetworkLinkState:
- capacity_bps and/or loss_rate
- If v1, implement a deterministic piecewise mapping:
  - clamp snr_margin_db into [min,max]
  - map linearly to capacity between [0, terminal.bandwidth * eff_max] (or use fixed)
  - map snr_margin_db to loss via logistic or fixed thresholds

streaming/delta.py

Maintains per-stream “previous link table”
Computes updates and removals each tick

streaming/events.py

Detects link up/down transitions and yields EngineEvent

7) Scenario validation rules (must be enforced)

t0 < t1
default_dt > 0
Node IDs unique
Satellites must include valid TLE lines
Fixed sites must have valid lat/lon ranges
Terminal model must include:
- frequency_hz > 0
- bandwidth_hz > 0
- rx_noise_temp_k > 0
LinkPolicy.allowed_types must be non-empty OR default to all valid types for provided roles

Return gRPC status INVALID_ARGUMENT with a descriptive error message if validation fails.

8) Performance requirements (practical targets)

These are engineering targets; adjust later.

Tick compute should scale with number of candidate links, not N² nodes.
Implement at least one of:
- pre-filter by link type and role sets
- max_range cutoff
- max_degree pruning (keep best K neighbors by range or SNR)

Recommended optimizations (v1)

Cache satellite positions per tick if precompute_positions=true.
Cache ground station ECEF once.
Vectorize range computations where possible (NumPy arrays).

9) Determinism requirements

Determinism must include:

Ordering: Always sort link keys before emitting for stable output
- sort by (src, dst, type)
RNG: Use numpy.random.Generator(PCG64(seed)) attached to scenario
Floating rounding: Do not over-round; but be consistent in computations (same order of ops)

Test: Run the same stream twice and ensure byte-equivalent serialized output (or field-wise equal within tolerance where appropriate).

10) Error handling (gRPC status codes)

Implement these consistent statuses:

NOT_FOUND: unknown scenario_ref
INVALID_ARGUMENT: bad time range, dt, selector, scenario validation failure
RESOURCE_EXHAUSTED: too many active streams for a scenario; or too many links per tick requested
FAILED_PRECONDITION: scenario closed
INTERNAL: unexpected exceptions (log stack trace server-side)

Add a stable error message prefix, e.g. GEOMRF_ERR:<CODE>:<details> for easier parsing.

11) Reference streaming algorithm (server-side)

Pseudocode for `StreamLinkDeltas`

Resolve scenario and compute effective t_start/t_end/dt.
Build selector state (resolved node sets, tag filters, link types).
Initialize:
- prev_links = {} (LinkKey → LinkUpdate-like internal struct)
- active_keys = set()
For tick k from 0..:
- Compute t = t_start + k*dt; stop when t > t_end.
- Determine candidate link pairs from selector+policy.
- For each candidate link:
  - compute geometry (range/elev/visibility)
  - if not feasible and only_visible: skip (will cause removal if previously active)
  - compute PHY summary
  - compute NetworkLinkState (up/delay/capacity/loss)
  - assemble internal current map curr_links[key] = state
- Compute removals = keys in prev_links but not in curr_links
- Compute updates:
  - if first tick and emit_full_snapshot_first: all curr_links become updates
  - else: apply delta thresholds comparing curr vs prev
- Emit LinkDeltaBatch (even if empty updates/removals? optional; recommended emit every tick for simplicity)
- Update prev_links = curr_links

Pseudocode for `StreamEvents`

Either:
- derive from StreamLinkDeltas logic (shared per-stream evaluator), OR
- implement as separate evaluation loop that only checks transitions
Resolve and validate t_start/t_end/dt exactly as in delta streams.
Build selector from request and apply identical candidate filtering.
Emit event when (prev.up != curr.up) for any link in the selected set.
Populate both time and tick_index.

12) Client expectations (contract for consumers)

A correct consumer must:

Start with the first LinkDeltaBatch (full snapshot)
Maintain active_table[LinkKey] = LinkUpdate
Apply each tick:
- delete removals
- upsert updates
Use time and tick_index as authoritative time
Optionally also subscribe to events; events are primarily observability data, not control-plane truth

13) Example client (must be included)

Create examples/client_stream.py:

Connect to server
CreateScenario from an inline scenario object (or YAML file)
Start StreamLinkDeltas and print:
- tick index, number of updates/removals, sample link
Optionally start StreamEvents concurrently
Close scenario at end

Checklist:

Implement examples/client_stream.py
Add README usage snippet:
- start server
- run client
- expected output format

14) Minimal “default” adaptation mapping (v1, deterministic)

If your existing code already outputs a usable throughput and PER proxy, use it. If not, implement a deterministic fallback:

v1 fallback policy

up = visibility && snr_margin_db > 0 (or >= threshold)
delay = range_m / c (c = 299792458 m/s)
capacity_bps:
- FIXED_RATE: fixed_capacity_bps when up else 0
- SNR_TO_RATE:
  - normalize x = clamp((snr_margin_db - min)/(max-min), 0..1)
  - capacity = x * capacity_max, where capacity_max = bandwidth_hz * eff_max
  - choose eff_max constant (e.g., 4 bits/s/Hz) in v1; document it
loss_rate:
- FIXED: fixed_loss_rate when up else 1
- SNR_TO_LOSS:
  - logistic: loss = 1 / (1 + exp(a*(snr_margin_db - b))) with fixed a,b
  - clamp to [0,1]

Checklist:

Decide v1 constants (eff_max, logistic params, up-threshold)
Put them in adaptation.py and record them in logs/version

15) Observability (recommended even in v1)

gRPC access logs including:
- scenario_ref, stream type, time range, dt, selector summary
Prometheus counters (optional but easy):
- streams active, ticks computed, links computed, mean compute time
TickStats in stream payload (already specified)

Checklist:

Add TickStats computation
Add server-side metrics (optional)
Add structured logging with correlation IDs

16) Security and robustness (v1 minimum)

Bind address configurable (0.0.0.0:50051 default)
Optional TLS later; v1 can be plaintext for local lab use
Enforce limits:
- max nodes
- max links per tick
- max active streams per scenario

Checklist:

Enforce link/node limits with RESOURCE_EXHAUSTED
Enforce max concurrent streams per scenario

17) Acceptance criteria (definition of done)

Functional

Server starts and responds to GetVersion and GetCapabilities
CreateScenario returns a scenario_ref and validates inputs
StreamLinkDeltas emits:
- a full snapshot first (when enabled)
- then sparse deltas/removals per tick
StreamEvents emits link up/down events consistent with deltas
CloseScenario frees scenario resources and blocks further streams

Correctness

Determinism test passes (same inputs → same outputs)
Selector tests pass (only requested links emitted)
Delta threshold tests pass (small changes suppressed)

Usability

Example client runs end-to-end against server and prints reasonable output
README explains how to run locally and how to pass a scenario YAML

18) Optional but high-value extension hooks (safe to leave stubbed)

These can exist as placeholders in code (no API changes needed later):

“Debug fields” population (include_debug_fields=true)
Additional events (handover start/complete)
More orbit formats (OEM/SP3) behind SatelliteOrbit oneof later
Better antenna pattern modeling behind TerminalModel

19) v1.1 orchestrator-alignment tasks (new)

Bump schema to geomrf.v1.1 (or equivalent versioning plan) for event-alignment fields.
Update StreamEventsRequest implementation to honor request dt and selector.
Populate EngineEvent.tick_index from the same tick loop semantics as deltas.
Add tests:
- events and deltas requested with same window/selectors produce aligned tick grids
- invalid event dt returns INVALID_ARGUMENT
- event selector filtering mirrors delta selector behavior
Keep backward compatibility plan explicit (version gate or dual-field behavior) for existing v1 clients.

TASKS_IMNET.md · 18 KiB · Markdown Raw

# TASKS_IMNET.md — SatSim IMNET Lane (OMNeT++/INET) Implementation Plan This is a task-driven implementation plan for the **IMNET lane** (OMNeT++/INET). It covers: - Building and running an OMNeT++/INET simulation project under **opp_env** - Integrating IMNET into the existing **uv-managed** SatSim workspace (without mixing concerns) - Consuming **orchestrator-produced LinkState traces** (v1: trace-first) to apply dynamic link changes - Producing artifacts compatible with SatSim run directories / manifests > Policy reminder (already in repo): Python workflows are `uv`-managed; OMNeT++/INET toolchain is `opp_env`-managed. --- ## -1) Locked v1 decisions for this plan update - Execution model is **post-stream replay**: - Orchestrator records the OMNeT trace during streaming, then runs OMNeT after stream completion. - Topology strategy is **Strategy 3**: - stable NED template + orchestrator-generated `node_map.json` and `link_map.json`. - Orchestrator ↔ OMNeT runtime interface is **typed and orchestrator-owned**: - no implicit "just pass arbitrary `run_args`" contract for required parameters. - Canonical trace key fields use `src`, `dst`, `link_type` (not mixed `type` naming). - Repo split is intentional: - OMNeT assets under `lanes/omnet/`, Python lane adapter/runner under `satsim_orch/lanes/omnet_lane/`. - `opp_env` default path uses Nix, but `--nixless-workspace` is a supported fallback mode. --- ## 0) Deliverables (what “done” means) - [x] `lanes/omnet/` exists and contains: - [x] a reproducible `opp_env` workspace definition (pinned OMNeT++ + INET versions) - [x] an OMNeT++ project (“satsim-imnet”) that compiles and runs headless - [x] a **LinkState trace ingestion + applier** module that updates delay/rate/(optional loss) per tick - [x] a minimal demo scenario (2–10 nodes) that: - [x] runs via orchestrator in `--mode omnet` - [x] reads the trace written by orchestrator - [x] produces artifacts under the run directory (logs + .vec/.sca; optional pcap) - [x] One-command dev flow: - [x] `uv run satsim run <scenario.yaml> --mode omnet ...` executes IMNET via opp_env and stores artifacts in the standard run folder. - [x] Existing OMNeT lane blockers in current orchestrator code are closed before IMNET C++ work: - [x] trace writer removal serialization does not use `__dict__` on slots dataclasses - [x] trace line includes `is_full_snapshot` - [x] OMNeT launch is not silently skipped when `--mode omnet` is selected --- ## 1) Repo layout for IMNET lane - [x] Create directory structure: - [x] `lanes/omnet/` - [x] `lanes/omnet/WORKSPACE.md` (how to install + run with opp_env; pinned versions) - [x] `lanes/omnet/opp_env/` (workspace init and pinned selection) - [x] `lanes/omnet/satsim-imnet/` (the OMNeT++ project) - [x] `src/` (C++ modules) - [x] `ned/` (NED definitions) - [x] `omnetpp.ini` (baseline config; orchestrator may override with -f/-c) - [x] `Makefile` (generated by opp_makemake; committed only if desired, otherwise generated) - [x] `README.md` (how to build/run inside opp_env) - [x] `lanes/omnet/scripts/` (helper wrappers used by orchestrator) - [x] `install.py` (optional convenience; calls `opp_env install ...`) - [x] `build.py` (build IMNET project inside opp_env) - [x] `run.py` (run IMNET headless inside opp_env, accepts args from orchestrator) - [x] Keep orchestrator Python integration in package code: - [x] `satsim_orch/lanes/omnet_lane/adapter.py` remains the lane entrypoint - [x] `satsim_orch/lanes/omnet_lane/runner.py` owns command construction and process launch - [x] `lanes/omnet/WORKSPACE.md` documents how these package paths map to `lanes/omnet/` assets --- ## 2) Version pinning and opp_env workspace (reproducible toolchain) ### 2.1 Pick and pin OMNeT++ + INET versions - [x] Pin versions (v1 recommendation; can be adjusted later): - [x] OMNeT++: `omnetpp-6.3.0` - [x] INET: `inet-4.5.4` - [x] Document the pin in: - [x] `lanes/omnet/WORKSPACE.md` - [x] orchestrator run manifest fields (already exists; ensure it records these exact strings) ### 2.2 Initialize an opp_env workspace outside git working trees Goal: keep installs reproducible while avoiding committing huge toolchains. - [x] Decide where the opp_env workspace lives: - [x] default: `~/.cache/satsim/opp_env/workspace` (outside git tree) - [x] store only small config/metadata in git, not the compiled artifacts - [x] Add `.gitignore` entries: - [x] ignore optional repo-local workspace path `lanes/omnet/opp_env/workspace/` if used for local experimentation - [x] ignore `lanes/omnet/**/out/` (OMNeT outputs) - [x] ignore `lanes/omnet/**/results/` (if used) ### 2.3 Provide canonical commands (must work from uv venv) - [x] Ensure `opp_env` is invoked via uv: - [x] `uv run opp_env --version` - [x] `uv run opp_env list` - [x] Implement `lanes/omnet/scripts/install.py` that performs: - [x] workspace init (idempotent) - [x] install pinned INET (which pulls matching OMNeT++) - [x] verify the installed packages exist - [x] Add workspace mode guidance: - [x] default mode: Nix-backed `opp_env` workspace - [x] fallback mode: `opp_env --nixless-workspace` (document prerequisites and reproducibility caveats) - [x] orchestrator preflight must emit a clear error only when the selected workspace mode requirements are unmet --- ## 3) uv ↔ opp_env integration (clean boundary) ### 3.1 Keep responsibility boundaries strict - [x] Confirm and document: - [x] `uv` manages Python deps + orchestrator execution - [x] `opp_env` manages OMNeT++/INET toolchain and the shell/run environment - [x] Orchestrator calls `opp_env run ...` (or `opp_env shell -c ...`) rather than assuming OMNeT binaries are on PATH ### 3.2 Add orchestrator-side “IMNET preflight” (Only if not already present; keep it minimal.) - [x] In orchestrator’s OMNeT lane runner: - [x] verify `opp_env` is available (`uv run opp_env --version`) - [x] verify required runtime mode dependencies: - [x] Nix-backed mode: verify `nix` is available - [x] nixless mode: verify required toolchain binaries are present - [x] verify the IMNET scripts exist (install/build/run) - [x] if missing dependencies: - [x] fail with a single actionable message (no partial runs) - [x] Close current OMNeT-lane correctness blockers first: - [x] fix trace writer removal serialization for `LinkKey` (slots dataclass) - [x] include `is_full_snapshot` in the OMNeT trace JSONL tick payload - [x] remove `ini_path`-gated silent skip; in omnet mode runner must launch or raise an explicit error ### 3.3 Standardize how orchestrator launches IMNET - [x] Replace ad-hoc argument passing with a typed runner contract (orchestrator-owned): - [x] required fields: - [x] `workspace_path` - [x] `inet_version` (pinned string) - [x] `project_path` - [x] `ini_path` - [x] `trace_path` - [x] `dt_seconds` - [x] `outdir` - [x] `seed` - [x] optional fields: - [x] `config_name` - [x] `sim_time_limit_s` - [x] `extra_args` (non-critical escape hatch only) - [x] Define one canonical `opp_env` invocation generated by runner code: - [x] `opp_env run inet-<PINNED> --init -w <WORKSPACE> --chdir -c "<COMMAND>"` - [x] `runner.py` converts typed fields into OMNeT CLI args; callers do not handcraft command strings --- ## 4) IMNET contract with the other SatSim systems (as they exist) ### 4.1 Relationship to Geo/RF engine - [x] IMNET does **not** talk to Geo/RF directly in v1 - [x] IMNET consumes orchestrator-produced trace derived from: - [x] Geo/RF `StreamLinkDeltas` batches (tick_index + time + updates/removals) ### 4.2 Relationship to Orchestrator (v1 trace-first) - [x] Orchestrator responsibilities (assumed existing): - [x] create scenario in Geo/RF engine - [x] consume LinkDeltaBatch stream - [x] write a deterministic LinkState trace file for OMNeT lane - [x] launch OMNeT lane runner with correct typed parameters **after stream completion** (post-stream replay) - [x] IMNET responsibilities: - [x] parse the trace deterministically - [x] schedule link updates in simulation time aligned to tick_index - [x] apply updates to the simulated links (delay/rate/(optional loss), and “up/down” semantics) ### 4.3 Trace file format (IMNET must support this) Pick one and lock it. If orchestrator already emits a format, IMNET must match it. - [x] Decide and document the v1 trace format (JSONL, locked) - [x] One JSON object per tick line - [x] Fields required: - [x] `tick_index` (uint64) - [x] `time` (ISO8601 or unix seconds; used for logging only; simtime comes from tick_index * dt) - [x] `is_full_snapshot` (bool) - [x] `updates` (list) - [x] `removals` (list) - [x] Each `update` contains: - [x] `src` (string) - [x] `dst` (string) - [x] `link_type` (string enum name) - [x] `up` (bool) - [x] `one_way_delay_s` (float) - [x] `capacity_bps` (float) - [x] `loss_rate` (float; optional if IMNET v1 ignores loss) - [x] Each `removal` contains: - [x] `src`, `dst`, `link_type` (same key fields) - [x] Determinism requirements: - [x] stable ordering of `updates` and `removals` within each tick - [x] no mixed key names (`type` vs `link_type`) in v1 output - [x] Define dt semantics for IMNET: - [x] Lock v1 approach: IMNET gets `dt` via typed runner argument (`dt_seconds`) - [x] trace `time` is informational/logging only; tick scheduling uses `tick_index * dt_seconds` --- ## 5) OMNeT++/INET project scaffolding (“satsim-imnet”) ### 5.1 Create a minimal INET-based network model Goal: minimal, not fancy, but real packets flow and link parameters can change. - [x] Create `lanes/omnet/satsim-imnet/ned/SatSimNetwork.ned` - [x] Use INET `StandardHost` (or `Router`) modules for nodes - [x] Include: - [x] one traffic source app and one sink app (UDP is fine for v1) - [x] optional intermediate router (for a 2-hop demo) - [x] Strategy 3 mapping is required in v1: - [x] orchestrator generates `node_map.json` (`node_id` ↔ module path) - [x] orchestrator generates `link_map.json` (LinkKey ↔ mutable channel/shim path) - [x] LinkStateApplier consumes map files in strict mode and fails on unknown keys - [x] Create `lanes/omnet/satsim-imnet/omnetpp.ini` - [x] include INET paths and defaults - [x] configure: - [x] IP address assignment (INET configurator) - [x] app endpoints and start times - [x] disable GUI by default (`Cmdenv`) for orchestrator runs ### 5.2 Add a LinkState ingestion + application component - [x] Add `src/LinkTraceReader.{h,cc}` - [x] reads the trace file - [x] validates ordering (tick_index monotonic) - [x] provides an in-memory list of per-tick updates (or streaming reader) - [x] Add `src/LinkStateApplier.{h,cc}` as a `cSimpleModule` - [x] parameters: - [x] `string tracePath` - [x] `double dtSeconds` - [x] `bool strict` (fail-fast on unknown link keys) - [x] `bool applyLoss` (optional) - [x] behavior: - [x] on init: - [x] load/validate trace header/dt - [x] build a mapping from LinkKey → simulation link object(s) - [x] schedule first self-message at tick 0 - [x] on each tick: - [x] apply updates/removals for that tick - [x] schedule next tick if present - [x] on finish: - [x] write summary scalars (num updates applied, unknown keys, etc.) --- ## 6) How IMNET represents links (v1 minimal approach) You need a concrete, implementable mapping from LinkKey → something mutable in OMNeT/INET. ### 6.1 Choose v1 link representation Pick one approach and implement it end-to-end: **Option A (preferred v1): mutate OMNeT channels** - [x] Use a channel type that supports: - [x] delay updates (`delay`) - [x] datarate updates (`datarate`) - [x] “up/down” via disabling or forcing drop - [x] Build a stable topology at startup containing all candidate links - [x] Map each LinkKey to a channel pointer - [x] On update: - [x] set channel delay - [x] set channel datarate - [x] if `up=false` or “removal”: disable channel / set drop mode **Option B: insert a small “LinkShim” module per edge** - [x] Create a `LinkShim` `cSimpleModule` that: - [x] applies propagation delay (schedule) *(not selected for v1 Option A path)* - [x] applies serialization delay from capacity (packet_bits / capacity_bps) *(not selected for v1 Option A path)* - [x] drops packets by loss_rate *(not selected for v1 Option A path)* - [x] drops everything when `up=false` *(not selected for v1 Option A path)* - [x] Connect hosts via `LinkShim` modules instead of relying on channel mutability *(not selected for v1 Option A path)* > v1 recommendation: start with Option A if channel mutability is adequate; fall back to Option B if not. ### 6.2 Define “up/down/removal” semantics for v1 - [x] Lock v1 semantics (must match orchestrator trace writer): - [x] `up=false` means the link exists but currently unavailable (drop all / disable) - [x] `removal` means the link is not in the current active set - [x] v1 handling recommendation: treat removal as `up=false` (do not delete topology) - [x] allow a later `update` to re-enable it ### 6.3 Unit conversion rules (lock them) - [x] `one_way_delay_s` → OMNeT `simtime_t` delay - [x] `capacity_bps` → channel datarate (bps) - [x] `loss_rate`: - [x] if supported natively by chosen link representation, apply directly - [x] otherwise implement drop probability in `LinkShim` or a per-interface dropper --- ## 7) Topology generation strategy (v1: small but consistent) ### 7.1 Keep topology simple in v1 - [x] v1 target: 2–10 nodes, with: - [x] at least one satellite-like router node - [x] at least one ground station-like host node - [x] at least one dynamic link changing delay/rate/up/down over time ### 7.2 How the topology is created Use Strategy 3 for v1: - [x] Commit a stable NED topology template under `lanes/omnet/satsim-imnet/ned/` - [x] Orchestrator writes `node_map.json` and `link_map.json` per run into the run artifacts - [x] IMNET loads maps at startup and applies all updates by LinkKey lookup through the map - [x] In strict mode, any unmapped LinkKey is a hard failure - [x] Generated NED per run (Strategy 2) is explicitly deferred beyond v1 --- ## 8) Orchestrator ↔ IMNET runtime interface (exact parameters) ### 8.1 Standardize the runner arguments - [x] Define a single typed runner entrypoint (called by adapter/runner code) that accepts: - [x] `--trace <path>` - [x] `--dt <seconds>` - [x] `--outdir <path>` - [x] `--seed <int>` - [x] `--tend <seconds>` (optional; can be inferred from last tick) - [x] `--config <omnet_config_name>` (optional) - [x] Runner converts these to OMNeT args: - [x] `-u Cmdenv` - [x] `-n <NEDPATHS including INET and satsim-imnet/ned>` - [x] `-l` (load required libraries if needed) - [x] `--output-dir=<outdir>` - [x] `--seed-set=<seed>` (or equivalent OMNeT seed setting) - [x] `--sim-time-limit=<tend>s` (if used) - [x] pass `tracePath` and `dtSeconds` as module parameters - [x] Runner invocation semantics: - [x] if lane mode is `omnet` or `parallel`, omnet runner invocation is mandatory - [x] missing required runner inputs is a hard error, never a silent no-op ### 8.2 Ensure artifacts land in the SatSim run directory - [x] Orchestrator passes `outdir = artifacts/<run_id>/omnet/` - [x] IMNET writes: - [x] OMNeT results (.vec/.sca) to `outdir/results/` (or directly under outdir) - [x] stdout/stderr to `outdir/logs/omnet.log` (or orchestrator captures it) - [x] a small `imnet_runinfo.json` containing: - [x] versions (inet, omnetpp) - [x] trace hash - [x] config name - [x] seed - [x] start/end ticks processed --- ## 9) Observability for IMNET (minimal but useful) - [x] Configure OMNeT to export: - [x] scalar summary stats (packets sent/received, drops) - [x] vector time series for throughput/delay (where feasible) - [x] Add LinkStateApplier scalars: - [x] `ticksProcessed` - [x] `updatesApplied` - [x] `removalsApplied` - [x] `unknownLinkKeys` (must be 0 in strict mode) - [x] Optional (v1.1): PCAP output - [x] If using INET features that can emit pcap: - [x] enable and write into `outdir/pcap/` *(deferred; not enabled in v1)* - [x] Otherwise: skip; rely on Mininet lane for PCAPs --- ## 10) Testing and validation (must exist for v1) ### 10.1 Offline smoke test (no orchestrator) - [x] Add `lanes/omnet/satsim-imnet/tests/` (or scripts) that: - [x] runs a short simulation with a tiny trace - [x] verifies results files created - [x] verifies LinkStateApplier processed N ticks - [x] Provide `lanes/omnet/scripts/smoke.py`: - [x] installs toolchain (if needed) - [x] builds project - [x] runs headless sim for ~5–20 seconds simtime ### 10.2 End-to-end smoke test (with orchestrator) - [x] Add a minimal `scenarios/omnet_smoke.yaml`: - [x] small node set - [x] dt ~ 1s - [x] short window (e.g., 60s) - [x] explicit link selector to keep link key set stable - [x] Add a single command documented in root README or WORKSPACE.md: - [x] `uv run satsim run scenarios/omnet_smoke.yaml --mode omnet` - [x] Verify artifacts: - [x] orchestrator run folder exists - [x] omnet subfolder contains .vec/.sca - [x] logs show LinkStateApplier applying ticks in order --- ## 11) Known v1 constraints (explicitly accepted) - [x] v1 uses **trace-first** ingestion (no live gRPC inside OMNeT) - [x] v1 may ignore directional asymmetry: - [x] if LinkKey is directional but the chosen link model is bidirectional, document how direction is collapsed (or restrict scenarios accordingly) - [x] v1 focuses on: - [x] correctness of tick alignment - [x] correctness of delay/rate updates - [x] reproducible, scripted build/run under opp_env --- ## 12) v1.1+ hooks (implemented as opt-in features) - [x] Live streaming adapter hook inside OMNeT++ (`ingest_mode=live_stream`, runtime trace refresh + `grpcTarget` hook parameter) - [x] Dynamic topology construction from ScenarioSpec maps (generated NED at run-time in `run.py`) - [x] Proper per-direction link modeling hook (`directional_links=true` creates separate directional channels) - [x] Integration hook for SDN decision traces (`sdn_trace_path` + per-tick channel enable/disable decisions in LinkStateApplier)

TASKS_ORCHESTRATOR.md · 20 KiB · Markdown Raw

# ORCHESTRATOR_IMPLEMENTATION.md — SatSim Orchestrator (Python) Detailed Plan This document is a task-driven implementation plan for the **SatSim Orchestrator**. It is intended to be handed to an LLM to implement the orchestrator in Python. It focuses on **architecture, module boundaries, data flow, and concrete tasks** (not deep RPC wire details). The orchestrator is responsible for: - Loading a scenario - Starting and coordinating subcomponents (Geo/RF engine, OMNeT++ lane, Mininet lane) - Driving a unified timebase - Fanning out LinkState/Event updates to lane adapters - Recording reproducible artifacts (manifest, logs, optional traces, metrics) --- ## Decision lock (2026-02-18) These ambiguities are now resolved and should be treated as fixed v1 design: - **Authoritative tick source**: `StreamLinkDeltas` is the only control-plane tick source. Lanes apply link state from deltas, not from events. - **Event stream alignment (Option B accepted)**: evolve Geo/RF event API so `StreamEventsRequest` carries `dt` + `selector`, and `EngineEvent` carries `tick_index`, aligned to the same tick grid as deltas. - **Error contract handling**: orchestrator must handle `NOT_FOUND`, `INVALID_ARGUMENT`, `FAILED_PRECONDITION`, and `RESOURCE_EXHAUSTED` as first-class engine responses. - **Scenario translation strictness**: translation to Geo/RF `ScenarioSpec` is fail-fast and must satisfy all required engine fields/constraints. - **Tooling policy**: Python workflows use `uv` (`uv run`, `uv add`); do not rely on `pip`. --- ## 1) Repository layout (recommended) ``` satsim/ orchestrator/ pyproject.toml README.md ORCHESTRATOR_IMPLEMENTATION.md ``` satsim_orch/ __init__.py cli.py main.py config/ __init__.py schema.py loader.py defaults.py normalize.py runtime/ __init__.py run_manager.py manifest.py artifact_store.py logging.py versioning.py process.py timebase/ __init__.py clock.py modes.py scheduler.py bus/ __init__.py messages.py queues.py fanout.py geomrf/ __init__.py client.py translate.py health.py lanes/ __init__.py base.py registry.py mininet_lane/ __init__.py adapter.py topo.py shaping.py controller.py capture.py omnet_lane/ __init__.py adapter.py trace_ingest.py runner.py metrics/ __init__.py prom.py records.py exporters.py util/ __init__.py ids.py units.py asyncx.py errors.py tests/ test_config_validation.py test_timebase_scheduler.py test_bus_fanout.py test_run_manifest.py test_lane_adapter_contract.py test_geomrf_client_smoke.py ``` subprojects/ geomrf-engine/ # separate project; orchestrator consumes it via gRPC lanes/ omnet/ mininet/ observability/ artifacts/ ``` --- ## 2) Orchestrator design summary (targets) ### 2.1 Orchestrator responsibilities - Scenario loading & validation - Run directory + manifest creation - Geo/RF engine lifecycle (create scenario; start streams; close) - Lane lifecycle: - `prepare()` (build topology / start processes) - `apply_tick()` (apply link deltas / events) - `finalize()` (stop processes, collect outputs) - Unified runtime pacing: - offline apply-fast - real-time apply-paced (wall-clock aligned) - parallel lane fanout (same incoming ticks feed multiple lanes) - Artifact collection: - config snapshot, manifest - optional LinkState trace logging - metrics export - PCAP capture (Mininet lane) ### 2.2 Key architectural choices - Python 3.11+ with **asyncio** - gRPC async client (`grpc.aio`) for Geo/RF streaming - In-process **async fanout bus** using bounded queues (v1) - Pluggable lane adapters via a registry - Everything stamped with versions/seeds for reproducibility --- ## 3) Implementation checklist (extremely detailed) ## 3.1 Project bootstrap and build - [x] Create `orchestrator/pyproject.toml` - [x] Define package name (e.g., `satsim-orchestrator`) - [x] Set Python version (>=3.11) - [x] Add dependencies: - [x] `pydantic` - [x] `pyyaml` - [x] `grpcio`, `grpcio-tools`, `protobuf` - [x] `rich` (optional, for CLI UX) - [x] `prometheus-client` (optional) - [x] `aiofiles` (optional, async file writes) - [x] Add dev dependencies: - [x] `pytest`, `pytest-asyncio` - [x] `ruff` / `black` - [x] `mypy` (optional) - [x] Add task runner and `uv` commands: - [x] `uv run pytest` - [x] `uv run ruff check .` - [x] `uv run python -m satsim_orch.cli run <scenario.yaml> ...` ## 3.2 CLI and entrypoints - [x] Implement `satsim_orch/cli.py` with commands: - [x] `run <scenario.yaml> --mode {omnet|mininet|parallel} --dt 1s --t0 ... --t1 ...` - [x] `validate <scenario.yaml>` - [x] `list-runs` - [x] `show-run <run_id>` - [x] Implement `satsim_orch/main.py` - [x] Parse CLI args - [x] Load scenario - [x] Create `RunContext` - [x] Run orchestrator loop - [x] Define exit codes and error messages: - [x] Invalid config → exit 2 - [x] Missing dependency/lane binary → exit 3 - [x] Runtime error → exit 1 --- ## 4) Configuration system ### 4.1 Scenario schema (Pydantic) - [x] Implement `config/schema.py` with a canonical `ScenarioConfig` - [x] Global: - [x] `name` - [x] `seed` - [x] `time: {t0, t1, dt, mode}` - [x] `execution: {lane_mode, strict_reproducible, record_trace, record_pcap}` - [x] `paths: {artifacts_root}` - [x] Geo/RF engine connection: - [x] `geomrf: {grpc_target, request_dt, selector_defaults, thresholds_defaults}` - [x] Geo/RF scenario payload: - [x] `geomrf.scenario_spec` maps 1:1 to Geo/RF `ScenarioSpec` required fields (`nodes`, `terminal`, orbit/site, link/adaptation policy) - [x] optional high-level shorthand may exist, but must compile deterministically to valid `ScenarioSpec` - [x] Lane configs: - [x] `mininet: {controller: {type, addr}, topo: {...}, shaping: {...}}` - [x] `omnet: {project_path, ini_path, run_args, trace_mode}` - [x] Add validators: - [x] `t0 < t1` - [x] `dt > 0` - [x] `seed >= 0` - [x] lane configs exist for chosen mode - [x] fail-fast if engine-required scenario fields are missing/invalid - [x] fail-fast if `request_dt` is outside engine capabilities (`min_dt`, `max_dt`) - [x] if `mode=mininet` require Linux + OVS checks (soft validate with warnings) - [x] Add defaulting rules in `config/defaults.py` - [x] dt default (e.g., 1s) - [x] thresholds default (delay/capacity/loss) - [x] artifacts root default `./artifacts/runs` ### 4.2 Loader and normalization - [x] Implement `config/loader.py` - [x] load YAML/JSON - [x] environment variable expansion (optional) - [x] include/merge support (optional) - [x] Implement `config/normalize.py` - [x] produce a normalized config (canonical types, timezone normalization) - [x] compute derived fields (run duration, tick count) - [x] Implement `config/normalize.py` to build: - [x] `GeomrfScenarioSpec` (engine-facing) from `ScenarioConfig` - [x] `LaneScenarioSpec` (lane-facing) from `ScenarioConfig` --- ## 5) Run manager and artifacts ### 5.1 Run context and directory structure - [x] Implement `runtime/run_manager.py` - [x] Generate `run_id` (timestamp + short random, or UUID) - [x] Create run directory: - [x] `artifacts/runs/<run_id>/` - [x] `logs/`, `metrics/`, `pcaps/`, `traces/`, `manifests/` - [x] Save copies of: - [x] raw scenario file - [x] normalized scenario JSON - [x] Implement `runtime/manifest.py` - [x] manifest fields: - [x] run_id, scenario name, timestamps - [x] seeds - [x] component versions (orchestrator, geomrf engine, lanes) - [x] execution mode, dt, tick count - [x] git SHAs if available - [x] host info (OS, python version) (optional) - [x] Implement `runtime/versioning.py` - [x] orchestrator version string - [x] best-effort git SHA discovery ### 5.2 Logging - [x] Implement `runtime/logging.py` - [x] structured JSON logs to file - [x] human-readable console logs - [x] include `run_id` and correlation IDs - [x] Implement log rotation policy (optional) - [x] Implement `util/errors.py` with typed exceptions: - [x] `ScenarioError`, `GeomrfError`, `LaneError`, `TimebaseError` ### 5.3 Artifact store helpers - [x] Implement `runtime/artifact_store.py` - [x] `write_text(path, text)` - [x] `write_json(path, obj)` - [x] `append_jsonl(path, obj)` - [x] atomic writes (write temp then rename) - [x] Implement trace recording option: - [x] If `record_trace=true`, append received LinkDeltaBatch to JSONL/Parquet later --- ## 6) Timebase and pacing ### 6.1 Time modes - [x] Implement `timebase/modes.py` enum: - [x] `OFFLINE` (apply incoming ticks as fast as possible; no sleeping) - [x] `REALTIME` (apply incoming ticks at wall-clock pace) - [x] `PARALLEL` (lane selection mode; both lanes consume the same incoming ticks) - [x] Implement `timebase/clock.py` - [x] `SimulationTime` type for formatting/validation of incoming stream ticks - [x] conversions and formatting - [x] Implement `timebase/scheduler.py` - [x] implement **pacing**, not tick generation - [x] for REALTIME: sleep until expected wall-clock for next received tick - [x] for OFFLINE: apply each received tick immediately - [x] Add drift handling for REALTIME: - [x] if late by > 1 tick, either skip ticks or catch up (configurable) - [x] default: never skip control-plane ticks; warn if drift accumulates --- ## 7) Internal bus and message contracts ### 7.1 Canonical internal messages - [x] Implement `bus/messages.py` dataclasses: - [x] `TickUpdate`: - [x] run_id, scenario_ref - [x] tick_index, time - [x] link_updates: list - [x] link_removals: list - [x] events: list - [x] stats: compute timing, counts - [x] `RunControl` messages: - [x] start/pause/resume/stop - [x] `LaneStatus` messages: - [x] ready/running/error/stopped - [x] Implement `bus/queues.py` - [x] bounded asyncio queues - [x] per-lane queue limits (configurable) - [x] Implement `bus/fanout.py` - [x] one producer (Geo/RF stream consumer) - [x] N consumers (lane adapters + recorder) - [x] backpressure policy: - [x] default: block producer when any lane queue is full (strict sync) - [x] option: drop trace recorder only (never drop lane updates) - [x] Add message ordering rules: - [x] tick updates delivered in increasing tick_index - [x] within a tick: removals applied before updates by consumers (documented) --- ## 8) Geo/RF engine client integration ### 8.1 gRPC client - [x] Implement `geomrf/client.py` - [x] gRPC channel creation (`grpc.aio.insecure_channel(target)`) - [x] stub creation from generated proto - [x] `get_version()`, `get_capabilities()` - [x] `create_scenario(scenario_spec) -> scenario_ref` - [x] `close_scenario(scenario_ref)` - [x] `stream_link_deltas(request) -> async iterator` - [x] `stream_events(request) -> async iterator` - [x] Implement `geomrf/health.py` - [x] connect + health check on startup - [x] gate event-consumer features on engine schema/version support - [x] Implement `geomrf/translate.py` - [x] translate orchestrator ScenarioConfig to Geo/RF ScenarioSpec (engine-facing) - [x] enforce deterministic key ordering where needed for reproducible payloads - [x] validate all required proto fields before RPC call; reject locally on mismatch - [x] translate Geo/RF `LinkDeltaBatch` into internal `TickUpdate` - [x] Implement robust error mapping: - [x] map `NOT_FOUND`, `INVALID_ARGUMENT`, `FAILED_PRECONDITION`, `RESOURCE_EXHAUSTED` to typed `GeomrfError` - [x] define retry policy for `UNAVAILABLE`/`DEADLINE_EXCEEDED` (bounded retries + backoff) - [x] include scenario_ref and tick_index in error logs ### 8.2 Stream consumption tasks - [x] Implement `geomrf` stream consumer coroutine: - [x] starts `StreamLinkDeltas` with `emit_full_snapshot_first=true` - [x] reads batches and pushes `TickUpdate` to bus producer - [x] Implement event stream consumer coroutine (optional in v1): - [x] call `StreamEvents` with same `t_start/t_end/dt/selector` used for deltas - [x] consume `EngineEvent.tick_index` directly (no nearest-tick heuristics) - [x] record events to trace/metrics channel for observability - [x] if connected engine does not support aligned event schema, disable event consumer and warn once - [x] Merge/control strategy: - [x] lane control path uses `TickUpdate` from `StreamLinkDeltas` only - [x] event stream is informational and must not mutate lane state --- ## 9) Lane adapter architecture ### 9.1 Adapter base contract - [x] Implement `lanes/base.py`: - [x] `class LaneAdapter(Protocol)` or ABC with: - [x] `name: str` - [x] `async prepare(run_context, scenario_config) -> None` - [x] `async apply_tick(tick: TickUpdate) -> None` - [x] `async finalize(run_context) -> None` - [x] `async health() -> dict` (optional) - [x] Implement `lanes/registry.py` - [x] register adapters by name - [x] instantiate chosen adapters based on `lane_mode` - [x] Implement `tests/test_lane_adapter_contract.py` for interface compliance ### 9.2 Mininet lane adapter (detailed tasks) - [x] Implement `lanes/mininet_lane/adapter.py` - [x] `prepare()`: - [x] validate Linux prereqs (`ovs-vsctl`, `tc`, `ip`) - [x] start controller (ONOS/Ryu) if configured - [x] create Mininet topology (delegate to `topo.py`) - [x] start Mininet network - [x] start PCAP capture if enabled (delegate to `capture.py`) - [x] `apply_tick()`: - [x] apply removals (links down) first - [x] apply updates: - [x] for each link: set up/down state - [x] apply delay/loss/rate using shaping module - [x] `finalize()`: - [x] stop captures - [x] stop Mininet - [x] stop controller if orchestrator started it - [x] Implement `lanes/mininet_lane/topo.py` - [x] create a Mininet graph from ScenarioConfig node roles - [x] map SatSim node IDs to Mininet host/switch names - [x] define OVS switches and host attachments - [x] decide representation: - [x] v1 recommended: represent satellites as OVS switches; GS/UT as hosts - [x] allow optional SAT as hosts if needed - [x] create links but keep them initially “neutral” (shaping applied per tick) - [x] Implement `lanes/mininet_lane/shaping.py` - [x] provide functions: - [x] `set_link_up(link_id)` / `set_link_down(link_id)` - [x] `apply_netem(link_id, delay_ms, loss_pct)` - [x] `apply_rate(link_id, rate_mbps)` - [x] `clear_shaping(link_id)` - [x] implement using: - [x] `tc qdisc replace dev <if> root netem delay ... loss ...` - [x] `tc qdisc ... tbf/htb` for rate - [x] ensure idempotency (repeated calls safe) - [x] log every applied shaping change with tick_index - [x] Implement `lanes/mininet_lane/controller.py` - [x] support controller options: - [x] external controller address (already running) - [x] orchestrator-launched controller container/process (optional v1) - [x] store controller version info in manifest - [x] Implement `lanes/mininet_lane/capture.py` - [x] start tcpdump for relevant interfaces - [x] rotate PCAP per time or per run (v1: one PCAP per run) - [x] store PCAP path in manifest ### 9.3 OMNeT lane adapter (trace-first v1) - [x] Implement `lanes/omnet_lane/adapter.py` - [x] v1 assumption: OMNeT consumes a **trace file** (offline) rather than live streaming - [x] Implement `lanes/omnet_lane/trace_ingest.py` - [x] orchestrator writes a LinkState trace file suitable for OMNeT adapter - [x] define a simple trace format: - [x] JSONL per tick containing updates/removals - [x] or CSV-like with (tick, src, dst, up, delay, rate, loss) - [x] ensure deterministic ordering of entries - [x] Implement `lanes/omnet_lane/runner.py` - [x] launch OMNeT simulation via subprocess: - [x] capture stdout/stderr to run logs - [x] exit code handling - [x] place outputs into artifacts directory --- ## 10) Orchestrator main runtime loop ### 10.1 Lifecycle coordination - [x] Implement `satsim_orch/main.py` orchestration steps: - [x] Create run context + artifact directories - [x] Log environment + versions - [x] Initialize Geo/RF client and fetch version/capabilities - [x] Create Geo/RF scenario - [x] Instantiate chosen lane adapters (mininet/omnet/parallel) - [x] Call `prepare()` for each lane - [x] Start stream consumer tasks - [x] Start realtime pacing task only when `time.mode=REALTIME` - [x] Await completion conditions: - [x] reached t_end - [x] user stop signal (CTRL+C) - [x] error in any task - [x] Finalize lanes - [x] Close Geo/RF scenario - [x] Write final manifest + summary ### 10.2 Streaming-driven execution (locked) - Geo/RF stream is the authoritative tick source. - Orchestrator does not generate ticks; it consumes them and fans out. Tasks: - [x] In streaming consumer, for each LinkDeltaBatch: - [x] translate to `TickUpdate` - [x] push to fanout bus ### 10.3 Fanout to lanes - [x] For each lane, run a consumer task: - [x] `while True: tick = await queue.get(); await lane.apply_tick(tick)` - [x] handle cancellation and lane errors - [x] Implement strict ordering: - [x] do not allow lane to process tick k+1 before tick k - [x] Implement shutdown handshake: - [x] send `RunControl(STOP)` to lanes on exit - [x] drain queues if configured --- ## 11) Error handling and shutdown ### 11.1 Exception strategy - [x] Any uncaught exception in: - [x] Geo/RF stream consumer - [x] any lane consumer - [x] any lane adapter method triggers a coordinated shutdown. - [x] Implement `runtime/process.py`: - [x] subprocess management with kill/terminate escalation - [x] collect exit codes and stderr tails - [x] Add SIGINT/SIGTERM handling: - [x] first CTRL+C: graceful stop - [x] second CTRL+C: immediate stop ### 11.2 Cleanup correctness - [x] Always attempt: - [x] `finalize()` lanes - [x] `close_scenario()` Geo/RF even when errors occur. - [x] Write final manifest including failure reason. --- ## 12) Metrics and run summaries ### 12.1 Metric recording - [x] Implement `metrics/records.py` - [x] standard metric record format for: - [x] tick compute time - [x] links emitted - [x] lane apply times (optional) - [x] Implement `metrics/exporters.py` - [x] JSONL writer to `metrics/` - [x] optional Prometheus exporter - [x] Implement per-tick timing: - [x] time spent translating batches - [x] time spent applying to each lane ### 12.2 Summary report (v1) - [x] Write a `summary.json` at end of run: - [x] total ticks, total links emitted, mean compute time, runtime duration - [x] lane success/failure states - [x] artifact paths (pcaps, traces, logs) --- ## 13) Integration tests (practical, not huge) ### 13.1 Smoke tests - [x] `test_geomrf_client_smoke.py` - [x] connect to Geo/RF engine on localhost - [x] create a tiny scenario (1 GS + 1 SAT) - [x] stream first 3 ticks and assert non-empty output - [x] `test_event_alignment_smoke.py` - [x] request deltas/events with identical `t_start/t_end/dt/selector` - [x] assert each event has `tick_index` and maps to existing/expected delta tick ### 13.2 Bus correctness - [x] `test_bus_fanout.py` - [x] ensure ticks delivered to all lanes in order - [x] ensure backpressure blocks producer when lane queue is full ### 13.3 Run manifest correctness - [x] `test_run_manifest.py` - [x] run manager writes expected keys - [x] manifest includes versions and config snapshot --- ## 14) Minimum viable orchestrator (v1) — acceptance criteria - [x] Can run `satsim run scenario.yaml --mode mininet` - [x] Geo/RF scenario created - [x] Link deltas streamed and applied via `tc/netem` - [x] PCAP recorded (optional) - [x] run artifacts written (logs, manifest) - [x] Can run `satsim run scenario.yaml --mode omnet` - [x] Geo/RF stream recorded to trace - [x] OMNeT launched consuming trace (or stubbed with clear TODO if not ready) - [x] run artifacts written - [x] Can run `satsim run scenario.yaml --mode parallel` - [x] both lane adapters receive identical tick updates - [x] lane adapters derive control only from link deltas - [x] optional events are captured in artifacts without driving lane state - [x] orchestrator shuts down cleanly on completion or CTRL+C --- ## 15) Optional but valuable v1.1 tasks (safe additions) - [ ] Orchestrator exposes its own gRPC stream `StreamTickUpdates` so lanes can subscribe remotely - [ ] Add NATS internal bus option for multi-process fanout - [ ] Add replay command: `satsim replay <run_id>` (use stored trace) - [ ] Add sweep runner: parameter grid search with repeated runs and consolidated summary ---

ORCHESTRATOR_IMPLEMENTATION.md — SatSim Orchestrator (Python) Detailed Plan

This document is a task-driven implementation plan for the SatSim Orchestrator. It is intended to be handed to an LLM to implement the orchestrator in Python. It focuses on architecture, module boundaries, data flow, and concrete tasks (not deep RPC wire details).

The orchestrator is responsible for:

Loading a scenario
Starting and coordinating subcomponents (Geo/RF engine, OMNeT++ lane, Mininet lane)
Driving a unified timebase
Fanning out LinkState/Event updates to lane adapters
Recording reproducible artifacts (manifest, logs, optional traces, metrics)

Decision lock (2026-02-18)

These ambiguities are now resolved and should be treated as fixed v1 design:

Authoritative tick source: StreamLinkDeltas is the only control-plane tick source. Lanes apply link state from deltas, not from events.
Event stream alignment (Option B accepted): evolve Geo/RF event API so StreamEventsRequest carries dt + selector, and EngineEvent carries tick_index, aligned to the same tick grid as deltas.
Error contract handling: orchestrator must handle NOT_FOUND, INVALID_ARGUMENT, FAILED_PRECONDITION, and RESOURCE_EXHAUSTED as first-class engine responses.
Scenario translation strictness: translation to Geo/RF ScenarioSpec is fail-fast and must satisfy all required engine fields/constraints.
Tooling policy: Python workflows use uv (uv run, uv add); do not rely on pip.

1) Repository layout (recommended)


satsim/
orchestrator/
pyproject.toml
README.md
ORCHESTRATOR_IMPLEMENTATION.md

satsim_orch/ init.py cli.py main.py

config/ init.py schema.py loader.py defaults.py normalize.py

runtime/ init.py run_manager.py manifest.py artifact_store.py logging.py versioning.py process.py

timebase/ init.py clock.py modes.py scheduler.py

bus/ init.py messages.py queues.py fanout.py

geomrf/ init.py client.py translate.py health.py

lanes/ init.py base.py registry.py

mininet_lane/
  __init__.py
  adapter.py
  topo.py
  shaping.py
  controller.py
  capture.py

omnet_lane/
  __init__.py
  adapter.py
  trace_ingest.py
  runner.py

metrics/ init.py prom.py records.py exporters.py

util/ init.py ids.py units.py asyncx.py errors.py

tests/ test_config_validation.py test_timebase_scheduler.py test_bus_fanout.py test_run_manifest.py test_lane_adapter_contract.py test_geomrf_client_smoke.py


subprojects/
geomrf-engine/   # separate project; orchestrator consumes it via gRPC
lanes/
omnet/
mininet/
observability/
artifacts/

2) Orchestrator design summary (targets)

2.1 Orchestrator responsibilities

Scenario loading & validation
Run directory + manifest creation
Geo/RF engine lifecycle (create scenario; start streams; close)
Lane lifecycle:
- prepare() (build topology / start processes)
- apply_tick() (apply link deltas / events)
- finalize() (stop processes, collect outputs)
Unified runtime pacing:
- offline apply-fast
- real-time apply-paced (wall-clock aligned)
- parallel lane fanout (same incoming ticks feed multiple lanes)
Artifact collection:
- config snapshot, manifest
- optional LinkState trace logging
- metrics export
- PCAP capture (Mininet lane)

2.2 Key architectural choices

Python 3.11+ with asyncio
gRPC async client (grpc.aio) for Geo/RF streaming
In-process async fanout bus using bounded queues (v1)
Pluggable lane adapters via a registry
Everything stamped with versions/seeds for reproducibility

3) Implementation checklist (extremely detailed)

3.1 Project bootstrap and build

Create orchestrator/pyproject.toml
- Define package name (e.g., satsim-orchestrator)
- Set Python version (>=3.11)
- Add dependencies:
  - pydantic
  - pyyaml
  - grpcio, grpcio-tools, protobuf
  - rich (optional, for CLI UX)
  - prometheus-client (optional)
  - aiofiles (optional, async file writes)
Add dev dependencies:
- pytest, pytest-asyncio
- ruff / black
- mypy (optional)
Add task runner and uv commands:
- uv run pytest
- uv run ruff check .
- uv run python -m satsim_orch.cli run <scenario.yaml> ...

3.2 CLI and entrypoints

Implement satsim_orch/cli.py with commands:
- run <scenario.yaml> --mode {omnet|mininet|parallel} --dt 1s --t0 ... --t1 ...
- validate <scenario.yaml>
- list-runs
- show-run <run_id>
Implement satsim_orch/main.py
- Parse CLI args
- Load scenario
- Create RunContext
- Run orchestrator loop
Define exit codes and error messages:
- Invalid config → exit 2
- Missing dependency/lane binary → exit 3
- Runtime error → exit 1

4) Configuration system

4.1 Scenario schema (Pydantic)

Implement config/schema.py with a canonical ScenarioConfig
- Global:
  - name
  - seed
  - time: {t0, t1, dt, mode}
  - execution: {lane_mode, strict_reproducible, record_trace, record_pcap}
  - paths: {artifacts_root}
- Geo/RF engine connection:
  - geomrf: {grpc_target, request_dt, selector_defaults, thresholds_defaults}
- Geo/RF scenario payload:
  - geomrf.scenario_spec maps 1:1 to Geo/RF ScenarioSpec required fields (nodes, terminal, orbit/site, link/adaptation policy)
  - optional high-level shorthand may exist, but must compile deterministically to valid ScenarioSpec
- Lane configs:
  - mininet: {controller: {type, addr}, topo: {...}, shaping: {...}}
  - omnet: {project_path, ini_path, run_args, trace_mode}
Add validators:
- t0 < t1
- dt > 0
- seed >= 0
- lane configs exist for chosen mode
- fail-fast if engine-required scenario fields are missing/invalid
- fail-fast if request_dt is outside engine capabilities (min_dt, max_dt)
- if mode=mininet require Linux + OVS checks (soft validate with warnings)
Add defaulting rules in config/defaults.py
- dt default (e.g., 1s)
- thresholds default (delay/capacity/loss)
- artifacts root default ./artifacts/runs

4.2 Loader and normalization

Implement config/loader.py
- load YAML/JSON
- environment variable expansion (optional)
- include/merge support (optional)
Implement config/normalize.py
- produce a normalized config (canonical types, timezone normalization)
- compute derived fields (run duration, tick count)
Implement config/normalize.py to build:
- GeomrfScenarioSpec (engine-facing) from ScenarioConfig
- LaneScenarioSpec (lane-facing) from ScenarioConfig

5) Run manager and artifacts

5.1 Run context and directory structure

Implement runtime/run_manager.py
- Generate run_id (timestamp + short random, or UUID)
- Create run directory:
  - artifacts/runs/<run_id>/
  - logs/, metrics/, pcaps/, traces/, manifests/
- Save copies of:
  - raw scenario file
  - normalized scenario JSON
Implement runtime/manifest.py
- manifest fields:
  - run_id, scenario name, timestamps
  - seeds
  - component versions (orchestrator, geomrf engine, lanes)
  - execution mode, dt, tick count
  - git SHAs if available
  - host info (OS, python version) (optional)
Implement runtime/versioning.py
- orchestrator version string
- best-effort git SHA discovery

5.2 Logging

Implement runtime/logging.py
- structured JSON logs to file
- human-readable console logs
- include run_id and correlation IDs
Implement log rotation policy (optional)
Implement util/errors.py with typed exceptions:
- ScenarioError, GeomrfError, LaneError, TimebaseError

5.3 Artifact store helpers

Implement runtime/artifact_store.py
- write_text(path, text)
- write_json(path, obj)
- append_jsonl(path, obj)
- atomic writes (write temp then rename)
Implement trace recording option:
- If record_trace=true, append received LinkDeltaBatch to JSONL/Parquet later

6) Timebase and pacing

6.1 Time modes

Implement timebase/modes.py enum:
- OFFLINE (apply incoming ticks as fast as possible; no sleeping)
- REALTIME (apply incoming ticks at wall-clock pace)
- PARALLEL (lane selection mode; both lanes consume the same incoming ticks)
Implement timebase/clock.py
- SimulationTime type for formatting/validation of incoming stream ticks
- conversions and formatting
Implement timebase/scheduler.py
- implement pacing, not tick generation
- for REALTIME: sleep until expected wall-clock for next received tick
- for OFFLINE: apply each received tick immediately
Add drift handling for REALTIME:
- if late by > 1 tick, either skip ticks or catch up (configurable)
- default: never skip control-plane ticks; warn if drift accumulates

7) Internal bus and message contracts

7.1 Canonical internal messages

Implement bus/messages.py dataclasses:
- TickUpdate:
  - run_id, scenario_ref
  - tick_index, time
  - link_updates: list
  - link_removals: list
  - events: list
  - stats: compute timing, counts
- RunControl messages:
  - start/pause/resume/stop
- LaneStatus messages:
  - ready/running/error/stopped
Implement bus/queues.py
- bounded asyncio queues
- per-lane queue limits (configurable)
Implement bus/fanout.py
- one producer (Geo/RF stream consumer)
- N consumers (lane adapters + recorder)
- backpressure policy:
  - default: block producer when any lane queue is full (strict sync)
  - option: drop trace recorder only (never drop lane updates)
Add message ordering rules:
- tick updates delivered in increasing tick_index
- within a tick: removals applied before updates by consumers (documented)

8) Geo/RF engine client integration

8.1 gRPC client

Implement geomrf/client.py
- gRPC channel creation (grpc.aio.insecure_channel(target))
- stub creation from generated proto
- get_version(), get_capabilities()
- create_scenario(scenario_spec) -> scenario_ref
- close_scenario(scenario_ref)
- stream_link_deltas(request) -> async iterator
- stream_events(request) -> async iterator
Implement geomrf/health.py
- connect + health check on startup
- gate event-consumer features on engine schema/version support
Implement geomrf/translate.py
- translate orchestrator ScenarioConfig to Geo/RF ScenarioSpec (engine-facing)
- enforce deterministic key ordering where needed for reproducible payloads
- validate all required proto fields before RPC call; reject locally on mismatch
- translate Geo/RF LinkDeltaBatch into internal TickUpdate
Implement robust error mapping:
- map NOT_FOUND, INVALID_ARGUMENT, FAILED_PRECONDITION, RESOURCE_EXHAUSTED to typed GeomrfError
- define retry policy for UNAVAILABLE/DEADLINE_EXCEEDED (bounded retries + backoff)
- include scenario_ref and tick_index in error logs

8.2 Stream consumption tasks

Implement geomrf stream consumer coroutine:
- starts StreamLinkDeltas with emit_full_snapshot_first=true
- reads batches and pushes TickUpdate to bus producer
Implement event stream consumer coroutine (optional in v1):
- call StreamEvents with same t_start/t_end/dt/selector used for deltas
- consume EngineEvent.tick_index directly (no nearest-tick heuristics)
- record events to trace/metrics channel for observability
- if connected engine does not support aligned event schema, disable event consumer and warn once
Merge/control strategy:
- lane control path uses TickUpdate from StreamLinkDeltas only
- event stream is informational and must not mutate lane state

9) Lane adapter architecture

9.1 Adapter base contract

Implement lanes/base.py:
- class LaneAdapter(Protocol) or ABC with:
  - name: str
  - async prepare(run_context, scenario_config) -> None
  - async apply_tick(tick: TickUpdate) -> None
  - async finalize(run_context) -> None
  - async health() -> dict (optional)
Implement lanes/registry.py
- register adapters by name
- instantiate chosen adapters based on lane_mode
Implement tests/test_lane_adapter_contract.py for interface compliance

9.2 Mininet lane adapter (detailed tasks)

Implement lanes/mininet_lane/adapter.py
- prepare():
  - validate Linux prereqs (ovs-vsctl, tc, ip)
  - start controller (ONOS/Ryu) if configured
  - create Mininet topology (delegate to topo.py)
  - start Mininet network
  - start PCAP capture if enabled (delegate to capture.py)
- apply_tick():
  - apply removals (links down) first
  - apply updates:
    - for each link: set up/down state
    - apply delay/loss/rate using shaping module
- finalize():
  - stop captures
  - stop Mininet
  - stop controller if orchestrator started it
Implement lanes/mininet_lane/topo.py
- create a Mininet graph from ScenarioConfig node roles
- map SatSim node IDs to Mininet host/switch names
- define OVS switches and host attachments
- decide representation:
  - v1 recommended: represent satellites as OVS switches; GS/UT as hosts
  - allow optional SAT as hosts if needed
- create links but keep them initially “neutral” (shaping applied per tick)
Implement lanes/mininet_lane/shaping.py
- provide functions:
  - set_link_up(link_id) / set_link_down(link_id)
  - apply_netem(link_id, delay_ms, loss_pct)
  - apply_rate(link_id, rate_mbps)
  - clear_shaping(link_id)
- implement using:
  - tc qdisc replace dev <if> root netem delay ... loss ...
  - tc qdisc ... tbf/htb for rate
- ensure idempotency (repeated calls safe)
- log every applied shaping change with tick_index
Implement lanes/mininet_lane/controller.py
- support controller options:
  - external controller address (already running)
  - orchestrator-launched controller container/process (optional v1)
- store controller version info in manifest
Implement lanes/mininet_lane/capture.py
- start tcpdump for relevant interfaces
- rotate PCAP per time or per run (v1: one PCAP per run)
- store PCAP path in manifest

9.3 OMNeT lane adapter (trace-first v1)

Implement lanes/omnet_lane/adapter.py
- v1 assumption: OMNeT consumes a trace file (offline) rather than live streaming
Implement lanes/omnet_lane/trace_ingest.py
- orchestrator writes a LinkState trace file suitable for OMNeT adapter
- define a simple trace format:
  - JSONL per tick containing updates/removals
  - or CSV-like with (tick, src, dst, up, delay, rate, loss)
- ensure deterministic ordering of entries
Implement lanes/omnet_lane/runner.py
- launch OMNeT simulation via subprocess:
  - capture stdout/stderr to run logs
  - exit code handling
- place outputs into artifacts directory

10) Orchestrator main runtime loop

10.1 Lifecycle coordination

Implement satsim_orch/main.py orchestration steps:
- Create run context + artifact directories
- Log environment + versions
- Initialize Geo/RF client and fetch version/capabilities
- Create Geo/RF scenario
- Instantiate chosen lane adapters (mininet/omnet/parallel)
- Call prepare() for each lane
- Start stream consumer tasks
- Start realtime pacing task only when time.mode=REALTIME
- Await completion conditions:
  - reached t_end
  - user stop signal (CTRL+C)
  - error in any task
- Finalize lanes
- Close Geo/RF scenario
- Write final manifest + summary

10.2 Streaming-driven execution (locked)

Geo/RF stream is the authoritative tick source.
Orchestrator does not generate ticks; it consumes them and fans out.

Tasks:

In streaming consumer, for each LinkDeltaBatch:
- translate to TickUpdate
- push to fanout bus

10.3 Fanout to lanes

For each lane, run a consumer task:
- while True: tick = await queue.get(); await lane.apply_tick(tick)
- handle cancellation and lane errors
Implement strict ordering:
- do not allow lane to process tick k+1 before tick k
Implement shutdown handshake:
- send RunControl(STOP) to lanes on exit
- drain queues if configured

11) Error handling and shutdown

11.1 Exception strategy

Any uncaught exception in:
- Geo/RF stream consumer
- any lane consumer
- any lane adapter method triggers a coordinated shutdown.
Implement runtime/process.py:
- subprocess management with kill/terminate escalation
- collect exit codes and stderr tails
Add SIGINT/SIGTERM handling:
- first CTRL+C: graceful stop
- second CTRL+C: immediate stop

11.2 Cleanup correctness

Always attempt:
- finalize() lanes
- close_scenario() Geo/RF even when errors occur.
Write final manifest including failure reason.

12) Metrics and run summaries

12.1 Metric recording

Implement metrics/records.py
- standard metric record format for:
  - tick compute time
  - links emitted
  - lane apply times (optional)
Implement metrics/exporters.py
- JSONL writer to metrics/
- optional Prometheus exporter
Implement per-tick timing:
- time spent translating batches
- time spent applying to each lane

12.2 Summary report (v1)

Write a summary.json at end of run:
- total ticks, total links emitted, mean compute time, runtime duration
- lane success/failure states
- artifact paths (pcaps, traces, logs)

13) Integration tests (practical, not huge)

13.1 Smoke tests

test_geomrf_client_smoke.py
- connect to Geo/RF engine on localhost
- create a tiny scenario (1 GS + 1 SAT)
- stream first 3 ticks and assert non-empty output
test_event_alignment_smoke.py
- request deltas/events with identical t_start/t_end/dt/selector
- assert each event has tick_index and maps to existing/expected delta tick

13.2 Bus correctness

test_bus_fanout.py
- ensure ticks delivered to all lanes in order
- ensure backpressure blocks producer when lane queue is full

13.3 Run manifest correctness

test_run_manifest.py
- run manager writes expected keys
- manifest includes versions and config snapshot

14) Minimum viable orchestrator (v1) — acceptance criteria

Can run satsim run scenario.yaml --mode mininet
- Geo/RF scenario created
- Link deltas streamed and applied via tc/netem
- PCAP recorded (optional)
- run artifacts written (logs, manifest)
Can run satsim run scenario.yaml --mode omnet
- Geo/RF stream recorded to trace
- OMNeT launched consuming trace (or stubbed with clear TODO if not ready)
- run artifacts written
Can run satsim run scenario.yaml --mode parallel
- both lane adapters receive identical tick updates
- lane adapters derive control only from link deltas
- optional events are captured in artifacts without driving lane state
- orchestrator shuts down cleanly on completion or CTRL+C

15) Optional but valuable v1.1 tasks (safe additions)

Orchestrator exposes its own gRPC stream StreamTickUpdates so lanes can subscribe remotely
Add NATS internal bus option for multi-process fanout
Add replay command: satsim replay <run_id> (use stored trace)
Add sweep runner: parameter grid search with repeated runs and consolidated summary

TASKS_TESTSUITE_GEOENGINE.md · 3.0 KiB · Markdown Raw

# Geometry/RF Engine Test Suite Plan This checklist tracks the work to build and verify a comprehensive RPC-focused test suite for `geomrf-engine`. ## 0) Deliverables - [x] Add a dedicated gRPC service test module that exercises all six RPCs. - [x] Validate success + error-path behavior for lifecycle and streaming RPCs. - [x] Produce an updated coverage report and capture gaps. - [x] Keep this checklist updated as tasks are completed. ## 1) Baseline and scope - [x] Confirm current tests/coverage baseline before adding new RPC tests. - [x] Confirm test scenario strategy (deterministic helper scenario; compatible with 027 overhead-pass style TLE + GS setup). ## 2) Test infrastructure - [x] Add an in-process gRPC test harness (ephemeral port, async channel/stub, clean teardown). - [x] Add shared helpers for creating/closing scenarios from tests. ## 3) RPC lifecycle tests - [x] `GetVersion` returns expected identity/schema metadata. - [x] `GetCapabilities` returns expected limits and feature flags. - [x] `CreateScenario` success path returns `scenario_ref`. - [x] `CreateScenario` invalid spec path returns `INVALID_ARGUMENT`. - [x] `CloseScenario` success path returns `ok=true`. - [x] `CloseScenario` unknown scenario path returns `NOT_FOUND`. ## 4) Streaming RPC tests - [x] `StreamLinkDeltas` success path returns ordered batches with snapshot metadata. - [x] `StreamLinkDeltas` unknown scenario path returns `NOT_FOUND`. - [x] `StreamLinkDeltas` closed scenario path returns `FAILED_PRECONDITION`. - [x] `StreamLinkDeltas` invalid time parameters return `INVALID_ARGUMENT`. - [x] `StreamEvents` success path returns well-formed events for the test scenario. - [x] `StreamEvents` filtered path validates event filtering behavior. - [x] `StreamEvents` unknown scenario path returns `NOT_FOUND`. - [x] `StreamEvents` closed scenario path returns `FAILED_PRECONDITION`. ## 5) Execution and coverage - [x] Run full test suite and ensure all tests pass. - [x] Run coverage scoped to `geomrf_engine`. - [x] Verify `server.py` and stream/event modules are covered by tests. - [x] Document final coverage numbers and remaining gaps. ## 6) Results summary - [x] Test count: `20 passed`. - [x] Coverage total (`geomrf_engine`): `85%` (`820` statements, `120` missed). - [x] Core RPC implementation coverage: `server.py` at `79%`, `streaming/events.py` at `96%`, `streaming/backpressure.py` at `80%`, `util/logging.py` at `92%`. - [x] Remaining notable gaps captured for follow-up: evaluator branch coverage (`56%`) and delta-threshold branch coverage (`71%`). ## 7) v1.1 follow-up (event alignment) - [ ] Add `StreamEvents` alignment tests for request `dt` semantics (`default_dt` fallback + invalid-range rejection). - [ ] Add selector-parity tests ensuring event selection mirrors `StreamLinkDeltas` selection. - [ ] Add assertions that every emitted `EngineEvent` carries `tick_index`. - [ ] Add cross-stream alignment test: same window/dt/selector for events+deltas yields consistent tick mapping. - [ ] Extend error-path coverage for new event request fields.

Smol Example:

ChatGPT Web:

Implementing component 3 (of 4):

ChatGPT Web:

Codex CLI:

Codex CLI:

Python Environment And Package Policy

Rules

Examples

Test Coverage Policy

Rules

Coverage command pattern

OMNeT++/INET Environment Policy

Rules

Scope boundary

Legacy Code Reuse Policy

Sandbox / Network Restriction Policy

SatSim docs index

Geometry/RF Engine v1/v1.1 Streaming API — Implementation Specification

0) Deliverables

What must exist at the end

1) Core design constraints

1.1 Contract invariants

1.2 What the engine outputs (NetworkView)

2) Repository layout (recommended)

3) Implementation tasks checklist

3.1 Project & build system

3.2 Protobuf + gRPC schema

3.3 Server skeleton

3.4 Scenario lifecycle

3.5 Compute pipeline

3.6 Streaming + deltas/events

3.7 Tests + examples

4) gRPC/Protobuf specification (v1)

4.1 .proto (authoritative spec)

5) Server behavior specification (streaming semantics)

5.1 Timebase rules

5.2 Tick indexing

5.3 Active link set and removals

5.4 First message behavior

5.5 Delta emission thresholds

5.6 Event stream behavior (v1.1 alignment)

5.7 Backpressure and cancellation

6) Internal engine architecture (recommended)

6.1 Modules and responsibilities

7) Scenario validation rules (must be enforced)

8) Performance requirements (practical targets)

Recommended optimizations (v1)

9) Determinism requirements

10) Error handling (gRPC status codes)

11) Reference streaming algorithm (server-side)

Pseudocode for StreamLinkDeltas

Pseudocode for StreamEvents

12) Client expectations (contract for consumers)

13) Example client (must be included)

14) Minimal “default” adaptation mapping (v1, deterministic)

v1 fallback policy

15) Observability (recommended even in v1)

16) Security and robustness (v1 minimum)

17) Acceptance criteria (definition of done)

Functional

Correctness

Usability

18) Optional but high-value extension hooks (safe to leave stubbed)

19) v1.1 orchestrator-alignment tasks (new)

TASKS_IMNET.md — SatSim IMNET Lane (OMNeT++/INET) Implementation Plan

-1) Locked v1 decisions for this plan update

0) Deliverables (what “done” means)

1) Repo layout for IMNET lane

2) Version pinning and opp_env workspace (reproducible toolchain)

2.1 Pick and pin OMNeT++ + INET versions

2.2 Initialize an opp_env workspace outside git working trees

2.3 Provide canonical commands (must work from uv venv)

3) uv ↔ opp_env integration (clean boundary)

3.1 Keep responsibility boundaries strict

3.2 Add orchestrator-side “IMNET preflight”

3.3 Standardize how orchestrator launches IMNET

4) IMNET contract with the other SatSim systems (as they exist)

4.1 Relationship to Geo/RF engine

4.2 Relationship to Orchestrator (v1 trace-first)

4.1 `.proto` (authoritative spec)

Pseudocode for `StreamLinkDeltas`

Pseudocode for `StreamEvents`