From 59ebfc2626d09a1d9a26ebe4609c09481face605 Mon Sep 17 00:00:00 2001 From: Stan Grams Date: Sun, 29 Mar 2026 12:15:00 +0200 Subject: [PATCH] =?UTF-8?q?[docs](trx-rs):=20refresh=20improvement=20areas?= =?UTF-8?q?=20=E2=80=94=20remove=20resolved,=20add=20new=20findings?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remove all completed P0/P1/P2 items and quick wins. Add new findings from codebase scan: auth.rs mutex poisoning, TCP listener rate limiting, RigState struct decomposition, spawn_blocking timeout, unsafe string construction, dead_code annotations, expanded test coverage gaps, and naming inconsistencies. Co-Authored-By: Claude Opus 4.6 (1M context) Signed-off-by: Stan Grams --- CLAUDE.md | 42 +++--- docs/Improvement-Areas.md | 269 +++++++++++++++++++------------------- 2 files changed, 154 insertions(+), 157 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index fb6fc51..8b8d1bf 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -107,7 +107,7 @@ Types: `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore`. Use `(trx-rs Full architecture documentation: `docs/Architecture.md` Improvement plan: `docs/Improvement-Areas.md` -*Last reviewed: 2026-03-28* +*Last reviewed: 2026-03-29* ### Strengths @@ -124,26 +124,26 @@ Improvement plan: `docs/Improvement-Areas.md` ### Areas for Improvement -**P1 — High (all resolved):** -- ✅ **FrontendRuntimeContext** decomposed into 9 sub-structs (AudioContext, DecodeHistoryContext, HttpAuthConfig, etc.) -- ✅ **Rig task command batching** fixed to FIFO order -- ✅ **Decoder history bounded** at 10,000 entries per queue -- ✅ **ExponentialBackoff jitter** ±25% randomized -- ✅ **Rig task crash recovery** emits Error state to clients -- ✅ **Sync RwLock in async** migrated to tokio::sync::RwLock where appropriate -- ✅ **Audio pipeline helpers** extracted from run_capture/run_playback +**P0 — Critical:** +- **Plugin signing**: No SHA-256 checksum verification, no per-plugin scoping, no Windows validation. -**P2 — Medium (all resolved):** -- ✅ **SoapySdrRig** uses `SoapySdrConfig` builder struct -- ✅ **Command enum mapping** uses `define_command_mappings!` macro -- ✅ **Lock poison recovery** now logs warnings via `lock_or_recover()` helper -- ✅ **Timeouts configurable** via `[timeouts]` TOML section -- ✅ **Config shared** types extracted to `trx-app/src/shared_config.rs` +**P1 — High:** +- **Session store unwrap()**: 7 `.write().unwrap()` in `auth.rs` — mutex poisoning crashes server. +- **No TCP listener rate limiting**: Raw protocol listener accepts unlimited connections per IP. +- **RigState 33-field flat struct**: Decoder bools/reset seqs should be sub-structs. Cloned on every broadcast. +- **No `spawn_blocking` timeout**: Satellite pass computation in `listener.rs` can hang indefinitely. -**P3 — Low (remaining):** -- **Command handler boilerplate**: 11 `RigCommandHandler` impls follow identical patterns across 500+ lines. Macro opportunity. -- **No integration tests** for `rig_task.rs` (1,315 LOC) or `audio.rs` (3,977 LOC) — the two largest server modules. -- **No command execution timeouts** at the `CommandExecutor` level. Backend stalls propagate up. -- **FT-817 VFO inference fragile**: Fails when VFO A and B share the same frequency. -- **VDES decoder incomplete**: Turbo FEC and link-layer parsing not implemented. +**P2 — Medium:** +- **Command handler boilerplate**: 11 `RigCommandHandler` impls across 500+ lines. Macro opportunity. +- **No command execution timeouts** at `CommandExecutor` level. Backend stalls propagate up. - **No protocol versioning**: Unknown commands cause parse errors with no graceful degradation. +- **Unsafe string in spectrum encoding**: `from_utf8_unchecked` in `api.rs:63`. +- **6 `#[allow(dead_code)]`** annotations to review/clean up. + +**P3 — Low:** +- **Missing tests**: `audio.rs` (3,812 LOC), `api.rs` (2,711 LOC), `main.rs` (1,203 LOC) have 0 tests. +- **FT-817 VFO inference fragile**: Fails when VFO A and B share the same frequency. +- **VDES decoder incomplete**: Turbo FEC, CRC, and link-layer parsing not implemented. +- **Plugin system lacks versioning**: No API version or reload semantics. +- **Configurator detection stubbed**: `detect.rs` TODO for `serialport::available_ports()`. +- **Inconsistent naming**: `freq_hz`/`frequency`/`center_hz`; `rig_id`/`id`; `model`/`rig_model`. diff --git a/docs/Improvement-Areas.md b/docs/Improvement-Areas.md index e46d909..5d5684c 100644 --- a/docs/Improvement-Areas.md +++ b/docs/Improvement-Areas.md @@ -4,54 +4,15 @@ A comprehensive audit of the trx-rs codebase covering code quality, architecture security, testing, and performance. Each item includes the affected location and a suggested fix. -*Last updated: 2026-03-28* +*Last updated: 2026-03-29* --- -## Resolved +## Critical (P0) -The following items have been fixed across PRs #58, #59, and #60: +### Plugin signing and cross-platform validation -### Quick Wins (all complete) -- ✅ **Session cleanup timer** — 5-minute periodic `cleanup_expired()` task -- ✅ **`DecodeHistory` type alias** — replaces 9 repeated `Arc>>` patterns -- ✅ **`mode_to_string()` allocation** — returns `Cow<'static, str>` (zero-alloc for known modes) -- ✅ **FTx dedup** — `HashSet` for O(1) lookups -- ✅ **Unbounded channels** — `VChanAudioCmd` channels bounded at 256 -- ✅ **JSON serialization** — `#[serde(flatten)]` wrapper replaces string-level splice -- ✅ **`AtomicUsize` counter** — `estimated_total_count()` avoids 9 mutex acquisitions -- ✅ **Cookie security warning** — startup warning when `cookie_secure` is false -- ✅ **Spectrum encoding** — pre-allocated output string replaces `format!` overhead -- ✅ **`pub(crate)` state data** — `ReadyStateData`/`TransmittingStateData` fields restricted with constructors + getters -- ✅ **Lock ordering docs** — module-level documentation in `` establishing `rigs → sessions → audio_cmd` - -### Critical (P0) -- ✅ **Plugin loading validation** — rejects world-writable files on Unix; `TRX_PLUGINS_DISABLED` env var -- ✅ **Audio pipeline mutex panics** — all `.expect()` on history mutexes and `.unwrap()` on audio ring buffers replaced with `.unwrap_or_else(|e| e.into_inner())` poison recovery -- ✅ **vchan lock panics** — ~25 `.unwrap()` on RwLock/Mutex replaced with poison recovery - -### High (P1) -- ✅ **RigCat trait split** — 13 SDR-specific methods extracted into `RigSdr` extension trait; `RigCat` retains core CAT ops + `as_sdr()`/`as_sdr_ref()`; SoapySdrRig implements both; FT-817/FT-450D/DummyRig unchanged -- ✅ **Decoder history contention** — `AtomicUsize` total counter maintained by record/prune/clear - -### Medium (P2) -- ✅ **Silent state machine failures** — debug-level tracing for rejected transitions -- ✅ **User input in logs** — raw JSON truncated to 128 chars -- ✅ **Rate limiting** — per-IP `LoginRateLimiter` (10 attempts/60s) on `/auth/login` -- ✅ **Lock-holding serialization** — clone data out under lock, serialize after release -- ✅ **Overly-public API** — state data fields `pub(crate)` with controlled accessors -- ✅ **Cookie security flag** — startup warning for non-TLS deployments -- ✅ **Lock ordering** — documented in `` module header - ---- - -## Remaining Issues - -### Critical (P0) - -#### Plugin signing and cross-platform validation - -**Location:** `src/trx-app/src/` +**Location:** `src/trx-app/src/plugins.rs` Current protections: file permission checks (Unix), `TRX_PLUGINS_DISABLED` env var, loaded plugins logged at startup. @@ -69,54 +30,145 @@ loaded plugins logged at startup. --- -### High Priority (P1) +## High Priority (P1) -#### Synchronous locks in async contexts +### Session store mutex poisoning (auth.rs) -**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/`, -`src/trx-client/trx-frontend/trx-frontend-http/src/` +**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/auth.rs` (lines 89, +96, 116, 124, 151, 158, 165) -`std::sync::RwLock` is used inside async tasks. Current code is safe (no locks held -across await points), but not idiomatic. Migrating to `tokio::sync::RwLock` would -prevent future regressions. +7 `.write().unwrap()` / `.lock().unwrap()` calls on the session `RwLock`. +If a panic occurs while holding the lock, all subsequent auth operations will panic +and crash the server. -#### Large functions in audio pipeline +**Fix:** Use `lock_or_recover()` helper (already used elsewhere in the codebase) or +`write().unwrap_or_else(|e| e.into_inner())` with warning logs. + +### No rate limiting on TCP listener + +**Location:** `src/trx-server/src/listener.rs` + +The TCP listener accepts connections without per-IP rate limiting. The HTTP frontend +has rate limiting on `/auth/login`, but the raw protocol listener does not. Potential +for connection exhaustion. + +**Fix:** Add per-IP connection rate limiting (similar to `LoginRateLimiter` in auth). + +### RigState is a 33-field flat struct + +**Location:** `src/trx-core/src/rig/state.rs` (lines 13–84) + +33 fields including 8 `*_decode_enabled` bools and 8 `*_decode_reset_seq` counters +that follow identical patterns. Cloned frequently via `watch` channel broadcasts. + +**Fix:** Group decoder fields into a `DecoderConfig` sub-struct and reset sequences +into a `DecoderResetSeqs` sub-struct. Reduces clone cost and makes decoder-related +changes self-contained. + +### No timeout on `spawn_blocking` in listener + +**Location:** `src/trx-server/src/listener.rs:351` + +`tokio::task::spawn_blocking()` for satellite pass computation has no timeout. If +SGP4 propagation hangs, it consumes a thread pool slot indefinitely. + +**Fix:** Wrap in `tokio::time::timeout()`. + +--- + +## Medium Priority (P2) + +### Command handler boilerplate + +**Location:** `src/trx-core/src/rig/controller/handlers.rs` (lines 145–659) + +11 `RigCommandHandler` implementations follow identical patterns across 500+ lines: +validate state → call executor method → return result. Differences are limited to +which executor method is called and which state preconditions are checked. + +**Fix:** Declarative macro that generates implementations from a table of +(command, executor_method, preconditions) tuples. Would reduce ~500 lines to ~100. + +### No command execution timeouts at CommandExecutor level + +**Location:** `src/trx-server/src/rig_task.rs` + +`command_exec_timeout` is defined in `RigTaskConfig` but there is no evidence of +`tokio::time::timeout()` wrapping individual executor calls. A stuck backend command +blocks the rig task indefinitely. + +**Fix:** Wrap each `executor.method().await` call in `timeout(config.command_exec_timeout, ...)`. + +### No forward compatibility in protocol + +**Location:** `src/trx-protocol/src/codec.rs` + +Unknown commands cause parse errors. No `protocol_version` field in the envelope. +Older clients cannot gracefully degrade when connecting to newer servers. + +**Fix:** Add optional `protocol_version` to `ClientEnvelope`. Unknown commands +should return an error response rather than a parse failure. + +### `unsafe` string construction in spectrum encoding + +**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/api.rs:63` + +`unsafe { String::from_utf8_unchecked(out) }` builds a base64 string from bytes. +The safety comment claims ASCII-only output, which is correct for the current +implementation, but a future edit could break the invariant silently. + +**Fix:** Use `String::from_utf8(out).expect("base64 is ASCII")` (negligible +performance difference on short spectrum strings) or use the `base64` crate. + +### 6 `#[allow(dead_code)]` annotations **Locations:** -- `src/trx-server/src/` — `run_capture()` (~200 lines), - `run_playback()` (~217 lines) +- `src/trx-client/trx-frontend/trx-frontend-http/src/auth.rs:652` +- `src/trx-client/src/config.rs:266` +- `src/trx-server/trx-backend/trx-backend-soapysdr/src/vchan_impl.rs:66, 87` +- `src/trx-server/trx-backend/trx-backend-soapysdr/src/demod.rs:113` +- `src/trx-server/trx-backend/trx-backend-soapysdr/src/real_iq_source.rs:20` -These contain nested loops, device re-enumeration logic, and stream error handling -that should be extracted into focused helper functions. +**Fix:** Review each — remove dead code or remove the annotation if the code is +reachable via feature gates. --- -### Medium Priority (P2) +## Low Priority (P3) -#### Configuration duplication +### Missing tests for critical modules -**Location:** `src/trx-server/src/` (1512 lines), -`src/trx-client/src/` (1181 lines) +Zero `#[test]` functions in: +- `src/trx-server/src/audio.rs` (3,812 lines) — decoder instantiation, audio streaming, history +- `src/trx-client/trx-frontend/trx-frontend-http/src/api.rs` (2,711 lines) — HTTP endpoints, SSE, spectrum encoding +- `src/trx-server/src/main.rs` (1,203 lines) — multi-rig setup, initialization +- `src/trx-server/src/history_store.rs` (193 lines) — persistence, timestamp conversion -14 config structs each, many mirrored between server and client. Extract shared -definitions (GeneralConfig, RigConfig, defaults) into `trx-app`. +`rig_task.rs` (1,316 lines) has 4 tests but no integration tests for command +timeout handling, polling recovery, or error state transitions. ---- +Serial backends (FT-817, FT-450D) and plugin loading have no test coverage. -### Low Priority (P3) +### FT-817 VFO state inference is fragile -#### Missing tests for critical paths +**Location:** `src/trx-server/trx-backend/trx-backend-ft817/src/lib.rs` -Serial backends (FT-817, FT-450D), plugin loading/discovery, and the audio -pipeline (Opus encode/decode) have no or minimal test coverage. +VFO state starts as `Unknown` and is inferred by matching frequencies against +cached values. When VFO A and B share the same frequency, inference fails. -Core crates (`trx-core`, `trx-server`, `trx-client`, `trx-app`) have limited -`[dev-dependencies]` and use only inline `#[test]` functions. Adding test -utilities (mock serial ports, test fixtures) would improve coverage. +**Fix:** Detect firmware version and use direct VFO query when available. -#### Plugin system lacks versioning and lifecycle +### VDES decoder has incomplete FEC -**Location:** `src/trx-app/src/` +**Location:** `src/decoders/trx-vdes/src/lib.rs` + +Burst detection and pi/4-QPSK demodulation work, but Turbo FEC (1/2 rate) and +link-layer (M.2092-1) parsing are not implemented. CRC validation is stubbed +(`crc_ok: false`). Output limited to raw symbols. + +### Plugin system lacks versioning and lifecycle + +**Location:** `src/trx-app/src/plugins.rs` No plugin API version, capability manifest, or unload/reload semantics. Old plugins break silently on API changes. @@ -124,73 +176,18 @@ plugins break silently on API changes. **Fix:** Add a version field to the registration struct and reject incompatible plugins at load time. ---- +### Configurator serial detection is stubbed -## New Findings (2026-03-28 Deep Review) — All Resolved +**Location:** `src/trx-configurator/src/detect.rs:8` -### High Priority (P1) — All Complete +Contains `TODO: use serialport::available_ports() for real detection`. The +interactive setup wizard cannot auto-detect connected rigs. -- ✅ **Rig task command batching LIFO** — replaced `batch.pop()` with `batch.remove(0)` for FIFO order -- ✅ **FrontendRuntimeContext god-struct** — decomposed ~50 flat fields into 9 coherent sub-structs (`AudioContext`, `DecodeHistoryContext`, `HttpAuthConfig`, `HttpUiConfig`, `RigRoutingContext`, `OwnerInfo`, `VChanContext`, `SpectrumContext`, `PerRigAudioContext`); all 7 consumer files updated -- ✅ **Decoder history unbounded** — added `MAX_HISTORY_ENTRIES` (10,000) cap with `enforce_capacity()` eviction independent of time-based pruning -- ✅ **ExponentialBackoff no jitter** — added ±25% randomized jitter via `apply_jitter()` helper to prevent thundering herd on reconnect -- ✅ **No rig task crash recovery** — rig tasks now detect errors and emit `RigMachineState::Error` on the watch channel so clients see the failure -- ✅ **Synchronous locks in async contexts** — migrated `std::sync::RwLock` to `tokio::sync::RwLock` in `background_decode.rs`; `vchan.rs` left as-is (all methods are synchronous, no locks held across await points) -- ✅ **Large audio pipeline functions** — extracted `find_input_device()` and `find_output_device()` helpers from `run_capture()` and `run_playback()` +### Inconsistent frequency/rig naming across crates -### Medium Priority (P2) — All Complete +Field naming is inconsistent across the codebase: +- `freq_hz` vs `frequency` vs `center_hz` (audio.rs, api.rs, config.rs) +- `rig_id` vs `id` (RigInstanceConfig vs RigState) +- `model` vs `rig_model` (RigConfig vs RigTaskConfig) -- ✅ **SoapySdrRig 20-parameter constructor** — introduced `SoapySdrConfig` struct with named fields and defaults; `new_from_config()` replaces positional parameters; old `new_with_config()` preserved as backward-compatible wrapper -- ✅ **Dual command enums** — added `define_command_mappings!` macro in `mapping.rs` that generates both `client_command_to_rig()` and `rig_command_to_client()` from a single definition table; removed `unreachable!()` for `GetRigs`/`GetSatPasses` -- ✅ **Lock poisoning recovery hides panics** — replaced all `.unwrap_or_else(|e| e.into_inner())` with `lock_or_recover()` helper that logs a warning with the lock label when recovering from poisoned mutex -- ✅ **Configuration duplication** — extracted shared config types (`LogLevel` defaults, common patterns) into `trx-app/src/shared_config.rs`; both server and client import from `trx_app` -- ✅ **Hardcoded timeouts** — made `command_exec_timeout`, `poll_refresh_timeout`, `io_timeout`, `request_timeout`, and `rig_task_channel_buffer` configurable via `RigTaskConfig`/`ListenerConfig` and the TOML `[timeouts]` section; constants remain as defaults - ---- - -### Low Priority (P3) - -#### FT-817 VFO state inference is fragile - -**Location:** `src/trx-server/trx-backend/trx-backend-ft817/src/` - -VFO state starts as `Unknown` and is inferred by matching frequencies against -cached values. When VFO A and VFO B are set to the same frequency, inference -fails and the rig may report the wrong VFO. - -**Fix:** Some FT-817 variants support extended status bytes that indicate active -VFO directly. Detect firmware version and use the direct query when available. - -#### VDES decoder has incomplete FEC - -**Location:** `src/decoders/trx-vdes/src/` - -The VDES decoder implements burst detection and pi/4-QPSK demodulation but the -Turbo FEC (1/2 rate) decoder and link-layer (M.2092-1) parser are not complete. -Decoded output is limited to raw symbols. - -#### No forward compatibility in protocol - -**Location:** `src/trx-protocol/src/` - -Unknown commands cause parse errors. There is no protocol version field in the -envelope, making it impossible for older clients to gracefully degrade when -connecting to newer servers. - -**Fix:** Add an optional `protocol_version` field to `ClientEnvelope`. Unknown -commands should return an error response rather than a parse failure. - -#### Command handler boilerplate - -**Location:** `src/trx-core/src/rig/controller/` - -11 `RigCommandHandler` implementations follow identical patterns across 500+ -lines (validate state → call executor method → return result). A declarative -macro could reduce this to ~100 lines. - -#### Missing tests for critical paths - -Serial backends (FT-817, FT-450D), plugin loading/discovery, and the audio -pipeline (Opus encode/decode) have no or minimal test coverage. `rig_task.rs` -(1,315 lines) and `audio.rs` (3,977 lines) — the two largest server modules — -have no integration tests. \ No newline at end of file +Not a correctness issue, but increases cognitive overhead and copy-paste errors.