[docs](trx-rs): reanalyze improvement areas, clear resolved items

Audit codebase against previous improvement list — all P0/P1/P2 items from
the prior review are now resolved or dropped. Restructured document with
resolved items in a collapsed section and identified new areas: decoder task
duplication, missing tests, decode log error handling, api.rs size, and others.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Stan Grams <sjg@haxx.space>
This commit is contained in:
2026-03-29 16:09:34 +02:00
parent d512268526
commit 44e09449dc
2 changed files with 212 additions and 108 deletions
+15 -17
View File
@@ -128,26 +128,24 @@ Improvement plan: `docs/Improvement-Areas.md`
### Areas for Improvement ### Areas for Improvement
**P0 — Critical:** All previous P0 items resolved or dropped. See `docs/Improvement-Areas.md` for
- **Plugin signing**: No SHA-256 checksum verification, no per-plugin scoping, no Windows validation. resolved item details.
**P1 — High:** **P1 — High:**
- **Session store unwrap()**: 7 `.write().unwrap()` in `auth.rs` — mutex poisoning crashes server. - **Decoder task duplication**: 9 decoder tasks in `audio.rs` (3,826 LOC) share ~1,000 lines of near-identical boilerplate. Extract a generic `DecoderTask<D>`.
- **No TCP listener rate limiting**: Raw protocol listener accepts unlimited connections per IP. - **Missing tests**: `audio.rs` (3,826 LOC), `api.rs` (2,831 LOC), `main.rs` (679 LOC) have 0 tests.
- **RigState 33-field flat struct**: Decoder bools/reset seqs should be sub-structs. Cloned on every broadcast. - **No multi-rig integration tests**: State isolation and command routing between rigs untested.
- **No `spawn_blocking` timeout**: Satellite pass computation in `listener.rs` can hang indefinitely.
**P2 — Medium:** **P2 — Medium:**
- **Command handler boilerplate**: 11 `RigCommandHandler` impls across 500+ lines. Macro opportunity. - **Decode log silent failures**: `let _ = flush()` discards errors; rotation failure has no fallback writer.
- **No command execution timeouts** at `CommandExecutor` level. Backend stalls propagate up. - **`api.rs` size**: 2,831 LOC with ~25 endpoint handlers and no logical separation.
- **No protocol versioning**: Unknown commands cause parse errors with no graceful degradation. - **Background decode state complexity**: 8+ nested conditionals in `run()` inner loop (~95 lines).
- **Unsafe string in spectrum encoding**: `from_utf8_unchecked` in `api.rs:63`. - **Actix-web pinned**: `=4.4.1` prevents patch-level security updates.
- **6 `#[allow(dead_code)]`** annotations to review/clean up. - **VDES magic numbers**: Plausibility thresholds (`-35`, `15`) are undocumented inline constants.
**P3 — Low:** **P3 — Low:**
- **Missing tests**: `audio.rs` (3,812 LOC), `api.rs` (2,711 LOC), `main.rs` (1,203 LOC) have 0 tests. - **FT-817 VFO inference fragile**: Defaults to VFO A when both share the same frequency.
- **FT-817 VFO inference fragile**: Fails when VFO A and B share the same frequency. - **String cloning in remote client**: ~105 `.clone()` calls, some in hot poll loops.
- ~~**VDES decoder incomplete**~~: Turbo FEC, CRC-16, and M.2092-1 link-layer parsing now implemented. - **Missing decoder doc comments**: `AisDecoder`, `VdesDecoder`, `RdsDecoder` lack public API docs.
- **Plugin system lacks versioning**: No API version or reload semantics. - **Turbo decoder precondition**: `turbo_decode_soft()` lacks debug assertions on interleaver length.
- **Configurator detection stubbed**: `detect.rs` TODO for `serialport::available_ports()`. - **No decoder tracing spans**: No `info_span!` for measuring per-decoder latency.
- **Inconsistent naming**: `freq_hz`/`frequency`/`center_hz`; `rig_id`/`id`; `model`/`rig_model`.
+197 -91
View File
@@ -8,152 +8,258 @@ a suggested fix.
--- ---
## Critical (P0) ## Resolved Items
### ~~Plugin signing and cross-platform validation~~ — DROPPED <details>
<summary>Click to expand resolved items from previous audits</summary>
### Plugin signing and cross-platform validation — DROPPED
Plugin system has been removed from the codebase. No longer applicable. Plugin system has been removed from the codebase. No longer applicable.
### Session store mutex poisoning (auth.rs) — RESOLVED
**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/auth.rs`
All 6 `.write().unwrap()` / `.lock().unwrap()` calls replaced with
`.unwrap_or_else(|e| { warn!(...); e.into_inner() })` pattern. Lock poisoning now
logs a warning and recovers the inner data instead of crashing.
### No rate limiting on TCP listener — RESOLVED
**Location:** `src/trx-server/src/listener.rs`
Added `ConnectionTracker` with per-IP connection limiting (default: 10 concurrent
connections per IP). Connections exceeding the limit are rejected with a log warning.
Slots are released when clients disconnect.
### RigState is a 33-field flat struct — RESOLVED
**Location:** `src/trx-core/src/rig/state.rs`
Decoder fields grouped into `DecoderConfig` (8 bools) and `DecoderResetSeqs`
(8 u64 counters). Both use `#[serde(flatten)]` for backward-compatible JSON wire
format. Updated across all consumers.
### No `spawn_blocking` timeout — RESOLVED
**Location:** `src/trx-server/src/listener.rs`
Satellite pass computation wrapped in `tokio::time::timeout(30s, ...)` with
graceful fallback to empty results on timeout or panic.
### Command handler boilerplate — RESOLVED
**Location:** `src/trx-core/src/rig/controller/handlers.rs`
Created `rig_command!` declarative macro. 7 unit commands use the macro; 4 commands
with custom fields/validation remain as explicit impls.
### No command execution timeouts at CommandExecutor level — RESOLVED
**Location:** `src/trx-server/src/rig_task.rs`
`tokio::time::timeout(command_exec_timeout, process_command(...))` wraps all
command execution. Default timeout: 10s, configurable via `RigTaskConfig`.
### No forward compatibility in protocol — RESOLVED
**Location:** `src/trx-protocol/src/types.rs`, `src/trx-protocol/src/codec.rs`
Added optional `protocol_version: Option<u32>` to `ClientEnvelope` and
`ClientResponse`. `parse_envelope()` distinguishes malformed JSON from
unrecognised `cmd` values.
### `unsafe` string construction in spectrum encoding — RESOLVED
**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/api.rs`
Replaced `unsafe { String::from_utf8_unchecked(out) }` with safe
`String::from_utf8(out).expect(...)`.
### `#[allow(dead_code)]` cleanup — RESOLVED
Reduced from 6 to 4 annotations, all in trx-backend-soapysdr where fields serve
as lifetime anchors (`device`, `iq_tx`) or document reserved capacity
(`fixed_slot_count`, `process_pair`).
### VDES decoder incomplete FEC — RESOLVED
Turbo FEC decoder, CRC-16-CCITT validation, and M.2092-1 link-layer frame parsing
implemented.
### Plugin system lacks versioning — DROPPED
Plugin system removed from the codebase.
### Configurator serial detection stubbed — RESOLVED
Implemented using `tokio_serial::available_ports()` with USB, Bluetooth, PCI, and
Unknown port type descriptions.
### Inconsistent frequency/rig naming — DOCUMENTED AS INTENTIONAL
Field names reflect distinct semantic contexts: `freq_hz` (dial), `center_hz`
(SDR capture center), `cw_center_hz` (CW tone); `rig_id` (config key), `id`
(runtime UUID); `model` (hardware string), `rig_model` (config parameter).
</details>
--- ---
## High Priority (P1) ## High Priority (P1)
### ~~Session store mutex poisoning (auth.rs)~~ — RESOLVED ### Decoder task duplication in audio.rs
**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/auth.rs` **Location:** `src/trx-server/src/audio.rs` (3,826 LOC)
**Resolution:** All 6 `.write().unwrap()` / `.lock().unwrap()` calls replaced with Nine decoder tasks (APRS, AIS, VDES, CW, FT2, FT4, FT8, WSPR, LRPT) each
`.unwrap_or_else(|e| { warn!(...); e.into_inner() })` pattern. Lock poisoning now implement the same pattern: subscribe to PCM broadcast, watch for state changes
logs a warning and recovers the inner data instead of crashing. (mode/frequency/reset), call `block_in_place()` for synchronous decoding, record
to history, and forward to `decode_tx`. This results in ~1,000 lines of
near-identical boilerplate with 14+ `block_in_place()` calls and 12+
`.resubscribe()` calls.
### ~~No rate limiting on TCP listener~~ — RESOLVED **Risk:** A bug fix or behavior change (e.g., lag handling, error recovery) must
be replicated across all 9 decoders manually.
**Location:** `src/trx-server/src/listener.rs` **Suggestion:** Extract a `DecoderTask<D>` generic that encapsulates the
subscribe → watch → decode → record → forward lifecycle. Each decoder implements
a trait with `process_block()` and `reset()` methods. Estimated reduction: ~500
lines.
**Resolution:** Added `ConnectionTracker` with per-IP connection limiting ### Missing tests for critical modules
(default: 10 concurrent connections per IP). Connections exceeding the limit
are rejected with a log warning. Slots are released when clients disconnect.
### ~~RigState is a 33-field flat struct~~ — RESOLVED **Location:** `src/trx-server/src/audio.rs` (3,826 LOC, 0 tests),
`src/trx-client/trx-frontend/trx-frontend-http/src/api.rs` (2,831 LOC, 0 tests),
`src/trx-client/src/main.rs` (679 LOC, 0 tests)
**Location:** `src/trx-core/src/rig/state.rs` These are among the largest files in the codebase and have zero unit tests.
`history_store.rs` and `auth.rs` now have good coverage; `handlers.rs` has 4
tests. The remaining files require ALSA/hardware mocking infrastructure or HTTP
test harnesses.
**Resolution:** Decoder fields grouped into two sub-structs: **Suggestion:** Start with `api.rs` — actix-web's `test::TestRequest` makes
- `DecoderConfig`: 8 `*_decode_enabled` bool fields endpoint testing feasible without hardware. Extract pure logic from `audio.rs`
- `DecoderResetSeqs`: 8 `*_decode_reset_seq` u64 counters into testable helpers where possible.
Both use `#[serde(flatten)]` to maintain backward-compatible JSON wire format. ### Missing integration tests for multi-rig scenarios
Updated across all consumers: `rig_task.rs`, `audio.rs`, `api.rs`,
`remote_client.rs`, `server.rs` (rigctl, http-json), `codec.rs`.
### ~~No `spawn_blocking` timeout~~ — RESOLVED No tests verify state isolation or command routing between rigs in multi-rig
configurations, despite the codebase supporting per-rig task isolation with
`HashMap<rig_id, RigHandle>` routing.
**Location:** `src/trx-server/src/listener.rs` **Risk:** Cross-rig state pollution on refactors.
**Resolution:** Satellite pass computation wrapped in `tokio::time::timeout(30s, ...)` **Suggestion:** Add integration test covering simultaneous frequency/mode changes
with graceful fallback to empty results on timeout or panic. on two rigs with a dummy backend.
--- ---
## Medium Priority (P2) ## Medium Priority (P2)
### ~~Command handler boilerplate~~ — RESOLVED ### Decode log silent failures
**Location:** `src/trx-core/src/rig/controller/handlers.rs` **Location:** `src/decoders/trx-decode-log/src/lib.rs`
**Resolution:** Created `rig_command!` declarative macro that generates unit-struct - Line 160: `let _ = state.writer.flush();` silently discards flush errors. Disk
command implementations from a concise table of (name, preconditions, execute body). full or permission changes cause silent data loss.
7 unit commands (PowerOn, PowerOff, ToggleVfo, Lock, Unlock, GetTxLimit, - Lines 137150: If file rotation fails (open error), subsequent writes retry the
GetSnapshot) now use the macro. Commands with custom fields/validation (SetFreq, same path indefinitely with no fallback writer or degradation logging.
SetMode, SetPtt, SetTxLimit) remain as explicit impls.
### ~~No command execution timeouts at CommandExecutor level~~ — ALREADY RESOLVED **Suggestion:** Log flush errors via `warn!`. On rotation failure, keep the old
writer and log a degradation warning rather than silently failing.
**Location:** `src/trx-server/src/rig_task.rs` ### `api.rs` file size and organization
`tokio::time::timeout(command_exec_timeout, process_command(...))` already wraps **Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/api.rs` (2,831 LOC)
all command execution (lines 370425). Default timeout: 10s. No further changes
needed.
### ~~No forward compatibility in protocol~~ — RESOLVED Contains ~25+ endpoint handlers spanning decoder history, frequency/mode control,
virtual channel management, spectrum, and SSE streams with no logical separation.
**Location:** `src/trx-protocol/src/types.rs`, `src/trx-protocol/src/codec.rs` **Suggestion:** Consider splitting into `decoder_api.rs`, `vchan_api.rs`,
`rig_api.rs` in a future refactor.
**Resolution:** ### Background decode state complexity
- Added optional `protocol_version: Option<u32>` to both `ClientEnvelope` and
`ClientResponse` (current version: 1, defined as `PROTOCOL_VERSION` constant).
- `parse_envelope()` now distinguishes between truly malformed JSON and valid
JSON with an unrecognised `cmd` value, enabling clearer error messages.
### ~~`unsafe` string construction in spectrum encoding~~ — RESOLVED **Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/background_decode.rs:350444`
**Location:** `src/trx-client/trx-frontend/trx-frontend-http/src/api.rs:63` The `run()` method's inner loop contains 8+ nested conditional branches
(users_connected, scheduler_has_control, scheduled_bookmark_ids, virtual channel
coverage, spectrum availability, offset bounds). Correct but difficult to modify
or extend.
**Resolution:** Replaced `unsafe { String::from_utf8_unchecked(out) }` with **Suggestion:** Extract the decision logic into a pure function returning a
`String::from_utf8(out).expect("base64 output is always valid ASCII")`. `ChannelAction` enum. Improves testability and makes the state machine explicit.
### ~~6 `#[allow(dead_code)]` annotations~~ — RESOLVED ### Actix-web pinned to exact version
**Resolution:** **Location:** `src/trx-client/trx-frontend/trx-frontend-http/Cargo.toml`
- `is_tx_endpoint` in auth.rs: made `pub` and removed annotation (used in tests,
available for TX access control integration). `actix-web = "=4.4.1"` prevents automatic patch-level security updates. Later
- `session_ttl()` in config.rs: removed annotation (public API method). 4.x releases may include security fixes.
- `device` in real_iq_source.rs: annotation kept (lifetime anchor for stream).
- `iq_tx` in vchan_impl.rs: annotation kept (broadcast sender kept alive). **Suggestion:** Relax to `actix-web = "4.4"` to allow patch updates, or
- `fixed_slot_count` in vchan_impl.rs: annotation kept (documents reserved slots). periodically review and bump the pinned version.
- `process_pair` in demod.rs: annotation kept (stereo AGC variant for future use).
### Magic numbers in VDES plausibility scoring
**Location:** `src/decoders/trx-vdes/src/lib.rs:261280`
Plausibility thresholds (`-35`, `15`) are inline magic numbers with no
documentation of the scoring scale or units.
**Suggestion:** Define named constants:
```rust
const PLAUSIBILITY_UNSYNCED_THRESHOLD: i32 = -35;
const PLAUSIBILITY_LOW_CONFIDENCE_THRESHOLD: i32 = 15;
```
--- ---
## Low Priority (P3) ## Low Priority (P3)
### ~~Missing tests for critical modules~~ — PARTIALLY RESOLVED ### FT-817 VFO inference fragile with same frequency
- `history_store.rs`: Added 4 unit tests covering timestamp generation, **Location:** `src/trx-server/trx-backend/trx-backend-ft817/src/lib.rs:233265`
serde round-trip, save/load round-trip, and expiry filtering.
- `audio.rs`, `api.rs`, `main.rs`: Remain without tests (require ALSA/hardware
mocking infrastructure that is beyond the scope of this pass).
- `rig_task.rs`: Existing 4 tests adequate; integration tests deferred.
### ~~FT-817 VFO state inference is fragile~~ — IMPROVED When both VFOs share the same frequency, inference defaults to VFO A. Resolved
after VFO toggle primes both sides. Well-documented in code comments but remains
a known limitation.
**Location:** `src/trx-server/trx-backend/trx-backend-ft817/src/lib.rs` ### Excessive string cloning in remote client
**Resolution:** Improved `update_vfo_freq()` to handle the ambiguous case where **Location:** `src/trx-client/src/remote_client.rs`
both VFOs share the same frequency. When VFO B has a cached frequency that
differs from the current reading, inference correctly assigns to VFO A. When
frequencies match (ambiguous), defaults to VFO A — resolved after VFO toggle
primes both sides. Added detailed comments explaining the inference logic.
### ~~VDES decoder has incomplete FEC~~ — RESOLVED ~105 `.clone()` calls on String fields, many in hot paths during poll loops
(spectrum, state updates). Most are necessary for ownership across async
boundaries, but some could use borrowed references or `Cow<str>`.
**Location:** `src/decoders/trx-vdes/src/` **Suggestion:** Audit hot-path clones in `run_remote_client`, particularly around
spectrum polling loops. Low priority unless profiling shows allocation pressure.
Turbo FEC decoder (dual 8-state RSC with BCJR/MAP iterative decoding, QPP ### Missing doc comments on public decoder structs
interleaver), CRC-16-CCITT validation, and M.2092-1 link-layer frame parsing
(Messages 06) have been implemented. The decode pipeline now attempts turbo
decoding first, falls back to Viterbi, and validates CRC on both paths.
### ~~Plugin system lacks versioning and lifecycle~~ — DROPPED **Location:** `src/decoders/trx-ais/src/lib.rs`, `src/decoders/trx-vdes/src/lib.rs`,
`src/decoders/trx-rds/src/lib.rs`
Plugin system has been removed from the codebase. No longer applicable. Public decoder structs (`AisDecoder`, `VdesDecoder`, `RdsDecoder`) lack doc
comments describing valid sample rates, preconditions, and guarantees.
### ~~Configurator serial detection is stubbed~~ — RESOLVED ### Turbo decoder precondition not asserted
**Location:** `src/trx-configurator/src/detect.rs` **Location:** `src/decoders/trx-vdes/src/turbo.rs:208249`
**Resolution:** Implemented `detect_serial_ports()` using `tokio_serial::available_ports()`. `turbo_decode_soft()` accesses interleaver/deinterleaver vectors without bounds
Returns `(port_name, description)` pairs with USB vendor/product info, Bluetooth, checks. The precondition `interleaver.len() == info_len` is clear from context
PCI, and Unknown port type descriptions. and enforced by the caller, but not formally documented or debug-asserted.
### Inconsistent frequency/rig naming across crates ### No tracing spans for decoder performance
Field naming varies across the codebase (`freq_hz` vs `center_hz`, `rig_id` vs **Location:** `src/trx-server/src/audio.rs`
`id`, `model` vs `rig_model`). Analysis shows these reflect distinct semantic
contexts rather than true inconsistencies:
- `freq_hz`: dial frequency; `center_hz`: SDR capture center; `cw_center_hz`: CW tone
- `rig_id`: stable config key; `id`: runtime UUID
- `model`: hardware model string; `rig_model`: config parameter
**Decision:** Documented as intentional. Renaming would break the wire protocol Decoders use `info!`/`warn!` logs but don't emit tracing spans. No way to
and provide minimal benefit. The `_hz` suffix convention is consistently applied. measure per-decoder latency without sampling logs.
**Suggestion:** Add `tracing::info_span!` around `block_in_place()` calls for
opt-in performance measurement.