Skip to content

feat(gsp-diagnostics): add information-collapse logging for GSP variants#12

Merged
jdbloom merged 1 commit intomasterfrom
feature/gsp-diagnostics-onto-master
Apr 13, 2026
Merged

feat(gsp-diagnostics): add information-collapse logging for GSP variants#12
jdbloom merged 1 commit intomasterfrom
feature/gsp-diagnostics-onto-master

Conversation

@jdbloom
Copy link
Copy Markdown
Collaborator

@jdbloom jdbloom commented Apr 13, 2026

Summary

Adds the per-step and per-episode HDF5 fields needed to detect "GSP information collapse" — a suspected failure mode where the GSP prediction network collapses to a near-constant output. See Stelaris `docs/specs/2026-04-12-dispatcher-diagnostic-batch.md` for the hypothesis.

This is a clean port of the work originally done on `feat/learn-every-n-steps` (PR #11) onto current master. It uses master's local `rl_code/src/hdf5_logger.py` and `hdf5_writer` variable name, and incorporates the cardinality fix from PR #11's review (aggregate per-tick mean in `--independent_learning` mode).

Changes

`rl_code/src/env.py`

  • `calculate_gsp_reward` now returns `(reward, label, squared_errors)`. The raw per-robot `(diff - prediction)²` carries the magnitude that the clipped `[-2, 0]` reward hides.

`rl_code/src/hdf5_logger.py`

  • New optional kwargs `gsp_target`, `gsp_squared_error` on `writerow` → 2D `(timesteps × robots)` datasets.
  • New `record_gsp_loss(value)` method → 1D dataset at GSP learning cadence.
  • `write_episode` computes two episode-level summary attrs when both prediction and target are present:
    • `gsp_output_std` — collapse signature: → 0
    • `gsp_pred_target_corr` — NaN when std is below 1e-12 tolerance, distinguishing "undefined" from "measured zero"
  • Uses `np.nanstd` and pair-wise NaN masking so a single physics glitch doesn't poison the summary.
  • Raises `ValueError` if `gsp_target`/`gsp_heading` buffers desync within an episode.

`rl_code/Main.py`

  • 3-tuple unpack of `calculate_gsp_reward`; broadcast scalar `label` to per-robot list for the HDF5 schema.
  • Pass new kwargs to `hdf5_writer.writerow`.
  • After each `model.learn()` call, capture `model.last_gsp_loss` (from companion GSP-RL PR #23, already merged to main) and forward to `hdf5_writer.record_gsp_loss`.
  • In `--independent_learning` mode: aggregate per-robot losses to a single scalar per learn tick (mean) so the `gsp_loss` axis length stays `num_learn_steps` regardless of mode.

Tests

  • 6 new `TestGSPSquaredErrorReturn` cases in `test_env/test_gsp_reward.py`; existing tests updated to 3-tuple unpack.
  • `tests/test_diagnostics/test_hdf5_logger_gsp_diagnostics.py` — 9 new tests: per-step datasets, gsp_loss recording, episode attrs, collapse signature detection, degenerate task, NaN poisoning, desynced-buffer raise, backward compat, optional record_gsp_loss.

Backward compatibility

All new kwargs/methods are optional. Existing callers continue to work. Existing `test_hdf5_logger.py` (7 tests) unchanged and still passing.

Test plan

  • 30 targeted tests pass
  • Full RL-CT suite: 111/111 pass on top of current master (excluding pre-existing `test_nan_guards.py` import error unrelated to this PR — it imports `_check_nan` from a stale GSP-RL submodule path)
  • All edits syntax-checked

Companion

`NESTLab/GSP-RL#23` (already merged to main) — provides `Actor.last_gsp_loss` that this PR reads.

🤖 Generated with Claude Code

Adds the per-step and per-episode HDF5 fields needed to detect "GSP
information collapse" — a suspected failure mode where the GSP prediction
network collapses to a near-constant output that carries no information
about the collective state. See Stelaris
docs/specs/2026-04-12-dispatcher-diagnostic-batch.md for the hypothesis.

Changes (all gated on opt-in — backward compatible):

env.py:
- calculate_gsp_reward returns (reward, label, squared_errors). The raw
  per-robot (diff - prediction)^2 carries the magnitude that the clipped
  [-2, 0] reward hides.

rl_code/src/hdf5_logger.py:
- New optional kwargs gsp_target, gsp_squared_error on writerow → 2D
  (timesteps × robots) datasets.
- New record_gsp_loss(value) method → 1D dataset at GSP learning cadence.
- write_episode now computes two episode-level summary attrs when both
  prediction and target buffers are present:
  - gsp_output_std (collapse signature: → 0)
  - gsp_pred_target_corr (collapse signature: → NaN when std is below
    1e-12 tolerance, distinguishing "undefined" from "measured zero")
  Uses np.nanstd and pair-wise NaN masking so a single physics glitch
  doesn't poison the summary; raises ValueError if gsp_target/gsp_heading
  buffers desync within an episode.

Main.py:
- 3-tuple unpack of calculate_gsp_reward; broadcast scalar label to
  per-robot list for the (timesteps × robots) HDF5 schema; pass new
  kwargs to hdf5_writer.writerow.
- After each model.learn() call, capture model.last_gsp_loss (from
  GSP-RL PR #23) and pass to hdf5_writer.record_gsp_loss. In
  --independent_learning mode, aggregate across per-robot models to a
  single scalar per learn tick (mean) so the gsp_loss axis length stays
  num_learn_steps regardless of mode.

Tests:
- 6 new TestGSPSquaredErrorReturn cases in test_env/test_gsp_reward.py;
  existing tests updated to 3-tuple unpack.
- tests/test_diagnostics/test_hdf5_logger_gsp_diagnostics.py — 9 new
  tests covering: per-step datasets, gsp_loss recording, episode attrs,
  collapse signature detection, degenerate task, NaN poisoning,
  desynced-buffer raise, backward compat, optional record_gsp_loss.

Companion: NESTLab/GSP-RL#23 (Actor.last_gsp_loss).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jdbloom
Copy link
Copy Markdown
Collaborator Author

jdbloom commented Apr 13, 2026

Port-faithfulness review

Verified against the PR #11 approved review fixes. The single commit (f94c4ed) touches exactly the 5 expected files (Main.py, env.py, hdf5_logger.py, test_gsp_reward.py, test_hdf5_logger_gsp_diagnostics.py) with no drift.

hdf5_logger.py — all six invariants preserved:

  • _reset() clears gsp_target, gsp_squared_error, gsp_loss.
  • writerow conditionally appends (non-None guard).
  • record_gsp_loss(value) present, casts to float.
  • write_episode conditionally creates gsp_target / gsp_squared_error 2D datasets and a separate 1D gsp_loss dataset (outside the timestep-indexed block, correct given differing cadence).
  • Desync check raises ValueError with a descriptive message before any aggregation.
  • Summary attrs use np.nanstd, STD_TOL = 1e-12, pair-wise np.isfinite mask, and corr = float("nan") when either std is below tolerance or fewer than 2 finite pairs remain. Exactly the approved semantics.

Main.py — ported cleanly onto master's API:

  • 3-tuple unpack at the calculate_gsp_reward call.
  • hdf5_writer.writerow(...) (correct master-side variable name) with new gsp_target=gsp_target_per_robot, gsp_squared_error=gsp_squared_error kwargs.
  • Learn-step cardinality fix is correct: shared-model branch records once per tick via getattr(model, "last_gsp_loss", None); independent-learning branch runs all per-robot learn() calls first, then collects non-None last_gsp_loss values and records a single np.mean per tick. Matches the "one entry per tick" contract from PR feat(gsp-diagnostics): surface label, squared error, and GSP loss per step #11 review.
  • gsp_target_per_robot = [float(label)] * Utility.params['num_robots'] — valid broadcast.
  • numpy as np already imported at line 14, so np.mean is fine.

env.py — 3-tuple return with zero-fill in the GSP-disabled branch. Raw unclipped abs(reward)**2 preserved.

Tests — existing test_hdf5_logger.py untouched (confirmed via git log HEAD~1..HEAD --). test_gsp_reward.py updates are purely mechanical 3-tuple unpacks plus the new TestGSPSquaredErrorReturn class; no existing assertion altered. New test_hdf5_logger_gsp_diagnostics.py imports from src.hdf5_logger and uses the same HAS_H5PY gating pattern as the reference file.

New issues introduced by the port: none.

Verdict: ready to merge.

@jdbloom jdbloom merged commit 2adc394 into master Apr 13, 2026
3 checks passed
@jdbloom jdbloom deleted the feature/gsp-diagnostics-onto-master branch April 13, 2026 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant