Add agent cli by waltsims · Pull Request #682 · waltsims/k-wave-python

waltsims · 2026-03-27T17:56:30Z

Greptile Summary

This PR introduces an agent-first CLI (kwave) for k-Wave simulations, allowing LLM agents (or humans) to drive the full simulation workflow — session management, phantom generation, source/sensor definition, pre-flight planning, and execution — entirely through structured JSON-in / JSON-out commands. It also adds a thin progress_callback hook into the Python solver so the run command can stream real-time progress events to callers.

Key changes:

New kwave/cli/ package: session.py (file-backed state), schema.py (JSON envelope + error types), main.py (Click group wiring), and six command modules (session, phantom, source, sensor, plan, run).
kspaceFirstOrder and kspace_solver.Simulation.run extended with an optional progress_callback(step, total) — only the Python backend honours it; the cpp backend silently ignores it.
pyproject.toml gains an optional [cli] extra (click>=8.0), a kwave console script entry point, and the new packages added to the wheel target.
Two integration test suites (test_e2e.py, test_1d_ivp.py) do bit-exact comparisons between CLI output and direct Python API calls, giving strong correctness coverage.

Findings:

traceback is imported but unused in schema.py.
_parse_int_tuple / _resolve_scalar_or_path in phantom.py call int() / float() on raw user input without catching ValueError, so invalid inputs bypass the structured CLIError response contract.
Session file writes in session.py._save() are not atomic; an interrupt mid-write leaves a corrupt session.json.
progress_callback is silently ignored when --backend cpp is used; callers receive the started event and then silence until completion.
progress_callback is undocumented in kspaceFirstOrder's docstring.

Confidence Score: 5/5

Safe to merge; all findings are P2 style/polish issues with no impact on correctness of simulation results.

All five findings are P2. Core simulation logic is unchanged and two integration tests verify bit-exact parity between CLI and direct Python API paths. The new CLI is gated behind an optional [cli] extra so it cannot regress existing users.

kwave/cli/session.py (non-atomic writes), kwave/cli/commands/phantom.py (unguarded int/float conversions), kwave/cli/commands/run.py (silent cpp progress gap).

Important Files Changed

Filename	Overview
kwave/cli/session.py	Core session state manager backed by ~/.kwave/session.json; non-atomic writes risk corruption on interrupt, and no file locking guards against concurrent CLI processes.
kwave/cli/schema.py	Defines structured JSON envelope, exit codes, and json_command decorator; unused traceback import, otherwise clean.
kwave/cli/commands/run.py	Emits streaming progress events then a final CLIResponse; progress_callback is silently ignored when --backend cpp is used.
kwave/cli/commands/phantom.py	Phantom load/generate commands; unguarded int()/float() calls on user input will emit raw Python tracebacks instead of structured CLIError JSON on bad values.
kwave/cli/main.py	CLI entry point wiring all command groups; deferred imports below the group definition cleanly avoid circular import issues.
kwave/kspaceFirstOrder.py	Adds progress_callback parameter forwarded to the Python solver; missing docstring entry and silently no-ops for cpp backend.
kwave/solvers/kspace_solver.py	Minimal surgical change: run() now accepts and invokes an optional progress_callback(step, total) after each simulation step.
tests/test_cli/test_e2e.py	Good end-to-end coverage: session lifecycle, phantom generation, sensor definition, plan, and a full run compared bit-for-bit against the direct Python API.
tests/test_cli/test_1d_ivp.py	1D CLI test with heterogeneous medium arrays, custom CFL, and sparse sensor; results compared exactly against the Python API.
pyproject.toml	Adds optional [cli] extra (click>=8.0), kwave console script entry point, and includes kwave.cli packages in wheel build targets.
tests/test_cli/conftest.py	Shared fixtures and invoke() helper that correctly extracts the last top-level JSON object from mixed streaming output.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant CLI as kwave CLI
    participant Session as session.json
    participant Solver as kspace_solver

    Agent->>CLI: kwave session init
    CLI->>Session: write {id, state: defaults}
    CLI-->>Agent: {"status":"ok","result":{"session_id":...}}

    Agent->>CLI: kwave phantom generate --type disc ...
    CLI->>Session: load()
    CLI->>Session: update_many({grid, medium, source})
    CLI-->>Agent: {"status":"ok","result":{grid_size, p0_max, ...}}

    Agent->>CLI: kwave sensor define --mask full-grid
    CLI->>Session: load() then update("sensor", {...})
    CLI-->>Agent: {"status":"ok","result":{mask_type, record}}

    Agent->>CLI: kwave plan
    CLI->>Session: load() + assert_ready()
    CLI->>CLI: make_grid() / make_medium()
    CLI-->>Agent: {"status":"ok","result":{grid, cfl, memory_mb, ...}}

    Agent->>CLI: kwave run
    CLI->>Session: load() + assert_ready()
    CLI->>Solver: kspaceFirstOrder(..., progress_callback)
    loop every 5% progress
        Solver-->>CLI: callback(step, Nt)
        CLI-->>Agent: {"event":"progress","pct":...}
    end
    Solver-->>CLI: result dict
    CLI->>Session: update("result_path", data_dir)
    CLI-->>Agent: {"event":"completed",...}
    CLI-->>Agent: {"status":"ok","result":{"outputs":{p:{path,shape},...}}}

_{Reviews (1): Last reviewed commit: "Skip CLI tests when click is not install..." | Re-trigger Greptile}

Greptile also left 5 inline comments on this PR.

Structured JSON I/O at every step, deterministic exit codes, file-backed session model. Commands: session init/status/reset, phantom generate, sensor define, plan, run. Acceptance test: new_api_ivp_2D.py runs entirely via CLI with exact numerical parity to the Python API. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- phantom load: accepts .npy files for heterogeneous medium properties - source define: loads custom p0 from .npy file - sensor define: supports file-based masks (sparse sensors) - Grid materializer passes full sound speed array to makeTime (not just max) - Custom CFL support via --cfl flag - Test: ivp_1D_simulation.py replicated via CLI with exact parity Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Normalize medium state to unified {field}_scalar/{field}_path schema (phantom generate now uses same format as phantom load) - Remove bogus PPW check from plan (Nyquist-based PPW was a false positive for all valid CFL values) - Remove dead code: unused min_wavelength, unused sys import, unused Optional - Add Session.assert_ready() and Session.update_many() - Rename _completeness() to completeness() (public API) - Extract _invoke helper to tests/test_cli/conftest.py - Shrink e2e test grid from 128x128 to 48x48 - Extract _parse_int_tuple and _resolve_scalar_or_path in phantom.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Uses pytest.importorskip so test collection succeeds in CI environments that don't install the cli extra. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-27T18:01:13Z

Codecov Report

❌ Patch coverage is 88.46154% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.97%. Comparing base (785c0bd) to head (9ec566e).

Files with missing lines	Patch %	Lines
kwave/cli/session.py	84.89%	9 Missing and 12 partials ⚠️
kwave/cli/commands/phantom.py	80.55%	10 Missing and 4 partials ⚠️
kwave/cli/schema.py	89.23%	6 Missing and 1 partial ⚠️
kwave/cli/__init__.py	33.33%	4 Missing ⚠️
kwave/cli/commands/plan.py	93.54%	1 Missing and 1 partial ⚠️
kwave/cli/commands/run.py	95.12%	1 Missing and 1 partial ⚠️
kwave/cli/commands/session_cmd.py	95.83%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #682      +/-   ##
==========================================
+ Coverage   74.23%   74.97%   +0.74%     
==========================================
  Files          56       66      +10     
  Lines        8012     8453     +441     
  Branches     1566     1603      +37     
==========================================
+ Hits         5948     6338     +390     
- Misses       1441     1473      +32     
- Partials      623      642      +19

Flag	Coverage Δ
3.10	`74.22% <33.33%> (-0.02%)`	⬇️
3.11	`74.22% <33.33%> (-0.02%)`	⬇️
3.12	`74.97% <88.46%> (+0.74%)`	⬆️
3.13	`74.22% <33.33%> (-0.02%)`	⬇️
macos-latest	`74.19% <33.33%> (-0.02%)`	⬇️
ubuntu-latest	`74.19% <33.33%> (-0.02%)`	⬇️
windows-latest	`74.96% <88.46%> (+0.74%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

waltsims · 2026-03-27T18:09:16Z

@greptile-app

greptile-apps · 2026-03-27T18:15:32Z

kwave/cli/schema.py

+
+import json
+import sys
+import traceback


Unused traceback import

traceback is imported but never referenced anywhere in the module. The exception handler currently serializes str(e) into the error suggestion field and never calls traceback.format_exc().

Suggested change

import traceback

from dataclasses import asdict, dataclass, field

greptile-apps · 2026-03-27T18:15:33Z

kwave/cli/commands/phantom.py

+@click.option("--cfl", type=float, default=None, help="CFL number for time step calculation")
+@pass_session
+@json_command("phantom.load")
+def load(sess, grid_size, spacing, sound_speed, density, cfl):
+    """Load medium properties from scalar values or .npy files."""
+    sess.load()
+
+    grid_n = _parse_int_tuple(grid_size)
+    ndim = len(grid_n)
+    grid_spacing = (spacing,) * ndim
+


Unguarded int() and float() conversions produce unformatted errors

_parse_int_tuple and _resolve_scalar_or_path call int() / float() on raw user input without catching ValueError. An invalid value like --grid-size 128,abc or --sound-speed notanumber raises a bare Python traceback rather than a structured ValidationError / CLIError JSON response, breaking the agent-parseable contract that the rest of the CLI upholds.

Consider wrapping these with explicit error handling:

def _parse_int_tuple(s: str) -> tuple[int, ...]: try: return tuple(int(x) for x in s.split(",")) except ValueError: raise ValidationError(CLIError( code="INVALID_GRID_SIZE", field="grid_size", value=s, constraint="must be comma-separated integers, e.g. 128 or 128,128", ))

greptile-apps · 2026-03-27T18:15:34Z

kwave/cli/session.py

+    def _save(self):
+        raw = {
+            "id": self._id,
+            "created_at": self._created_at,
+            "state": self._state,
+        }
+        self.session_file.write_text(json.dumps(raw, indent=2, default=str))


Non-atomic session file write can corrupt state on interrupt

write_text is not atomic — if the process is killed (SIGINT, SIGKILL, OOM) during the write the session file will be partially written and unreadable on the next CLI invocation. Using a write-to-temp-then-rename pattern eliminates this race:

def _save(self): import os raw = { "id": self._id, "created_at": self._created_at, "state": self._state, } tmp = self.session_file.with_suffix(".tmp") tmp.write_text(json.dumps(raw, indent=2, default=str)) os.replace(tmp, self.session_file)

os.replace is atomic on POSIX and Windows (same filesystem).

greptile-apps · 2026-03-27T18:15:35Z

kwave/cli/commands/run.py

+    result = kspaceFirstOrder(
+        kgrid,
+        medium,
+        source,
+        sensor,
+        backend=backend,
+        device=device,
+        quiet=True,
+        progress_callback=progress_callback,
+    )


progress_callback silently ignored for cpp backend

progress_callback is wired into the Python backend path in kspace_solver.py, but when --backend cpp is selected the callback is accepted by kspaceFirstOrder and then never forwarded to the C++ solver. The caller will receive only the {"event": "started", ...} line and then wait silently until the simulation finishes, with no intermediate progress events.

Consider emitting a warning event when backend != "python" and a callback is provided:

if backend != "python": _emit_event({"event": "warning", "detail": "progress_callback is only supported with the python backend"})

greptile-apps · 2026-03-27T18:15:36Z

kwave/kspaceFirstOrder.py

    debug: bool = False,
    num_threads: Optional[int] = None,
    device_num: Optional[int] = None,
+    progress_callback=None,


Missing docstring entry for progress_callback

The new parameter has no documentation in the existing docstring block, which documents all other keyword arguments. Agents or users introspecting help(kspaceFirstOrder) will have no description of the expected signature or when it is invoked.

Add to the docstring Keyword Args section:

progress_callback: Optional callable ``(step: int, total: int) -> None``. Called after each simulation step. Only supported with the ``"python"`` backend; silently ignored for ``"cpp"``.

waltsims and others added 4 commits March 27, 2026 10:35

Skip CLI tests when click is not installed

9ec566e

Uses pytest.importorskip so test collection succeeds in CI environments that don't install the cli extra. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps bot reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent cli#682

Add agent cli#682
waltsims wants to merge 4 commits intomasterfrom
add-agent-cli

waltsims commented Mar 27, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

codecov bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

waltsims commented Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	import traceback
	from dataclasses import asdict, dataclass, field

Conversation

waltsims commented Mar 27, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

codecov bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

waltsims commented Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

waltsims commented Mar 27, 2026 •

edited by greptile-apps bot

Loading

codecov bot commented Mar 27, 2026 •

edited

Loading