Skip to content

ci: replace per-package Rust tests with sharded workspace nextest#3237

Open
QuantumExplorer wants to merge 12 commits intov3.1-devfrom
ci/sharded-rust-tests
Open

ci: replace per-package Rust tests with sharded workspace nextest#3237
QuantumExplorer wants to merge 12 commits intov3.1-devfrom
ci/sharded-rust-tests

Conversation

@QuantumExplorer
Copy link
Member

@QuantumExplorer QuantumExplorer commented Mar 12, 2026

Issue being fixed or feature implemented

Codecov coverage wasn't updating after PR merges because the Tests workflow only triggered on pull_request and schedule (nightly), not on push to base branches. Additionally, each Rust package got its own CI runner that redundantly compiled all shared dependencies (dpp, grovedb, etc.), wasting CI time.

What was done?

  • New .github/workflows/tests-rs-workspace.yml: Sharded workspace test runner using cargo llvm-cov nextest --workspace --partition count:N/4. Tests are distributed across 4 parallel shards via cargo-nextest partitioning. Each shard uploads coverage to Codecov (auto-merged).
  • Modified .github/workflows/tests.yml: Added push trigger on master/v*-dev so Codecov updates immediately on merge (matches GroveDB's approach). Added rs-workspace-tests job calling the new workflow.
  • Modified .github/workflows/tests-rs-package.yml: Removed the test job (103 lines). Per-package lint, formatting, unused deps, structure detection, and check-each-feature jobs remain unchanged.
  • Excluded packages that can't compile natively: wasm-dpp, wasm-dpp2, wasm-sdk, wasm-drive-verify, rs-sdk-ffi, platform-wallet-ffi, strategy-tests, check-features, dash-platform-balance-checker

Before vs After

Before After
Test runners 1 per Rust package (~8 runners) 4 shards for entire workspace
Compilation Each runner compiles shared deps independently (dpp/grovedb compiled ~8x) Each shard compiles the workspace once (4x total)
Test distribution By package (uneven — drive-abci 142s vs dpp 1s) By test count (nextest distributes evenly across shards)
Coverage upload Per-package flags Per-shard flags (Codecov merges automatically)
Coverage on merge Only on nightly schedule Immediate (push trigger)

How Has This Been Tested?

  • Verified all 9 excluded package names exist in workspace metadata
  • Local analysis confirmed test time breakdown: drive-abci 142s tests vs dpp 1s tests (compilation dominates for most packages)
  • CI will validate on this PR's own workflow run

Breaking Changes

None. This only changes CI workflow configuration.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • Chores
    • Improved continuous integration testing infrastructure with enhanced organization and efficiency.
    • Tests now execute in parallel across multiple shards for faster turnaround.
    • Enhanced code coverage tracking, collection, and reporting capabilities.
    • Added automatic capture and storage of diagnostic artifacts when test failures occur.
    • Optimized workflow triggers and change detection to run only necessary tests based on code changes.

- Add tests-rs-workspace.yml using cargo-nextest with --partition count:N/4
  to split workspace tests across 4 shards with cargo-llvm-cov coverage
- Remove test job from tests-rs-package.yml (lint/format/check jobs remain)
- Add push trigger on master/v*-dev so Codecov updates on merge
- Exclude wasm/ffi/tool packages that can't compile natively

Previously each Rust package got its own CI runner that independently
compiled all shared dependencies. Now the workspace compiles once per
shard and tests are distributed evenly via nextest partitioning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added this to the v3.1.0 milestone Mar 12, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Warning

Rate limit exceeded

@QuantumExplorer has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 43 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b9c90379-8ac8-45eb-b8ea-136956315f54

📥 Commits

Reviewing files that changed from the base of the PR and between b502a7b and d573106.

📒 Files selected for processing (1)
  • .github/workflows/tests-rs-workspace.yml
📝 Walkthrough

Walkthrough

Consolidates Rust test execution into a new workspace-level GitHub Actions workflow with sharded partitions and removes the prior in-repo package test job; updates main tests workflow to conditionally invoke the workspace workflow before package-level jobs.

Changes

Cohort / File(s) Summary
Removed package test job
​.github/workflows/tests-rs-package.yml
Deleted the in-repo package test job (≈103 lines): Rust/tooling setup, sccache, librocksdb, coverage collection, core-dump handling, and artifact uploads replaced by a comment pointing to the workspace workflow.
New workspace test workflow
​.github/workflows/tests-rs-workspace.yml
Adds sharded Rust workspace test workflow (matrix partitions 1–4) with checkout, Rust + llvm-tools, sccache (S3), librocksdb install, cargo-llvm-cov + cargo-nextest, per-shard coverage (lcov) upload to Codecov, and failure handling that archives core dumps and binaries as shard-specific artifacts.
Main CI integration
​.github/workflows/tests.yml
Updates CI to add detection for workflow changes and a new rs-workspace-tests job that conditionally triggers the workspace workflow (uses it before rs-packages), gating workspace tests based on changed paths or package changes.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant GitHub as Runner
participant WorkspaceWF as tests-rs-workspace.yml
participant Tooling as Rust/llvm/cargo-nextest
participant Cache as sccache/S3
participant TestExec as cargo-llvm-cov + nextest
participant Codecov as Codecov
participant Artifacts as GitHub Artifacts

GitHub->>WorkspaceWF: trigger matrix partition (1..4)
WorkspaceWF->>Tooling: setup Rust, llvm-tools, install cargo tools
WorkspaceWF->>Cache: configure sccache (S3)
WorkspaceWF->>TestExec: run nextest partition (partition N/4) with coverage
TestExec->>Codecov: upload lcov for shard N
alt tests fail
WorkspaceWF->>Artifacts: archive cores + binaries -> upload core-dumps-shard-N
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit raced through CI, quick and spry,
Shards in a basket, tests leapt up high,
Coverage spilled like crumbs on the floor,
Core-dumps zipped, artifacts stored,
Hooray — the workspace hops and multiplies! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: replacing per-package Rust tests with a sharded workspace approach using nextest, which aligns with all three modified workflow files and the core objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ci/sharded-rust-tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

QuantumExplorer and others added 2 commits March 13, 2026 01:51
sccache doesn't help with cargo-llvm-cov instrumented builds since the
instrumentation changes compilation output, preventing cache hits.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cache target/llvm-cov-target (where cargo-llvm-cov puts instrumented
build artifacts) using GitHub Actions cache, matching GroveDB's approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/tests-rs-workspace.yml (1)

39-40: Consider pinning tool versions for reproducibility.

The taiki-e/install-action installs the latest versions of cargo-llvm-cov and cargo-nextest. Pinning specific versions would prevent unexpected CI breakage from upstream changes.

Example version pinning
-      - uses: taiki-e/install-action@cargo-llvm-cov
-      - uses: taiki-e/install-action@cargo-nextest
+      - uses: taiki-e/install-action@v2
+        with:
+          tool: cargo-llvm-cov@0.6.15,cargo-nextest@0.9.85
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/tests-rs-workspace.yml around lines 39 - 40, The workflow
currently uses floating references taiki-e/install-action@cargo-llvm-cov and
taiki-e/install-action@cargo-nextest; change these to pinned versions/tags (for
example a specific release tag or commit SHA) to ensure reproducible CI
runs—update the two uses entries (taiki-e/install-action@cargo-llvm-cov and
taiki-e/install-action@cargo-nextest) to explicit pinned identifiers such as
taiki-e/install-action@vX.Y.Z or taiki-e/install-action@<commit-sha> and ensure
the chosen tags are tested/compatible with the rest of the workflow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/tests-rs-workspace.yml:
- Around line 4-7: The workflow currently defines an input named partitions but
still hardcodes strategy.matrix.partition to [1,2,3,4], so it ignores caller
input; fix by either removing the partitions input and its description (if you
want to always shard into 4) or implement dynamic matrix generation: add a setup
job that runs on ubuntu (e.g., job name setup with step id set-matrix) which
emits partition-matrix via GITHUB_OUTPUT using a shell like echo "matrix=$(seq 1
${{ inputs.partitions }} | jq -R . | jq -sc .)" >> $GITHUB_OUTPUT, then change
the test job to depend on setup and set strategy.matrix.partition: ${{
fromJson(needs.setup.outputs.partition-matrix) }} so the matrix respects the
partitions input.

---

Nitpick comments:
In @.github/workflows/tests-rs-workspace.yml:
- Around line 39-40: The workflow currently uses floating references
taiki-e/install-action@cargo-llvm-cov and taiki-e/install-action@cargo-nextest;
change these to pinned versions/tags (for example a specific release tag or
commit SHA) to ensure reproducible CI runs—update the two uses entries
(taiki-e/install-action@cargo-llvm-cov and taiki-e/install-action@cargo-nextest)
to explicit pinned identifiers such as taiki-e/install-action@vX.Y.Z or
taiki-e/install-action@<commit-sha> and ensure the chosen tags are
tested/compatible with the rest of the workflow.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 936a548a-81bd-44b3-9570-058c781517fa

📥 Commits

Reviewing files that changed from the base of the PR and between 037c387 and 8422550.

📒 Files selected for processing (3)
  • .github/workflows/tests-rs-package.yml
  • .github/workflows/tests-rs-workspace.yml
  • .github/workflows/tests.yml

The matrix was already hardcoded to [1,2,3,4] so the input was ignored.
Simplify by removing it entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ensure sharded workspace tests run even when no Rust source files
changed (e.g. workflow-only PRs, push to base branch, nightly schedule).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add rs-workflows-changed path filter so the sharded workspace tests
also run on PRs that only modify CI workflow files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dash-sdk has crate-type = ["cdylib", "rlib"] and the static rocksdb
library has TLS relocations (R_X86_64_TPOFF32) incompatible with shared
objects. Adding --lib --bins --tests tells cargo to only build test
targets, skipping the cdylib output that triggers the linker error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dash-sdk has crate-type = ["cdylib", "rlib"] which causes R_X86_64_TPOFF32
linker errors with static rocksdb when building shared objects under
cargo-llvm-cov. Exclude it alongside the other cdylib crates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/tests-rs-workspace.yml (1)

22-25: Minor: Consider using boolean for cache-on-failure.

While the string "false" works, using the native YAML boolean false is more idiomatic.

       - uses: Swatinem/rust-cache@v2
         with:
-          cache-on-failure: "false"
+          cache-on-failure: false
           workspaces: ". -> target/llvm-cov-target"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/tests-rs-workspace.yml around lines 22 - 25, Change the
string value "false" for the rust-cache action to a native YAML boolean false to
be more idiomatic: locate the job step that has uses: Swatinem/rust-cache@v2 and
update the cache-on-failure setting (cache-on-failure) from the quoted "false"
to the unquoted boolean false.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/tests-rs-workspace.yml:
- Around line 47-56: The workflow currently excludes the rs-sdk-ffi package so
its 55+ native integration tests never run; update
.github/workflows/tests-rs-workspace.yml to stop excluding rs-sdk-ffi (remove
the "--exclude rs-sdk-ffi" entry) or add a separate native test job (e.g.,
"native-rs-sdk-ffi-tests") that checks out the repo, sets up Rust, and runs
cargo test for the rs-sdk-ffi package (cargo test --manifest-path
packages/rs-sdk-ffi/Cargo.toml or cargo test -p rs-sdk-ffi) on a native runner
to execute its integration tests; ensure the job runs on the same
matrix/conditions as other Rust native tests and document the change in the
workflow comment.

---

Nitpick comments:
In @.github/workflows/tests-rs-workspace.yml:
- Around line 22-25: Change the string value "false" for the rust-cache action
to a native YAML boolean false to be more idiomatic: locate the job step that
has uses: Swatinem/rust-cache@v2 and update the cache-on-failure setting
(cache-on-failure) from the quoted "false" to the unquoted boolean false.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c301b89d-ff32-4f44-abf0-7ee42e159d58

📥 Commits

Reviewing files that changed from the base of the PR and between 8422550 and b502a7b.

📒 Files selected for processing (2)
  • .github/workflows/tests-rs-workspace.yml
  • .github/workflows/tests.yml

Set cache-on-failure to true so subsequent runs benefit from cached
compilation even if a test failure occurs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lusions

Replace --workspace with --package for drive, dpp, and drive-abci.
This avoids the cdylib linker issue entirely (no cdylib crates in
the build) and is simpler than maintaining a growing exclusion list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.11%. Comparing base (0d10ecd) to head (d573106).
⚠️ Report is 2 commits behind head on v3.1-dev.

Additional details and impacted files
@@              Coverage Diff              @@
##           v3.1-dev    #3237       +/-   ##
=============================================
+ Coverage     55.87%   68.11%   +12.23%     
=============================================
  Files          3173     3266       +93     
  Lines        235215   251726    +16511     
=============================================
+ Hits         131435   171454    +40019     
+ Misses       103780    80272    -23508     
Components Coverage Δ
dpp 58.20% <ø> (+23.85%) ⬆️
drive 77.03% <ø> (+20.57%) ⬆️
drive-abci 78.18% <ø> (-0.01%) ⬇️
sdk 31.13% <18.96%> (-1.00%) ⬇️
dapi-client 39.08% <ø> (+18.21%) ⬆️
platform-version ∅ <ø> (∅)
platform-value 39.35% <ø> (∅)
platform-wallet 60.40% <ø> (∅)
drive-proof-verifier ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +57 to +64
- name: Upload coverage
if: always()
uses: codecov/codecov-action@v5
with:
files: lcov.info
flags: rust-shard-${{ matrix.partition }}
token: ${{ secrets.CODECOV_TOKEN }}
fail_ci_if_error: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job setting fail_ci_if_error to avoid forked pre from failing :)

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine; I haven't noticed anything that will break forked PRs, nor did I notice anything wrong.

Add dash-sdk (234 tests), platform-value (125), rs-dapi (90),
platform-wallet (53), rs-dapi-client (11), and
platform-serialization (10) to the coverage test runner.

Strip cdylib from dash-sdk's Cargo.toml in CI to avoid the
static rocksdb TLS relocation linker error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip cdylib from their Cargo.toml in CI (same static rocksdb issue).
Adds 328 more tests (260 + 68) to coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants