Skip to content

Unify quarantine and outerloop CI with the main test pipeline#16448

Open
radical wants to merge 6 commits intomicrosoft:mainfrom
radical:update-specialized-test-runner
Open

Unify quarantine and outerloop CI with the main test pipeline#16448
radical wants to merge 6 commits intomicrosoft:mainfrom
radical:update-specialized-test-runner

Conversation

@radical
Copy link
Copy Markdown
Member

@radical radical commented Apr 24, 2026

Description

Migrates the specialized test runner (quarantine/outerloop) from the old MSBuild-based runsheet generation to the new metadata-driven pipeline, and extracts a shared run-tests-core.yml workflow to eliminate duplicated test job definitions.

Key changes

  1. New run-tests-core.yml — reusable workflow containing the 6 dependency-based test bucket jobs + results gate. Shared by both tests.yml and specialized-test-runner.yml to avoid duplicating job definitions.

  2. Refactored specialized-test-runner.yml — now uses the same pipeline as tests.yml: TestEnumerationRunsheetBuilderbuild-test-matrixexpand-test-matrix-githubsplit-test-matrix-by-depsrun-tests-core. Added grep-based project scoping and positive trait filtering (DiscoveryTraitFilter) to build a lean matrix of only relevant test classes.

  3. Simplified tests.yml — replaced 6 inline test job definitions with a single call to run-tests-core.yml.

  4. Updated tests-outerloop.yml / tests-quarantine.yml — added run-tests-core.yml to paths: triggers so workflow changes are validated on PRs.

  5. Deleted old MSBuild targets — removed SpecializedTestRunsheetBuilderBase.targets, OuterloopTestRunsheetBuilder.targets, QuarantinedTestRunsheetBuilder.targets, and cleared the old runsheet generation from AfterSolutionBuild.targets.

  6. split-test-projects-for-ci.ps1 — added IncludeTraitFilter parameter for positive trait filtering in class discovery mode. When set and zero classes are found, the script gracefully skips instead of erroring.

  7. tests/Directory.Build.targets — passes DiscoveryTraitFilter through to the split script as -IncludeTraitFilter.

Tests added

  • IncludeTraitFilterGracefullySkipsWhenNoClassesFound — verifies graceful skip with empty output JSON
  • WithoutIncludeTraitFilterFailsWhenNoClassesFound — verifies error behavior is preserved without the filter

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
    • No

radical and others added 3 commits April 24, 2026 16:32
Replace the older bespoke test matrix pipeline used by quarantine and
outerloop workflows with the same metadata-driven pipeline as tests.yml:

- TestEnumerationRunsheetBuilder generates .tests-metadata.json per project
  (deterministic, no assembly execution)
- build-test-matrix.ps1 merges into canonical platform-agnostic matrix
- expand-test-matrix-github.ps1 expands to per-OS entries
- split-test-matrix-by-deps.ps1 splits into 6 dependency-based buckets

Key changes:
- specialized-test-runner.yml: Major rewrite to inline enumerate-tests
  steps, derive MSBuild props from attributeName, use 6 split test jobs
  with proper empty-bucket handling in results gate
- split-test-projects-for-ci.ps1: Add -IncludeTraitFilter parameter for
  positive trait filtering in class discovery (enables finding classes
  that ONLY contain quarantined/outerloop tests)
- tests/Directory.Build.targets: Wire DiscoveryTraitFilter MSBuild
  property to the -IncludeTraitFilter script parameter
- Caller workflows simplified (removed testRunnerName, extraRunSheetBuilderArgs)
- Delete old infrastructure: SpecializedTestRunsheetBuilderBase.targets,
  QuarantinedTestRunsheetBuilder/, OuterloopTestRunsheetBuilder/,
  _GenerateTestMatrix target from AfterSolutionBuild.targets

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the 6 dependency-split test job definitions + results gate into
a new reusable workflow (run-tests-core.yml) called by both tests.yml
and specialized-test-runner.yml. This eliminates duplication of the
test bucket pattern across the two pipelines.

Changes:
- Create .github/workflows/run-tests-core.yml with 6 test bucket
  jobs and an internal results gate that handles empty-bucket skips
- Update specialized-test-runner.yml to call run-tests-core.yml
  instead of defining 6 inline test jobs
- Update tests.yml to call run-tests-core.yml instead of defining
  6 inline test jobs; simplify its results gate accordingly
- Add run-tests-core.yml to path triggers in quarantine/outerloop
  caller workflows

Trade-off: in tests.yml, tests_no_nugets now waits for builds to
complete (~2 min regression) since all buckets share a single
reusable workflow call with unified dependencies. This was accepted
for the maintenance win of having a single test job definition.

Nesting depth: 4 levels (max allowed by GitHub Actions):
  ci.yml -> tests.yml -> run-tests-core.yml -> run-tests.yml
  tests-quarantine.yml -> specialized-test-runner.yml ->
    run-tests-core.yml -> run-tests.yml

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-ci.ps1

Test the new IncludeTraitFilter parameter added to split-test-projects-for-ci.ps1:
- IncludeTraitFilterGracefullySkipsWhenNoClassesFound: verifies that when
  IncludeTraitFilter is set and no matching classes are found, the script
  succeeds and writes valid JSON with empty testPartitions (graceful skip)
- WithoutIncludeTraitFilterFailsWhenNoClassesFound: verifies the existing
  behavior that zero discovered classes without IncludeTraitFilter is an error

Also updates the RunScript helper to accept the optional includeTraitFilter
parameter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 24, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 16448

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 16448"

radical and others added 3 commits April 24, 2026 19:15
…kage

GitHub Actions' pwsh wrapper appends 'exit $LASTEXITCODE' after the user
script. When ignoreTestFailures is true, the Windows test steps skipped the
explicit exit but left $LASTEXITCODE set to the non-zero dotnet test exit
code, causing the step to fail despite the intent to ignore failures.

Also add always() to the 'Verify test results exist' step condition so it
runs even if a prior step failed unexpectedly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The TestEnumerationRunsheetBuilder.targets has explicit skip gates for
Aspire.Templates.Tests and Aspire.Cli.EndToEnd.Tests that require these
props to be set. Without them, these projects are excluded from the
canonical matrix even when the BeforeBuildProps lists them.

Projects with their own SkipTests conditions (e.g., Templates skips
quarantine runs) will still be correctly excluded by the downstream
SkipTests check in the runsheet builder.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The splitting is an implementation detail; the parent workflows provide
the context of what kind of tests are running.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

@radical radical changed the title Migrate specialized test runners to metadata-driven pipeline and extract run-tests-core.yml Unify quarantine and outerloop CI with the main test pipeline Apr 25, 2026
@radical radical marked this pull request as ready for review April 26, 2026 00:48
Copilot AI review requested due to automatic review settings April 26, 2026 00:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the quarantine/outerloop “specialized test runner” off the legacy MSBuild runsheet generation and onto the same metadata-driven test matrix pipeline used by the main CI, while extracting a shared reusable workflow to eliminate duplicated test-bucket job definitions.

Changes:

  • Added a new reusable workflow (run-tests-core.yml) that executes the 6 dependency-based test buckets plus a results gate.
  • Refactored specialized-test-runner.yml to generate/split matrices via the canonical pipeline and then call run-tests-core.yml (including a new positive trait discovery mode for class splitting).
  • Simplified tests.yml to call run-tests-core.yml, and removed legacy MSBuild runsheet targets/logic.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/Infrastructure.Tests/PowerShellScripts/SplitTestProjectsTests.cs Adds tests for new IncludeTraitFilter “graceful skip” behavior.
tests/Directory.Build.targets Plumbs DiscoveryTraitFilter through to the split script as -IncludeTraitFilter.
eng/scripts/split-test-projects-for-ci.ps1 Adds IncludeTraitFilter support and “skip instead of fail” behavior when no matching classes are found.
eng/SpecializedTestRunsheetBuilderBase.targets Deletes legacy specialized runsheet generator base targets.
eng/QuarantinedTestRunsheetBuilder/QuarantinedTestRunsheetBuilder.targets Deletes quarantined runsheet builder targets.
eng/OuterloopTestRunsheetBuilder/OuterloopTestRunsheetBuilder.targets Deletes outerloop runsheet builder targets.
eng/AfterSolutionBuild.targets Removes legacy combined runsheet generation target (keeps canonical matrix generation).
.github/workflows/tests.yml Replaces 6 inline bucket jobs with a single call to run-tests-core.yml.
.github/workflows/tests-quarantine.yml Updates paths: triggers and switches to new specialized runner inputs.
.github/workflows/tests-outerloop.yml Updates paths: triggers and switches to new specialized runner inputs.
.github/workflows/specialized-test-runner.yml Rebuilds specialized test execution around canonical matrices + run-tests-core.yml.
.github/workflows/run-tests.yml Ensures ignoreTestFailures exits 0 reliably and always verifies TRX presence when enabled.
.github/workflows/run-tests-core.yml New shared workflow implementing the 6 bucket jobs + results gate.
Comments suppressed due to low confidence (1)

eng/scripts/split-test-projects-for-ci.ps1:50

  • The script header still states it "Fails fast if zero test classes discovered when in class mode", but with IncludeTraitFilter set the script now intentionally succeeds and writes an empty partitions file. Update the .NOTES (and/or description) to reflect the conditional behavior so callers understand when zero classes is an error vs a skip.
.NOTES
  PowerShell 7+
  Fails fast if ExtractTestPartitions cannot be built or run.
  Fails fast if zero test classes discovered when in class mode.
  Only runs --list-tests when no partitions are found in the assembly.

Comment on lines 225 to +244
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
if: steps.check_tests.outputs.tests_run == 'true'
continue-on-error: true
with:
pattern: 'logs-*'
path: ${{ github.workspace }}/artifacts/all-logs
pattern: logs-*-ubuntu-latest
merge-multiple: true
path: ${{ github.workspace }}/testresults/ubuntu-latest

# Organize the .trx files by OS
- name: Organize test results by OS
if: steps.check_tests.outputs.tests_run == 'true'
shell: pwsh
run: |
$logDirectory = "${{ github.workspace }}/artifacts/all-logs"

# Create OS-specific directories
New-Item -ItemType Directory -Path "${{ github.workspace }}/testresults/ubuntu-latest" -Force
New-Item -ItemType Directory -Path "${{ github.workspace }}/testresults/windows-latest" -Force
New-Item -ItemType Directory -Path "${{ github.workspace }}/testresults/macos-latest" -Force

# Find all .trx files
$trxFiles = Get-ChildItem -Path $logDirectory -Filter *.trx -Recurse

# Copy each .trx file to the appropriate OS folder
foreach ($trxFile in $trxFiles) {
if ($trxFile.FullName -match "ubuntu") {
Copy-Item -Path $trxFile.FullName -Destination "${{ github.workspace }}/testresults/ubuntu-latest/" -Force
} elseif ($trxFile.FullName -match "windows") {
Copy-Item -Path $trxFile.FullName -Destination "${{ github.workspace }}/testresults/windows-latest/" -Force
} elseif ($trxFile.FullName -match "macos") {
Copy-Item -Path $trxFile.FullName -Destination "${{ github.workspace }}/testresults/macos-latest/" -Force
}
}
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
continue-on-error: true
with:
pattern: logs-*-windows-latest
merge-multiple: true
path: ${{ github.workspace }}/testresults/windows-latest

- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
continue-on-error: true
with:
pattern: logs-*-macos-latest
merge-multiple: true
path: ${{ github.workspace }}/testresults/macos-latest
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results job only downloads artifacts matching logs--ubuntu-latest/windows-latest/macos-latest. Test jobs can run on custom runners (e.g., 8-core-ubuntu-latest, ubuntu-24.04-arm, windows-11-arm), so their logs/.trx files won’t be included in the summary. Consider downloading all logs- artifacts (or expanding the patterns to include the non-default runner labels used by your matrices) so the combined test summary is complete.

Copilot uses AI. Check for mistakes.
tests: ${{ fromJson(needs.generate_tests_matrix.outputs.runsheet) }}
if: ${{ github.repository_owner == 'microsoft' && !cancelled() && !failure() }}
uses: ./.github/workflows/run-tests.yml
if: ${{ !cancelled() && !failure() }}
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate_tests_matrix/build_* jobs are gated on github.repository_owner == 'microsoft', but run_tests is not. In forks, generate_tests_matrix will be skipped and the run-tests-core call will receive empty (non-JSON) matrix inputs, likely failing at fromJson evaluation. Add the same repository_owner guard to run_tests (or otherwise ensure it won’t run when matrices aren’t produced).

Suggested change
if: ${{ !cancelled() && !failure() }}
if: ${{ github.repository_owner == 'microsoft' && !cancelled() && !failure() }}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants