Skip to content

Add unified setup command + fix all CI E2E failures#7

Merged
abueide merged 28 commits intomainfrom
feat/add-setup-command
Mar 3, 2026
Merged

Add unified setup command + fix all CI E2E failures#7
abueide merged 28 commits intomainfrom
feat/add-setup-command

Conversation

@abueide
Copy link
Copy Markdown
Contributor

@abueide abueide commented Mar 2, 2026

Summary

This PR adds a unified setup command to all plugins and fixes all CI E2E failures. All 6 CI jobs are now green.

Changes

1. Unified Setup Command (feat)

Adds devbox run setup to all three plugins for SDK pre-evaluation before parallel build phases:

  • plugins/android/virtenv/scripts/user/setup.sh — Evaluates Nix flake, ensures ANDROID_SDK_ROOT is set, respects ANDROID_SKIP_SETUP
  • plugins/ios/virtenv/scripts/user/setup.sh — Verifies Xcode/iOS toolchain readiness, respects IOS_SKIP_SETUP
  • plugins/react-native/virtenv/scripts/user/setup.sh — Orchestrates both Android and iOS setup with skip flag support

2. E2E Test Suite Improvements (fix)

  • Setup phase added to all E2E test suites (setup-sdk process) — warms Nix cache before parallel builds start
  • SDK root sharing via file between process-compose processes (isolated environments)
  • Step status tracking — setup-sdk writes pass/fail status files so failures appear in test summaries (prevents false positives)
  • Nix error surfacing — core.sh captures Nix build stderr instead of suppressing; shows last 15 lines on failure
  • ERR trap fix — replaced trap ERR pattern with explicit if/else (ERR trap doesn't fire in || compounds, causing missed failures)
  • Deploy diagnostics — on deploy failure, captures emulator state, installed packages, manual launch attempt, pidof check, crash logcat, full logcat dump, and device-info to reports/logs/
  • iOS build depends on simulator — build-app now depends on simulator being healthy, ensuring the runtime is available before xcodebuild resolves the iOS Simulator destination

3. CI Workflow Fixes (fix)

  • Disk cleanup — removes pre-installed Android SDK (~13GB), .NET, GHC, PowerShell, Chromium, Boost, hostedtoolcache from Ubuntu runners
  • ANDROID_DEVICES=max — limits Nix flake evaluation to single API level, saving ~500-800MB
  • Stable Xcode pinning — xcode-select step selects latest non-beta Xcode (Xcode 26.4 beta has clang incompatibility with RN fmt library)

4. Device Configuration Updates (fix)

  • iOS max runtime 26.2 to 26.3 — matches stable Xcode 26.3 SDK
  • RN Android max API 36 to 35 — matches React Native 0.83 compileSdk=35 / targetSdk=35 (standalone Android example stays at API 36)

5. Android Deploy Fix (fix)

  • Process detection window increased from 5 attempts x 1s (~7s) to 15 attempts x 2s (~32s) in deploy.sh
  • Diagnostics confirmed the app was running correctly but pidof via adb shell was too slow on CI emulators — detected just after the old window closed

6. iOS Runtime Strictness (fix)

  • Pure mode now uses resolve_runtime_strict instead of resolve_runtime in ios.sh — fails fast on runtime mismatch instead of silently falling back to wrong version

7. Validation Tests (test)

  • Android: 17 tests covering SDK dirs, env vars, tools in PATH, devices, skip flag, idempotency
  • iOS: 12 tests covering Xcode tools, env vars, devices, skip flag, idempotency
  • React Native: 16 tests covering Node.js, Android SDK, iOS tools, skip flag combinations, idempotency

Files Modified (26 files, +1045 / -72)

File Change
.github/workflows/pr-checks.yml Disk cleanup, ANDROID_DEVICES=max, stable Xcode pinning
plugins/android/plugin.json Register setup script
plugins/android/virtenv/scripts/user/setup.sh New: Android setup command
plugins/android/virtenv/scripts/platform/core.sh Surface Nix build errors
plugins/android/virtenv/scripts/domain/deploy.sh Increase process detection window
plugins/ios/plugin.json Register setup script
plugins/ios/virtenv/scripts/user/setup.sh New: iOS setup command
plugins/ios/virtenv/scripts/user/ios.sh Strict runtime resolution in pure mode
plugins/ios/config/devices/max.json Runtime 26.2 to 26.3
plugins/react-native/plugin.json Register setup script
plugins/react-native/virtenv/scripts/user/setup.sh New: RN setup command
examples/android/tests/test-suite.yaml Setup phase, deploy diagnostics
examples/ios/tests/test-suite.yaml Setup phase, build-to-simulator dependency
examples/ios/devbox.d/ios/devices/max.json Runtime to 26.3
examples/ios/devbox.d/ios/devices/devices.lock Regenerated
examples/react-native/tests/test-suite-android-e2e.yaml Setup phase, deploy diagnostics
examples/react-native/tests/test-suite-ios-e2e.yaml Setup phase, build-to-simulator dependency
examples/react-native/tests/test-suite-web-e2e.yaml Setup phase
examples/react-native/tests/test-suite-all-e2e.yaml Setup phase, build-to-simulator dependency
examples/react-native/devbox.d/android/devices/max.json API 36 to 35
examples/react-native/devbox.d/android/devices/devices.lock Regenerated
examples/react-native/devbox.d/ios/devices/max.json Runtime to 26.3
examples/react-native/devbox.d/ios/devices/devices.lock Regenerated
plugins/tests/android/test-validate-env.sh New: 17 validation tests
plugins/tests/ios/test-validate-env.sh New: 12 validation tests
plugins/tests/react-native/test-validate-env.sh New: 16 validation tests

CI Results

All 6 jobs passing (run 22609054162):

Job Result Time
Fast Tests (Lint + Unit + Integration) Pass 5m36s
Android E2E - max Pass 10m44s
iOS E2E - max Pass 5m40s
React Native E2E - android-max Pass 26m55s
React Native E2E - ios-max Pass 7m28s
React Native E2E - web-none Pass 2m31s

Test plan

  • All 6 CI jobs green (fast tests, Android E2E, iOS E2E, RN Android, RN iOS, RN Web)
  • Android deploy diagnostics confirmed app was running (not crashing) — timing issue fixed
  • iOS strict runtime catches mismatches in pure mode
  • Non-pure iOS mode still falls back gracefully
  • Stable Xcode pinning avoids beta clang incompatibilities
  • Disk cleanup frees enough space for both standalone and RN Android builds
  • 45 new validation tests passing across all three plugins

Generated with Claude Code

abueide and others added 18 commits March 2, 2026 13:30
Add a generic `devbox run setup` command to all plugins that:
- Pre-evaluates and caches SDKs before parallel build phases
- Respects ANDROID_SKIP_SETUP and IOS_SKIP_SETUP flags
- Is idempotent and safe to call multiple times
- Works in any project (Android, iOS, or React Native)

Implementation:
- plugins/android: setup script evaluates Nix flake, ensures ANDROID_SDK_ROOT
- plugins/ios: setup script verifies Xcode/iOS toolchain readiness
- plugins/react-native: setup orchestrates both Android and iOS (respecting skip flags)

This fixes race conditions in CI where builds started before SDK evaluation
completed, causing "ANDROID_HOME not set" and "xcrun simctl" failures.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update all test suites to call `devbox run setup` before starting builds:
- examples/android/tests/test-suite.yaml
- examples/ios/tests/test-suite.yaml
- examples/react-native/tests/test-suite-android-e2e.yaml
- examples/react-native/tests/test-suite-ios-e2e.yaml
- examples/react-native/tests/test-suite-web-e2e.yaml
- examples/react-native/tests/test-suite-all-e2e.yaml

Setup phase ensures SDK evaluation completes before parallel build
phases start, eliminating race conditions where gradle/xcodebuild
would fail with "SDK not found" errors on fresh CI runners.

The setup-sdk process:
- Runs first (no dependencies)
- Other phases depend on it (build-app, sync-avds, sync-simulators, etc.)
- Respects skip flags set in environment section

This fixes the test failures observed in PR #4 (Dependabot actions/checkout v6)
where Android and iOS E2E tests failed due to SDK not being ready.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds comprehensive validation tests for the new setup command across all three plugins:
- Android: Tests SDK setup, ANDROID_SDK_ROOT, tools in PATH (adb, emulator, avdmanager), skip flag, idempotency
- iOS: Tests Xcode tools (xcrun, simctl), skip flag, idempotency
- React Native: Tests Node.js/npm, Android SDK, iOS tools, all skip flag combinations, idempotency

All tests validate that setup command properly configures environments and respects skip flags.

Test results:
- Android: 8/8 passed
- iOS: 6/6 passed
- React Native: 9/9 passed

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
process-compose processes don't have access to devbox shell scripts, so the setup-sdk phase needs to call the setup.sh script directly using bash.

Changes:
- Android: Changed from "setup" to "bash ${ANDROID_SCRIPTS_DIR}/user/setup.sh"
- iOS: Changed from "setup" to "bash ${IOS_SCRIPTS_DIR}/user/setup.sh"
- React Native: Changed from "setup" to "bash ${REACT_NATIVE_VIRTENV}/scripts/user/setup.sh"

All E2E tests passing:
- Android: 2/2 tests passed
- iOS: 3/3 tests passed
- React Native Android: 5/5 tests passed
- React Native iOS: 6/6 tests passed
- React Native Web: 3/3 tests passed

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When running inside process-compose (e.g., E2E test suites), the
setup-sdk process inherits ANDROID_SDK_ROOT from the devbox init hook.
Previously, the setup script would unconditionally re-evaluate the Nix
flake, which could fail silently on CI runners due to nix build errors
being suppressed with 2>/dev/null.

Now the setup scripts check if the SDK is already configured from the
inherited environment before attempting re-evaluation. This fixes the
"ANDROID_SDK_ROOT not set" failure seen on CI.

Changes:
- Android setup.sh: Skip init/setup.sh sourcing when ANDROID_SDK_ROOT
  is already set and the directory exists
- iOS setup.sh: Skip init/setup.sh sourcing when IOS_DEVELOPER_DIR
  is already set from devbox init_hook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Process-compose processes have isolated environments - env vars set
in one process don't propagate to siblings. The setup-sdk phase now
sources init/setup.sh directly (instead of calling user/setup.sh as
a subprocess), and each build process also sources init/setup.sh to
get ANDROID_SDK_ROOT in its own environment. The setup-sdk phase
warms the Nix cache so subsequent evaluations are instant.

Also updates validation tests to run in --pure mode (matching CI)
and adds comprehensive environment checks:
- Android: 17 tests (SDK dirs, env vars, tools, devices, skip, idempotency)
- iOS: 12 tests (Xcode tools, env vars, devices, skip, idempotency)
- React Native: 16 tests (Node.js, Android SDK, iOS, skip flags, idempotency)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Process-compose processes have isolated environments, so setup-sdk
writing to a file allows build-app to read the resolved SDK root
without re-evaluating the Nix flake.

Key changes:
- setup-sdk: after init/setup.sh, falls back to detecting SDK from
  adb in PATH if Nix eval failed, then writes SDK root to
  .state/sdk_root file
- build-app: reads SDK root from .state/sdk_root if ANDROID_SDK_ROOT
  is not set in the inherited environment

This fixes CI failures where the Nix flake evaluation fails in
process-compose subprocesses but devbox already installed the SDK
and put tools in PATH.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add step status file tracking to setup-sdk in all E2E test suites so
  setup failures are properly reported in test summaries (prevents false
  positives where blocked steps appear as 0/0 instead of failures)
- When Nix flake evaluation fails silently, retry with visible stderr
  output so CI logs show the actual error for debugging
- Add adb-based SDK detection fallback after Nix retry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace redundant Nix re-build attempt with ANDROID_DEBUG_SETUP=1 env
var, which enables debug logging in core.sh's resolve_flake_sdk_root().
This shows the actual Nix build result in CI logs without trying to
re-run nix (which may not be in PATH in process-compose subprocesses).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes to the E2E test pipeline:

1. core.sh: Capture Nix build stderr instead of suppressing it with
   2>/dev/null. On failure, display the last 15 lines so CI logs show
   the actual error. Remove CI/GITHUB_ACTIONS gates on progress messages
   since --pure strips those env vars.

2. Test suites: Replace ERR trap pattern with explicit if/else status
   file writing in setup-sdk. The ERR trap doesn't fire for commands in
   || compounds, so status files were never written on failure, causing
   summary to report 0 failures (false positive).

3. Test suites: Add build-node to cleanup dependencies in RN Android,
   iOS, and all-e2e suites. Without this, summary could run before
   independent processes finished writing status files, missing their
   results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use resolve_runtime_strict in pure/CI mode so simulator start fails
immediately when the configured runtime isn't available, instead of
silently falling back. Add disk cleanup step and ANDROID_DEVICES=max
to Android CI jobs to prevent "No space left on device" failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The macos-26 runner's Xcode SDK moved to 26.4, causing xcodebuild to
fail with "iOS 26.4 is not installed" when building for generic iOS
Simulator destination. Update max.json and regenerate lock files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add /usr/local/share/powershell, chromium, boost, and
/opt/hostedtoolcache to the disk cleanup step. The RN Android emulator
was failing with "Not enough space to create userdata partition" because
the additional RN dependencies (Node, npm, Gradle) consumed the
headroom from the initial cleanup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Xcode 26.4 beta requires a matching iOS simulator runtime installed
before xcodebuild can resolve the generic/platform=iOS Simulator
destination. The simulator start path downloads the runtime if needed,
so making build-app/build-ios depend on the simulator being healthy
ensures the runtime exists before the build starts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The macos-26 runner defaults to Xcode_26.4_beta which has a clang
incompatibility with React Native's bundled fmt library. Select the
latest stable (non-beta) Xcode before building. Update max.json to
26.3 to match the stable Xcode 26.3 SDK.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
React Native 0.83 targets compileSdk/targetSdk 35. Running on an API
36 emulator caused flaky app crashes on CI. Align the RN Android max
device to API 35 to match the app's target SDK. The standalone Android
example keeps API 36 since it targets the latest SDK directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove set -e from deploy steps to capture diagnostics on failure
instead of immediately exiting. On deploy failure, capture emulator
state, installed packages, manual launch attempt, process check,
crash logcat, and full logcat/device-info to reports/logs/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The deploy script's 5-attempt × 1s window (~7s total) was too short
for CI emulators where adb shell pidof takes 1-2s per call. Diagnostics
confirmed the app was running correctly but detected just after the
window closed. Increase to 15 attempts × 2s (~32s total).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@abueide abueide changed the title Add unified setup command to fix CI test failures Add unified setup command + fix all CI E2E failures Mar 3, 2026
abueide and others added 10 commits March 3, 2026 01:03
PR review fixes:
- Add adb-in-PATH fallback and state file writing to user/setup.sh,
  deduplicate ~15-line SDK fallback from 3 YAML setup-sdk processes
- Use mktemp for temp file in core.sh instead of TMPDIR/PID pattern
- Add [LEVEL] accessible text prefixes alongside emojis in all setup scripts
- Validate ANDROID_SDK_ROOT after reading state file in build processes
- Replace fragile assert_success shell evals with assert_not_empty,
  assert_equal, and assert_contains in test-validate-env.sh files
- Standardize success messages across Android/iOS/RN setup scripts
- Add set -e to deploy-app processes with set +e around intentional
  exit code capture

CI reliability fixes:
- Add expected-steps validation to e2e_report_steps: accepts step names
  as arguments and reports missing steps as failures, catching false
  successes where processes are skipped due to broken dependency chains
- Update all 6 E2E YAML summaries with explicit expected step lists
- Add set -e to all setup-sdk, verify-emulator-ready,
  verify-simulator-ready, verify-app-running, and verify-*-running
  processes that were missing it
- iOS/RN-iOS setup-sdk processes now use user/setup.sh (with fallback
  to init/setup.sh) for consistency with Android setup flow

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add verify-emulator-ready and verify-simulator-ready processes to the
Android and iOS standalone test suites, matching the pattern already
used by the React Native suites. This prevents pipeline hangs when the
emulator/simulator starts but never passes its readiness probe.

deploy-app now depends on the verify process (process_completed_successfully)
instead of directly on the emulator/simulator (process_healthy), ensuring
the dependency chain always resolves within the timeout window and the
summary process always runs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass -buildVersion to xcodebuild -downloadPlatform so it downloads the
exact runtime version requested by the device definition (e.g., iOS 26.3)
rather than whatever is latest for the current Xcode (e.g., iOS 26.4).

This was causing iOS E2E failures on runners with Xcode 26.4, which only
ships with the iOS 26.4 beta runtime. The download command would fetch
26.4 again instead of the requested 26.3 stable runtime.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…issing

In pure/CI mode (DEVBOX_PURE_SHELL=1), devices sync now exits 1 when
devices are skipped due to unavailable runtimes or system images. In
normal dev mode, behavior is unchanged (warn and continue). Also fixes
Android android_ensure_avd_from_definition to return code 3 for skipped
devices (consistent with iOS) and corrects result tracking in sync.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The macos-26 runner's Xcode_26.4.0.app is a beta build whose name
doesn't contain "beta", so the previous grep-based selection picked it.
Pin to Xcode 26.2 (stable, default on runner) with pre-installed iOS
26.2 runtime. Also adds IOS_DEVICES=max to PR checks to avoid syncing
min device (iOS 15.4) on macos-26, and fixes stderr capture in
ios_ensure_device_from_definition that silently swallowed runtime
download attempts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…E suites

All verify processes now include a 10s soak period, device log capture to
reports/logs/, and error pattern scanning for native crashes and RN JS
errors. Cleanup processes capture final logs before teardown. The test
framework summary lists diagnostic log files on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reorder ios_resolve_developer_dir() priority so xcode-select -p is
checked before scanning /Applications for the highest-version Xcode.
On CI runners with multiple Xcode versions (including betas), the
version scan picked Xcode 26.4 beta which lacks iOS 26.2 runtime,
breaking RN iOS builds. Respecting xcode-select honors the explicit
pinning set by sudo xcode-select -s in CI workflows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Android: Remove "Process.*crash" pattern which matched normal system
crash_dump64 process management entries in logcat. The remaining
"FATAL EXCEPTION" and "AndroidRuntime.*FATAL" patterns are specific
to actual app crashes.

iOS: Remove "Fatal error" pattern which matched system daemon messages
(e.g. FamilyControlsAgent XPC "fatal error" logs). The remaining
patterns (Terminating app, SIGABRT, EXC_BAD_ACCESS, EXC_CRASH,
Assertion failure) are specific to actual process crashes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Nix's mkShell/stdenv sets ~80 build variables (NIX_CFLAGS_COMPILE,
NIX_LDFLAGS, DEVELOPER_DIR pointing to Nix's apple-sdk, clang wrapper
in PATH, etc.) that interfere with xcodebuild targeting iOS Simulator.
The old ios_setup_native_toolchain() only unset 3 variables (LD,
LDFLAGS, CFLAGS), leaving behind the Nix compiler wrapper and include
paths that caused "unknown type name 'uint8_t'" errors in RN iOS CI.

Extends the function to fully strip the Nix stdenv environment: unsets
all NIX_* compiler variables, build tool overrides (AR, AS, NM, etc.),
SDK/deployment variables, and filters Nix clang-wrapper/cctools/xcbuild
from PATH. This is equivalent to devbox shellenv --omit-nix-env but
without depending on that undocumented flag.

Adds test-native-toolchain.sh (23 assertions) that validates the
before/after: loads raw Nix env via devbox shellenv, verifies pollution
is present, applies cleanup, then verifies all Nix stdenv vars are gone
and iOS Simulator compilation succeeds without Nix wrapper warnings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Read IOS_XCODE_VERSION from plugins/ios/plugin.json via jq instead of
hardcoding "26.2" in 4 CI workflow steps, so future version bumps only
require a single-line change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@abueide abueide merged commit 6738d71 into main Mar 3, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant