Add unified setup command + fix all CI E2E failures#7
Merged
Conversation
Add a generic `devbox run setup` command to all plugins that: - Pre-evaluates and caches SDKs before parallel build phases - Respects ANDROID_SKIP_SETUP and IOS_SKIP_SETUP flags - Is idempotent and safe to call multiple times - Works in any project (Android, iOS, or React Native) Implementation: - plugins/android: setup script evaluates Nix flake, ensures ANDROID_SDK_ROOT - plugins/ios: setup script verifies Xcode/iOS toolchain readiness - plugins/react-native: setup orchestrates both Android and iOS (respecting skip flags) This fixes race conditions in CI where builds started before SDK evaluation completed, causing "ANDROID_HOME not set" and "xcrun simctl" failures. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update all test suites to call `devbox run setup` before starting builds: - examples/android/tests/test-suite.yaml - examples/ios/tests/test-suite.yaml - examples/react-native/tests/test-suite-android-e2e.yaml - examples/react-native/tests/test-suite-ios-e2e.yaml - examples/react-native/tests/test-suite-web-e2e.yaml - examples/react-native/tests/test-suite-all-e2e.yaml Setup phase ensures SDK evaluation completes before parallel build phases start, eliminating race conditions where gradle/xcodebuild would fail with "SDK not found" errors on fresh CI runners. The setup-sdk process: - Runs first (no dependencies) - Other phases depend on it (build-app, sync-avds, sync-simulators, etc.) - Respects skip flags set in environment section This fixes the test failures observed in PR #4 (Dependabot actions/checkout v6) where Android and iOS E2E tests failed due to SDK not being ready. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds comprehensive validation tests for the new setup command across all three plugins: - Android: Tests SDK setup, ANDROID_SDK_ROOT, tools in PATH (adb, emulator, avdmanager), skip flag, idempotency - iOS: Tests Xcode tools (xcrun, simctl), skip flag, idempotency - React Native: Tests Node.js/npm, Android SDK, iOS tools, all skip flag combinations, idempotency All tests validate that setup command properly configures environments and respects skip flags. Test results: - Android: 8/8 passed - iOS: 6/6 passed - React Native: 9/9 passed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
process-compose processes don't have access to devbox shell scripts, so the setup-sdk phase needs to call the setup.sh script directly using bash.
Changes:
- Android: Changed from "setup" to "bash ${ANDROID_SCRIPTS_DIR}/user/setup.sh"
- iOS: Changed from "setup" to "bash ${IOS_SCRIPTS_DIR}/user/setup.sh"
- React Native: Changed from "setup" to "bash ${REACT_NATIVE_VIRTENV}/scripts/user/setup.sh"
All E2E tests passing:
- Android: 2/2 tests passed
- iOS: 3/3 tests passed
- React Native Android: 5/5 tests passed
- React Native iOS: 6/6 tests passed
- React Native Web: 3/3 tests passed
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When running inside process-compose (e.g., E2E test suites), the setup-sdk process inherits ANDROID_SDK_ROOT from the devbox init hook. Previously, the setup script would unconditionally re-evaluate the Nix flake, which could fail silently on CI runners due to nix build errors being suppressed with 2>/dev/null. Now the setup scripts check if the SDK is already configured from the inherited environment before attempting re-evaluation. This fixes the "ANDROID_SDK_ROOT not set" failure seen on CI. Changes: - Android setup.sh: Skip init/setup.sh sourcing when ANDROID_SDK_ROOT is already set and the directory exists - iOS setup.sh: Skip init/setup.sh sourcing when IOS_DEVELOPER_DIR is already set from devbox init_hook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Process-compose processes have isolated environments - env vars set in one process don't propagate to siblings. The setup-sdk phase now sources init/setup.sh directly (instead of calling user/setup.sh as a subprocess), and each build process also sources init/setup.sh to get ANDROID_SDK_ROOT in its own environment. The setup-sdk phase warms the Nix cache so subsequent evaluations are instant. Also updates validation tests to run in --pure mode (matching CI) and adds comprehensive environment checks: - Android: 17 tests (SDK dirs, env vars, tools, devices, skip, idempotency) - iOS: 12 tests (Xcode tools, env vars, devices, skip, idempotency) - React Native: 16 tests (Node.js, Android SDK, iOS, skip flags, idempotency) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Process-compose processes have isolated environments, so setup-sdk writing to a file allows build-app to read the resolved SDK root without re-evaluating the Nix flake. Key changes: - setup-sdk: after init/setup.sh, falls back to detecting SDK from adb in PATH if Nix eval failed, then writes SDK root to .state/sdk_root file - build-app: reads SDK root from .state/sdk_root if ANDROID_SDK_ROOT is not set in the inherited environment This fixes CI failures where the Nix flake evaluation fails in process-compose subprocesses but devbox already installed the SDK and put tools in PATH. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add step status file tracking to setup-sdk in all E2E test suites so setup failures are properly reported in test summaries (prevents false positives where blocked steps appear as 0/0 instead of failures) - When Nix flake evaluation fails silently, retry with visible stderr output so CI logs show the actual error for debugging - Add adb-based SDK detection fallback after Nix retry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace redundant Nix re-build attempt with ANDROID_DEBUG_SETUP=1 env var, which enables debug logging in core.sh's resolve_flake_sdk_root(). This shows the actual Nix build result in CI logs without trying to re-run nix (which may not be in PATH in process-compose subprocesses). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes to the E2E test pipeline: 1. core.sh: Capture Nix build stderr instead of suppressing it with 2>/dev/null. On failure, display the last 15 lines so CI logs show the actual error. Remove CI/GITHUB_ACTIONS gates on progress messages since --pure strips those env vars. 2. Test suites: Replace ERR trap pattern with explicit if/else status file writing in setup-sdk. The ERR trap doesn't fire for commands in || compounds, so status files were never written on failure, causing summary to report 0 failures (false positive). 3. Test suites: Add build-node to cleanup dependencies in RN Android, iOS, and all-e2e suites. Without this, summary could run before independent processes finished writing status files, missing their results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use resolve_runtime_strict in pure/CI mode so simulator start fails immediately when the configured runtime isn't available, instead of silently falling back. Add disk cleanup step and ANDROID_DEVICES=max to Android CI jobs to prevent "No space left on device" failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The macos-26 runner's Xcode SDK moved to 26.4, causing xcodebuild to fail with "iOS 26.4 is not installed" when building for generic iOS Simulator destination. Update max.json and regenerate lock files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add /usr/local/share/powershell, chromium, boost, and /opt/hostedtoolcache to the disk cleanup step. The RN Android emulator was failing with "Not enough space to create userdata partition" because the additional RN dependencies (Node, npm, Gradle) consumed the headroom from the initial cleanup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Xcode 26.4 beta requires a matching iOS simulator runtime installed before xcodebuild can resolve the generic/platform=iOS Simulator destination. The simulator start path downloads the runtime if needed, so making build-app/build-ios depend on the simulator being healthy ensures the runtime exists before the build starts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The macos-26 runner defaults to Xcode_26.4_beta which has a clang incompatibility with React Native's bundled fmt library. Select the latest stable (non-beta) Xcode before building. Update max.json to 26.3 to match the stable Xcode 26.3 SDK. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
React Native 0.83 targets compileSdk/targetSdk 35. Running on an API 36 emulator caused flaky app crashes on CI. Align the RN Android max device to API 35 to match the app's target SDK. The standalone Android example keeps API 36 since it targets the latest SDK directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove set -e from deploy steps to capture diagnostics on failure instead of immediately exiting. On deploy failure, capture emulator state, installed packages, manual launch attempt, process check, crash logcat, and full logcat/device-info to reports/logs/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The deploy script's 5-attempt × 1s window (~7s total) was too short for CI emulators where adb shell pidof takes 1-2s per call. Diagnostics confirmed the app was running correctly but detected just after the window closed. Increase to 15 attempts × 2s (~32s total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR review fixes: - Add adb-in-PATH fallback and state file writing to user/setup.sh, deduplicate ~15-line SDK fallback from 3 YAML setup-sdk processes - Use mktemp for temp file in core.sh instead of TMPDIR/PID pattern - Add [LEVEL] accessible text prefixes alongside emojis in all setup scripts - Validate ANDROID_SDK_ROOT after reading state file in build processes - Replace fragile assert_success shell evals with assert_not_empty, assert_equal, and assert_contains in test-validate-env.sh files - Standardize success messages across Android/iOS/RN setup scripts - Add set -e to deploy-app processes with set +e around intentional exit code capture CI reliability fixes: - Add expected-steps validation to e2e_report_steps: accepts step names as arguments and reports missing steps as failures, catching false successes where processes are skipped due to broken dependency chains - Update all 6 E2E YAML summaries with explicit expected step lists - Add set -e to all setup-sdk, verify-emulator-ready, verify-simulator-ready, verify-app-running, and verify-*-running processes that were missing it - iOS/RN-iOS setup-sdk processes now use user/setup.sh (with fallback to init/setup.sh) for consistency with Android setup flow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add verify-emulator-ready and verify-simulator-ready processes to the Android and iOS standalone test suites, matching the pattern already used by the React Native suites. This prevents pipeline hangs when the emulator/simulator starts but never passes its readiness probe. deploy-app now depends on the verify process (process_completed_successfully) instead of directly on the emulator/simulator (process_healthy), ensuring the dependency chain always resolves within the timeout window and the summary process always runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass -buildVersion to xcodebuild -downloadPlatform so it downloads the exact runtime version requested by the device definition (e.g., iOS 26.3) rather than whatever is latest for the current Xcode (e.g., iOS 26.4). This was causing iOS E2E failures on runners with Xcode 26.4, which only ships with the iOS 26.4 beta runtime. The download command would fetch 26.4 again instead of the requested 26.3 stable runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…issing In pure/CI mode (DEVBOX_PURE_SHELL=1), devices sync now exits 1 when devices are skipped due to unavailable runtimes or system images. In normal dev mode, behavior is unchanged (warn and continue). Also fixes Android android_ensure_avd_from_definition to return code 3 for skipped devices (consistent with iOS) and corrects result tracking in sync. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The macos-26 runner's Xcode_26.4.0.app is a beta build whose name doesn't contain "beta", so the previous grep-based selection picked it. Pin to Xcode 26.2 (stable, default on runner) with pre-installed iOS 26.2 runtime. Also adds IOS_DEVICES=max to PR checks to avoid syncing min device (iOS 15.4) on macos-26, and fixes stderr capture in ios_ensure_device_from_definition that silently swallowed runtime download attempts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…E suites All verify processes now include a 10s soak period, device log capture to reports/logs/, and error pattern scanning for native crashes and RN JS errors. Cleanup processes capture final logs before teardown. The test framework summary lists diagnostic log files on failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reorder ios_resolve_developer_dir() priority so xcode-select -p is checked before scanning /Applications for the highest-version Xcode. On CI runners with multiple Xcode versions (including betas), the version scan picked Xcode 26.4 beta which lacks iOS 26.2 runtime, breaking RN iOS builds. Respecting xcode-select honors the explicit pinning set by sudo xcode-select -s in CI workflows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Android: Remove "Process.*crash" pattern which matched normal system crash_dump64 process management entries in logcat. The remaining "FATAL EXCEPTION" and "AndroidRuntime.*FATAL" patterns are specific to actual app crashes. iOS: Remove "Fatal error" pattern which matched system daemon messages (e.g. FamilyControlsAgent XPC "fatal error" logs). The remaining patterns (Terminating app, SIGABRT, EXC_BAD_ACCESS, EXC_CRASH, Assertion failure) are specific to actual process crashes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Nix's mkShell/stdenv sets ~80 build variables (NIX_CFLAGS_COMPILE, NIX_LDFLAGS, DEVELOPER_DIR pointing to Nix's apple-sdk, clang wrapper in PATH, etc.) that interfere with xcodebuild targeting iOS Simulator. The old ios_setup_native_toolchain() only unset 3 variables (LD, LDFLAGS, CFLAGS), leaving behind the Nix compiler wrapper and include paths that caused "unknown type name 'uint8_t'" errors in RN iOS CI. Extends the function to fully strip the Nix stdenv environment: unsets all NIX_* compiler variables, build tool overrides (AR, AS, NM, etc.), SDK/deployment variables, and filters Nix clang-wrapper/cctools/xcbuild from PATH. This is equivalent to devbox shellenv --omit-nix-env but without depending on that undocumented flag. Adds test-native-toolchain.sh (23 assertions) that validates the before/after: loads raw Nix env via devbox shellenv, verifies pollution is present, applies cleanup, then verifies all Nix stdenv vars are gone and iOS Simulator compilation succeeds without Nix wrapper warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Read IOS_XCODE_VERSION from plugins/ios/plugin.json via jq instead of hardcoding "26.2" in 4 CI workflow steps, so future version bumps only require a single-line change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a unified
setupcommand to all plugins and fixes all CI E2E failures. All 6 CI jobs are now green.Changes
1. Unified Setup Command (feat)
Adds
devbox run setupto all three plugins for SDK pre-evaluation before parallel build phases:2. E2E Test Suite Improvements (fix)
3. CI Workflow Fixes (fix)
4. Device Configuration Updates (fix)
5. Android Deploy Fix (fix)
6. iOS Runtime Strictness (fix)
7. Validation Tests (test)
Files Modified (26 files, +1045 / -72)
CI Results
All 6 jobs passing (run 22609054162):
Test plan
Generated with Claude Code