-
-
Notifications
You must be signed in to change notification settings - Fork 410
Description
Environment
- Platform: ppc64le (POWER9), Fedora 43, 64KB page size
- Box64: v0.4.1 built from main + PPC64LE dynarec (custom branch)
- Game: Pillars of Eternity (GOG, x86_64 Linux), Unity + Mono runtime
- Build:
cmake .. -DPPC64LE=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo
Symptoms
With dynarec enabled, Pillars of Eternity exhibits three issues:
1. Main thread hangs at ~110% CPU after shader warmup
The game loads to about line 174 in Player.log then hangs. The main thread is stuck in R state burning CPU. strace shows ~37,000 rt_sigprocmask syscalls/sec.
Root cause we identified: On 64KB page systems, Mono JIT code lives in RWX anonymous regions. protectDB marks these as PROT_NEVERCLEAN because mixed code+data share the same 64KB host page. This sets always_test=1 on dynarec blocks, so every block entry goes through DBGetBlock → hash validation → protectDB → getProtection(). Each getProtection() call uses LOCK_PROT_READ which does pthread_sigmask(SIG_BLOCK) + mutex_lock + unlock + pthread_sigmask(SIG_SETMASK) = 2 rt_sigprocmask syscalls. On ppc64le, pthread_sigmask is a real syscall (not vDSO-optimized as on x86-64/aarch64).
Our fix: Use getProtection_fast() / isprotectedDB_fast() (mutex-only, no signal masking) in hot dynarec paths, and skip the protectDB() call for always_test==1 (NEVERCLEAN) blocks since it's a no-op. This reduced rt_sigprocmask from ~37,000/sec to ~86/sec.
Question: Is this approach safe? We believe so because the only signal handlers touching mutex_prot are synchronous (SIGSEGV/SIGBUS/SIGILL/SIGABRT in my_box64signalhandler), and the rbtree lookup in getProtection_fast won't fault. But we'd welcome your review.
2. Stack overflow crashes with no handler
After fixing the hang, the game progresses further but crashes with "Stack overflow in unmanaged" during UICapitularLabel.Awake() → recursive font processing. The fault address is on a Mono-allocated ~8MB thread stack (not the main [stack]), hitting the guard page.
Root cause: box64's signal handlers lack SA_ONSTACK, and no native sigaltstack is set up per thread. When the stack overflows, the kernel can't deliver SIGSEGV because there's no stack space for the handler.
Our fix: Added setupNativeAltStack() — allocates 64KB per-thread via real sigaltstack(), called from init_signal_helper(), pthread_routine(), thread_set_emu(), and thread_set_et(). Added SA_ONSTACK to all 4 handler registrations.
Question: Does box64 intentionally not use SA_ONSTACK / native sigaltstack? We see that box64 wraps the emulated program's sigaltstack calls (my_sigaltstack in signals.c), but the real kernel alt stack for box64's own handlers was never set up. Is there a reason for this, or is it simply something that wasn't needed before?
3. Remaining stack overflow in Mono thread (seeking guidance)
Even with proper signal delivery, the stack overflow itself persists. The recursive call chain is:
UICapitularLabel.Awake() → RefreshLabels() → UIDynamicFontSize.Guarantee()
→ SetFont() → UILabel.set_font() → ProcessText() → CalculatePrintedSize()
→ Font.RequestCharactersInTexture()
This overflows the ~8MB Mono-managed thread stack. Interestingly:
- Without dynarec (interpreter mode): too slow to reach this code in reasonable time
- With
BOX64_SHOWSEGV=1: the logging overhead prevents the crash (game reaches 725+ lines), suggesting timing/ordering affects recursion depth - The crash is non-deterministic — sometimes it passes the crash point, sometimes not
Question: Have you seen Unity/Mono games exhibit similar stack overflow issues with dynarec on other platforms (aarch64, rv64)? Could the dynarec be subtly miscompiling something in this call chain, causing extra recursion iterations? Or is this likely a genuine game bug that manifests differently under emulation?
Any guidance on where to look would be greatly appreciated. Happy to provide more details, logs, or remote access to the machine for debugging.