Conversation
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
|
Hey! I've only had a very high-level look, since there's quite a lot here. If you'd like to help out with porting to Mac, I'd suggest discussing on the Wild zulip. There's an existing thread "Mach-o support". Martin is leading the porting effort for Mac. I think it might be getting to the point where there might be scope for multiple people to work concurrently, but definitely check with him first to avoid duplicated efforts and / or hard-to-resolve merge conflicts. I'm not sure what Martin's thoughts are on integration tests, but that's definitely something we'll need soon and is perhaps more likely to parallelise with other mac work. I see you did some work in this area, which is great. It looks like you opted for a completely separate integration test runner. I think we'd want to actually extend our existing integration test runner to support running mac tests. |
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
…ns, partial linking Enable remaining Mach-O integration tests (41/41 pass, 0 ignored): - Emit __stubs and __got section headers; fix PLT/GOT resolution for undefined symbols from dylibs - Populate LC_SYMTAB for executables (backtraces, absolute symbols) - Set weak_import bit in chained fixups for N_WEAK_REF symbols - Parse .tbd files via text-stub-library to detect truly undefined symbols vs dylib imports - Emit __gcc_except_tab section header; scan __compact_unwind for personality GOT entries; add LSDA descriptors to __unwind_info - Implement partial linking (-r) producing MH_OBJECT output with merged sections, remapped relocations and symbol tables - Remove unused eh_frame generic parsing abstraction Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
- Give __thread_vars its own output section (PREINIT_ARRAY) so all TLV descriptors are contiguous across objects - Fix TLS offset computation: fallback for TBSS symbols in extern resolution path, force 8-byte alignment for S_THREAD_LOCAL_VARIABLES - Filter __thread_vars key/offset fields from chained fixup chain and zero corrupted key fields - Disambiguate __TEXT,__const vs __DATA,__const via section_name() rename to __text_const, each mapped to a dedicated output section ID - Separate __literal4/8/16 from __cstring (different output section IDs) to prevent layout pipeline part overlap - Separate __DATA,__const from __data (CSTRING output section ID) - Fix write_exe_symtab n_sect: compute from section address ranges instead of hardcoding section 1 - Add validation invariants: TLV key=0, no duplicate TLV offsets, section file-offset overlap, section data write overlap, symbol n_value within n_sect range, chained fixup chain integrity - Dynamic section header generation for all output sections - New tests: rust-tls, rust-build-script-sim, rust-format-strings, rust-large-data, tls-alignment Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
The write_pageoff12 function extracted the access size shift from bits 31:30 of the instruction, which works for integer LDR/STR but is wrong for SIMD/FP loads. For ldr q0 (128-bit), bits 31:30 are 00 (interpreted as byte access = no shift) but the actual scale is 16 bytes (shift=4). This caused the page offset to be 16x too large, making ldr q0 read from the wrong address. This was the root cause of the println! crash: the BufWriter init constant (a 16-byte value loaded via ldr q0) was fetched from a wrong offset, producing garbage in the stdout mutex/RefCell state. Found by adapting LLVM lld's arm64-relocs.s test which exercises exactly this relocation pattern. Signed-off-by: Giles Cope <gilescope@gmail.com>
The LINKEDIT segment size estimation was too small for binaries with
many symbols, causing the output file to be truncated. This made
codesign fail ("main executable failed strict validation"), leaving
binaries unsigned, which macOS kills with SIGKILL on execution.
Increased the estimate to account for chained fixups data and longer
average symbol names.
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Import 76 ARM64-relevant assembly tests from LLVM lld's MachO test suite. These provide comprehensive coverage of relocations, stubs, thunks, TLS, compact unwind, dead stripping, and ObjC features. Test runner assembles each .s file with clang, links with Wild, and validates the output with codesign. 21 tests pass, 2 ignored. Signed-off-by: Giles Cope <gilescope@gmail.com>
Import 134 shell tests from bluewhalesystems/sold (archived). Tests compile C/C++ via clang, link with Wild via --ld-path=./ld64, and verify output. 36 pass, 98 ignored (categorized by reason). Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
- Fix write_dylib_symtab hardcoding n_sect=1 for all symbols. Extract parse_section_ranges() for proper section lookup. - Error when an explicitly-specified entry symbol (-e) is not found, instead of silently succeeding. - Propagate entry_symbol_address errors in Mach-O writer. - Un-ignore 5 sold tests: pagezero-size3, entry, objc, eh-frame, bind-at-load. Sold suite: 41 pass (was 36). Signed-off-by: Giles Cope <gilescope@gmail.com>
Implement -filelist <path>[,<dir>] to read input file paths from a file, with optional directory prefix. Also accept (silently ignore) many more ld64 flags: -add_ast_path, -macos_version_min, -dependency_info, -map, -stack_size, -sectcreate, -F, -U, -hidden-l, -no_fixup_chains, -x, -S, -w, -Z, and others. Sold suite: 44 pass (was 41). Signed-off-by: Giles Cope <gilescope@gmail.com>
Instead of silently ignoring -force_load, add the specified archive as an input with whole_archive=true. Archive members are correctly marked as non-optional in the loading pipeline, though the full whole-archive layout path needs further work for Mach-O. Signed-off-by: Giles Cope <gilescope@gmail.com>
Root cause: whole-archive archive members had their symbols resolved but data sections were GC'd because resolve_section didn't know about the whole_archive modifier. Unreferenced sections stayed as SectionSlot::Unloaded and were skipped during layout activation. Three fixes: - Thread whole_archive through ResolvedCommon so resolve_section can set must_load=true for all sections from whole-archive members - Add load_all_defined_symbols in activate() to set DIRECT on all defined symbols from whole-archive members (skipping discarded sections like __compact_unwind) - Fix exe symtab to emit N_EXT for external symbols instead of marking everything as local Sold suite: 46 pass (was 44). Unlocks all-load and force-load. Signed-off-by: Giles Cope <gilescope@gmail.com>
Wild now handles -help/--help by printing usage info and exiting. This unblocks the sold-macho/response-file test which invokes ./ld64 @response_file with -help. Sold suite: 47 pass (was 46). Signed-off-by: Giles Cope <gilescope@gmail.com>
Two bugs prevented common symbols (tentative definitions) from working: 1. is_undefined() returned true for common symbols because they have N_UNDF type. Fixed by excluding symbols where is_common() is true. This caused common symbols to be skipped during symbol registration, making them appear undefined. 2. as_common() used raw n_desc as shift count for alignment, causing shift overflow panics. Fixed by extracting GET_COMM_ALIGN bits (bits 8-11 of n_desc) per the Mach-O spec. Sold suite: 49 pass (was 47). Unlocks common and common-alignment. Signed-off-by: Giles Cope <gilescope@gmail.com>
The TLV descriptor offset field for thread_bss symbols was computed as align_up(tdata_size, 8) + bss_offset, producing offset 8 for a 4-byte tdata section. The correct formula is tdata_size + bss_offset (no alignment padding), matching what the system linker produces. Also remove debug file logging from TLS relocation paths and add TLS-block-relative offset computation for $tlv$init symbols in the fallback relocation path. Signed-off-by: Giles Cope <gilescope@gmail.com>
Handle $ld$ linker directives in .tbd symbol exports: - $ld$add$os<ver>$ — add symbol when target OS >= version - $ld$hide$os<ver>$ — remove symbol when target OS >= version - $ld$install_name$os<ver>$ — change install name for target OS - $ld$previous$ — use previous install name for OS version range Defer .tbd positional input processing to end of arg parsing so -platform_version is known when processing directives. Parse exports trie from .dylib files found via -l (was only adding install name without symbol data). Add dylib_symbols() trait method for generic undefined-symbol suppression. Enables sold-macho/tbd, tbd-reexport, unkown-tbd-target, tbd-install-name, tbd-previous (78 passing, was 73). tbd-add and tbd-hide need stricter undefined checking. Signed-off-by: Giles Cope <gilescope@gmail.com>
Remove unused has_incomplete_dylib_symbols field — the nuanced check requires tracing reexported-libraries in system .tbd files, which is a larger change. Keep the existing extra_dylibs guard for now. The $ld$add and $ld$hide directives work correctly (symbols are added/ removed from dylib_symbols based on target OS), but the undefined error is suppressed when extra_dylibs is non-empty (which is always true when cc passes -lSystem). Fixing this requires complete re-export tracing. Signed-off-by: Giles Cope <gilescope@gmail.com>
The collect_tbd_symbols function was missing weak_symbols from re_exports sections (e.g. libc++abi symbols re-exported by libc++). This caused operator delete and other C++ runtime symbols to be missing from dylib_symbols, triggering false undefined symbol errors. Also: parse framework dylib exports trie in link_framework (was only adding install name). Remove has_unparsed_dylibs guard from undefined symbol check — dylib_symbols is now comprehensive enough. Enables sold-macho/tbd-add and tbd-hide (79 passing, was 78). Regresses reexport-l (needs LC_REEXPORT_DYLIB chain tracing). Signed-off-by: Giles Cope <gilescope@gmail.com>
When parsing a .dylib input, follow LC_REEXPORT_DYLIB load commands recursively to collect exported symbols from re-exported libraries. This handles multi-level re-export chains (e.g. libbaz → libbar → libfoo). Search for re-exported dylibs by filename in: the absolute install name path, the parent directory of the importing dylib, and -L paths. Depth-limited to 8 levels to prevent infinite loops. Enables sold-macho/reexport-l test (80 passing, was 79). Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
…ezero_size validation - -reexport_library: parse dylib/tbd at full path, set Reexport kind - -weak_library: parse dylib/tbd at full path, set Weak kind - -Z: suppress default syslibroot search paths; error on -lSystem - -U <symbol>: add to dylib_symbols to suppress undefined error - -pagezero_size: error when used with -dylib/-bundle Enables sold-macho/pagezero-size2 and Z tests (82 passing, was 80). Signed-off-by: Giles Cope <gilescope@gmail.com>
…ymbols Change undefined symbol format to 'undefined symbol: <file>: <sym>' and duplicate symbol format to 'duplicate symbol: <file1>: <file2>: <sym>' matching the sold linker's output format that tests expect. Enables sold-macho/missing-error and duplicate-error tests (84 passing, was 82). Signed-off-by: Giles Cope <gilescope@gmail.com>
Canonicalize OSO stab paths to absolute before emitting. When -oso_prefix is set, strip the prefix from the path. Handles both absolute prefixes and '.' (current directory). Also wire up -oso_prefix arg parsing (was in ignored list). Enables sold-macho/oso-prefix test (85 passing, was 84). Signed-off-by: Giles Cope <gilescope@gmail.com>
Emit N_AST stab symbols for each -add_ast_path flag, enabling dsymutil to find AST files for LLDB. Also moved -add_ast_path from ignored to parsed args list. Enables sold-macho/add-ast-path test (86 passing, was 85). Signed-off-by: Giles Cope <gilescope@gmail.com>
When -dead_strip_dylibs is set, only emit LC_LOAD_DYLIB for dylibs that have at least one symbol referenced by the executable. Track symbol-to-dylib provenance via dylib_symbol_provenance HashMap. -needed_framework and -needed-l mark dylibs as immune to stripping. Enables sold-macho/dead-strip-dylibs, dead-strip-dylibs2, needed-framework tests (89 passing, was 86). Signed-off-by: Giles Cope <gilescope@gmail.com>
Handle -v flag by setting VersionMode::Verbose, which prints the Wild version string. Adapt the version test to accept 'wild' in addition to '[ms]old'. Enables sold-macho/version test (90 passing, was 89). Signed-off-by: Giles Cope <gilescope@gmail.com>
Write a link map file showing object files, section addresses/sizes, and symbol addresses when -map <path> is specified. Format matches ld64/sold output with fixed-width columns. Enables sold-macho/map test (91 passing, was 90). Signed-off-by: Giles Cope <gilescope@gmail.com>
The system linker (ld64) requires the symbol table in LINKEDIT to be 8-byte aligned when consuming dylibs. Add alignment padding before the dylib symtab offset. Only applies to dylib output (exe output keeps the existing packed layout for strip(1) compatibility). Enables sold-macho/reexport-library test (92 passing, was 91). Signed-off-by: Giles Cope <gilescope@gmail.com>
…able_dylib Set MH_DEAD_STRIPPABLE_DYLIB flag in dylib output when -mark_dead_strippable_dylib is passed. When consuming a dylib with this flag set and no symbols are referenced, auto-strip it from the load commands (even without -dead_strip_dylibs). Also: align dylib symtab to 8 bytes in LINKEDIT (fixes ld64 'mis-aligned LINKEDIT' rejection of wild-built dylibs). Enables sold-macho/mark-dead-strippable-dylib and reexport-library (93 passing, was 91). Signed-off-by: Giles Cope <gilescope@gmail.com>
When -hidden-l<name> is used, scan the archive's object members for global defined symbols and add them to unexported_symbols. The export trie filtering then hides these symbols from the dylib output. Also remove leftover debug trace for libc++. Enables sold-macho/hidden-l test (94 passing, was 93). Signed-off-by: Giles Cope <gilescope@gmail.com>
Check linked dylibs for MH_APP_EXTENSION_SAFE flag and .tbd not_app_extension_safe flag. Warn when -application_extension is set and linked dylib isn't safe. -w suppresses all warnings. Deferred warning emission to handle arg ordering (dylib before flag). Enables sold-macho/application-extension, application-extension2, w (97 passing, was 94). Signed-off-by: Giles Cope <gilescope@gmail.com>
…tension - -U <symbol>: emit as N_UNDF|N_EXT in output symbol table for dynamic runtime lookup. Also suppress undefined error. - -w: suppress linker warnings. - -application_extension: warn when linked dylib lacks MH_APP_EXTENSION_SAFE or has not_app_extension_safe TBD flag. Deferred to end of parse for arg ordering. Enables sold-macho/U, application-extension, application-extension2, w tests (98 passing, was 94). Signed-off-by: Giles Cope <gilescope@gmail.com>
Add addend field to BindFixup. When any bind has a non-zero addend, switch to DYLD_CHAINED_IMPORT_ADDEND (format 2) which stores 32-bit addends per import. The fixup-chains-addend test still fails because the same import is used for both addend=0 and addend=4096 references to the same symbol — needs per-site import deduplication. No regressions (98 passing). Signed-off-by: Giles Cope <gilescope@gmail.com>
Implement -dependency_info (binary dep info file with version, input, and output records). Adapt lc-build-version test to accept tool ID 3 (standard ld) alongside sold's 54321. Enables sold-macho/dependency-info, lc-build-version (100 passing, was 98). Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
…d more Major features: - Mach-O LTO via libLTO.dylib (new macho-lto feature, macho_lto.rs) - -sectcreate writer integration (data in TEXT segment gap) - Indirect symbol table (DYSYMTAB + nlist for imported symbols) - DYLD_CHAINED_IMPORT_ADDEND64 (format 3) for large addends - Chained fixup implicit addend from data content - __init_offsets section (S_INIT_FUNC_OFFSETS, -fixup_chains implies -init_offsets) - -flat_namespace (MH_TWOLEVEL removal, GOT for local globals, interposable symbols) - -undefined warning/suppress/dynamic_lookup treatment - @rpath/@loader_path/@executable_path expansion in re-export resolution - Source-level debugging (SO/BNSYM/FUN/ENSYM stab synthesis for dsymutil) - section$/segment$ start/stop synthetic symbols - --print-dependencies output - -search_dylibs_first (pre-scan for global flags) - -umbrella (LC_SUB_FRAMEWORK) - -export_dynamic (EXPORT_DYNAMIC flag in symtab) - -u force-undefined symbols kept alive as GC roots - Literal section merge infrastructure (S_4BYTE/S_8BYTE/S_16BYTE_LITERALS) - TLS mismatch detection (object-to-object case) - Unaligned rebase fixup error detection Bug fixes: - LC_LOAD_WEAK_DYLIB command value (0x18000018 → 0x80000018) - Bitcode wrapper detection (0x0B17C0DE magic) - Validator: skip stab entries in section range check - Test fixes: libc++ exception message wording, asm symbol prefixes Signed-off-by: Giles Cope <gilescope@gmail.com>
…rite_objc_stub - Suppress undefined errors for _objc_msgSend$<selector> symbols - Redirect ObjC stubs to _objc_msgSend (selector loading not yet implemented) - Parse -order_file into symbol_order map (reordering not yet implemented) - Add write_objc_stub function for future 32-byte stub synthesis - Add --print-dependencies and start-stop-symbol from previous commit Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
… stub redirect cleanup - Tested subsections-via-symbols: approach works (produces correct output) but O(n²) symbol scanning is too slow for large objects. Needs cached subsection offset map per object. Reverted and re-skipped. - ObjC _objc_msgSend$ stubs redirect to _objc_msgSend (no selector loading yet) - Cleaned up stub allocation (back to 12-byte stubs, 8-byte GOT) Signed-off-by: Giles Cope <gilescope@gmail.com>
__llvm_addrsig sections have relocations at unaligned offsets but are metadata-only (not part of runtime data layout). Skip the alignment error for these sections instead of bailing out. Fixes lld-macho/arm64-thunks test. Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
Signed-off-by: Giles Cope <gilescope@gmail.com>
|
Hi @gilescope. What are you plans with this work? I don't mean from a technical perspective, more from a merging perspective? |
Trying to see how much work is required to get basic mac support going.
Currently c and rust hello worlds work. Obviously we'll need to improve the code and make it fit more with existing wild better, but to start with just trying to get bigger rust programs to compile and run without segfaults. So far only tested on mac m3 max.
(I note that wasm isn't supported either, so might focus on that after.)
If anyone wants to join in, PRs welcome from you or your AI friend.