(Re-)Implement C Backend (this time supports callbacks & typed enums)#261
(Re-)Implement C Backend (this time supports callbacks & typed enums)#261yuvatia wants to merge 4 commits intoralfbiedert:masterfrom
Conversation
50f33ff to
ef15cea
Compare
|
Hi, thanks for the PR, and thanks a lot for looking after the C backend. Yes, AI assisted PRs are generally fine if there is sufficient 'human in the loop'. This PR has a few issues though: Most importantly, the
The overall structure should be vaguely organized like in the C# backend, in particular the The passes should also be configurable (this might include options about naming conventions, etc.), and setting these config options should again follow how the C# pipeline builders do it. Most importantly, new backends must use tera templates like the C# backend does it. The whole Testing should also vaguely follow C#. There should be multiple emission / insta snapshot tests for various reference project parts. Neither tests nor codegen should include C++ constructs (for the |
|
Thank you for your feedback, will address those comments and update the PR in a few days. |
ef15cea to
ba54839
Compare
| int len = MultiByteToWideChar(CP_UTF8, 0, path, -1, NULL, 0); | ||
| if (len <= 0) return -1; | ||
| wchar_t* wpath = (wchar_t*)_alloca(len * sizeof(wchar_t)); | ||
| MultiByteToWideChar(CP_UTF8, 0, path, -1, wpath, len); | ||
| HMODULE lib = LoadLibraryW(wpath); |
There was a problem hiding this comment.
I'm debating adding a {{load_fn}}W or something that just takes a PCWSTR to avoid these conversions, wdyt?
|
I updated the PR to address all your comments, and also addressed a few gaps such as missing benchmarks (runnable with |
52dc19f to
e753a61
Compare
|
Sorry, but this still has most issues of the C backend flow and structure being vastly different from the C# one. I recommend you (as a human) look at the C# backend and get a feeling for it, there's too much detail to write everything down item-by-item. While some internals have been agent created (e.g., overload emission and proc macros), the overall backend flow is 'hand-designed' and should be easy to follow (if it isn't that would be an issue). One bigger thing though I missed the first time, the reliance on |
C backend architecture (crates/backend_c/) ------------------------------------------- The C backend follows the same model+pass+template pattern established by the C# backend. The implementation is split into: - `lang/` — C language model defining the constructs the backend can emit: `CType` with `CTypeKind` variants (Primitive, Struct, SimpleEnum, TaggedUnion, FnPointer, Callback, Slice/SliceMut, Vec, Utf8String, Option, Result, Opaque, Pointer, Array), `CFunction`, and `CModel` which holds the complete mapped model. - `pass/model.rs` — Single model pass that transforms the Rust inventory into the C language model. Maps all type kinds, resolves type names (sanitizing Rust names like `Option<Vec2>` into valid C identifiers like `OPTIONVEC2`), performs topological sort for dependency-ordered emission, and builds the function list. - `pass/output.rs` — Output pass that renders the C model through Tera templates into the final header. Each type kind dispatches to its own template; the final assembly concatenates header guard, type definitions (in topo order), function declarations, dispatch table, platform-specific loaders, and footer. - `pipeline/` — `CLibrary` with builder pattern (`loader_name`, `ifndef`, `filename`), wires model and output passes together. Templates are packed into a tar archive at build time via `build.rs` and embedded in the binary. - `templates/` — 14 Tera `.h` template files organized by construct: types (struct, simple_enum, tagged_union, callback, fn_pointer, slice, vec, utf8string, option, result, opaque), function declarations, dispatch table, and loaders (dynamic_win32, dynamic_posix, static). The dynamic loader uses `MultiByteToWideChar`/`LoadLibraryW` on Windows and `dlopen`/`dlsym` with `memcpy` on POSIX. A `/* internal helpers */` comment separator is emitted before any `interoptopus_*` builtin functions in the dispatch table, loaders, and function declarations. reference_project_c (crates/backend_c/reference_project/) --------------------------------------------------------- Comprehensive FFI example exercising all supported patterns: structs, tagged union enums, slices, vecs, options, strings, callbacks (with Shape, Slice, Option, and Vec parameters), and a KitchenSink struct that combines all major FFI types. Mirrors the role of `crates/reference_project/` for the C# backend. Test infrastructure ------------------- - Insta snapshot tests (6 focused + 1 full): basic struct, simple enum, tagged union, callbacks, pattern types (Slice/Option), and the full reference project header. Each test builds a small inventory, runs the pipeline, and snapshots the generated `.h` output. - C++20 gtest suite (10 tests) under reference_project/ loads the Rust cdylib and validates all FFI types at runtime. Uses CMake + FetchContent for gtest. The generated header is gitignored. - C++20 Google Benchmark suite (10 benchmarks) under benches/ measures FFI call overhead for all major patterns: tagged unions, slices, mutable slices, vec lifecycle, option returns, and callbacks with various argument types. Run with `just bench-c`. - examples/hello_world/: binding generation tests for both C# and C, plus a simple C++20 gtest (2 tests). Renamed `bindings/` to `bindings_csharp/` for clarity alongside `bindings_c/`. - CMake: copies the Rust cdylib next to the test exe on all platforms (with RPATH=$ORIGIN on Unix), uses `--config Debug` / `-C Debug` for MSVC multi-config generators. - Justfile: `just test-c` runs both C++ test suites via `_test_c` helper. `just test-dotnet` runs hello_world Xunit test. Both wired into `just ci`. `just bench-c` builds in release and runs the Google Benchmark suite via ctest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds NamingStyle enum (ScreamingSnake, UpperCamel, Snake, Raw) and NamingConfig with per-category control over type, enum variant, function, parameter, and constant naming. An optional prefix is prepended to types and functions (e.g. mylib_color). Loader templates now use a separate `symbol` field for dlsym/ GetProcAddress lookups so prefixed names don't break dynamic loading. ScreamingSnake properly splits at word boundaries (OptionVec2 → OPTION_VEC2). Callback _fn and tag _TAG suffixes are cased to match their respective naming styles. Tagged union field names are computed in the model pass rather than reverse-engineered in the output pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e753a61 to
cae05d8
Compare
- Handle Rust's `_` parameter name in `c_param_name()`: `sanitize("_")`
returns an empty string which produces invalid C like `Type 0` instead
of `Type param`. Fall back to "param" when the styled name is empty.
- Add `#include <malloc.h>` to the Win32 dynamic loader template so
`_alloca` is declared when compiling with MSVC in C++ mode.
The prefix (e.g. "mylib_") is meant to avoid symbol collisions in the global namespace, but inside the dispatch struct the fields are already scoped — the prefix just adds noise and makes the API painful to use (api.mylib_foo vs api.foo). Keep the prefixed name for top-level function declarations and the original symbol for dlsym/GetProcAddress. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
At the company I work for, we use
interoptopusto generate C# bindings for a couple of projects and we have been very satisfied with the results, however recently we wanted to generate bindings for a project written in C++ and found the existing (now deprecated) C backend to be extremely lacking. In particular the things that were very apparaent were:These gaps motivated me (to motivate Claude) to enhance the backend. Initially this was based on the (now
_old) pre-existing C backend, but due to recent structural changes in the repo it was changed to be self-contained.My intention is to keep this backend relatively maintained (as we plan on using it for production software).
See commit message for detailed overview of the backend.
Some notes
C header generator (crates/backend_c/)
New C header generator built against the 0.16 type system. Organized as three modules: lib.rs (public Generator API with builder pattern), codegen.rs (all emission routines), and topo.rs (deterministic topological sort of types by name before dependency traversal, ensuring stable output across runs).
The generator produces a single .h file containing:
Type definitions: structs, simple enums, and tagged unions. Rust enums with data payloads emit a C11 tag enum + struct with anonymous union (e.g.
SHAPE_TAGenum +SHAPEstruct withtagandunionfields). This also coversffi::Option<T>andffi::Result<T, E>which are internally represented as enums with typed variants.Callbacks: each
callback!type emits aNAME_fnfunction pointer typedef (with an implicit trailingconst void*context parameter) and aNAMEstruct containing three fields —callback(the fn pointer),data(context pointer), anddestructor(optional cleanup). This matches the Rust#[repr(C)]layout so callbacks round-trip correctly across the FFI boundary.Dispatch table: a
{name}_api_tstruct with one function pointer field per exported function, where{name}is caller-specified (e.g.reference_project_c_api_t). A/* internal helpers */comment separator is emitted before anyinteroptopus_*builtin functions (frombuiltins_string!/builtins_vec!) to visually distinguish user APIs from internal helpers.Dynamic loader: a cross-platform
{name}_load(path, api)function. The POSIX implementation usesdlopen/dlsymwithmemcpyto transfervoid*into function pointer fields (avoiding the ISO C prohibition on directvoid*-to-function-pointer casts, which triggers warnings under-Wpedantic). The Windows implementation converts the UTF-8 path to UTF-16 viaMultiByteToWideChar(CP_UTF8, ...)and loads withLoadLibraryW/GetProcAddress. Both validate every symbol and return -1 on failure.Static loader: a
{name}_load_static(api)function (behind an#ifdefguard) that assigns the forward-declared symbols directly, for use when statically linking the Rust library.reference_project_c (crates/backend_c/reference_project/)
Comprehensive FFI example exercising all supported patterns: structs, tagged union enums, slices, vecs, options, strings, callbacks (with Shape, Slice, Option, and Vec parameters), and a KitchenSink struct that combines all major FFI types (u64, bool, f64, ffi::String, tagged enum, ffi::Option, ffi::Slice, ffi::Optionffi::String). Exports a public
inventory()function consumed by the backend's integration test. Mirrors the role ofcrates/reference_project/for the C# backend.Test infrastructure
crates/backend_c/tests/: Rust integration test generates the C header from reference_project_c's inventory. C++20 gtest suite (10 tests) under reference_project/ loads the Rust cdylib and validates all FFI types with proper assertions. Uses CMake + FetchContent for gtest. The generated header is gitignored (regenerated by
cargo test).examples/hello_world/: added a second binding generation test for C (alongside the existing C# one), plus a simple C++20 gtest (2 tests) that validates the Vec2/my_function roundtrip. Renamed
bindings/tobindings_csharp/for clarity alongsidebindings_c/. Also fixed the existing C# Xunit test to only reference types in the inventory and updated the target framework to net10.0. Generated headers are gitignored.Justfile:
just test-cruns both C++ test suites via a shared_test_chelper (cmake configure/build/ctest with RUST_LIB_DIR).just test-dotnetnow also runs the hello_world Xunit test. Both are wired intojust ci.