Skip to content

Add fast-path dispatch for address-taken functions in ModuleClone#326

Open
Pavel-Durov wants to merge 9 commits intoykjit:mainfrom
Pavel-Durov:opt-indirect-fn-notrace-fastpath
Open

Add fast-path dispatch for address-taken functions in ModuleClone#326
Pavel-Durov wants to merge 9 commits intoykjit:mainfrom
Pavel-Durov:opt-indirect-fn-notrace-fastpath

Conversation

@Pavel-Durov
Copy link
Copy Markdown

Previously, functions whose address was taken were excluded from cloning entirely to preserve function pointer identity. This meant all indirect calls always went through the unoptimised body, even at runtime when no tracing was active.

Address-taken functions are now cloned into __yk_opt_<name> as normal, but the original symbol is kept. A tracing-state dispatch check is prepended: when __yk_thread_tracing_state == 0 (the common case) a call to the optimised clone is emitted; otherwise the original unoptimised body executes.

This preserves function pointer identity while reducing overhead for indirect calls to a single load + branch in the non-tracing path.

Functions that are still excluded from this optimisation: those containing the control point, and functions annotated
yk_outline or yk_indirect_inline.

Previously, functions whose address was taken were excluded from cloning
entirely to preserve function pointer identity. This meant all indirect
calls always went through the unoptimised body, even at runtime when no
tracing was active.

Address-taken functions are now cloned into `__yk_opt_<name>` as normal,
but the original symbol is kept. A tracing-state dispatch check is
prepended: when `__yk_thread_tracing_state == 0` (the
common case) a call to the optimised clone is emitted; otherwise the
original unoptimised body executes.

This preserves function pointer identity while reducing overhead for
indirect calls to a single load + branch in the non-tracing path.

Functions that are still excluded from this optimisation: those containing
the control point, and functions annotated
`yk_outline` or `yk_indirect_inline`.
Previously, functions whose address was taken were excluded from cloning
entirely to preserve function pointer identity. This meant all indirect
calls always went through the unoptimised body, even at runtime when no
tracing was active.

Address-taken functions are now cloned into `__yk_opt_<name>` as normal,
but the original symbol is kept. A tracing-state dispatch check is
prepended: when `__yk_thread_tracing_state == 0` (the
common case) a call to the optimised clone is emitted; otherwise the
original unoptimised body executes.

This preserves function pointer identity while reducing overhead for
indirect calls to a single load + branch in the non-tracing path.

Functions that are still excluded from this optimisation: those containing
the control point, and functions annotated
`yk_outline` or `yk_indirect_inline`.
@Pavel-Durov
Copy link
Copy Markdown
Author

Will post haste diff when its complete, it takes ages for ykmicropython

@Pavel-Durov
Copy link
Copy Markdown
Author

Ready for another review :)

@Pavel-Durov
Copy link
Copy Markdown
Author

Sorry missed one thing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants