In the last post, I demonstrated how to use
JSPI in a simple program written for the wasm32-unknown-unknown
platform. Of
course, wasm32-unknown-unknown
does not support the libc standard library, so
it does not work for real-world code. Emscripten provides the
wasm32-unknown-emscripten
target which does support these features.
The main additional problem that comes up when integrating with Emscripten is
the problem of JavaScript frames. Recall from the last post that to use
JavaScript Promise Integration, we wrap an async JavaScript function that we
want to use from C in new WebAssembly.Suspending()
and we wrap the export in
WebAssembly.promising()
:
const imports = {
env: {
awaitFakeFetch: new WebAssembly.Suspending(awaitFakeFetch),
// ... other imports
},
};
export const { instance } = await WebAssembly.instantiate(
readFileSync("myModule.wasm"),
imports,
);
export const fakePyFunc = WebAssembly.promising(instance.exports.fakePyFunc);
If there are any JavaScript calls in between the call into WebAssembly
fakePyFunc
and the call out to JavaScript awaitFakeFetch
, the call to
awaitFakeFetch()
will fail. Before JSPI existed, features in the C runtime
could be implemented using JavaScript frames and it would not make any
observable difference to the behavior of the program. Now it does.
In Pyodide, JavaScript frames are used for:
- Handling function pointer casts
- C++ exceptions and Rust panics
- libffi/ctypes
- Resolution of late-binding symbols from dynamic libraries
To make JSPI work with these features, there are two possible approaches: either we can somehow replace the JavaScript frame with equivalent WebAssembly functionality, or we can make the JavaScript frame cooperate with the stack switching. In practice, both of these approaches are quite challenging to implement. In this post, we’ll focus on replacing the function pointer cast handling JS call with WebAssembly.
Function pointer casts
I wrote a blog post about this problem in 2021. Python extensions often declare functions with one signature, cast them to a different signature with more arguments, and call them with extra arguments. This is undefined behavior according to the C standard, but works in practice with most major compilers and architectures.
However, WebAssembly is more strict and call_indirect
instruction which calls a
function pointer traps if the function pointer does not have the expected type.
For example suppose we have code like this:
typedef int (*F)(int, int);
int handler0(int x, int y) {
logString("Handler 0 fetch x");
int resx = awaitFakeFetch(x);
logString(" ... fetch y");
int resy = awaitFakeFetch(y);
return x * y;
}
int handler1(int x) {
logString("Handler 1 fetch x");
int res = awaitFakeFetch(x);
return res * res;
}
F handlers[] = {(F)handler0, (F)handler1 };
WASM_EXPORT("fakePyFunc")
void fakePyFunc(int func_index, int x, int y) {
int res = handlers[func_index](x, y);
logString("Got result:");
logInt(res);
}
When we call fakePyFunc()
with func_index
set to 0
it works, but when set
to 1
, we get a crash:
RuntimeError: null function or function signature mismatch
See the complete example here.
Calls from JavaScript into WebAssembly ignore extra arguments, so we had been dealing with this situation before by adding a JavaScript trampoline:
const imports = {
env: {
...
fpcastTrampoline(fptr, x, y) {
return functionPointerTable.get(fptr)(x, y);
},
...
},
};
and then in C invoking the handler as follows:
fpcastTrampoline(handlers[func_index], x, y);
This fixes the function pointer casts but when we attempt to stack switch it raises:
SuspendError: trying to suspend JS frames
See the complete example here.
Handling function pointer casts without JavaScript
Luckily, wasm-gc
has added a ref.test
instruction which can be used to check
the signature of a function before calling it. We can use this to cast the
function to the right signature before making the call. Here is an example wat
module
(module
(type $zero_args (func (result i32)))
(type $one_args (func (param i32) (result i32)))
(type $two_args (func (param i32 i32) (result i32)))
(import "env" "__indirect_function_table" (table $functable 1 funcref))
(func (export "countArgs") (param $funcptr i32) (result i32)
(local $funcref funcref)
;; Convert function pointer to function reference using table.get, store the
;; result into $funcref local
local.get $funcptr
table.get $functable
local.tee $funcref
;; Two args?
ref.test (ref $two_args)
if
i32.const 2
return
end
local.get $funcref
;; One arg?
ref.test (ref $one_args)
if
i32.const 1
return
end
local.get $funcref
;; Zero arg?
ref.test (ref $zero_args)
if
i32.const 0
return
end
;; It takes more than two args, or uses a non-i32 type.
i32.const -1
return
)
)
Now we can use this as follows:
typedef int (*F)(int, int);
typedef int (*F1)(int);
typedef int (*F0)(void);
WASM_IMPORT("countArgs")
int countArgs(F func);
int callHandler(F f, int x, int y) {
int nargs = countArgs(f);
switch (nargs) {
case 2:
logString("Two arguments");
return func(x, y);
case 1:
logString("One argument");
return ((F1)func)(x);
case 0:
logString("Zero arguments");
return ((F0)func)();
default:
logString("Bad handler");
return -1;
}
}
WASM_EXPORT("fakePyFunc")
void fakePyFunc(int func_index, int x, int y) {
int res = callHandler(handlers[func_index], x, y);
logString("Got result:");
logInt(res);
}
We can use wasm-as
(WebAssembly Assemble) to convert the wat to wasm and use
wasm-merge
to merge countArgs.wasm
with the compiled C code. wasm-as
and
wasm-merge
are part of binaryen. See the full build script here:
here.
(As an aside, we use wasm-merge
instead of the normal approach of assembling
to an object file and then linking the object file with wasm-ld
because we can
only use an instruction in an object file if llvm knows how to generate
relocations for the instruction. I have now taught llvm how to generate
relocations for ref.test
, but it didn’t know how to do this when I started.)
This approach to handling function pointer casts uses the ref.test
instruction
which was added as part of the garbage collection feature (wasm-gc) which has
only been supported in Safari since December of 2024. Furthermore, on iOS using
wasm-gc causes crashes. However, we do know that every runtime that supports
JSPI also supports wasm-gc, so we can use wasm-gc if we can and otherwise fall
back to using a JavaScript trampoline.
You can see the current code used to handle function pointer casts in the Python interpreter here.
A new clang intrinsic
I added a new clang intrinsic called
__builtin_wasm_test_function_pointer_signature
so that we can test the runtime signature of a function pointer from C. Using
this, we can get rid of the .wat
file and our callHandler()
function looks
like:
int callHandler(F f, int x, int y) {
if (__builtin_wasm_test_function_pointer_signature(f)) {
logString("Two arguments");
return f(x, y);
}
if (__builtin_wasm_test_function_pointer_signature((F1)f)) {
logString("One argument");
return ((F1)f)(x);
}
if (__builtin_wasm_test_function_pointer_signature((F0)f)) {
logString("Zero arguments");
return ((F0)f)();
}
logString("Bad handler");
return -1;
}
Here’s a pull request that updates Python to use the new intrinsic.
Conclusion
In the future, __builtin_wasm_test_function_pointer_signature()
should make
dealing with function pointer casting cleaner and faster. It will also allow
handing function pointer casts in wasi and wasm32-unknown-unknown whenever the
runtime supports wasm-gc, whereas previously it was not possible. It currently
requires a very recent version of Emscripten, or a development version of clang
and a quite recent web browser / node version, so it will still be a while
before it can be adopted everywhere.
All of this fixes just one of the many sources of JavaScript frames that cause trouble when using JSPI. This problem we solved by replacing functionality implemented using JavaScript with WebAssembly. Next time I’ll discuss the approach of making a JavaScript frame cooperate with JSPI and the difficulties that occur with that approach.