Pyodide v0.26.0 is out, including Python version 3.12, many minor improvements to the foreign function interface and to the build system, and major improvements to stack switching. It also includes a tremendous amount of foward-looking work that is not yet visible to users.
Pyodide at PyCon
We were at PyCon again this year at another WebAssembly summit. The authors of this post met each other in person for the first time. Gyeongjae’s travel was generously paid for by our sponsors, so thank you if you have ever donated to Pyodide. Pyodide creator Mike Droettboom was there too. Both the WebAssembly summit and the rest of PyCon went fantastically well. It was wonderful hearing from the community about all of the things that are working well for them, and also very helpful to learn where we have room to improve.
Left to right: Pyodide maintainer Gyeongjae Choi, Pyodide creator Mike Droettboom, Pyodide maintainer Hood Chatham
Pyodide 0.26 Release
pygame-ce Support
We are thrilled to announce that Pyodide now supports Pygame Community Edition (pygame-ce). While we began supporting SDL-based graphics libraries in Pyodide 0.23.0, this release enhances the stability and compatibility of SDL libraries within Pyodide and adds support for pygame-ce.
We would like to thank the pygame-ce maintainers, especially to Paul m. p. peny (@pmp-p), for helping us integrate pygame-ce into Pyodide. They have been working hard on this project for many years.
This demo page provides examples of pygame-ce running in the browser using Pyodide.
cibuildwheel support
Many scientific Python packages now include building and testing wheels for Pyodide as a part of their continuous integration. We are in the process of adding Pyodide targets to cibuildwheel which automates all of the setup for this.
The pull request implementing Pyodide support in cibuildwheel has been open for 14 months and currently has 3 contributors, 6 reviewers, and over 200 comments, and has been through about five rounds of review. We are going to merge it any day now, which will be an incredible milestone for Pyodide. We could hardly have imagined getting to where we are now when we took over the Pyodide project from Mozilla in April of 2021.
Edit: It was merged on the morning of May 28!
Improvements to Stack Switching Support
Stack switching allows us to make async calls from a synchronous Python function. For example:
py.runPythonAsync(`
from js import fetch
from pyodide.ffi import run_sync
def sync_fetch(url):
resp = run_sync(fetch(url))
text = run_sync(resp.text())
return text
print(sync_fetch("https://example.com"))
`);
We fixed quite a few memory leaks involving stack switching, improved the interaction with other language features, and polished the API. We believe that stack switching is ready for real-world use at this point.
There are a lot of usability improvements left to build. For instance, we could
use it to implement loop.run_until_complete()
, to make pdb work, to make http
clients work, and many more things.
Chrome has started the
JSPI origin trial
so if you want to use stack switching in your web app you can add your domain to
the origin trial. It is also possible to use in node v20 if you pass --experimental-wasm-stack-switching
to node.
Foreign function interface improvements
We started to address several long-standing paper cuts involving conversion of
dictionaries in the foreign function interface. Now toJs
of a dictionary
returns a LiteralMap
. Any key which is a string and a valid JavaScript
identifier can be looked up by direct indexing a.key
, in addition to being
able to look up the keys with .get
as before. Map
methods shadow keys
though, so as the JavaScript Map
class gains more methods, it may cause minor
breakages.
For instance, in order to use JS fetch
method in Python, previously you need
to do something like this:
from js import fetch, Object
from pyodide.ffi import to_js
async def python_fetch(url, options: dict[str, str]):
return await fetch(url, to_js(options, dict_converter=Object.fromEntries))
Here, dict_converter
is a function that converts a Python dict to a JS object.
This was necessary because fetch
expects a JS object as its second argument.
If to_js
is used without dict_converter
, the Python dict will be converted
to a JS Map, which fetch does not expect.
Now, by introducing the LiteralMap
, the above code can be simplified to:
await fetch(url, to_js(options))
without the need to specify dict_converter
(thanks to Andrea Giammarchi for
the idea and implementation of LiteralMap
).
Note that we still need to use to_js
to convert the options
dict to a JS
object, meaning that the following case will still not work. But we hope to
support this in the future (see Future work on the FFI below).
await fetch("example.com", options={"headers": {"a": "b"}});
For a similar purpose of adapting from Python JSON to JavaScript JSON without
doing a conversion we added PyProxy.asJsJson()
. For example the following code
works now:
const jsonStr1 = `{"a":[1,2,3,{"b":7}]}`;
const pyjsonmod = pyodide.pyimport("json");
const pyjson = pyjsonmod.loads(jsonStr1).asJsJson();
const jsonStr2 = JSON.stringify(pyjson);
console.log(jsonStr1 === jsonStr2); // true
In future work, we intend to also add JsProxy.as_py_json()
to do the reverse.
Improvements to sphinx-js
and JS API docs
We use sphinx-js
to render our JavaScript API documentation. It is a fantastic
tool but it was very hard to maintain. We did a major rewrite of part of
sphinx-js
, moving the logic for ingesting the typedoc abstract syntax tree
from Python to JavaScript. In the process of doing this we discovered and fixed
a large number of bugs. It is now dramatically easier to implement new features
and to update dependencies than it ever was before. We were able to fix most of
the remaining complaints we had about the rendering of Pyodide’s JS API docs.
(The content still has a lot of issues though.)
Work in Progress / Roadmap
Package build system
We have made significant progress towards separating our package build system
(pyodide-build
) from the Pyodide runtime. Though this change will affect only
maintainers of Pyodide ports of packages (port maintainers) and not regular end
users, it is a crucial improvement for several reasons. This work is not yet
complete, but we believe it will be done by the end of the year.
Benefits of decoupling pyodide-build
from the Pyodide runtime
- Faster Package Updates:
Currently, pyodide-build
is released in conjunction with the Pyodide runtime
and the Pyodide foreign function interface, with a quarterly release cadence.
This is inconvenient for port maintainers because they have to wait months to
use new features or bug fixes in pyodide-build
. Decoupling pyodide-build
releases from Pyodide runtime releases will allow us to make improvements
available faster.
- Faster and Easier Runtime and FFI Releases:
Because packages can only be updated once a quarter, whenever we feel that we are ready to release a set of changes to the runtime and foreign function interface, we feel the need to try to update as many packages as possible so that people won’t have to wait even longer to get new stable package versions. This significantly delays releases, sometimes by more than a month.
- Decoupling Versions:
pyodide-build
and the Pyodide runtime are tightly coupled, requiring port
maintainers to match their versions to build packages. This prevented port
maintainers who work with older Pyodide runtimes from benefiting from
improvements in pyodide-build
. Decoupling pyodide-build
from Pyodide runtime
releases will give port maintainers the flexibility to use the latest version of
pyodide-build
and support old versions of the Pyodide runtime.
- Capacity for More Packages:
The 250+ packages in the Pyodide distribution are rebuilt with every new commit to Pyodide. This is a heavy burden on our CI system and makes it challenging to support additional large packages. If we had to pay for our CI resources, we would pay over $3,000 per month. CircleCI generously donates much more to us than they claim they are willing to.
By separating pyodide-build
and the package recipes from the Pyodide runtime,
we will be able to build packages independently, allowing many more packages to
exist in the Pyodide ecosystem. It will also make the ecosystem more sustainable
in its consumption of both computational power and maintenance effort.
Load time optimization: Memory snapshots
Pyodide does a lot of work at startup time initializing the Python interpreter and the foreign function interface. This work is done at every startup but has the same outcome every time. Ideally we should be able to save and reuse it.
There are two main approaches to reducing startup work. One approach is build-time partial evaluation. By analyzing the code, we can find expressions that have a consistent effect and bake that effect into the binary so it does not need to be done at runtime.
A second approach is to do the work once and then take a snapshot of the program state and somehow restore it later. In normal hosts, this is very difficult unless the language runtime has a compacting garbage collector (which Python does not have). However, WebAssembly has a fundamentally different security model than normal architectures and has no need for security features like address space layout randomization (ASLR) that ordinarily make snapshots very hard.
Build-time partial evaluation works best when implemented by the upstream maintainers of the cpython interpreter. Memory snapshots are easier to implement downstream of cpython so we are investing first in this approach.
Cloudflare is using Pyodide to support Python in their workers runtime and they care a tremendous amount about startup time. I (Hood Chatham) implemented memory snapshot support for Pyodide downstream in workerd as part of my work for them.
Memory snapshots are a complex feature with tricky interactions with dynamic linking, ctypes, the file system, the foreign function interface, and all systems that make use of entropy. Each one of these systems requires careful design work to support well. We believe that we understand how to handle the interactions with each of these features, but it takes time to implement.
I have upstreamed part of this work. There is no public API yet, and there are a lot of features that do not yet work correctly. Hopefully by the end of the year we will have something that is usable by the public. This should lead to a huge reduction in startup time in many cases.
Future work on the Foreign Function interface
We believe that we have pushed our current approach to the foreign function interface about as far as it can go. We have the following requirements, in roughly decreasing order of importance:
- Expressiveness. We want every language construct of JavaScript to be consumable from Python and conversely every language construct in Python to be consumable in JavaScript.
- Memory management. We want it to be possible and reasonably ergonomic to avoid memory leaks. This is critical to support applications like games that do video processing. The frames from the video are large buffers which we cannot afford to leak.
- Ergonomics. We want the interaction to be ergonomic, particularly when calling from Python into JavaScript.
- Performance. We want the interaction to be fast.
I think we have succeeded pretty completely at expressiveness and memory management. We have also done a great job at ergonomics, but there are corner cases which are very difficult to handle without compromising our more fundamental goals.
To deal with this we need to reconsider some fundamental aspects of our
approach. Compared to ctypes
and most other foreign function interfaces, one
thing that is unusual about Pyodide’s foreign function interface is that it does
not require any metadata about what function is being called and what it means.
This causes difficulties with ergonomics and performance that are very difficult
to fix.
To solve this, we are introducing a way to tell Pyodide what the Python function means. The beginnings of this work are included in Pyodide 0.26 but there are no publically visible changes yet.
Hopefully we’ll write a blog post discussing the design of this system in more detail, since there is too much to say in this post. But here are a couple quick examples.
The following doesn’t work because we forgot to say Response.new
:
Response("", status=404, statusText="Not found")
The following generates an empty response body:
Response.json({"a": [1, 2, 3]}, status=200, statusText="Success")
If we bind a signature to Response
to tell Pyodide about its shape, then these
two calls will work. This looks roughly as follows:
class Response_sig:
def __init__(self, body: Any, /, *, status: int = 200, statusText: str = "Okay"):
pass
@staticmethod
def json(obj: Json, /, *, status: int = 200, statusText: str = "Okay"):
pass
Response = Response.bind(Response_sig)
Acknowledgements
Thanks to Andrea Giammarchi for the LiteralMap
contribution. Thanks to Agriya
Khetarpal, Loïc Estève, and Ralph Gommers for their work helping ensure
scientific Python packages are well supported in Pyodide.
Thanks to Henry Schreiner, Joe Rickerby, Martin Renou, Matthieu Darbois, and Grzegorz Bokota for their help with the cibuildwheel port. Thanks to Henry also for helping us with our compliance with packaging standards.
Additionally, we always appreciate the support and assistance from the Emscripten team.
Thanks to Lukasz Langha, Antonio Cuny, Eric Snow, Kushal Das, Mark Shannon, Russel Keith-Macgee, Jeff Smith, Mike Fiedler, and many others for useful conversations at PyCon and constructive engagement with Pyodide.
Thanks to Nicholas Tollervey and Fabio Pliger for organizing the Pycon Wasm summit.
The following people commited to Pyodide in this release:
Brian “bits” Olsen, Christian Clauss, chrysn, C. Titus Brown, David Contreras, Emil Nikolov, goulashsoup, guangwu, Gyeongjae Choi, Henry Schreiner, Hood Chatham, Ian Thomas, ifduyue, James J Balamuta, Joel Ostblom, Joe Marshall, John Wason, Loïc Estève, Matthias Hochsteger, Matthias Köppe, Myles Scolnick, Philipp Schiele, Pierre Haessig, pyodide-pr-bot, Raymond Berger, Sam Estep, Szabolcs Dombi, Victor Blomqvist, Yuichiro Tachibana (Tsuchiya), Zsolt