Pyodide 0.27 Release

Pyodide v0.27.0 is out. This release was focused on improving the long-term stability of Pyodide.

Welcome Agriya Khetarpal to the Pyodide team

Agriya Khetarpal has joined as a new maintainer. Agriya has been active in the Scientific Python area and is a contributor to NumPy, SciPy, and scikit-learn. He has already significantly strengthened Pyodide’s support for various Scientific Python packages.

Build System Improvements

Decoupling `pyodide-build` from Pyodide runtime

pyodide-build is a tool that builds Python packages to run in Pyodide. Previously, the version of pyodide-build was strongly coupled to the version of Pyodide, meaning that if you wanted to build a package against a specific version of Pyodide, you had to use the corresponding version of pyodide-build. The problem with this approach was that even if we improved the build system, downstream users would have to wait for the next Pyodide release to use it.

In this release, we have separated pyodide-build from the Pyodide runtime. This allows us to develop and release pyodide-build independently of Pyodide. pyodide-build is now developed in pyodide/pyodide-build. You can install it with pip install pyodide-build.

Recent versions of pyodide-build work with Pyodide 0.26 and higher. For example, you can use pyodide-build version 0.29.2 to build packages for use with Pyodide 0.26.4:

pip install pyodide-build==0.29.2
pyodide xbuildenv install 0.26.4
pyodide build <...>

Wheels for new packages

This release includes about twenty new packages, most notably popular data science packages PyArrow, Polars, and DuckDB.

Previously, we have built all packages in our CI from within the repository. The increasing number of packages with long build times have been a strain on our CI resources. For instance, PyArrow takes tens of minutes to build. Now, several of these packages are built in a separate repository and managed by the package maintainers. In Pyodide 0.28.0, we are planning to unvendor the packages from the Pyodide runtime entirely. This will pave the way to supporting many more packages.

Update to NumPy 2.0

We have updated NumPy to version 2.0.2. This is a major update that required updates to many downstream packages in the scientific Python ecosystem. Please see Ecosystem compatibility with numpy 2.0 for more information.

Performance improvements to foreign function interface

We’ve long prioritized making the foreign function comprehensive and correct over making it fast. At this point we no longer get many bug reports or feature requests for the foreign function interface. However, for some use cases the foreign function interface is inconvienently slow. We made several improvements.

The first improvement we made was to getattr on a JsProxy. Each JsProxy has a Python dictionary in addition to the JavaScript object that it holds. When someone accesses an attribute on the JsProxy we have to first look it up on the dictionary. If we don’t find the attribute on the dictionary, we then look up the attribute on the JavaScript object. Most attributes are found in the second lookup on the JavaScript object. On this expected codepath, the failed Python lookup raises an AttributeError which we catch. In a successful lookup, between 43% and 76% of the execution time was spent formatting the message for the AttributeError. We now avoid creating the AttributeError in the first place which prevents wasting time formatting an error message that we will throw away.

The second improvement we made was to optimize away temporary bound methods. To execute code like

a.f()

Python first looks up a.f and then calls the result. This is translated into bytecode like:

LOAD a
LOAD_ATTR f
CALL

The function a.f receives a as the first self argument, so it needs to be a special bound method object that knows the correct value for the self argument. However, we only use this object once to call it and then we throw it away. We can avoid allocating and destroying an object when we call a method by calling type(a).f(a). The LOAD_ATTR opcode has a special argument that indicates that the next opcode is going to be CALL and in that case calls a method named _PyObject_GetMethod instead of the typical PyObject_GetAttr to perform the attribute lookup. We patched _PyObject_GetMethod to have special handling for JsProxy objects so that we could optimize away the temporary for JS objects too.

It would be possible to also improve other opcode sequences like

LOAD a
LOAD_ATTR x
LOAD_ATTR y

but in typical Python code, LOAD_ATTR does need create a temporary so it would require a more invasive patch to the Python interpreter.

What’s next?

We plan to upgrade to Python 3.13 in our next release. In addition, we intend to make the following changes:

Package unvendoring

We will unvendor the package recipes from the main Pyodide repository. The package index will gain a separate release process from the Pyodide runtime itself. This should have several significant benefits:

It will reduce CI usage due to rebuilding packages less often.
People who add a package will not have to wait as long until their package is included in our index.

Upstream work in Emscripten and Python

We are working on restoring Emscripten to a tier 3 supported target for Python. As a part of this work, we are upstreaming a large number of fixes to the Emscripten file system. For example, it now works to seek on /dev/null (it does nothing), symlink support is much better, it will work to stat a file descriptor which doesn’t point to a named file, and many file system system calls have more posix-compliant error handling. Most of these changes have not made their way into Pyodide yet because we’re still using Emscripten 3.1.58 for ABI compatibility.

Wasm Exception Handling

Emscripten supports two different stack unwinding ABIs for C++ exceptions, Rust panics, and setjmp/longjmp, a legacy ABI based on JavaScript exception handling, and a newer ABI based on WebAssembly Exception Handling. We use the JavaScript Exception handling but hope to switch to WebAssembly exception handling. This will lead to faster, smaller code, and fewer bugs. It also eliminates a lot of complexity in the stack switching support code – it is impossible to stack switch through JavaScript frames and by default every C++ try block introduces a JavaScript frame. This requires upstream work on the Rust compiler and switching to a custom build of the Rust standard library.

Acknowledgements

Thanks to Agriya Khetarpal, Loïc Estève, and Ralf Gommers for their work helping ensure that packages in the Scientific Python ecosystem are well supported in Pyodide.

Thanks to Joe Marshall and George Stagg for their contributions towards PyArrow and Polars support respectively, and to DuckDB maintainers for getting DuckDB to work in Pyodide.

Additionally, we appreciate the continued support from the Emscripten team.

The following people committed to Pyodide in this release:

Agriya Khetarpal, Andrei V. Plamada, Andrew Moon, Bart Broere, Carlo Piovesan, Castedo Ellerman, Chris Pyles, Christian Clauss, Deepak Cherian, Eli Lamb, Em Zhan, Eric Brown, Gyeongjae Choi, Hanno Rein, Henry Schreiner, Hood Chatham, Ian Thomas, JHM Darbyshire, James J Balamuta, James Lamb, Jiefu7, Joe Marshall, Joel Ostblom, Juniper Tyree, Kellen Malek, Kyle Barron, Loïc Estève, Luiz Irber, M Bussonnier, Maarten Breddels, Marco Edward Gorelli, Marianne Corvellec, Muspi Merol, Myles Scolnick, Nick Altmann, Olivier Grisel, Oscar Benjamin, Péter Gyarmati, Phillip Cloud, Riya Sinha, Szabolcs Dombi, Victor Blomqvist, YISH, Yan Wong, Zsolt Dollenstein, airen1986, chrysn, josephrocca, swnf

Welcome Agriya Khetarpal to the Pyodide team#

Build System Improvements#

Decoupling pyodide-build from Pyodide runtime#

Wheels for new packages#

Update to NumPy 2.0#

Performance improvements to foreign function interface#

What’s next?#

Package unvendoring#

Upstream work in Emscripten and Python#

Wasm Exception Handling#

Acknowledgements#