If you have been developing using Python for a while, perhaps you have heard of Cython and how it speed things up. Cython is an optimizing static compiler for the Python programming language and the Cython programming language, which is a superset of Python. What Cython does is convert your Python code to C and then build/compile it using a C compiler of your choice. In Python world, this is commonly called as Cythonizing. The speed gain is great but it still depends on how optimized your Python code is.
How to Cythonize Python code?
First step is to have a C compiler available depending on the platform that we are using and the Python version that you are working with. If we are developing on Linux, we do not need to install anything since most Linux boxes comes with GCC compiler installed. If on Windows, there is a recommended set of compilers for specific Python versions available here.
In this guide, we will be using Python 3.7 on Windows 10. The easiest and faster route for us is to download and install Visual Studio Community 2019. During installation, choose Desktop development with C++, click Install, and that's it! You will be downloading tools and SDKs for C and C++ development.
Next step is to install
pip install cython
Now we can start working on our Python module. Let us say we have a Python file named
module.py containing the function
hello() and we want to Cythonize it.
#!/usr/bin/env python def hello(): print("Hello world!")
First step to Cythonizing is to write a standard
setup.py containing the definition for
ext_modules.We will simply pass our module file name to the
cythonize() function. In setuptools, our cythonized module is called an extension.
#!/usr/bin/env python from setuptools import setup from Cython.Build import cythonize setup( ext_modules=cythonize('module.py') )
The last step is to build our extension by executing
setup.py. The argument
--inplace builds our extension on the same location as
python setup.py build_ext --inplace
We will end up with the following files and directories. The build directory contains all the files and objects used by the C compiler. What is important to us are the
module.c which is the C equivalent of our Python code and
module.cp37-win_amd64.pyd which is our compiled extension.
build/ module.c module.cp37-win_amd64.pyd module.py setup.py
To use our compiled module we simply import it like a normal Python module.
#!/usr/bin/env python from module import hello if __name__ == '__main__': hello()
$ python example.py Hello world!
How to Cythonize large Python packages?
For this example, we will be using the amortization module that we use on our previous blogs. Most guides on the internet will simply try to put it this way which is wrong and will not compile our code:
#!/usr/bin/env python from setuptools import setup from Cython.Build import cythonize setup( ext_modules=cythonize('amortization/*.py') )
The reason for this is that the
__init__.py on packages cannot be compiled, at least, under normal methods. There is a somewhat hacky way do it but I will not discuss that here.
LINK : error LNK2001: unresolved external symbol PyInit___init__ build\temp.win-amd64-3.7\Release\amortization\__init__.cp37-win_amd64.lib : fatal error LNK1120: 1 unresolved externals error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.20.27508\\bin\\HostX86\\x64\\link.exe' failed with exit status 1120
To solve this, we will need to refactor our code and move codes out of
__init__.py. We need to retain this file empty on the package and not compile it.
#!/usr/bin/env python from setuptools import setup, Extension from Cython.Build import cythonize ext_modules = cythonize([ Extension("amortization.amount", ["amortization/amount.py"]), Extension("amortization.schedule", ["amortization/schedule.py"]), Extension("amortization.amortize", ["amortization/amortize.py"]), ]) setup( ext_modules=ext_modules )
python setup.py build_ext --inplace, we will end up with the following files.
__init__.py amortize.c amortize.cp37-win_amd64.pyd amortize.py amount.c amount.cp37-win_amd64.pyd amount.py schedule.c schedule.cp37-win_amd64.pyd schedule.py
I moved away the
.py files except
__init__.py temporarily and ran
pytest -v to verify that the code is working though there is no need to do this since Python imports the compiled modules (
.so on Unix and
.pyd on Windows) if they are available.
tests/test_amortization.py::test_amortization_amount PASSED [ 50%] tests/test_amortization.py::test_amortization_schedule PASSED [100%] ========================== 2 passed in 0.05 seconds ===========================
How to distribute packages with Cython support on PyPI?
By simply running
python setup.py bdist_wheel you will end up with a binary wheel that you can use only on platforms with similar Python versions and platforms as you have. Note that you should install the wheel package prior to executing the command. There are two ways to support all platforms and versions:
- Build binary wheels on all target platforms and versions and upload to PyPI
- Upload only the source to PyPI and let the user build it
The first option takes a lot of effort but we can automate things on a CI/CD pipeline. The fastest route is the second option as you only need to do minor tweaks on
#!/usr/bin/env python from setuptools import setup, Extension try: from Cython.Build import cythonize ext_modules = cythonize([ Extension("amortization.amount", ["amortization/amount.py"]), Extension("amortization.schedule", ["amortization/schedule.py"]), Extension("amortization.amortize", ["amortization/amortize.py"]), ]) except ImportError: ext_modules = None setup( ext_modules=ext_modules )
To build the source-only package, uninstall Cython first and make sure to remove all
*.pyd files in the amortization module then run
python setup.py sdist. The only disadvantage of this option is that, the end-user should install
cython and a C compiler. Try this by doing the following steps:
# install a C compiler first pip install cython pip install amortization -v # Add -v to see what is happening behind the scenes
To wrap up
Cython increases the speed of a Python module by compiling a Python code to C. Although this is a common use-case for developers to use Cython, we can use it for code obfuscation. If we want to protect our code from other people's eyes, we can definitely build it using Cython and distribute it without the source code.python cython