Sublime Blog

Python as an extension language: Not all beer and sunshine

If you're shopping around for an extension language for your project, it's pretty hard to go past Python. Aside from being a very pleasant language to work with, it has a huge selection of libraries, and a user base that's at least as large. Add projects like Boost.Python into that, and you've got a pretty compelling case. However, it's not all beer and sunshine in Python land - at least, not on Windows.

Lets start with the simplest case: Ship python25.dll, and (containing the standard libraries) with your application, and provide some means for users to enter Python code.

Golden, right? Not so much. It'll work fine right up until someone with an international version of Windows tries to install your application to a localized version of "C:\Program Files" that contains a unicode character. Then everything falls over.

Oh, you thought Python supported unicode? Well, it does. Sort of. Provided you don't expect to be able to put unicode paths within sys.path, the list of directories python files may be loaded from. Ouch!

There are still options though: you can set sys.path to load from the current directory, and just set the current directory appropriately before making any calls into Python.

Now, let's take a look at loading user supplied plugins from .py files. You could just place them in the same directory that the application is installed to, but that's not a great solution on Vista, as users generally won't have write access to Program Files. A better location is to place them within the Application Data folder, contained within the user's home directory.

This raises some more problems: It's not uncommon for users to have unicode characters in their username, and hence in their Application Data path. So we can't simply add that to sys.path. We also can't use the current directory trick any longer, as there are now two directories to load things from.

For Sublime Text, I've implemented a half way solution: Do some sys.path_hooks mangling to get modules loading from, and use the current directory trick for user supplied modules (Drop me an email if you're interested in the code). It's not pretty, but it does work.

Aside from current directory wrangling, there's another option I've yet to try: short path names. Windows has a notion of short path names, where every file has its full path name, and a short one (see GetShortPathName) in 8.3 format for archaic programs. The noteworthy part is that the 8.3 name uses ASCII characters, so they'll be safe to use as Python module paths. There are a few caveats with this approach:

  • Not every file has a short path name. Generation of them may be turned off.
  • Some file systems don't support 8.3 file names: They presumably won't exist on Samba shares, for instance.

It's unfortunate that Python has this limitation. My understanding is that it is not going to be fixed for Python 3.0, though there has been some work in this direction. Despite this, I still think you'd be mad to use anything else as an extension language.

Jon Skinner