Sublime Blog

Python as an extension language: Not all beer and sunshine

If you’re shopping around for an extension language for your project, it’s pretty hard to go past Python. Aside from being a very pleasant language to work with, it has a huge selection of libraries, and a user base that’s at least as large. Add projects like Boost.Python into that, and you’ve got a pretty compelling case. However, it’s not all beer and sunshine in Python land – at least, not on Windows.

Lets start with the simplest case: Ship python25.dll, and python25.zip (containing the standard libraries) with your application, and provide some means for users to enter Python code.

Golden, right? Not so much. It’ll work fine right up until someone with an international version of Windows tries to install your application to a localized version of “C:\Program Files” that contains a unicode character. Then everything falls over.

Oh, you thought Python supported unicode? Well, it does. Sort of. Provided you don’t expect to be able to put unicode paths within sys.path, the list of directories python files may be loaded from. Ouch!

There are still options though: you can set sys.path to load from the current directory, and just set the current directory appropriately before making any calls into Python.

Now, let’s take a look at loading user supplied plugins from .py files. You could just place them in the same directory that the application is installed to, but that’s not a great solution on Vista, as users generally won’t have write access to Program Files. A better location is to place them within the Application Data folder, contained within the user’s home directory.

This raises some more problems: It’s not uncommon for users to have unicode characters in their username, and hence in their Application Data path. So we can’t simply add that to sys.path. We also can’t use the current directory trick any longer, as there are now two directories to load things from.

For Sublime Text, I’ve implemented a half way solution: Do some sys.path_hooks mangling to get modules loading from python25.zip, and use the current directory trick for user supplied modules (Drop me an email if you’re interested in the code). It’s not pretty, but it does work.

Aside from current directory wrangling, there’s another option I’ve yet to try: short path names. Windows has a notion of short path names, where every file has its full path name, and a short one (see GetShortPathName) in 8.3 format for archaic programs. The noteworthy part is that the 8.3 name uses ASCII characters, so they’ll be safe to use as Python module paths. There are a few caveats with this approach:

  • Not every file has a short path name. Generation of them may be turned off.
  • Some file systems don’t support 8.3 file names: They presumably won’t exist on Samba shares, for instance.

It’s unfortunate that Python has this limitation. My understanding is that it is not going to be fixed for Python 3.0, though there has been some work in this direction. Despite this, I still think you’d be mad to use anything else as an extension language.


Jon Skinner

14 Comments

  1. Why sharing a dll and some standard libs in a zip file instead of asking the
    user to install Python ?
    In this way, you gain the benefits of using whatever lib you need.
    And maybe your plugins can be saved within the Python installation directory
    (in site packages), but I may be wrong.

    I agree with the ’sort of’ unicode support in Python. I don’t know how Vista’
    s Application Data path is encoded. Is it ACSII ? If so, you can write some
    functions to convert from and to it easily I think.

    Comment by kib2 — April 8, 2008 @ 10:03 pm

  2. http://mail.python.org/pipermail/python-dev/2006-September/068686.html

    Comment by Ken Faulkner. — April 8, 2008 @ 11:09 pm

  3. This issue is going to be fixed in Python 3.0. One of the major changes in 3.0 is that all strings are Unicode strings, which naturally includes those in sys.path.

    Comment by skymt0 — April 8, 2008 @ 11:21 pm

  4. did you try Ruby? if yes, why did you dismiss the idea?

    Comment by NSaibot — April 8, 2008 @ 11:27 pm

  5. you aren’t familiar with ruby are you?

    Comment by NSaibot2 — April 9, 2008 @ 12:22 am

  6. There currently is a reddit war going on – python vs ruby. One big agenda is Unicode (a topic which totally bores me personally)

    My problem is not so much about the war, but about _OTHER_ blogs that do not allow comments.

    Please dear visitors, support blogs that allow comments and do not waste any time visiting blogs that disallow comments.

    This blog here allows comments so I approve of it!

    Comment by she — April 9, 2008 @ 12:25 am

  7. NSailbot: wanna do mindless advocacy? Get the guy off Windows, don’t give him an inferior language. ;-P

    Comment by Nicola Larosa — April 9, 2008 @ 12:31 am

  8. kib2: There’s a lot to like about that approach, but requiring users to first install Python is asking a fair bit – there’s a lot to be said to being able to download a single install and have it just work.

    skymt0: Do you have any references? It’s not fixed in the current 3.0a4 alpha, and I hazily recall reading that there’s no intention to fix it. The behavior of the current 3.0 build is the same as 2.5: You can put unicode strings in sys.path, they just don’t work.

    As Ken pointed out, there have been some patches to fix it floating around, but I don’t know the current status of any of them.

    Comment by Jon Skinner — April 9, 2008 @ 1:10 am

  9. What about lua?
    http://www.lua.org/
    Build about twenty c files straight into your app (or into a dll if you like that kind of thing) and its done, standard libraries and all. Beautiful little language, and dead easy to create bindings with.
    Extension language integration doesn’t any get easier than that…

    Comment by SDC — April 9, 2008 @ 8:44 am

  10. I’m going to have to disagree with your conclusion.

    I honestly think Lua is the best choice for an extension language these days.

    Comment by Christopher Cashell — April 9, 2008 @ 9:11 am

  11. I’ve run into a lot of pain points like that which I covered at Pycon this year. I love Python, but it’s far harder than it should be to embed it cleanly.

    Here’s a paper I published for Pycon about some of the objectives and issues we had:
    http://us.pycon.org/common/2008/talkdata/PyCon2008/020/Case_Study_-_Embedding_Python_into_Counter-Strike_Source.pdf

    Comment by Mattie — April 9, 2008 @ 5:48 pm

  12. @Nicola Larosa: i’m not advocating! i’m merely asking.

    “Despite this, I still think you’d be mad to use anything else as an extension language.” this sentence is the reason i asked the question. i’m using ruby for myself, and i know it either has some unicode issues; just like python has. still it is a very flexible language and fun to work with.

    on the other hand, there is lua. so, basically, my thinking was that jon isn’t/wasn’t aware of those alternatives :)

    Comment by NSaibot — April 9, 2008 @ 8:17 pm

  13. Jon, what do you think about lua as extension language for sublime?

    Comment by mvm — April 9, 2008 @ 8:22 pm

  14. Lua makes a good extension language, with the caveat that its greatest strength (and weakness) is that it’s rather lean in the libraries department.

    Comment by Dan — April 9, 2008 @ 10:11 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.