Yellow

Last Post Ever

It took me a while to bother to move my old posts over to my new blog, but LiveJournal's addition of obnoxious, mandatory, full-screen popover advertising finally motivated me to do it.

I couldn't find a nice way to re-direct each individual post over to its new home on glyph.twistedmatrix.com, so hopefully the title update is enough.  If you happen to know of some content which links to an article on this blog, please update it to point at the archived copy on that site.  All comments have also been migrated, so you shouldn't be missing anything there.
In related news, in case someone didn't notice, all old posts from jcalderone.livejournal.com have also moved, to as.ynchrono.us.  Goodbye forever, LiveJournal.  I hope you treat your new users better.
Glyph

New Blog

Those of you following me on blendix already know this, but I have a new blog.

While I may post something here occasionally from now on, it will be a much more personal flavor; my writing about work, about programming, and about technology will move there.

Masters of the planets (both python and twisted) please update your feed sources; you can feel free to drop this livejournal in favor of the new URL.
Divmod

Divmod: Reloaded

Hot on the heels of the Twisted release, Divmod has a new, and hopefully much more comprehensible, sight design and layout.

Check it out over at divmod.org.

I've long been ashamed of the default-Trac look and the opaque information layout on Divmod's site, and I'm really happy to have the way we greet the world be spruced up.

This is mostly the work of the unstoppable Duncan McGreggor; this is just his latest work in improving Divmod's communication with our community and our customers — and it won't be his last.

(As with any new site design, the topic isn't entirely a joke: your browser's probably cached some stuff it wasn't supposed to, so if you've been visiting our site a lot, re-load for the full effect...)
Glyph

Upgrade now!

In case you haven't heard through some other channel already, Twisted 8 is out.

In addition to numerous fixes and features, this release also includes a new release system for Twisted itself; this (hopefully) means that we won't have another year-long release drought.  We're planning to do another release in less than 3 months.

This means that new Twisted features will be available faster, but it also means that if you're writing some software that uses Twisted, upgrade now!  We try very hard to make sure that each new release is mostly compatible with the one that comes before it, so that your upgrade should be painless.  Especially if you have good unit tests.

However, this compatibility doesn't extend infinitely.  There are at least a few twisted developers who would really like to drop some of our years of accumulated cruft and break compatibility with older versions.

If you upgrade now, your migration process will be gradually fixing a few deprecation warnings.  If you wait for 3 or 4 more minor releases, upgrading all at once will mean that anything which has changed will start off broken, and your tests might not even run until you've fixed a bunch of things.

Of course, by "will", I mean "should" - we're not perfect, but we'll fix upgrade issues in micro releases if you find them and report them.
Glyph

Open Source 3D Massively Multiplayer Game Infrastructure using Twisted

sirgolan wasn't at PyCon, but he totally should have been.

I (and others) have been working on him for years to release this stuff as open source, and he's finally done it!

Go check it out the Multiverse 3D trac site, and get the code!

Do it!

Do it now!

(and then start submitting patches to get its persistence layer ported back over to Axiom because I told Mike not to block the release on that but it should totally be using Axiom and he only switched away from it because he didn't understand quite how it worked...)
Alchemy

Against the Alexandria Library Migration Strategy

Some library maintainers, when faced with the impending incompatible changes in Py3K, decide that it's time to burn their library down and start over with a new, incompatible version.  "Python is changing", they say, "so people are going to have to do a bunch of maintenance anyway.  What a great opportunity to force them to do all that maintenance that they should be doing for our library too!"

I'm saying it's wrong.  But don't take my word for it: Guido says it's wrong.  Before it became cool to do it, Martijn Faassen was saying it was wrong.

Guido didn't just blog that it was wrong, though.  He was so concerned that this message get out publicly that he repeated himself in a mailing list message to python-dev to make sure that people who don't read blogs would get the message.

This might strike some library maintainers as unfair.  If Py3K is just breaking compatibility, why can't you?

First of all, even if Py3K were really "just breaking compatibility", there is still the issue of careful timing.  You should read Guido's post and understand Ima Lumberjack's plight; he explains it exactly as I would.  When your users are doing maintenance to upgrade something, they only want to upgrade  one thing, so they know what's going wrong when they encounter problems.  And if you're making incompatible changes, they will encounter problems, regardless of how cool and well-documented your new API is.

But, if you look closely, you will find that there's another reason that your library doesn't play by the same rules that Py3K does.  It's because Py3K is actually doing a lot more than just breaking compatibility.

I've been a critic of this effort in the past (and I still occasionally grumble about a thing or two) but the bottom line is that the core Python team is not just willy-nilly breaking stuff.  Let me enumerate the huge amount of work they're doing to make sure that people can have a reasonable migration experience to Python 3:
  1. The Python core team have written and are maintaining a source-to-source translation tool to assist in the transition.  Does your compatibility-breaking project have source-to-source translation, or in fact any tool support for migrating between different versions of the library?
  2. The Python core team are developing a compatible backport of 99% of their features: Python 2.6 is effectively "python 3 lite"; you don't need to upgrade all the way to the incompatible version to get a lot of the new features.  Does your compatibility-breaking project include a (at least mostly) compatible backport of all of your new features to an actively developed, "older" version?
  3. The Python core team is providing long-term support for the previous version so that people can migrate at their own pace and not be left out in the cold.  Is your compatibility-breaking project planning to provide a decade worth of bugfixes, security patches, and feature backports to your older versions?
  4. The Python core team is providing comprehensive deprecation warnings explaining each new feature, and how you get there from the old feature.  Is your project going to provide documentation like that?
It's also worth noting that Python's dependencies are not going through any kind of compatibility earthquake while they're doing this, so they are not in the same position that you (as a Python application author) are.

So, if you answered "yes" to all four of those questions — you still don't have the same excuse that Python does, because their dependencies are not changing incompatibly.  So don't do it.  But if you were thinking about breaking compatibility at the same time as a dependency, then you probably didn't.

Of course, you can drop deprecated stuff and break compatibility if your user community will tolerate that.  Just don't do it in the same version where you decide to support Py3K.  Users should have the ability to get a compatible version that will work in 2.x and 3.x so that when they translate their own source code, they don't have to learn new methods right at that moment.
Glyph

Back from PyCon

Summary: PyCon 2008 was a great time.  I didn't go to a single talk, except a few of the keynotes; I spent pretty much my entire time cross-pollinating with other projects and plugging the until-recently-secret Twisted Software Foundation - our nickname for the Twisted project's membership of the Software Freedom Conservancy (TSF/SFC).

I briefly addressed the audience (of over one thousand python users) to kick off the TSF announcement, mostly to introduce Duncan.  However, someone managed to snap a picture of me that I think captured the feeling of awe that we've come so far in such a relatively short time.

I spoke to about 400 people at the conference, and I have a lot to follow up on.  I also have a day job, 阿as those of you that I spoke to about Mantissa rather than Twisted know ;-).

If I talked to you about something at the conference, please don't hesitate to send me email reminding me about it.  Ideally, send me email reminding me in one to three weeks.  I have almost 1000 unread emails right now and while I try to be rigorous about using some unique features of Divmod's mail system to make sure I reply to each one, there is a certain volume of communication which no tool can help me cope with.

My first priority is blogging about interesting aspects of the conference in the next few days, before I've forgotten.
Glyph

That Ain't Workin'

I'm a big fan of Nine Inch Nails.  Not quite to the degree of buying every Halo, but I have a number of his albums.  So of course I've been intrigued by his latest offering.  Today, while looking at Nine Inch Nails "Ghosts I-IV' website, I noticed an interesting bit of information:
We have SOLD OUT of the 2500 Limited Edition Packages.
My great-uncle is fond of a saying.  "It is better to be rich and healthy than to be poor and sick."  Seeing this, I was reminded of it.  It's not quite as catchy, but it's better to be customer-friendly and a huge success than reviled as corrupt and a failure.

The music and movie industries have been telling us for the last few years that digital restrictions are required in order to save their businesses from destruction.  Well, Mr. Reznor has proven them wrong quite dramatically.  Before I continue, let me address an obvious objection up front - I realize he's a superstar.  However, the spokespeople for the RIAA that support their claims are also superstars: Lars Ulrich is hardly a starving artist laboring in obscurity.   I'm not saying everyone can do what he did, only that the people who are already rich in the music industry can continue to be rich without the bullshit that they claim is critical.

The "Limited Edition", for those of you not up on the latest NIN happenings, is a three hundred dollar version of the album, containing a bunch of extras and a signature from Mr. Reznor himself.  The full album, in lossless, non-restricted format, costs five dollars.  There were 2500 copies of the limited edition.

Let me emphasize for those of you who might not be quite as up on the terminology that "lossless" formats (which NIN is selling for $5 here) are the highest quality format that it is possible to distribute over the internet.  Other music producers, out of fear for eating their CD revenues, have mostly refused to provide digital copies of their music of this quality.

Also, to compare pricing: Apple typically charges 99¢ for a DRM-free song: it's not lossless, but it's 256kbps, which is fairly high quality.  (I would not believe it if someone told me they could hear the difference, but there is a marginal difference in the perception of value here.)  There are 36 songs on "Ghosts".  $5 is roughly 14% of $35.64.

So now that I've established that NIN is selling higher quality goods, in a customer-friendly way, for a fraction of the price of the competition, let's do some math:
  • march 6, today,
  • minus the march 4th (the date that the "ghosts" website became fully operational (according to wikipedia),
  • is two days, times
  • 2500 copies
  • times 300 dollars
  • equals SEVEN HUNDRED AND FIFTY THOUSAND DOLLARS IN TWO DAYS.
Trent has now proven that if you are a superstar, you can make three quarters of a million dollars in two days, on a ridiculously expensive premium edition alone.  This is to say nothing of the people who bought, and continue to buy, the $75 version, the $10 version, or the $5 version.  This says nothing of the people who are buying it through Amazon.

Coincidentally, it also proves that you don't need any RIAA thugs to help you do this, or "market" your work, assuming people already know who you are.  You just need a web server, and a swimming pool big enough to put a million dollars.
Glyph

Highlighting buried treasure in Twisted

I've previously blogged about twisted.python.modules, but it assumes you know about another API inside Twisted, twisted.python.filepath.  Unfortunately this module is rather under-documented and under-publicized, despite being extremely useful.  Unlike a lot of Twisted, much of the code in twisted.python can be extracted and used by itself, regardless of whether the program in question is networked or even event-driven.  This is especially true of FilePath, which is completely blocking, although sometimes I wish there were at least a version of it that wasn't.

A common sort of script that deals with a filesystem is to open each file in a directory hierarchy with a given path and do something to its contents.  For example, let's write a program that prints out a list of all Python modules (with a .py extension) in a tree which contain shebang lines.

Here's the script using good old os.path:
import sys
import os

def os_shebangs(pathname):
    for dirpath, dirnames, filenames in os.walk(pathname):
        for filename in filenames:
            fullpath = os.path.join(dirpath, filename)
            if (fullpath.endswith(".py") and
                file(fullpath, "rb").readline().startswith("#!")):
                yield fullpath

def os_show_shebangs(pathname):
    for path in os_shebangs(pathname):
        sys.stdout.write("%s: %s\n" % (
                path,
                file(path, "rb").readline()[2:].strip()))

if __name__ == '__main__':
    os_show_shebangs(sys.argv[1])

Pretty normal looking python code; not too much wrong with it.  At 20 lines and 596 characters long, it's not too complex.

Now let's have a look at a similarly idiomatic version using FilePath:
import sys
from twisted.python.filepath import FilePath

def shebangs(path):
    for p in path.walk():
        if (p.basename().endswith(".py") and
            p.open().readline().startswith("#!")):
            yield p

def showShebangs(pathobj):
    for path in shebangs(pathobj):
        sys.stdout.write("%s: %s\n" % (
                path.path,
                path.open().readline()[2:].strip()))

if __name__ == '__main__':
    showShebangs(FilePath(sys.argv[1]))
At 18 lines and 471 characters, it's almost exactly 20% smaller than the version that uses os.path.  However, a small space savings is hardly the most interesting property of this code.  The advantages over the version that uses os.path:
  • It's easier to test.  You can use a fake FilePath object rather than needing to replace the whole "os" module and the "file" builtin.
  • It's easier to read.  You need fewer names; rather than os, os.path, and builtins, the code talks mainly to one object.
  • It's easier to write.  How many of you honestly remembered that "dirpath, dirnames, filenames" is the order of the tuples yielded from os.walk?
  • It's easier to secure.  If you wanted to allow untrusted users to supply input to the os.path version, you need to be very, very careful.  What about "/"?  What about ".."?  With FilePath, you simply supply the input to the 'child' method, and...
    >>> from twisted.python.filepath import FilePath
    >>> fp = FilePath(".")
    >>> x = fp.child("okay")
    >>> y = fp.child("..")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "twisted/python/filepath.py", line 308, in child
        raise InsecurePath("%r is not a child of %s" % (newpath, self.path))
    twisted.python.filepath.InsecurePath: '/home' is not a child of /home/glyph
    >>> z = fp.child("hello/world")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "twisted/python/filepath.py", line 305, in child
        raise InsecurePath("%r contains one or more directory separators" % (path,))
    twisted.python.filepath.InsecurePath: 'hello/world' contains one or more directory separators
  • It's easier to extend.  As of revision 22464 of Twisted (i.e. the next release) you can replace twisted.python.filepath.FilePath with twisted.python.zippath.ZipArchive, and this exact same code can operate on zip files.
Not only does FilePath provide these benefits, it has very few dependencies.  Even if you don't like Twisted much, you can use twisted.python.filepath by copying only 3 modules into your project (twisted.python.filepath, twisted.python.win32, and twisted.python.runtime) and twiddling the appropriate imports to be relative.  Since FilePath is only one import for your code, and mostly consists of method calls, it will easily work with Twisted's version or your own.  So, share and enjoy!
Glyph

Do Not Want

I do not want a book called "The Ghost Brigades", but someone thought I did, for a minute.  Those of you following this via my Blendix activity page will know what I mean.

One of the interesting things about working on Blendix is that we get to see all the ways in which other services export bad data.  Amazon is particularly weird about it, though.  As we were testing our code we saw a variety of bits of bad data intermittently published, some of which were fusions of the names of random products, some of which were actual products that had nothing to do with the user in question.  I'm a little sad that my first bogus entry was a real product, though.  Some of the incorrect entries we saw during testing were pretty hilarious.

Has anyone else out there used to using Amazon's various APIs to pull wishlist data and seen similar results?  Is there any way to work around it, or to recognize the bogus data?  Advice is welcome.