Friday, February 20, 2009

Alternatives to django-media-bundler

Apparently I didn't do a good enough job Googling for other projects when I wrote django-media-bundler. If you Google for django concatenate javsacript or django minify javascript there are a number of other projects that do similar things, each with different tradeoffs:
  • django-mediacat: To use this tool, you descibe your JS packages in Django models using the django-admin interface. You then configure a URL to point at the view function, and pass it the package names you want as GET arguments. mediacat then caches the resulting file in the database model, and sends the appropriate ETags and Last-Modified tags to make the browser cache the request contents. Last updated: 2008-11-02
  • django-compress: This tool can be configured to automatically regenerate its bundles by checking the file modification times of the original source files. Obviously, if you're using a content distribution network or a cookieless static domain, this feature might not work out of the box, but if you're a small-time operation, this is nice. It also allows versioning the filenames of the bundles so that when you have new bundles they will bust the browser cache. You can also use the YUI compression tools if available, and finally django-compress has a templatetag that will source the compressed script if compression is enabled, or the individual source files if disabled. Finally, I'd like to point out that this project looks the most mature, as it was started on 2008-04-28, and it has a well-written wiki and not just a README. Last updated: 2008-12-05
  • django-compact: Looks like it's not quite finished yet, but so far it looks like its main feature is the templatetags that will link to individual script sources or the bundle. Last updated: 2009-01-30
  • django-assets: Has Jinja 2 templatetags, and supports automatic regeneration of bundles based on file mtime. Bundles are defined inline in the templates instead of in a central configuration. This seems less good to me, because you want to keep the number of different bundles small, so that the user only has to download one script bundle. Sometimes you want more than one because a particular page has a lot of JS, but usually you want all JS to be cached after the first page load. Last updated: 2009-02-08
  • django-assetpackager: This tool supports cache busting by putting the bundle generation timestamp in the bundle filename. It also has a templatetag that will source individual scripts in debug mode and just the bundle in production mode. Last updated: 2008-06-21
  • Finally, django-media-bundler: While somewhat unrelated to concatenating and minifying JavaScript and CSS, my project supports image spriting, which is a pain to do by hand. It also employs an interesting little heuristic 2-D bin packing algorithm to try and arrange the images into a square-ish rectangle of minimal area. It has templatetags like a couple of the others, but it doesn't have cache busting. That's an important feature I'd like to add. The auto-regeneration I'm not convinced about, because it breaks down when you're not using a simple single-server setup. Also, it means that your templatetag has to do a bunch of file system calls while its rendering the template, which isn't a terrible idea, but it feels less than perfect. Last updated: 2009-02-15
Having browsed the source trees of each of these projects, in my (biased, of course) opinion the best tools here are django-compress and django-media-bundler. django-compress is mature and has the most JS & CSS bundling features, while the media-bundler is simpler (which can be good) and has image spriting.

Saturday, February 14, 2009

django-media-bundler now supports sprites!

Over the last week I've been working on adding image spriting support to django-media-bundler. For the uninitiated, image spriting is a technique that Google, Yahoo, and other fast web sites use to speed up page load times. What these web sites do is to combine all of their small icon images into one medium size image, and then use CSS background image offsets to display each icon individually from the master. For small icon graphics, the overhead of the HTTP requests dwarfs the size of the actual image, so this speeds things up drastically.

With the help of the Python Imaging Library, I was able to read the images, measure their dimensions, and paste them together into the master image. However, given icons of arbitrary size, it's not clear what is the best way to lay out the master image. Having just taken an advanced algorithms course, this problem seemed like a variation of the bin-packing problem. The bin-packing problem is NP-hard, but once you have a name for something, it's a lot easier to Google up some simple heuristic algorithms to solve the problem.

Finally, I had to figure out how to get the sprites into the page. I found that in audio-enclave we use images in all sorts of interesting ways that make it difficult to abstract away the spriting behind a template tag. In the end I generated a set of CSS rules with the background image and offsets and decided to let the user figure out how to display the images. Working in the sprites was, for our project, more pain than it was worth, and I had to break a couple of nice CSS abstractions to force a DIV node into an element which had none before.

Anyway, implementation details aside, now audio-enclave has excellent front-end performance! Check out these Firebug net tab screenshots:

Before spriting:

After spriting:

Sunday, February 1, 2009

Why CPython Will Live On

Recently there has been a lot of interest on proggit and Hacker News in creating new language implementations on top of existing VMs like the JVM, the CLR, and the Erlang VM Beam. The list of language implementations targeting existing VMs that I can name off the top of my head is long: Clojure, Scala, Jython, JRuby, IronRuby, IronPython, Reia, Ioke, Boo, Fan, F#, and Fortress. I was even working on a small language side-project that had the eventually had the goal of targetting the JVM. This should all be old news to you if you've been paying attention to PL news, and I think it's a pretty good idea. When languages share runtimes, you end up being able to communicate between them nicely, and everyone can collaborate on writing one high-performance garbage collector and one solid JIT.

However, you can only stretch this principle so far. Reia is implemented on top of Beam because it wants capabilities that the JVM doesn't have built-in, like lightweight processes and good fault-tolerant message passing. So what I want to talk about is that while I think Jython and IronPython are a worthwhile ways to get pure Python to play nice with languages on those respective VMx, I still think CPython has a very bright future.

I realized while reading Guido's History of Python blog that one of Python's very early design decisions was to integrate well with existing systems, meaning things written in C. As Guido explains, this was a reaction on his part to his work with the ABC group, which wanted to hide all those scary systems problems away from the programmer and isolate them on some higher-level plane. While this may be good for a learning language, this limits your ability to do interesting things with code that already exists. Python solved that problem by having a relatively simple C API, especially when compared to things like JNI. Going further, the choices to use the GIL and reference counting are decisions that clearly make the life of the C extension module writer easier.

Writing C extension modules isn't exactly peaches and cream, so Greg Ewing came up with Pyrex which was forked into Cython. Cython is a "medium"-level Python-like language which gives you access to C primitives and allows you to call out into both Python and C with ease. The Sage Project, a project to repackage and combine Python math software, uses it extensively.

To give you an idea of what this lets you do, let's say you're writing a C++ plugin for an existing crummy Windows application and you want to make your life better by embedding Python. All you have to do is call PyInitialize() from your plugin, and then you can start interacting with Python code. Cython makes this even easier, because you can write your DLL stubs in Cython as cdef's and you can just wrap it up as a DLL. Doing this kind of thing is technically possible with the JVM. However, while Cython is easy, the JNI is a pain, and the JVM has a massive startup time penalty as compared to CPython. I'd like to link to a more in depth explanation of this technique, but the person I know who is using it hasn't written it up online yet. If and when it does go up I'll link it.

In conclusion, until the day that C's star sets, CPython will continue to be an incredibly useful tool.