msgbartop
by Brian Neal
msgbarbottom

20 Mar 10 Get Ready for Django 1.2

Django 1.2 is coming out soon. I had been sitting on a trunk version since last October or November, and I finally decided to update to try out some of the new features. This blog post will summarize my experience and report on any gotchas I ran into.

So What Changed?

First of all, I have to mention the great new site Django Advent. Django Advent was created as a way to publicize the new features and exciting changes coming in version 1.2. I assume this site was inspired by The 14 Days of jQuery, which did a similar thing for the great Javascript library jQuery. If you haven’t already, please visit Django Advent to get a nice overview of the changes. I developed a quick “shopping list” of features I wanted to add to my site after perusing Django Advent.

A more obvious way of finding out what changed is to check Django’s fine development documentation. In particular, check out the Django 1.2 release notes. Note that since 1.2 isn’t out yet, these notes are likely to change, so check back on them from time to time until the final release. These notes don’t tell the complete story. You’ll likely also want to read the Django Deprecation Timeline. The Django documentation is really great for an open-source project (or any project for that matter). I highly recommend spending some time familiarizing yourself with all the information that is available in the docs.

Adventures in Upgrading

So with some trepidation and giddiness, I finally did the svn up command and pulled down a hot off-the-press copy of trunk. What happened next? My site under development still ran, and after clicking around randomly I found nothing obviously broken. Yes, I know I need to get some tests written for my site and applications to make this more scientific and repeatable.

Multi-DB Support

One of the first things I did was cut my settings.py over to the new settings format for configuring your databases. That’s right, databases, plural. Although I don’t have an immediate use-case for this feature, I can easily see it becoming useful in the future. In any event, I appreciate the work that went into this, and it should help Django get more accepted in the enterprise world. Please read the Django Advent article on this feature for more background.

Cached Template Loader

The second change I made was to try out the cached template loader. This loader caches the compiled templates in your site’s cache, and thus Django doesn’t need to go to the filesystem (often multiple times) on every request to fetch template files. Again, read the Django Advent article (and this one) for more explanation of this great new feature.

This was very easy to setup, however I totally derailed myself when doing so. When I reviewed my settings.py and began to cut it over to the cached template loader, I threw out the “app_directories” loader from my list of template loaders. I didn’t need that, I have all my templates under a common templates directory (with sub-directories under it for my apps). I then happily confirmed my templates were being cached and went on to another task.

It wasn’t until a few days later that I noticed my admin wasn’t working; it couldn’t find the login template. Huh? And gee, my Admin docs stopped working too. Well after some flailing about, I realized that indeed *I* don’t use the app_directories loader, but several applications I didn’t write do. In particular, I was reminded that yes, the Django admin is, in fact, an application. Ha-ha! Whoops. Okay, I put the app_directories loader back and all was well.

The cached template loader will be useful in production, but I think I’m going to have to turn it off in development. I noticed already that when I change a template and then hit reload on my browser, gee,  my change isn’t seen. This is not going to be a problem since settings.py is just a Python file, I’ll just conditionally use the correct loader for my current environment.

New Messages System

Once again, I refer you to the corresponding Django Advent article for an overview. A new messaging system has been put into place, replacing the old functionality that was tied to contrib.auth. I liked how this new system took the lead from the Python logging module for its design. I can imagine situations where it may be useful to filter messages at certain levels, for example.

It was very straight-forward to cut over to this new scheme. I appreciated being able to use the tags feature to tack on CSS styling to messages.

Syndication Changes

Many improvements were made to the syndication feed application. In particular, I liked the increased flexibility in the URL routing. It was easy to cut my syndication feed classes over to the new system, and along the way it looks like I was able to gain some additional RSS functionality thanks to the improvements in the base class.

I did run into one snag that was quickly resolved. I was using the cache_page decorator in my URLconf to cache the output of my feed classes. This stopped working after the upgrade. After I reported this problem on the Django Users mailing list, Django core developer Russell Keith-Magee confirmed it was a problem and wrote a ticket on this issue. Within hours it was resolved. Thanks Russell! Someday I hope to understand the root problem, which apparently is a bug in Python itself. I still need to do some more homework on how Python decorators work.

CSRF Updates

And finally, major updates to the CSRF protection system landed. Since I was not using this feature, I skipped over reading the release notes about it. Thus I was surprised when my login stopped working and started throwing CSRF related errors. It turns out that even though I am not using the CSRF middleware, all of the contrib applications, including admin and auth, have been cut over to use it. Normally this is not a problem, but as the upgrade notes state, if you aren’t using the provided contrib templates and you POST to a contrib view, things will stop working. The solution is to add the {% csrf_token %} to your custom template.

I am probably going to spend some time and cut all my applications over to use the CSRF protection. Django makes it easy to do, so it is really a no-brainer to add a bit of security to my site. It will be tedious to find all the existing forms in my templates to add the {% csrf_token %}, but that is a one-time task. I can easily add them to future forms as they are created.

The End?

Well that is all the issues I’ve run into so far, and as you can see, they are pretty minor and/or self-inflicted. But I hope this write-up will help other people on the fence about upgrading, or to just give them pointers on where to find upgrade information. Again, the Django Advent site combined with Django’s comprehensive documentation makes this upgrade easy. The new features rock and I can’t wait to incorporate them into my site and get some mileage on them. Thanks to all the Django developers and contributors for such a great piece of software.

Tags: , , ,

20 Dec 09 SG101 2.0 Status Report

This is the obligatory “why haven’t you been blogging about your project in a while” post. Yes, I suppose it is time to give a quick update.

Things slowed down dramatically on SG101 2.0 this summer. There was the 2009 SG101 convention vacation and other fun summer things to distract me. I had put up a beta site for feedback but was still missing a forums application. I got the itch to start back up again sometime in the Fall. I decided to start on a forums application myself and to see how it went. If it was too complicated I would look for a third-party solution.

This is one case where I did look at several third-party applications for ideas and to check on their status. There doesn’t seem to be a single recognized forums application in Django-land. There are a number of them, and they range from very simple to moderately complex. Many of them seem unsupported and have obvious problems. So in the end, I decided that since the forums are probably the most important part of my site, it would be best if I wrote it myself so that I could understand it completely. This includes both strengths and weaknesses. I did borrow many ideas from existing applications, and some of my initial momentum came from djangobb. However, I quickly stopped looking at other apps because Django really makes it easy to write complex web applications once you get an idea and try a few things.

My forum app contains most of the functionality of the venerable phpBB-based board I have now. I added a few things like the ability for users to flag posts as spam or abuse (I sure wish I had that now). I am considering making the first few posts of a user require approval to counter spam. But I’m not sure it is worth the effort with the “flag post” feature in-place. I might just wait and see how well that works.

I also decided to save a user’s post read and unread status in the database, instead of using cookies. Too many of my existing users complain that when their cookies expire they lose track of which threads are new. It will cost some database space, no doubt, but it is an often requested feature to fix this issue. I implemented a rolling 7-day window of thread and post read status, and in initial tests it seems to work just fine. It did add significant complexity to the design however, and I’m not looking forward to debugging that logic when a problem occurs.

After finishing the forums, I began working on my lengthy to-do list using my Trac issue tracker. I also spent a great deal of time refactoring some of my original code that I wrote over a year ago. I’ve become so much more proficient with Python and jQuery it is inevitable. My task list has become quite small and I am thinking about wiping the existing beta site and putting up a new one over holiday break and launching an official beta test.

The one area that I am lacking in right now is a good design and layout. A few users have volunteered to help with that, and one in particular is showing me some really nice work. If I can just manage to implement his design we may be on to something. I may also try to reach out to someone who is familiar with Django.

There are a couple of interesting problems I either solved or worked around during this period that I should blog about. I’ll just have to find the time to do that. In particular, I wanted to share how I created an admin dashboard for user-created content that needs admin approval before being published.

I’ve also volunteered to give a “brown-bag” lunchtime talk at my employer on Python. I’ll have to prepare some slides over the holiday break for this.

Tags: ,

15 Sep 09 Django Tip: get_object_or_404() and select_related()

I was looking at the SQL my views were generating, and I came across a couple of places where I was using get_object_or_404(), and then later following some foreign keys in the returned object. Something like this:

forum = get_object_or_404(Forum, slug=slug)
if not forum.category.can_access(request.user):
     return HttpResponseForbidden()

The problem is that two SQL queries occur here, one during the get_object_or_404(), and then another in the following if statement, when we access category, a foreign key on the forum object. It would sure be nice to somehow use a select_related() there to avoid the extra SQL query. I did some googling, and found a quick tip on one of the This Week in Django podcast pages. And yes, the documentation confirmed that get_object_or_404() can now take as a first argument either a model, a manager, or a queryset!

So now you can keep using the handy get_object_or_404() idiom, and reduce the number of queries with a slight bit of refactoring:

forum = get_object_or_404(Forum.objects.select_related(), slug=slug)
if not forum.category.can_access(request.user):
     return HttpResponseForbidden()

Very cool!

Tags: ,

13 Jun 09 Installing memcached for use with Python and Django

I installed memcached on my production server a while back. It’s supposed to be thee way to get fast and efficient caching for your Django powered website. I remember the process as being somewhat less than satisfying. Tonight I decided to get it running on my development box, which is running Ubuntu 8.04. So I took a lot of notes and present them here for my own future reference. I hope this may help someone. And as you can see, I have a few questions myself that perhaps someone can help me with. Read on…
(more…)

Tags: , , , ,

12 Jun 09 Django and Python Logging

I started working on a simple Django application for accepting and recording Paypal donations. While I was working on the IPN code, it suddenly occurred to me that I really needed a way to log any errors that might occur. After all the IPN process will be initiated by Paypal completely out of my control (not counting the Paypal sandbox) and without any visual feedback. Thus, I’d like a record of the path through my code to make sure everything is working the way I expected.

I had learned about the brilliant Python logging module some time ago, and had even used it in my IRC bot application. But could you use this with Django?

I did some research and found a couple of blog posts and related Django projects. After studying them I came to the conclusion that indeed the Python logging module is an excellent way to add logging to your web application. And after working with it again, I am very impressed by the functionality that it offers and how easy it is to use. I would have killed to have something like this in my PHP days. So I laced some logging calls throughout my IPN listener code, and I’ll know soon enough if it works correctly. I’ll post more about this later. But for now, I’d like to add some links to the things I found useful related to logging in Django.

First there is this very useful blog post by Simon Willison titled “Debugging Django.” In addition to talking about logging, Simon also has tips on using the Python debugger, asserts, and some useful middleware.

Second, there is a Django application called django-logging. This application seems mainly aimed at getting your logging statements displayed at the bottom of your web pages while debugging a problem. Of course you could also hook in a logger to log to a file, which is more of what I wanted to do. Another useful feature of this application is that you can configure it to automatically log your application’s SQL queries.

And finally I looked at Fairview Computing’s Django request logging, or drlog. This project provides some middleware which adds the ability to add a unique identifier to each log entry to associate it with a given HTTP request. This allows you to easily trace a single request, even while multiple concurrent requests are happening.

Studying Simon’s blog post and the source code for the above two applications was very enlightening. In the end I decided to start with the bare Python logging facility for now, configuring it to write to a file. I was reassured to read that the Python logging module is thread safe. If I start using the logging module heavily, I may use the drlog middleware to help me map log statements to HTTP requests.

Tags: ,

03 May 09 Django-Elsewhere

I just got finished integrating Leah Culver’s django-elsewhere application. Django-elsewhere was formerly Django-PSN (Portable Social Networks) and was originally created for the now defunct social networking site Pownce. This nifty application allows your users to add an arbitrary number of social networks, websites, and instant messengers to their profile. The application even comes with many icons for widely known sites.

In my previous design I had just stuck a few fields in my user profile for websites and a few of the common instant messengers. This was limiting, and  I had been thinking about expanding it to a more general solution when I stumbled across this application.

To integrate it with my site, I created a template tag to display a user’s “elsewhere” sites, and I made a view and template to allow a user to edit their sites. This code was based off the example view and template that came with the application. In general the django-elsewhere code quality is quite high. There are still a few print statements in the code base, but that’s all I can find fault with right now.

Thank you django-elsewhere team for the big time saver!

Tags: , ,

12 Apr 09 Using html5lib to Sanitize User Input

Based on this blog post by Django co-BDFL Jacob Kaplan-Moss, I wanted to try using html5lib to sanitize user input. I’m using Markdown on most of the site. But in one particular place (news items), I am (currently) allowing users to submit HTML news stories with the TinyMCE Javascript editor. This is mainly because my users like to copy and paste content from sites like MySpace, and TinyMCE might be easier for them to use than Markdown. I may revisit this decision, but for now we’ll go with it.

I was using the lxml sanitizer for this purpose. But because of the high praises html5lib received from Jacob, and from studying the source code to both, html5lib gives me greater confidence, even if it is an order of magnitude slower. But, it isn’t like this is going to get used more than a few times a day, so that isn’t a concern.

Never having used html5lib, or any other HTML/XML parser before, it was a bit confusing to figure out how to use it for this task. After studying the code and the html5lib news group, I came up with the following bit of code I thought I would share. Comments are extremely welcome.

import html5lib
from html5lib import sanitizer, treebuilders, treewalkers, serializer

def sanitizer_factory(*args, **kwargs):
    san = sanitizer.HTMLSanitizer(*args, **kwargs)
    # This isn't available yet
    # san.strip_tokens = True
    return san

def clean_html(buf):
    """Cleans HTML of dangerous tags and content."""
    buf = buf.strip()
    if not buf:
        return buf

    p = html5lib.HTMLParser(tree=treebuilders.getTreeBuilder("dom"),
            tokenizer=sanitizer_factory)
    dom_tree = p.parseFragment(buf)

    walker = treewalkers.getTreeWalker("dom")
    stream = walker(dom_tree)

    s = serializer.htmlserializer.HTMLSerializer(
            omit_optional_tags=False,
            quote_attr_values=True)
    return s.render(stream)

I haven’t tested it extensively yet, but it seems to do the trick. I understand a future version of html5lib will have an option to strip completely out offending tags. Right now they are simply rendered harmless and remain in the input (via < and >). This is fine, as I can see them in the admin as I review submitted stories.

Tags: , , ,

07 Apr 09 I contributed to Django!

Check out this ticket! Granted, it is just a typo fix to a Python docstring, but you got to start somewhere. :) What a great feeling in any event.

I hope I can contribute more meaningful features and bug fixes in the future.

Tags:

06 Apr 09 Infrastructure: Trac & Subversion

I’ve been wanting to get some kind of issue tracker up and running for some time now. Trac seems like a great choice. We’ve used it where I work, and the Django project uses it. I even managed to install it on Windows at work. Still, I was kind of dreading trying to get it working on the dedicated server I rent. I finally gathered the strength and tackled this problem this weekend, and it went far easier than I imagined.

Subversion

First of all, I decided I might as well upgrade my Subversion (SVN) server while I am at it. I see that Subversion 1.6 is out now. However, reading the fine print, I noticed that they seemed to have changed their Python bindings in 1.6, and I wasn’t sure if Trac is compatible with this. So without doing any further research I decided to just run the last stable version before that, 1.5.6.

My dedicated server is running Fedora Core 6, which isn’t maintained anymore, so there is no way to my knowledge of getting a binary package for these recent builds of SVN. I need to build from source. I had done this once before, and I even took detailed notes (which I had forgot about). Building from source is fairly easy, but there is one gotcha on the AMD64 server I run, you need to invoke the configure script with an –enable-shared switch. Luckily I wrote this down from the first time I did this. Getting the required dependencies for the source build isn’t too hard. The Subversion folks helpfully package some of the less readily available dependencies, so it is just a matter of grabbing them and untarring them on top of the unpacked source tarball.

Since I wanted to integrate Trac with my Subversion repository, I needed to ensure I built the Python Subversion bindings. I used Yum, the package manager that comes with Fedora, to make sure I had SWIG installed before I ran configure to build SVN. Then it is a simple matter of building the Python SWIG bindings after Subversion proper is built. This is explained very well in the Subversion documentation.

This seemed to go well, although I had a minor heart attack when Apache crashed the first time I tried to restart it with the new SVN in place. Another restart and it was fine. Hmmm. In short order I had upgraded my existing repository and things seemed to be working fine.

A New Subdomain

I then created a new subdomain to host my issue tracker. I rely on the Plesk control panel to do this lifting for me. It came installed with the server, and I rely on it heavily to configure Apache, the mail server, etc. I’m not a hard core server admin, so this is a big help. Although I can see the day when the training wheels can come off as I become more familiar with Linux and these tools. I can sort of see what Plesk is doing by examining the config files it creates and it doesn’t appear to be rocket science. Still, it is a big time saver for me.

Trac

To get Trac installed requires getting all the dependencies in place first. In most cases, I was was able to use Yum to get most of the dependencies in binary form from the Fedora repository. Despite the fact that Fedora Core 6 is pretty old, the version numbers of the dependencies in the repository were still compatible with the newest version of Trac. The one notable exception was the template engine Trac uses, Genshi. In this case a simple “easy_install Genshi” did the trick. Nice.

I might have been able to easy_install Trac, but the docs say that this only works for Python 2.5 and 2.6. I’m still running 2.4 on the dedicated server. Upgrading my OS is definitely on the long term to-do list, but I must take baby-steps for now. But it was a simple matter of grabbing the Trac tarball, untarring it, and doing the usual “python setup.py install”. It went flawlessly.

Now luckily I had setup Trac at work before, so I already knew what to do. I ran the command-line Trac admin tool to create a project and tied it to my new Subversion repository. Trac comes with a development server, and I ran that after configuring the project. I could then point my browser at my server and see my new Trac project for the first time! Things are cooking at this point.

Mod_WSGI

Of course I can’t use the development server for real work. So the next step was to get Apache to serve my Trac project. I once again chose mod_wsgi as the deployment method, after just recently converting The Madeira site from mod_python to mod_wsgi. The mod_wsgi documentation is excellent, and a wiki page covers integrating Trac and mod_wsgi in great detail. After studying the docs for a short while I had the magic Apache configuration down. I restarted Apache, and once again I was amazed that things were working on the first try. I had been pretty lucky so far. (In fact the most trouble I had that day was trying to change the logo on the Trac site!)

At Last…

I was now ready to configure my Trac project and get my new Subversion repository loaded. I had an existing Subversion repository that I was doing all my work in. However I had checked in some settings files that contained database password information. Shortly after realizing this I just locked the whole repository down. Since then, I have learned the Django settings.py and local_settings.py trick, and have placed the sensitive information in the local_settings.py file (which is not controlled in SVN). Now I can have a public read-only repository again.

So here it is, ready for beta testing: http://code.surfguitar101.com. Now there isn’t anything stopping me; I have to do the real work of deploying a beta version of SG101 2.0 for testing and feedback.

Tags: , , ,

29 Mar 09 Event Calendar: Time Zone Picker and Updates

I ended up creating a time zone picker for the event calendar. I saw the idea on the web somewhere. The problem is that there are nearly 400 common time zones in the database. Since every time zone is named in the format “area/location”, I created an area select and a location select. That broke up the time zones nicely, although some of the areas still have far too many entries to be completely convenient. I wrote a short Python script that parsed the pytz common time zones and generated a Javascript object literal to contain the select menus contents. Here is a screen shot showing it in action:

Time Zone Picker

When you select an area (the left-most) control, the location select fills with the appropriate options. When the form is submitted, some Javascript runs to take the two select values and puts them together and populates a hidden time zone input field with the result. So, in the example above, when the form is submitted, the hidden field receives “US/Pacific”. Likewise, when the form is displayed, the hidden field is parsed and the two select controls are set accordingly. This works pretty well, although I think I could have done a better job of modularizing this code in case I need to use it in another place on the site (such as in a user’s profile). I will definitely do this later.

I’ve decided to tackle recurring events later, as it seems a bit involved, and as I stated, very few events on the calendar need this capability. So with the time zone picker in place, and the corresponding code on the server side (thanks to pytz), I can now accurately add events to the event calendar without losing local time information.

I also sat down finally and converted The Madeira’s website from mod_python to mod_wsgi. This wouldn’t have been possible without the excellent documentation that mod_wsgi has. I feel this will scale better, and it will allow me to more easily run multiple Python web applications side by side. I am anxious to get a Trac issue tracker running as well as a beta version of the new site.

The rest of the weekend was spent working the “to-do” list for the site in preparation for deploying a beta version. I really do need to get an issue tracker going to capture all the ideas and work I need to complete.

Tags: , , , , ,