Databases


25
Sep 11

Quickstart your Django project in 60 seconds

Sane-default boilerplate to avoid repeating yourself each time you start on a new Django project.

Django comes with a lot of batteries included, but it takes some time to set them up. And usually, the initial few steps are always the same. We can refactor these into a boilerplate empty Django project, and use it as a base for new projects, instead of starting from scratch each time.

That’s what I did with DJ Skeletor. It’s an empty, relocatable (ie. uses no hardcoded paths) Django project, set up to my liking and with a few Django apps that I use in virtually all projects (impatient? jump directly to the examples).

Django settings

Default Django settings organisation is rather simplistic – it’s just one file. But as soon as you deploy the project to somewhere else than the computer you’re writing it on, you’re going to have at least two sets of settings: development and production.

There are several ways to handle this in Django. I prefer the following:

  • Settings module is split into several submodules and lives in settings/ directory.
  • Settings that are the same for both development and production live in settings.base module.
  • Development-specific settings are provided by settings/dev.py, and production-specific settings are in settings/prod.py; a symbolic link settings/local.py points to the one that should be used in a specific environment.
  • The settings module imports settings.base and settings.local

This allows me to have all of the settings code (even per-environment settings) in the git repository. By using the symlink (instead of host name or IP address as some do), I can activate the variant I need without regards to the rest of the environment. This allows me to eg. have several variants of the same project running on the same machine.

Database

A test SQLite database is set up for the development environment. The database file is configured to be test.db in the project root by default.

South

South is an awesome Django app for handling database schema migrations (ie. table/field changes when you modify your models). If you’re using a database with Django, you want to use South as well.

Django Debug Toolbar

Django Debug Toolbar is very handy for inspecting what happens when you request a page from Django. It lists things such as SQL queries executed (including how long did they take and why they were executed), signals, logging, exception, request params, etc. I use it all the time for finding and fixing suboptimal database queries.

The app usually defines a white-list of client IPs for which to be shown. As I’m not on a static IP, I find it more useful to have the toolbar show all the time when in development environment, and never when in production.

Sentry

Sentry helps with exception logging and viewing for your Django project. It can handle multiple Django installs where the logs can be managed from a single place, or it can be used on a per-project basis. The latter is how it’s preconfigured in DJ Skeletor.

Besides logging the exceptions, Sentry can also catch normal logs you create with Python’s logging system. This is also preconfigured in DJ Skeletor.

Example

Enough talk, let’s see some action. First, we’ll create a virtual environment (you do use virtualenv, right? if not, you should) and install prerequisite packages:

    virtualenv --no-site-packages myenv
    cd myenv
    source bin/activate
    pip install django django-debug-toolbar south django-sentry

Next, we’ll clone the boilerplate project:

    git clone https://github.com/senko/dj-skeletor.git myproject
    cd myproject

Let’s activate the development environment and prepare the database:

    cd settings
    ln -s dev.py local.py
    cd ..
    python manage.py syncdb
    python manage.py migrate sentry

All done, let’s run it!

    python manage.py runserver

See? Piece of cake. With boring initial set-up out of our way, we can focus on building an awesome web site or app.

DJ Skeletor is open source and is available on GitHub. Feel free to use it or base your own boilerplate on it – if you do, please share your thoughts in the comments.

Bonus: HTML boilerplate

If you’re a programmer and couldn’t design if your life depended on it, it’s useful to have the user interface boilerplate handy as well.

If you need a clean, well designed (but definitely not unique or artistic) user interface for your HTML app, I heartily recommend Bootstrap, created and open sourced by the fine folks at Twitter.

If you do need to make a proper, unique design, again no need to start from zero: use HTML5 boilerplate which takes care of a myriad little gotchas for you; and there’s a mobile option as well.


23
Dec 10

MongoDB gotchas for the unaware user

I like MongoDB, mostly because it’s so simple and natural to use from dynamic languages. I’ve used it in two of my projects so far (Encode and Sparrw) and, while I’m really happy with the choice, there were a couple of issues that caught me unaware and cost me a few additional hours of head-scratching or fixing. Some of these things are no-brainer if you have multiple machines and then assign some of them for database, but my use cases are low-traffic web app on a single (virtual) server.

These are all simple and documented things, and are not bugs (well, depending on who you ask). If you’ve read all the docs, you’ve probably read of most of them at some point. So did I, but then I didn’t remember them at the correct point in time and then had to fix things.

Use 64-bit version. 32-bit version has a limit on about 2.5GB of data stored. Yeah, it’s probably enough for playing around. But when you start configuring your production (or staging) system, remember to choose 64bit flavor, since you can’t just “fix” that later on, you’ll have to reinstall everything.

Have a slave db on another machine. If your MongoDB instance crashes (or gets killed due to OOM, or the whole system crashes), there’s no guarantee about what state your data is in. You can run repair, but this is like running fsck or playing the lottery – you never know what you’re going to get. So you really want to have a slave (or a replica set), and you want that slave to be on another server. This is really cumbersone if one VPS is enough for all your (other) needs, but there’s no avoiding it, if you value your data.

MongoDB 1.8 update:From 1.8 onwards, MongoDB supports journaling, making it safe to use on a single server. Journaling is not on by default as of yet, and it’s recommended to use journaling only on a 64-bit version.

Secure it. MongoDB is by default using no authentication and is listening on all network interfaces (this is true for version you can get directly from their site; various Linux distributions, such as Debian and Ubuntu, have a sane default of binding to 127.0.0.1 only), meaning anyone can access your db from anywhere in the world. If you use it on a publicly-visible server, this is a bit of a problem. You could either set up authentication or tell MongoDB to only listen on localhost. I prefer the latter because I’m the only user on my server anyways.

Always use getLastError. Unless you need lightning speed, it pays to wait a little to be sure the database is ok with your changes, and that there were no errors modifying the data – if nothing else, then to log it in your app so you know something bad happened. Or, if you’re certain you don’t need getLastEror(), at least never mix using and not using it on the same collection. MongoDB doesn’t guarantee that commands will be executed in the order given. In my test code, I had an “async” remove() call (ie. I didn’t wait for it to finish) and was then inserting new entries, and previous remove() happiliy removed them (all of them, or some, or none, depending on the race). Those were very confusing few hours.

There’s a lot of documentation online and a lot more info can be found on various forums, but it’s also good if you can get this information in a condensed form. For this I’ve found MongoDB: The Definitive Guide book and 10gen videos very helpful – for example, the deployment strategies video is great for a start.

I hope this few tips from my experience will help you avoid the mistakes I made while trying to use MongoDB :-)

Edit: there’s a lot of useful comments about this on Hacker News as well.