I have become a big fan of Django data migrations
by mark | 4 Jul 2022, 8:45 p.m.
One of the more irritating things about Django is keeping track of what you have done to the database. Now for things like these blog posts you don't normally care as you create the object, insert it into the database, and render it on demand. But for other things where your database model or some parameter style data is updated you want to be able to control programatically what is going on.
If you read many resources on django they are long on a feature called fixtures. This attention is a bit of a red herring. You use fixtures to populate a DB right at the beginning of a project, or to take a dump of live data for test purposes due to how the Django test framework works. That's it. If you are migrating or updating data that isn't user generated content you want to use data migrations. These have several benefits:
- Can be reversed: you can back out changes if you turn out to have messed up
- Can be reviewed and debugged: changes can be made iteratively until they're right
- Can be versioned: stick them in git and you can see what has gone on and why, rather than trying to infer what has gone on from database state (you will never do this).
- Can basically do fixtures: you can point the data migration at some other source and load it into the DB
- Zero downtime: sensible deployment processes will replicate exactly your test and prod DBs (e.g. on Heroku: I do
git push heroku master
, give it a few minutes, and it is live).
The major downside is that they are a bit complicated to set up but once you get the hang of it they're not bad.
So what do you need to do?
Make an empty migration
Easy. At the command prompt utter the incantation
python manage.py makemigrations --empty yourappname
where yourappname is, well, your app name. That's it.
Define functions
In the migration file Django created for you, there is a new Migration instance with an operations member defined. Make this look like
operations = [migrations.RunPython(forward_func, reverse_func)]
Create a forward function
This is where you create the new objects in your model. Function signature should be
def forward_func(apps, schema_editor)
To get a model, use my_model = apps.get_model("yourappname","yourmodelname")
. Don't import the model and call it directly. It won't work as the migration executes in such a way that imported model objects cannot be used. I can't tell you how to create the new data objects; this depends on your use case. Try my_model.objects.create()
.
Create a reverse function
This is the reverse of the forward function. Identify the objects created in the forward_func(), and delete() them.
Actually do the migration
At the command line, do python manage.py migrate
. This should work. It will give you a numbered success message if it works. If it does not, fix your migration. Make sure to test both forward and backwards!
The migrate command takes you forward. To go back, first remember what the numbered success message is (e.g. 0011). Then do python manage.py migrate yourappname 0010
. If it doesn't work, fix your reverse function.
The great thing about this is the ability to move between DB versions in a controlled way. This is just not possible with fixtures as you can't really tell, without agonisingly reconciling the fixture file with the DB row-by-row, whether the fixture is loaded. Data migrations give you the control you need to do updates safely.
Back to all articles