Erik Ramsgaard Wognsen

Thoughts & technology

Checks for Your Django Project

The Django web framework has a management command to check your code for various problems. For example, it checks that your database CharFields define the max_length attribute, and that you have not set DEBUG = True in deployment. This is good for catching various mistakes in using the framework before the problem shows up in production. And for your own business logic, you have unit tests. But there are still many things to not mess up, and they are easy to forget if they are not incredibly easy to check. So I wrote a script to check every problem I could think of in one go.

First up is the unit tests. For this script I only want it to tell me anything if there is a problem, as is customary for unix programs. With verbosity 0 and head -n -4 cutting off the last four lines with the test summary, only failed test cases are shown:

1
./manage.py test -v 0 --noinput 2>&1 | head -n -4

Alternatively, you can run them with a code coverage check. Here, failed tests are shown and the total code coverage is shown if it is below 50%:

1
2
3
coverage run --branch --source='.' --omit='*/migrations/*,*test*,*settings*' \
    manage.py test -v 0 2>&1 | head -n -4
coverage report --fail-under=50 >/dev/null || coverage report | sed -n '1p;$p' && echo

The first of the “hidden” things that is easy to forget is changes to the data model that need to be migrated in the database. Unfortunately, the makemigrations management command doesn’t have a usable exit status, but I work around that to only output text in case of problems:

1
2
[ "$(./manage.py makemigrations --dry-run)" != "No changes detected" ] && \
    ./manage.py makemigrations --dry-run && echo

I also run flake8 which wraps the code style checker pep8, the static error checker Pyflakes, and, optionally, the code complexity checker mccabe, if you give the flag --max-complexity=10.

1
flake8 . || echo

Translation

For a multilingual project, you must remember to scan for changes to translatable strings and also to translate them. The makemessages management command goes over the source and finds all strings marked for translation and makes or updates the translation “portable object”/PO file. So ideally, I could just run that command and see if there are any changes to the PO file.

However, makemessages includes a timestamp for the generation of the file, so it is always different each time the command is run. On top of that, it can happen that the command finds the translatable strings in a different order even when the content of the strings didn’t change (I think this happens between different version of Django).

So, I compare the before and after versions of the PO file with the timestamps stripped out, and sorted such that ordering doesn’t matter. This sorting trick will cause false negatives if two strings are swapped in the source code, but this seems unlikely and therefore preferable to getting false positives from makemessages’ inability to search the source in a consistent order.

1
2
3
4
5
6
7
8
9
sed '/POT-Creation-Date/d' locale/da/LC_MESSAGES/django.po > __po1
mv locale/da/LC_MESSAGES/django.po{,_orig}
cp locale/da/LC_MESSAGES/django.po{_orig,}
./manage.py makemessages -v 0 -a --no-location --no-obsolete
sed '/POT-Creation-Date/d' locale/da/LC_MESSAGES/django.po > __po2
mv locale/da/LC_MESSAGES/django.po{_orig,}
diff <(sort __po1) <(sort __po2) >/dev/null || \
    { echo "=== Translation not up to date:" && diff __po1 __po2 || echo; }
rm __po1 __po2

Before and after running makemessages, I backup the PO file and restore it, respectively. By doing this as mv A B; cp B A; mv B A, rather than just cp A B; mv B A, the file retains its inode. Thus, editors will not falsely see the file as changed outside the editor.

Even if you had already run makemessages, you also had to translate the newly found strings. I check for both untranslated strings and unverified automatic translations. The former appear as msgstr "", but not every msgstr "" means you forgot a translation; some of them just mark the beginning of a string spanning multiple lines. So I do some sed magic to find the real culprits:

1
2
sed '$a\\' locale/da/LC_MESSAGES/django.po | tac | \
    sed '/^$/N;/\nmsgstr ""$/,/^msgid/!d' | tac

While scanning, if makemessages finds a string that is close to something you already translated, it will insert the existing translation and mark it as a “fuzzy” match, so you can check it and remove the fuzzy label when the translation is done. I find such unremoved “fuzzy” labels:

1
2
3
[ $(grep -c ', fuzzy' locale/da/LC_MESSAGES/django.po) -gt 1 ] && \
    echo "=== Fuzzy translation:" && \
    grep ', fuzzy' locale/da/LC_MESSAGES/django.po -A2 | tail -n +5 && echo

The Full Script

With all the components of the script done, the whole script looks like this. I call it ok.sh:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/bin/bash
coverage run --branch --source='.' --omit='*/migrations/*,*test*,*settings*' \
    manage.py test -v 0 2>&1 | head -n -4
coverage report --fail-under=50 >/dev/null || coverage report | sed -n '1p;$p' && echo

[ "$(./manage.py makemigrations --dry-run)" != "No changes detected" ] && \
    ./manage.py makemigrations --dry-run && echo

flake8 . || echo

sed '/POT-Creation-Date/d' locale/da/LC_MESSAGES/django.po > __po1
mv locale/da/LC_MESSAGES/django.po{,_orig}
cp locale/da/LC_MESSAGES/django.po{_orig,}
./manage.py makemessages -v 0 -a --no-location --no-obsolete
sed '/POT-Creation-Date/d' locale/da/LC_MESSAGES/django.po > __po2
mv locale/da/LC_MESSAGES/django.po{_orig,}
diff <(sort __po1) <(sort __po2) >/dev/null || \
    { echo "=== Translation not up to date:" && diff __po1 __po2 || echo; }
rm __po1 __po2

sed '$a\\' locale/da/LC_MESSAGES/django.po | tac | \
    sed '/^$/N;/\nmsgstr ""$/,/^msgid/!d' | tac

[ $(grep -c ', fuzzy' locale/da/LC_MESSAGES/django.po) -gt 1 ] && \
    echo "=== Fuzzy translation:" && \
    grep ', fuzzy' locale/da/LC_MESSAGES/django.po -A2 | tail -n +5 && echo

There is no explicit call to ./manage.py check because it is called implicitly by the other management commands. An ideal run of the script produces no output, but if you made a lot of errors, it could look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$ ./ok.sh 
======================================================================
FAIL: test_basic_addition (docs.tests.SimpleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/erw/hstareal/docs/tests.py", line 16, in test_basic_addition
    self.assertEqual(1 + 1, 3)
AssertionError: 2 != 3

Migrations for 'requisitions':
  0002_auto_20150605_1757.py:
    - Alter field supplier_postcode on requisition

./requisitions/models.py:123:80: E501 line too long (108 > 79 characters)
./requisitions/admin.py:194:25: F821 undefined name 'messages'
./requisitions/admin.py:441:25: F821 undefined name 'messages'

=== Translation not up to date:
1541c1541,1543
< msgid "postcode"
---
> #, fuzzy
> #| msgid "postcode"
> msgid "postal code"

=== Fuzzy translation:
#, fuzzy
msgid "internal account type"
msgstr "interne kontotyper"

Now, all that is required is to run this one script before committing. A CI server could of course be nice, but for a small project, this might be fine, and cheaper.

One thing that is missing is to make the check in lines 18–19 of the script aware of untranslated strings that you don’t need to translate in your own project, because they are covered in Django’s own translation. For example, my 404 page says “Page not found” which Django already knows how to translate into 70+ languages. The script could also be generalized to process all locales — right now it is hardcoded for Danish.

Comments