David Owen’s blog | Entries from April 2011

Perceived shortcomings of style checkers (and a potential solution)

Posted by David Owen on 2011-04-09

Style-checkers check code for certain statically-determinable properties. These properties may affect dynamic behavior (e.g. all references are definitely initialized before use), or they may be purely textual (e.g. code is indented in a particular way).

Teams use style-checkers to ensure some typical dynamic and textual standard in the code base. The idea is that common errors can be prevented, and that a common textual style will aid readability, especially between code from different authors.¹

In reality, style-checking is far from ideal, especially regarding textual consistency. Some section of code may have notably worse readability if wrapped to the project’s standard of 80 columns. Some object allocation may not have a matching deällocation because it is live for the program’s entire run.

No matter the reason, exceptions to the rules always come up, and this is where style-checking becomes really difficult: how do you handle the exceptions? If you do nothing, exceptions will accumulate until the style-checker’s report is overwhelming. At best, the report will be ignored; at worst, it will become demoralizing to developers. In such a situation, style-checking may be worse than no style-checking at all.

To keep the report positively motivating, you must make a change somewhere:

You could alter the code to make it compliant with the style rules. This may directly compromise either the design integrity or actual readability of the code.
You could alter the style rules so that the case is no longer exceptional. But the new rule will probably be more complicated, have more edge cases, and still have other (and new) exceptions. Besides, which is better: writing rules, or writing code?
You could mark the exception in the code with some kind of meta-comment, so that the style-checker ignores it on future runs; but, this adds clutter to the code and runs counter to the goal of readability.
You could mark the exception somewhere else. This keeps the code clean, and the developers in control, but this database would have to be intelligent enough to keep up with code motion as the project develops. The database should also be version-controlled with the source that it annotates.

This last approach would yield a “perfect” solution: it would only ever mark newly-introduced problems in the code. A “conservative,” approximate solution (one that marks every new problem, and might mark a few old) is possible if we make the following assumptions:

Code is under version-control;
The style-checker is run before or as part of the process of committing code to the main branch; and,
Genuine problems, when reported, will actually be addressed.

Given these constraints, we can see that a style-checker could report on errors associated only with code features that are new or changed in the current version, relative to the latest main-line version. If this is the case, it would assume that any style exception in the main-line is there intentionally, and should not be reported on again (it would have been reported when it was previously committed)—it would not need to track exceptions in a separate database. I call this “differential style-checking.”

The code features that the style-checker tracks could be either lines of text, or nodes of parse-trees. The latter might be more accurate, but the former might be easier to implement and good enough in most cases.

Here is a proof-of-concept. It works on only one Python file at a time, and the output is messy, but it demonstrates the essential principles. pylint version 0.19.0, as packaged in Ubuntu 10.04, is the style-checker, with which I have no experience prior to writing this.

#!/bin/bash

 
usage() {
    echo $0 trunk_dir proposed_dir file
    exit 1
}


comment() {
    # $1 the file to comment
    # $2 the line number
    # $3 the comment to append
    head -n $(( $2 - 1 )) $1 >/tmp/tmp.py
    local target_line=$(head -n $2 $1 |tail -n1)
    echo "$target_line #{$3}" >>/tmp/tmp.py
    tail -n +$(( $2 + 1 )) $1 >>/tmp/tmp.py
    mv /tmp/tmp.py $1
}


process() {
    local type
    local line
    local msg
    local old_ifs="$IFS"
    IFS=":"
    while read type line msg
    do
        [ "$msg" ] && comment $1 $line "$type:$msg"
    done
    IFS="$old_ifs"
}


TRUNK=$1; shift
PROPOSED=$1; shift
FILE=$1; shift

[ -d "$TRUNK" -a -d "$PROPOSED" -a "$FILE" ] || usage
[ "$1" ] && usage

STRIPPED=${FILE%.py}
MODULE=${STRIPPED//\//.}

pushd $TRUNK >/dev/null
cp $FILE /tmp/left.py
pylint -rn $MODULE 2>/dev/null |process /tmp/left.py
popd >/dev/null

pushd $PROPOSED >/dev/null
cp $FILE /tmp/right.py
pylint -rn $MODULE 2>/dev/null |process /tmp/right.py
popd >/dev/null

diff -u /tmp/{left,right}.py |egrep '^[+@]'

For a sample run, I will use the Trac project, which I greatly respect. At revision 10114, the PostgreSQL backend received a change, which we will analyze.


cd /tmp
svn co -r10113 http://svn.edgewall.org/repos/trac/branches/0.12-stable 10113
svn co -r10114 http://svn.edgewall.org/repos/trac/branches/0.12-stable 10114
diffstyle 10113 10114 trac/db/postgres_backend.py

The (wrapped) output is:

+++ /tmp/right.py       2011-04-09 09:59:48.274189942 -0600
@@ -237,6 +237,11 @@
+    def update_sequence(self, cursor, table, column='id'): #{C:PostgreSQLConnection.update_sequence: Missing docstring} #{R:PostgreSQLConnection.update_sequence: Method could be a function}
+        cursor.execute("""
+            SELECT setval('%s_%s_seq', (SELECT MAX(id) FROM %s))
+            """ % (table, column, table))
+

We see that, according to pylint’s default checks, two new (potential) problems were introduced: the new method doesn’t have a docstring, and the new method could be a function outside of a class.² Running pylint normally on postgres_backend.py revision 10114 produces 47 warnings. (The entire Trac packages yields 10065.)

pylint -rn trac.db.postgres_backend |egrep '^[A-Z]' |wc -l
pylint -rn trac |egrep '^[A-Z]' |wc -l

End notes

I believe that style-checkers can effectively guard against some types of runtime errors, just as compilers can. I don’t believe that style-checkers can be an effective judge of textual style.
In this case, the style-checker got it wrong. This must be a method, because it is invoked polymorphically; although, it need not be an instance method—it could be a class method.