Skip to content

Fixing dispatch

A fun thing about refactoring code is that after one refactor is finished, the next candidate is easier to see.

In An abstraction gone wrong, we refactored the state variable of a simple tokenizer from an integer into an object. Now that that's done, another problem is staring at us in the face. It's in this code:

if (state == INITIAL) {
    // ...
} else if (state == IN_NUMBER) {
    // ...
} else if (state == IN_STRING) {
    // ...
} else if (state == AFTER_STRING) {
    // ...
} else if (state == ESCAPING) {
    // ...
}

The tip-off here is that we are asking the state object about itself—its attributes, its type, or (here) its identity—instead of telling it the result that we want and letting it handle it. We'll refactor it to follow the tell, don't ask principle.

This particular code uses a single variable as a condition to select which action to take. Another word for this is dispatch, as in single-dispatch and multiple-dispatch. Most object-oriented languages support dispatch as part of invoking methods. In order to leverage this in our code, we first need to have a method that we can call. We'll call that method eat:

class State {
    State eat(char ch) {
        throw new UnsupportedOperationException();
    }
}

The first action we'll move into a method will be for the ESCAPING state, because it's so simple:

class Escaping extends State {
    State eat(char ch) {
        image += ch;
        return IN_STRING;
    }
}

We'll also have to hoist the declaration for image out of the loop and into the scope of the tok class. ESCAPING also gets instantiated as an object of the new class:

State INITIAL = new State(), IN_NUMBER = new State(), IN_STRING = new State(),
    AFTER_STRING = new State(), ESCAPING = new Escaping();

The code in the loop becomes a call to the method.

} else if (state == ESCAPING) {
    state = state.eat(ch);
    continue;
}

It isn't really any simpler yet, and our cascading if-statements still do their own dispatch, but things will improve as we keep going.

Moving the next action into a method requires a slight change. Every other action cares about the type of character, so we'll add that as an argument to the eat method and update the Escaping class to match.

Now, we're in a position to create our next class:

class AfterString extends State {
    State eat(char ch, ChType type) throws Unexpected {
        if (type != ChType.SPACE) {
            throw new Unexpected(ch, this);
        }
        return INITIAL;
    }
}

The bottom of our dispatch would then look like this:

} else if (state == AFTER_STRING) {
    state = state.eat(ch, type);
    continue;
} else if (state == ESCAPING) {
    state = state.eat(ch, type);
    continue;
}

Because the blocks are identical, we can collapse these two cases into one, and let the language handle the dispatch to the correct method:

} else if (state == AFTER_STRING || state == ESCAPING) {
    state = state.eat(ch, type);
    continue;
}

If we continue like this for every case, we'll end up with the bottom of the loop looking like this:

if (state == INITIAL || state == IN_NUMBER || state == IN_STRING || state == AFTER_STRING
    || state == ESCAPING) {
    state = state.eat(ch, type);
    continue;
}

throw new Unexpected(ch, state);

At this point, the if-statement is no longer doing much. With the State class's eat method handling our base case of raising an exception when we see an unexpected character, the bottom of the loop can be reduced to a single statement:

state = state.eat(ch, type);

When we refactored the integer state into an object state, we ensured that we couldn't typo an illegal state.

By refactoring the dispatch into method-dispatch, we've also ensured that we handle every case—we can't accidentally forget a transition and leave the state as-is. If we specifically want such a transition from one state to itself, we have to make it explicit in some method's return value.

Later, we'll clean up the code a bit, then continue refactoring, including taking care of an anti-pattern in the second level of dispatch.

The code so far

import java.io.FileReader;

class tok {

    class Unexpected extends Exception {
        Unexpected(char ch, State state) {
            super("Unexpected: character '" + ch + "' in state " + state);
        }
    }

    enum ChType {
        DIGIT, QUOTE, SPACE, BSLASH, OTHER
    }

    String image = "";

    class State {
        State eat(char ch, ChType type) throws Unexpected {
            throw new UnsupportedOperationException();
        }
    }

    class Initial extends State {
        State eat(char ch, ChType type) throws Unexpected {
            switch (type) {
            case DIGIT:
                image += ch;
                return IN_NUMBER;
            case QUOTE:
                return IN_STRING;
            case SPACE:
                return this;
            default:
                throw new Unexpected(ch, this);
            }
        }
    }

    class InNumber extends State {
        State eat(char ch, ChType type) throws Unexpected {
            switch (type) {
            case DIGIT:
                image += ch;
                return this;
            case SPACE:
                System.out.println("INT(" + image + ")");
                image = "";
                return INITIAL;
            default:
                throw new Unexpected(ch, this);
            }
        }
    }

    class InString extends State {
        State eat(char ch, ChType type) {
            switch (type) {
            case QUOTE:
                System.out.println("STRING(" + image + ")");
                image = "";
                return AFTER_STRING;
            case BSLASH:
                return ESCAPING;
            default:
                image += ch;
                return this;
            }
        }
    }

    class AfterString extends State {
        State eat(char ch, ChType type) throws Unexpected {
            if (type != ChType.SPACE) {
                throw new Unexpected(ch, this);
            }
            return INITIAL;
        }
    }

    class Escaping extends State {
        State eat(char ch, ChType type) {
            image += ch;
            return IN_STRING;
        }
    }

    State INITIAL = new Initial(), IN_NUMBER = new InNumber(), IN_STRING = new InString(),
        AFTER_STRING = new AfterString(), ESCAPING = new Escaping();

    void tokenize() throws Exception {
        State state = INITIAL;
        FileReader reader = new FileReader("demo.txt");

        while (true) {
            int cp = reader.read();
            if (cp == -1)
                break;
            char ch = (char) cp;

            ChType type;
            if (Character.isDigit(ch)) {
                type = ChType.DIGIT;
            } else if (ch == '"') {
                type = ChType.QUOTE;
            } else if (ch == ' ' || ch == '\n') {
                type = ChType.SPACE;
            } else if (ch == '\\') {
                type = ChType.BSLASH;
            } else {
                type = ChType.OTHER;
            }

            state = state.eat(ch, type);
        }
    }

    public static void main(String[] args) throws Exception {
        new tok().tokenize();
    }
}

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

No comments

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.
To leave a comment you must approve it via e-mail, which will be sent to your address after submission.
Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Form options