Abstractions

Posted by David Owen on 2016-12-13

Abstraction allows us to wrap up complication into a simpler form. We can take a set of things, with all of their nuances and details of combination, wrap it all up and call it new thing.

But, abstractions can sometimes get in our way. If they're not designed properly, they may not go together in convenient ways, or they may distance us from other useful parts of the language.

Parameters & names

Some abstractions are provided by languages. For example, the while-statement and structured if-statement are abstractions over the old unstructured if-statements:

# Unstructured
i = 1
top:
if i % 10 == 0: goto star
print(".")
goto work
star:
print("*")
work:
cont = do_work()
i += 1
if cont: goto top
end:

# Structured
i = 1
cont = True
while cont:
    if i % 10 == 0:
        print("*")
    else:
        print(".")
    cont = do_work()
    i += 1

Abstractions can allow parameterization. In the structured if-statement, we can plug in whatever blocks of code we want in the two available places (the then or the else), and the abstraction will make sure that the flow of execution jumps around to the right one.

Abstractions can allow naming as well. Functions usually allow both parameterization and naming, although either could be elided:

def my_func(x):
    return x + 42

def another_func()              # No parameterization!
    return 42

map(lambda x: x + 42, my_list)  # No naming!

Composition

Good abstractions are composable, which means that they can be combined and still work the way you would expect.

def what_to_do(x):
    some_setup()
    if x < 5:
        do_the_thing()
        some_teardown()
    else:
        raise AnError()

Composability is something of a continuum, though. In the previous example, functions and if-statements seemed to compose pretty well. But say you want do_the_thing() to be a parameter instead of hard-coded, but your language doesn't support passing around blocks of code as parameters to other blocks of code. You might be stuck with something like this:

def funky():
    some_setup()
    if x < 5:
        return True
    else:
        raise AnError()

if funky():
    do_the_thing()
    some_teardown()

Not so helpful. Every time we want to respond to funkiness, we must be sure to handle the teardown, too.

But your language may in fact allow blocks of code as parameters, so you can do this:

def funky(good_block, bad_block):
    some_setup()
    if x < 5:
        good_block()
        some_teardown()
    else:
        bad_block()

funky(lambda: do_the_thing(), lambda: raise AnError())

It's workable, but kinda of messy. Ideally, you'd want something more like this:

if_funky:
    do_the_thing()
else:
    raise AnError()

A good macro system (like Lisp, not like C) allows you to do that.

Cuts

It's hard to make good abstractions. Essentially, we have to decide where to cut code and data apart in text, space, and time. It's easy to get it wrong, or to be forced into a suboptimal cut by limitations of the language.

Here's an example of a bad cut:

class dict:

    def contains(self, key):
        h = hash(key)
        l = self.store[h]
        for k, v in l:
            if k == key:
                return True
        return False

    def get(self, key):
        h = hash(key)
        l = self.store[h]
        for k, v in l:
            if k == key:
                return v
        raise NotFound()

if d.contains("the key"):
    print d.get("the key")

We're searching the dictionary twice, just to be good citizens. We could be a little less good:

try:
    print d.get("the key")
except:
    pass

We're now searching only once, but we're generating a bunch of stack-trace data that we don't use. Being required to handle the case of a missing key as an explicit no-op is also textually wasteful.

Here's another alternate, if your language supports returning multiple values (or faking it by returning a single 2-tuple):

class dict:
    def get(self, key):
        h = hash(key)
        l = self.store[h]
        for k, v in l:
            if k == key:
                return True, v
        return False, None

found, val = d.get("the key")
if found:
    print val

Better. But we're still repeating the test for presence in two places: first with if k == key, and second with if found. This is another step:

class dict:
    def with_value(self, key, found_block, not_found_block):
        h = hash(key)
        l = self.store[h]
        for k, v in l:
            if k == key:
                found_block(v)
                return
        not_found_block()

But the use is a bit messier, as we've seen already for this style:

d.with_value("the key", lambda val: print val, lambda: pass)

Better still would be something allowing this:

with v for "the key" in d:
    print v
else:
    pass

And, with the else-block being omitable:

with v for "the key" in d:
    print v

Deciding to cut the abstraction of using a value in a dictionary, by key in different places (contains/get versus multi-return-get versus get-with-blocks versus with-key) affects not only how we think about the concept, but how effectively it can be used in real code. The with-key approach yields the best use overall, with a natural style, and no duplication of code elements.

Distance

Abstractions created by the programmer also pull you further away from the base language. Sometimes this is by creating a new language that's augmented with additional concepts, like with-key. Sometimes, an abstraction can take you far enough away from the language that you can no longer leverage some of the underlying concepts.

In C, we have the example of the qsort() function (defined in stdlib.h). It abstracts the task of sorting. It provides a convenient and memorable name—qsort—and parameterizes a performant algorithm over the data to be sorted and the comparison operator to use.

In providing this abstraction, it takes us and our code away from C's type-checker. qsort takes a void * (pointer to unassumed type) to the data, and a pointer to a function which itself accepts two void *s to its arguments to be compared. It does this so that we can sort lists of integers as well as lists of arbitrary data. But the compiler doesn't verify that the comparator passed to qsort actually matches the data (well, the compiler might have specialized checking in this case, but it won't for a qsort found outside the standard library).

David Owen’s blog

Abstractions

Parameters & names

Composition

Cuts

Distance

Trackbacks

Comments

Add Comment