Pages

25 August 2010

Prog, Constify, and Pointer Single Loop

Hello again.

This summer holiday, in the absence of a job, I've spent a lot of time working on Prog, and recently I actually finished a compiler of sorts. It produces an intermediate format suitable for text-based translation (via the C preprocessor, for instance) into any target language that supports, whether directly or indirectly, a high-level translation from Prog. I'm working on the C++ target, and it's currently possible to compile trivial programs such as Hello World and FizzBuzz, among others.

So I didn't make my January release last year, but I will make it long before next year, and I'm happy about that. After all, I'm nineteen years old, and the longer I wait to release a language, the less impressed people will be by my age! I'm kidding, of course. I really just want the damn thing to be done so I can use it and enjoy it, and so that others can do the same.

On to the longer, more peculiar part of the title. I've been browsing Stack Overflow a lot lately—in fact, perhaps more than I should—and a topic came up recently that I thought was interesting enough to write about here.

The question concerned the possibility of a "constify" operation in C++, that would, for the remainder of the duration of the current scope, cause a variable to be treated as though it had been declared const. The original poster wanted to be able to write something like the following:

std::vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);
constify v;
v.push_back(4); // This is now a compile-time error.

Now, the real use of such an operation doesn't really show in std::vector, especially since C++0x introduces an initializer_list constructor that makes it trivial to initialise a const std::vector. But to be able to initialise a const object by calling mutating methods on that object? I was intrigued.

Obviously the syntax would have to change somewhat. The simplest route was obviously to declare a local const reference bound to the original variable. My original constify macro looked like this:

#define constify(type, id) \
type const& id##_const(id), & id(id##_const)

The first declaration creates a const reference var_const bound to the constified variable var. It then binds another reference named var to var_const. It has to be done this way because type const& id(id); is erroneous, due to the fact that both mentions of id refer to the local variable, not the existing one.

To use this version of the macro, it had to be wrapped in its own scope, such as a loop body or bare block:

std::vector<int> v;
// ...
{
    constify(std::vector<int>, v)
    // ...
}

Using the macro outside a local scope resulted in a duplicate definition of the constified variable. I didn't like the requirement of wrapping the macro in its own scope, so I sought a more elegant solution. Nobody likes trying to mash a block into a macro invocation, so I decided to try turning the macro into something more like an inbuilt C++ construct, that could be used like so:

constify (var) {
    // var is const for the duration of this scope.
}

The first problem was easily solved: how to have the compiler deduce the type of the constified variable. C++0x introduces the decltype keyword, which allowed me to rewrite constify:

#define constify(id) \
decltype(id) const& id##_const(id), & id(id##_const);

So far so good. (NB: In non-C++0x compilers, there is often a typeof extension with a very similar effect.) But how to allow for the clean syntax? The answer was in what I'll call, for lack of a better term and only because I've never seen it before, the Pointer Single Loop idiom.

Basically, the constify macro had to introduce a for loop, which would provide the scope for the local variables var_const and var, while also executing only once. There is no way to introduce a loop control variable of a type unrelated to that of the constified variable, and no way to produce a local scope surrounding the expansion of the macro without also requiring that the loop body be mashed as a parameter into the invocation of constify.

Meet the Pointer Single Loop idiom. It relies on the fact that, for any variable t of type T, this loop executes only once:

for (T* p = &t; p; p = 0) {}

Further, since it's possible to construct a pointer to any type, and since we already have the name of the variable to be constified, it's trivial to construct the final implementation of constify:

#define constify(id) \
for (decltype(id) const& id##_const(id), \
    & id(id##_const), * constify_index = &id; \
    constify_index; constify_index = 0)

Like the earlier implementations of constify, this accepts an id and binds it to a local, immutable id via id_const. The difference is that it uses a Pointer Single Loop to execute the next statement once in the context of the new local variables. It requires only that the variable constify_index be reserved for the purpose of the loop idiom, and works exactly as expected:

std::vector<int> v;

v.push_back(1);
v.push_back(2);
v.push_back(3);

constify (v) {

    v.push_back(4); // This is a compile-time error.

}

This also works if v happens to be declared const already. In all, it was very easy to write, and I imagine that judicious use of C++0x features alongside an idiom of this sort could result in very sensible, readable extensions to C++, which provide real value to the language while retaining compatibility across standards and compilers. I can totally see Boost going for something like this.

1 comment:

  1. Nice construction... however I personally think that const-correctness isn't indeed that useful (my approach is use `const` when you must, not when you can). This opinion about const is sort of a blasphemy in many C++ circles, so unfortunately I too can imagine this proposal of constify being actually used. The different is that I'm not happy about it... ;-)

    ReplyDelete