27 April 2012

C++ Functional Style Tips

To say that C++ is a complex language would be an understatement—I had to spend years using it before properly understanding all of its many gotchas. Because I have plumbed its depths, I cannot recommend it to anyone for any purpose. If you know how to use the language effectively, then by all means do use it—but if you do not already know it, then there are plenty of other, perfectly good languages out there. Personally, I’m a fan of C, Haskell, and Scheme.

However, the language is still undeniably useful, particularly in game development. So I must try to benefit from the power of C++ while avoiding its failings in every way that I can. Following, then, are some brief guidelines that I have found to work well for me to write C++ code in a more functional style. And functional style makes it easier to grasp a program by reducing the effort required to understand its possible states—so you can focus more on solving problems than fixing bugs.

For the sake of concision, I’ll assume that using namespace std; and using namespace std::placeholders; are in effect. Obviously in real code you ought to import only what’s necessary, and in as small a scope as possible.

♦ ♦ ♦

1. Use const Obsessively

C++ unfortunately has no reasonable way of enforcing referential transparency. You cannot readily make the compiler complain when a function has externally visible side effects. Indeed, I hope C++2x gets a pure keyword. But in the meantime, at least we have the ability to enforce immutability with our good friend const.

In general, everything should be const unless it truly needs to be modified—whether because an in-place algorithm is more efficient than its non-mutating equivalent, or because of failings in the type system and standard library. As an example of the latter, you cannot use standard algorithms to initialise an immutable container, resulting in objects that are unnecessarily non-const.

Declare member functions const if they are logically non-mutating, that is, you would expect that calling the function in and of itself will have no effect on the observable state of the object.  “Observable” means that you can deduce said state through the public interface alone.

struct Point {
    double distance(const Point&) const;
    double distance(double, double) const;
    double x, y;

When taking parameters by lvalue reference, use const when possible.

    double distance(const Point&) const;

When taking parameters by value, declare the function as usual.

    double distance(double, double) const;

Then make the parameters const in the definition, unless of course they are to be modified.

double Point::distance(const double x, const double y) const {

Don’t bother returning by const value—all it does is arbitrarily prevent calling mutating member functions on temporaries—but do return const references where appropriate.

const vector<double>& TreeNode::get_children() const {
    return children;

Make local variables const when possible.

double Point::distance(const double x, const double y) const {
    const auto dx = this->x - x;
    const auto dy = this->y - y;
    return sqrt(dx * dx + dy * dy);

Be sure to use const qualification for pointers where appropriate.

T* x;             // Mutable pointer to mutable T.
const T* x;       // Mutable pointer to immutable T.
T* const x;       // Immutable pointer to mutable T.
const T* const x; // Immutable pointer to immutable T.

This also applies to smart pointers such as unique_ptr (which you should use a lot) and shared_ptr (which you probably don’t need).

shared_ptr<T> x;             // Mutable shared pointer to mutable T.
shared_ptr<const T> x;       // Mutable shared pointer to immutable T.
const shared_ptr<T> x;       // Immutable shared pointer to mutable T.
const shared_ptr<const T> x; // Immutable shared pointer to immutable T.

Related to const is constexpr, which allows you to ensure that if a function can be evaluated at compile time, then it will be. This has less to do with functional style and is more a matter of optimisation, so I mention it only for the sake of completeness.

♦ ♦ ♦

2. Use <functional>, <algorithm>, <numeric>, and <iterator>

The C++ standard library has a number of common algorithms abstracted into iterator-based functions. While iterators have some flaws, they can often help you write code that is more readable than the alternative using explicit loops.

bind() lets you perform partial application of function objects.

const auto succ = bind(plus<int>(), _1, 1);
cout << succ(3) << '\n';

Here, bind(plus<int>(), _1, 1) is equivalent to the section (+1) in Haskell.

There are two versions of transform() to be found in <algorithm>. The first is analogous to the familiar functional map, applying a unary function to each element in a range and sending the results to an output iterator.

const vector<int> a{1, 2, 3, 4, 5};
vector<int> b;
transform(a.begin(), a.end(), back_inserter(b),
    bind(plus<int>(), _1, 1));

The second version is analogous to zipWith, which takes two input ranges, applies a binary function to each pair of elements, and sends the results to an output iterator.

const vector<int> a{1, 2, 3, 4, 5};
const vector<int> b{5, 4, 3, 2, 1};
vector<int> c;
transform(a.begin(), a.end(), b.begin(),
    back_inserter(c), plus<int>());

accumulate(), from the <numeric> header, is by default a sum:

const vector<double> values{3.1, 4.1, 5.9, 2.6, 5.3, 5.8};
const auto sum = accumulate(values.begin(), values.end(), 0.0);

With a binary function, it becomes a left fold:

const auto product = accumulate(values.begin(), values.end(),
    1.0, multiplies<double>());

C++11 introduces lambdas to C++, which are largely syntactic sugar for anonymous classes—the members of the class are the (explicit) closure from the lambda’s lexical environment, and the operator() of the class is the lambda body. When you need to use a non-trivial function in a standard algorithm, it’s clearest to use a lambda rather than trying to compose function objects.

const auto paraboloid_product = accumulate(values.begin(), values.end(),
    2.0, [](double x, double y) { return (x - 1.0) * (y + 1.0); });

On the other hand, sometimes it’s better to use a simple for loop. Case in point: the for_each() algorithm, which conveys no more information than an ordinary range-based for, and can become unreadable when the types are non-trivial.

// Do this.
for (auto& p : symbols)

// Or this!
for (auto p = symbols.begin(); p != symbols.end(); ++p)

// Not this.
for_each(symbols.begin(), symbols.end(),
    [](pair<const string, shared_ptr<Value>>& p)
        { p.second->method(); });

♦ ♦ ♦

3. Use auto and decltype()

The C++11 auto keyword allows you to omit the manifest type of a variable, if a type can be deduced from the variable’s initialiser. Combined with const, this makes C++ variables look an awful lot like single static assignment registers.

const auto v = f(x, y);

Just as auto is used to deduce a declaration from an initialiser, decltype() is used to deduce a type from an expression:

decltype(v[0]) w;

Like sizeof(), decltype() does not evaluate its argument, so it’s safe to use side-effectful functions in a decltype() expression. Like const, you ought to use auto and decltype() wherever possible, specifying manifest types only when necessary. auto lets you avoid repetition at the type level, and decltype() lets you express types as contracts in terms of other types.

♦ ♦ ♦

4. Use Higher-Order Macros

One of the reasons I’m fond of functional and concatenative programming is an increased ability to reduce repetition through factoring. A functional style, and in particular a point-free style, is vastly more amenable to refactoring than imperative code. Unfortunately, C++ often makes it unwieldy to do such refactoring, even with functional style, because it is inherently an imperative language.

Despite its limitations and caveats, the C preprocessor is useful for eliminating repetition. Unlike, say, Lisp macros, which are Turing-complete, CPP is only equivalent in power to a pushdown automaton. Still, macros have one powerful feature that I use regularly: they can take other macros as arguments. Thus you can define a sequence like so:

#define FUNCTIONS(M) \
    M(add, +) \
    M(sub, -) \
    M(mul, *) \
    M(div, /)

And apply macros to every term in the sequence:

#define DECLARE(N, O) \
    int N(int, int);

#define DEFINE(N, O) \
    int N(int x, int y) { return x O y; }


#undef DEFINE
#undef DECLARE

When C++ fails to provide a mechanism of semantic abstraction for a particular problem, higher-order macros at least allow you to abstract syntactically repetitive code. Judicious use of macros to reduce repetition is a Good Thing, even if it might make you feel a bit dirty.

♦ ♦ ♦

I’m sure there’s more I’m forgetting, but that’s the meat of it. Functional style can help you out, and here are some ways to do it. These aren’t rules, just guidelines, and you should always evaluate each problem individually to determine if the usual patterns really fit. Programming is, after all, a very subtle magic. But if there’s any way to make that magic a little more fun, I’ll take it.