## 08 May 2014

### Constification in C++11

Here’s a neat little pattern. Suppose you want to initialise some object using mutating operations, and thereafter make the object const, without any overhead. In C++11 you can do just that with a lambda:

const auto v = []{
vector<int> v;
for (auto i = 0; i < 10; ++i)
v.push_back(i * 10);
return v;
}();

This “inline builder pattern” turns out to be a nice way to encapsulate mutation to smaller scope, thereby making your code a bit easier to reason about. Of course, the drawback of this is that objects you would initialise this way are the sorts of objects you might prefer to move rather than copy, and const interferes with that; but that is a problem in C++11 generally.

It goes to show that lambdas make working with const things a bit nicer all around. This pattern also lets you initialise const members in an object:

struct X {
X()
: x([]{
auto s = 0;
auto i = 10;
while (i) s += --i;
return s;
}())
{}
const int x;
};

Ugly, but it gets the job done. That’s C++ for you.

## The problem and the solution

When designing my Kitten programming language, I wanted to support lexical closures without the need for a tracing garbage collector. The problem of closure is simple: anonymous functions often use local variables from their enclosing scope; in a memory-safe language, those variables should continue to exist if the inner function escapes, after the outer function exits. Take this classic example of function currying in JavaScript:

// curriedAdd :: Number -> (Number -> Number)
return function(y) {
return x + y;
};
}

var inc = add(1);

Here, x must exist after curriedAdd has exited, in order for us to safely and meaningfully call the resulting anonymous function. Since Kitten has no mutable variables, there is no observable difference between capturing a variable by reference and capturing it by value. So the Kitten compiler takes a definition like this:

def curriedAdd (Int -> (Int -> Int)):
->x
{ x + }

def inc (Int -> Int):
1 curriedAdd @

And rewrites it to copy the value of x into the closure of the anonymous function. In low-level pseudocode, this is what’s happening under the hood:

def curriedAdd:
setlocal0
\$(local0){ closure0 __add_int }

Now there are no references to the local variable within the inner function—only the value of x is captured when the function object is constructed on the stack. The runtime cost is minimal, and no garbage is generated.

## The rest of the story

Saying “by value” and “no garbage” is accurate, but not a complete picture. It would be inefficient to always deep-copy all closed values, particularly if they were to exceed the size of a machine word.

So while integers, booleans, and floating-point numbers are all copied, heap-allocated objects such as vectors can instead be shared using reference counting. Coming from such languages as C++, I like that this preserves deterministic memory behaviour. And in the absence of mutation and lazy evaluation, cycles are impossible, so reference counting alone suffices to eagerly reclaim all garbage.

This transformation is equally valid in other languages, as long as we ignore mutation. You can easily rewrite the JavaScript example above to use explicit closure:

// curriedAdd :: Number -> (Number -> Number)
return function(x2, y) {
return x2 + y;
}.bind(this, x1);
}

It also works when we add mutation. Here’s a function encapsulating an impure counter from a starting value:

// counter :: Number -> (() -> IO Number)
function counter(x) {
return function() {
return x++;
};
}

var c = counter(0);
console.log(c());    // 0
console.log(c());    // 1
console.log(c());    // 2

We can fake pointers using Array to make this work without lexical closure:

// counter :: Number -> (() -> IO Number)
function counter(x) {
return function(px) {
return px[0]++;      // dereference
}.bind(this, [x]);     // reference
}

However, reference cycles are now possible, so a reference-counting collector would need an auxiliary cycle detector to reclaim all garbage.

## Optimisations and future work

Production tracing collectors outperform naïve reference-counting collectors by a wide margin. We can make a reference-counting collector that closes that gap through simple optimisations:

• Reducing the number of bits used in the reference count—most objects have a small number of references.
• Creating objects with a zero reference count, and only retaining when the object escapes—the vast majority of objects die in the scope they were born.
• Eliminating redundant reference manipulation, considering only the net reference change in a basic block—most reference changes do not result in deallocation.

I intend to implement such optimisations in the future, as Kitten evolves. My bet is that as a statically typed, high-level language with low-level semantics, Kitten will someday be suitable for games and simulations. There, deterministic memory behaviour is a big win, since low latency is often more important than high throughput. Still, the ultimate direction of the project remains to be seen—I’m just discovering the language as I go. :)

## 01 July 2013

### Static Typing in a Concatenative Language

As you may know, I have for some time been working on a pet language project, Kitten. The name is a pun on concatenative programming, a somewhat overlooked paradigm of stack-based functional programming languages. My February 2012 blarticle on concatenative programming is an okay introduction for those interested. The language is very much in flux, but at the time of this writing, Kitten looks like this:
/* C-style comments */

// Top-level code
99 bottles_of_beer

decl bottles_of_beer (Int ->)  // Separate type declarations
def bottles_of_beer:           // Optional indentation-based syntax
-> x                         // Local variables
x verse                      // Postfix expressions
if x 1 > then:
x-- bottles_of_beer        // Recursion instead of iteration
else {}

decl verse (Int ->)
def verse:
-> x
x wall newline
x beer newline
take newline
x-- wall newline
newline

decl beer (Int ->)
def beer:
bottles " of beer" print

decl wall (Int ->)
def wall:
beer " on the wall" print

decl take (->)
def take:
"take one down, pass it around" print

decl bottles (Int ->)
def bottles:
-> x
if x 0 = then:
"no more bottles"
else if x 1 = then:
"one bottle"
else:
x showi " bottles" cat
print

## Motivation

The premier concatenative programming language Factor is dynamically typed, alongside dynamically typed cousin Joy, and untyped Forth. There was a statically typed concatenative language Cat but that project is no longer maintained.

So it seems there is room in the world for a statically typed functional stack language. Kitten was originally dynamically typed, primarily because I am lazy, but the intent was always to move to static typing. It seems to me that the main reason that dynamic languages are so widely used—often in situations where they are not appropriate—is that they are so easy to implement. That can give implementors a shorter iteration time on new and useful features, as well as more time to work on documentation and tooling.

Dynamic languages are deceptive. You can absolutely write large software in a dynamic language, but at the point when dynamic types become a liability and static types become really valuable, it’s already too late! In the long run, static types ease maintainability, optimisation, and static reasoning for computers and humans alike. And that’s not to mention the great opportunities for refactoring, analysis, and visualisation tools.

But static types, without type inference, are just a bad joke. My main goal when designing Kitten’s type system was inferability. Having used Haskell for a few years, I’ve grown accustomed to the benefits of type inference, at least of the local variety. The meat of a program simply shouldn’t need type annotations for the compiler to tell you not only whether it is type-correct, but also what the type is of anything you care to ask about. Even simple type deduction such as C++’s auto or C#’s var is much better than nothing.

Concatenative languages pose some unique problems for type inference, and I’d like to share what I’ve learned while implementing static types in Kitten.

## Differences with Dynamic Languages

One of the typical features of dynamically typed concatenative languages is homoiconicity, the property that code and data have a common representation. In concatenative languages, this is the quotation, an analog of Lisp’s list. This is a really powerful feature, and a valuable tool in a dynamic language. There’s a reason that, alongside Lisp, concatenative languages are among the most heavily metaprogrammed in existence. But homoiconicity and static types are basically incompatible kinds of awesome. A function constructed dynamically could have a dynamic effect on the stack, which a static type system can’t hope to make sense of.

Kitten therefore has to differentiate quotations into two categories: functions and vectors. There are separate primitives for function and vector operations, such as composition and concatenation—even though internally these can be implemented the same way.

## The Type System

Kitten’s type system is fairly simple. The base types include Booleans (Bool), characters (Char), integers (Int), floating-point numbers (Float), homogeneous vectors ([a]), and functions (a -> b).

All functions operate on stacks. They consume some number of parameters from their input stack and produce some number of return values, potentially with side effects such as I/O. The juxtaposition of functions denotes their composition—sending the output of one to the input of the next. 2 + is a function of type Int -> Int which adds two to its argument; it consists of the composition of two functions 2 and +, which have types -> Int and Int Int -> Int respectively.

However, functions that actually quantify over stacks pose significant challenges to inference. Functions may contain arbitrary code—of which you want to know the type, because they can be dynamically composed and applied, and those operations should be type-safe. So the basic concatenative compose combinator needs a type like the following:
r s t u. r (st) (tu) → r (su)
That is, for any stack r with two functions on top, one of type (st) and one of type (tu), it gives you back the same stack r with a new function on top, of type (su). All of the variables in this type are quantifying over multiple types on the stack. This is the type that Cat uses for its compose combinator. Already this is strictly less powerful than the dynamic equivalent:
1. You can no longer dynamically compose functions with interesting stack effects: {1} and {+} are composable, but {drop} and {drop} are not.
2. The associativity of dynamic composition is broken as a result: {a b} {c} compose does not necessarily equal {a} {b c} compose.
#1 is not an issue in practice, because as fun as arbitrary dynamic effects can be, a static type system is designed to prevent total lunacy, and there’s going to be some collateral damage. So by construction, the type of a dynamic composition must always be known statically.

#2 is more significant, in that the type checker is going to reject some dynamic compositions you might expect to be valid: {drop drop} {} compose is allowed, but {drop} {drop} compose is not.

The problem with quantifying over stacks, however, is that every ordinary function type is inherently polymorphic with respect to the part of the stack that it doesn’t care about. + in this system would have a type like ∀r. r Int Int → r Int. All higher-order functions become higher-rank, and all recursion becomes polymorphic recursion. This makes typechecking significantly more complicated, and type inference undecidable in general.

In light of this, I opted for a simpler approach: model functions as multiple-input, multiple-output, and get rid of such combinators as compose and apply that need to talk about stack types. These can be replaced with a handful of fixed-arity equivalents: to apply unary functions, binary functions, and so on.

The process for composing two function types (ab) and (cd) is simple:
1. Pair the values produced by b and consumed by c, and ensure they match.
2. Let x be those values consumed by c, not produced by b.
3. Let y be those values produced by b, not consumed by c.
4. The type of the composed function is x + a → y + d.

## Next Time

In a future post, I’ll talk about some of the more low-level details of Kitten, particularly the design of the upcoming VM. As the language takes shape, I’ll also begin offering tutorials and explanations of how everything comes together to write real-world software. Until then!

## 17 February 2013

### Computers Aren’t Fast Anymore

Hello, doctor. I am only 22 years old, but I think computers used to be faster. Why do I feel this way? Can you help me?

Well, it couldn’t be CPUs. Everyone knows that chips were once getting twofold faster every couple of years. That is slowing down a bit now. But it couldn’t be GPUs, either. They’ve seen impressive leaps in speed too.

Hmm? Are we developers to blame? No, I don’t think so. We’re getting better and better at operating our newfangled machines, to take advantage of their futuristic capabilities. At the very least, the smart folks are writing languages and frameworks to bestow that power on the rest of us.

Of course, the extent that most developers see this innovation is in taking embarrassingly data-parallel problems and slapping them on a GPU. But that’s something, isn’t it, doctor? Memory’s faster, CPU caches are bigger, and hard disks are faster than they used to be. SSDs are even faster than that.

Tell you about my past? Okay…

At my high school, there were these run-of-the-mill Windows 2000 machines. We programmed on them in VB6. And let me tell you, for all its downsides, VB6 was screaming fast. These crappy amateur applications were downright speedy. Everything was.

When I use a computer now, I don’t feel that way. I don’t feel good about the speed or crisp responsiveness of applications. Not on a desktop, not on a high-end laptop, and especially not on a mobile device. And being that my job includes developing software for mobile devices, I have messed around with a great many of them.

I was deeply concerned by this. So I sat and I thought. Hmm. And it dawned on me: I don’t use real applications anymore.

I write this very psychiatric confessional blog post within a WYSIWYG editor running inside a web browser. (Sorry, doctor, but you’re just a figment of my overactive rhetorical imagination.) In the browser lies the problem. Almost all of the erstwhile proper applications that I use on a daily basis are now web applications.

Wait. Hold up. Give me a minute to breathe. We’re on to something.

All of the (user-facing–type) applications I use. They are running in a web browser. Which runs HTML and JavaScript. In case you haven’t heard, or you’re thinking of a different thing, we’re talking about HTML the document markup language. JavaScript the dynamically typed scripting language.

What. What. This is not okay. This is so beyond not okay that I have difficulty articulating the manner in which it is not okay because I feel like it should be obvious and why aren’t people freaking out? I’m freaking out.

Calm down? Okay, doc, I’ll try.

[Minutes pass, glacially.]

I guess the problem that I have with this is that programs aren’t…programs anymore. They’re not running on my hardware. They’re running umpteen layers removed from my hardware. They exist on this fundamentally different substrate, one that can’t be efficiently implemented on the real machines that we’ve spent so much effort making fast for real programs. It’s insane.

The things that are running natively are running on underpowered hardware, such as mobile games. And still a great number of them rely on internet services to function properly. Like a parasite, no, like an organ. Everything is part of this enormous whole, of which I am somehow not a part, and I wanted the brain and the muscle but you handed me a kidney.

Developers? No, I don’t fault them at all. You’ve got to have fun, and you’ve got to make money, and you’ve got to make programs. That’s just how it is. Everybody but me seems to have accepted a long time ago that it’s okay for an application to take this many milliseconds to display a character on the screen, or that many milliseconds to scroll. Or very many milliseconds stoned on TCP, roundtripping to a server for my data.

And my poor data. It wasn’t safe on the ground, so we sent it to live in the cloud with foster parents. Now I don’t know where it is, but it isn’t here. It’s holed up in my web applications, and it never comes to visit.

So do you think you can help, doctor? Is it just me? Should I adapt to this brave new world and wait for the award-winning silicon implementation of JavaScript running alongside an HTML coprocessor? Can I have my fast back then?

In the meantime, I guess I’ll continue to sacrifice a fraction of a second off my life here and there for the convenience of using a web program.

Rendering to a canvas.

Running in a browser.

Rendering to a screen.

## 24 July 2012

### So I write compilers for a living now.

I’m a regular visitor of Stack Overflow, and have a Stack Overflow Careers profile. A month ago, I saw a job listing for Spaceport.ioa YouWeb-funded company that makes a tool for converting Flash applications to run natively on several different platforms. A key component of their technology is a compiler, written in Haskell, from ActionScript 3 to the Spaceport runtime.

Since I enjoy working in Haskell, and compilers and games are two of my favourite things, I readily applied. After phone and Skype interviews, I flew out to California for a day of in-person interviews. Much as I like flying, it’s no fun to spend more time in the air than at your destination. Lesson learned.

Soon afterward I received an offer, accepted, and in a few short weeks found myself moving across the country from southern New Hampshire to the Bay Area. It’s been a great adventure so far, with plenty of good people, good food, and endless opportunities to practice my Chinese. :)

Stack Overflow Careers

When it launched, I believed Careers would soon become the go-to place to hire developers. Unfortunately, none of the job offers I actually received were the right thing for me. Most were a mediocre, uninteresting kind of bad, not the hilariously terribad stuff of great blog posts.

But it was well worth the wait. I got a job on the basis of reputation and merit, not just a résumé. And now I can happily say I make my living off compilers, which is what I want to be doing for the foreseeable future. The coolest part is that I can apply much of the theory I learned while doing language research in college—to solve practical problems.

From the company’s side, Careers is a win as well. You get many good developers looking at your job listing, for a downright minuscule fraction of the financial and time cost of a recruiter. When you get an application from a developer, all of the information you could possibly want to know about their skills and history is right there in their public profile.

So, fellow devs, if I’m ever in a position to hire you, it will probably be through Stack Overflow Careers.

Since I started using it professionally, my opinions on Haskell have definitely evolved. It’s a well designed language, to be sure, but till coming to Spaceport I hadn’t really had the opportunity to explore a large project. Here are some of the things I’ve learned while using Haskell in production.

I was skeptical that the language could scale, not only to a large project but to a complex one. It’s not that I was doubtful, exactly. It’s just that languages that perform well for small code don’t necessarily work well for big code, and vice versa. You can’t know unless you test.

After working with a roughly 10,000-line codebase for several weeks, I’m now convinced that Haskell is excellent for coding in the large. With good unit tests in place—we have about four times as much test code as application code—I was able to productively make changes and feature additions starting on day one.

A Haskell codebase seems to remain readily comprehensible well beyond the size where, say, a C++ codebase starts becoming unwieldy; it’s maybe a 2:1 ratio. Functional languages are good at factoring, too, which is basically the essence of abstraction and the key to managing complex code.

Still, you must be diligent about it. Using Haskell does not automatically make you a better programmer, athough the people who learn Haskell well do seem to be better programmers for it—if only because Haskell is pretty progressive as languages go.

Proponents of any language often talk about increased productivity. I do feel more productive in Haskell—in terms of the amount of code I have to write in order to add and test a feature—than in most any other language. But I can’t exactly quantify that, so I invite you to try it for yourself.

LOC is funny in that it’s simultaneously an important metric and a completely useless one. Language designers tend to be in agreement, more or less, that syntax matters—that good syntax does wonders for the effectiveness of language as a tool of thought. Naturally, they disagree about what actually constitutes “good syntax”. There might be something more to it, or it may just be a matter of taste that Python looks good to Alice and APL looks good to Bob.

Regardless, Haskell looks good to Jon. It’s pleasant to read—succinct, but not so dense that reading it is a chore. As is the case for any language, tutorial code looks nothing like production code. In Haskell you also see many academic papers where the style is very mathy—one-letter names, excessive abbreviations—and that style doesn’t scale at all beyond 200 lines. You need good names, no matter what.

Static Typing? More. Gimme.

Haskell’s type system is powerful enough that type signatures can tell you a good amount of useful information as to what a polymorphic function does. The “polymorphic” part is important: specific types naturally have much broader interfaces than simple type variables. You can easily say that a function of type (a → [a] → [a]) has only a few possible sane definitions. (Well, okay, countably infinitely many, but still.)

1. f x xs = x : xs
2. f x xs = xs ++ [x]
3. f x xs = intersperse x xs

Whereas you know considerably less about what a function of type (Double → Double → Double) might do, because the set of floating-point numbers, and functions operating on them, is so large. The type system makes the trade-off explicit: the more domain-specific your functions are, the fewer guarantees you have about them.

In practice what this means is that, say, glancing at the type signature of a poorly-named function can give you a decent clue about what it ought to do. Or if you need a utility function, you can search Hoogle for a plausible type signature and usually find what you need. No other popular language’s API search comes close to matching that kind of utility. Thanks to Hackage, new libraries are very easy to discover and integrate into a project.

Where Haskell really does shine is in building reliable software. Static typing, unit testing with HUnit, and property testing with QuickCheck let you easily make a lot of guarantees that your code does what it’s supposed to. This is especially valuable for compilers, which are essentially (very large) pure functions, and thus very much amenable to such testing.

Testing definitely requires more up-front thought, but the end result is more reliable software—and that’s synonymous with better software, because good software is, for lack of a better term, obedient. Except maybe for AI.

Haskell lets us have a near-complete, robust AS3 compiler in under 10,000 non-blank lines of not-particularly-terse code. And that’s saying something: ActionScript is not what anyone would call a “pretty little thing”. And you don’t even know the half of it.

Conclusions

• Compilers are fun and I will write them forever.
• Haskell is good and I will use it for a good long time.
• The Bay Area is a cool place to be, and I’m glad I moved here.

For those interested in my language work, an update is coming soon. Till next time.