12 November 2011

Languages versus Implementations

In the world of programming languages, there is a clear distinction between the language itself, and the implementation of the language. There is only one abstract Python, but there are many concrete implementations of Python. Only one Haskell, but several implementations thereof.

However, this is a false dichotomy. Each implementation is necessarily different from the others, mainly because implementing an identical specification in identical fashion is boring and doesn’t solve any problems.

Outside of those implementations we make as a learning process, we programmers typically create new implementations of existing languages in search of better performance or better user experience. I don’t expect anyone to agree with me that changing the performance characteristics of a language is the same as a change to the language, but that is what I believe.

More importantly, the implementors of a language will often try to make their implementation stand out as more productive, more useful to the programmer, by adding extensions. Whether or not these extensions are intended for widespread adoption by other implementations, and whether you call them extensions or pragmas or whatever else, the effect is the same. A program that is accepted by one implementation is either not accepted by another, or it runs differently under the different implementations.

The moment this happens, the two implementations are implementing different languages—with a lot of overlap, certainly, but nevertheless distinct. This divergence happens inadvertently as well: because of bugs, one implementation will likely fail or produce incorrect output for a program that another implementation handles just fine, and vice versa. These are still, by the same reasoning, implementing different languages.

So next time you go to accuse someone of conflating the theoretical and the practical, consider that they may simply be ignoring theory for the sake of practicality.

03 November 2011

What’s My Best Time of Day?

Since I find myself quite suddenly in my last year of college, I now realise that I have access to something unique and oddly exciting: statistics about my college career. I decided to take a look at my grade data for the past few years, and see if I couldn’t discern a few things:

  1. How much did my inability to get up early affect my grades?
  2. What are the hours at which I am most capable of productive work?
  3. How much does what I’m learning affect my grades?
  4. In what seasons am I most productive?
  5. What is the best amount of work per week for me?

The results were really interesting. First, I grouped classes by the time the class started, and calculated the average weighted GPA (0.0–4.0) for each group. The chart below shows that I am neither a morning nor an evening person—midday and midafternoon are when I do my best work, but right after lunch, I’m apparently just as useless as in the morning.

Next, I grouped classes by my subjective impression of how much I learned. This was less interesting: like most people, I do best when I’m learning something, but also fine when the work is easy.

Then I discovered something quite counter to my expectations. When I grouped my grades by season of the year, I had thought winter would be my worst quarter, because of the stress and lack of sunlight. It turns out that my grades tell a different story:

I don’t know why this is, but my hunch was that I work better under stress. Autumn and spring are my favourite seasons, when I feel happiest. To test this hypothesis from a different angle, I grouped quarters by number of hours per week of class time.

Apparently, when given a sane amount of work, I do fine. Past a certain point I can’t keep up, but then, beyond that threshold of stress, my performance improves significantly. Really weird stuff.

I hope to be able to apply this information to make my life easier and perhaps get more things done. If you can obtain hard data about your work habits, you can derive a surprising amount of useful information about how you can improve them. That in turn will make it easier for you to be productive. Try it—you’ll like it.

19 October 2011

Programming Languages Suck at UX

I wish I could say that programming languages are notorious for their terrible usability—unfortunately, very few people seem to have noticed, and those that have noticed don’t seem to be particularly upset about it. Frankly, that pisses me off, so I’m going to rant about it, and hopefully give the problem some of the notoriety it deserves. I’d like to point out that this is by no means an exhaustive list of the (many, many) problems with programming languages!

(If you don’t have the time or desire to read all of the ranting, at least skim the boldfaced points. There’s some good stuff in there.)

The Humanity!

When it comes to any kind of functional design, user experience is not just some tangential concern: it is of the utmost importance. The moment you make a design choice that doesn’t benefit the user, you’re doing it wrong, and your design will be bad. I’ll get more articulate in a minute, but first I want to admonish all language designers. You are wrong and bad.

We’re all guilty of it. No one can design something useful that will satisfy everybody’s expectations. But we can avoid making language design choices that benefit the computer without benefiting the programmer—such choices are nonsensical and absurd.

If you’re not designing for humans, why are you designing at all?

That doesn’t mean we shouldn’t explore new computational paradigms, new ways of reasoning about and optimising programs. What it means is that we should do so in the spirit of making programming easier and more enjoyable, so that developers can focus on the problems that they want to solve, rather than the linguistic obstacles that get in their way.

♦ ♦ ♦

Non-linguistic Languages

I find it baffling that most designers of programming languages give no concern to the language aspect. We are linguistic creatures by nature, and we have strong intuitions about how language is supposed to work.

Larry Wall is a notable (partial) exception to the norm, appropriate considering he studied linguistics back in the day. Perl is what you might call a semi-naturalistic programming language. This has nothing to do with “natural-language programming”, which is largely bunk: natural language itself is not suitable for programming computers any more than Nahuatl is suitable for talking to a Welshman.

A naturalistic language, then, is one that exhibits usability characteristics of natural languages. I’ll wager that everyone’s first languages are either spoken or signed, followed shortly by the written forms (if any) of those languages. A second language can be much easier to learn if it is more similar to your native tongue, and I think this holds for programming languages as well.

We learn new things in terms of what we already know.

One of my favourite things about Perl (well, not Perl 6) is the fact that it has noun classes, inflected with sigils (basically typed dereferencing operators). $x (“the scalar x”) is a completely different thing from @x (“the array x”), which too is a completely different thing from %x (“the hash x”). Agreement in type is analogous to grammatical agreement in number or gender, which is very common in human language.

Different things should look different; related things, related.

Another thing Perl mimics in natural language is implicit reference. $_ is like the pronoun “it”, a default thing, the current subject of discussion, which in many situations can be assumed. Programming languages use explicit reference almost exclusively. In order to perform a series of operations on a value, the programmer must explicitly name that value for every operation. Oddly enough, this is even the case in the highly English-like Inform.

In one of the object-oriented languages of which everyone seems to be so fond, this could be as simple as keeping track of the current object “under discussion” and allowing it to be assumed where an object would otherwise be expected.

Languages should let us elide repetition.

One thing I like about concatenative languages such as Forth and Factor is that there is essentially no named state—you can create variables as a convenience, but at the heart of it, everything is just dataflow.

Computer languages are (characteristically) so wholly unlike and inferior to natural language that it’s almost comical to call them languages at all. You express ideas every day in your native tongue that have no analogue in any programming language in existence. In most languages, you can apply productive rules to derive new terms, whereas basically all programming languages are purely isolating with positional syntax.

Naturalistic programming languages will never be pretty. They are not minimal, or elegant, or simple, but despite all that, they are intuitive and useful. More importantly, they meet users’ expectations about how language is supposed to work.

Humans have expectations. Do not judge them; only exploit them.

All of the “ugly” features of natural languages evolved for specific reasons, and the designs that evolution has wrought can be borrowed to create better designs for our artificial languages.

♦ ♦ ♦

Erroneous Errors

Real-world languages have loads of redundancy, which greatly improves error recovery; a programming language with judicious syntactic redundancy can issue warnings instead of errors for imperfect but understandable input, improving the user’s experience.

If a compiler can deliver detailed diagnostic messages, then it must do so; if there is not enough information to provide a meaningful diagnostic, then it shouldn’t provide any more information than is relevant to the programmer.

If I write int f(X x), where X is an undeclared type, the compiler should not do what GCC does, which is to write the following:

error: expected ‘)’ before ‘x’

This tells me nothing about what is actually wrong, but in an effort to be specific, it gives me misleading specific information. If I were to insert ) before x, I would then get:

error: expected declaration specifiers before ‘x’

Followed by further, largely nonsensical errors that depend on what follows the declaration of f(). It should say either something both specific and helpful, such as:

error: use of undeclared type ‘X’ in parameter list of function ‘f’

Or, if that is not possible, then at least something not so specific that it becomes incorrect:

error: the parameter list of function ‘f’ is invalid

Errors should be useful, or vague enough not to be misleading.

This is not just an implementation issue. Languages are often structured in such a way that it is very difficult to provide meaningful error messages, because constructs are too context-dependent or informationally sparse to obtain a meaningful message from an appropriately narrow and targeted view of the source.

♦ ♦ ♦

Terrible Typography

Most programmers don’t seem to care about legibility. If they did, they would probably be angry that we still use fixed-width fonts for programming. Our programming languages aren’t designed in such a way that they look any good in proportional-width fonts, and monospaced fonts are needed to carry out tabular formatting in editors that don’t support the sanest known way to handle tabulation.

Sigh. At least most of us agree that it’s a good idea to keep code width low. It is a bit sad, though, that we call the rule of thumb the “80-column rule”.

I yearn for the day when we measure code width in ems.

Monospaced fonts arose in the first place due to technical limitations: first, because typewriters could only move a fixed distance per character typed, and second, because it was easier to address characters on a graphical display as a regular grid of fixed-size sprites. Fixed-width fonts are a typographical oddity that survived through tradition and little else.

Programming notation is the way it is primarily because of the fallacy that programming is mathematics. In reality, writing software is also very much like, well, writing.

Programming notation should evoke mathematics and prose alike.

In prose, for example, punctuation merely explicates the structure and organisation of the text, and gives hints as to cadence and prosody. In programming languages, as in mathematics, punctuation abbreviates common operations—but it is also abused to stand in for structures that, in mathematical notation, would ordinarily be indicated with richer formatting.

The formatting problem is a side-effect of the fact that we still live in the Stone Age when it comes to our notational character set. No offense to ASCII, but it was already showing its age in the late 80s when Unicode showed up.

But thanks to the peculiar tenacity of fixed-width typefaces, most programming languages can still be comfortably written on a 50-year-old typewriter. How’s that for backward compatibility?

♦ ♦ ♦

Impossible Input Methods

On account of the limited circa-1960 palette of characters we use in our languages, we’re constantly making notational compromises, approximating glyphs with digraphs and trigraphs. Should “<=” really mean “≤” when it actually looks like “⇐”?

When I tutored computer science, I saw students write => instead of >= many a time, because they expected the conceptual reverse of <= to be the visual reverse of it, as with ≤ and ≥. Similarly, the students frequently confused the left and right sides of assignment—i.e., they would write y = x when they meant x = y—because with “=” there is no visual indication of the direction of assignment, nor of the fact that mutating assignment is not the same as mathematical equality.

Why don’t we use hyphens for hyphenation, minus signs for subtraction, and dashes for ranges, instead of the hyphen-minus for all three? Why do we use straight quotes ("") when curved quotes (“”) can be nested (or contain straight quotes) without escaping? There’s a wealth of time-tested typographical convention in both mathematics and prose that programming language designers simply discard without a second thought. Without a first thought.

Throw away traditions that don’t work; respect those that do.

These compromises would be totally unnecessary thanks to Unicode, but our editors and input methods lag so far behind that it’s still infeasible to comfortably enter many non-ASCII characters without dedicated editor support.

And no such support exists, because a language that uses “->”, which can be written in any editor, is more marketable than one that uses “→” and relies (however little!) on outside tools. On Linux I can use a compose key, which is only a mild inconvenience, but on Windows I’m stuck with Alt codes or Character Map.

We value terseness in a language because it increases the information density of the code, so we can work with more meaning at once, both per screen and per brain. Punctuation symbols are generally the most concise way to express a concept, and some symbolic notation in a language is absolutely good, insofar as it reduces cognitive load and eye movement. But the legibility of a punctuation-heavy language suffers just as greatly as that of one without any punctuation at all.

♦ ♦ ♦

The Fear

Perhaps the biggest problem with programming language design is that, because it is so bad, people are afraid to use tools that can help them. They are afraid of the whitespace sensitivity of Python, because they assume it will violate their expectations, and therefore cause them a hassle. In reality, whitespace sensitivity in Python and Haskell are totally innocuous, and actually quite helpful—because they are designed with some thought behind them.

When you only know the bad, you quietly assume all things are bad.

It’s no wonder people stick so tenaciously to a single language or language family. Their expectations were violated time and again when they were learning how to program, so they assume that learning a new language is always like that—and sadly, it often is. They don’t want their expectations to be violated again, so they stick to what they know. Even if it’s bad, at least it’s familiar.

We need to get rid of that kind of thinking—not just so that language design can move forward, but so that we can quit worrying about languages and get shit done.

15 October 2011

Tabs vs. Spaces? I’ll Do You One Better

Now, I don’t ordinarily like to get involved in holy wars, because arguing about programming serves largely to waste time that could be spent actually programming. But I’m making an exception in this case in order to make, shall we say, a modest proposal.

While doing research for an article, I rediscovered elastic tabstops, which are a fairly sane way of managing indentation and tabulation, especially if you like to use proportional-width fonts for programming. Unfortunately, it’s not as widely implemented as it ought to be, and the reference implementation (a gedit plugin) leaves something to be desired in terms of efficiency.

The search for an Emacs implementation (which, sadly, proved fruitless) led me to all kinds of juicy “discussions” concerning the merits of tabs versus spaces. There are many arguments for both, but summarising and responding to them are beyond the scope of this article.

The peculiar thing to me was that people kept stressing the semantic nature of the tab character versus the space character. In typewriter terms, spaces are for moving the cursor a fixed width, while tabs are for moving until the next tabstop.

On a typewriter and in word-processing programs, you can easily change the global positions of the tab stops, but formatting code is a subtler problem, and I think, as long as we’re sticking to tabs versus spaces, elastic tabstops are the best solution.

But who says we must stick to just tabs and spaces? There are plenty of other ASCII control characters that just sit around doing nothing these days; why don’t we repurpose them? As long as we’re thinking semantically, the ASCII control characters offer some interesting possibilities in the way of code formatting and semantic markup:

  • HT (Horizontal Tabulation) would be used only for formatting of tabular data, in the same manner as elastic tabstops—but never for indentation.
  • VT (Vertical Tabulation) would serve to terminate rows in tabular data, allowing table cells to vary in height, e.g., by containing linefeeds.
  • SI (Shift In) and SO (Shift Out) could be repurposed to mark increases and decreases, respectively, in indentation. This has the benefit of not requiring indentation characters to be repeated at the start of every line, but the redundancy of doing so would improve degradation for tools that rely heavily on indentation, such as Python.
  • BS (Backspace) would appear at the beginning of a line to “dedent” code such as a line or case label in C, which is subordinate to its parent, yet not equal to its siblings.
  • FF (Form Feed) could separate logical sections of code, and thus be used like the “region” markers available in many IDEs. You still see some old source files with this convention from time to time.
  • US, RS, and GS (Unit, Record, and Group Separators) could provide further logical division of code, or semantically mark up data types and their related operations.

And that’s just in the C0 range. Consider the possibilities with some control codes from the C1 set:

  • BPH (Break Permitted Here) would indicate where lines should be wrapped before other text-wrapping methods are applied. This would eliminate the need to manually insert linefeeds to break long lines.
  • NBH (No Break Here) like its cousin above, would provide an otherwise absent means of indicating that no line break is to be inserted at the current position.
  • SSA and ESA (Start and End of Selected Area) could be used by editors to save the current selection, preserving editor state in a central location in the file.
  • HTS and VTS (Horizontal and Vertical Tabulation Set) would cause a tabstop to be set at the current position, overriding the automatic behavior of the elastic tabstops.
  • HTJ (Horizontal Tabulation with Justification) would produce a right-aligned rather than left-aligned field in a table.
  • PLD and PLU (Partial Line Down and Up) could automatically produce rich formatting for subscripts and superscripts.

Imagine just throwing a few simple control characters in your files, and being able to interact with beautifully typeset, richly formatted code, with all styling defined by local CSS so you never have to see code formatted in a way you don’t like.

Control characters have the advantage of being immune to interpretation by a lot of tools. Existing editors and compilers would need only minimal changes to accept code with semantic markup. An appropriately designed editor could even convert code on the fly for backward compatibility with unmodified compilers.

I hope you get that I’m not really serious about this—an earnest proposal for something as broad as “inline language-agnostic semantic code markup” would need to take a lot more into consideration than some old telecom codes.

I think many would agree that the idea has some merit, but few would agree on any particular implementation of it. Then again, who knows? Literate programming has its adherents, and that’s not so very different in spirit.

(Oh, and for the love of all that is holy, don’t anybody get the idea that comments containing HTML or LaTeX or whatever would be a good solution. Just try it and see how far you get.)

08 October 2011

Please Write More Crappy Productivity Articles

Every article about productivity is either crappy or excellent. If it’s excellent, I read it all the way through in order to absorb every detail about how I can get more things done—in doing so, I get nothing done. If, however, the article sucks, then I quit reading it immediately and go work on something I care about.

For me, then, good articles about productivity are usually counter to my productivity, and crappy articles are helpful. I’m so disgusted with them that I can’t help but want to work on something to counteract the time I lost by starting to read them.

Therefore, bloggers, please write more crappy productivity articles, so I can get shit done.

Regexes Parse XML Just Fine, Actually

Despite what you may believe, or may have heard from a particular famous Lovecraftian answer on Stack Overflow, you can actually use regular expressions to parse arbitrary XML. Of course, they’re not regular expressions in the strict mathematical sense, but rather in the sense with which most of us are familiar—Perl regexen.

This relies on a seemingly little-known but tremendously handy feature of Perl regular expressions: (?N) will recursively match an instance at the current position of whatever is in capture group N. This lets you recursively match delimiters, limited to a depth of 50 unless you build yourself a special Perl.

The only bit you can’t do in the regular expression engine itself is assert that the names of two matching tags are equal, because you can’t match backreferences in recursive submatches. This isn’t a problem if you assume your input is well-formed. I’m talking about parsing XML, not checking whether some input actually is XML. Correctness is a Boolean, after all: invalid XML is not XML.

I wrote a quick Perl script to demonstrate this using a number of test strings, both those well formed and those less than perfect. This was in part because I feel like doing “impossible” things lately, and in part because I wanted to brush up on my Perl. The point of it is not to be a good solution, but rather the opposite: there is a certain joy in getting something to work in completely the wrong way.

Anyway, the body of it goes like this:

my $element = qr{ ( $STag ( $CharData? (?: $Reference | $CDSect | $PI
    | $Comment | (?1) )* $CharData? ) $ETag ) }sx;
                 # ^
                 # The morsel that matches $element recursively.

my @stack = '<html><head><title>Title</title></head><body><h1>Lies.</h1>'
    . '<p><i>You</em>, my friend, have been told them.</p></body></html>';

$" = "\n\n";
while (@stack) {
    my $string = shift @stack // '';
    my @groups = $string =~ m/$element/g;
    print "@groups\n\n" if @groups;
    unshift @stack, map {
        s/^<($Name)(?:$S*$Attribute)*$S?>//; my $a = $1;
        s/<\/($Name)$S*>$//;                 my $b = $1;
        $a eq $b ? $_ : undef
    } @groups;

This recursively enumerates all tags and their contents in some rough semblance of hierarchical order:
  • <html>…</html>
  • <head>…</head><body>…</body>
  • <head>…</head>
  • <title>…</title>
  • <body>…</body>
  • <h1>…</h1><p>…</p>
  • <h1>…</h1>
  • Lies.
  • <p>…</p>
  • <i>…</em>, my friend, have been told them.
  • <i>…</em>
  • You
  • <title>…</title>
  • Title
Here for your enjoyment is the script in its entirety, which ought to match all matched tags that are valid according to the XML specification, including weird tag names, comments, <![CDATA[...]]>, and all that other fun stuff.

I’m admittedly unsure of whether the rules involving lookahead and lookbehind actually match the spec, but it was what came to mind. The rules in question are $CData, $CharData, $PITarget, $PI, and $Comment (but they seem to work okay from what little testing I’ve done).


use warnings;
use strict;

my $S = qr{ \x20 | \x09 | \x0D | \x0A }x;

my $NameStartChar = qr{ : | [A-Z] | _ | [a-z] | [\xC0-\xD6] | [\xD8-\xF6]
    | [\xF8-\x{2FF}] | [\x{370}-\x{37D}] | [\x{37F}-\x{1FFF}]
    | [\x{200C}-\x{200D}] | [\x{2070}-\x{218F}] | [\x{2C00}-\x{2FEF}]
    | [\x{3001}-\x{D7FF}] | [\x{F900}-\x{FDCF}] | [\x{FDF0}-\x{FFFD}]
    | [\x{10000}-\x{EFFFF}] }x;

my $NameChar = qr{ $NameStartChar | - | \. | [0-9] | \xB7
    | [\x{0300}-\x{036F}] | [\x{203F}-\x{2040}] }x;

my $Char = qr{ \x09 | \x0A | \x0D | [\x20-\x{D7FF}] | [\x{E000}-\x{FFFD}]
    | [\x{10000}-\x{10FFFF}] }x;

my $Name = qr{ $NameStartChar $NameChar* }x;

my $EntityRef = qr{ & $Name ; }x;

my $CharRef = qr{ % $Name ; }x;

my $Reference = qr{ $EntityRef | $CharRef }x;

my $AttValue = qr{ " (?: [^<&"] | $Reference )* " }x;

my $Attribute = qr{ $Name $S? = $S? $AttValue }x;

my $STag = qr{ < $Name (?: $S* $Attribute )* $S? > }x;

my $ETag = qr{ </ $Name $S? > }x;

my $CDStart = qr{ <!\[CDATA\[ }x;

my $CDEnd = qr{ \]\]> }x;

my $CData = qr{ (?<! $CDEnd ) $Char* (?! $CDEnd ) }x;

my $CharData = qr{ (?<! $CDEnd ) [^<&]* (?! $CDEnd ) }x;

my $CDSect = qr{ $CDStart $CData $CDEnd }x;

my $PITarget = qr{ (?! [Xx][Mm][Ll] ) $Name }x;

my $PI = qr{ <\? $PITarget (?: $S (?: (?<! \?> ) $Char* (?! \?> )))? \?> }x;

my $Comment = qr{ <!-- (?: (?: (?! - ) $Char ) | (?: - (?! - ) $Char ) )*
    --> }x;

my $element = qr{ ( $STag ( $CharData? (?: $Reference | $CDSect | $PI
    | $Comment | (?1) )* $CharData? ) $ETag ) }sx;

my @stack = '<html><head><title>Title</title></head><body><h1>Lies.</h1>'
    . '<p><i>You</em>, my friend, have been told them.</p></body></html>';

$" = "\n\n";
while (@stack) {
    my $string = shift @stack // '';
    my @groups = $string =~ m/$element/g;
    print "@groups\n\n" if @groups;
    unshift @stack, map {
        s/^<($Name)(?:$S*$Attribute)*$S?>//; my $a = $1;
        s/<\/($Name)$S*>$//;                 my $b = $1;
        $a eq $b ? $_ : undef
    } @groups;

I would call this a good (okay, slow and overspecified, but still kinda neat) practical solution to a problem that’s theoretically unsolvable. That, incidentally, will be the theme of many of my upcoming posts, so stay tuned if you’re into it.

06 October 2011

Correctness is a Boolean

The most misleading thing about the way programming is taught is how programs are graded. Students are used to receiving a score that represents the sum of the correct parts of an assignment. You can think of it start as starting with a zero and earning points up to a maximum of 100, or as starting with 100 and losing points from there. As for me, I prefer to simply avoid thinking about it.

But when it comes to software, this kind of grading makes almost no sense whatsoever. Especially in the imperative languages that are generally taught to beginning students of programming, the effect of a statement depends crucially on the effects of the statements preceding it, and not just lexically, but temporally.

An error in a single line of a program is not necessarily local to that line, but more likely has effects that propagate forward in time or outward in an expression to cause further errors. By grading programs subjectively according to how many of the statements seem appropriately designed and sequenced, teachers influence students to treat programs as though their objective correctness is dependent on the mere sum of the subjective correctness of their parts.

Beginning programmers often have difficulty thinking above the level of individual statements in order to develop a high-level understanding of program structure and meaning. Professors don’t need to further their confusion by giving them a skewed representation of what makes a program right.

A program ought to be like a proof: a single flaw, and it’s not correct. “Correct” means that it does everything it ought to, and nothing it oughtn’t. That says nothing about the amount of work required to make an incorrect program correct, because that’s entirely subjective; it could be a difference as small as a single character, or as great as the entire program. The changes needed are not necessarily local to individual statements, nor even comprehensible at the statement level.

That’s why, if I ever end up teaching programming, I’ll grade my assignments on the basis of correctness and subjective style separately, and draw a clear distinction between the two. If your program is correct, you’re guaranteed a passing grade. All the points beyond that you gotta earn by writing elegantly. It’s only fair.

28 September 2011

How to Be Productive

Blindingly Obvious Truths

This is how to be productive. Most of these things are absurdly, staggeringly obvious. That’s kinda my style. Anyway, since not very many of us are the kind that get shit done, we apparently need to be reminded every once in a while. Pin this to your wall—pin it to all of your walls—and look at it to give yourself a mental kick in the ass when you know you should be working.

Get out of bed.

If, like me, you have delayed sleep phase syndrome, then fix it. If it’s hard to get up when the sun comes up, then your sleep schedule is probably not healthy, and it is negatively affecting your productivity. We are a diurnal species. Deal with it.

Be healthy.

It ought to go without saying that you need good diet and exercise habits to be healthy. What you may not realise is that health is happiness, and for many of us, happiness is productivity. (I can’t say all, because surely there are some dark tormented programmers out there whose work is fueled by the darkness and torment in their dark tormented souls of tormentful darkfulness. Or something.)

For instance, when my blood sugar is low, I hate the world—then I eat a sandwich and everything’s fine. If you feel like hell all the time, your body is probably trying to tell you something. Sleep right, eat well, exercise occasionally, spend time with friends, find a mate, and mate with them often.

Work on one thing.

Multitasking is doing many things at once—badly. Innumerable studies have shown that multitaskers are just fooling themselves into believing they’re improving their productivity. You can be a true “jack of all trades” if you work on one thing at a time. Don’t distract yourself from what you’re working on with thoughts of what you’re not working on. It’s an endless cycle of self-subversion that can only be broken with focus.

If I want to write a blarticle, I just do it. If I need to do homework, I just—well, okay, first I complain a lot, but then I just do it. I didn’t post a vlog entry for months, and then I said “no, this will not do” and made something. It was shitty. It doesn’t matter. I did it.

Work till you’re done.

Have a project that needs finishing? Don’t waste your valuable time making small changes. Get your hands dirty and implement that feature you want. No one else is going to do it, and you have the time. It probably won’t even be as hard as you think. If it seems guaranteed to be difficult, you might be fixating on a difficult solution when there is a much simpler one. Reflect.

When you’re done, stop.

You must take time to relax and play. If you work hard, you earn the right to relax hard. You will feel the profound satisfaction that comes from creating something through your own focused efforts, and your sleep will be easy and deep.

Now go, and do.

27 September 2011

Simulation is Abstraction is Power

A more abstract programming language is more powerful. This is one of precious few things that most software developers agree on. C is more powerful than assembly, C++ is more powerful than C, and Lisp is more powerful than most developers realise—or will ever take advantage of. And that’s fine.

But whence cometh abstraction? Sure, you can say that all abstractions are generalisations of some category of operations or structures, but that’s shallow and basically useless. There’s more to it.

Abstraction comes from simulation. Computers run simulations of idealised computation; more abstract languages are better at simulating the ideal. Low-level languages deal with real hardware, while high-level languages simulate ideal hardware. A few examples:

  • References simulate unstructured memory.
  • Garbage collection simulates unbounded memory.
  • Dynamic typing simulates lack of typing.
  • Threads simulate parallel execution.
  • Tail recursion simulates unbounded recursion.

(Depending on how far you want to go with the analogy, recursion simulates a whole bunch of things, including self-reference, fractal structure, and induction.)

Simulation is about removing dependency. A program can depend on manual memory management, but it cannot, by definition, depend on garbage collection. With memory management fully abstracted away, memory management ought to be a nonexistent concept from the perspective of the language.

Your program probably pretends it has infinite memory anyway. Provided your program doesn’t crash, the system does have unbounded memory as far as you know. Most programs that are working properly don’t run out of memory unless they’re deliberately designed to, for lack of a better phrase, fuck shit up.

Similarly, as we all know, threads may actually run on separate cores, and thus truly in parallel—that’s sorta the point. But the thing about a language that offers powerful abstractions for parallelism is that you don’t need to care. It’s taken care of. As far as you need to know, the system can perform an infinite number of computations at once.

Abstractions are (imperfect) simulations of limitlessness. As a language designer, you can (and should) offer pragmatic limitations and safeguards for the benefit of the fallible developer. But to make a more powerful language, you must remove limitations. When you do so, you discover the nature of the underlying simulation—and simulation is abstraction is power.

22 September 2011

Tricky Programming Concepts Aren’t

What Teachers do Wrong

Teachers use subversive language. They say things like “this is a bit tricky” before introducing a concept. It’s intended to remind you to pay attention, but mostly it turns students off—it says “this is hard; I don’t expect you to understand it”.

Frankly, that’s bullshit, and it has appeared—and caused problems—in nearly every class I’ve ever taken. It’s especially visible in mathematics and the sciences, but foreign language, social studies, literature, and even drafting are not immune.

The fine arts, notably, do avoid it: it’s not conceptually difficult to improve your process and observational skills—and no art teacher I’ve ever encountered pretends it will be anything other than hard work and dedication. (Paul Graham got this one right: hacking and painting are very similar.)

I may have an irresponsibly intertwingled worldview, but it seems to me that most things are not conceptually difficult. At all. Even the most high-level concept, explained correctly, can be absorbed and applied by any person of average intelligence. The important thing is the explanation.

Learning should not be about overcoming difficult ideas: it should be about realising how simple those ideas actually are.

I think the main reason that fine art instruction is effective is that, when a student says “I can’t draw/paint/compose/&c.”, the teacher’s natural response is to say “you can—you must; just practice”.

And they do, and they get better. Most people are not going to become master painters or composers, of course, for the same reason that most people won’t become masters of anything. But perhaps more of the people who are capable of mastery would strive for it if we used encouraging, practical teaching methods like we know we ought to.

♦ ♦ ♦

What Programming is About

Programming is fundamentally about doing three things with a problem:

  • Discovery
    First, know what problem you need to solve. Finding interesting problems is half the fun of research. If you don’t mind solving uninteresting problems, you probably aren’t a very interesting programmer, or even very interested in programming.
  • Understanding
    Once you know the problem, you need to understand it deeply. You need to understand its subproblems, and their subproblems, all the way down to the units that make up your problem domain. (This is recursion.)
  • Expression
    When you understand the problem, you can express it. This is not just writing the code you designed earlier, as some “software engineering” evangelists will have you believe. Expression itself contains discovery and understanding, which leads to further expression. (This is tail-recursion. It’s also why the “waterfall” method is flawed.)

This is exactly like the process of essay writing: you discover your thesis, engage in research to understand your topic, then begin the process of expressing yourself in a logically connected essay. While figuring out your expression, you discover and understand further details and incorporate them into further expression.

An essay is ideally just a proof of a thesis; similarly, a program is ideally a solution to an intellectual problem. Both have practical effects in the real world, and at the heart of it, they really are the same thing.

♦ ♦ ♦

“Tricky” Programming Concepts

Two things are consistently taught in such a way that beginning programmers fail to properly understand them. Things that are so stunningly simple and pervasive that it’s baffling that anyone could fail to understand them.

I’m talking of course about pointers and recursion.

What baffles me is how people seem to be so inclined to analogy when the vast majority of programming concepts are stupendously literal.

A pointer points. A reference refers. These are the same thing.

(Don’t the C++ folks in the audience give me any crap about C++ references. They’re not magic, they’re non-nullable immutable pointers—just another reference type.)

The address of a thing is not the thing itself, but it is good enough (and often, in fact, better). I can steal anything in your house, break your windows, burn it down, and all I need to know is where your house is. I can use the words “burn your house down”, and even though I have not actually burnt your house down, you know exactly what I’m referring to.

Reference is a fucking pillar of language. I say “fucking” because it’s fucking important. We use finite-sized symbols to refer to concrete objects and abstract concepts alike, and it takes no more cognitive effort to say the word “apple” than it does to say the word “universe”. (Referring to something is a constant-time operation.)

Most people have no trouble understanding the concept of reference, of course—but then they fail to see why it’s important, least of all in programming. But again, you don’t even need analogies here—the explanations are comically, tautologically simple:

  • A linked list is a list of things that are linked.
  • A tree is a hierarchy of things that are linked.
  • A queue is a queue of things.
  • A stack is a stack of things.

If someone doesn’t understand these things, they’re probably trying too hard. To implement a linked list, just open your eyes and think about what the term means: it is a list, and at each place there must be a thing, and the address of the next thing if there is one.

That’s all.

Recursion is even more important to programming, even less understood by beginners, and yet even more visible in daily life. If you remember one thing, remember this:

References are for referencing, and recursion is for self-referencing.

A mind-squishingly huge number of natural things have fractal structure, from clouds to mountains to plants to coastlines to trees. A branch of a tree looks just like a whole tree, only smaller. Because of this, doing stuff to part of a tree is basically like doing stuff to a whole tree.

You can thus write instructions for doing stuff to entire trees very concisely: to snuggle a whole tree, first snuggle the base of the tree, then snuggle each branch as though it were a whole tree.

This works in less silly examples as well: to sort people by height, pick one, move all the shorter people to his left, and all the taller people to his right, then sort each side. Bam. You’ve just invented quicksort.

♦ ♦ ♦

Don’t Be Fooled

So I think Joel Spolsky got it severely, totally, irresponsibly wrong in his article The Perils of JavaSchools, wherein he basically claims that understanding pointers and recursion is an aptitude, not a skill.

Not only is it the same kind of subversive language that teachers mistakenly use, it betrays a deeply mistaken belief that the concepts of pointers and recursion are somehow inherently difficult.

They are not.

If you think all I’m saying is “programming is not as hard as people make it out to be”, then you’ve missed the point. Programming is hard, like writing and design, because it is intimately related to both. It’s just that the basics of programming are no harder than the basics of any other kind of design.

Pointers and recursion are among the most basic things in the world. The real world, the one in which we live—and that’s why they show up in computing in the first place.

♦ ♦ ♦

My sincerest thanks to all of the kind folks at Hacker News (and now /r/programming!) for reading, upvoting, commenting, and +1ing. Your many praises and criticisms are deeply appreciated, for giving me the motivation and humility to write even better content in the future.

17 September 2011

What’s Wrong with Lisp

Lisp is a brilliant language. You can do things in it that are impossible in other languages, and that’s wicked cool. But though I think there is an upper limit to language expressiveness, I don’t believe Lisp represents that limit. Here’s why.

A quote I see from time to time—and for some reason have been seeing a lot lately—comes from the author of perhaps my favourite book, Le Petit Prince, the marvelous Antoine de Saint-Exupéry:

“A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.”

In short, minimalism = perfection, and Lisp (well, Scheme anyway) favours it: uniform syntax, a few special forms, and then it’s turtles all the way up.

But even though I’d be the first to say that there’s a lot to take away from a thorough study of Lisp, there’s also a lot that could be taken away from Lisp itself, and for that reason I can only recommend it as an intermediate step in a programmer’s journey to linguistic nirvana.

See, this may come as a shock, but Lisp is not the most powerful language. Just as there are countless things that are ugly in most languages but mundane (if not beautiful) in Lisp, so too are there things that are ugly in Lisp and beautiful in other languages. They’re easy to see when you go looking for them.

It all boils down to the ability to concisely represent patterns. When a programmer in language X sees repetitive code, he refactors it using the tools available in X. To the programmer of some better language Y, even the most beautiful solution in X is imperfect, because he sees it as a workaround for the fact that X does not have what Y does. A Band-Aid solution, if you will.

Lisp macros are seen as the final word in abstraction and beauty. See repetitive code, make a macro, repeat. But aren’t macros the same kind of Band-Aid solution that exists in every language? Lisp just has nicer Band-Aids. (Ones with spaceships and unicorns on them.) And make no mistake, there is a language with nicer Band-Aids than Lisp.

To paraphrase Paul Graham: when you look down the spectrum of programming language power, you know you’re looking down; but when you look up, all you see are weird languages. Lisp programmers look down on all other languages—but what languages look weird to them?

I advise Lisp programmers to get off their pedestal and look through the weird, the esoteric, and the insane, to reach the power of what lies beyond. Question your assumptions. Why the oh-so-important parentheses? Simple: nesting! So why do you need nesting? To create scope. Why do you need scope? Variables, of course. So why do you need variables? To store values, obviously. What is the purpose of a value? To model state. Why do you need state? You don’t.

If you model everything in terms of data flow, for example, you no longer need state or any of the other things that come with it. You can write your program solely in terms of function composition. You don’t even need scoping (lambda abstraction) or application at all. You don’t need an absurd Haskellian type system to keep things functionally pure, either. 

The point is that lambda calculus is only the beginning. It is (for lack of a better term) the assembly language of functional programming. A language that directly models the operations of a Turing Machine is plainly rather boring, but what baffles me is that nobody seems to notice that the same is true of one that models lambda calculus. Lisp dresses it up, but only very little. We can do much better.

Let us take more away, to reveal what more is yet to be.

15 September 2011

How to Pronounce Hexadecimal Numbers

In 1999, the IEC introduced kibibyte and friends, because they were pissed off that kilobyte means 1000 bytes, but programmers wanted it to mean 1024 bytes and were insistently using it that way, correctness be damned. Unfortunately, nobody really gave a damn, and even though some things (Ubuntu for instance) show data sizes in KiB, nobody actually says “kibibyte”.

Someone asked recently on the Programmers Stack Exchange how hexadecimal numbers ought to be pronounced. In a valiant effort to pull an IEC and come up with a useful thing that nobody will use, here’s my system of pronunciation for hex numbers. The system is based on English (specifically American English) because it’s the lingua franca of programming. Enjoy.

Hexadecimal Number Name Decimal Equivalent
0x0000 0001 One 160 = 1
0x0000 000a A (ay) = ten 10
0x0000 000b B (bee) = eleven 11
0x0000 000c C (cee) = twelve 12
0x0000 0010 Tex 1 × 161 = 16
0x0000 0020 Twentex† = two tex 2 × 161 = 32
0x0000 0030 Thirtex† = three tex 3 × 161 = 48
0x0000 0040 Fortex† = four tex 4 × 161 = 64
0x0000 0050 Fiftex† = five tex 5 × 161 = 80
0x0000 0060 Sixtex = six tex 6 × 161 = 96
0x0000 0070 Seventex = seven tex 7 × 161 = 112
0x0000 0080 Eightex = eight tex 8 × 161 = 128
0x0000 0090 Ninetex = nine tex 9 × 161 = 144
0x0000 00a0 Tentex = ten tex 10 × 161 = 160
0x0000 00b0 Eleventex = eleven tex 11 × 161 = 176
0x0000 0100 One hundrex 162 = 256
0x0000 1000 One thousax 163 = 4096
0x0001 0000 Tex thousax = one thoutex‡ 164 = 65536
0x0010 0000 One hundrex thousax = tex thoutex 165 = 1048576
0x0100 0000 One milliox = one hundrex thoutex 166 = 16777216
0x1000 0000 One billiox†† = one thousax thoutex 169

† By analogy with “thirty”, “forty”, &c.

†† This system is based on the short scale—one billiox equals one thousax milliox, and one trilliox is one thousax billiox.

‡ I introduced the term “thoutex”, equal to 0x10000, because it’s more obvious that 0x0001 0000 is “one thoutex” rather than “tex thousax”: four-digit grouping makes more sense in hex, but three-digit grouping is the default in English, so I had to come up with something, and the hypothetical “thouten” sounded good in my mind to adapt.

Fractions are formed as in English, by adding -th: one texth, one hundrexth, one thousaxth, one thoutexth, &c.


  • Ten hundrex fourtex thousax, fifteen hundrex twentex and twelvetex thirteen hundrexths
  • Ay hundrex fourtex thousax, eff hundrex twentex and ceetex dee hundrexths
  • Ay milliox, four hundrex thousax, eff hundrex twentex and ceetex dee hundrexths

Happy pronouncing! :)

14 September 2011

Thoughts on Metaprogramming in Very

I'm currently implementing the metaprogramming system in my pet language, Very. The language is inspired by Forth and Lisp, both of which are known for their strong metaprogramming capabilities. Here's how metaprogramming works in Very.

Very is a stack-based, concatenative language: words (the concatenative analogue of functions) manipulate a stack of values to perform computation. Very macros are words that run at compile-time: while words usually manipulate values, macros usually manipulate code. The interpreter reads and parses the source, then repeatedly expands all macro invocations until none remain, before evaluating the result.

Very also has an analogue of Lisp's reader macros: in Lisp, you can specify a callback function that is executed by the source reader when it encounters some predefined character or group of characters. Unfortunately, reader macros must explicitly operate on a stream object, whereas proper macros operate solely on S-expressions.

A Lisp reader macro can function like an ordinary macro by calling read on the stream it receives, to retrieve S-expressions rather than characters, but this interface is awkward, and offers no facility for performing manipulations at the raw token level.

In Very, you can declare that a particular sequence of characters forms a token, which is then treated by the reader as an indivisible unit. Very reader macros are simply ordinary macros, whose name may have been declared as a token. Macros have the ability to manipulate the incoming queue of source at whatever level they choose—there are words for reading (and unreading) terms, raw tokens, and characters. A character is just a number; a string, a list of numbers; and a term, a number or list of terms. It's that simple.

So Very offers better abstractions than Lisp for source manipulation madness: you could implement any language entirely in Very macros, and have it run under the Very interpreter, and doing so would be a lot less painful than in Lisp. The Very web framework will make extensive use of macros to allow arbitrary HTML and CSS to function as ordinary values, and such goodies as automatic validation will come practically for free.

One thing I've been worried about as a result of all this metaprogramming insanity is how to support syntax highlighting of Very code. Languages such as Very that cannot be parsed without being evaluated aren't exactly amenable to tool support, because writing complete tools requires writing a nearly-complete implementation of the language.

I've decided to offer a compile-time facility for semantic annotation, which at the heart of it would tag regions of source with labels. An editor could run Very code through a preprocessing phase to obtain an annotated and escaped version of the source, which would contain enough information to correctly highlight the code.

Semantic annotations could also be used as compiler hints for static typing, parallelisation, and other optimisations. If possible, I'd like to implement annotations as macros—conversely, it may be possible to implement macros by tagging regular words with a compile-time-word annotation. Will update will my progress on that insanity when I come to it.

13 September 2011

This “Trello” Business

So Fog Creek Software has launched a new thingamajig called Trello. Here are my initial thoughts.

Everything needs a purpose in order to survive. (Even if that purpose is to have no purpose. How Zen.) It should be easy to say, concisely, what your story is about, what your software does, and what your product is for. Why did you make it and why do I care?

So what's Trello about? Projects, basically. That's concise enough that it holds my attention, and doesn't immediately shout “useless gimmick” or “totally misguided”. You get a pretty interface (boards) to what is essentially groups (lists) of lists (cards). I usually have an immediate negative reaction to coinage of proprietary slang, and now is no different: I don't know why I'm organising lists of “cards” on a “board”. The physical analogy just isn't doing it for me.

Anyway, by default, you get three top-level groups, presumably for tasks: future (to-do), present (doing), and past (done). It's an intuitive model that people will like, but, well, it's not unique.

And no goal is worth pursuing unless it's unique. Your product will be successful if people want it, and people usually want things that either haven't been done before, or haven't been done right. Project management solutions have been done. Scads of times, in myriad different ways. So the tacit assumption that Trello makes is that its model for project and team organisation is superior to those of its competitors. (Yes, free services have competitors.)

So how is it different? Trello makes it very easy to collect bundles of junk related to a task, and sort them chronologically. You can add comments, checklists, and files, as well as attach coloured “labels” to tasks. I would say this is where it really shines, but it just doesn't excite me.

(For one thing, the interface is still a tad buggy: when adding items to a checklist in Firefox 6.02, the most recent item replaces the previous item, so only one item displays until the card is closed and reopened. Also, I ran into some synchronisation issues when it came to creating, closing, and reopening boards, as well as archiving—because there's no such thing as “delete” in the Trello world—and unarchiving cards. This mainly consisted of duplicate items and things that were supposed to be gone having some trouble taking the hint. Most things seemed to work fine, however, and I expect the wrinkles will get ironed out soon.)

The real problem is that I don't need it. I can make my own to-do list in whatever format I please. (Emacs org-mode is excellent for this, but any document will do.) I can put that to-do list in a Dropbox folder with whatever other files I'm working on, and share it with whomever I please. I can set up any old task tracker and wiki on my server and get essentially all of the features that I'd use in Trello. And if I didn't have a server, I would use one of the many available freemium services.

To my mind, the only clear advantage to Trello is that it centralises, simplifies, and prettifies these things for, well, non-technical users. And yet I somehow doubt that many of them will need digital project-management software...

In short, I think it's a beautiful thing that will perish for lack of an audience. But I definitely hope to be proven wrong.

07 September 2011

My Current Language Project

I was thinking about redoing my website in anticipation of a game I may or may not be making. None of the languages or frameworks out there at the moment really speak to me, though. I've been reading about Factor lately, and I think it's a superb language. I thought about using it, but there were some things I wanted to add, and some things about its philosophy that I didn't agree with. So I decided to make yet another language for web development.

The code-name of the project is Very, because for some reason I've been fond of naming my projects adverbs lately. It's a dynamically typed, concatenative, lazy language with homoiconic syntax, closely related to Forth and Lisp.

Very has very few builtins: the core special forms include only a few stack combinators (dup, pop, swap, quote, apply, compose) and some operations for introducing definitions. Fewer stack combinators are of course possible, but these represent a good balance of minimalism and efficiency. It's the definition special forms that are particularly interesting.

The def special form is nothing special: it creates a word by binding a name to a quotation that is applied when the word is invoked. This gives you general recursive functions. The extern special form gives you the ability to invoke functions through the C FFI, which is how many secondary builtins are defined.

The token special form is more subtle. By default, only parentheses, quoted strings, and whitespace-delimited identifiers (which include numbers) are recognised by the tokeniser. token modifies the lexer to treat a particular sequence of characters as a single token regardless of any adjacent nonwhitespace characters. This allows a program to introduce new special tokens as it is being read.

The macro special form is where things get really interesting. Macros are words that operate on a stack of terms that have been read since the point of invocation of the macro, allowing them to perform arbitrary operations on the source. Macros have access to the full power of the language, and may do anything that ordinary words can.

In addition, macros have the ability to manipulate the queue of terms that have yet to be read, meaning they are not restricted to the usual postfix syntax. Macros can invoke other words, including macros, directly at compile-time or indirectly through macro expansion.

The macro system will be used to provide high-level, user-friendly syntactic forms that correspond directly to efficient underlying operations, so those used to a more traditional infix imperative syntax will have no trouble, but those interested in working closer to the philosophy of the language are still encouraged to do so.

At the moment, I've got most of the basics in place, and am currently working on macros. I'm sure there will be plenty of examples to show off as I get more done. Until then.

06 September 2011

Iterators as a Model of Lazy Computation

As I often do, I'm working on a pet language project, which I'll describe in future posts. One of the key concerns for this language is laziness: computation should be deferred as long as possible. For various reasons I'm writing the implementation in C++, so I was looking for a convenient means of expressing lazy computation in an imperative language.

I'm very fond of the work of Alexander Stepanov, who is largely responsible for the popularity of generic programming, one of the driving principles of the C++ standard library. He advocated type-agnostic algorithms expressed in terms of iterators (and iterator ranges) as a means of generically implementing families of operations on any data structure that can be traversed linearly.

It occurred to me that iterators are, for eager imperative languages, an excellent model of lazy computation. An iterator models a lazy function over a list, which can accept or return any number of values it pleases at a time. Composition of these functions is modelled as iterator adaption: composable iterators are those that are constructable from an iterator range of compatible type.

Fundamentally, an iterator is an object that can be dereferenced and advanced. However, in order to be used properly by generic functions in the standard library, C++ iterators must fulfill a number of other requirements. These details are not immediately relevant to the implementation of algorithms, and readability of algorithms expressed as iterators consequently may suffer in C++.

For imperative programmers, it may require some unusual thinking in order to properly decompose algorithms in terms of iterator adaption. At present, the implementation of my language project is decomposed into the following steps, each of which is modelled by an iterator type that is then aliased for convenience.
  • Convert the input stream to a raw bytes.
  • Buffer the bytes to allow backtracking.
  • Interpret the bytes as UTF-8 and produce UTF-32 code points.
  • Partition those code points into tokens.
  • Parse those tokens into terms.
  • Establish an execution context in which to evaluate those terms.
  • Run the interpreter.
All of these operations are composed and executed by just one line of code:


Only run() is an imperative operation: it's a function that sends the result of std::copy()ing the iterator range to a no-op output iterator called ignore_iterator. Every other identifier here is an aliased name of an iterator type. In order to evaluate an expression, the minimum number of terms needed to constitute a valid expression is read, and consequently also the minimum number of tokens and characters. Waste not.

The good thing about iterators here is that they strongly encourage you to fully encapsulate all of the state that you use, at least as much as an ordinary function. Iterators must conform to very strict requirements in order to be used with the usual algorithms and adapters, and those that break those expectations will simply not work properly. (Unfortunately in C++ that usually means spectacular crashes.)

The type aliases involved in the above sample are also relatively simple:

typedef std::istreambuf_iterator  <char>       characters;
typedef buffer_iterator           <characters> buffered;
typedef utf8::unchecked::iterator <buffered>   utf32;
typedef lex_iterator              <utf32>      tokens;
typedef term_iterator             <tokens>     terms;
typedef context_iterator          <terms>      interpreter;

You'll note the use of the standard C++ std::istreambuf_iterator as well as utf8::unchecked::iterator from the excellent iterator-centric UTF8-CPP library. The Boost.Iterator library contains a number of general-purpose iterators and adapters, which may be useful depending on your particular application.

Note that each operation must refer to the type of the previous operation: this is another reason I'm not terribly fond of C++ for this. Type inference doesn't apply to constructors, so in order to have the compiler infer the type of the composed iterator, you would have to replace the type aliases with template functions that forward to the constructors of the appropriate types. Ugh. C++ requires entirely way too much machinery to accomplish what ought to be simple operations.

It's equally possible to model interactive applications in this manner. Loops could be represented as cyclical iterators adapting linear sequences to circular ones, and loops with exit conditions could be represented helically. I haven't proved it, but it seems apparent that iterators can model any general recursive algorithm, because at their heart they're just lazy functions plus some state. It also occurs to me that you could use this technique to very easily and efficiently compile a lazy functional language into an imperative one.

Anyway, I hope this gives my nonexistent readers some interesting ideas. If anyone knows of an imperative language that would be good for this sort of thing, do let me know. I grow tired of C++.

How College Has Been Worth It for Me

I've just started my fourth year at RIT, which is to be my last as an undergrad. I feel this puts me in a good position to reminisce, so here are some thoughts on my experiences in college.

First, college has not really been worth it to me academically. I've been programming for a very long time for someone my age, because I like it a lot. I always assumed that when I got to college there would be vast depths of further knowledge to explore in that field. It turned out that there weren't—at least not vast ones. Pretty much everything I've learned about programming lately I've learned in the same way I always have: on my own, because not very many people know or care about the kinds of things I'm interested in.

That's not to say that I haven't gained something by being here, but my knowledge has increased in breadth, not in depth. By far the best classes for me have been Creative Writing and the many Chinese language classes I've taken for my minor. I could have studied these things at any liberal arts college and probably learned just as much, but the fact is that I like these professors quite a lot and now I'm biased.

So how have I benefited from my post-secondary adventures? Socially. I made friends, broke up with a girlfriend who wasn't good for me, started vlogging, got a boyfriend, broke up with said boyfriend, and am now in an open relationship with a girl that I like quite a lot and wouldn't mind being involved with for quite a long time. I'm a lot more comfortable in public than I've ever been before.

So if you're wondering whether college is good for you academically as a self-taught individual: it probably isn't, unless (as I have) you study something completely different than what you've taught yourself already.

But you can't neglect the importance of spending time getting to know people who share your interests, people you'll probably get along with, people you never imagined you would get along with. You'll end up with mates, roommates, flatmates, classmates, and sex mates.

And the thing about humans is that we tend to mate for life.

How to Live on Your Language

So let's say you want to make a living off writing the next popular language. Assume "making a living" means the equivalent of reasonable pay at a full-time job (any job—not necessarily a technical one). If you really love what you do, you'll probably accept a little bit less to do it, so let's say $12.50 an hour, or $2000/month.

While working a stable job, you release your first version and set up a facility for donations. You then engage in continuous development and marketing, to a degree proportional to the number of users of your language. Let's estimate conservatively that 1% of your users will donate, and each of them will donate an average of $1/year.

That means that in order to get your $2000/month salary, you need to have 2000 donating users for each of the 12 months in the year. That's 2.4 million users total. Let's now assume that it takes 10 years for a language to become this popular: you must therefore acquire an average of 240 000 users per year, or 20 000 users per month.

If you're working the equivalent of full time (160 hours/month), your promotion strategy and implementation quality must be sufficient to gain an average of 125 users per hour. And that's repeat users, of course: if 20% of the people who try your language become repeat users, you actually need a conversion rate of 625 people/hour.

Even if every one of the people you convince directly convinces four more people to try your language—and for the sake of simplicity, assuming that they don't go on to try to convince others—then you're still back down to the 125 users/hour figure.

Now, this may seem totally unreasonable, but believe it or not it can still work: say your marketing strategy produces roughly linear growth over the 10-year period during which your language is gaining ground, and then plateaus. That means at the beginning you'll be converting an average of 0 users/hour, and 10 years later you'll be gaining 250 users. (Again, hourly. Perspective, here.)

That's an average increase of 25 users per hour per year: at the end of each year, you're converting 25 more people per hour—or 4000 more people per month—than you were at the start of the year.

So let's revisit that 2.4 million users ballpark: is it feasible to gain that many users in 10 years? If we accept the (inherently flawed, but usable nonetheless) statistics offered by Langpop as accurate, we get the following information about the top 7 languages currently trending through Yahoo search. If one result is accepted as representative of one user (I know, bear with me), these numbers indicate the rounded approximate average number of users gained per year since the language first appeared.
  1. C++: 500k
  2. C: 400k
  3. Java: 700k
  4. PHP: 400k
  5. Perl: 150k
  6. C#: 300k
  7. Python: 150k
This puts things back in the realm of possibility: if you make a language that's as popular as, say, Python, then in 20 years you will have enough users to make development and support (and marketing!) of that language into your full-time job.

Make one as popular as C#, and you can do it in 10. Cool!

…Except of course that putting it that way trivialises the vastly unlikely and difficult undertaking that is making a language so popular. But hey, if you've got a good idea, and you can manage to get to the top entirely on your own, without the contributions of any other developers who would take a cut of your donation money, then you're a genius, and you deserve it.

The original version of this article was posted on Programmers.SE and is duplicated here so I remember where I put it.

05 September 2011

Turing Completeness is not Practical Completeness

As a discussion comparing the relative merits of different programming languages grows longer, the probability of an appeal to Turing completeness as a measure of linguistic equivalence approaches 1. It's practically Godwin's law of programming languages.

Appealing as it is, the notion that whatever is possible in one language is possible in all is simply untrue. Turing completeness refers merely to the ability to implement any computable function. There are plenty of esoteric languages that have no or very limited I/O, and plenty of non-esoteric languages that have little to no ability to interface with libraries and other languages.

The fact that languages do differ in their strengths and weaknesses is precisely the main reason we use different languages at all. Different tools are best suited to different situations. The next time someone commits this fallacy, point them to the concept of the Turing tarpit, and if further investigation doesn't prove to them that theoretical equivalence is not the same as practical equivalence, invite them to implement a simple algorithm such as CRC32 in Brainfuck.

29 July 2011

Oxymoronic: Computer Science & Software Engineering

I think I'll kill two birds with this stone. At RIT, there are two main computing majors: Computer Science and Software Engineering. Both of these are misnomers.

See, Computer Science is concerned primarily with programming as it relates to computing theory. Programming, while certainly related to the design of (reasonably) precise things, is by no means scientific. Science is systematic, rigorous, testable, and predictable. Computer hardware and software alike operate under the illusion of determinism and reliability for the sake of pragmatism alone.

Further, computing theory is only related to computing insofar as programmers actually implement the mathematical ideas of the theorists. Research is valueless without eventual application.

Software Engineering, on the other hand, is concerned with programming in the context of project management. It asks what the process is (or ought to be) for creating software in teams of people according to business requirements. Software engineers also operate under an illusion of determinism and reliability, but their misplaced faith is not in computers—instead their money is on programmers. The fool and his money have parted.

So computers are fallible, and programmers are fallible, and it's a wonder any of us ever gets anything done at all. The best programmers know the secret of how things are really done in the software world: communication.

I don't mean meetings, or agile (or worse, Agile), or documentation, or any of dozens of other borderline buzzwords that a manager might throw around. No, I mean the plain and simple fact that programming is writing, and programmers have the unique task among authors of writing in such a way that both computers and humans can readily parse and evaluate their work.

Of course, any fool can write for one or the other. If I write an informal description of an algorithm on paper, it's perfectly human-readable; if I write a formal implementation of an algorithm in heavily optimised machine code, then it's perfectly computer-readable.

And this is why I hate the word code. If you call it “coding” for a computer, then the quality of your writing from a human perspective is probably utter trash. Don't get me wrong, of course: for a first draft, utter trash is perfectly acceptable provided it can still convey the intended meaning.

But for a final draft, a certain amount of editing is necessary. Programmers differs from writing in another key aspect: though neither a program nor a work of prose is ever truly complete (final draft is an oxymoron as well), you must release both sometime; the difference is that a program, like a work of nonfiction, can be updated in editions to maintain freshness and currency.

Programming is like creating a diamond from scratch. When you code, you make the stone. You may then polish what's there and cut what shouldn't be, but both the colour and the clarity of that stone are immutable. No amount of refactoring will fix an idea that was poorly styled or unclear from the outset.

In conclusion, I might add that I insist that programming is not art. I believe this simply because I'm not good at art, but I am good at programming. I'm also good at design and craft, and that's what I believe programming is.

Programming is writing for one audience who seemingly doesn't understand you but is also endlessly critical, and simultaneously for another who seemingly understands you but is also painfully literal. It is writing manuals and dictionaries and phrasebooks to help these two disparate audiences communicate. It's logical, illogical, philological, analogical, anthropological, and frequently epistemological.

I fucking love it.

Oxymoronic: Pharmaceutical Company

There is a magical place where geniuses in white coats carry out research and produce remedies in the everlasting fight against human ailment. Most of them do what they do because they want to help people. The rest of them are businessmen (and women). Businessfolk are in the business of turning the work of the workers into the business of the businessfolk. By “the business of the businessfolk”, I naturally mean “the money of the businessfolk and the investors, and to a certain extent the workers as well, and to a much smaller extent some philanthropy and humanitarian aid to keep up appearances”.

So therein is the oxymoron: while you started your pharmaceutical company with good intentions, trying to reduce human suffering, now you don't want to reduce it at all, because when human suffering decreases, you'll see an equal decline in the sales of your remedies. That represents an equal decline in the funding available for your workers to carry out your research, which represents a loss of research, which allows ailments to propagate unfettered and suffering to increase.

Ordinarily I don't believe in doing things for the sake of doing things, but as a pragmatist, my objective morality compels me to pursue the path of least human suffering, as well as the path of least absurd complexity, which in this case means making a slightly unusual suggestion:

Pharmaceutical companies should not be companies. They should be government services. And before you say that our government is too far in debt to be creating more services, why don't we cut everything that isn't a public service and just see if the simplest concept isn't really the best?

On Brilliance

You are brilliant, but you don't know that the following things are true:

  • Your work is meaningless until proven meaningful; don't complain, work.
  • You do not have to be an insufferable asshole; you probably will anyway.
  • You are not better because you are smarter; you are worse for believing you are.
  • The world is not a meritocracy; your sense of entitlement is undeserved.
  • Intelligence is not a skill; you have to be able to do something.
  • The truth is not self-evident; you do have to explain.
  • You can't stop people from sticking beans up their nose; it won't stop you from trying.
  • You are a fool; wisdom comes only from knowing the depth of your own folly.

06 July 2011

Why Closures?

In a language with reasonably rich partial function application, it seems to me that there's very little need for implicit closures. In pseudocode:

greeter = λname.
    message = 'Hello, ' + name + '!'
    return λε. print(message)


Could easily be rewritten explicitly as something along the lines of:

greeter = λname.
    message = 'Hello, ' + name + '!'
    return λm. print(m) fix message


Where `fix` is an operator accepting a lambda and some parameters and returning a new lambda with the given parameters fixed, that is, λx. f(x) fix y is identical to λε. f(y). With the appropriate consideration given to reference semantics, this would have exactly the same effect; indeed, C++0x lambdas can be written with explicit closures exclusively:

return [&message]() -> void { std::cout << message; }

Supporting implicit capture not only complicates implementation; as with exceptions and globals, implicit capture reduces locality and makes analysis more difficult. Limiting functions to explicit capture requires support for lambdas with multiple parameters, because you cannot write:

sub = λx. λy. (x - y)

Instead you must write:

sub = λx, y. (x - y)

Which lends itself to partial application quite nicely:

neg = sub(0)
ident = sub(?, 0)

It would seem, then, that the primary reason for supporting implicit capture is historical, namely the roots of functional programming in the lambda calculus. And really, since implicit capture is the basis of scope, the fact that a functional language can easily survive without it calls into question whether the concept of scope is necessary or good. Functional programming is all about making state explicit, after all. Why not hold a function accountable not only for values it uses, but also for operations? I intend to play with this idea in the future.