Is this email not displaying correctly? View it in your browser.

LinkedList NYC

Issue #84 - November 3, 2012


Hi Listheads,

Our hearts go out to everyone affected by Hurricane Sandy. We hope you and your loved ones are safe and sound.

Contest results

We are amazed by the inventiveness of the submissions we got for last week's programming contest. We had no idea our readers had so much free time.

We've compiled a list of our favorite submissions. We got some spectacular entries and it was tough choosing a winner. But we're not made of ice cream, so we had to pick just one.

And the ice cream goes to...

Misha Dynin, for the following entry:

1  #ifdef _
2 
3                       _(A)_(B)     _(C)_(D)  _(E)
4                         _(F)     _(G) _(H) _(I)_(J)
5                         _(K)      _(L) _(M)   _(N)
6                         _(O)       _(P) _(Q)_(R)
7                         _(S)         _(T) _(U)
8                       _(V)_(W)         _(X)
9 
10                  _(Y)      _(Z)_(a) _(b)_(c) _(d)_(e)
11                    _(f)      _(g)     _(h)    _(i)
12                    _(j)_(k)  _(l)       _(m)_(n)
13                    _(o)  _(p)_(q)         _(r)
14                    _(s)      _(t)         _(u)
15                  _(v)_(w)    _(x)       _(y)_(z)
16
17 #else
18 #define _(z)_##z)();typedef _##z(*
19 #include <stdio.h>
20 typedef int(*
21 #include __FILE__
22 #define __(_,z)_z _(__ y){return fputs(z,stdout),y?y(0):(_z)_0;}
23 #undef _
24 __)();__(_0,"")__(nl,"\n")__(_," ")
25 #define _(_)__(_,#_)
26 #include __FILE__
27
28 int main() {
29   (L)(i)(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);
30
31   return 0;
32 }
33
34 #endif                            /* http://mishadynin.com/ */

We spent much of the afternoon trying to understand this program, and we think we've figured it out. If you want to follow along at home, we suggest running cc -E misha.c, which will run the C preprocessor and expand the macros.

The source is split into two parts by a large #ifdef _. The first half contains an ASCII art LinkedList logo and the second half contains the program's code. When the preprocessor starts running through the source, _ is undefined, so the ASCII art is ignored.

We'll start in the #else branch by unpacking the first macro:

#define _(z)_##z)();typedef _##z(*

This becomes clearer if you add a space:

#define _(z) _##z)();typedef _##z(*

Macros that take arguments don't require a space after the closing parenthesis. For this macro, _(A) would evaluate to _A)();typedef A(*. The ##s let us replace z even when it's in the middle of a word. If we didn't have those, _(A) would evaluate to _z)();typedef _z(*.

Immediately after this, we include stdio.h and then we have the seemingly incomplete line typedef int(*. To understand what this line does we'll need to look at the next line too.

The piece of line 21 that we didn't initially know was the __FILE__ macro. __FILE__ evaluates to the full path of the current file surrounded by quotes. This means, if you have misha.c in /tmp, __FILE__ will be "/tmp/misha.c". With this in mind, we know now that line 21 actually includes misha.c inside of itself!

When we include the file, we start at the top again, but this time _ is defined, so we ignore our code and evaluate our ASCII art. On closer inspection, we find that our art is actually just a bunch of invocations of the _ macro, and combined with our first incomplete typedef produces the following code (reformatted slightly):

typedef int(*_A)();typedef _A(*_B)();typedef _B(*_C)();typedef _C(*_D)();
typedef _D(*_E)();typedef _E(*_F)();typedef _F(*_G)();typedef _G(*_H)();
typedef _H(*_I)();typedef _I(*_J)();typedef _J(*_K)();typedef _K(*_L)();
typedef _L(*_M)();typedef _M(*_N)();typedef _N(*_O)();typedef _O(*_P)();
typedef _P(*_Q)();typedef _Q(*_R)();typedef _R(*_S)();typedef _S(*_T)();
typedef _T(*_U)();typedef _U(*_V)();typedef _V(*_W)();typedef _W(*_X)();
typedef _X(*_Y)();typedef _Y(*_Z)();typedef _Z(*_a)();typedef _a(*_b)();
typedef _b(*_c)();typedef _c(*_d)();typedef _d(*_e)();typedef _e(*_f)();
typedef _f(*_g)();typedef _g(*_h)();typedef _h(*_i)();typedef _i(*_j)();
typedef _j(*_k)();typedef _k(*_l)();typedef _l(*_m)();typedef _m(*_n)();
typedef _n(*_o)();typedef _o(*_p)();typedef _p(*_q)();typedef _q(*_r)();
typedef _r(*_s)();typedef _s(*_t)();typedef _t(*_u)();typedef _u(*_v)();
typedef _v(*_w)();typedef _w(*_x)();typedef _x(*_y)();typedef _y(*_z)();
typedef _z(*
    

You'll notice that the last line is incomplete. It'll be completed later.

These are all typedef'd function pointers. If you're a bit shaky on function pointers, or pointers in general, take a look at the Clockwise/Spiral Rule. To create a type out of a function pointer, you just put typedef in front of the pointer declaration. The name of the pointer becomes the name of the type.

On line 22, we define the __ macro:

#define __(_,z)_z _(__ y){return fputs(z,stdout),y?y(0):(_z)_0;}

which can be rewritten for clarity as follows:

#define __(_,z) _z _(__ y) { return fputs(z,stdout), y ? y(0) : (_z) _0; }

This macro is especially confusing because it uses _ as the name of its first parameter. This does not get expanded to the _ macro. It is treated as a normal macro parameter. This macro defines a function with its name taken from _. The function has one parameter, y, which has type __. This is not the same as the macro __ and doesn't get expanded recursively because this occurrence of __ doesn't have any arguments. The function prints z to standard out and then returns either y(0) or _0 casted to the type _z. We've already seen that _z was typedef'd to a function pointer, but we haven't seen _0 yet. While we don't know what type __ is, seeing y(0) leads us to believe that __ is a function pointer type.

In C, multiple expressions separated by commas evaluate to the value of the last expression, which explains how we can have two expressions in the return statement. The return value of fputs is discarded.

After undefining the _ macro on line 23, we have a relatively dense line 24:

__)();__(_0,"")__(nl,"\n")__(_," ")

Let's unpack that:

__)();
__(_0,"")
__(nl,"\n")
__(_," ")

The first line completes our long string of typedefs from earlier, defining the type __ as a pointer to a function that returns a function pointer of type _z. This confirms our assumption that __ is a function pointer type, which we hypothesized after we saw y treated as a function in the definition of the __ macro.

The next three lines use the __ macro to define three functions, _0, nl, and _, which print the empty string, a newline, and space, respectively.

Next we define one more macro, called _:

#define _(_)__(_,#_)

which cleans up to

#define _(_) __(_,#_)

The new piece here is #_, which evaluates to _ surrounded by double quotes. Thus _(a) becomes __(a,"a") which becomes _z a(__ y){return fputs("a",stdout),y?y(0):(_z)_0;}, a function called a that prints "a" and returns a pointer to a function of type _z.

With _ redefined, we include __FILE__ one more time. Our ASCII art is once again interpreted as a bunch of macros, but this time, we define functions that print each of the letters rather than defining function pointer types.

The last interesting line is line 29:

(L)(i)(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);

This doesn't look like valid C at first, but let's make one small change to make it more familiar:

L(i)(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);

Here we call the L function, and pass it the i function as an argument. L prints "L", and then returns i(0) (look at the __ macro if you're lost). i prints "i" and then returns _0 casted to a _z. Recall that _0 is the same as L, i and all the other lettered functions, but it prints the empty string instead of a letter.

Because L(i) returned i(0) and i(0) returned a pointer to _0, we're now back inside main with a handle on _0 that we immediately call with n as its argument. You can almost imagine that main now looks like this:

_0(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);

L returns a _z, which was defined as typedef _y (*_z)() (i.e., it's a type that is a pointer to a function that returns a _y and takes no arguments). If this is the case though, how are we allowed to call L's return value with one argument? The answer is that the function declaration int foo() is different than int foo(void). The second declares a function foo that takes zero arguments while the first declares a function foo without argument information. Because our typedef of _z doesn't have argument information, our compiler doesn't complain when we try to call a function pointer of type _z with one argument. There is a Stack Overflow answer with more information.

Calling _0(n) prints nothing and then returns n(0), which prints "n" and returns _0 leaving us with the equivalent of

_0(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);

This goes on until we get to _0(nl) which prints nothing, calls nl(0) which prints a newline and then returns a final pointer to _0 which we ignore.

Pretty awesome, right? We just have one minor correction, Misha. Line 29 should read:

(L)(i)(n)(k)(e)(d)(L)(i)(s)(t)(_)(N)(Y)(C)(nl);

Peace out, cub scouts.

Nick & Dave

P.S. Please email us if you have any corrections or clarifications for our explanation!


*|IFNOT:ARCHIVE_PAGE|* *|LIST:DESCRIPTION|*

If you don't want to receive the LinkedList anymore, unsubscribe.

*|LIST:ADDRESS|*