Tuesday, November 11, 2008

Recursive type definitions in C

Hi,

It's been a while -- I've been mostly busy with university. Maybe I won't try to follow four classes next year. In other news, we didn't make it (with kmemcheck) for 2.6.28 either. Oh well. We did at least make an impression by discovering two more bugs in 2.6.28-rc.

Now for the topic of this post: Recursive type definitions in C. More specifically, the types I want to write about are not structs, but function pointers. We may use typedefs to define types which refer to function pointers:

typedef void (*funcpointer_t)();

Not strictly necessary to make it a new named type, but it helps readability. This is all well and good; we can declare variables that hold function pointers, declare functions that take function pointers as parameters, or even declare functions that return function pointers (to functions with this specific signature):

funcpointer_t x;

void do_a(funcpointer_t f);

funcpointer_t do_b(void);

Now, what I really wanted to do was to create a parser in C where a variable would hold the next function to call, and where the function itself returned a (function) pointer to the next function to call (i.e. the next state of the parser). It would look something like this:

funcpointer_t init_state(const char *token) {
/* ... */
return &some_other_state;
}

funcpointer_t some_other_state(const char *token) {
/* ... */
return &some_other_state;
}

void parse() {
funcpointer_t state = &init_state;

while (token = read_token())
state = state(token);
}

Okay, so this looks quite good. But how do we define funcpointer_t? Defining structs that have pointers of the same type is quite easy, since the compiler already knows about the type as soon as you start defining it. The most obvious example is the linked list:

struct list_node {
struct list_node *next;
/* ... */
};

Another possibility includes forward declarations, where we simply declare that a type exists, without defining it:

struct a;
struct b {
struct a *x;
};

The same goes for functions. They can refer to themselves, since we must provide their name at the start of their definition, or declare them beforehand. But how can we do this with function pointers?


Vegard