simont: A picture of me in 2016 (Default)
simont ([personal profile] simont) wrote2011-03-17 10:23 am

C abuse of the week

A while back I wanted to write some C code that looped over every element in a collection. But I wanted the code to be parametrisable in the face of different implementations of that collection, because it might have to link against several other pieces of code which might all store the things in question in a different format.

So I thought I'd do this:

    LOOP_OVER_ALL_ELEPHANTS(elephant) {
/* loop body executed once for each elephant */
}

And then each implementation would define the macro LOOP_OVER_ALL_ELEPHANTS appropriately for how it happened to be storing its elephants. If it was an array, the macro would expand to

    for (elephant = array; elephant < array + len; elephant++)

and if it was a linked list, it would expand to

    for (elephant = head; elephant != NULL; elephant = elephant->next)

and I presumed that any more complicated thing like a C++ STL collection would have some kind of similar idiom available.

Unfortunately, the very first implementation I tried to test against turned out to be storing its elephants in a circularly linked list – and you can't iterate over that in a for statement, because the start condition is the same as the end condition. If you write the apparently obvious adaptation of the linear-list code,

    for (elephant = head; elephant != head; elephant = elephant->next)

then the loop terminates after zero iterations and you look foolish.

I solved the problem by rewriting my loop in the form

    LOOP_OVER_ALL_ELEPHANTS(elephant) {
/* loop body executed once for each elephant */
} END_LOOP_OVER_ALL_ELEPHANTS(elephant)

which permitted the porting code to define a pair of macros containing the two ends of a do-while, so as to defer the first loop termination test until after the first run of the loop body. This felt inelegant and cumbersome, and I found myself wishing for a more metaprogrammable language to work in.

But this morning, I do believe I've solved it. You can iterate over a circularly linked list using a construction that can be encapsulated into my original macro, requiring no loop machinery after the single closing brace. It's strictly legal standard C as far as I can see, and it goes like this:

    switch (elephant = head, 0)
while (elephant = elephant->next, elephant != head)
case 0:

Then you can follow that with a single braced block (or even an unbraced statement), and everything works – including continue. So I could have put that lot into my macro, and it would have been fine!

More generally, this construction permits a sort of variant form of the for statement, which does its test after the loop body rather than before. You could almost define it as a macro in its own right:

#define for_after(init, condition, increment) \
switch(init, 0) while (increment, condition) case 0:

Of course that isn't quite syntactically right, because the three clauses of the for are separated by commas rather than semicolons, and can't contain initialisers in the way that a proper for can. Also there's the unusual constraint that the body of such a statement cannot contain case statements intended to bind to switches outside the loop (so you can't combine a loop of this form with Duff's Device).

But it's closer than I thought I'd be able to get!

eta: there's a bug in the switch-based solution above. See my follow-up post for details.

[identity profile] cartesiandaemon.livejournal.com 2011-03-17 09:13 pm (UTC)(link)
I was trying an alternative solution. I think you may be able to fix the problems this approach has with not quite being semantically identical to a "for" block, but I couldn't overcome the restrictions on not being able to declare another variable.

You could possibly wrap up the "compare elephant and head, except the first time, then just return true" logic into a function with a static variable in, but it would mean the code would be definitely non-reentrant. (Unless your preprocessor did something clever with the LINE directive in the macro, but I can't remember, that wasn't in C90 either, right?)

[identity profile] cartesiandaemon.livejournal.com 2011-03-18 10:48 am (UTC)(link)
:)

Thank you. I don't think I quite thought through what i did want. But i think that where I'm going is that if you want to use a function to represent the "compare el against head, except ignore the first and every second match" logic instead of a simple local var, it needs to identify when its part of the same loop and when it isn't, which it might be able to do with a combination of (a) using some thead local storage or thread id to maintain when the next call is from the same thread and (b) using a line directive to determine when its called from different (nested) places in the same file and thread.

Your solution is probably right because it uses syntax to resolve a problem with insufficiently expressive syntax. Whereas mine (if it worked) would be how to solve a problem if the constraints here were logical parts of the program for some reason.

I agree it's horrendous, that's why i recommend using a language with convenient local variables, fewer macros, etc, etc :)