Mar. 9th, 2019

I was just reading a paper on gcd algorithms, and was struck by the following statement appearing in a snippet of pseudocode:

max{u, v} := |u-v| / 2;

in which the obviously intended meaning was ‘Replace the larger of u and v with the value computed on the RHS.’ In other words, the author is implicitly considering the max function to return an lvalue.

It had never occurred to me before that max could return an lvalue! If I'd wanted to express the same concept myself, I'd surely have either pre-sorted my variables, or written some tedious if statements out longhand, or else resorted to the notation of ‘arg max’, in which you replace your set of values to compare with some kind of mapping of indices to values (say, an array or hash), and where max(set) simply returns the largest element of the set, argmax(mapping) returns whatever index maps to the largest value. I wouldn't even have thought of writing it like this.

I think for pseudocode purposes, any of my other options would have been worse, because each one introduces extra bookkeeping or formalism that the reader has to mentally clear out of the way to get at the underlying concept. So I quite liked this notation for its pedagogical value – but at the same time, I recognised that you'd have to translate it back into one of those other less clear idioms as soon as you wanted to implement the algorithm in a real programming language.

Or do you? Now that I think about it, there's no reason at all why a programming language couldn't define a function like min or max to return an lvalue. In fact, in C++, a simple implementation is as easy as adding three & characters to the most obvious version:

// before
template<typename T>
T max(T u, T v) { return u>v ? u : v; }

// after
template<typename T>
T &max(T &u, T &v) { return u>v ? u : v; }

But for maximum flexibility, you'd surely want max to return an lvalue if possible, and fall back to an rvalue if not. You'd like to be able to write max(u,v) = w as the above pseudocode did, but you also still want to be able to write x = max(y,1000) without a compile error.

And that doesn't seem so easy in C++. The obvious approach is to have both of the above template definitions in scope at the same time. Overload resolution will use the by-value version in cases where only that one matches, and you hope it will manage to pick the reference version in cases where both match. But sadly, when I tried it, that hope turned out to be misplaced – the language defines no preference order between those two templates.

I wonder if any other languages allow you to get the lvalue/rvalue flexible version to work!