Found pun [entries|reading|network|archive]
simont

[ userinfo | dreamwidth userinfo ]
[ archive | journal archive ]

Wed 2012-11-21 10:13
Found pun

This morning I noticed a sort of ‘natural pun’ between maths and software. I suspect about three people will get it, but I'll say it anyway because it's too good to lose.

The Schröder-Bernstein theorem states that given injections from a set A to a set B and from B to A, you can find a bijection between A and B. There's a well known construction proving the theorem, which goes like this: draw a graph whose vertices are the elements of A and B, connect a (in A) and b (in B) with a red edge if f(a)=b, and with a blue edge if g(b)=a. This gives every vertex a red-degree and blue-degree at most 1, so each vertex has at most degree 2 and so the graph decomposes into connected components which are chains or loops; furthermore, no chain can be finite (because in one direction you can keep applying f and g forever). Now in most kinds of component you can pair up the elements by simply saying ‘use the blue edges’ (so that on those points, the bijection h we're constructing matches f); the only case where that doesn't work is a one-way infinite chain starting with a red edge, in which case you can just use the red edges instead (so that h matches the inverse of g).

If you actually apply this construction to find a bijection between two sets which are mostly similar but one contains a few elements not in the other, then the resulting bijections tend to be of the form: map everything to itself, unless it's one of the things we want to avoid, in which case do a sort of ‘quoting’ transformation to turn it into something safe – but then we also have to apply the quoting transformation to any input which could have been derived from a bad thing by repeated quoting. For example, if you want to find a bijection between the non-negative real numbers and the positive real numbers (that is, the former including zero and the latter not), you can map x to itself in most cases, and map x to x+1 if x is either zero or anything reachable from zero by repeated adding of 1 (i.e. if x is an integer).

It occurred to me this morning that this is identical to a procedure frequently used in software for reversible quoting. For example, the mbox mail folder format separates emails with lines starting with the word ‘From’ followed by a space. Therefore, if such a line shows up in an email, it needs quoting, and what generally happens is that you put a ‘>’ character before it. But to make that quoting properly reversible, so you can distinguish lines that start ‘From’ from those that already started with ‘>From’, you must also quote any line which begins ‘>From’ or ‘>>From’ and so on – that is, any line which could have been derived from the thing you want to avoid by repeated application of the quoting transform. It's exactly a Schröder-Bernstein construction.

And what makes this amusing is that this reversible scheme for quoting in mbox files is not always used, because not everybody is that rigorous – but it is used by qmail, by Dan Bernstein. Hooray!

[xpost |http://simont.livejournal.com/237443.html]

LinkReply
[identity profile] atheorist.livejournal.comThu 2012-11-22 14:38
By the Curry-Howard Isomorphism, constructive proofs correspond to functions (from correlates of assumptions to correlates of conclusions) - and an alternate proof often corresponds to an alternate function.

The Schroeder-Bernstein construction works for sets in general. But sets of concrete things such as emails are always at most countably infinite. So there's an alternative proof. You can compose the two bijections - quoted emails to integers and then from integers to unquoted emails. This is nonconstructive because I haven't pinned down what those (Godel numbering) bijections actually are, but it might look like a complete mess.

The 103...496th possible email not containing a specific byte sequence ("\nFrom" or whatever) might not have much superficial visual similarity to the 103...496th possible email when that byte sequence is allowed.
Link Reply to this | Thread
[personal profile] simontThu 2012-11-22 14:57
Indeed, there's no shortage of ways to show that two countably infinite sets are bijectable to one another. What strikes me as interesting (apart from the match of surnames) is that one reasonably natural way for a mathematician to do it (applying a theorem which happens to be more general than needed) also gives rise to a method of quoting which is not merely plausible for practical use but actually in use.

Of course, there's no need in general for the transformation from a string to its quoted form to be bijective, or even to be a function (by which I mean 'single-valued'). There are plenty of practical reasons why you might want to have some strings not be legal in the quoting syntax even though they aren't among the set of strings you really needed to avoid, or why you might want to provide multiple equivalent ways to quote the same thing (e.g. \042 vs \x22 vs \" in C string literals). Bijectivity is desirable only because it correlates with space-efficiency, and only as long as it doesn't introduce any other inconveniences. For instance, the qmail-style mbox quoting transform could have been replaced by simply prefixing every non-separator line with a > sign, but that wastes more space for a reason not totally unconnected to its non-bijectivity.

So the Gödel-numbering approach probably would be too inconvenient to be worth the space saving, if it didn't turn out to reduce to anything more easily stated!
Link Reply to this | Parent
[personal profile] lnrWed 2012-11-21 11:23
I'm not sure I'd consider it a pun, but it did make me smile :)
Link Reply to this | Thread
[personal profile] simontWed 2012-11-21 11:25
I think I'm going to call the general technique "Bernstein quoting" from now on, and if software people think I'm referring to Dan Bernstein and mathematicians think I'm referring to Felix Bernstein then that's fine because they'll both be right :-)
Link Reply to this | Parent | Thread
[personal profile] lnrWed 2012-11-21 12:07
I approve!
Link Reply to this | Parent
[personal profile] gerald_duckWed 2012-11-21 12:37
I've been aware for at least two decades of a variety of escaping schemes. There are a lot of subtle and important trade-offs in terms of usefulness to humans, correctness, robustness, compactness, simplicity, etc.

Knowing as I do that this is a tricksy area where care and circumspection are required, it annoys me that so many systems routinely get it wildly wrong.

Which in turn creates a natural pun with Santayana's principle that those who cannot remember the past are condemned to repeat it.
Link Reply to this
navigation
[ go | Previous Entry | Next Entry ]
[ add | to Memories ]