Accidental i18n (Reply) [entries|reading|network|archive]
simont

[ userinfo | dreamwidth userinfo ]
[ archive | journal archive ]

[personal profile] simont Fri 2018-02-16 12:41
Accidental i18n

I told a silly story in the pub last night which I suddenly realise would make a fun post here as well. It's from a few years ago originally, but I don't think it matters.

You may have heard of that old chestnut in which an alleged Cambridge University researcher allegedly claims that people can still read written text with no problems even if the internal letters of each word are arbitrarily reordered, as long as the first and last letters of each word are still the right ones.

This is nonsense, of course, and it's been debunked before. But a few years ago, Gareth and I were discussing it, and I dashed off a Perl one-liner to do that scrambling transformation. (Perhaps it seemed like a good Perl-golf challenge to waste half an hour on, or something like that.)

I got a draft implementation working quickly enough, although it didn't quite fit on one line:

$ perl -pe 's!(?<=\b[a-z])[a-z]*(?=[a-z]\b)!join"",map{$_->[1]}
sort{$a->[0]<=>$b->[0]}map{[rand,$_]}split//,$&!egi'
But soft, what light through yonder window breaks?
But soft, what lghit tughroh yedonr woindw bkears?

But shortly before the working version, I made a small error of a kind that Perl makes uniquely easy: I briefly got my scalar and list contexts confused, tried omitting the join step, and this happened:

$ perl -pe 's!(?<=\b[a-z])[a-z]*(?=[a-z]\b)!map{$_->[1]}
sort{$a->[0]<=>$b->[0]}map{[rand,$_]}split//,$&!egi'
But soft, what light through yonder window breaks?
B1t s2t, w2t l3t t5h y4r w4w b4s?

Of course – if you don't explicitly use join to turn a list of characters back into a single string, then Perl's default conversion when you use a list in scalar context is to replace it with the length of the list. Slap forehead, mutter ‘oh yes, this is Perl’, fix bug.

But I'm glad I made the mistake, because look at what the wrong program is actually doing: it's exactly a tool for abbreviating long words in the style of ‘i18n’ and ‘l10n’. Of course that's not a hard thing to do, but I was very amused to have managed to do it completely by accident!

Link Read Comments
Reply:
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting