simont: A picture of me in 2016 (Default)
simont ([personal profile] simont) wrote2009-12-04 09:21 am

Better testing through Google

The other day I needed a lot of primes in a hurry, and I judged that it would be quicker just to sit down and write a simple Sieve of Eratosthenes program from memory than to faff about trying to google up one that somebody had already written and that wasn't in some unhelpful language.

So I wrote one; then I disposed of the obvious bugs by checking its output rigorously for numbers up to 10; then I ran it for numbers up to 100 and didn't see anything obviously wrong (though I didn't look that hard). Then I wondered whether there was any good way to be more confident of its correctness.

On a whim, I made it generate all the primes up to 2^32, and fed the output text file to md5sum (with Unix line endings). Then I pasted the resulting checksum into Google – and found a hit! Somebody else had generated the same set of primes, checksummed them in exactly the same way, and posted the MD5 on a web forum which was talking about prime-generating programs so that other people on the forum could use it as a test case. Just the confirmation I wanted.

The silly thing is that if I'd tried to google for things like ‘md5sum of primes up to 2^32’, it wouldn't have been remotely successful. But once you already know what you think the answer is (at least in cases where it's a mess of digits), googling for that will tell you whether anyone else agreed with you.

pm215: (Default)

[personal profile] pm215 2009-12-04 09:47 am (UTC)(link)
mnementh$ primes 1 > /tmp/zz9.primes
mnementh$ md5sum /tmp/zz9.primes 
037a526651ff4d6babb3b1a23bb83097  /tmp/zz9.primes
...right?

(That's the standard BSD primes(6). 2^32-1 is as high as it will go, so it's only just good enough for this job...)

[identity profile] fivemack.livejournal.com 2009-12-04 11:02 am (UTC)(link)
I tend to find this more infuriating than useful; to google for the result of a two-month computation that you thought was new and exciting and to get a hit does not fill me with the most ecstatic kind of well-fermented joy.

[identity profile] tigerfort.livejournal.com 2009-12-04 11:27 am (UTC)(link)
(If everybody had thought of it before, that's less frustrating, for some reason; the pessimal thing is to be the second.)

I suspect that's because if you're the second, then you've just missed out on a clever idea, whereas if 10^8 other people have thought of it, then it's a human universal that comes to everyone.

[identity profile] mooism.livejournal.com 2009-12-04 01:25 pm (UTC)(link)
Ah, Google and the "If Wikimedia Made Rainbow Tables" sketch.
gerald_duck: (ascii)

[personal profile] gerald_duck 2009-12-04 01:45 pm (UTC)(link)
Rapidly finding primes up to 232 became a lot easier when computers started having more than 512Mbytes of RAM.