Counting in English
Just now I was sorting through my old projects, zipping up the ones I haven't touched in forever, for archival in the dusty, dark corners of some external storage device. And I came across EnglishCounter. A while back a friend of mine was about to interview for a job; and he was told to be prepared to write a program that "counts from one to one million in English". Yeah, it definitely wasn't a senior software engineering position. Anyway, somehow that translated into "bug Curtis", but ultimately it proved to be a pretty entertaining exercise.
Our make-shift competition was graded on three criteria: (1) speed of execution, (2) number of lines of code and (3) time to code.
He was going to be working in ActionScript, which means that in my world, only Java would likely have a slower execution time (they've really been trying hard to make ActionScript not suck), leaving Perl, C, Objective-C; which ensured that I'd win criteria #1; and I wasn't totally sure how fast Perl would handle this so I dropped it into the bucket with Java, just to be safe. Unfortunately for C, I wasn't sure that I'd be able to get it working faster than he could pound out the ActionScript equivalent (criteria #3), and it'd definitely be more lines of code (criteria #2). So, Objective-C it was.
The end result was this: EnglishCounter.m ... a careful balance of efficiency, time-to-code and number-of-lines.
But, looking it over today, I couldn't deal with it. It was only marginally faster than the ActionScript (partly my fault, partly Objective-C's fault, and entirely inexcusable), and it ended up being about 1/4th the size of the ActionScript program (because he chose to code first and think later), which means that C would have been plenty short, but I probably would have lost #3. So, now I can make up for it.
Behold, I give you the C version: EnglishCounter.c ... a careful balance of ... efficiency. And it has the names of numbers up to vigintillion, but that's like forty-four orders of magnitude greater than what a 64-bit integer can express, so it's a bit over-kill. But whatever.
So, first, I re-wrote EnglishCounter in C, following the same algorithm as the Objective-C version, but dumping the NSString stuff in favor of character arrays and memcpy()'s. Then I dumped all of the strlen()'s in favor of hard-coding the (very finite) number of string lengths in question (ie, the number names). And finally I dumped sprintf() (which is always a bitch in terms of performance) in favor of a Knuth-style base conversion function).
With printing turned off, the execution time (when counting to one million) dropped from ~23 seconds to 0.28 seconds (beautiful!), on my MacBook Air, with gobs of other programs running, in a most unscientific test that is at best only marginally useful for comparing the speeds relative to each other. But, whatever.
--
Update: For your viewing pleasure, I give you the first ten million numbers: TenMillion.bz2 [17.9 MB]. I was going to post the first hundred million numbers, but that worked out to be a four gig file and it was taking forever to compress, so I gave up. If you want those missing 90 million numbers, and you know how to decompress a bzip file, I'll hook you up.
--
Update (04:52): Taking a hint from the stdlibc implementation of strlen(), I rolled my own memcpy() that works with 64 bits at a time and nearly cut the time in half in so doing: 1..1,000,000 in 0.15 seconds now. Okay, I think I'm done for the night.