Optimization (not SEO though)

The software that I use for maintaining the BME site has become quite robust over the past few years. As I'm going through emails, the software opens a browser window and instantly displays a custom gallery of all that user's submissions and so on; it's very useful for both getting a reminder of who I'm talking to, and for catching double submissions (I do fingerprinting, but that only catches some since people often send almost identical pictures of the same thing years apart).

Anyway, in order to do this, it has to do a pretty resource intensive build before starting up. The build involves mostly text searches of very large text arrays, and is the only thing in my current maintenance codebase that I'd consider “slow”… As of yesterday, doing the load of all submissions from 2002 on took 6:45 to complete. When I'd initially written it the databases were much smaller, so it wasn't a big deal and I didn't bother to write it efficiently, but a 6:45 hole in my work schedule is surprisingly debilitating.

That said, I knew I was going to have to optimize it in the future, so it wasn't too hard to fix since it had been designed for upgrades. I added a caching scheme for the searches (so it doesn't repeat searches that it's done recently), and tested it on a few cache sizes, eventually settling on caching the previous five searches — this cut the load time down to 1:53, which is back into bearable territory. By the end of the year I'll have to do one more fix, which is making the database sort itself periodically during the load process to allow for binary searches (the input data is in totally random order).

I used this technique on the search engine build routines, and it cut the build down from something like six or seven hours to about half an hour (!!!)… I figure it'll cut the load on this down to under a minute, even with a much larger database. Thinking about optimization is a lot of fun… If it interests you as well, check out the 256 bytes demos archive.

Wow Shannon, that's really annoying! What is it, 1997 on Geocities? Retroweb is NOT cool!

Post a Comment

Your email is never published nor shared. Required fields are marked *