15 March 2010

Be Careful with Default Arguments

CRC16::computeMemory( const char* data, unsigned length, uns16 crc = 0 );
CRC16::computeString( const char* data, uns16 crc = 0 );
This little gem caused quite a headache. Allow me to explain.

EQII's streaming client uses 16-bit CRCs to tell if an asset has changed. All 500,000 individual assets have a 16-bit CRC calculated and stored in the master asset list (manifest). When an asset is downloaded, it is cached on disk and stored with the 16-bit CRC. To save time, the client doesn't calculate the CRC on each asset that it downloads, it just uses the CRC stored in the manifest.

Every time the client runs, we download the manifest and check the CRCs in the manifest against the stored CRCs for our cached assets to see if any changed. If an asset has changed (the CRC is different), we delete the cached copy and request the replacement, even if it's not immediately needed.

When the streaming client launched, everything worked as planned. It worked great! Little did we know that a particularly evil bug was lurking.

I discovered a problem when I wrote a utility to convert old PAK files (the game data shipped on the DVD) to cached streaming assets. NOTHING matched the CRCs in the manifest. Every asset was wrong. I looked over the code several times and everything looked fine. This bug didn't make sense.

And then it hit me.

The code that was building the manifests was doing this:
uns16 crc = CRC16::computeString( data, dataLength );
Talk about /facepalm. The function I meant to call was CRC16::computeMemory(). The function I actually called treated the input like a null-terminated string. This means that the CRC was only calculated up to the first NUL character and the dataLength parameter was actually being treated as a starting CRC value. This was a bug that had to get fixed. Someday, perhaps many years from now, this would be a huge bug that would waste a lot of someone's time to hunt down.

Oh, but the fun doesn't end there. I couldn't just change the function to fix the bug. Doing that would mean that every streaming client user would have to re-download everything that they had already downloaded. Every CRC would change and the naïve client would happily delete everything and start over. To fix this properly would take a highly-synchronized effort to fix and push the manifests while deploying a one-time-only tool that would re-calculate the CRC for all assets that people had already downloaded.

The moral of the story: be careful with default arguments. They can really hurt if misused.