24 October 2009

Keep Designer Data in Files (not Databases)

Whoa! Did I just fall off the deep end?! Designer Data in ... files?! Data... doesn't that belong in a database?! There are those who say that it does, but I disagree, at least when talking about development.

In MMO terms, Designer Data is data that our Design team creates: Quests, Spells, NPCs, Paths, Items, Recipes, Locations, etc. Tons and tons of data that, together with art, is really the lifeblood of the game.

Designer Data is to Designers as Source Code is to Programmers. It's constantly changing and growing. Different people own different parts of it. Everyone on the dev team needs all of it and as up-to-date as possible to do their current tasks. Servers need it in a highly-optimized form to run the game. There are many parallels between Designer Data and Source Code.

Which brings me to my first major point. As files, Designer Data can be stored in Source Control (we use Perforce, and I love it). This gives an amazing amount of incredibly important functionality and basically for free:
  • Revision history - As text (EQII uses XML) files, anyone on the team can go back and see (through the same tool that everyone must use) all previous changes to a file. Anyone can see exactly what changed between each version or between multiple versions and who made the changes. Also, you can go back in time and grab previous file revisions.
  • Changelists - Most source control systems group together file changes. Perforce calls these Changelists. Grouping files together is an effective tool for seeing relative changes based on concept. One changelist might be labeled "Zone 1 population" and contain NPCs, Quests that those NPCs give out, Paths that those NPCs follow, etc. If a Changelist breaks the game, it can be rolled back wholesale (and you can see who made it and go pound them).
  • Integration - Changes are made in branches and we have branches for each distribution of the game. As a group of changes become ready to go Live, they are integrated into the next distribution (Main goes to QA, QA goes to Staged, Staged goes to Live). With Code and Data both in the same system, integration to the next distribution becomes a one-step process.
  • Code/Design in sync - Since Source Code and Design Data are both stored in the same system, they are always in sync. If a Programmer makes changes that affect both (which often happens), he can rest assured that everyone will easily get the latest in both.
  • Sandboxing - Designers are only affecting their own local version of design data until they check it in. They can do testing and make tweaks before checking in something that could potentially break everyone else.
  • Blame - It's so easy to see who made a change (and confront them if necessary).
If you were doing Design development in a database, you would have to develop your own solution around each of these (or use database tools that not everyone would have access to). And since the database is always up-to-date, and everyone is generally working out of the same database, problems could arise from changing formats or unexpected data.

Furthermore, Designer Data in files are free to have complex formats. For instance, in EQII the Designer Data is object oriented. Data definitions specify that a Character inherits from Entity and Entity inherits from a Base type. This also allows us to do things like have a base Predicate type and more complicated types like CharacterHasQuestPredicate that inherits from Predicate but has additional data members that only make sense for CharacterHasQuestPredicate (like the Quest name). In a database, you might have a Predicate table with a type column that is a number. You would have to look up what that number means from somewhere or write special tools that understand the relationship. Additionally, that Predicate table would have to include all of the options that any types could have in a very generic fashion. So you might have columns called "StringParam1" and "NumParam1". In the database then, you might have a row with type set to 12 and StringParam1 set to a quest name (or more likely NumParam1 set to a unique ID for a particular Quest). If you were looking at raw data, what would you rather see:
<object name="CharacterHasQuestPredicate">
  <string field="sQuest">quests/heritage/dwarven_work_boots</string>
Type     IntParam1    IntParam2
12 1243 <null>

The EQII Designer Data tool is very generic as it knows how to read the game's data definition. To actually add a whole new data type (Achievement for instance) actually takes zero changes to the Designer Data tool. It just takes writing a small text file that describes the Data Definition such as this one for our aforementioned CharacterHasQuestPredicate:
<objectdef name="CharacterHasQuestPredicate" inherits="Predicate">
    <fielddef name="sQuest" type="String" require_dir="quests/" default="" />
However, reading XML files isn't the fastest thing to do. Server startup is fairly slow when reading from XML files, but EQII has a utility that converts the myriad XML files into a single file that contains all of the data in a highly optimized binary format (I've actually written about this file before). This file is mapped into virtual memory allowing all server processes running on the same physical machine to share one copy of the file in memory. For development purposes, developers can run their own servers against the XML data (and both internal development servers and rapidly changed external servers such as a Beta server could run against XML as well for fast turn around). Plus, there are enough options for converting XML data to databases that you could still use XML files for development and run production servers against a database.

The only major shortcoming of keeping Designer Data in files as opposed to a database is for searching and updating large amounts of data--things that a database is designed to do.

For searching data, EQII actually has a very workable solution. The EQII team has developed a system called VooDLe (a name playing off of Google and VDL--the internal name for EQII's data library). This is a very simple web solution that syncs to the latest data and indexes it for searching. It also detects file references and generates links. However, it doesn't help if you'd want to find all of the shields that have a block value higher than 300. For this, you can search VooDLe for the field that declares shield block value and quickly scan through the results. (Rarely is searching for a numeric value so cut and dried in EQII though with level scalars and combat scalars that affect everything.)

Updating large amounts of data is actually something that (shouldn't) happen very often. Furthermore, updating large amounts of data is error-prone and should be limited. You might even call this an inherent benefit to having designer data in files. When this is necessary, it's possible with a simple Perl script. If something does go wrong (even later) the Changelist can easily be reverted.

Based on all this, I feel it's more beneficial to a development team to use files for Designer Data during development rather than a database.