30 June 2014

By the Library of Thor!

I recently uploaded to github a C++ template library called Thor, which is something that I've been playing around with. It's really just a way for me to write some crazy template code--a creative outlet if you will. This blog post is meant to show off some of the things that I'm proud of in Thor. The name Thor came in part because of inspiration from Andrei Alexandrescu's template library Loki and his book Modern C++ Design (an excellent advanced C++ book).

Some things in my Thor library are just me implementing things in the way I would if I had full control, and some are a slightly different take on popular concepts. Some things are already outmoded based on offerings in C++11 (especially in terms of atomic support), but are included for posterity.

I apologize that the code samples are images (Blogger currently doesn't do well with template syntax), but generally a link is provided to the code on github.

Containers

The C++ standard library includes several standard, well-known and well-documented containers, however these containers don't often allow for much customization. For instance, std::vector (up until the C++11 standard) had no obvious means to free memory (C++11 adds the shrink_to_fit() function). Although not technically correct, I'll refer to the C++ standard library as STL below for brevity's sake.

thor::basic_string

Thor's string class works very similarly to std::string and std::wstring with a few changes. First, the strings are atomically reference counted, so the underlying string data is shared when possible. Secondly, a template parameter can be specified to pre-reserve (as part of the basic_string instance, not a separate heap allocation) space for the string. The default of zero is a fully dynamic string, but if you need a quick string on the stack and don't want to allocate anything from the heap, it is easily done with the addition of a size template paramter:

Furthermore, in addition the STL constructors for basic_string, additional options are available for printf-style formatting and literal strings (that don't require copying or allocation):

The printf-style formatting is also supported for append, insert, replace and assign variants.

Thor's string also supports conversion between UTF-8 and "wide" (wchar_t). If wchar_t is 32-bits, then the conversion will be between UTF-8 and UTF-32, however the more likely case is that wchar_t is 16-bits, in which case the conversion will be between UTF-8 and UTF-16, complete with support for UTF-16 surrogate pairs.

thor::list

In general node-based STL containers (such as std::list) will allocate a 'dead node' or a terminator node from the heap as soon as they're constructed (even default constructed). Usually this terminator node will include space for the 'T' stored type, although that space will never be used and therefore wastes space. I frown upon the fact that wasted space is allocated from the slower heap for such a simple operation as default constructing a list.

Consider Thor's list class. Instead of heap-allocating a terminator node, the head/tail pointers for the list class use a list_node_base and the address of the list_node_base is used as the terminator. Thus, the terminator (used by the end() iterator) is implemented as such:

This has the advantage of not requiring a heap allocation for the terminator node while still allowing for simple reverse iteration and no need for null-pointer checking at head/tail. However, there are two disadvantages. The first is that any accidental writes to the end() iterator (which is completely invalid anyways) may cause memory corruption. This can be prevented in checked/debug builds with asserts when trying to dereference the end() iterator. The second is that swap semantics require slightly more finesse (although these are more complicated in Thor for another reason explained below). In std::list the swap() function must only exchange pointers and counts between two lists--since the terminator node is allocated on the heap it is not tied to either list and is automatically part of the swap. However, since Thor's list essentially has a local terminator node as part of the instance, care must be taken to fix up the terminators during swap:

Notice that the swap function handles a concept of shareable. This is because of another enhancement of Thor's containers. Essentially, space for a certain number of items can be pre-allocated along with the container, similarly to the string class mentioned above. For instance, thor::list<T,5> specifies that no heap allocations must occur for the first five list entries. Fortunately, thor::list<T,5> inherits from thor::list<T> so that it can be used anywhere where a list is required. However, it also means that the problem of swap() now must handle pre-allocated nodes. It is less efficient to swap() a list with pre-allocated nodes as this requires the nodes be converted to heap allocated nodes.

There are many other extensions to all of the containers, yet they all try to remain true to the C++ standard library specification.

thor::hash_map (et al)

The thor::hashtable class (which forms the basis of hash_map, hash_set and the multi- versions of each) has a noteworthy feature: a policy class that controls how the stored values are organized into buckets. The options are a power-of-two system that is very fast by using bitwise masks instead of divide/mod instructions (although this method can be terrible for pointers as keys as they typically are multiples of 4 or 8 and the bucket strategy loses efficiency leading to collisions), or a more traditional (and slower yet less collision-prone) prime-number strategy.

The C++ standard library hash containers are undefined order containers. This means that the order in which you insert items is not necessarily the same order in which you would iterate over the items. However, thor's hash containers are a fusion of a list and a hashtable, which allows for a defined iteration order as well as amortized O(1) lookup time. This does require a slight change to how iteration starts: the begin() function defaults to 'list' iteration mode but can be changed to 'hash' iteration mode, whereas find() operations default to 'hash' iteration mode.

Embedded Containers

For std::list (or thor::list for that matter), the 'T' type that you're storing in the container is contained within a node that tracks other information for the container. If you had a list of pointers to objects in the heap, then the objects were allocated and the list container must allocate a small node to store the pointer. This is wasteful. The embedded containers can alleviate this by having the node tracking information as part of the object that is being stored. For instance, the 'T' type stored in a thor::embedded_list must contain an embedded_list_link member and the member is given as a template parameter to thor::embedded_list.

Atomic integer/pointer

The thor::atomic_integer system is similar to C++11's atomic wrapper, so while not completely necessary it was a fun exercise to write. The real meat of how it works is based in the platform-specific interlocked system (the Windows version is interlocked_win). This uses template specialization to handle one-, two-, four- and eight-byte integers. By using intrinsics, these atomic_integers actually compile down to very few inline instructions for most operations. Consider how declaring interlocked works on a platform with a 32-bit integer. First, some magic happens based on how the non-specialized interlocked class is declared. There is a second template parameter that is required, but defaults to sizeof(T):

The template parameter T_SIZE defaults to 4 for int, which causes a specialization to be selected:

This specialization uses the correct functions/intrinsics for the numeric type that we're wrapping in atomic_integer. This allows atomic_integer to do the correct operation based on size of the integer parameter:

Memory Alignment

Another place where Thor uses template specialization is with memory alignment. All of the containers do memory allocation through Thor's memory functions. There is a simple class called the align_selector:

This class's only job is to determine if an object can use system-guaranteed alignment or not (most objects can). If the system alignment will suffice, alignment_selector::alignment will be zero. The align_alloc class uses the align_selector as a default template parameter to select the proper specialization for the alignment:

The system default alignment just allocates raw bytes:

But the default version that handles non-default alignments will over-allocate and offset:

Arguably, this version could use _aligned_malloc or memalign or similar, but by using new[] an application may still override new and use their own memory manager if so desired.

There is a lot more to the Thor library. Take a look at it on github and let me know what you think.

15 May 2014

Busting Mines

Behold: Bust A Mine. My first iOS development project.

Not a bad run!

Inspiration

When I set out to (once again) try my hand at iOS development, I didn't know what the future would hold. It was around the time that a former game that I worked on, Clone Wars Adventures, had announced that it would be ending on 31 March 2014. One of my favorite mini-games in CWA was called Mine Buster, a simple game that involved creating a chain reaction for points. It was also a mini-game that would soon cease to exist. And a good mini-game that could use a Spiritual Successor.

Beginnings

I started working on my own rendition in early March while professionally employed as the Director of Technology at Molten Games. Within two weeks of late-night spare time I had a working prototype that functioned similarly to its inspiration. I had heard good things about using Cocos2D 3.0 and it seemed to fit the bill. My prototype was still very rough and could only run on the iOS Simulator that comes with Xcode 5. However, I was happy enough with the progress that I decided to spring for the iOS Developer Program so that I could actually run and debug on my iPhone and iPad. There was a little learning curve to overcome as running code on the device requires creating App Identifiers, Certificates for code signing and Provisioning Profiles for signed code to be allowed to run on the device. Being able to run and debug on my devices was instantly gratifying, but also revealed another issue: the need to deal with different screen sizes (iPad vs. iPhone) and aspect ratios.

Artist Wanted

It was time to start finalizing the art--getting real assets at the necessary sizes and building out the art style. For this, I'd actually need an artist. Fortunately, Molten's UI artist was on-board with the project after seeing the prototype running on my phone. He had free reign with the art style and created a look inspired by the likes of Geometry Wars yet unique in its own right. We worked together to integrate the art into the game and iterate on the style. Unfortunately, Molten Games collapsed around this time, towards the end of March. Priorities shifted to finding a new day job since this project was a journey of Experience and less of Business.

To Test or not to Test

Working with an artist brought to light another need that I'd have to figure out: distributing test builds of my app to other people. For this I began using Test Flight. This provided a great way to bring testers on board and get feedback. The only issue was minor: when a new device was registered, I would have to add the device's unique identifier to the provisioning profile and upload another build of the app. I started acquiring testers with a broad spectrum of devices. I also acquired a few devices of my own: non-Retina iPad and older iPhone. I was also blessed with winning an iPad Mini in a contest! I made sure to also test on the newer 64-bit devices. Through this I actually discovered some 64-bit compatibility issues with Cocos2D 3.0 (in beta at the time) and worked with the developers to solve them.

Sprechen Sie Deutsche?

I can understand exactly one of these.
Having visited Moscow and Shanghai for assisting Planetside 2 localization (and working on games for the past decade or so), I'm constantly reminded of the worldwide gaming community. The English-speaking community might be easier for me to market, but it's a small part of the world. Therefore, I wanted to make sure that my app could be distributed and popular in other locales. I had a very small number of strings (less than 100), so professional localization would be fairly easy. I originally started working with Rev.com but found them difficult and obtuse. I ended up going with e2f and found them slightly more expensive, but very easy to work with. They appreciated all of my contextual notes and turned around six languages (French, Italian, German, Spanish, Russian and Simplified Chinese) in less than 24 hours. In hindsight, I should have included a few more items in the translations, such as future (generic) update notes and anything that I would potentially need in the next few months.

Finishing Touches

This game, since it was an exploration of experience, was going to be free and advertisement-supported. However, given the ease of doing in-app purchases, I would offer an in-app purchase to remove the ads. I also wanted GameKit integration to essentially have free Leaderboards. Fortunately many kind developers have written simple APIs around these pieces of functionality and have shared them for free. Integrating the likes of ABGameKitHelper and MKStoreKit was very easily done. My testers also suggesting graphical features such as using UIMotionEffect to simulate a parallax display on iOS 7+.

A Different Beat

Having some manner of background music was important to me. However, I'm not a composer. I don't really know many composers. So I started looking for free music. Interestingly, though, "free" isn't always free. Some companies let you use their music for free for certain purposes, and an app is not generally one of those free uses. I originally wanted to use some songs from Freeplay Music, but after seeing their licensing requirements and reading about their litigation processes, I decided against it. I ended up using music by Chris Zabriskie who offers his music for free with a Creative Commons Attribution license. I feel that his music meshes well with the peaceful, soothing nature of Bust A Mine's laid-back gameplay.

Pre-Launch Checklist

I've heard of using a test market to pre-launch, so I figured I'd give that a try. I have some friends with family in the Philippines and it seems to be a fairly popular test market. Launching in the Philippines also revealed an interesting fact: iAd isn't available in all parts of the world. Therefore, my advertisement strategy would have to change for a worldwide launch.

3... 2... 1... Liftoff!

Bust A Mine officially launched in a limited capacity on 9 May 2014, about 9 weeks after development started. Considering that it was my first foray into iOS development and Objective-C programming in general, I'm calling that a success. Not a business success, but an experiential success. For the full worldwide launch, I'm waiting for Apple to approve the next version that will integrate Google AdMob as a backup to iAd.

To Infinity and Beyond

The background reverberates with explosive force
I really enjoy iOS development. The Objective-C language allows for easy checking to see if features are available on the device it's running on. Furthermore, the Xcode integration with Apple back-end services is very well done and easy to use. Want GameKit integration? Flip a switch and you've got it.

Several people have asked about an Android port as well. Interestingly, no one has asked about a Windows phone version. In any case, I'm considering a port to Cocos2D-X, a C++ library that supports all major mobile platforms.

I've got plenty of ideas for Bust A Mine for the future. And a few of my game designer friends have offered some feedback and ideas of their own. Plus, I have more ideas for future games as well. Mobile development has become a fun little hobby for me.

Download Bust A Mine and give it a try. Leave a review or rating if you feel so inclined. And let me know what you think in the comments!

04 February 2014

Githubbing

I just created a repository on github.

https://github.com/jkriegshauser/fastjson

For a while I've been writing a json parser (fully RFC-4627 compliant) similar in interface to rapidxml as a fun way to write some simple code that ended up being fairly complex. I mean, there's a good bit of template partial specialization in there, exception/non-exception handling, Unicode support, unit-testing, etc.

It even figures out if you've given it UTF-8, UTF-16 or UTF-32 and whether the endianness matches your machine or not.

Yeah, so that's how I have fun. Enjoy, universe!

21 November 2013

Catching Fire

Well, it happened.

Nearly two months ago I took the plunge and left Sony Online Entertainment to be the Director of Technology at Molten Games. Molten is a small but well-funded and growing start-up in the San Diego area.

I. Am. Having. A. Blast.

We're building a rock-star team and making a great game. I can't wait until I can talk about it, but you're going to have to stay tuned for a bit.


08 January 2013

Getting Ripped

For the past few weeks I've been converting my DVD and Blu-ray Disc (BD) collection to MP4 format so everything can be easily viewed on the Apple TV. Generally, this has been fairly painless (especially for DVDs), but there is one thing that I desire that most people probably don't care about: captions.

Maybe I'm getting old, but for about a decade I've preferred to watch shows with captions. There are basically three types of captions present on DVD and BD media:

  1. Closed Captions - text-based captions that may include hard-of-hearing information.
  2. Bitmap Captions - bitmaps that are overlaid on the screen.
  3. Forced Bitmap Captions - technically a subset of (2). Typically shown for foreign languages.
Bitmap Captions allow the DVD/BD publisher to ensure that captions look the same on every player, whereas text-based Closed Captions can look and perform differently on different players, TVs, etc. And in my experience, Bitmap Captions don't display on the Apple TV whereas Closed Captions will. Also of note is that DVD and BD each use a separate type of Bitmap Captions--VobSub for DVD and PGS for BD.

Now, if Bitmap Captions don't show on the Apple TV, why do I care about them? There are two main reasons: 1). If Closed Captions don't exist and 2). Forced Subtitles.

At this point I should mention that I'm using Handbrake to encode my movies for the Apple TV. Several pages exist as guides to using Handbrake, so I'm not going to cover it in much detail, except for dealing with captions.

If Closed Captions Don't Exist

If a DVD has Bitmap Captions (VobSub) but no Closed Caption, then we need to find a way to add text-based Closed Captions. Fortunately, there are a few options.

The first option is my least favorite. Handbrake has the ability to burn-in Bitmap Captions. This means that captions will be present, but there will be no way to turn them off. For all captions, that's not desirable, but as seen below it can be quite handy for Forced Captions.

The second option is to find a SubRipper (SRT) file that someone has created. Fortunately there are some great crowd-sourced resources for this, like Open Subtitles and my personal favorite, SubScene. These sites allow you to search for a SRT file that someone has created and uploaded. Handbrake can import the SRT file by using the Import SRT button (seen in the picture above). The downside to this method is that timing is everything. There can be Special Editions and Director's Cut versions of the movie you're searching for, so ensuring that you have the correct file will involve using Handbrake's Preview feature or using a player (like VLC Media Player) to preview before encoding. Otherwise you could end up with captions that don't match.

The final option is to create your own SRT file from the Bitmap Caption. This can either be done manually (VERY time consuming) or with the aid of OCR software. This was more of a last resort for me, but I did do it for a few of my DVDs. I used a program called Subresync. It is about as naive as OCR software can be, essentially asking the user the first time it encounters a specific pixel pattern:

Forced Captions

This is where things get a bit trickier. Forced captions were typically not a problem for DVDs as all of my DVDs that used Forced Captions actually had them burned in to the video stream. Not so for my BDs. A good example is Star Trek VI: The Undiscovered Country where the Klingon speech is a Forced Caption track and is not burned in to the video stream. And Forced Captions are something that everyone should want as they are essentially part of the movie.

Also complicating this process is the fact that Handbrake does not yet understand PGS-style Bitmap Captions (the system used for BDs). Therefore, the PGS (or 'Sup') Captions must be converted to VobSub before using Handbrake. This involves the following steps:
  1. I use MakeMKV to get the BD data on the hard disk in MKV format.
  2. Load the .mkv file in the mkvmerge tool from MKVToolNix. This will tell you which tracks contain PGS Caption data:

    In this example, IDs 2, 3, 4 and 5 contain PGS subtitle data. Leave this program open as we'll come back to it if we do indeed find Forced Captions.
  3. Extract the subtitle data using mkvextract on the command line:
    mkvextract tracks Star_Trek_VI...mkv 2:sub2.sup 3:sub3.sup 4:sub4.sup 5:sub5.sup
    This will extract tracks 2-5 into files sub2.sup, sub3.sup, sub4.sup and sub5.sup
  4. Open each extracted PGS Subtitle file using BDSup2Sub to see if it contains Forced Captions. In the Star Trek 6 example, we find that track 3 contains nothing but Forced Captions:
  5. Since we've found Forced Captions, we need to convert them to a format that Handbrake understands. Fortunately, this is what BDSup2Sub was designed to do. File->Export gives the following dialog (make sure that 'Export only forced' is checked):

    Interestingly, VobSub is made up of a .idx text file and a .sub binary file.
  6. Now we need to add the newly created VobSub captions and re-mux into a new .mkv file. Go back to the mkvmerge that was open from step 2. Click the Add button and select the new .idx file that was created in step 5, then press Start Muxing:
  7. When the muxing finishes, you can now load the new .mkv file up in Handbrake and burn-in the Forced Captions:

Getting captions to work correctly on the Apple TV with custom-encoded movies can be a bit tricky, but if you're hard-of-hearing or just used to using captions, hopefully this guide can help you.

02 May 2012

New Domain!

Well, I've finally gotten around to registering joshuakriegshauser.com. It's about time. Which means that this blog will appear at blog.joshuakriegshauser.com.

Oddly enough, kriegshauser.com is considered a premium domain and is listed for about $1600. I honestly hope whoever is holding it never gets that.

05 January 2012

1 in 600,000

For Christmas this year, my wife gave me (among other things) Another Tough Puzzle. This puzzle claims that "with millions of wrong solutions, the chance of solving this puzzle is ONE in 600,000!"

There are 16 triangular pieces and each piece has three connections each being one of four possible shapes. The completed puzzle should be a pyramid with six connections on the edges pointing in and six pointed out.

The puzzle also claims that "it looks easy...but just when you think you've solved it, the last piece proves you wrong." This is very true and very frustrating. I (and others) played around with the puzzle for a while always ending up with one piece that didn't fit.

Then I thought, "I write software for a living; why don't I do that here?" I've enjoyed writing solvers in the past (my Minesweeper solver would actually play the GUI for you on previous Windows versions). I mostly use C++, so I decided to use it here as well. Certainly there are languages more suited to rapid development, but I'm a glutton for punishment.

The first step was to describe the pieces. We noticed pretty quickly that half of the pieces have two outward-facing connectors (outputs) and one inward-facing connector (input) and the other half are opposite (two inputs and one output). There are no pieces with three inputs or outputs.
enum FaceType
{
    eFaceTypeClover = 0,
    eFaceTypeHeart,
    eFaceTypeSpade,
    eFaceTypeDiamond,
};

enum Direction
{
    eDirIn = 0,
    eDirOut,
};

struct Node
{
    FaceType face;
    Direction dir;

    bool fits(const Node& rhs) const
    {
        return face == rhs.face && dir != rhs.dir;
    }
};
struct Piece
{
    Node node[eNumSides];

    Piece(FaceType in, FaceType out1, FaceType out2, bool)
    {
        node[0].face = in;
        node[1].face = out1;
        node[2].face = out2;
        node[0].dir = eDirIn;
        node[1].dir = node[2].dir = eDirOut;
    }
    Piece(FaceType in1, FaceType in2, FaceType out)
    {
        node[0].face = in1;
        node[1].face = in2;
        node[2].face = out;
        node[0].dir = node[1].dir = eDirIn;
        node[2].dir = eDirOut;
    }
};
Using these structures, all of the pieces can be declared:
Piece pieces[eNumPieces] =
{
// one input, two outputs
Piece(eFaceTypeSpade, eFaceTypeDiamond, eFaceTypeHeart),        // 0
Piece(eFaceTypeHeart, eFaceTypeDiamond, eFaceTypeSpade),        // 1
Piece(eFaceTypeDiamond, eFaceTypeClover, eFaceTypeSpade),       // 2
Piece(eFaceTypeClover, eFaceTypeSpade, eFaceTypeHeart),         // 3
Piece(eFaceTypeSpade, eFaceTypeHeart, eFaceTypeClover),         // 4
Piece(eFaceTypeHeart, eFaceTypeSpade, eFaceTypeClover),         // 5
Piece(eFaceTypeSpade, eFaceTypeClover, eFaceTypeDiamond),       // 6
Piece(eFaceTypeDiamond, eFaceTypeClover, eFaceTypeHeart),       // 7

// two inputs, one output
Piece(eFaceTypeClover, eFaceTypeDiamond, eFaceTypeHeart, true), // 8
Piece(eFaceTypeClover, eFaceTypeDiamond, eFaceTypeSpade, true), // 9
Piece(eFaceTypeHeart, eFaceTypeDiamond, eFaceTypeSpade, true),  // a
Piece(eFaceTypeHeart, eFaceTypeClover, eFaceTypeDiamond, true), // b
Piece(eFaceTypeHeart, eFaceTypeClover, eFaceTypeSpade, true),   // c
Piece(eFaceTypeDiamond, eFaceTypeClover, eFaceTypeHeart, true), // d
Piece(eFaceTypeDiamond, eFaceTypeClover, eFaceTypeSpade, true), // e
Piece(eFaceTypeSpade, eFaceTypeHeart, eFaceTypeDiamond, true)   // f
};
Now that the pieces are all set up, the gameboard must be described. The end result should be a pyramid made up of all 16 pieces (see diagram). Therefore, I had to describe how the pieces relate to each other in the gameboard. This will allow the solver algorithm to test that a piece 'fits' if its connectors match the adjacent pieces. In order to do this, each position in the desired puzzle shape is marked with a number and each side of the triangular pieces are numbered as well. The sides are always numbered clockwise starting at the horizontal side so that adjacent pieces in the puzzle have sides matching (for instance, pieces 7 and 8 both share their side number 2).






The array of PieceLoc structures define how each piece relates to adjacent pieces:
PieceLoc locs[eNumPieces] =
{
// Sides:     0               1               2          // Pieces
PieceLoc(eInvalidPiece,  eInvalidPiece,              1), // 0
PieceLoc(            7,              2,              0), // 1
PieceLoc(eInvalidPiece,              1,              3), // 2
PieceLoc(            9,              4,              2), // 3
PieceLoc(eInvalidPiece,              3,              5), // 4
PieceLoc(           11,              6,              4), // 5
PieceLoc(eInvalidPiece,              5,  eInvalidPiece), // 6
PieceLoc(            1,  eInvalidPiece,              8), // 7
PieceLoc(           12,              9,              7), // 8
PieceLoc(            3,              8,             10), // 9
PieceLoc(           14,             11,              9), // a
PieceLoc(            5,             10,  eInvalidPiece), // b
PieceLoc(            8,  eInvalidPiece,             13), // c
PieceLoc(           15,             14,             12), // d
PieceLoc(           10,             13,  eInvalidPiece), // e
PieceLoc(           13,  eInvalidPiece,  eInvalidPiece)  // f
};
The fitting function is recursive. The first time it is called it has all 16 pieces remaining to try and fit. It loops through all remaining pieces rotating them in all possible positions each time checking to see if the piece fits. If the piece fits, it calls itself again this time with 15 pieces remaining. When a pieces is found that fits, it calls itself again with 14 pieces remaining. And so on, and so forth until all pieces have been used. If a piece doesn't fit, it goes on to the next piece. If a solution is found or no more pieces fit, the function returns and tries the next piece.

Since we only have 16 pieces, I use a bitmask to determine which pieces are already used.
void FitRemaining(int remain)
{
    if (remain <= 0)
    {
        ShowSolution();
    }
    else
    {
        int index = eNumPieces-remain;
        for (int i = 0; i < eNumPieces; ++i)
        {
            if ((used & (1 << i)) == 0)
            {
                used |= (1 << i); // Mark in-use
                pieces[index] = i;
                for (int r = 0; r < eNumRotations; ++r)
                {
                    rotation[index] = r;
                    if (validate(index))
                    {
                        FitRemaining(remain - 1);
                    }
                }
                pieces[index] = eInvalidPiece;
                used &= ~(1 << i); // No longer in-use
            }
        }
    }
}

The last part is the validation function. We really only need to validate the most recent piece added. Figuring out which sides are facing each other is the most difficult part, but even that is pretty easy. Because of how the gameboard is organized we already know that side indexes on adjacent pieces always face each other. All we need to do is take the rotation into account and ensure that the shapes match:
bool validate(int piece) const
{
    assert(piece >= 0 && piece < eNumPieces);
    const Piece* cur = Get(piece);
    int rot1 = rotation[piece];
    const PieceLoc& fit = locs[piece];
    for (int f = 0; f < eNumSides; ++f)
    {
        const Piece* p = Get(fit.neighbors[f]);
        if (p)
        {
            int rot2 = rotation[fit.neighbors[f]];
            if (!cur->node[(rot1 + f) % 3].fits(p->node[(rot2 + f) % 3]))
            {
                return false;
            }
        }
    }
    return true;
}

That's all there is to it! I probably spent less time writing this program than manually (and frustratingly) trying to find the solution.

In fact with this program I found all 518,469 solutions to the puzzle. Here's the first one:

If you'd like the full C++ source: Go here.