The many faces of perspective projection matrix

One of the first things I stumbled upon in the beginning of my adventure with graphics programming were types of matrices and view spaces. I remember it took me a while to wrap my head around different naming conventions (is clip space the same as screen space or…?) and how each and every projection worked from theoretical standpoint. With Internet around it’s so much easier to figure things out but there’s one thing that I remember baffling me: the relation between different forms of perspective projection matrix.

The most popular (and dare I say: the only one?) representation of a projection matrix that you can find in decent graphics and math books today is of the API agnostic form:

[ 2n/(r - l),          0, -(r + l)/(r - l),            0 ]
[          0, 2n/(t - b), -(t + b)/(t - b),            0 ]
[          0,          0,  (f + n)/(f - n), -2fn/(f - n) ]
[          0,          0,                1,            0 ]

where: 
r, l, t, b - respective planes of the view frustum (right, left, top, bottom)
f - far plane
n - near plane

This matrix is a result of how a truncated pyramid frustum is being transformed into canonical view volume (a unit cube) – a process slightly more complicated than a regular ortographic projection and requiring a bit of math work (which I will skip here as you can find plentiful reference material on the Internet). Lets assume for a second that it all makes sense to you, you understand how each element of the matrix came to be and how the entire thing works (no really, read the full math derivation and try to understand – it’ll help!). If you’re a beginner in graphics programming, one of the first example implementations of perspective projection matrix will probably use this form instead (assuming we’re talking OpenGL):

[ (1/r)cot(fov/2),          0,                0,            0 ]
[               0, cot(fov/2),                0,            0 ]
[               0,          0, -(f + n)/(f - n), -2fn/(f - n) ]
[               0,          0,               -1,            0 ]

where: 
fov - field of view
r   - aspect ratio
f   - far plane
n   - near plane

(if you happen to see a tan() being used instead of cot() remember, that one function is the inverse of another, so the final forms may differ slightly on the trig part)

Wait… what?

There’s very little explanation available out there concerning the relation between these two forms. The math behind it is there, you can still find it and get a grip of how the matrix works – but how do two different representations result in the same output? Also, which one is the “better” one that I should use? The key is simply to understand that you arrive at both solutions using different input parameters:

– First matrix is derived given n and f planes but also the actual dimensions of the view frustum defined by r, l, t and b planes (here, both the aspect ration and field of view can be extracted from the matrix for the given size of the frustum).
– Second matrix is a result of taking into account the n and f planes and instead of frustum dimensions we use the desired view aspect ratio and desired field of view.

It is therefore easier and less code to write with the second form, making it the most commonplace in real-life. This is especially visible in FPS games where we want to have a smooth and fast control over the player’s fov. Bottom line? Being able to express the same thing in different ways is a powerful tool but also one that can easily confuse everyone using it 🙂

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Oculus Rift DK2 (SDK 0.6.0.1) and OpenGL ES 2.0

Recently I’ve been working on a VR port for Rage of the Gladiator, a game that was originally released for mobile devices and used OpenGL ES 2.0 as the rendering backend. This seemingly simple task soon created several fun problems resulting in limitation of this graphics SDK in relation to “full-fledged” OpenGL. My initial idea was to rewrite the entire renderer but very soon this approach turned out to be a dead end (suffice to say, the original codebase was slightly convoluted), so I decided to stick with the original implementation. To run an OpenGL ES application on a PC I used the PowerVR SDK which is an excellent emulation of mobile rendering environment on a desktop computer.

Once I got the game up and running, I started figuring out how to plug in my existing Oculus code to get proper output both on the device and in the mirroring window. Rendering to the Rift worked pretty much out of the box – it only required changing the depth buffer internal format of each eye buffer to GL_DEPTH_COMPONENT16 (from the “default” GL_DEPTH_COMPONENT24). Creating a proper mirror output was a whole different story and while not excessively complicated, it did require some workarounds to get it working. Here’s a list of things I ran into – something you should consider if you ever decide to use Open GL ES in your VR application (but why would you, anyway? 🙂 ):

1. Replacement for glBlitFramebuffer()

Starting with Oculus SDK 0.6.0.0, rendering mirror texture to window is as easy as getting the system-handled swap texture and perform a blit to the window back buffer:

    // Blit mirror texture to back buffer
    glBindFramebuffer(GL_READ_FRAMEBUFFER, m_mirrorFBO);
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
    GLint w = m_mirrorTexture->OGL.Header.TextureSize.w;
    GLint h = m_mirrorTexture->OGL.Header.TextureSize.h;

    // perform the blit
    glBlitFramebuffer(0, h, w, 0, 0, 0, w, h, GL_COLOR_BUFFER_BIT, GL_NEAREST);

    glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);

With OpenGL ES 2.0 you will soon notice that glBlitFramebuffer() is not present. This causes more complications than may seem at first because now you have to manually render a textured quad which, while not particularily difficult, is still a lot more code to write:

// create VBO for the mirror - call this once before BlitMirror()!
void CreateMirrorVBO()
{
    const float verts[] = { // quad vertices
                            -1.0f, 1.0f, 1.0f, 1.0f, -1.0f, -1.0f, 1.0f, -1.0f,

                            // quad tex coords
                            0.0f, 0.0f, 1.0f, 0.0f, 0.0f, 1.0f, 1.0f, 1.0f,

                            // quad color
                            1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f,
                            1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f
    };

    glGenBuffers(1, &mirrorVBO);
    glBindBuffer(GL_ARRAY_BUFFER, mirrorVBO);
    glBufferData(GL_ARRAY_BUFFER, sizeof(verts), verts, GL_STATIC_DRAW);
}

void BlitMirror()
{
    // bind a simple shader rendering a textured (and optionally colored) quad
    ShaderManager::GetInstance()->UseShaderProgram(MainApp::ST_QUAD_BITMAP);

    // bind the stored window FBO - why stored? See 2.
    glBindFramebuffer(GL_FRAMEBUFFER, platform::Platform::GetFBO());
    glActiveTexture(GL_TEXTURE0);
    glBindTexture(GL_TEXTURE_2D, m_mirrorTexture->OGL.TexId);

    // we need vertex, texcoord and color - used by the shader
    glEnableVertexAttribArray(VERTEX_ARRAY);
    glEnableVertexAttribArray(TEXCOORD_ARRAY);
    glEnableVertexAttribArray(COLOR_ARRAY);

    glBindBuffer(GL_ARRAY_BUFFER, mirrorVBO);
    glVertexAttribPointer(VERTEX_ARRAY, 2, GL_FLOAT, GL_FALSE, 0, (const void*)0);

    glEnableVertexAttribArray(TEXCOORD_ARRAY);
    glVertexAttribPointer(TEXCOORD_ARRAY, 2, GL_FLOAT, GL_FALSE, 0, (const void*)(8 * sizeof(float)));

    glEnableVertexAttribArray(COLOR_ARRAY);
    glVertexAttribPointer(COLOR_ARRAY, 3, GL_FLOAT, GL_FALSE, 0, (const void*)(16 * sizeof(float)));

    // set the viewport and render textured quad
    glViewport(0, 0, WINDOW_WIDTH, WINDOW_HEIGHT);
    glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

    // safety disable
    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glDisableVertexAttribArray(VERTEX_ARRAY);
    glDisableVertexAttribArray(TEXCOORD_ARRAY);
    glDisableVertexAttribArray(COLOR_ARRAY);
}

2. Keeping track of the window/screen FBO

Many complex games of today heavily employ the use of rendering to texture for special effects or various other purposes. My experience shows that once programmers start using RTT, the calls to glBindFramebuffer() start appearing at an alarming rate in various parts of the code, disregarding the fact that in many instances switches between rendering to texture and rendering to the actual window happen more often than it should. Not counting performance impact, usually this behavior does not produce unwanted results and it *may* not matter whether we squeeze a render to window between various RTTs or not. Now, consider that the Oculus mirror render is virtually a blit to separate eye buffers which are then later again blitted to the output window buffer resulting in the popular distorted image you see in YouTube videos. If the rendering code performs a blit to window in-between RTTs, parts of the final image may be distorted by weirdly overlaying images.


Notice how a popup is rendered incorrectly behind the lenses due to a mid-RTT render to window.

For this reason it’s important to correctly track which FBO belongs to the window and avoid reverting to it *before* you render the entire scene – glGetIntegerv() is your friend and it can save you a lot of grief, especially with more complex drawing sections. While this is not a VR problem per-se and may happen to you in regular application development, it’s definitely easier to run into in this particular case.

3. Remember to disable OpenGL states after you’re done using them

Again, this is not a strictly VR-related issue but one that can manifest itself right away. With VR rendering you have to remember that you essentially draw the entire scene twice – once per eye. This means that OpenGL state after the first render persists during the second one which may produce some rather baffling results. It took me quite a while to understand why left eye rendered correctly, right eye had messed up textures and the mirror turned out completely black – turns out the cause was not calling glDisable() for culling, blending and depth test. A simple fix but very annoying one to track down 🙂

4. Don’t forget to disable V-Sync

As of today, PowerVR SDK seems to create all render contexts with V-Sync enabled – while this may sound suprisingly easy to detect it did, in fact, caused me some trouble. What’s worse – Oculus Rift didn’t seem to bother and showed a constant 75fps in the stats which only added to the confusion (why oh why does this one single triangle rendering stutter all the time?). Calling eglSwapInterval(display, 0) will solve that problem for you.

Conclusion

In perpsective, the issues I ran into were a minor annoyance but clearly showed how forgetting simple things can cause a whole bunch of issues you would normally never see when performing a single render. The whole experience was also a nice indication that the current state of Oculus SDK performs well even with limited OpenGL – even if it’s a bit gimmicky when developing for a PC.

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Sparse matrices and projection calculations

If you ever worked with high performance 3D applications you know that every cycle counts. One of the issues programmers try to solve is reducing computation time when dealing with matrices and vectors, especially if calculations are done very frequently each frame. Here’s a little trick that can save you some memory and cycle counts when determining projection. Consider a typical projection matrix P:

[ A, 0, 0, 0 ]
[ 0, B, 0, 0 ]
[ 0, 0, C, D ]
[ 0, 0, E, 0 ]

Projection is essentially just a linear transformation in homogenous coordinate space. If vertex coordinates are already expressed in view space as a 4-element vector (Vv):

Vv = [vx, vy, vz, 1]

then getting clip space homogenous coordinates (Vh) is simply:

Vh = Vv * P = [ a, b, c, w ]

We then divide a, b and c by w to get the regular 3D coordinates. There’s a small optimization that can be done here, provided that:

1. Most elements of object’s transformation/modelview matrix are 0 (this is called a sparse matrix). If this is the case, performing additional projection transformation using matrix multiplication is unnecessary and creates computation and memory overhead.
2. The sparse transformation/modelview matrix has it’s last row equal to [ 0, 0, 0, 1 ].
3. The object’s transformation/modelview matrix is non-projective (we wouldn’t have to bother otherwise!).

If condition 1 is met, then we only need 4 elements to determine projection (instead of a matrix), as long as we’re dividing by a positive constant: [ A/D, B/D, C/D, E/D ].

Condition 2 implicates that for proper model -> view vertex transformation we only need three 4-element vectors instead of a 4×4 modelview matrix. We can then get the view space coordinates of a vertex by performing dot products. Below is a simplified vertex shader code of how the optimization can be applied (in actual codebase I used it to reduce sizes of bone animation matrices, so the full code is slightly more complex):

#version 100

attribute highp vec3 inVertexPos;
uniform   highp vec4 inModelView[3]; // first three rows of modelview matrix
uniform   highp vec4 inProjection;   // projection matrix elements expressed as: [A/D, B/D, C/D, E/D]

void main()
{
    // transform from model space to view space
    highp vec3 viewSpacePos;
    viewSpacePos.x = dot(inModelView[0], vec4(inVertexPos, 1.0));
    viewSpacePos.y = dot(inModelView[1], vec4(inVertexPos, 1.0));
    viewSpacePos.z = dot(inModelView[2], vec4(inVertexPos, 1.0)); 

    // calculations using view space vertex position
    (...)

    // transform from view space to clip space
    gl_Position = viewSpacePos.xyzz * inProjection;
    gl_Position.z += 1.0;
}

Note that depending on how you define your projection matrix in your codebase, it might be necessary to flip signs, ie.:

(...)

uniform   highp vec4 inProjection; // projection matrix elements expressed as: [-A/D, -B/D, -C/D, -E/D]

void main()
{
    (...)
    gl_Position.z -= 1.0;
}

This neat trick, as well as many others came from PowerVR Performance Recommendations which is an excellent source for mobile graphics developers.

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Game Engine Architecture by Jason Gregory

This is the book I’ve been wanting to read for a long time. With close to 1000 pages of pure content, you get a heavily condensed compendium on good, bad and typical practices in game engine design. What’s great about this book is that even though it reads like something straight out of a university library, all the information is based on the author’s practical experience. This means that there’s relatively very little “dry” theory in favor of analysis of real life applications and how each component may perform on current gaming hardware. The latter was something I found especially interesting, since there’s very few articles out there that give you a decent comparison of the XBox or PS4 hardware against a desktop PC. If you never worked in AAA gamedev, you will definitely learn a lot. That said, the book is clearly aimed at people with various programming or industry experience. If you already shipped a title, you may find some parts of the book rather obvious. Nevertheless even having prior knowledge of the covered topics didn’t prevent me from catching some interesting quirks, making going through the entire thing worthwhile.

The two major chapters of the book are focused on rendering and animations which are usually the most complex. However other elements are not neglected in any way – coverage of memory allocators, debug tools, profilers, gameplay design and HID gives the reader a perfect picture of how extensive and complex piece of software a game engine truly is. Comparison of internal software used by major game developers was pretty informative and should give you an insight on how to properly design tools of your own. One thing that should be noted is that there are no straight code solutions in any chapter – the whole book should be treated primarily as an introduction to each topic. Supplementary literature is provided, so this makes a perfect starting point no matter what you want to focus on in your programming career.

If you want to start out in the game industry this is definitely the book you want. If you’re already experienced you may not benefit as much but you might still learn a new thing or two. However if you already own the first edition of the book you might as well hold out. The audio chapter and slightly updated information on gaming consoles, while informative, are not good enough a reason to spend another $60.

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

“Hazumi” – a game that came to be against all odds

December 2014 and January 2015 mark the dates when my simple puzzle game “Hazumi” came out for the Nintendo 3DS. It’s been almost 6 months and only recently I started feeling the whole production pressure wearing off, so I decided to share my story with other inspiring game developers out there in hopes that they don’t repeat the mistakes I made. You won’t find any marketing tips here nor how to interact with the social media to promote your work – there’s a ton of a lot better articles on that topics out there. What I could seldom find was sort of a personal confession from the developers on how they felt during the development and what they had to go through over the course of their work which should be as important in order to maintain mental health. I’m writing this in hopes that someone benefits from my experiences.

Introduction
It all started in mid-2012, when I came up with an idea of finally making a game of my own. The company I worked for at the time was kind enough to let their employees pursue their own projects and even help with publishing if necessary, so it sounded like a perfect moment to get on the “indie bandwagon” and do something creative. Since I wanted to start off rather easily, I decided to remake one of the games from my early childhood: Crillion for the Commodore 64. The premise was very simple: you’re a ball bouncing around an enclosed brick level, your goal to destroy all the blocks of the same color as the ball in order to proceed further. The idea felt perfect for the mobile market at the time and the basic game mechanics was, theoretically, easy to implement. I teamed up with a friend graphics designer, we set out to release the game for the Nintendo 3DS first as it was the perfect platform for this kind of a game. Thus the development process started.

Choosing the tech
Unlike most cases, choosing the underlying technology was a no brainer – our company already had a basic rendering, input and audio engine which supported multiplatform production (including iOS, Android and PC), so we decided to use exactly that. And trust me when I say, that developing a game for Nintendo 3DS without having to test it on the devkit every single time makes a lot of difference. The underlying software, while solving a lot of initial multi-platform issues, was not truly fit for our needs, so I decided to develop an engine on top of that – something that in the future I would call “Talon“. For graphical assets, we used all regular tools of the trade. We aimed for a classic 2D look, so no additional software was needed. We set up a common project on Todoist and Google Groups and distributed all the task we felt were essential to complete for the prototype. We had everything at that point to make a successful game: technology, skills, idea and enough motivation for both of us. The only thing we lacked, was enough free time.

Initial development
Intensive work started around August 2012. Fellow artist was at the time preoccupied with other project obligations, so at first the main focus was on developing a playable prototype and the engine itself for any future projects. I was hyped and really excited. For the next few months I worked 8 hours during the day, then went straight home and spent another 4-5 hours hacking at the game I would want other people to play as much as I do. The first prototype was finished by November, compileable and playable across ALL major mobile devices with full sound support, input and audio. It felt like everything was going as smoothly as it possibly could.

And yet, it took nearly 2 more years to push the game out for sale.

Time is always against us – this is especially true when you’re employed full time. Getting the graphics done turned out to be an extremely arduous task. By December we had concepts but no clear sense of style direction. We decided to take small steps, relax a bit and don’t press anything in fear of creating a mediocre product. I told the artist to take a break and think over the design while I would tinker more with the prototype, turn it into a full fledged game and add some extra features we were both talking about: UI system, level selection, progression and a level editor. With those thing in mind and still fully motivated, I started implementing each component one by one, further extending the engine’s functionality.

Problems
Another year has passed. An intensive, hard working and extremely stressful time where most days I worked over 12 hours a day, the only driving force being my motivation that I want to finish the game. I implemented a level editor, the engine was practically ready for multi-project development and it had all the features I wanted it to have including stereoscopic rendering for the 3DS and support for every existing compressed texture format I could think of. I even took the liberty to experiment with different gameplay types, client-server architecture for sharing levels online and even time challenges that could retain the users on the more difficult Android and iOS. The idea bag was full to the brim but luckily cherry picking them went really well and in a fairly short amount of time I narrowed down all the game features that the final game would have on its release day. But not all things went so smoothly: the time is November 2013 – and we still have no graphics whatsoever. This was the moment I realized the fatal mistake I made. Our team of “dynamic duo” lacked communication and we hadn’t even noticed when. We stopped talking about the game, we stopped thinking what a fun thing we’re making and, as a result, we stopped motivating each other and pushing ourselves further. The team broke up leaving behind it a pile of neatly written code and a game ready to be playtested had it only had the looks. It came unexpected. I didn’t see it coming.

Starting all over
All that time I was so endulged in my own work that I lost the bigger picture. Furstration kicked in. Seeing that I had no other option, I decided to take up the challenge of doing the artwork myself – big mistake in retrospective, since I consider myself “artistically handicapped” and I couldn’t possibly match the quality of games on the Nintendo. Still, I decided to give it a go and believe it or not – it was quite entertaining at first. Over the span of next several months I taught myself how to use all the basic drawing tools and began experimenting with different art styles trying to capture “on paper” what I saw in my head… but it just didn’t “stick”. When showing the game to others I could tell how unappealing the game felt even though the mechanics and controls were found to be intuitive and enjoyable. Something was missing and I just couldn’t get it right myself even with tips from other artists (you wouldn’t believe how difficult it is for a programmer to take art advice!). And then I saw the light at the end of the tunnel when a graphics artist I worked with at my day job said how much he enjoyed the game and that he’d love to help. I gladly accepted, though with slight reluctance after what I’ve experienced in teamwork till that moment. Soon it turned out it was the best decision I could’ve made. While I was taking a bit time off from the excessive amount of work, Rafal did an amazing job and in just under 2 weeks created artwork I couldn’t even dream of making myself.


My attempts to work out graphics style.


Final artwork.

With this new shot of inspiration, we started the finishing work on the game. Level design took another several weeks, interrupted by playtesting and “QA reports” from our friends. This work quickly started feeling rather mundane – coming up with new levels soon became more difficult and balancing the gameplay proved to be a tedious task. Repetition is a productivity killer and tires both body and mind amazingly fast. By the time we got everything polished and done just the way we wanted it was already nearing the end of the year. But the moment finally came. The game was complete. “Hazumi” hit the US 3DS eShop in December and both Europe and Japan in January of 2015. The reviews were positive, people seemed to enjoy the game as much as we did and it felt like the mission was accomplished.

And yet something was missing. The thrill just wasn’t there anymore and all the high flames of excitement from almost 2 years earlier: gone. There was no launch party, no celebration. Nothing. I should have personally felt happy that I eventually *DID* achieve my goal of delivering a complete game. But I didn’t. All the joy was sapped away from realization how much toll the development time took on me. Had it been fully productive 2 years I know I would feel different but the fact that I neglected to address all the problems earlier just hit me way too hard. I was more angry at myself than glad that I pushed a high-quality product to the people. And frankly – this completely destroyed all the fun of game development.

Moral of the story
So what lessons have I learned from this?
1. Address communication problems as quickly as possible. Don’t neglect it EVER.
2. If you feel that any team member starts feeling burned out – REACT.
3. Working long hours, several months in a row with no breaks is BAD for you, no matter how passionate you are about what you’re doing.
4. Don’t neglect your health and activities other than developing your game. It’s not worth it.
5. Never lose inspiration and if necessary do whatever you have to do to self-motivate if there’s nobody else to motivate you around. If you’re a coder – start painting or drawing or writing poetry. This helped me and will very likely help you too on your endeavours.
6. If you have little to no experience/skill in the area you need for your game (graphics in my case) – first try looking for people who can help. Trying to learn and doing things by yourself might be enlightening but it’s unlikely you’ll get as good results as with an expert in your team.

To sum up, developing “Hazumi” was extremely taxing and it took me a long time to recover and start enjoying games all over again. I was lucky to realize something was wrong soon enough not to completely drop interest in the project but had I reacted sooner, I would definately benefit more. I hope this post helps you in the slightest and if it already gives you some ideas how not to lose your mind doing what you love – that’s good enough for me!

Visit Hazumi homepage

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Templates and C-style array size

If you’re dealing with templates a lot in your C++ code, then you’re likely familiar with how template type deduction works. It’s an extensive topic which I’m not going to cover in detail here but while reading this book I found one aspect of it quite useful in my work.

Remember how you sometimes needed to know the size of C-style arrays? One way to determine it was doing something like this:

const int cArray[] = { 1, 2, 3, 4, 5 };

std::size_t aSize = sizeof(cArray) / sizeof(cArray[0]);

Nothing wrong with doing it this way but thanks to templates and the way they deduce data types we can get the array size in a much cleaner manner. To briefly recap on C-style arrays, even though you can specify a function signature using this syntax:

// how we may declare a function
void foo(int arr[])
{
}

What we essentially get is an implicit pointer conversion by the compiler:

// what compiler actually sees
void foo(int *arr)
{
}

This poses a problem, since there doesn’t seem to be an easy and obvious way to “extract” array size from the function variable. But there is a solution, one that employs templates. If you read up on how template type deduction works, you’ll learn that C-style array (and function pointers) is a special case treated differently depending on how you declare a template function:

– If a template function takes a non-reference and non-pointer parameter, deduced type for C-array is a pointer to its first element
– If a template function takes a reference parameter, deduced type for C-array is the actual array type

So in practice, what happens is:

const int cArray[] = { 1, 2, 3, 4, 5 };

template<typename T>
void foo(T arg)
{
}

template<typename T>
void foo2(T& arg)
{
}

foo(cArray);  // T will be deduced as int*
foo2(cArray); // T will be deduced as int[5]

The case of T &arg produces an interesting implication, since it allows us to directly access the size of an array from within the template function. For this, however, the argument has to be slightly modified:

// get C-style array size in a simple, painless way
template<typename T, std::size_t N>
constexpr std::size_t arraySize(T(&)[N]) noexcept
{
    return N;
} 

// Usage example:
const int cArray[] = { 1, 2, 3, 4, 5 };
std::size_t aSize = arraySize(cArray); // aSize is now 5

The constexpr keyword allow us to directly initialize static array sizes with the function’s result, while noexcept gives the compiler a chance for additional optimization. While this solution doesn’t help much when dealing with dynamically allocated arrays, it’s fun to know that C++ templates, usually regarded as code obfuscators, can make programming cleaner in certain applications.

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Using C unions in high level code

I always considered the C union to be highly underappreciated ugly-child that nobody cares about in high level programming. Not really meant for persistent storage (specific cases excluded), it’s difficult to see any good use for such constructs, especially for beginner programmers. Unless you deal with compilers or close to the metal development, chances are you have barely used or spotted a union in an application’s runtime code. But unions can come in handy, sometimes in quite unexpected ways. I often forget about their applications, so hopefully this post will help me remember in the future.

So what can we do with them? Consider a situation, where we would like to pack one data type into another. Specifically, assume that we want to pack low precision uint8_t 4D vector coordinates into a single uint32_t variable. What most people think of doing first is taking each respective coordinate and OR it with the target using bit shifting operations. This would work fine but thanks to unions we can provide a nice, cleaner code for this purpose:

    uint32_t packVector(uint8_t x, uint8_t y, uint8_t z, uint8_t w)
    {
        union
        {
            uint32_t m_packed;
            uint8_t  m_unpacked[4];
        } u; // 4 bytes 

        u.m_unpacked[0] = x;
        u.m_unpacked[1] = y;
        u.m_unpacked[2] = z;
        u.m_unpacked[3] = w;

		return u.m_packed;
	}

A completely different problem that a union can easily solve is floating point endian swapping. You’d usually run into this when dealing with different CPU architectures and binary data stored in a file. Depending on the supported platforms you may need to juggle between little and big endian and in case of floats this might incur some performance issues if you decide to use integer endian swapping with “classic” type casting. But all those problems dissapear if instead of casting you use an union:

	// assuming that float is 32 bit for simplicity
    float swapFloat(float value)
    {
        union
        {
            uint32_t m_int;
            float    m_float;
        } u;

		u.m_float   = value;
		u.m_integer = integerEndianSwap(u.m_integer); // no penalty

		return u.m_float;
	}

See this article for a detailed explanation on why this is generally a better approach.

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Finding an alternative to std::bitset

In one of my games I ran into a seemingly simple problem: saving a puzzle state (ie. completed/not completed) for each of 105 available levels. Naturally first thing that came to mind was a static bool array with required amount of entries – both easy to maintain and write to disk without hassle:

bool m_levelComplete[105];  // array of 105 bools (in most cases: 8 * 105 bits)

That would serve my purpose well but I felt like expanding on the problem a bit. In my particular case, the array would take 105 bytes (or 840 bits). Not too much, but the excessive redundancy felt “dirty”, so I decided to tackle this issue and find a different way. Second solution that came to my mind was using a std::bitset object:

#include <bitset>

std::bitset<105> m_levelComplete;  // better in terms of size but still "not quite right"

A lot better in terms of storage, std::bitset felt like the thing I needed… with one (well, two, as it turned out later) exceptions: it was not trivial to serialize and store to disk (which I didn’t like) and it produced erratic behavior on one of the target platforms (Nintendo 3DS). For various reasons I also needed to know the underlying type of variables that the bits were “stored in”, so this particular implemenation didn’t fit my purpose. Eventually I started implementing a simple bitset/bit vector of my own:

#define BS_NUM_ELEM (size_t)(numBits / (sizeof(storageType) * 8) + numBits / (sizeof(storageType) * 8) % 2)

template<typename storageType, size_t numBits> class BitSet
{
    storageType m_bits[BS_NUM_ELEM];
};

For starters, I encapsulate m_bits array of desired data type. In order to satisfy the amount of bits, I calculate the required amount of array elements using the BS_NUM_ELEM macro. As you can see, the number of elements will be always rounded up, so we get nicely aligned data. For example: using unsigned long long for underlying type (and assuming it’s 64 bit in size), we get m_bits[2], so essentialy 128 available bits. Since this is a regular C-style array it will be easily saved to file but also size redundancy will be low (depending on which data type you use, the redundancy can get even lower). Having this very basic structure, I needed a way to set, clear and access individual bits:

#define BS_NUM_ELEM (size_t)(numBits / (sizeof(storageType) * 8) + numBits / (sizeof(storageType) * 8) % 2)

template<typename storageType, size_t numBits> class BitSet
{
public:
    // set bit value
    void Set(size_t bitNo)
    {
        for (size_t i = 0; i < BS_NUM_ELEM; ++i)
        {
            if (bitNo < (i + 1) * sizeof(storageType) * 8)
            {
                m_bits[i] |= ((storageType)0x01 << (bitNo - i * sizeof(storageType) * 8));
                break;
            }
        }
    }

    // clear bit value
    void Clear(size_t bitNo)
    {
        for (size_t i = 0; i < BS_NUM_ELEM; ++i)
        {
            if (bitNo < (i + 1) * sizeof(storageType) * 8)
            {
                m_bits[i] &= ~((storageType)0x01 << (bitNo - i * sizeof(storageType) * 8));
                break;
            }
        }
    }

    // access bit value
    bool operator[](size_t bitNo)
    {
        for (size_t i = 0; i < BS_NUM_ELEM; ++i)
        {
            if (bitNo < (i + 1) * sizeof(storageType) * 8)
            {
                return ((m_bits[i] >> (bitNo - i * sizeof(storageType) * 8)) & 0x01) && 0x01;
            }
        }

        return false;
    }

private:
    storageType m_bits[BS_NUM_ELEM];
};

The code should be fairly self explanatory (I omitted constructors for clarity). First we determine in which stored variable our desired bit resides. Once it’s located, we set/clear proper bit in said variable. Accessing the bit is done easily with operator[] – note how I return a boolean which simply determines whether desired bit is non-zero.

 // Usage
BitSet<unsigned long long, 104> bv; // create an "aligned" 128 bit set stored in two ULL variables.
printf("%d", sizeof(bv)); // 16 bytes/128 bits
bv.Set(24); // set bit 24
bv.Set(103); // set bit 103

...

Full template code can be downloaded from: https://github.com/kondrak/cpp_tools/tree/master/bitset

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr

Effective Modern C++ by Scott Meyers

If you’ve been on the C++ bandwagon for a while you probably heard about Scott Meyers and his “Effective…” book series. While I haven’t read every single one of them, the ones I did check out always came packed with highly compressed information on how to become a more productive C++ programmer. “Effective Modern C++” is, thankfully, no exception.

Each of the 42 tips embedded in the book comes with a practical example and concise explanation of the techniques used in the code. The title does mention C++14 but most of the content is focused primarily on C++11 (with respective and often simplified C++14 examples if applicable). A positive notion while reading the book is that you don’t necessarily have to be (too) familiarized with language constructs introduced by C++11/14, since every chapter gives an extensive explanation on how each one of them works in detail (and why it’s a better/worse solution in particular cases). That being said, it’s difficult not to notice the author’s love for templates – almost all code samples use them. Personally, I have nothing against that but depending on which industry branch you’re working in you might find the tips more/less useful (in case of game development many senior devs will tell you how much they loathe templates!). Nevertheless, going through all 300 pages of “Effective Modern C++” was an educating experience, so if you’re serious about moving to C++11 I highly recommend getting it!

Tweet about this on TwitterShare on RedditShare on LinkedInShare on FacebookShare on Google+Share on Tumblr