Sunday, July 26, 2009

OpenGL Render To Texture 10x Slower than DirectX

That's my main problem right now. It takes roughly 10 - 20 times longer when I call mSceneManager->manualRender() when using OpenGL in Ogre instead of DirectX (9).

After getting my app running in DirectX, I used NVidia Perfhud to debug the framerate spikes - only to find that DirectX is vastly outperforming OpenGL! So now I can't really afford gDEBugger ($800), so I'm stuck with inserting my mini-profiler into the GL_RenderSystem code to figure out what parts are slowing down etc. That will be a good experience and help me get more familiar with Ogre RenderSystem calls.

Misc Notes:
- Make sure you play around with your RenderSystem configuration settings. I had set my Display Frequency to 59 and that caused the framerate to be cut in half. Dunno why I had it set to that value.
- With OpenGL you don't need the RenderSystem to issue _begin() and _end() frame calls, but you do in DirectX.
- Fixed a bug where when writing dynamic textures I was using pointer math and it was messing up my textures in DirectX. I switched to using array notation and it fixed the issue. Haven't spent much time figuring out why I was having so much grief with that (it worked fine with the OpenGL RenderSystem).


Next, I plan on profiling the GL_RenderSystem to figure out those RTT issues, and upgrading to the newly released Ogre version 1.6.3!!

Tuesday, July 21, 2009

Optimisation Fail

This is a maintenance/progress blog post so nothing exciting really. I just try to post a few times a month as any developer keeping a journal should.

I got rid of the vertex buffer and texture resource pools and created a mesh pool and each mesh has its' own vertex buffer and textures so the same thing is accomplished and the mesh resource manager is now way simpler. I used to have std::maps with pointers to the meshes in each list depending on what stage the mesh was in (needs building, built, and visible, cached and cache build) - now I have 3 std::vectors for 3 lists: needs building, built and visible.

The lists are stored in an array like this:

std::vector mMeshLists[3];

The built and visible lists are in index 0 and 1 and I swap them periodically like this:

mVisibleListIndex = (mBuiltListIndex + 1) % 2;

And the build list is always index 2.

Unfortunately, the app is still too jerky. Creating the textures on the graphics card is taking too long, generating the heightmaps is taking too long and it's not smooth even though I'm using threads for all the non-opengl stuff. Most of the time spikes are now happening in Ogre so I will probably have to add some profiling there to figure out what's going on.

Also, I switch over to DirectX 9 so that I can use NVidia's Perfhud tool to help me analyze how the graphics card and cpu are being utilized. That's been fun.

The only major bug I have at the moment is that the Render To Texture code that generates the terrain texture for each patch isn't working in DirectX right. My dynamic heightmap texture is not being created correctly and I don't know why yet. I have been experimenting with different pixel formats to no avail yet.

I'm almost thinking of doing another simpler version of the quad tree planet that uses geomippmapping and the planet size is restricted to 1025x1025 vertices per side - or maybe 2049x2049. This terrain is taking too much of my time to code that should be spent on other important game things!