3D Terrain with Level of Detail

Rendering terrains is an important task for video games, simulations and geographic information systems (GIS). Terrains are, however, large and constantly visible and thus expensive to render. As a result, various level of detail (LOD) algorithms have been developed and published over the last three decades.

In this semester project (i.e. pre-bachelor-thesis project), I mainly researched and compared existing terrain LOD algorithms and evaluated them based on their suitability for implementation, especially with regards to modern GPU hardware. As a result, I developed a small terrain LOD renderer named ATLOD (A Terrain Level of Detail (renderer)) in C++ and OpenGL and implemented a suitable terrain LOD algorithm based on my findings.

**Figure 1:** A 420 km x 420 km terrain rendered in ATLOD.

The implemented algorithm is based mainly on Geometrical MipMapping (GeoMipMapping), but also incorporates certain aspects from GPU-based Geometry Clipmaps and various other approaches. Naive brute-force rendering, which simply renders all vertices onto the screen without any LOD considerations, was also implemented to allow comparing the implemented LOD algorithm with the ground truth.

This project page aims to present the basic ideas and implementation details of the implemented terrain LOD algorithm. The implementation still has some room for improvements, which are mentioned at the end of this page.

The source code of ATLOD is available on the GitHub repository AmarTabakovic/3d-terrain-with-lod and is licensed under the MIT license. The full (cleaned up) project report, which contains an introduction to the basics of terrain rendering, an overview of other terrain LOD algorithms, and some performance benchmarks of ATLOD, can be read here¹.

Basic Ideas

In this section, the basic ideas behind the implemented LOD algorithm are presented.

Review of GeoMipMapping

As previously mentioned, the LOD algorithm is mainly based on GeoMipMapping by de Boer [1]. GeoMipMapping splits the terrain up into blocks of a fixed side length \(blockSize = 2^n+1\), for an \(n \in \mathbb{N}\). Some examples for block sizes are 17, 33, 65, 129, 257, etc. The choice of this sizing is so that the terrain block can be represented in multiple LOD resolutions. For example, a 9 x 9 block can be scaled down to 5 x 5, 3 x 3 and finally 2 x 2. Figure 2 shows an example of a a 5 x 5 block at the maximum LOD level 2, at the LOD level 1 and at the minimum LOD level 0².

**Figure 2:** A 5 x 5 GeoMipMapping block at the maximum LOD level 2, at the LOD level 1 and at the minimum LOD level 0.

The blocks are organized in a quadtree, which is used for frustum culling. The LOD level of a block is calculated using the distance from the block to the viewer and a screen-space error. In order to avoid cracks, vertices which would cause T-junctions are omitted and the border area is rendered as a triangle fan instead, as shown in figure 3.

**Figure 3:** Avoiding cracks between two 5 x 5 GeoMipMapping blocks.

Improvements for Modern Hardware

At the time of GeoMipMapping’s publication, the main method of rendering was immediate mode rendering, which loads each vertex manually to the GPU using functions such as glBegin(), glVertex(), glEnd(), etc. Immediate mode rendering is deprecated nowadays in most graphics APIs, which necessitates an approach which makes use of vertex and index buffers. The GPU-based Geometry Clipmaps algorithm by Hoppe and Asirvatham [3] not only solves this problem, but does so in a very performance- and memory-friendly manner.

The main idea of the predecessor Geometry Clipmaps algorithm by Hoppe and Losasso [2] is that a mesh consisting of nested grid-like rings of increasing size is rendered centered around the camera and that the height values are “shifted” as the camera moves, kind of like dragging a tablecloth across a bumpy surface³. The vertex and index buffers are updated each frame in the basic algorithm, which is improved upon in GPU-based Geometry Clipmaps.

In GPU-based Geometry Clipmaps, instead of updating the buffers at runtime, a set of flat meshes of fixed sizes are stored once on the GPU in a vertex buffer and index buffer. Both buffers are not modified at runtime. The heightmap is stored in a texture image. At runtime, these flat meshes are translated to the camera’s position and scaled up, and in the vertex shader, the \(y\)-coordinates are displaced using the heightmap texture. This requires very little memory to be allocated for the vertices and indices and also does not require the costly reallocation of the buffers at runtime.

Main Idea

The algorithm implemented in this project combines both

the simple organisation of the terrain into blocks and rendering blocks with multiple LOD resolutions from GeoMipMapping,
and the aspects of GPU-based Geometry Clipmaps related to its efficient usage of the GPU,

such that it is both simple to understand and performant at the same time.

The main steps of the GeoMipMapping implementation can be summarised as follows: The terrain has a single global vertex buffer and index buffer. The vertex buffer stores the vertices for a flat grid-like mesh centered arount \((0,0,0)\) of side length \(blockSize\). The index buffer stores the indices of the flat mesh for each LOD resolution and each possible border permutation. The heightmap is stored as a texture on the GPU. The terrain contains a block list, which simply stores per-block information relevant for rendering, such as the block’s world-space center, current LOD level, etc. During rendering, the current LOD level and border permutation are updated for each block. Afterwards, for each block the flat mesh is rendered at the block’s current LOD resolution and border permutation. In the vertex shader, the flat mesh gets translated to its actual world-space center and the \(y\)-coordinate is displaced with the heightmap texture.

These steps will be explained in greater detail in the following sections.

Block Structure

The block is encapsulated as a struct GeoMipMappingBlock, which only contains logical data related to rendering, and not any meshes. It consists of the following fields:

glm::vec3 worldCenter: the center point of the block, used for the LOD calculation. The \(y\)-coordinate is retrieved from the heightmap.
glm::vec3 p1, p2: the two extreme points defining the AABB of the block. The \(y\)-coordinates are set to the minimum and maximum heightmap value inside this particular block.
glm::vec2 translation: the 2D translation vector for translating this block to its actual position in the vertex shader at runtime.
unsigned currentLod: the current LOD level.
unsigned currentBorderBitmap: the current border permutation bitmap.

All blocks are stored in a std::vector<GeoMipMappingBlock>. The blocks are generated when loading the terrain. The terrain width and height are rounded down to the next multiple of \(blockSize\).

Vertex and Index Organisation

The GeoMipMapping implementation is based on a single vertex buffer and a single index buffer.

A vertex consists of its \((x,z)\)-position, which are both 4-byte floating point numbers. The \(y\)-coordinate is implicitly always zero. The vertices are organized in a grid of size \(blockSize \times blockSize\), centered around \((0,0,0)\).

The indices are split into a center area and a border area, and the index buffer stores the indices as follows:

The first part of the index buffer contains the indices for the border area at every LOD level and for every possible border permutation.
The second part of the index buffer contains the indices for the center area at every LOD level.

A border permutation is defined to be a 4-tuple \((l,r,t,b)\) where each element corresponds to left, right, top and bottom, and where each element is set to 1 if the neighboring block on the corresponding side has a lower LOD, 0 otherwise. The organisation of indices into their border permutations serves for the crack prevention as described further above.

The 4-tuple can also be represented as a bitmap of the form lrtb, which is useful for easy manipulation and for easy array indexing. For example, the bitmap of a block whose left and top neighbors have a lower LOD level is 1010. The reason for splitting up the mesh into the border and center areas is to save memory, since the center area is the same for all border permutations. Figure 4 visualizes the index organisation.

**Figure 4:** Organisation of the index buffer.

The start indices and sizes into the index buffer for the index buffer subsets containing the desired center and border areas are stored in the std::vector<unsigned> lists centerStarts, centerSizes, borderStarts and borderSizes. These four lists are filled during the generation of the index buffer and the indices. The code for the index generation is quite involved, therefore I refer to ATLOD’s source code (more specifically GeoMipMapping::loadIndices()) for more information on that.

Important to note is that the indices of the border areas are organized such that the difference in LOD level of any two neighboring blocks must be at most 1. There are \(2^4 = 16\) border permutations per LOD level, which are all shown exemplary in figure 5.

**Figure 5:** All 16 border permutations for a 5 x 5 block at the maximum LOD level 2. Note that the center areas are not displayed.

Heightmaps

Heightmaps store the height values for each \((x,z)\)-position. ATLOD supports 16-bit grayscale PNG heightmaps, which allow for up to 65536 different height values. Heightmaps are stored as a texture image on the GPU. The maximum supported heightmap dimension depends on the GPU’s capabilites and is in most cases either 8k x 8k or 16k x 16k on today’s GPUs. Figure 6 shows the 14k x 14k heightmap used in the preview in figure 1. The digital elevtion model (DEM) was retrieved from OpenTopography [5] and procesed into a heightmap using QGIS.

**Figure 6:** 14k x 14k heightmap of a large portion of Switzerland at 30m resolution. The gray values were scaled from \([g_{min}, g_{max}]\) to \([0, 255]\) for visibility, where \(g_{min}, g_{max}\) are the minimum and maximum gray values in the heightmap respectively.

Rendering

The rendering consists of two main phases:

Updating the block list,
and the actual rendering of each block.

Block List Update

The block list update does the following for each block: as a first step, the LOD level of the block is updated. The LOD is based on the distance between the camera and the center of the block. ATLOD supports two LOD determination modes: the linearly-growing distance and the exponentially-growing distance modes. Both modes use a parameter \(baseDist\). The linearly-growing distance mode simply chooses every \(baseDist\) distance units the next lower LOD level. For instance, the closest blocks have the maximum LOD level and after \(baseDist\) units, the blocks have the next lower LOD level, and so on. The exponentially-growing distance mode works similarly, but doubles the distance every decreasing LOD level. Figure 7 demonstrates both modes.

**Figure 7:** Linearly-growing distance mode (left side) and exponentially-growing distance mode (right side).

The below listing shows the LOD determination:

unsigned GeoMipMapping::determineLodDistance(float squaredDistance, float baseDist, bool doubleEachLevel)
{
    unsigned distancePower = 1;
    for (unsigned i = 0; i < _maxLod - _minLod; i++) {
        if (squaredDistance < distancePower * distancePower * baseDist * baseDist)
            return _maxLod - i;

        if (doubleEachLevel)
            distancePower <<= 1;
        else
            distancePower++;
    }
    return _minLod;
}

After the LOD level update, the border permutation bitmap of the block is updated. Based on the left, right, top and bottom neighboring blocks, the new border permutation bitmap is computed by checking whether the LOD of each of the four neighboring blocks is lower than the current block’s LOD and setting that particular bit to 1 if that is the case, otherwise 0.

Draw Calls

For each block, the intersection of its AABB (defined by p1 and p2) with the view-frustum gets calculated. If the current block does not intersect the view-frustum, then it gets skipped. Otherwise, the flat mesh stored on the vertex and index buffer gets rendered with two draw calls: one for the center area and one for the border area. A special case is when the block has a LOD level of 1 or 0, in which case the block only consists of the border area, resulting in the center area not getting rendered.

As previously mentioned, the start indices and sizes into the index buffer are stored in centerStarts, centerSizes, borderStarts and borderSizes. These four lists are indexed with the current LOD (for centerStarts and centerSizes) and with the current LOD times 16 plus the current border permutation bitmap (for borderStarts and borderSizes). These four lists are used in the two draw calls, as shown below:

unsigned currentIndex = block.currentLod - _minLod;

// First render the center subblocks (only for LOD >= 2 , since
// LOD 0 and 1 do not have a center block)
if (block.currentLod >= 2) {
    glDrawElements(
        GL_TRIANGLE_STRIP, 
        centerSizes[currentIndex], 
        GL_UNSIGNED_INT, 
        (void*)(centerStarts[currentIndex] * sizeof(unsigned)));
}

// Then render the border subblocks
currentIndex = currentIndex * 16 + block.currentBorderBitmap;
glDrawElements(
    GL_TRIANGLE_STRIP,
    borderSizes[currentIndex],
    GL_UNSIGNED_INT,
    (void*)(borderStarts[currentIndex] * sizeof(unsigned)));

Vertex Shader

The vertex shader first translates the \((x,z)\)-vertex with the uniform vec2 variable translation, which is the per-block 2D translation vector for translating the flat mesh to the block’s actual world-space position. Next, it samples the current height value from the heightmap texture and displaces the \(y\)-coordinate of the vertex with that value. After both operations, all vertices of the mesh are at their actual world-space coordinates.

Fragment Shader

The fragment shader performs the following steps:

Application of the overlay texture (if specified and loaded)
Calculation of the Phong shading
Calculation of the distance fog

Overlay Texture

The overlay texturing is straightforward and happens only if an overlay texture was previously loaded, otherwise the color of the terrain is gray.

Phong Shading

The shading performed is a special form of Phong shading based on Andersson’s SIGGRAPH 2007 talk on terrain rendering in the Frostbite engine [4]. The ambient lighting is very straightforward and requires no special description. The diffuse lighting requires normal vectors, which are not given by the vertices (recall that the only vertex attributes are the \(x\) and \(z\)-coordinates). Since the heightmap can be accessed in the fragment shader, the normal vector at the current fragment position can instead be calculated by sampling the vertically and horizontally neighboring height values \(y_{left},y_{right},y_{top},y_{bottom}\) and computing the vertical and horizontal differences \[\notag \begin{align*} dx & = y_{left} - y_{right},\newline dz & = y_{top} - y_{bottom}. \end{align*} \] Using these two values, the normal vector \(\mathbf{n}\) can be calculated with \[\notag \mathbf{n} = \frac{(dx, 2, dz)}{\lVert (dx, 2, dz) \rVert}. \] This normal vector \(\mathbf{n}\) is now used to shade that particular fragment of the terrain with diffuse lighting.

Distance Fog

The final step is the calculation of the distance fog, which is based on the distance between the camera and the fragment position. The formula for the fog factor is given by \[\notag fog = \exp(-d \cdot dist),\] where \(d\) is the user-set density and \(dist\) is the distance between the camera position and the fragment position. The variable \(fog\) is clamped between 0 and 1, and is used as the mixing factor between the color of the terrain and the fog color.

Conclusion

Potential Improvements

The implemented algorithm works decently well on my tested hardware (60 FPS on a 2020 MacBook Air), but could still be optimized even further:

With instanced rendering, the number of draw calls could be reduced dramatically.
The frustum culling can be performed with a quadtree instead of linearly iterating the block list. This would drastically decrease the number of frustum-AABB-intersection calculations and lessen the workload on the CPU. The main difficulty with quadtree-based frustum culling is supporting non-square terrains. One solution would be to set the quadtree side length to the next power of two larger than or equal \( blockSize \cdot \max \{ nBlocksX,nBlocksZ \} \), where \(nBlocksX\) and \(nBlocksZ\) are the number of blocks in the \(x\) and \(z\) direction respectively. The terrain would be situated in the top-left corner of the quadtree and all child nodes which do not contain any terrain at all are set to nullptr.
An improvement for avoiding pops during LOD level changes is vertex morphing, which interpolates the vertices between two LOD levels such that the transition is seamless and continuous.
A minor GPU memory improvement would be to rotate the border area of the flat mesh for each block in addition to translating it. This would require only loading the indices for the 6 border permutations (0,0,0,0), (1,0,0,0), (1,1,0,0), (1,0,1,0), (1,1,1,0), (1,1,1,1), instead of all 16. The idea behind this improvement is the idea that for example a (0,1,0,0) border permutation is the same as a (1,0,0,0) border permutation but with a rotation about the \(y\)-axis of 180 degrees.
There are probably some minor bugs which I have yet to hunt down… ;-)

Final Words

Generally, I am quite satisfied with this project, considering that it is my first actual C++ and OpenGL project and that the topic of efficient terrain rendering was completely new to me before starting this project. I’m quite excited for the bachelor thesis. There are plenty of possibilities which I can branch into, such as developing a streaming/paging based LOD system, integrating/implementing a terrain LOD system for a game engine or a scene-graph library, and more.

Note that since the submission of the report, I have done some minor changes and improvements to ATLOD and the implemented algorithm, which means that the report might not be 100% up-to-date. ↩︎
The definition of the term “LOD” is different between papers and systems. Certain papers and systems refer to LOD level 0 as the highest resolution and the maximum LOD level as the lowest resolution. I chose the (slightly more intuitive) approach of denoting LOD 0 as the lowest resolution and each increasing LOD level corresponding to a higher resolution. ↩︎
I recommend this video which visualizes the basic concept behind Geometry Clipmaps. ↩︎