Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Optimizing Graphics Pipelines with Meshlets: A Guide to Efficient Geometry Processing

Save for later
  • 900 min read
  • 2024-12-09 08:18:07

article-image

This article is an excerpt from the book, "
Figure 6.1 – A meshlet subdivision example

These vertices can make up an arbitrary number of triangles, but we usually tune this value according to the hardware we are running on. In Vulkan, the recommended value is 126 (as written in
Figure 6.2 – A meshlet bounding spheres example; some of the larger spheres have been hidden for clarity

Some of you might ask: why not AABBs? AABBs require at least two vec3 of data: one for the center and one for the half-size vector. Another encoding could be to store the minimum and maximum corners. Instead, spheres can be encoded with a single vec4: a vec3 for the center plus the radius.

Given that we might need to process millions of meshlets, each saved byte counts! Spheres can also be more easily tested for frustum and occlusion culling, as we will describe later in the chapter.

The next additional piece of data that we’re going to use is the meshlet cone, as shown in the following screenshot:

https:// github.com/JarkkoPFC/meshlete) and we encourage you to try both to find the one that best suits your needs.

After we have loaded the data (vertices and indices) for a given mesh, we are going to generate the list of meshlets. First, we determine the maximum number of meshlets that could be generated for our mesh and allocate memory for the vertices and indices arrays that  will describe the meshlets:

const sizet max_meshlets = meshopt_buildMeshletsBound(
   indices_accessor.count, max_vertices, max_triangles );

Array<meshopt_Meshlet> local_meshlets;
local_meshlets.init( temp_allocator, max_meshlets, 
   max_meshlets );

Array<u32> meshlet_vertex_indices;
meshlet_vertex_indices.init( temp_allocator, max_meshlets *
   max_vertices, max_meshlets* max_vertices );
Array<u8> meshlet_triangles;
meshlet_triangles.init( temp_allocator, max_meshlets *
   max_triangles * 3, max_meshlets* max_triangles * 3 );

Notice the types for the indices and triangle arrays. We are not modifying the original vertex or index buffer, but only generating a list of indices in the original buffers. Another interesting aspect is that we only need 1 byte to store the triangle indices. Again, saving memory is very important to keep meshlet processing efficient!

The next step is to generate our meshlets:

const sizet max_vertices = 64;
const sizet max_triangles = 124;
const f32 cone_weight = 0.0f;

sizet meshlet_count = meshopt_buildMeshlets(
   local_meshlets.data,
   meshlet_vertex_indices.data,
   meshlet_triangles.data, indices,
   indices_accessor.count,
   vertices,
   position_buffer_accessor.count,
   sizeof( vec3s ),
   max_vertices,
   max_triangles,
   cone_weight );

As mentioned in the preceding step, we need to tell the library the maximum number of vertices and triangles that a meshlet can contain. In our case, we are using the recommended values for the Vulkan API. The other parameters include the original vertex and index buffer, and the arrays we have just created that will contain the data for the meshlets.

Let’s have a better look at the data structure of each meshlet:

struct meshopt_Meshlet
{
unsigned int vertex_offset;
unsigned int triangle_offset;

unsigned int vertex_count;
unsigned int triangle_count;
};

Each meshlet is described by two offsets and two counts, one for the vertex indices and one for the indices of the triangles. Note that these off sets refer to meshlet_vertex_indices and meshlet_ triangles that are populated by the library, not the original vertex and index buff ers of the mesh.

Now that we have the meshlet data, we need to upload it to the GPU. To keep the data size to a minimum, we store the positions at full resolution while we compress the normals to 1 byte for each dimension and UV coordinates to half-float for each dimension. In pseudocode, this is as follows:

meshlet_vertex_data.normal = ( normal + 1.0 ) * 127.0;
meshlet_vertex_data.uv_coords = quantize_half( uv_coords );

The next step is to extract the additional data (bounding sphere and cone) for each meshlet:

for ( u32 m = 0; m < meshlet_count; ++m ) {
   meshopt_Meshlet& local_meshlet = local_meshlets[ m ];

   meshopt_Bounds meshlet_bounds =
   meshopt_computeMeshletBounds(
   meshlet_vertex_indices.data +
   local_meshlet.vertex_offset,
   meshlet_triangles.data +
   local_meshlet.triangle_offset,
   local_meshlet.triangle_count,
   vertices,
   position_buffer_accessor
   .count,
   sizeof( vec3s ) );

   ...
}

We loop over all the meshlets and we call the MeshOptimizer API that computes the bounds for each meshlet. Let’s see in more detail the structure of the data that is returned:

struct meshopt_Bounds
{
   float center[3];
   float radius;

   float cone_apex[3];
   float cone_axis[3];
   float cone_cutoff;

   signed char cone_axis_s8[3];
   signed char cone_cutoff_s8;
};

The first four floats represent the bounding sphere. Next, we have the cone definition, which is comprised of the cone direction (cone_axis) and the cone angle (cone_cutoff). We are not using the cone_apex value as it makes the back-face culling computation more expensive. However, it can lead to better results.

Once again, notice that quantized values (cone_axis_s8 and cone_cutoff_s8) help us reduce the size of the data required for each meshlet.

Finally, meshlet data is copied into GPU buff ers and it will be used during the execution of task and mesh shaders.

For each processed mesh, we will also save an offset and count of meshlets to add a coarse culling based on the parent mesh: if the mesh is visible, then its meshlets will be added.

In this article, we have described what meshlets are and why they are useful to improve the culling of geometry on the GPU.

Conclusion

Meshlets represent a powerful tool for optimizing the rendering of complex geometries. By subdividing meshes into small, efficient chunks and incorporating additional data like bounding spheres and cones, we can achieve finer-grained control over visibility and culling processes. Whether you're leveraging advanced shader technologies or applying these concepts with compute shaders, adopting meshlets can lead to significant performance improvements in your graphics pipeline. With libraries like MeshOptimizer at your disposal, implementing this technique has never been more accessible.

Author Bio

Marco Castorina first became familiar with Vulkan while working as a driver developer at Samsung. Later, he developed a 2D and 3D renderer in Vulkan from scratch for a leading media server company. He recently joined the games graphics performance team at AMD. In his spare time, he keeps up to date with the latest techniques in real-time graphics. He also likes cooking and playing guitar.

Gabriel Sassone is a rendering enthusiast currently working as a principal rendering engineer at The Multiplayer Group. Previously working for Avalanche Studios, where he first encountered Vulkan, they developed the Vulkan layer for the proprietary Apex Engine and its Google Stadia port. He previously worked at ReadyAtDawn, Codemasters, FrameStudios, and some other non-gaming tech companies. His spare time is filled with music and rendering, gaming, and outdoor activities.

Modal Close icon
Modal Close icon

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant