Nils Arenz continues his vegetation studies in UE4 and speaks about creating dense grass for next-generation video games with minimal loss of performance.
Nils Arenz continues his vegetation studies in UE4 and speaks about creating dense grass for next-generation video games with minimal loss of performance. The technique described in this article can be freely tested in January as Nils’s Interactive Open World Foliage will be among UE sponsored content of the month.
Introduction
Hello! My name is Nils Arenz. In the last article, I explained how to create realistic-looking vegetation in UE4. This was aswell the start of my professional career as a Vegetation Artist. Because of my educational background with a BSc in computer graphics, the most important part of making game-ready assets for me is making them as performant as possible. Therefore, you have to spend hours analyzing profiling graphs of different approaches to design vegetation.
Read the last article here:
What Is It About?
In this article, I would like to talk about the most used kind of vegetation, grass, but you can apply all explanations to any kind of plant. Filling the whole ground with dense grass in Unreal is a complex task and I would like to share the knowledge I gained in the last half year while profiling and designing vegetation and present the idea of having dense grass in next-generation video games.
This article targets the technical background of creating vegetation for next-gen applications, so before going deep into shader breakdowns, I would like to explain a few important shader problems and the current state of the art to make it easier to follow this article. To keep this as small as possible I focused on the most important parts, so you don’t have to read through lots of pages. Keep in mind that all tests are done in the Unreal Engine 4, I’m pretty sure other engines handle some problems the same, but without proper testing, there is no proper statement.
State of the Art
The most commonly used method to create grass for video games is still packing grass blades next to each other in the albedo map, mapping them on a plane and stacking this plane inside each other to create a 3-dimensional looking grass cluster. The advantage of that is that the vertex count is very low. In the picture below you can see the workflow for that.
The problem with this implementation is the heavy usage of transparency. With a densely filled ground, you can imagine the shader has to do a lot of calculations. These calculations split into two problems:
- the actual draw pass (what part of what card is getting rendered)
- the shading (what blade of grass is shading what card/part of the ground)
Now to a few terms that I’ll use a lot in the next part.
Quad Overdraw
The quad overdraw is a pixel bound problem, that means the more pixel you have (resolution wise) the more important this problem gets. For many operations, the GPU is using Quads (a block of 2×2 pixels) and not single pixels for their calculations. If you have thin objects (like grass) you are going to waste a lot of discarded pixels. This is why small objects cost you a lot of GPU time, but grass is simply small and if we look for a natural look that’s the path we have to take. The more red/white the visualization is looking the worst the performance.
Shader Complexity
In Unreal and other similar CG applications transparency is very expensive. That’s why you should always keep this as low as possible. In UE4 the shader complexity is one of the bottlenecks if it’s about grass or other vegetation. The visualization is easy to understand. The more white spaces you have the worse the performance of the pre- and basepass will be. The color represents the sum of shader instructions the GPU has to calculate for every single pixel for each draw. A modern GPU has about 2000 shader units, which can execute one line of shader code each clock tick (depends on the GPU around 1500MHz). So the longer the shader code (shader complexity) the longer the render passes.
For a good explanation of all GPU-based problems, check out this video:
Test Approach
To reduce shader-based calculations I decided to try modeling each blade of grass on its own. That sounds heavy in terms of vertex count but don’t underestimate modern GPUs and object-instancing. Aswell it gives you the opportunity to use the Pivot Painter 2.0 Tool to animate realistic wind and have a more accurate and quality-based modeling process. For my assets, I used a polycount of 1 to 8 per blade of grass.
A few other tips regarding the creation process are located at the end of the article.
Performance Breakdown
In this part of the article, I would like to show the two approaches and break down the render passes to explain the differences between them. For the tests, I’ve built two exact looking clusters: one using the plane technique, the other one using the „real mesh“ implementation. The grass is placed using the Landscape type grass node which generates grass clusters on a ground material. Pictures might not look totally the same but this is caused by the procedural placement of the clusters. The density is the same, so you see the same amount of clusters in every picture I’ll compare. I tried to optimize the meshes of both designs in the same way, that being said I mapped the plane as accurate as possible. Here the technical details of both approaches:
Every game, every ecotype is totally different, that’s why I would like to break down the most important factors and show you the performance differences. For each of these breakdowns, I’ll show pictures and compare the shader times in percent based on the profiles output. To save you time by reading through hundreds of shadertimes in ms I simplified as follows:
This means: Using my approach results in 47% more drawcalls but also in an overall performance boost of 81%.
Here is a small explanation about the renderpasses and what they include.
- Drawcalls: How many drawcalls the GPU has to do for each frame.
- Overall: The total render time including all renderpasses.
- Prepass: In the prepass, the UE is sorting the instances based on their distance to the camera. (DBuffer)
- Basepass: Runs through the shaders (materials) to create the GBuffers.
- Shadow: Combination of all shadow-related passes (ShadowDepths, Lights, LightComposition, etc.)
Cull Distance
Density
Different Frustums
Shaded vs. Unshaded
Some engines/devs don’t use shadows for grass. That’s why I also wanted to include a test with unshaded grass clusters. Because the shadow passes on ground foliage are very heavy they are skipped by a lot. But even without ground shadows, the performance of the single modeled clusters is about 190% faster on the same scene. This is caused by the pre- and basepass due to the number of calculations the GPU has to do.
Game Scene
In a game, there are way more things going on than just rendering grass. The last test is based on a fully-build environment with particle effects, animation, a big landscape, rocks, trees, other structures and blueprints running in the background. Another idea was to have a really heavy vertex count on the scene. As grass ground, a mid-dense coverage is used. In the GIF below you can see the scene. Please don’t judge my creativity in this scene.
The results of this test are speaking against the usage of vegetation billboards, too.
The results might look very dramatic, but keep in mind that there is only a landscape with grass in the scene.
Visual Breakdown
Since I was talking about the visual advantages already in the last article (link in the beginning) I’ll just explain the most important parts.For me, it is very satisfying to look at the ground and see that every single blade of grass is modeled on its own and you don’t see these vegetation billboards. Other plants will look ways better in an actual 3D model than just a billboard. The next important thing for me is the animation. If you look at a windy meadow, you will see that every blade of grass is moving individually. Modeling as I explained gives you the opportunity to use the pivot painter tool for a realistic wind effect. In the GIFs below you can see some game ready scene I made with my assets.
Pros & Cons
Pros:
- better performance
- good control over LOD creation
- easier to create more complex, natural plant structures
- better visual quality
Cons:
- consoles and mobile can only visualize a limited amount of vertices
- time-consuming creation process
- not possible for every type of vegetation (fern, leaves)
Conclusion
After evaluating all this different breakdown scene I question myself: Trading vertex count for fewer shader calculations? Definitely worth in my opinion. Of course, you have to balance everything in the right order and plan more time for the process of creation, but benefits in visual quality and performance will be a good reason to think about that. Overall I can say the more grass you see (by culling or simply a higher density) the more important it gets to optimize grass and other vegetation. Natural environments are highly complex and filled with such a variety of different plants. Using a higher vertex count to make plants more realistic and performant at the same time is something I really look forward in the next years.
Tips & Tricks
After explaining the impact of performance I would like to give a few tips on how I proceed during my asset creation. It all starts with the scan of the vegetation and what exact plant parts you should choose. Keep the idea to have the lowest possible use of the opacity mask in your mind! While you choose a plant to scan, you have to know if you can UV this part efficiently. So always think ahead. In the picture below you can see what blades of grass are good and what should be avoided.
The next important thing is UV the plants super accurate! Let every pixel count. If you do it well you save shader complexity and quad overdraw.
Using no mipmapping on the opacity map will improve the overall performance. With active mipmapping, the alpha mask is getting downsampled depending on your position (like LOD but for textures). The downsample will linear interpolate between the white space and increase the shader calculations, as well as some white spaces will just disappear but the render will still draw a fully transparent mesh.
Another Idea For Avoiding Shader Problems
In order not to mess with shader problems at all I can recommend fully using a pure mesh technique. But this only works for simplest blades of grass. The overall performance of this approach is very good because you have a low vertex count combined with low shader calculations. The downside of this is the approach is the lack of realism. In the pictures below you can see the results I had. This allows you to put grass with a culling distance up to 15000 on a big landscape with more than 140 fps.
What’s Coming Next?
A lot! Here a small list of my winter projects:
Unity asset store, jungle ground vegetation, full tropical biome vegetation, bushes, and all European trees.
Afterward
And that‘s basically the whole magic. I hope you enjoyed reading the article as much as I enjoyed making it! If you have more ideas, questions or criticism, go ahead and tell me. That’s the best way to improve.
If you want to see all of my assets, check out my Unreal Marketplace Page.