These are some thoughts on what a modern rendering system would require.
Introduction
As this engine would not be ready for current gen architecture, it should go for next gen - namely a microtriangle approach.
Implementation
Components
The rendering interface
The latest Direct X and Vulkan cover the vast majority of platforms. I think Apple devices require Metal?
The game layer
Procedural generation of meshes on the client would help reduce client install size. Currently, all meshes are generated by whatever means and then exported into a bag of triangles. This takes up a massive amount of memory and it would be much more efficient to generate the mesh on the client from a seed. The same applies to textures; a run time version of substance could massively reduce the amount of disk space (if they're still needed at all). Whether the textures are all generated on first run, part of the install process, or on demand are all viable options with trade-offs. The generation is quick enough to be on demand; especially if the anti-aliased shape generation process was improved.The micro triangle approach is building on the GPEG mantra of 'the polygon is the new pixel' and takes that to its logical conclusion. Basically, the triangles are so small they represent a pixel of screen space, and the pixel color is determined by the color of the triangle. This means textures are repurposed, procedural LOD becomes much more practical, sub-pixel antialiasing becomes practical, the normal map (and ambient occlusion?) is implicit, and the rendering pipeline is much simpler.
An additional level of hierarchy is required to handle the massive number of triangles and vertices - this would be clusters. This helps in culling (cull an entire cluster), splitting up the mesh into manageable chunks for streaming, and makes it easier for the GPU to issue its own draw calls - a massive performance boon.
Let's do a back of the envelope comparison of a 50k triangle textured object vs 1M microtriangle object and a 4M microtriangle object.
Given an uncompressed vertex being:
float[3] Coordinates; float[3][3] Normals; float[2][2] TextureCoordinates;... and a compressed vertex being:
float[3] Coordinates; byte[2][4] PackedNormals; float16[2][2] TextureCoordinates;
Item | Count | Uncompressed | Compression | Run time size |
Vertices | 50k | 3.2M | 1.4M | |
Indices/Clusters | 150k | 600k | Array of shorts | 300k |
Diffuse map | 4k | 67M | BC1 - 8:1 | 11.2M |
Normal map | 4k | 67M | BC5 - 4:1 | 22.3M |
ORM map | 2k | 4M | BC1 - 8:1 | 2.8M |
38M |
float[3] Coordinates; float[3][3] Normals;... and an uncompressed triangle being:
int[3] VertexIndices; int RGBA; int ORM;... and a compressed vertex being:
float[3] Coordinates; byte[2][4] PackedNormals;... and a compressed triangle being:
int[3] VertexIndices; [BC3 diffuse texture] [BC1 ORM texture]
Item | Count | Uncompressed | Compression | Run time size |
Vertices | 1M | 48M | 20M | |
Indices/Clusters | 3M | 12M | Array of ints | 12M |
Triangles | 1M | 4M | 4M | |
Diffuse | 1k | 4M | BC3 - 4:1 | 1.33M |
ORM | 1k | 4M | BC1 - 8:1 | 666k |
38M |
Item | Count | Uncompressed | Compression | Run time size |
Vertices | 4M | 192M | 80M | |
Indices/Clusters | 12M | 48M | Array of ints | 48M |
Triangles | 4M | 16M | 16M | |
Diffuse | 2k | 16M | BC3 - 4:1 | 5.33M |
ORM | 2k | 16M | BC1 - 8:1 | 1.33M |
150MB |
This brings up several open questions:
Some blue sky thinking: