You can now edit a local copy of any diagram in any of the posts! There is an option available via the right click context menu. It works by saving a copy of the diagram to Local Storage and then opening it in the Editor!
Rendering engines often use cubemaps to store some form of lighting. There are a myriad of techniques that use them ranging from static to dynamic, direct and indirect, diffuse and specular lighting and so on. In this post we’ll try to illustrate some of the common steps in capturing, filtering and using cubemaps to achieve some form of lighting.
Let’s start with the cubemap capturing process, using the a simple environment (full of awesome programmer art) below. I’ve added a few random colorful objects just to make things a bit more interesting!
Next, let’s define our cubemap’s boundary as well as the capture point. It’s quite common to have the capture point being at the center of the cubemap, but it’s not always necessary to do so. In some cases the center of the cubemap might lie inside some scene geometry so moving it a bit further away would help. You can use the Shift + mouse drag to move the capture point.
With the cubemap in place we can start thinking about how to populate it. We need to decide what values to store on each texel and how we’ll calculate them. The choice of content will have implications the cubemap texture format, bit depth and compression method, but for now we’ll focus on illustrating the capturing process.
The simplest capture we can make is to simply record the color value of the scene at each texel location. We can visualise this by firing a ray from the capture point to the center of each texel. Have a go at doing this by moving the mouse cursor through the diagram below.
In a real rendering engine we probably wouldn’t be firing rays from the capture point, but rather setup 6 cameras (for a 3D cubemap), each pointing at the center of each cube face, with a 90° field of view and a square aspect ratio. We’d then render the scene from each camera and store the results (e.g. render to texture) on each cube face.
But what is it we’re capturing? In the case of the diagram above when the ray hits an object we just copy that object’s color into the cubemap texel that corresponds to that direction. In that case we have assumed that the entire object has a constant color, i.e. it’s the same across the entire surface and doesn’t vary with the incident vector. This would be the case if all the objects in the scene were purely emissive.
A more realistic model would be one where the color we capture is based on the lighting that affects the object as well the material properties and shading model (BRDF) of the surface at the point where the ray hits.
Take a look at the example below. The scene is illuminated by a single point light and we’re interested in a single point of the ceiling, P. The surface properties at P indicate that the any light arriving there is not reflected equally in every direction (as would be the case for a purerly Lambertian surface). Instead there is a predominant reflection vector where the relfection intensity is highest and it falls off on vectors further away from it. If point P was captured by the two probes shown below, they would each get a different color value since the amount of light reflected along each of the probe rays is different.
One implication of this is that the lighting stored in the cubemap is only correct at the capture point. The further away we are from the sampling point, the more incorrect the stored lighting becomes.
So what does that mean when capturing a cubemap using a render-to-texture camera? A first attempt would probably be to use the normal rendering pipeline to render the scene from the probe’s point of view. This means that the usual lighting shaders would be used, which, presumably, would do both diffuse and specular illumination and therefore produce results similar to the ‘raytraced’ examples above. That’s not necessarily a problem but rather something to be aware of.
What’s more important to be aware of is any image post processing the normal rendering pipeline performs. A typical linear lighting pipeline would include some tonemapping and gamma correction that transforms a linear HDR buffer to an sRGB LDR one. That’s fine when it comes to showing images on screen but probably not when doing a cubemap capture. We’d generally want to keep the lighting values in linear color space and in high precision and only do any transforms necessary to take them to whichever storage format we wish to use.
At this point it should also be obvious that the process described above captures indirect lighting. Lights aren’t rendered directly into the cubemap, but rather the light that bounces off the surfaces of the scene is. We’re capturing both diffusely and specularly bounced lighting. That has nothing to do with how we’re going to use the cubemap (e.g. to provide indirect diffuse or specular illumination), it’s just to clarify that both types of bounced lighting are captured. The contribution from emissive surfaces would also be handled in the same way.
So now we have a cubemap that has been fully populated to contain the incoming radiance along the direction vector corresponding to each texel. In its current form it cannot really be used for any form of indirect illumination, at least not in a PBR pipeline. We need to do some preprocessing first. More specifically we need to apply a bit of intelligent blurring!
Earlier we saw how the BRDF of a shading point tells us how much of the lighting coming from a particular direction is reflected in any other direction. In a sense it tells us how light going into the surface is scaterred outwards. We now want to do the reverse, given a particular outgoing direction, we’d like to know how much light we can gather from each direction along the hemisphere.
In the scene below the shading point is illuminated by 3 discrete lights. The BRDF will give us the contribution of each light along the orange vector pointing towards the camera. Summing these up (i.e. integrating over the hemisphere) will gives us the total amount of light in the direction of the camera vector.
We’ll use the same principle to gather the contributions from each of the radiance cubemap texels to calculate the total illumination (irradiance) along a given viewing direction. The diagram below illustrates this process. The mouse cursor controls the view direction and the colored lines are the contribution of each radiance texel based on the BRDF. The sum of these contributions represents the irradiance on that surface and we store it on a texel of a separate, irradiance, cubemap.
In a sense this is the same as having a chrome sphere positioned at the capture point and having the camera looking straight down on it along the direction of the mouse vector. The roughness of the sphere will have a significant effect on the reflections, the rougher the surface, the blurrier the reflections.
You can use the controls below to change the BRDF properties. Click the Populate Irradiance button to fully populate the irradiance cubemap, which will show a disc that simulates sampling the entire map. Have a play with the roughness slider using the Phong BRDF to see how the reflections change. Also note how when using the Lambert BRDF the roughness doesn’t make a difference.
It’s worth noting that the above gather operation was done by pretending there was a flat white surface element oriented along the mouse vector and we have a camera looking straight down on it, i.e. the normal, N and the view vector, V are aligned. The importance of this will be explained further down.
We’re almost there now! We have an irradiance cubemap that was generated using a particular BRDF. If have an object located at the capture point and that has the exact same BRDF as we used to generate the cubemap and it’s facing along the camera view vector, then we’re all set! However, chances are at least one of those things won’t be true. The object, or rather the shading point, probably isn’t at the capture point, its BRDF isn’t exactly the same as the one we did the integration with and its orientation probably isn’t along the camera vector. We need to either account or accept all these differences.
We already came across the issue of sampling a cubemap away from the capture point when discussing the capturing process. Sampling away from the capture point suffers from incorrect specular radiance being used, incorrect occlusion and incorrect parallax. Of those, incorrect occlusion is probably the most noticable (reflections being picked up when they should have been occluded). There is no straighforward solution involving just the cubemap alone. Often additional structures are used to capture the occlusion information and/or use alternate cubemaps.
Incorrect parallax on the other hand is fairly straightforward to fix in some of the simpler cases and involves simply some bounding box information. We’ll cover this in a future post.
The incorrect specular radiance is something we can just probably live with.
The other issues occur when the BRDF of the surface we need to shade isn’t the same as the BRDF we used for the radiance integration and when the surface orientation is not aligned to the view vector. There are couple of very good SIGGRAPH papers that describe a split-sum approximation the lighting integral for image based lighting (Karis13, Lazarov13, Lagarde14). As part of that approximation we integrate the radiance for a particular roughness and a look up table provides a runtime scale and bias factors based on roughness and viewing angle. In terms of precomputation all we need to do is perform the cubemap integration for a range of roughness values.
One approach would be generate a unique cubemap per discrete roughness value that we need to support (roughness is typically expressed as a scalar in the [0,1] range). However it’s worth noting that with increasing roughness the cubemap becomes more and more blurry and we therefore need less texels to capture it accurately. One conventient approach then is to store the irradiace of each roughness at the mip of a mipmapped cubemap. Mip 0 (the highest resolution) stores the lowest rouhgness (i.e. the least blurry) and subsequent ones store increasingly higher roughness values. This also has the advantage that we can leverage trilinear filtering to interpolate between mip maps giving us a continuous range of roughness values.
While the emphasis has been on indirect specular illumination using the cubemap, all the principles described here are equally applicable to diffuse illumination. Computing the irradiance cubemap using a Lambert BRDF (basically a cosine lobe) would result in a cubemap suitable for that. As described in this great paper by Ramamoorthi however, a Spherical Harmonic is a much more compact and efficient way of achieving the same result. One potential exception to this is using the last mip (lowest resolution) of the specular cubemap mip chain to store the diffuse irradiance. Then a single cubemap can be used for both diffuse and specular indirect illumination.
There is still a lot to cover of course but hopefully these diagrams have given you a better understanding of what’s involved in capturing and using a cubemap for indirect environment lighting. Feel free to post questions and topic requests in the comments below!
A Signed Distance Field is a mathematical construct where the distance to a closed surface is computed along a set of positions, with the sign of the distance used to indicate whether the position is inside or outside the surface. The positions are typically chosen to be on a regular grid and they work well in both 2D and 3D. They were made popular in computer graphics by this SIGGRAPH 2007 paper by Valve. If you haven’t already read it, it’s definitely worth a read!
In this post we’ll investigate using a simple 2D SDF to approximate a shape. It’s by no means the only use of SDF, but it’s one that’s easy to visualize and has practical use in computer graphics.
We’ll start with the surface shown below. It’s a closed surface so that we can tell whether a point is inside or outside its perimeter.
Next, we’ll overlay a regular grid over the surface. As you might expect, the size of the grid is important, the finer it is, the better the approximation will be.
For each grid cell, we can calculate the distance of the cell center to the nearest point of the surface. We’ll represent that with a circle. We’ll also indicate whether the cell center is inside the shape or not by coloring in the circle. Note that we only store one value per cell, calculated at its center, so while a cell may overlap the surface boundary, we only care about its center.
Go ahead and move the shape around (shift + mouse drag) to see how the cell distance values change.
Note that for reasons of clarity I’m only showing the circles that have a diameter of about one cell. The other cells are computed as well and their values will be used later on for the approximation.
Ok, so now we have a regular grid where we store the distance to the shape boundary as well as a bit that tells us whether we’re inside or outside the shape. How can we use that information to reconstruct the original shape?
Let’s have a look at the diagram without the original shape.
Imagine we throw a dart onto the grid. If the dart lands exactly at the center of a cell we’ll accurately know our distance to the shape outline and whether we’re inside it or not. That information will only be accurate if it lands exactly at the center because that’s where we evaluated the distance and sign. If the dart lands anywhere else (and chances are that it will) then we’ll need to make an approximation by interpolating between the 4 nearest cells.
Let’s see what that looks like! Move the mouse cursor over the image.
We’re nearly there now. We have a way of estimating the distance to the surface at any point inside the grid. All we need to do now is visualize all these estimates somehow.
We could for example paint anywhere where the distance is ≤ 0 to shade the interior of the shape. In the diagram below we’re drawing an approximation of the outline of the surface by painting anywhere where the absolute distance is within a threshold. The threshold value controls the ‘thickness’ of the approximate outline.
Have a play with the controls to see what effect it has on the approximation. Use shift + mouse drag to move the shape within the grid.
SDF vs Textures
So what’s the point of all this?
Let’s compare a low resolution SDF of a surface to a same resolution texture that stores the surface color.
You can see that despite being the same, low, resolution, the SDF representation can provide a very high resolution outline of the shape compared to the very pixelated result of the texture approach. Before jumping to conclusions, we need to understand that we’re comparing apples to oranges! The SDF is a high level representation of the shape surface. The image is a capture of the surface color (and alpha). We may use both representations to achieve visually comparable results, but they’re fundamentally different things. Both representations need to be evaluated per pixel.
In the case of the texture this evaluation gives us the interpolated color values, producing the gradients at the edges of the pixels. The image can store different/arbitrary colors per texel, that don’t have to conform to any particular function, but the look of texture is (generally) fixed (e.g. we cannot easily add an outline). Color and opacity can be stored and handled independently.
In the case of the SDF the interpolation is giving us a smooth approximation of the distance to the surface of the shape. We’re free to translate that into a pixel color value using a function selected at runtime. The same SDF can be used with multiple functions to produce different results. However, unlike in the texture case, the entire SDF has to go through the same translation (i.e. it’s not trivial to produce a different color for an arbitrary part of the surface). This translation is a high resolution operation, so we don’t get any pixelation artifacts. Instead, the effects of the low resolution are seen on the accuracy of the approximation. You can see that the sharp corners of the original shape have disappeared and been replaced with rounded points instead.
Looking at the SDF we’ve created so far it should feel quite intuitive that we would store it in a low resolution texture. Ignoring format considerations for now, we’d store the distance and the sign at each texel. At runtime we can evaluate the entire SDF by sampling that texture in a pixel shader and use that to shade a quad. This is more or less what described in the Valve paper mentioned at the beginning. The quad will be rendered at a high resolution (e.g. screen resolution) and the interpolation (done for ‘free’ by the GPU) will give us the lovely crisp outlines we saw earlier.
However this is not always what we want (or are able) to do. While 2D SDF are easy to visualize, there is nothing inherently 2D about SDF in general. A 3D SDF can be used to represent a volumetric surface. How do we visualize that? We could render it as a point cloud, evaluating the SDF at each point, but that would get tricky and expensive. What about cases where we don’t care to visualize the SDF? What if it represents a surface we want to do collision detection against?
As it turns out Ray Marching is a good answer to both of these questions and when it comes to ray marching SDF there are a couple of cool tricks we can use. I’ll save the diagrams for this for part 2 of this post, but in the meantime have a read through this blog post and maybe have a look at Claybook to see how cool things can be made using SDF!
Let’s kick things off with an interactive diagram explaining floating point numbers! There are many ways to explain this, but I’ve always found that diagrams help. The Wikipedia page on single-precision floats has all the details on how they work, including links to external tools allowing you to convert decimal to binary and vice versa.
The diagram below is a visualization of the mantissa and exponent parts of a floating point number. The exponent part gives you the large steps, that keep getting bigger. The mantissa part gives a fixed number of equal subdivisions between two exponent steps.
Use the sliders below to control the number of bits used for the exponent and mantissa. The more bits you have for exponent, the higher values you can represent. The more bits you have for mantissa, the more steps you get between exponent values, giving you greater precision.
Each exponent is a power of 2, so every step is 2x bigger than the previous. As the number of subdivisions between two successive exponent values is fixed (mantissa bits), as the exponents get larger, so do the subdivisions between them, which means the precision is reduced.
The exponent bias is subtracted from the exponent, which allows for values less than 2.