天天看點

Applying Sampling Theory To Real-Time Graphics

原文位址:https://mynameismjp.wordpress.com/2012/10/21/applying-sampling-theory-to-real-time-graphics/      

Applying Sampling Theory To Real-Time Graphics

應用采樣理論到實時圖形學中 Computer graphics is a field that constantly deals with discrete sampling and reconstruction of signals, although you might not be aware of it yet. This article focuses on the ways in which sampling theory can be applied to some of the common tasks routinely performed in graphics and 3D rendering.

計算機圖形學是一個不斷地處理離散采樣和信号重建的領域,雖然你可能并不會主要到它。這篇文章會聚焦在采樣理論被應用到圖形和3D渲染上的正常的執行任務處。

Image Scaling

圖像縮放

The concepts of sampling theory can are most easily applicable to graphics in the form of image scaling. An image, or bitmap, is typically the result of sampling a color signal at discrete XY sample points (pixels) that are evenly distributed on a 2D grid. To rescale it to a different number of pixels, we need to calculate a new color value at sample points that are different from the original pixel locations. In the previous article we mentioned that this process is known as resampling, and is also referred to as interpolation. Any graphics programmer should be familiar with the point (also known as nearest-neighbor) and linear(also known as bilinear) interpolation modes supported natively in GPU’s which are used when sampling textures. In case you’re not familiar, point filtering simply picks the closest texel to the sample point and uses that value. Bilinear filtering on the other hand picks the 4 closest texels, and linearly interpolates those values in the X and Y directions based on the location of the sample point relative to the texels. It turns out that these modes are both just implementations of a reconstruction filter, with point interpolation using a box function and linear interpolation using a triangle function. If you look back at the diagrams showing reconstruction with a box function and triangle function, you can actually see how the reconstructed signal resembles the visual result that you get when performing point and linear sampling. With the box function you end up getting a reconstructed value that’s “snapped” to the nearest original sample point, while with a triangle function you end up with straight lines connecting the sample points. If you’ve used point and linear filtering, you probably also intuitively understand that point filtering inherently results in more aliasing than linear filtering when resizing an image. For reference, here’s an image showing the same rotated checkerboard pattern being resampled with a box filter and a triangle filter:

采樣理論的概念能夠很容易地以圖像縮放的形式應用到圖形中。一個圖像或者位圖通常是對一個顔色信号在離散的XY方向上的采樣結果,這些采樣點是均勻地分布在一個2D的網格中。為了能夠縮放到不同數量的像素(K:采樣點),我們需要在與之前的采樣點不同的采樣位置上計算新的顔色值。在之前的文章中我們提到這個過程被稱為resampling(K:重采樣),也被稱作為interpolation(K:插值)。任何的圖形程式員都會熟悉點(被認為是最近的)和線性(被認為是雙線性的)插值形式,當采樣貼圖的時候這兩種形式是被GPU所原生支援的。防止你對此并不熟悉,點采樣會簡單地提取距離采樣點最近的紋理像素并使用這個值。另一方面,雙線性采樣提取鄰近的4個文理像素,并在這些紋理像素相對應的采樣點下,在X和Y的方向上通過雙線性插值獲得最終的結果。結果是,這些模式都是同過reconstruction filter來實作的,點采樣使用box function,線性插值使用triangle function。如果你回顧展示重建的box function和triangle function的圖表,你實際上能夠看到重建後的信号類似于當你執行點和線性采樣的視覺效果。使用box function你獲得的重建值會"中斷"在最近的原始采樣點,而當使用一個triangle function你最終會得到在一條直線上的采樣點。如果你使用點和線性濾波,你也可能感性地了解,當縮放一個圖像時,比起線性采樣,點濾波本身會産生更多的走樣。為了提供相應的參考,這裡有一個圖像顯示了相同的已經被旋轉過的checkerboard pattern在被box filter和triangle filter重采樣後的結果:

Applying Sampling Theory To Real-Time Graphics

An image of a rotated checkerboard pattern being enlarged with a box filter (point filtering) and a triangle filter (bilinear filtering)

Knowing what we do about aliasing and reconstruction filters, we can now put some mathematical foundation behind what we intuitively knew all along.  The box function’s frequency domain equivalent (the sinc function) is smoother and wider than the triangle function’s frequency domain equivalent (the sinc2 function), which results in significantly more postaliasing. Of course we should note even though the triangle function might be considered among the the “low end” of reconstruction filters in terms of quality, it is still attractive due to its low performance impact. Not only is the triangle function very cheap to evaluate at a given point in terms of ALU instructions, but more importantly the function evaluates to 0 for all distances greater than or equal to 1. This is important for performance, because it means that any pixels that are further than a distance of 1.0 from the resampled pixel location will not have to be considered. Ultimately this means that we only need to fetch a maximum of 4 pixels (in a 2×2 area) for linear filtering, which limits bandwidth usage and cache misses. For point filtering the situation is even better, since the box function hits zero at 0.5 (it has a width of 1.0) and thus we only need to fetch one pixel.

在了解我們對采樣和reconstruction fitlers所做,我們現在可以将我們一直都直覺了解到隐藏在背後的數學基礎拿出來了。box function在頻域中的等價曲線(sinc function)比triangle function在頻域的等價曲線(sinc2 函數)要更寬更平滑,這會導緻更多的postaliasing。當然我們也應該知道,盡管triangle function可能會讓人覺得在品質上是一個"低端"的reconstruction fitlers,但它仍然吸引人的理由是它對性能的影響較少。在一個指定的采樣點上,triangle function不止在使用ALU指令上有較低的開銷,而且更重要的是這個函數在大于或等于1的距離上計算的結果都是零。在性能上這是重要的,這意味着隻要采樣的像素位置大于1時都不需要考慮。最終我們隻需要提取最大是4個像素的區域(在一個2x2的區域上)來進行線性濾波,這能夠限制bandwidth的使用和緩存的丢失。對于點濾波的情況甚至有更好的結果,因為box function隻在以0為原點(采樣點為原點)、0.5像素為半徑的區域尋找,是以我們隻需要提取一個像素(K:這裡的描述,和上圖給出的圖像有點不一緻,因為這裡說的box function寬度隻有1(半徑為0.5),但上圖中現實的box function的寬度是[-1,1]之間,寬度很明顯是1的,不知道是作者的錯誤還是我了解有問題)。

Outside of realtime 3D rendering, it is common to use cubic filters (also known as bicubic filters) as a higher-quality alternative to point and linear filters when scaling images. A cubic filter is not a single filtering function, but rather a family of filters that interpolate using a 3rd-order (cubic) polynomial. The use of such functions in image processing dates back to Hsieh Hou’s paper entitled “Cubic Splines for Image Interpolation and Digital Filtering”[1] which proposed using cubic B-splines as the basis for interpolation. Cubic splines are attractive for filtering because they can be used to create functions where the 1stderivative is continuous across the entire domain range, which known as being C1 continuous. Being C1 continuous also implies that the function is C0 continuous, which means that the 0th derivative is also continuous. So in other words, the function itself will would have no visible discontinuities if you were to plot the function. Remember that there is an inverse relationship between rate of change in the spatial domain and the frequency domain, therefore a smooth function without discontinuities is desirable for reducing postaliasing. A second reason that cubic splines are attractive is that the functions can be made to be zero-valued after a certain point, much like a box or triangle function. This means the filter will have a limited width, which is optimal from a performance point of view. Typically cubic filters use functions defined along the [-2, 2] range, which is double the width of a unit triangle function. Finally, a third reason for the attractiveness of cubic filters is that they can be made to produce acceptable results when applied as a seperable filter. Seperable filters can be applied independently in two passes along the X and Y dimensions, which reduces the number of neighboring pixels that need to be considered when applying the filter and thus improves the performance.

在實時3D渲染之外,當在縮放圖像時,使用cubic filters作為相對于點和線性濾波來講更高品質的候選是很常見的(也被稱為bicubic filters)。一個cubic filter 不隻是一個單獨的濾波函數,更是一類使用三次多項式插值的濾波器。在圖像進行中這類函數的使用要追溯到Hsieh Hous's的論文,題目叫做"Cubic Splines for Image Interpolation and Digital Filtering",這篇論文提議使用cubic B-splines作為插值的基礎。對于濾波來講cubic splines是吸引的,因為他們能夠被用來建立函數,這些函數的一階導數在整個作用域都是連續的,這被稱作為C1 連續。C1 連續也意味着C0 連續,意思是第0階到時也是連續的。換句話講,函數自身将沒有任何可以看到的不連續的部分,如果你要繪制這個函數的話。記住,變化率在時域和頻域中有相反的關系,因為一個平滑的沒有不連續的函數在降低postaliasing是讓人期待的。關于cubic splines吸引的第二個原因是,這類函數能夠在某個點之後(K:baseband bandwidth的位置),函數值都為0,就如box和triangle function一樣。這意味着這類濾波器有有限的寬度,在性能的角度看是最優的。通常,cubic filters定義在[-2,2]的範圍内,相對于unit triangle function來講,是它的兩倍。最後,第三個吸引的地方是,cubic filter能夠作為separable filter産生出讓人能夠接受的結果。Separable filters能夠單獨的通過兩個passed應用到X和Y的方向上,當應用這個filter的時候,這能夠減少鄰近像素需要被考慮到的數量,因為能夠體高性能(K:想象一下一個全屏的高斯模糊後處理,就能夠使用兩個filter分别應用到X和Y方向上)。

In 1988, Don Mitchell and Arun Netravali published a paper entitled Reconstruction Filters in Computer Graphics[2] which narrowed down the set of possible of cubic filtering functions into a generalized form dependent on two parameters called B and C. This family of functions produces filtering functions that are always C1 continuous, and are normalized such that area under the curve is equal to one. The general form they devised is found below:

在1988年,Don Mitchell和Arun Netravali發表了一篇名為"Reconstruction Filters in Computer Graphics"的論文,這篇論文能夠将一系列cubic濾波函數,歸結為一種隻依賴兩個參數的通用格式,這兩個參數被稱為B和C。這一系列的函數産生出濾波函數,這些函數總是C1連續的,并且被歸一化緻使這個曲線能夠在一個區域為1的範圍内。他們發現的這個通用的格式如下:

Applying Sampling Theory To Real-Time Graphics

Generalized form for cubic filtering functions

Below you can find  graphs of some of the common curves in use by popular image processing software[3], as well as the result of using them to enlarge the rotated checkerboard pattern that we used earlier:

在下面的圖示中你能夠找到一些通用的曲線,它們被用到一些流行的圖像處理軟體中,也會被用到我們之前提到的,放大旋轉的checkerboard樣式的結果中:

Applying Sampling Theory To Real-Time Graphics

Common cubic filtering functions using Mitchell’s generalized form for cubic filtering. From top- left going clockwise: cubic(1, 0) AKA cubic B-spline, cubic(1/3, 1/3) AKA Mitchell filter, cubic(0, 0.75) AKA Photoshop bicubic filter, and cubic(0, 0.5) AKA Catmull-Rom spline

Applying Sampling Theory To Real-Time Graphics

Cubic filters used to enlarge a rotated checkerboard pattern

One critical point touched upon in Mitchell’s paper is that the sinc function isn’t usually desirable for image scaling, since by nature the pixel structure of an image leads to discontinuities which results in unbounded frequencies. Therefore ideal reconstruction isn’t possible, and ringing artifacts will occur due to Gibb’s phenomenon. Ringing was identified by Schrieber and Troxel[4] as being one of four negative artifacts that can occur when using cubic filters, with the other three being aliasing, blurring and anisotropy effects. Blurring is recognized as the loss of detail due to too much attenuation of higher frequencies, and is often caused by a filter kernel that is too wide. Anisotropic effects are artifacts that occur due to applying the function as a separable filter, where the resulting 2D filtering function doesn’t end up being radially symmetrical.

關于Mitchell's論文中的一個重要的觀點是,sinc function在圖像放大方面并不通常被使用,因為一個圖像原生的像素格式會導緻到不連續的出現,導緻出現unbounded frequencies(K:這個名詞不知道怎麼翻譯)。是以理想的重建是不可能的,并且由于Gibb's現象(K:曲線的Y軸有負數出現),ringing瑕疵也會發生。Ringing被Schrieber和Troxel發現,由于細節的丢失而被認為是一種負面的人為行迹,這種細節丢失的原因是在于太多的高頻衰減導緻的,這個高頻衰減通常是因為filter核心太寬。各向異性的效果是有缺陷的,這種現象發生的原因在于,将一個separable fitler應用到這個函數,而2D濾波函數卻并不是徑向對稱的。

Mitchell suggested that the purely frequency domain-focused techniques of filter design were insufficient for designing a filter that produces subjectively pleasing results to the human eye, and instead emphasized balancing the 4 previously-mentioned artifacts against the amount of postaliasing in order to design a high-quality filter for image scaling. He also suggested studying human perceptual response to certain artifacts in order to subjectively determine how objectionable they may be. For instance, Earl Brown[5] discovered that ringing from a single negative lobe can actually increase perceived sharpness in an image, and thus can be a desirable effect in certain scenarios. He also pointed out that ringing from multiple negative lobes, such as what you get from a sinc function, will always degrade quality. Here’s an image of our friend Rocko enlarged with a Sinc filter, as well as an image of a checkerboard pattern enlarged with the same filter:

Mitchell建議,在設計一個濾波器時,想要讓人的眼睛主觀上感覺到舒服,緊緊是針對頻域的濾波設計技術是不足夠的,轉而強調平衡前面提到的4種缺陷,能為圖像縮放,設計出品質更好的濾波器。他也建議,研究人類的感覺反應,在某幾個缺陷上是如何主觀地決定他們有多難以接受。例如,Earl Brown發現,來自單個負值的波瓣的ringing實際上會增加人類感覺圖像的銳利程度,是以在某些情況下能夠作為被引入的效果。他也指出,來自多個負值的波瓣,例如你通過sinc函數獲得的,将總是降低圖像品質。下面有兩張張使用sinc filter放大的圖檔,一張是我們的朋友Rocko,一張checkerboard pattern:

Applying Sampling Theory To Real-Time Graphics

Ringing from multiple lobes caused by enlargement with a windowed sinc filter

Ultimately, Mitchell segmented the domain of his B and C parameters into what he called “regions of dominant subjective behavior”. In other words, he determined which values of each parameter resulted in undesirable artifacts. In his paper he included the following chart showing which artifacts were associated with certain ranges of the B and C parameters:

最終,Mitchell分離了B和C參數所在的區域,被他稱為"占優勢的主觀行為區域"。換句話講,他決定每個參數在哪個區域會導緻不想要的瑕疵。在他的論文中,他引用了下面的圖表,用于展示人工瑕疵是怎樣與B和C參數的範圍産生聯系的:

Applying Sampling Theory To Real-Time Graphics

A chart showing the dominant areas of negative artifacts for Mitchell’s generalized cubic function. From “Reconstruction Filters in Computer Graphics” [Mitchell 88]

Based on his analysis, Mitchell determined that (1/3, 1/3) produced the highest-quality results. For that reason, it is common to refer to the resulting function as a “Mitchell filter”. The following images show the results of using non-ideal parameters to enlarge Rocko, as well as the results from using Mitchell’s suggested parameters:

基于他的分析,Mitchell認為(1/3,1/3)能夠産生出最高品質的結果。基于這個原因,這個函數通常被稱為"Mitchell filter"。下面的圖像顯示了使用非理想參數和使用Mitchell's建議的參數去放大Rocko:

Applying Sampling Theory To Real-Time Graphics

Undesirable artifacts caused by enlargement using cubic filtering. The top left image demonstrates anisotropy effects, the top right image demonstrates excessive blurring, and the bottom left demonstrates excessive ringing. The bottom right images uses a Mitchell filiter, representing ideal results for a cubic filter. Note that these images have all been enlarged an extra 2x with point filtering after resizing with the cubic filter, so that the artifacts are more easier to see.

使用cubic filtering而引入的不可取的人工瑕疵。左上是anisotropy效果,右上是過度的blurring,左下是過度的ringing,右下是使用Mitchell filiter,顯示出了cubic filter的理想效果。需要注意的是,這些圖像在經過cubic filter之後,使用point filtering額外放大了兩倍,為了能夠讓人工瑕疵能夠更容易的被看到。

Texture Mapping

紋理貼圖

Real-time 3D rendering via rasterization brings about its own particular issues related to aliasing, as well as specialized solutions for dealing with them. One such issue is aliasing resulting from resampling textures at runtime in order to map them to a triangle’s 2D projection in screen space, which I’ll refer to as texture aliasing. If we take the case of a 2D texture mapped to a quad that is perfectly perpendicular to the camera, texture sampling essentially boils down to a classic image scaling problem: we have a texture with some width and height, the quad is scaled to cover a grid of screen pixels with a different width and height, and the image must be resampled at the pixel locations where pixel shading occurs. We already mentioned in the previous section that 3D hardware is natively capable of applying “linear” filtering with a triangle function. Such filtering is sufficient for avoiding severe aliasing artifacts when upscaling or downscaling, although for downscaling this only holds true when downscaling by a factor <= 2.0. Linear filtering will also prevent aliasing when rotating an image, which is important in the context of 3D graphics since geometry will often be rotated arbitrarily relative to the camera. Like image scaling, rotation is really just a resampling problem and thus the same principles apply. The following image shows how the pixel shader sampling rate changes for a triangle as it’s scaled and rotated:

實時3D渲染通過光栅化會引入與走樣相關的特定問題,也會有相應的辦法可以處理它們。一個問題是,在運作時,将貼圖映射到一個經過2D投影後,在螢幕坐标中顯示的三角形中,而由于重采樣貼圖而産生的走樣,我将會稱作為texture aliasing(紋理走樣)。如果我們将一個經過2D紋理映射的四邊形完美地垂直于錄影機,紋理采樣将會歸結為一個傳統的圖像縮放問題:我們有一個一定大小的貼圖,四邊形被縮放到覆寫螢幕中的像素網格,該網格的長寬與紋理長寬是不一樣的,這導緻圖像必須重采樣到相應的像素位置,該位置就是着色發生的地方。在前面我們已經提到過,3D硬體原生具有"線性"濾波的能力,其使用的是一個triangle function。當我們縮小或放大時,這樣的濾波在避免走樣瑕疵時是足夠的,需要注意的是,在縮小時,當縮小的倍數小于等于2倍的時候,線性濾波才是足夠的。線性濾波能夠防止經過旋轉後的圖像發生走樣,這是很重要的,因為在3D圖形學中,幾何體将會由于錄影機的原因,經常被任意旋轉。就如圖像縮放一樣,旋轉也隻是一個重采樣的問題,是以能夠應用相應的法則。下面的圖像顯示了,經過旋轉和縮放的三角形,導緻的采樣率變化後,像素是如何被着色的:

Applying Sampling Theory To Real-Time Graphics

Pixel sampling rates for a triangle. Pixel shaders are executed at a grid of fixed locations in screen space (represented by the  red dots in the image), thus the sampling rate for a texture depends on the position, orientation, and projection of a given triangle. The green triangle represents the larger blue triangle after being scaled and rotated, and thus having a lower sampling rate.

一個三角形的像素采樣率。像素着色器在螢幕的固定網格位置處被執行(用紅色的點标注在圖像中),是以,一個貼圖的采樣率是基于位置,朝向和三角形的投影。藍色的三角形經過旋轉和縮放後形成綠色的三角形,是以具有一個更低的采樣率。

Mipmapping

Mipmapping(K:貼圖分級細化??總感覺這樣的翻譯怪怪的)

When downscaling by a factor greater than 2, linear filtering leads to aliasing artifacts due to high-frequency components of the source image leaking into the downsampled version. This manifests as temporal artifacts, where the contents of the texture appear to flicker as a triangle moves relative to the camera. This problem is commonly dealt with in image processing by widening the filter kernel so that its width is equal to the size of the downscaled pixel. So for instance if downscaling from 100×100 to 25×25, the filter kernel would be greater than or equal in width to a 4×4 square of pixels in the original image. Unfortunately widening the filter kernel isn’t usually a suitable option for realtime rendering, since the number of memory accesses increases with O(N2) as the filter width increases. Because of this a technique known as mipmapping is used instead. As any graphics programmer should already know, mipmaps consist of a series of prefiltered versions of a 2D texture that were downsampled with a kernel that’s sufficiently wide enough to prevent aliasing. Typically these downsampled versions are generated for dimensions that are powers of two, so that each successive mipmap is half the width and height of the previous mipmap. The following image from Wikipedia shows an example of typical mipmap chain for a texture:

當縮小倍數大于2時,線性濾波會導緻一個走樣錯誤,原因是原圖的高頻分量會洩露到縮小的貼圖中。這顯示為臨時的瑕疵(K:貌似這樣翻譯有問題),當一個三角形相對于錄影機移動時,貼圖的顯示内容将會出現閃爍。這個問題,在圖像進行中,通常的解決方法是擴大濾波器的核心,以緻于它的寬度與縮小的倍數相等。舉個例子,如果從100x100縮小到25x25時,濾波核心将會大于或者等于一個在原圖中的4x4的像素區域。不幸的是,擴大濾波核心的方式并不适合于實時渲染,因為記憶體通路将因為濾波器寬度的增加,以O(N^2)增加。因為這個原因,被稱為mipmapping的技術得到了應用。就如任何的圖形程式員都已經知道的,這能夠防止走樣。通常情況下,這樣的縮小的圖像會以2的n次方來生成,以便連續的每個mipmap會是上一個mipmap的長寬的一半。下圖來自Wikipedia,顯示了一個貼圖通常有的mipmap鍊:

Applying Sampling Theory To Real-Time Graphics

An example of a texture with mipmaps. Each mip level is roughly half the size of the level before it. Image take from Wikipedia.

A box function is commonly used for generating mip maps, although it’s possible to use any suitable reconstruction filter when downscaling the source image. The generation is also commonly implemented recursively, so that each mip level is generated from the mip level preceding it. This makes the process computationally cheap, since a simple linear filter can be used at each stage in order to achieve the same results as a wide box filter applied to the highest-resolution image. At runtime the pixel shader selects the appropriate mip level by calculating the gradients of the texture coordinate used for sampling, which it does by comparing texture coordinate used for one pixel to the texture coordinate used in the neighboring pixels of a 2×2 quad. These gradients, which are equal to the partial derivatives of the texture coordinates with respect to X and Y in screen space, are important because they tell us the relationship between a given 2D image and the rate at which we’ll sample that image in screen space. Smaller gradients mean that the sample points are close together, and thus we’re using a high sampling rate. Larger gradients result from the sample points being further apart, which we can interpret to mean that we’re using a low sampling rate. By examining these gradients we can calculate the highest-resolution mip level that would provide us with an image size less than or equal to our sampling rate. The following image shows a simple example of mip selection:

box function通常被用來生成mipmaps,盡管能夠在縮小原圖時,使用任何合适的reconstruction filter。這個生成的過程通常能夠以遞歸的方式實作,以便每一個mip level都是通過上一個mip level來生成的。這樣確定了這個計算過程是廉價的,因為一個簡單線性濾波能夠在每個stage中被應用,以便能夠獲得,像應用一個更寬的box filter到一個更高分辨率的圖像中,一樣的效果。在運作時,通過使用用于采樣的紋理坐标來計算梯度值後,像素着色器會使用這個梯度值來選擇合适的mip level。這個梯度值的計算,通過在一個2x2的方格中,先提取第一個像素的紋理坐标,再取其鄰近的像素的紋理坐标進行比較而得到的。這個梯度值,與螢幕坐标上的X和Y方向上的偏導數是相等的,而且這個梯度值是很重要的,因為他們告訴我們,一個給定的2D圖像和我們将要在螢幕空間上采樣的比率之間的關系。更小的梯度值意味着采樣點靠得更近,意味着我們使用着一個更高的采樣率。更大的梯度值意味着采樣點離得更遠,意味着使用了一個更低的采樣率。通過檢查這些梯度值,我們能夠利用最高分辨率的mip level計算出我們的圖像尺寸,這個尺寸将會小于或者等于我們的采樣率。下面的圖像顯示了一個mip 選擇的簡單例子:

Applying Sampling Theory To Real-Time Graphics

Using texture coordinate gradients to select a mip level for a 4×4 texture.

In the image, the two red rectangles represent texture-mapped quads of different sizes rasterized to a 2D grid of pixels. For the topmost quad, the a value of 0.25 will be computed as the partial derivative for the U texture coordinate with respect to the X dimension, and the same value will be computed as the partial derivative for the V texture coordinate with respect to the Y dimension. The larger of the two gradients is then used to select the appropriate mip level based on the size of the texture. In this case, the length of the gradient will be 0.25 which means that the 0th (4×4) mip level will be selected. For the lower quad the size of the gradient is doubled, which means that the 1st mip level will be selected instead. Quality can be further improved through the use of trilinear filtering, which linearly interpolates between the results of bilinearly sampling the two closest mip levels based on the gradients. Doing so prevents visible seams on a surface at the points where a texture switches to the next mip level.

在這個圖像中,兩個紅色的,經過紋理貼圖的,大小不一樣的四邊形被光栅化為像素網格。上面的四邊形,0.25是紋理坐标U在X方向上的偏導數,紋理坐标V在Y方向上的偏導數也和U相同。基于貼圖的大小,兩個中更大的那個梯度值被用來選擇合适的mip level。在這個例子中,0.25的值表示第0層(4x4)的mip level将會被選擇。下面的四邊形中,梯度值是上面的兩倍,這意味着第一層的mip level将會被選擇。通過使用trilinear filtering品質會得到很大的提升,該trilinear filtering是基于梯度值選擇兩個最近的mip levels進行線性采樣。當在兩個不同的mip level進行切換時,這樣做能夠防止在表面上出現可見的裂縫。

One problem that we run into with mipmapping is when an image needs to be downscaled more in one dimension than in the other. This situation is referred to as anisotropy, due to the differing sampling rates with respect to the U and V axes of the texture. This happens all of the time in 3D rendering, particularly when a texture is mapped to a ground plane that’s nearly parallel with the view direction. In such a case the plane will be projected such that the V gradients grow more quickly than the U gradients as distance from the camera increases, which equates to the sampling rate being lower along the V axis. When the gradient is larger for one axis than the other, 3D hardware will use the larger gradient for mip selection since using the smaller gradient would result in aliasing due to undersampling. However this has the undesired effect of over-filtering along the other axis, thus producing a “blurry” result that’s missing details. To help alleviate this problem, graphics hardware supports anisotropic filtering. When this mode is active, the hardware will take up to a certain number of “extra” texture samples along the axis with the larger gradient. This allows the hardware to “reduce” the maximum gradient, and thus use a higher-resolution mip level. The final result is equivalent to using a rectangular reconstruction filter in 2D space as opposed to a box filter. Visually such a filter will produce results such that aliasing is prevented, while details are still perceptible. The following images demonstrate anisotropic filtering on a textured plane:

當一個次元上縮小的值比在另一個次元的多時,使用mipmapping的就會出現問題。這種情況被稱為各向異性,這是由于在貼圖的U和V方向上使用了不同的采樣率。這種情況一直都出現在3D渲染中,尤其是當貼圖映射到一個地平面上,這個地平面幾乎與視覺方向平行。在這種情況下,當距離錄影機越遠時,這個被投影的平面在V方向的梯度增長得比U方向的梯度要快,這和在V軸上的采樣率更低是一樣的道理。當梯度值在一個軸上比另一個軸要大時,3D硬體将會使用更大的梯度值用于選擇相應的mipmap,因為使用更小的梯度将會因為缺少細節而出現走樣。為了減輕這個問題,圖形硬體支援anisotropic filtering(各向異性過濾)。當這個模式開啟時,硬體将會在梯度值更大的軸上占據一定數量的額外紋理采樣點(K:這個采樣點的數量通常是我們說的2x,4x,8x,16x各向異性采樣的值)。這允許硬體能夠"減少"最大的梯度值,因為會用到更高的分辨率的mip level。在2D空間中,最終的結果将會和rectangular reconstuction filter相同,所謂的rectangular是相對于box來講的。下面的圖像展示了在一個貼圖平面上的anisotropic filtering

Applying Sampling Theory To Real-Time Graphics
Applying Sampling Theory To Real-Time Graphics

A textured plane without anisotropic filtering, and the same plane with 16x anistropic filtering. The light grey grid lines demonstrate the distribution of pixels, and thus the rate of pixel shading in screen space. The red lines show the U and V axes of the texture mapped to plane. Notice the lack of details in the grain of the wood on the left image, due to over-filtering of the U axis in the lower-resolution mip levels.

一個沒有anisotropic filter和一個有16x的anistropic filtering。淺灰色的網格線表示像素的分布,表示在螢幕空間中像素着色的比率。紅線表示經過紋理映射後的平面的U和V軸。注意上圖中缺少木紋理的貼圖細節,原因是在U軸上使用了更低分辨率的mip levels。

(K:我在這裡的了解是,U軸的梯度值一定比V軸要大得多的,是以V軸在沒有各向異性的應用時,肯定用的是由U的梯度覺得的mip level的貼圖,U越大,選取到的mip level将會越高,mip level的分辨率就會越低,相應的細節就會減少。)

Geometric Aliasing

幾何走樣

A second type of aliasing experienced in 3D rendering is known as geometric aliasing. When a 3D scene composed of triangles is rasterized, the visibility of those triangles is sampled at discrete locations typically located at the center of the screen pixels. Triangle visibility is just like any other signal in that there will be aliasing in the reconstructed signal when the sampling rate is inadequate (in this case the sampling rate is determined by the screen resolution). Unfortunately triangular data will always have discontinuities, which means the signal will never be bandlimited and thus no sampling rate can be high enough to prevent aliasing. In practice these artifacts manifest as the familiar jagged lines or “jaggies” commonly seen in games and other applications employing realtime graphics. The following image demonstrates how these aliasing artifacts occur from rasterizing a single triangle:

第二種走樣的類型能夠在3D渲染中經常遇到的是geometric aliasing(幾何體走樣)。當一個3D場景包含經過光栅化後的三角形時,這些可見的三角形在離散的位置上被采樣,通常采樣的位置會在螢幕像素的中心處。三角形的可見度(K:這個可見度應該說的是能夠怎樣在螢幕中被看到,而不是能不能看到)就如任何其他的信号一樣,當采樣率不精确時,将會在重建的信号中出現走樣(在這個例子中,采樣率是被螢幕分辨率決定的)。不幸的是,三角資料總是有不連續的,意味着信号将不會是bandlimited,是以沒有任何的采樣率能夠防止走樣的發生。實踐中,這些人為瑕疵以鋸齒狀的線段或者“鋸齒圖形”顯示出來,通常能夠在遊戲或者其他實時圖形的應用程式中被看到。下面的圖像展示了這些走樣如何從光栅化後的三角形中發生:

Applying Sampling Theory To Real-Time Graphics

Geometric aliasing occurring from undersampling the visibility of a triangle. The green, jagged line represents the outline of the triangle seen on a where pixels appear as squares of a uniform color.

Geometric aiasing發生在一個采樣率過低的三角形上。綠色的,鋸齒狀的線段顯示了三角形在同一顔色的四邊形像素方格中如何被看到。

Although we’ve already established that no sampling rate would allow us to perfectly reconstruct triangle visibility, it is possible to reduce aliasing artifacts with a process known as oversampling. Oversampling essentially boils down to sampling a signal at some rate higher than our intended output, and then using those samples points to reconstruct new sample points at the target sampling rate. In terms of 3D rendering this equates to rendering at some resolution higher than the output resolution, and then downscaling the resulting image to the display size. This process is known as supersampling, and it’s been in use in 3D graphics for a very long time. Unfortunately it’s an expensive option, since it requires not just rasterizing at a higher resolution but also shading pixels at a higher rate. Because of this, an optimized form of supersampling known as multi-sample antialiasing (abbreviated as MSAA) was developed specifically for combating geometric aliasing. We’ll discuss MSAA and geometric aliasing in more detail in the following article.

盡管我們已經确定了,沒有采樣率能夠允許我們完美地重建出三角形的可見度,但使用一個叫做oversampling的過程對于減低走樣情況是可能的。Oversampling主要歸結為使用一個更高的采樣率去采樣一個信号,相比于我們原來打算使用的采樣率,然後使用那些采樣點去重建出新的采樣點。在3D渲染中,相比起使用原來輸出的分辨率,使用一個更高的分辨率進行渲染,然後将采樣得出的圖像縮放到需要顯示的大小。這個過程被稱為supersampling,這已經在3D圖形學中被使用了很長的一段時間了。很不幸的是,它是一個很昂貴的選項,因為它不隻要求以更高的分辨率進行光栅化,并且以更高的比率進行像素着色。因為這個原因,一個supersampling的優化形式被稱為multi-samle antialiasing(MSAA)被開發出來,用于改善geometric aliasing。我們将會在接下來的章節中對MSAA和geometric aliasing進行更多細節的讨論

Shader Aliasing

着色器走樣

A third type of aliasing that’s common in modern 3D graphics is known as shader aliasing. Shader aliasing is similar to texture aliasing, in that occurs due to the fact that the pixel shader sampling rate is fixed in screen space. However the distinction is that shader aliasing refers to undersampling of signals that are evaluated analytically in the pixel shader using mathematical formulas, as opposed to undersampling of a texture map. The most common and noticeable case of shader aliasing results from applying per-pixel specular lighting with low roughness values (high specular exponents for Phong and Blinn-Phong). Lower roughness values result in narrower lobes, which make the specular response into a higher-frequency signal and thus more prone to undersampling. The following image contains plots of the N dot H response of a Blinn-Phong BRDF with varying roughness, demonstrating it becomes higher frequency for lower roughnesses:

3D圖形學中常見的第三種走樣被稱為shader aliasing(着色器走樣)。Shader aliasing類似于texture aliasing,它們發生的原因在于像素的着色采樣率在螢幕空間上是固定的。然而,它們的差別在于,shader aliasing是信号使用數學公式,在像素着色計算的過程中采樣過低,而texture aliasing是用于對貼圖映射采樣過低(K:就如,在pixel shader中,需要根據采樣的資料,如光栅化後插值出來的資料,應用到一個連續的函數中,那麼這個連續的函數由于采樣過低,肯定不能夠在這個固定的采樣率下完美地還原出這個連續的函數應有的樣子,就出現了所謂的shader aliasing,而texture aliasing相當于把連續的數學公式需要計算的值已經預先計算好在texture中)。最常見并且最讓人主要到的shader aliasing是,使用低粗糙度值(Phong和Blinn-Phong中的高specular指數)在逐像素高光中的計算(K:粗糙度值就相當于Phong光照模型中的specular的power指數值)。更低的粗糙度值會産生出更窄的波瓣,這會讓高光響應變成一個更高頻的信号,是以更傾向于出現采樣過低。下面的圖像包含了N dot H的響應值與不同的粗糙值在Blinn-Phong BRDF中繪制的圖像,這展示了越低的粗糙度會有越高的頻率值:

Applying Sampling Theory To Real-Time Graphics

N dot H response of a Blinn-Phong BRDF with various exponents. Note how the response becomes higher-frequency for higher exponents, which correspond to lower roughness values. Image from Real-Time Rendering, 3rdEdition, A K Peters 2008

Shader aliasing is most likely to occur when normal maps are used, since they increase the frequency of the surface normal and consequently cause the specular response to vary rapidly across a surface. HDR rendering and physically-based shading models can compound the problem even further, since they allow for extremely intense specular highlights relative to the diffuse lighting response. This category of aliasing is perhaps the most difficult to solve, and as of yet there are no silver-bullet solutions. MSAA is almost entirely ineffective, since the pixel shading rate is not increased compared to the non-MSAA. Supersampling is effective, but prohibitively expensive due to the increased shader and bandwidth costs required to shade and fill a larger render target. Emil Persson demonstrated a method of selectively supersampling the specular lighting inside the pixel shader[6], but this too can be expensive if the number of lights are high or if multiple normal maps need to be blended in order to compute the final surface normal.

當normal maps被使用時,Shader aliasing是最可能會發生的,因為他們增加了表面法向的頻率,是以導緻了高光響應會在整個表面上快速變化。HDR和基于實體着色模型會使這個問題變得更複雜,因為它們允許強度非常大的高光出現,相對于散射光的響應來講。這種走樣可能是最難以分辨出來的,到目前為止,還沒有有效的解決方法。MSAA幾乎是完全失效的,因為像素的着色率沒有相應的增加(K:MSAA隻會計算一次shader)。Supersampling是有效的,而鑒于着色計算和用于填充更大的RT的帶寬可能消耗太大。Emil Persson展示了一個方法,在像素着色中,有選擇地對高光的處理進行supersampling,但這也會變得昂貴,如果燈光的數量太多或者多個normal maps需要混合以計算出表面最終的法線。

A potential solution that has been steadily gaining some ground[7][8] is to modify the specular shading function itself based on normal variation. The theory behind this is that microfacet BRDF’s naturally represent micro-level variation along a surface, with the amount of variation being based on a roughness parameter. If we increase the roughness of a material as the normal map details become relatively smaller in screen space, we use the BRDF itself to account for the undersampling of the normal map/specular lighting response. Increasing roughness decreases the frequency of the resulting reflectance, which in turn reduces the appearance of artifacts. The following image contains an example of using this technique, with an image captured with 4x shader supersampling as a reference:

一個潛在的解決辦法已經持續得到支援,它根據法線的變化,修改高光着色函數自身來達到目的。這背後的原理是,微表面的BRDF原生地呈現着一個表面的微觀變化,而微觀變化有多少是依賴于粗糙度參數的。如果我們增加一個材質的粗糙度,那麼在螢幕空間中,normal map的細節就會相對變得更小,我們使用BRDF自身,導緻normal map或specular lighting response處于采樣不足的狀态。增加粗糙度會降低反射的頻率,轉而降低表面的人工痕迹。下面的圖像包含了使用這個技術的一個例子,與一個使用了4x shader supersamping作為參考:

Applying Sampling Theory To Real-Time Graphics
Applying Sampling Theory To Real-Time Graphics
Applying Sampling Theory To Real-Time Graphics

The topmost image shows an example of shader aliasing due to undersampling a high-frequency specular BRDF combined with a high-frequency normal map. The middle image shows the same scene with 4x shader supersampling applied. The bottom image shows the results of of using a variant of CLEAN mapping to limit the frequency of the specular response.

最上面的圖像顯示了一個shader aliasing的例子,它是由于一個高頻的高光BRDF結合了高頻的normal map而出現的采樣不足。中間的圖像顯示了相同的場景下,使用4x shader supersampling。下面的圖像顯示了使用一個可變的CLEAN映射去限制高光響應頻率的例子。

This approach (and others like it) can be considered to be part of a broader category of antialiasing techniques known as prefiltering. Prefiltering amounts to applying some sort of low-pass filter to a signal before sampling it, with the goal of ensuring that the signal’s bandwidth is less than half of the sampling rate. In a lot of cases this isn’t practical for graphics since we don’t have adequate information about what we’re sampling (for instance, we don’t know what triangle should is visible for a pixel until we sample and raterize the triangle). However in the case of specular aliasing from normal maps, the normal map contents are known ahead of time.

這個方法(或者其他類似的)能夠被認為是廣泛反走樣計算範疇的一部分,被稱為prefiltering。Prefiltering相當于在采樣某個信号前,應用某些類型的低通濾波器到一個信号中,目的是為了保證信号的帶寬小于采樣率的一半。在很多的情況下這并不适合圖形學,因為我們對要采樣的東西,無法獲得精确的資訊(例如,我們不知道三角形會怎樣以一種像素的形式被觀察,直到我們采樣和光栅化這個三角形)。然而來自于normal maps下的高光走樣,這個normal map的内容我們卻能夠預先知道的。

Temporal Aliasing

時域走樣

So far, we have discussed graphics in terms of sampling a 2D signal. However we’re often concerned with a third dimension, which is time. Whenever we’re rendering a video stream we’re also sampling in the time domain, since the signal will completely change as time advances. Therefore we must consider sampling along this dimension as well, and how it can produce aliasing.

到目前為止,我們都以采樣2D信号的形式讨論圖形學。然而,我們經常需要關心第三維,那就是時間。無論什麼時候我們渲染一個視訊流我們也在時域上進行采樣,從這個時候起,信号會在時間流逝的過程中完全改變。是以我們也必須考慮在這個次元上的采樣,和它是怎樣産生走樣的。

In the case of video we are still using discrete samples, where each sample is a complete 2D image representing our scene at some period of time. This sampling is similar to our sampling in the spatial domain: there is some frequency of the signal we are sampling, and if we undersample that signal aliasing will occur. One classic example of a temporal aliasing is the so-called “wagon-wheel effect”, which refers to the phenomenon where a rotating wheel may appear to rotate more slowly (or even backwards) when viewed in an undersampled video stream. This animated GIF from Wikipedia demonstrates the effect quite nicely:

在視訊的例子中,我們仍然使用離散的采樣,在那裡,每個采樣都是一個完整的2D圖像,用于顯示我們場景在某個時間的樣子。這個采樣類似于我們在時域上的采樣:有一些信号的頻率是我們正在采樣的,如果我們采樣不足,走樣就會發生。一個關于時域走樣的經典例子被叫做“wagon-wheel effect”,它提及到的現象是,當在一個采樣不足的視訊流中觀看時,一個旋轉的車輪可能會出現旋轉得越來越慢(甚至是反向的)。下面的來自Wikipedia的GIF很好地展示了這個情況:

Applying Sampling Theory To Real-Time Graphics

A demonstration of the wagon-wheel effect that occurs due to temporal aliasing. In the animation the camera is moving to the right at a constant speed, yet the shapes appear to speed up, slow down, and even switch direction. Image taken from Wikipedia.

一個因為temporal aliasing而發生wago-wheel effect的事例。在這個動畫中,錄影機正在以恒定的速度往右移動,然而裡面的形狀卻是逐漸加速,降低,和甚至方向反轉。圖像是從Wikipedia中引用的。

In games, temporal sampling artifacts usually manifest as “jerky” movements and animations.  Increases in framerate correspond to an increase in sampling rate along the time domain, which allows for better sampling of faster-moving content. This is directly analogous to the improvements that are visible from increasing output resolution: more details are visible, and less aliasing is perceptible.

在遊戲中,temporal sampling的人為痕迹通常表現在移動和動畫的特然跳動。在時域上,增加幀率,相對應于增加采樣率,這對于一個快速移動的内容來講,會有一個更好的采樣表現。這相當于直接模拟,通過增加輸出的分辨率來提高可視品質:更多的細節被看到,預期更少的走樣。

The most commonly-used anti-aliasing technique for temporal aliasing is motion blur. Motion blur actually refers to an effect visible in photography, which occurs due to the shutter of the camera being open for some non-zero amount of time. This produces a result quite different than what we produce in 3D rendering, where by default we get an image representing one infinitely-small period of time. To accurately simulate the effect, we could supersample in the time domain by rendering more frames than we output and applying a filter to the result. However this is prohibitively expensive just like spatial supersampling, and so approximations are used. The most common approach is to produce a per-pixel velocity buffer for the current frame, and then use that to approximate the result of oversampling with a blur that uses multiple texture samples from nearby pixels. Such an approach can be considered an example of advanced reconstruction filter that uses information about the rate of change of a signal rather than additional samples in order to reconstruct an approximation of the original sample. Under certain conditions the results can be quite plausible, however in many cases noticeable artifacts can occur due to the lack of additional sample points. Most notably these artifacts will occur where the occlusion of a surface by another surface changes during a frame, since information about the occluded surface is typically not available to the post-process shader performing the reconstruction. The following image shows three screenshots of a model rotating about the camera’s z-axis: the model rendered with no motion blur, the model rendered with Morgan McGuire’s post-process motion blur technique[9] applied using 16 samples per pixel, and finally the model rendered temporal supersampling enabled using 32 samples per frame”

運動模糊經常使用針對temporal aliasing的反走樣技術。現實生活中,在攝影中能夠看到運動模糊的效果,它的發生是因為錄影機的快門在某個很短的時間下打開(K:想象一下,你将相機的快門時間調高,然後對一個運動的物體拍攝,産生出的效果就是模糊的)。這産生的結果和我們在3D渲染中産生的結果有很大不同,預設地,我們獲得的圖像,表示的是在一個無窮小的時間内顯示的内容(K:而在3D渲染就不一樣了,你在某個時間點截圖的圖像是不可能有任何模糊的,每一個像素都恰當的顯示在目前的采樣時間上,圖像會是非常清晰的)。為了精确的塑造這個效果,我們可以在時域進行supersampling,通過渲染更多的幀數并應用一個filter到這個結果中。然而,這是太昂貴而被禁止的,就如空間上的supersampling一樣,是以一個近似的方法被用上了。最常用的方法是,對目前幀産生一個逐像素的移動速度到一個buffer中,然後使用這個近似的結果進行oversampling,這個oversampling會帶有模糊效果,它會使用到多個貼圖的樣本,這些樣本都是來自采樣點旁邊的像素。這種方法被認為是進階的reconstruction filter的例子,為了能夠重建出接近原始采樣資料,它會使用信号的變化率作為資訊而不是額外的采樣點。在某些條件下,這個結果是十分可信的,然而在很多情況下,能讓人注意到的人工痕迹将會發生,原因在于缺失了額外采樣點。最讓人主要到是,這些痕迹将會在一幀中,在一個表面遮擋另一個表面的關系發生變化時(K:如本來是A遮擋B的,這幀中會變成B遮擋A)被觀察到,因為這些關于遮擋表面的資訊通常在後處理着色器執行重建中是不可用的。下面的圖像顯示了三個模型沿着錄影機z軸進行旋轉的螢幕截圖,第一個沒有使用運動模糊,第二個使用了 Morgan McGuire’s post-process motion blur technique,每個像素應用了16個采樣點,第三個圖使用了temporal supersampling每幀啟用了32個采樣點:

Applying Sampling Theory To Real-Time Graphics

A model rendered without motion blur, the same model rendered with post-process motion blur, and the same model rendered with temporal supersampling.

References

[1]Hou, Hsei. Cubic Splines for Image Interpolation and Digital Filtering. IEEE Transactions on Acoustics, Speech, and Signal Processing. Vol. 26, Issue 6. December 1978.

[2]Mitchell, Don P. and Netravali, Arun N. Reconstruction Filters in Computer Graphics. SIGGRAPH ’88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques.

[3]http://entropymine.com/imageworsener/bicubic/

[4] Schreiber, William F. Transformation Between  Continuous  and Discrete  Representations  of Images:  A  Perceptual  Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 7, Issue 2. March 1985.

[5] Brown, Earl F. Television: The  Subjective  Effects  of Filter  Ringing  Transients. February, 1979.

[6]http://www.humus.name/index.php?page=3D&ID=64

[7]http://blog.selfshadow.com/2011/07/22/specular-showdown/

[8]http://advances.realtimerendering.com/s2012/index.html

[9]McGuire, Morgan. Hennessy, Padraic. Bukowski, Michael, and Osman, Brian. A Reconstruction Filter for Plausible Motion Blur. I3D 2012.