天天看點

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

目錄

  • 6.5 Lumen
    • 6.5.1 Lumen技術特性
      • 6.5.1.1 表面緩存(Surface Cache)
      • 6.5.1.2 螢幕追蹤(Screen Tracing)
      • 6.5.1.3 Lumen光線追蹤
      • 6.5.1.4 Lumen其它說明
    • 6.5.2 Lumen渲染基礎
      • 6.5.2.1 FLumenCard
      • 6.5.2.2 FLumenMeshCards
      • 6.5.2.3 FLumenSceneData
    • 6.5.3 Lumen資料建構
      • 6.5.3.1 CardRepresentation
      • 6.5.3.2 GCardRepresentationAsyncQueue
      • 6.5.3.3 GenerateCardRepresentationData
    • 6.5.4 Lumen渲染流程
    • 6.5.5 Lumen場景更新
      • 6.5.5.1 UpdateLumenScene
      • 6.5.5.2 CardsToRender
      • 6.5.5.3 MeshCardCapture
      • 6.5.5.4 RasterizeLumenCards
    • 6.5.6 Lumen場景光照
      • 6.5.6.1 Voxel Cone Tracing
      • 6.5.6.2 RenderLumenSceneLighting
      • 6.5.6.3 RenderRadiosityForLumenScene
      • 6.5.6.4 CombineLumenSceneLighting
      • 6.5.6.5 RenderDirectLightingForLumenScene
      • 6.5.6.6 PrefilterLumenSceneLighting
      • 6.5.6.7 ComputeLumenSceneVoxelLighting
    • 6.5.7 Lumen非直接光照
      • 6.5.7.1 RenderDiffuseIndirectAndAmbientOcclusion
      • 6.5.7.2 RenderLumenScreenProbeGather
      • 6.5.7.3 RenderLumenReflections
      • 6.5.7.4 DiffuseIndirectComposite
    • 6.5.8 Lumen總結
  • 6.6 其它渲染技術
    • 6.6.1 Temporal Super Resolution
    • 6.6.2 Strata
  • 6.7 本篇總結
  • 特别說明
  • 參考文獻

6.2.2.2 Lumen全局動态光照小節已經簡介過Lumen的特性,包含間接光照明、天空光、自發光照明、軟硬陰影、反射等,本節将更加詳細地介紹其技術特性。

首先需要闡明的是,Lumen是綜合使用了多種技術的結合體,而非單一技術的運用。比如,Lumen預設使用有符号距離場(SDF)的軟光追,但是當硬體光線追蹤被啟用時,可以在支援的顯示卡上實作更高的品質。

下面将Lumen涉及的主要技術點羅列出來。

Lumen會為場景表面的附近生成自動化參數,被稱為表面緩存(Surface Cache),表面緩存用于快速查詢場景中射線命中點的光照。Lumen會為每個網格從多角度捕捉材質屬性,這些捕捉位置被稱為Cards,是逐網格被離線生成的。通過控制台參數

r.Lumen.Visualize.CardPlacement 1

可以檢視Lumen Cards的可視化效果:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

上:正常渲染畫面;下:Lumen Card可視化。

Nanite加速了網格捕捉,用于保持Surface Cache與三角形場景同步。特别是高面數的網格,需要使用Nanite來獲得高效捕捉。

當Surface Cache被材質屬性填充後,Lumen計算這些表面位置的直接和間接照明。這些更新在多個幀上攤銷,為許多動态燈光和多反彈的全局照明提供有效的支援。

隻有内部簡單的網格可以被支援,如牆壁、地闆和天花闆,它們應該各自使用單獨的網格,而不應該合成一個大網格。

Lumen的特點是先對螢幕進行追蹤(稱為螢幕追蹤或螢幕空間追蹤),如果沒有擊中,或者光線經過表面後,就使用更可靠的方法。

使用螢幕追蹤的缺點是,它極大地限制了藝術家的控制,導緻隻适用于間接照明,如Indirect lighting Scale、Emissive Boost等光照屬性。

件光線追蹤首先使用螢幕追蹤,然後再使用其它開銷更大的追蹤選項。如果螢幕追蹤被禁用于GI和反射,将會看見隻有Lumen場景。螢幕跟蹤支援任何幾何類型,并有助于掩蓋Lumen場景和三角形場景之間的不比對現象。

使用

r.Lumen.ScreenProbeGather.ScreenTraces 0|1

開啟或關閉螢幕追蹤,以檢視場景的對比效果:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

上:開啟了Lumen螢幕追蹤的效果;下:關閉Lumen螢幕追蹤的效果。可知在反射上差别最明顯,其次是部分間接光。

Lumen支援兩種光線追蹤模式:

1、軟體光線追蹤。可以在最廣泛的硬體和平台上運作。

2、硬體光線追蹤。需要顯示卡和作業系統支援。

  • 軟體光線追蹤

Lumen預設使用依賴有向距離場的軟體光線追蹤,這意味着可以運作于支援SM5的硬體上。

需要在工程設定中開啟生成網格距離場(Generate Mesh Distance Fields),UE5預設已開啟。

渲染器會合并網格的距離場到一個全局距離場(Global Distance Field)以加速追蹤。預設情況下,Lumen追蹤每一個網格距離場的前兩米的準确性,其它距離的射線則使用合并的全局距離場。如果項目需要精确控制Lumen軟光追,則可以在項目設定中使用的軟體光線追蹤模式的方法:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

細節追蹤(Detail Tracing)是預設的追蹤方法,可以利用單獨的網格距離場來達到高品質的GI(前兩米才使用,其它距離用全局距離場)。全局追蹤(Global Tracing)利用全局距離場來快速追蹤,但會損失一定的畫質效果。

網格距離場會根據錄影機在世界的移動而動态流式加載或解除安裝。它們會被打包成一個圖集(Atlas),可以通過控制台指令

r.DistanceFields.LogAtlasStats 1

輸出資訊:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

由于Lumen的軟光追的品質非常依賴網格距離場,是以關注網格距離場的品質可以提升Lumen的GI效果。下圖是現實網格距離場和全局距離場的菜單:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

下面兩圖分别是網格距離場和全局距離場可視化:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

但是,軟體光線追蹤存在着諸多限制,主要有:

  • 幾何物體限制:
    • Lumen場景隻支援靜态網格、執行個體化靜态網格、層級執行個體化靜态網格(Hierarchical Instanced Static Meshe)。
    • 不支援地貌幾何體,是以它們沒有間接反射光。未來将會支援。
  • 材質限制:
    • 不支援世界位置偏移(WPO)。
    • 不支援透明物體,視Masked物體為不透明物體。
    • 距離場資料的建構基于靜态網格資産的材質屬性,而不是覆寫的元件(override component)。意味着運作時改變材質不會影響到Lumen的GI。
  • 工作流限制:
    • 軟體光線追蹤要求層級是由子產品組成。牆壁、地闆和天花闆應該是獨立的網格。較大的網格(如山)将有不良的表現,并可能導緻自遮擋僞陰影。
    • 牆壁應大于10厘米,以避免漏光。
    • 距離場的分辨率依賴靜态網格導入時的設定,如果壓縮率過高,将得不到高品質的距離場資料。
    • 距離場無法表達很薄的物體。

上面已經闡述完Lumen的軟體光追,下面繼續介紹其硬體光追。

  • 硬體光線追蹤

硬體光線追蹤比軟體光線追蹤支援更大範圍的幾何物體類型,特别是它支援追蹤蒙皮網格。硬體光線追蹤也能更好地獲得更高的畫面品質:它與實際的三角形相交,并有選擇地來評估光線擊中點的照明,而不是較低品質的Surface Cache。

然而,硬體光線追蹤的場景設定成本很高,目前還無法擴充到執行個體數超過10萬的場景。動态變形網格(如蒙皮網格)也會導緻更新每一幀的光線追蹤加速結構的巨大成本,該成本與蒙皮三角形的數量成正比。

對于使用Nanite的靜态網格,硬體光線追蹤為了渲染效率,隻能在靜态網格編輯器設定中Nanite的Proxy Triangle Percent生成的代理網格(Proxy Mesh)上操作。這些Proxy Mesh可以通過控制台指令

r.Nanite 0|1

來開關可視化:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

上:全精度細節的三角形網格;下:對應的Nanite代理網格。

螢幕追蹤用于掩蓋Nanite渲染的全精度三角形網格和Lumen射線追蹤的代理網格之間的不比對。然而,在某些情況下,不比對太大而無法掩蓋。上面兩圖就是因為Proxy Triangle Percent數值太小,導緻了自陰影的瑕疵。

Lumen隻有在滿足以下條件時才啟用硬體光線追蹤:

  • 工程設定裡開啟了Use Hardware Ray Tracing when available和Support Hardware Ray Tracing。
  • 工程運作于支援的作業系統、RHI和顯示卡。目前僅以下平台支援硬體光追:
    • 帶DirectX 12的Windows10。
    • PlayStation 5。
    • Xbox系列S / X。
    • 顯示卡必須NVIDIA RTX-2000系列及以上,或者AMD RX 6000系列及以上。

Lumen場景運作于錄影機附近的世界,而不是整個世界,實作了大世界和流資料。Lumen依賴于Nanite的LOD和多視圖光栅化來快速捕捉場景,以維護Surface Cache,并控制所有操作以防止出現錯誤。Lumen不需要Nanite來操作,但是在沒有啟用Nanite的場景中,Lumen的場景捕捉會變得非常慢。如果資産沒有良好的LOD設定,這種情況尤其嚴重。

Lumen的Surface Cache覆寫了距離攝像頭200米的位置。在此之後的範圍,隻有螢幕追蹤對于全局照明是開啟的。

此外,Lumen還存在其它限制:

  • Lumen全局光照不能和光照圖(Lightmap)一起使用。未來,Lumen的反射應該被擴充到和Lightmap中使用全局照明,這将進一步提升渲染品質。
  • 植物還不能被很好地支援,因為嚴重依賴于下采樣渲染和時間濾波器。
  • Lumen的最後收集(Final Gather)會在移動物體周圍添加顯著的噪點,目前仍在積極開發中。
  • 透明材質還不支援Lumen反射。
  • 透明材質沒有高品質的動态GI。

以下是Lumen相關的調試或可視化資訊:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

上:正常畫面;中:Lumen Scene可視化;下:Lumen GI可視化。

當然,除了以上出現的幾個可視化選項,實際上Lumen還有很多其它可視化控制指令:

r.Lumen.RadianceCache.Visualize    
r.Lumen.RadianceCache.VisualizeClipmapIndex
r.Lumen.RadianceCache.VisualizeProbeRadius
r.Lumen.RadianceCache.VisualizeRadiusScale

r.Lumen.ScreenProbeGather.VisualizeTraces    
r.Lumen.ScreenProbeGather.VisualizeTracesFreeze

r.Lumen.Visualize.CardInterpolateInfluenceRadius    
r.Lumen.Visualize.CardPlacement
r.Lumen.Visualize.CardPlacementDistance    
r.Lumen.Visualize.CardPlacementIndex    
r.Lumen.Visualize.CardPlacementOrientation    
r.Lumen.Visualize.ClipmapIndex    
r.Lumen.Visualize.ConeAngle    
r.Lumen.Visualize.ConeStepFactor
r.Lumen.Visualize.GridPixelSize    
r.Lumen.Visualize.HardwareRayTracing
r.Lumen.Visualize.HardwareRayTracing.DeferredMaterial    
r.Lumen.Visualize.HardwareRayTracing.DeferredMaterial.TileDimension
r.Lumen.Visualize.HardwareRayTracing.LightingMode
r.Lumen.Visualize.HardwareRayTracing.MaxTranslucentSkipCount
r.Lumen.Visualize.MaxMeshSDFTraceDistance
r.Lumen.Visualize.MaxTraceDistance    
r.Lumen.Visualize.MinTraceDistance    
r.Lumen.Visualize.Stats    
r.Lumen.Visualize.TraceMeshSDFs    
r.Lumen.Visualize.TraceRadianceCache
r.Lumen.Visualize.VoxelFaceIndex
r.Lumen.Visualize.Voxels
r.Lumen.Visualize.VoxelStepFactor    

ShowFlag.LumenGlobalIllumination
ShowFlag.LumenReflections
ShowFlag.VisualizeLumenIndirectDiffuse
ShowFlag.VisualizeLumenScene
           

此外,還有很多控制指令,以下顯示部分指令:

r.Lumen.DiffuseIndirect.Allow
r.Lumen.DiffuseIndirect.CardInterpolateInfluenceRadius
r.Lumen.DiffuseIndirect.CardTraceEndDistanceFromCamera    

r.Lumen.DirectLighting    
r.Lumen.DirectLighting.BatchSize    
r.Lumen.DirectLighting.CardUpdateFrequencyScale    

r.Lumen.HardwareRayTracing
r.Lumen.HardwareRayTracing.PullbackBias
r.Lumen.IrradianceFieldGather
r.Lumen.IrradianceFieldGather.ClipmapDistributionBase
r.Lumen.IrradianceFieldGather.ClipmapWorldExtent

r.Lumen.MaxConeSteps
r.Lumen.MaxTraceDistance
r.Lumen.ProbeHierarchy
r.Lumen.ProbeHierarchy.AdditionalSpecularRayThreshold
r.Lumen.ProbeHierarchy.AntiTileAliasing

r.Lumen.RadianceCache.DownsampleDistanceFromCamera
r.Lumen.RadianceCache.ForceFullUpdate    
r.Lumen.RadianceCache.NumFramesToKeepCachedProbes    

r.Lumen.Radiosity    
r.Lumen.Radiosity.CardUpdateFrequencyScale    
r.Lumen.Radiosity.ComputeScatter    
r.Lumen.Radiosity.ConeAngleScale

r.Lumen.Reflections.Allow
r.Lumen.Reflections.DownsampleFactor    
r.Lumen.Reflections.GGXSamplingBias    
r.Lumen.Reflections.HardwareRayTracing
r.Lumen.Reflections.HardwareRayTracing.DeferredMaterial

r.Lumen.Reflections.HierarchicalScreenTraces.UncertainTraceRelativeDepthThreshold
r.Lumen.Reflections.MaxRayIntensity
r.Lumen.Reflections.MaxRoughnessToTrace    
r.Lumen.Reflections.RoughnessFadeLength    
r.Lumen.Reflections.ScreenSpaceReconstruction

r.Lumen.Reflections.ScreenTraces
r.Lumen.Reflections.Temporal
r.Lumen.Reflections.Temporal.DistanceThreshold
r.Lumen.Reflections.Temporal.HistoryWeight
r.Lumen.Reflections.TraceMeshSDFs

r.Lumen.ScreenProbeGather
r.Lumen.ScreenProbeGather.AdaptiveProbeAllocationFraction
r.Lumen.ScreenProbeGather.AdaptiveProbeMinDownsampleFactor
r.Lumen.ScreenProbeGather.DiffuseIntegralMethod
r.Lumen.ScreenProbeGather.DownsampleFactor
r.Lumen.ScreenProbeGather.FixedJitterIndex
r.Lumen.ScreenProbeGather.FullResolutionJitterWidth
r.Lumen.ScreenProbeGather.GatherNumMips
r.Lumen.ScreenProbeGather.GatherOctahedronResolutionScale
r.Lumen.ScreenProbeGather.HardwareRayTracing

r.Lumen.ScreenProbeGather.ImportanceSample.ProbeRadianceHistory
r.Lumen.ScreenProbeGather.MaxRayIntensity
r.Lumen.ScreenProbeGather.OctahedralSolidAngleTextureSize
r.Lumen.ScreenProbeGather.RadianceCache
r.Lumen.ScreenProbeGather.RadianceCache.ClipmapDistributionBase

r.Lumen.ScreenProbeGather.ReferenceMode
r.Lumen.ScreenProbeGather.ScreenSpaceBentNormal
r.Lumen.ScreenProbeGather.ScreenTraces
r.Lumen.ScreenProbeGather.ScreenTraces.HZBTraversal

r.Lumen.ScreenProbeGather.SpatialFilterHalfKernelSize    Experimental
r.Lumen.ScreenProbeGather.SpatialFilterMaxRadianceHitAngle

r.Lumen.ScreenProbeGather.Temporal    
r.Lumen.ScreenProbeGather.Temporal.ClearHistoryEveryFrame    

r.Lumen.ScreenProbeGather.TraceMeshSDFs    
r.Lumen.ScreenProbeGather.TracingOctahedronResolution
r.Lumen.TraceMeshSDFs
r.Lumen.TraceMeshSDFs.Allow    
r.Lumen.TranslucencyVolume.ConeAngleScale    
r.Lumen.TranslucencyVolume.Enable    
r.Lumen.TranslucencyVolume.EndDistanceFromCamera    

r.LumenParallelBeginUpdate
r.LumenScene.CardAtlasAllocatorBinSize    
r.LumenScene.CardAtlasSize    
r.LumenScene.CardCameraDistanceTexelDensityScale
r.LumenScene.CardCaptureMargin

r.LumenScene.ClipmapResolution    
r.LumenScene.ClipmapWorldExtent    
r.LumenScene.ClipmapZResolutionDivisor    
r.LumenScene.DiffuseReflectivityOverride    
r.LumenScene.DistantScene
r.LumenScene.DistantScene.CardResolution    

r.LumenScene.FastCameraMode
r.LumenScene.GlobalDFClipmapExtent    
r.LumenScene.GlobalDFResolution    
r.LumenScene.HeightfieldSlopeThreshold    
r.LumenScene.MaxInstanceAddsPerFrame
r.LumenScene.MeshCardsCullFaces    
r.LumenScene.MeshCardsMaxLOD

r.LumenScene.NaniteMultiViewCapture    
r.LumenScene.NumClipmapLevels    
r.LumenScene.PrimitivesPerPacket
r.LumenScene.RecaptureEveryFrame    
r.LumenScene.Reset
r.LumenScene.UploadCardBufferEveryFrame    
r.LumenScene.VoxelLightingAverageObjectsPerVisBufferTile

r.SSGI.AllowStandaloneLumenProbeHierarchy
r.Water.SingleLayer.LumenReflections
           

Lumen相關的控制台指令達到上百個,由此可知Lumen渲染的複雜度有多高!!

本節将闡述Lumen相關的基礎概念和類型。

FLumenCard就是上一小節提及的Card,是FLumenMeshCards的基本組成元素。

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneData.h

// Lumen卡片類型。
class FLumenCard
{
public:
    FLumenCard();
    ~FLumenCard();

    // 世界空間的包圍盒.
    FBox WorldBounds;
    // 旋轉資訊.
    FVector LocalToWorldRotationX;
    FVector LocalToWorldRotationY;
    FVector LocalToWorldRotationZ;
    // 位置.
    FVector Origin;
    // 局部空間的包圍盒.
    FVector LocalExtent;
    
    // 是否可見.
    bool bVisible = false;
    // 是否處于遠景.
    bool bDistantScene = false;

    // 所在圖集的資訊.
    bool bAllocated = false;
    FIntPoint DesiredResolution;
    FIntRect AtlasAllocation;

    // 朝向
    int32 Orientation = -1;
    // 在可見清單的索引.
    int32 IndexInVisibleCardIndexBuffer = -1;
    // 所在的FLumenMeshCards的Card清單的索引.
    int32 IndexInMeshCards = -1;
    // 所在的FLumenMeshCards的索引.
    int32 MeshCardsIndex = -1;
    // 分辨率縮放.
    float ResolutionScale = 1.0f;

    // 初始化
    void Initialize(float InResolutionScale, const FMatrix& LocalToWorld, const FLumenCardBuildData& CardBuildData, int32 InIndexInMeshCards, int32 InMeshCardsIndex);

    // 設定變換資料
    void SetTransform(const FMatrix& LocalToWorld, FVector CardLocalCenter, FVector CardLocalExtent, int32 InOrientation);
    void SetTransform(const FMatrix& LocalToWorld, const FVector& LocalOrigin, const FVector& CardToLocalRotationX, const FVector& CardToLocalRotationY, const FVector& CardToLocalRotationZ, const FVector& InLocalExtent);

    // 從圖集(場景)中删除.
    void RemoveFromAtlas(FLumenSceneData& LumenSceneData);

    int32 GetNumTexels() const
    {
        return AtlasAllocation.Area();
    }

    inline FVector TransformWorldPositionToCardLocal(FVector WorldPosition) const
    {
        FVector Offset = WorldPosition - Origin;
        return FVector(Offset | LocalToWorldRotationX, Offset | LocalToWorldRotationY, Offset | LocalToWorldRotationZ);
    }

    inline FVector TransformCardLocalPositionToWorld(FVector CardPosition) const
    {
        return Origin + CardPosition.X * LocalToWorldRotationX + CardPosition.Y * LocalToWorldRotationY + CardPosition.Z * LocalToWorldRotationZ;
    }
};
           

FLumenMeshCards是計算Surface Cache的基本元素,也是構成Lumen Scene的基本單元。它最多可存儲6個面(朝向)的FLumenCard資訊,每個朝向可存儲0~N個FLumenCard資訊(由

NumCardsPerOrientation

指定)。

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenMeshCards.h

class FLumenMeshCards
{
public:
    // 初始化.
    void Initialize(
        const FMatrix& InLocalToWorld, 
        const FBox& InBounds,
        uint32 InFirstCardIndex,
        uint32 InNumCards,
        uint32 InNumCardsPerOrientation[6],
        uint32 InCardOffsetPerOrientation[6])
    {
        Bounds = InBounds;
        SetTransform(InLocalToWorld);
        FirstCardIndex = InFirstCardIndex;
        NumCards = InNumCards;

        for (uint32 OrientationIndex = 0; OrientationIndex < 6; ++OrientationIndex)
        {
            NumCardsPerOrientation[OrientationIndex] = InNumCardsPerOrientation[OrientationIndex];
            CardOffsetPerOrientation[OrientationIndex] = InCardOffsetPerOrientation[OrientationIndex];
        }
    }

    // 設定變換矩陣.
    void SetTransform(const FMatrix& InLocalToWorld)
    {
        LocalToWorld = InLocalToWorld;
    }

    // 局部到世界的矩陣.
    FMatrix LocalToWorld;
    // 局部包圍盒.
    FBox Bounds;

    // 第一個FLumenCard索引.
    uint32 FirstCardIndex = 0;
    // FLumenCard數量.
    uint32 NumCards = 0;
    // 6個朝向的FLumenCard數量.
    uint32 NumCardsPerOrientation[6];
    // 6個朝向的FLumenCard偏移.
    uint32 CardOffsetPerOrientation[6];
};
           

FLumenSceneData就是Lumen實作全局光照的場景代表,它使用的不是Nanite的高精度網格,而是基于FLumenCard和FLumenMeshCards為基本元素的粗糙的場景。其定義及相關類型如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneData.h

// Lumen圖元執行個體
class FLumenPrimitiveInstance
{
public:
    FBox WorldSpaceBoundingBox;
    // FLumenMeshCards索引.
    int32 MeshCardsIndex;
    bool bValidMeshCards;
};

// Lumen圖元
class FLumenPrimitive
{
public:
    // 世界空間包圍盒.
    FBox WorldSpaceBoundingBox;
    // 屬于此圖元的FLumenMeshCards的最大包圍盒, 用于早期剔除.
    float MaxCardExtent;

    // 圖元執行個體清單.
    TArray<FLumenPrimitiveInstance, TInlineAllocator<1>> Instances;

    // 對應的真實場景的圖元資訊.
    FPrimitiveSceneInfo* Primitive = nullptr;

    // 是否合并的執行個體.
    bool bMergedInstances = false;
    // 卡片分辨率縮放.
    float CardResolutionScale = 1.0f;
    // FLumenMeshCards的數量.
    int32 NumMeshCards = 0;

    // 映射到LumenDFInstanceToDFObjectIndex.
    uint32 LumenDFInstanceOffset = UINT32_MAX;
    int32 LumenNumDFInstances = 0;

    // 擷取FLumenMeshCards索引.
    int32 GetMeshCardsIndex(int32 InstanceIndex) const
    {
        if (bMergedInstances)
        {
            return Instances[0].MeshCardsIndex;
        }

        if (InstanceIndex < Instances.Num())
        {
            return Instances[InstanceIndex].MeshCardsIndex;
        }

        return -1;
    }
};

// Lumen場景資料.
class FLumenSceneData
{
public:
    int32 Generation;

    // 上傳GPU的緩沖.
    FScatterUploadBuffer CardUploadBuffer;
    FScatterUploadBuffer UploadMeshCardsBuffer;
    FScatterUploadBuffer ByteBufferUploadBuffer;
    FScatterUploadBuffer UploadPrimitiveBuffer;

    FUniqueIndexList CardIndicesToUpdateInBuffer;
    FRWBufferStructured CardBuffer;

    TArray<FBox> PrimitiveModifiedBounds;

    // Lumen場景的所有Lumen圖元.
    TArray<FLumenPrimitive> LumenPrimitives;

    // FLumenMeshCards資料.
    FUniqueIndexList MeshCardsIndicesToUpdateInBuffer;
    TSparseSpanArray<FLumenMeshCards> MeshCards;
    TSparseSpanArray<FLumenCard> Cards;
    TArray<int32, TInlineAllocator<8>> DistantCardIndices;
    FRWBufferStructured MeshCardsBuffer;
    FRWByteAddressBuffer DFObjectToMeshCardsIndexBuffer;

    // 從圖元映射到LumenDFInstance.
    FUniqueIndexList PrimitivesToUpdate;
    FRWByteAddressBuffer PrimitiveToDFLumenInstanceOffsetBuffer;
    uint32 PrimitiveToLumenDFInstanceOffsetBufferSize = 0;

    // 從LumenDFInstance映射到DFObjectIndex
    FUniqueIndexList DFObjectIndicesToUpdateInBuffer;
    FUniqueIndexList LumenDFInstancesToUpdate;
    TSparseSpanArray<int32> LumenDFInstanceToDFObjectIndex;
    FRWByteAddressBuffer LumenDFInstanceToDFObjectIndexBuffer;
    uint32 LumenDFInstanceToDFObjectIndexBufferSize = 0;

    // 可見的FLumenMeshCards清單.
    TArray<int32> VisibleCardsIndices;
    TRefCountPtr<FRDGPooledBuffer> VisibleCardsIndexBuffer;

    // --- 從三角形場景中捕獲的資料 ---
    TRefCountPtr<IPooledRenderTarget> AlbedoAtlas;
    TRefCountPtr<IPooledRenderTarget> NormalAtlas;
    TRefCountPtr<IPooledRenderTarget> EmissiveAtlas;

    // --- 生成的資料 ---
    TRefCountPtr<IPooledRenderTarget> DepthAtlas;
    TRefCountPtr<IPooledRenderTarget> FinalLightingAtlas;
    TRefCountPtr<IPooledRenderTarget> IrradianceAtlas;
    TRefCountPtr<IPooledRenderTarget> IndirectIrradianceAtlas;
    TRefCountPtr<IPooledRenderTarget> RadiosityAtlas;
    TRefCountPtr<IPooledRenderTarget> OpacityAtlas;

    // 其它資料.
    bool bFinalLightingAtlasContentsValid;
    FIntPoint MaxAtlasSize;
    FBinnedTextureLayout AtlasAllocator;
    int32 NumCardTexels = 0;
    int32 NumMeshCardsToAddToSurfaceCache = 0;

    // 增删圖中繼資料.
    bool bTrackAllPrimitives;
    TSet<FPrimitiveSceneInfo*> PendingAddOperations;
    TSet<FPrimitiveSceneInfo*> PendingUpdateOperations;
    TArray<FLumenPrimitiveRemoveInfo> PendingRemoveOperations;

    FLumenSceneData(EShaderPlatform ShaderPlatform, EWorldType::Type WorldType);
    ~FLumenSceneData();

    // 增删圖元操作.
    void AddPrimitiveToUpdate(int32 PrimitiveIndex);
    void AddPrimitive(FPrimitiveSceneInfo* InPrimitive);
    void UpdatePrimitive(FPrimitiveSceneInfo* InPrimitive);
    void RemovePrimitive(FPrimitiveSceneInfo* InPrimitive, int32 PrimitiveIndex);

    // 增删FLumenMeshCards.
    void AddCardToVisibleCardList(int32 CardIndex);
    void RemoveCardFromVisibleCardList(int32 CardIndex);
    void AddMeshCards(int32 LumenPrimitiveIndex, int32 LumenInstanceIndex);
    void UpdateMeshCards(const FMatrix& LocalToWorld, int32 MeshCardsIndex, const FMeshCardsBuildData& MeshCardsBuildData);
    void RemoveMeshCards(FLumenPrimitive& LumenPrimitive, FLumenPrimitiveInstance& LumenPrimitiveInstance);

    bool HasPendingOperations() const
    {
        return PendingAddOperations.Num() > 0 || PendingUpdateOperations.Num() > 0 || PendingRemoveOperations.Num() > 0;
    }

    void UpdatePrimitiveToDistanceFieldInstanceMapping(FScene& Scene, FRHICommandListImmediate& RHICmdList);

private:
    // 從建構資料增加FLumenMeshCards.
    int32 AddMeshCardsFromBuildData(const FMatrix& LocalToWorld, const FMeshCardsBuildData& MeshCardsBuildData, float ResolutionScale);
};
           

由此可知,FLumenSceneData存儲着FLumenMeshCards以及以FLumenMeshCards為基礎的圖元FLumenPrimitive和圖元執行個體FLumenPrimitiveInstance。每個FLumenPrimitive又存儲着若幹個FLumenMeshCards,同時存儲了一個FPrimitiveSceneInfo指針,标明它是真實世界哪個FPrimitiveSceneInfo的粗糙代表。

Lumen在正在渲染之前,會執行很多資料建構,包含生成Mesh Distance Field、Global Distance Field以及MeshCard。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

首次啟動Lumen工程時,會建構很多資料,包含網格距離場等。

為了建構網格卡片代表,UE5獨立出了MeshCardRepresentation子產品,其核心概念和類型如下:

// Engine\Source\Runtime\Engine\Public\MeshCardRepresentation.h

// FLumenCard建構資料
class FLumenCardBuildData
{
public:
    // 中心和包圍盒.
    FVector Center;
    FVector Extent;

    // 朝向順序: -X, +X, -Y, +Y, -Z, +Z
    int32 Orientation;
    int32 LODLevel;

    // 根據朝向旋轉Extent.
    static FVector TransformFaceExtent(FVector Extent, int32 Orientation)
    {
        if (Orientation / 2 == 2) // 朝向: -Z, +Z
        {
            return FVector(Extent.Y, Extent.X, Extent.Z);
        }
        else if (Orientation / 2 == 1) // 朝向: -Y, +Y
        {
            return FVector(Extent.Z, Extent.X, Extent.Y);
        }
        else // (Orientation / 2 == 0), 朝向: -X, +X
        {
            return FVector(Extent.Y, Extent.Z, Extent.X);
        }
    }
};

// FLumenMeshCards建構資料.
class FMeshCardsBuildData
{
public:
    FBox Bounds;
    int32 MaxLODLevel;
    // FLumenCard建構資料清單.
    TArray<FLumenCardBuildData> CardBuildData;

    (......)
};

// 每個卡片表示資料執行個體的唯一id。
class FCardRepresentationDataId
{
public:
    uint32 Value = 0;

    bool IsValid() const
    {
        return Value != 0;
    }

    bool operator==(FCardRepresentationDataId B) const
    {
        return Value == B.Value;
    }

    friend uint32 GetTypeHash(FCardRepresentationDataId DataId)
    {
        return GetTypeHash(DataId.Value);
    }
};

// 卡片代表網格建構過程的有效負載和輸出資料.
class FCardRepresentationData : public FDeferredCleanupInterface
{
public:
    // 網格卡片建構資料和ID.
    FMeshCardsBuildData MeshCardsBuildData;
    FCardRepresentationDataId CardRepresentationDataId;

    (......)

#if WITH_EDITORONLY_DATA
    // 緩存卡片代表的資料.
    void CacheDerivedData(const FString& InDDCKey, const ITargetPlatform* TargetPlatform, UStaticMesh* Mesh, UStaticMesh* GenerateSource, bool bGenerateDistanceFieldAsIfTwoSided, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData);
#endif
};

// 建構任務
class FAsyncCardRepresentationTaskWorker : public FNonAbandonableTask
{
public:
    (.....)
    
    void DoWork();

private:
    FAsyncCardRepresentationTask& Task;
};

// 建構任務資料載體.
class FAsyncCardRepresentationTask
{
public:
    bool bSuccess = false;

#if WITH_EDITOR
    TArray<FSignedDistanceFieldBuildMaterialData> MaterialBlendModes;
#endif

    FSourceMeshDataForDerivedDataTask SourceMeshData;
    bool bGenerateDistanceFieldAsIfTwoSided = false;
    UStaticMesh* StaticMesh = nullptr;
    UStaticMesh* GenerateSource = nullptr;
    FString DDCKey;
    FCardRepresentationData* GeneratedCardRepresentation;
    TUniquePtr<FAsyncTask<FAsyncCardRepresentationTaskWorker>> AsyncTask = nullptr;
};

// 管理網格距離場的異步建構的類型.
class FCardRepresentationAsyncQueue : public FGCObject
{
public:
    // 增加新的建構任務.
    ENGINE_API void AddTask(FAsyncCardRepresentationTask* Task);
    
    // 處理異步任務.
    ENGINE_API void ProcessAsyncTasks(bool bLimitExecutionTime = false);
    
    // 取消建構.
    ENGINE_API void CancelBuild(UStaticMesh* StaticMesh);
    ENGINE_API void CancelAllOutstandingBuilds();

    // 阻塞建構任務.
    ENGINE_API void BlockUntilBuildComplete(UStaticMesh* StaticMesh, bool bWarnIfBlocked);
    ENGINE_API void BlockUntilAllBuildsComplete();

    (......)
};

// 全局建構隊列.
extern ENGINE_API FCardRepresentationAsyncQueue* GCardRepresentationAsyncQueue;

extern ENGINE_API FString BuildCardRepresentationDerivedDataKey(const FString& InMeshKey);

extern ENGINE_API void BeginCacheMeshCardRepresentation(const ITargetPlatform* TargetPlatform, UStaticMesh* StaticMeshAsset, class FStaticMeshRenderData& RenderData, const FString& DistanceFieldKey, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData);
           

為了建構Lumen需要的資料,UE5聲明了兩個全局隊列變量:GCardRepresentationAsyncQueue和GDistanceFieldAsyncQueue,前者用于Lumen Card的資料建構,後者用于距離場的資料建構。它們的建立和更新邏輯如下:

// Engine\Source\Runtime\Launch\Private\LaunchEngineLoop.cpp

int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
    (......)
    
    if (!FPlatformProperties::RequiresCookedData())
    {
        (......)
        
        // 建立全局異步隊列.
        GDistanceFieldAsyncQueue = new FDistanceFieldAsyncQueue();
        GCardRepresentationAsyncQueue = new FCardRepresentationAsyncQueue();

        (......)
    }
    
    (......)
}

void FEngineLoop::Tick()
{
    (......)
    
    // 每幀更新全局異步隊列.
    if (GDistanceFieldAsyncQueue)
    {
        QUICK_SCOPE_CYCLE_COUNTER(STAT_FEngineLoop_Tick_GDistanceFieldAsyncQueue);
        GDistanceFieldAsyncQueue->ProcessAsyncTasks();
    }
    if (GCardRepresentationAsyncQueue)
    {
        QUICK_SCOPE_CYCLE_COUNTER(STAT_FEngineLoop_Tick_GCardRepresentationAsyncQueue);
        GCardRepresentationAsyncQueue->ProcessAsyncTasks();
    }
    
    (......)
}
           

由于GDistanceFieldAsyncQueue是UE4就存在的類型,本節将忽略之,将精力放在GCardRepresentationAsyncQueue上。

對于CardRepresentation加入到全局建構隊列GCardRepresentationAsyncQueue的時機,可在MeshCardRepresentation.cpp找到答案:

FCardRepresentationAsyncQueue* GCardRepresentationAsyncQueue = NULL;

// 開始緩存網格卡片代表.
void BeginCacheMeshCardRepresentation(const ITargetPlatform* TargetPlatform, UStaticMesh* StaticMeshAsset, FStaticMeshRenderData& RenderData, const FString& DistanceFieldKey, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData)
{
    static const auto CVarCards = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.MeshCardRepresentation"));

    if (CVarCards->GetValueOnAnyThread() != 0)
    {
        FString Key = BuildCardRepresentationDerivedDataKey(DistanceFieldKey);
        if (RenderData.LODResources.IsValidIndex(0))
        {
            // 建構FCardRepresentationData執行個體.
            if (!RenderData.LODResources[0].CardRepresentationData)
            {
                RenderData.LODResources[0].CardRepresentationData = new FCardRepresentationData();
            }

            const FMeshBuildSettings& BuildSettings = StaticMeshAsset->GetSourceModel(0).BuildSettings;
            UStaticMesh* MeshToGenerateFrom = StaticMeshAsset;

            // 緩存FCardRepresentationData.
            RenderData.LODResources[0].CardRepresentationData->CacheDerivedData(Key, TargetPlatform, StaticMeshAsset, MeshToGenerateFrom, BuildSettings.bGenerateDistanceFieldAsIfTwoSided, OptionalSourceMeshData);
        }
    }
}

// 緩存FCardRepresentationData.
void FCardRepresentationData::CacheDerivedData(const FString& InDDCKey, const ITargetPlatform* TargetPlatform, UStaticMesh* Mesh, UStaticMesh* GenerateSource, bool bGenerateDistanceFieldAsIfTwoSided, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData)
{
    TArray<uint8> DerivedData;

    (......)
    {
        COOK_STAT(Timer.TrackCyclesOnly());
        
        // 建立新的建構任務FAsyncCardRepresentationTask.
        FAsyncCardRepresentationTask* NewTask = new FAsyncCardRepresentationTask;
        NewTask->DDCKey = InDDCKey;
        check(Mesh && GenerateSource);
        NewTask->StaticMesh = Mesh;
        NewTask->GenerateSource = GenerateSource;
        NewTask->GeneratedCardRepresentation = new FCardRepresentationData();
        NewTask->bGenerateDistanceFieldAsIfTwoSided = bGenerateDistanceFieldAsIfTwoSided;

        // 處理材質混合模式.
        for (int32 MaterialIndex = 0; MaterialIndex < Mesh->GetStaticMaterials().Num(); MaterialIndex++)
        {
            FSignedDistanceFieldBuildMaterialData MaterialData;
            // Default material blend mode
            MaterialData.BlendMode = BLEND_Opaque;
            MaterialData.bTwoSided = false;

            if (Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface)
            {
                MaterialData.BlendMode = Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface->GetBlendMode();
                MaterialData.bTwoSided = Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface->IsTwoSided();
            }

            NewTask->MaterialBlendModes.Add(MaterialData);
        }

        // Nanite材質用一個粗糙表示覆寫源靜态網格。在建構網格SDF之前,需要加載原始資料。
        if (OptionalSourceMeshData)
        {
            NewTask->SourceMeshData = *OptionalSourceMeshData;
        }
        // 建立Nanite的粗糙代表.
        else if (Mesh->NaniteSettings.bEnabled)
        {
            IMeshBuilderModule& MeshBuilderModule = IMeshBuilderModule::GetForPlatform(TargetPlatform);
            if (!MeshBuilderModule.BuildMeshVertexPositions(Mesh, NewTask->SourceMeshData.TriangleIndices, NewTask->SourceMeshData.VertexPositions))
            {
                UE_LOG(LogStaticMesh, Error, TEXT("Failed to build static mesh. See previous line(s) for details."));
            }
        }

        // 加入全局隊列GCardRepresentationAsyncQueue.
        GCardRepresentationAsyncQueue->AddTask(NewTask);
    }
}
           

跟蹤FCardRepresentationAsyncQueue的調用堆棧,不難查到其最終會進入

FMeshUtilities::GenerateCardRepresentationData

接口,此接口會執行具體的網格卡片建構邏輯:

// Engine\Source\Developer\MeshUtilities\Private\MeshCardRepresentationUtilities.cpp

bool FMeshUtilities::GenerateCardRepresentationData(
    FString MeshName,
    const FSourceMeshDataForDerivedDataTask& SourceMeshData,
    const FStaticMeshLODResources& LODModel,
    class FQueuedThreadPool& ThreadPool,
    const TArray<FSignedDistanceFieldBuildMaterialData>& MaterialBlendModes,
    const FBoxSphereBounds& Bounds,
    const FDistanceFieldVolumeData* DistanceFieldVolumeData,
    bool bGenerateAsIfTwoSided,
    FCardRepresentationData& OutData)
{
    // 建構Embree場景.
    FEmbreeScene EmbreeScene;
    MeshRepresentation::SetupEmbreeScene(MeshName,
        SourceMeshData,
        LODModel,
        MaterialBlendModes,
        bGenerateAsIfTwoSided,
        EmbreeScene);

    if (!EmbreeScene.EmbreeScene)
    {
        return false;
    }

    // 處理上下文.
    FGenerateCardMeshContext Context(MeshName, EmbreeScene.EmbreeScene, EmbreeScene.EmbreeDevice, OutData);
    // 建構網格卡片.
    BuildMeshCards(DistanceFieldVolumeData ? DistanceFieldVolumeData->LocalSpaceMeshBounds : Bounds.GetBox(), Context, OutData);

    MeshRepresentation::DeleteEmbreeScene(EmbreeScene);
    
    (......)

    return true;
}
           

由此可知,建構網格卡片過程使用了Embree第三方庫。

關于Embree

Embree是由Intel開發維護的開源庫,是一個高性能光線追蹤核心的集合,幫助開發者提高逼真渲染的應用程式的性能。它的特性有進階頭發幾何體、運動模糊、動态場景、多關卡執行個體:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
Embree的實作和技術有以下特點:
  • 核心為支援SSE、AVX、AVX2和AVX-512指令的最新Intel處理器進行了優化。
  • 支援運作時代碼選擇,以選擇周遊和建構算法,以最佳比對的CPU指令集。
  • 支援使用Intel SPMD程式編譯器(ISPC)編寫的應用程式,還提供了核心射線追蹤算法的ISPC接口。
  • 包含針對非緩存一緻的工作負載(如蒙特卡羅光線追蹤算法)和緩存一緻的工作負載(如主要可見性和硬陰影射線)優化的算法。
簡而言之,Embree是基于CPU的高度優化的光線追蹤渲染加速器,但不支援GPU的硬體加速。正是這個特點,Lumen的網格卡片建構時間主要取決于CPU的性能。

建構的核心邏輯位于

BuildMeshCards

void BuildMeshCards(const FBox& MeshBounds, const FGenerateCardMeshContext& Context, FCardRepresentationData& OutData)
{
    static const auto CVarMeshCardRepresentationMinSurface = IConsoleManager::Get().FindTConsoleVariableDataFloat(TEXT("r.MeshCardRepresentation.MinSurface"));
    const float MinSurfaceThreshold = CVarMeshCardRepresentationMinSurface->GetValueOnAnyThread();

    // 確定生成的卡片包圍盒不為空.
    const FVector MeshCardsBoundsCenter = MeshBounds.GetCenter();
    const FVector MeshCardsBoundsExtent = FVector::Max(MeshBounds.GetExtent() + 1.0f, FVector(5.0f));
    const FBox MeshCardsBounds(MeshCardsBoundsCenter - MeshCardsBoundsExtent, MeshCardsBoundsCenter + MeshCardsBoundsExtent);

    // 初始化部分輸出資料.
    OutData.MeshCardsBuildData.Bounds = MeshCardsBounds;
    OutData.MeshCardsBuildData.MaxLODLevel = 1;
    OutData.MeshCardsBuildData.CardBuildData.Reset();

    // 處理采樣和體素資料.
    const float SamplesPerWorldUnit = 1.0f / 10.0f;
    const int32 MinSamplesPerAxis = 4;
    const int32 MaxSamplesPerAxis = 64;
    FIntVector VolumeSizeInVoxels;
    VolumeSizeInVoxels.X = FMath::Clamp<int32>(MeshCardsBounds.GetSize().X * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
    VolumeSizeInVoxels.Y = FMath::Clamp<int32>(MeshCardsBounds.GetSize().Y * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
    VolumeSizeInVoxels.Z = FMath::Clamp<int32>(MeshCardsBounds.GetSize().Z * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);

    // 單個體素的大小.
    const FVector VoxelExtent = MeshCardsBounds.GetSize() / FVector(VolumeSizeInVoxels);

    // 随機在半球上生成射線方向.
    TArray<FVector4> RayDirectionsOverHemisphere;
    {
        FRandomStream RandomStream(0);
        MeshUtilities::GenerateStratifiedUniformHemisphereSamples(64, RandomStream, RayDirectionsOverHemisphere);
    }
    
    // 周遊6個朝向, 給每個朝向生成卡片資料.
    for (int32 Orientation = 0; Orientation < 6; ++Orientation)
    {
        // 初始化高度場和射線等資料.
        FIntPoint HeighfieldSize(0, 0);
        FVector RayDirection(0.0f, 0.0f, 0.0f);
        FVector RayOriginFrame = MeshCardsBounds.Min;
        FVector HeighfieldStepX(0.0f, 0.0f, 0.0f);
        FVector HeighfieldStepY(0.0f, 0.0f, 0.0f);
        float MaxRayT = 0.0f;
        int32 MeshSliceNum = 0;

        // 根據朝向調整高度場和射線資料.
        switch (Orientation / 2)
        {
            case 0: // 朝向: -X, +X
                MaxRayT = MeshCardsBounds.GetSize().X + 0.1f;
                MeshSliceNum = VolumeSizeInVoxels.X;
                HeighfieldSize.X = VolumeSizeInVoxels.Y;
                HeighfieldSize.Y = VolumeSizeInVoxels.Z;
                HeighfieldStepX = FVector(0.0f, MeshCardsBounds.GetSize().Y / HeighfieldSize.X, 0.0f);
                HeighfieldStepY = FVector(0.0f, 0.0f, MeshCardsBounds.GetSize().Z / HeighfieldSize.Y);
                break;

            case 1: // 朝向: -Y, +Y
                MaxRayT = MeshCardsBounds.GetSize().Y + 0.1f;
                MeshSliceNum = VolumeSizeInVoxels.Y;
                HeighfieldSize.X = VolumeSizeInVoxels.X;
                HeighfieldSize.Y = VolumeSizeInVoxels.Z;
                HeighfieldStepX = FVector(MeshCardsBounds.GetSize().X / HeighfieldSize.X, 0.0f, 0.0f);
                HeighfieldStepY = FVector(0.0f, 0.0f, MeshCardsBounds.GetSize().Z / HeighfieldSize.Y);
                break;

            case 2: // 朝向: -Z, +Z
                MaxRayT = MeshCardsBounds.GetSize().Z + 0.1f;
                MeshSliceNum = VolumeSizeInVoxels.Z;
                HeighfieldSize.X = VolumeSizeInVoxels.X;
                HeighfieldSize.Y = VolumeSizeInVoxels.Y;
                HeighfieldStepX = FVector(MeshCardsBounds.GetSize().X / HeighfieldSize.X, 0.0f, 0.0f);
                HeighfieldStepY = FVector(0.0f, MeshCardsBounds.GetSize().Y / HeighfieldSize.Y, 0.0f);
                break;
        }

        // 根據朝向調整射線方向.
        switch (Orientation)
        {
            case 0: 
                RayDirection.X = +1.0f; 
                break;

            case 1: 
                RayDirection.X = -1.0f; 
                RayOriginFrame.X = MeshCardsBounds.Max.X;
                break;

            case 2: 
                RayDirection.Y = +1.0f; 
                break;

            case 3: 
                RayDirection.Y = -1.0f; 
                RayOriginFrame.Y = MeshCardsBounds.Max.Y;
                break;

            case 4: 
                RayDirection.Z = +1.0f; 
                break;

            case 5: 
                RayDirection.Z = -1.0f; 
                RayOriginFrame.Z = MeshCardsBounds.Max.Z;
                break;

            default: 
                check(false);
        };

        TArray<TArray<FSurfacePoint, TInlineAllocator<16>>> HeightfieldLayers;
        HeightfieldLayers.SetNum(HeighfieldSize.X * HeighfieldSize.Y);

        // 填充表面點的資料.
        {
            TRACE_CPUPROFILER_EVENT_SCOPE(FillSurfacePoints);

            TArray<float> Heightfield;
            Heightfield.SetNum(HeighfieldSize.X * HeighfieldSize.Y);
            for (int32 HeighfieldY = 0; HeighfieldY < HeighfieldSize.Y; ++HeighfieldY)
            {
                for (int32 HeighfieldX = 0; HeighfieldX < HeighfieldSize.X; ++HeighfieldX)
                {
                    Heightfield[HeighfieldX + HeighfieldY * HeighfieldSize.X] = -1.0f;
                }
            }

            for (int32 HeighfieldY = 0; HeighfieldY < HeighfieldSize.Y; ++HeighfieldY)
            {
                for (int32 HeighfieldX = 0; HeighfieldX < HeighfieldSize.X; ++HeighfieldX)
                {
                    FVector RayOrigin = RayOriginFrame;
                    RayOrigin += (HeighfieldX + 0.5f) * HeighfieldStepX;
                    RayOrigin += (HeighfieldY + 0.5f) * HeighfieldStepY;

                    float StepTMin = 0.0f;

                    for (int32 StepIndex = 0; StepIndex < 64; ++StepIndex)
                    {
                        FEmbreeRay EmbreeRay;
                        EmbreeRay.ray.org_x = RayOrigin.X;
                        EmbreeRay.ray.org_y = RayOrigin.Y;
                        EmbreeRay.ray.org_z = RayOrigin.Z;
                        EmbreeRay.ray.dir_x = RayDirection.X;
                        EmbreeRay.ray.dir_y = RayDirection.Y;
                        EmbreeRay.ray.dir_z = RayDirection.Z;
                        EmbreeRay.ray.tnear = StepTMin;
                        EmbreeRay.ray.tfar = FLT_MAX;

                        FEmbreeIntersectionContext EmbreeContext;
                        rtcInitIntersectContext(&EmbreeContext);
                        rtcIntersect1(Context.FullMeshEmbreeScene, &EmbreeContext, &EmbreeRay);

                        if (EmbreeRay.hit.geomID != RTC_INVALID_GEOMETRY_ID && EmbreeRay.hit.primID != RTC_INVALID_GEOMETRY_ID)
                        {
                            const FVector SurfacePoint = RayOrigin + RayDirection * EmbreeRay.ray.tfar;
                            const FVector SurfaceNormal = EmbreeRay.GetHitNormal();

                            const float NdotD = FVector::DotProduct(RayDirection, SurfaceNormal);
                            const bool bPassCullTest = EmbreeContext.IsHitTwoSided() || NdotD <= 0.0f;
                            const bool bPassProjectionAngleTest = FMath::Abs(NdotD) >= FMath::Cos(75.0f * (PI / 180.0f));

                            const float MinDistanceBetweenPoints = (MaxRayT / MeshSliceNum);
                            const bool bPassDistanceToAnotherSurfaceTest = EmbreeRay.ray.tnear <= 0.0f || (EmbreeRay.ray.tfar - EmbreeRay.ray.tnear > MinDistanceBetweenPoints);

                            if (bPassCullTest && bPassProjectionAngleTest && bPassDistanceToAnotherSurfaceTest)
                            {
                                const bool bIsInsideMesh = IsSurfacePointInsideMesh(Context.FullMeshEmbreeScene, SurfacePoint, SurfaceNormal, RayDirectionsOverHemisphere);
                                if (!bIsInsideMesh)
                                {
                                    HeightfieldLayers[HeighfieldX + HeighfieldY * HeighfieldSize.X].Add(
                                        { EmbreeRay.ray.tnear, EmbreeRay.ray.tfar }
                                    );
                                }
                            }

                            StepTMin = EmbreeRay.ray.tfar + 0.01f;
                        }
                        else
                        {
                            break;
                        }
                    }
                }
            }
        }

        const int32 MinCardHits = FMath::Floor(HeighfieldSize.X * HeighfieldSize.Y * MinSurfaceThreshold);

        TArray<FPlacedCard, TInlineAllocator<16>> PlacedCards;
        int32 PlacedCardsHits = 0;

        // 放置一個預設卡片.
        {
            FPlacedCard PlacedCard;
            PlacedCard.SliceMin = 0;
            PlacedCard.SliceMax = MeshSliceNum;
            PlacedCards.Add(PlacedCard);

            PlacedCardsHits = UpdatePlacedCards(PlacedCards, RayOriginFrame, RayDirection, HeighfieldStepX, HeighfieldStepY, HeighfieldSize, MeshSliceNum, MaxRayT, MinCardHits, VoxelExtent, HeightfieldLayers);

            if (PlacedCardsHits < MinCardHits)
            {
                PlacedCards.Reset();
            }
        }

        SerializePlacedCards(PlacedCards, /*LOD level*/ 0, Orientation, MinCardHits, MeshCardsBounds, OutData);

        // 嘗試通過拆分現有的卡片去放置更多的卡片.
        for (uint32 CardPlacementIteration = 0; CardPlacementIteration < 4; ++CardPlacementIteration)
        {
            TArray<FPlacedCard, TInlineAllocator<16>> BestPlacedCards;
            int32 BestPlacedCardHits = PlacedCardsHits;

            for (int32 PlacedCardIndex = 0; PlacedCardIndex < PlacedCards.Num(); ++PlacedCardIndex)
            {
                const FPlacedCard& PlacedCard = PlacedCards[PlacedCardIndex];
                for (int32 SliceIndex = PlacedCard.SliceMin + 2; SliceIndex < PlacedCard.SliceMax; ++SliceIndex)
                {
                    TArray<FPlacedCard, TInlineAllocator<16>> TempPlacedCards(PlacedCards);

                    FPlacedCard NewPlacedCard;
                    NewPlacedCard.SliceMin = SliceIndex;
                    NewPlacedCard.SliceMax = PlacedCard.SliceMax;

                    TempPlacedCards[PlacedCardIndex].SliceMax = SliceIndex - 1;
                    TempPlacedCards.Insert(NewPlacedCard, PlacedCardIndex + 1);

                    const int32 NumHits = UpdatePlacedCards(TempPlacedCards, RayOriginFrame, RayDirection, HeighfieldStepX, HeighfieldStepY, HeighfieldSize, MeshSliceNum, MaxRayT, MinCardHits, VoxelExtent, HeightfieldLayers);

                    if (NumHits > BestPlacedCardHits)
                    {
                        BestPlacedCards = TempPlacedCards;
                        BestPlacedCardHits = NumHits;
                    }
                }
            }

            if (BestPlacedCardHits >= PlacedCardsHits + MinCardHits)
            {
                PlacedCards = BestPlacedCards;
                PlacedCardsHits = BestPlacedCardHits;
            }
        }

        SerializePlacedCards(PlacedCards, /*LOD level*/ 1, Orientation, MinCardHits, MeshCardsBounds, OutData);
    } // for (int32 Orientation = 0; Orientation < 6; ++Orientation)
}
           

以上代碼顯示建構卡牌資料時使用了高度場光線追蹤(Height Field Ray Tracing)來加速,而光線追蹤多年前就存在的技術。它的核心思想和步驟在于将網格離散化成大小相等的3D體素,然後根據分辨率大小從錄影機位置向每個像素位置發射一條光線和3D體素相交測試,進而渲染出高度場的輪廓。而高度場的輪廓将螢幕劃分為高度場覆寫區域和高度場以上區域的分界線:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

這樣獲得的輪廓存在明顯的鋸齒,論文Ray Tracing Height Fields提供了高度場平面、線性近似平面、三角面、雙線性表面等方法來重建表面資料以緩解鋸齒。

經過以上建構之後,可以出現如下所示的網格卡片資料:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

上:網格正常資料;下:網格卡片資料可視化。

網格卡片資料存在LOD,會根據鏡頭遠近選擇對應等級的LOD(點選看視訊)。

此外,UE5建構出來的網格距離場資料做了改進,利用稀疏存儲提升了精度(下圖左),明顯要好于UE4(下圖右)。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

Lumen的主要渲染流程依然在

FDeferredShadingSceneRenderer::Render

中:

void FDeferredShadingSceneRenderer::Render(FRDGBuilder& GraphBuilder)
{
    (......)
    
    bool bAnyLumenEnabled = false;
    if (!IsSimpleForwardShadingEnabled(ShaderPlatform))
    {
        (......)

        // 檢測是否有視圖啟用了Lumen.
        for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
        {
            FViewInfo& View = Views[ViewIndex];
            bAnyLumenEnabled = bAnyLumenEnabled 
                || GetViewPipelineState(View).DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen
                || GetViewPipelineState(View).ReflectionsMethod == EReflectionsMethod::Lumen;
        }

        (......)
    }
    
    (......)
    
    // PrePass.
    RenderPrePass(...);
    
    (......)
    
    // 更新Lumen場景.
    UpdateLumenScene(GraphBuilder);

    // 如果在BasePass之前執行遮擋剔除, 則在RenderBasePass之前渲染Lumen場景光照.
    // bOcclusionBeforeBasePass預設為false.
    if (bOcclusionBeforeBasePass)
    {
        {
            LLM_SCOPE_BYTAG(Lumen);
            RenderLumenSceneLighting(GraphBuilder, Views[0]);
        }

        ComputeVolumetricFog(GraphBuilder);
    }
    
    (......)
    
    // BasePass.
    RenderBasePass(...);
    
    (......)
    
    // BasePass之後的Lumen光照.
    if (!bOcclusionBeforeBasePass)
    {
        const bool bAfterBasePass = true;
        // 渲染陰影.
        AllocateVirtualShadowMaps(bAfterBasePass);
        RenderShadowDepthMaps(GraphBuilder, InstanceCullingManager);
        
        {
            LLM_SCOPE_BYTAG(Lumen);
            // 渲染Lumen場景光照.
            RenderLumenSceneLighting(GraphBuilder, Views[0]);
        }

        AddServiceLocalQueuePass(GraphBuilder);
    }
    
    (......)
    
    // 渲染Lumen可視化.
    RenderLumenSceneVisualization(GraphBuilder, SceneTextures);
    // 渲染非直接漫反射和AO.
    RenderDiffuseIndirectAndAmbientOcclusion(GraphBuilder, SceneTextures, LightingChannelsTexture, true);
    
    (......)
}
           

下面的紅框是RenderDoc截幀中Lumen的執行步驟:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

Lumen的光照主要有更新場景

UpdateLumenScene

和計算場景光照

RenderLumenSceneLighting

兩個階段。

Lumen場景更新主要由

UpdateLumenScene

承擔:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp

void FDeferredShadingSceneRenderer::UpdateLumenScene(FRDGBuilder& GraphBuilder)
{
    LLM_SCOPE_BYTAG(Lumen);

    FViewInfo& View = Views[0];
    const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);
    const bool bAnyLumenActive = ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen;

    if (bAnyLumenActive
        // 非主要視圖更新場景
        && !View.bIsPlanarReflection 
        && !View.bIsSceneCapture
        && !View.bIsReflectionCapture
        && View.ViewState)
    {
        const double StartTime = FPlatformTime::Seconds();

        // 擷取Lumen場景和卡片資料.
        FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
        TArray<FCardRenderData, SceneRenderingAllocator>& CardsToRender = LumenCardRenderer.CardsToRender;

        RDG_EVENT_SCOPE(GraphBuilder, "UpdateLumenScene: %u card captures %.3fM texels", CardsToRender.Num(), LumenCardRenderer.NumCardTexelsToCapture / 1e6f);

        // 更新卡片場景緩沖.
        UpdateCardSceneBuffer(GraphBuilder.RHICmdList, ViewFamily, Scene);

        // 因為更新了Lumen的圖元映射緩沖, 是以需要重新建立視圖統一緩沖區.
        Lumen::SetupViewUniformBufferParameters(Scene, *View.CachedViewUniformShaderParameters);
        View.ViewUniformBuffer = TUniformBufferRef<FViewUniformShaderParameters>::CreateUniformBufferImmediate(*View.CachedViewUniformShaderParameters, UniformBuffer_SingleFrame);
        
        LumenCardRenderer.CardIdsToRender.Empty(CardsToRender.Num());

        // 捕捉卡片的臨時深度緩沖區.
        const FRDGTextureDesc DepthStencilAtlasDesc = FRDGTextureDesc::Create2D(LumenSceneData.MaxAtlasSize, PF_DepthStencil, FClearValueBinding::DepthZero, TexCreate_ShaderResource | TexCreate_DepthStencilTargetable | TexCreate_NoFastClear);
        FRDGTextureRef DepthStencilAtlasTexture = GraphBuilder.CreateTexture(DepthStencilAtlasDesc, TEXT("Lumen.DepthStencilAtlas"));

        if (CardsToRender.Num() > 0)
        {
            FRHIBuffer* PrimitiveIdVertexBuffer = nullptr;
            FInstanceCullingResult InstanceCullingResult;
            // 裁剪卡片, 支援GPU和非GPU裁剪.
#if GPUCULL_TODO
            if (Scene->GPUScene.IsEnabled())
            {
                int32 MaxInstances = 0;
                int32 VisibleMeshDrawCommandsNum = 0;
                int32 NewPassVisibleMeshDrawCommandsNum = 0;

                FInstanceCullingContext InstanceCullingContext(nullptr, TArrayView<const int32>(&View.GPUSceneViewId, 1));

                SetupGPUInstancedDraws(InstanceCullingContext, LumenCardRenderer.MeshDrawCommands, false, MaxInstances, VisibleMeshDrawCommandsNum, NewPassVisibleMeshDrawCommandsNum);
                // Not supposed to do any compaction here.
                ensure(VisibleMeshDrawCommandsNum == LumenCardRenderer.MeshDrawCommands.Num());

                InstanceCullingContext.BuildRenderingCommands(GraphBuilder, Scene->GPUScene, View.DynamicPrimitiveCollector.GetPrimitiveIdRange(), InstanceCullingResult);
            }
            else
#endif // GPUCULL_TODO
            {
                // Prepare primitive Id VB for rendering mesh draw commands.
                if (LumenCardRenderer.MeshDrawPrimitiveIds.Num() > 0)
                {
                    const uint32 PrimitiveIdBufferDataSize = LumenCardRenderer.MeshDrawPrimitiveIds.Num() * sizeof(int32);

                    FPrimitiveIdVertexBufferPoolEntry Entry = GPrimitiveIdVertexBufferPool.Allocate(PrimitiveIdBufferDataSize);
                    PrimitiveIdVertexBuffer = Entry.BufferRHI;

                    void* RESTRICT Data = RHILockBuffer(PrimitiveIdVertexBuffer, 0, PrimitiveIdBufferDataSize, RLM_WriteOnly);
                    FMemory::Memcpy(Data, LumenCardRenderer.MeshDrawPrimitiveIds.GetData(), PrimitiveIdBufferDataSize);
                    RHIUnlockBuffer(PrimitiveIdVertexBuffer);

                    GPrimitiveIdVertexBufferPool.ReturnToFreeList(Entry);
                }
        }
            FRDGTextureRef AlbedoAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.AlbedoAtlas);
            FRDGTextureRef NormalAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.NormalAtlas);
            FRDGTextureRef EmissiveAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.EmissiveAtlas);

            uint32 NumRects = 0;
            FRDGBufferRef RectMinMaxBuffer = nullptr;
            {
                // 上傳卡片id,用于在待渲染卡片上操作的批量繪制。
                TArray<FUintVector4, SceneRenderingAllocator> RectMinMaxToRender;
                RectMinMaxToRender.Reserve(CardsToRender.Num());
                for (const FCardRenderData& CardRenderData : CardsToRender)
                {
                    FIntRect AtlasRect = CardRenderData.AtlasAllocation;

                    FUintVector4 Rect;
                    Rect.X = FMath::Max(AtlasRect.Min.X, 0);
                    Rect.Y = FMath::Max(AtlasRect.Min.Y, 0);
                    Rect.Z = FMath::Max(AtlasRect.Max.X, 0);
                    Rect.W = FMath::Max(AtlasRect.Max.Y, 0);
                    RectMinMaxToRender.Add(Rect);
                }

                NumRects = CardsToRender.Num();
                RectMinMaxBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateUploadDesc(sizeof(FUintVector4), FMath::RoundUpToPowerOfTwo(NumRects)), TEXT("Lumen.RectMinMaxBuffer"));

                FPixelShaderUtils::UploadRectMinMaxBuffer(GraphBuilder, RectMinMaxToRender, RectMinMaxBuffer);

                FRDGBufferSRVRef RectMinMaxBufferSRV = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(RectMinMaxBuffer, PF_R32G32B32A32_UINT));
                ClearLumenCards(GraphBuilder, View, AlbedoAtlasTexture, NormalAtlasTexture, EmissiveAtlasTexture, DepthStencilAtlasTexture, LumenSceneData.MaxAtlasSize, RectMinMaxBufferSRV, NumRects);
            }

            // 緩存視圖資訊.
            FViewInfo* SharedView = View.CreateSnapshot();
            {
                SharedView->DynamicPrimitiveCollector = FGPUScenePrimitiveCollector(&GetGPUSceneDynamicContext());
                SharedView->StereoPass = eSSP_FULL;
                SharedView->DrawDynamicFlags = EDrawDynamicFlags::ForceLowestLOD;

                // Don't do material texture mip biasing in proxy card rendering
                SharedView->MaterialTextureMipBias = 0;

                TRefCountPtr<IPooledRenderTarget> NullRef;
                FPlatformMemory::Memcpy(&SharedView->PrevViewInfo.HZB, &NullRef, sizeof(SharedView->PrevViewInfo.HZB));

                SharedView->CachedViewUniformShaderParameters = MakeUnique<FViewUniformShaderParameters>();
                SharedView->CachedViewUniformShaderParameters->PrimitiveSceneData = Scene->GPUScene.PrimitiveBuffer.SRV;
                SharedView->CachedViewUniformShaderParameters->InstanceSceneData = Scene->GPUScene.InstanceDataBuffer.SRV;
                SharedView->CachedViewUniformShaderParameters->LightmapSceneData = Scene->GPUScene.LightmapDataBuffer.SRV;
                SharedView->ViewUniformBuffer = TUniformBufferRef<FViewUniformShaderParameters>::CreateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters, UniformBuffer_SingleFrame);
            }

            // 設定場景的紋理緩存.
            FLumenCardPassUniformParameters* PassUniformParameters = GraphBuilder.AllocParameters<FLumenCardPassUniformParameters>();
            SetupSceneTextureUniformParameters(GraphBuilder, Scene->GetFeatureLevel(), /*SceneTextureSetupMode*/ ESceneTextureSetupMode::None, PassUniformParameters->SceneTextures);

            // 捕獲網格卡片.
            {
                FLumenCardPassParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardPassParameters>();
                PassParameters->View = Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer;
                PassParameters->CardPass = GraphBuilder.CreateUniformBuffer(PassUniformParameters);
                PassParameters->RenderTargets[0] = FRenderTargetBinding(AlbedoAtlasTexture, ERenderTargetLoadAction::ELoad);
                PassParameters->RenderTargets[1] = FRenderTargetBinding(NormalAtlasTexture, ERenderTargetLoadAction::ELoad);
                PassParameters->RenderTargets[2] = FRenderTargetBinding(EmissiveAtlasTexture, ERenderTargetLoadAction::ELoad);
                PassParameters->RenderTargets.DepthStencil = FDepthStencilBinding(DepthStencilAtlasTexture, ERenderTargetLoadAction::ELoad, FExclusiveDepthStencil::DepthWrite_StencilNop);

                InstanceCullingResult.GetDrawParameters(PassParameters->InstanceCullingDrawParams);

                // 捕獲網格卡片Pass.
                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("MeshCardCapture"),
                    PassParameters,
                    ERDGPassFlags::Raster,
                    [this, Scene = Scene, PrimitiveIdVertexBuffer, SharedView, &CardsToRender, PassParameters](FRHICommandList& RHICmdList)
                    {
                        QUICK_SCOPE_CYCLE_COUNTER(MeshPass);

                        // 将所有待渲染的卡片準備資料并送出繪制指令.
                        for (FCardRenderData& CardRenderData : CardsToRender)
                        {
                            if (CardRenderData.NumMeshDrawCommands > 0)
                            {
                                FIntRect AtlasRect = CardRenderData.AtlasAllocation;
                                RHICmdList.SetViewport(AtlasRect.Min.X, AtlasRect.Min.Y, 0.0f, AtlasRect.Max.X, AtlasRect.Max.Y, 1.0f);

                                CardRenderData.PatchView(RHICmdList, Scene, SharedView);
                                Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer.UpdateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters);

                                FGraphicsMinimalPipelineStateSet GraphicsMinimalPipelineStateSet;
#if GPUCULL_TODO
                                if (Scene->GPUScene.IsEnabled())
                                {
                                    FRHIBuffer* DrawIndirectArgsBuffer = nullptr;
                                    FRHIBuffer* InstanceIdOffsetBuffer = nullptr;
                                    FInstanceCullingDrawParams& InstanceCullingDrawParams = PassParameters->InstanceCullingDrawParams;
                                    if (InstanceCullingDrawParams.DrawIndirectArgsBuffer != nullptr && InstanceCullingDrawParams.InstanceIdOffsetBuffer != nullptr)
                                    {
                                        DrawIndirectArgsBuffer = InstanceCullingDrawParams.DrawIndirectArgsBuffer->GetRHI();
                                        InstanceIdOffsetBuffer = InstanceCullingDrawParams.InstanceIdOffsetBuffer->GetRHI();
                                    }

                                    // GPU裁剪調用GPUInstanced接口.
                                    SubmitGPUInstancedMeshDrawCommandsRange(
                                        LumenCardRenderer.MeshDrawCommands,
                                        GraphicsMinimalPipelineStateSet,
                                        CardRenderData.StartMeshDrawCommandIndex,
                                        CardRenderData.NumMeshDrawCommands,
                                        1,
                                        InstanceIdOffsetBuffer,
                                        DrawIndirectArgsBuffer,
                                        RHICmdList);
                                }
                                else
#endif // GPUCULL_TODO
                                {
                                    // 非GPU裁剪調用普通繪制接口.
                                    SubmitMeshDrawCommandsRange(
                                        LumenCardRenderer.MeshDrawCommands,
                                        GraphicsMinimalPipelineStateSet,
                                        PrimitiveIdVertexBuffer,
                                        0,
                                        false,
                                        CardRenderData.StartMeshDrawCommandIndex,
                                        CardRenderData.NumMeshDrawCommands,
                                        1,
                                        RHICmdList);
                                }
                            }
                        }
                    }
                );
            }

            // 記錄待渲染卡片的id和檢測是否存在需要渲染Nanite網格的标記.
            bool bAnyNaniteMeshes = false;
            for (FCardRenderData& CardRenderData : CardsToRender)
            {
                bAnyNaniteMeshes = bAnyNaniteMeshes || CardRenderData.NaniteInstanceIds.Num() > 0 || CardRenderData.bDistantScene;
                LumenCardRenderer.CardIdsToRender.Add(CardRenderData.CardIndex);
            }

            // 渲染Lumen場景的Nanite網格.
            if (UseNanite(ShaderPlatform) && ViewFamily.EngineShowFlags.NaniteMeshes && bAnyNaniteMeshes)
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(NaniteMeshPass);
                QUICK_SCOPE_CYCLE_COUNTER(NaniteMeshPass);

                const FIntPoint DepthStencilAtlasSize = DepthStencilAtlasDesc.Extent;
                const FIntRect DepthAtlasRect = FIntRect(0, 0, DepthStencilAtlasSize.X, DepthStencilAtlasSize.Y);
                FRDGBufferSRVRef RectMinMaxBufferSRV = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(RectMinMaxBuffer, PF_R32G32B32A32_UINT));

                // 光栅化上下文.
                Nanite::FRasterContext RasterContext = Nanite::InitRasterContext(
                    GraphBuilder,
                    FeatureLevel,
                    DepthStencilAtlasSize,
                    Nanite::EOutputBufferMode::VisBuffer,
                    true,
                    RectMinMaxBufferSRV,
                    NumRects);

                const bool bUpdateStreaming = false;
                const bool bSupportsMultiplePasses = true;
                const bool bForceHWRaster = RasterContext.RasterScheduling == Nanite::ERasterScheduling::HardwareOnly;
                // 非主要上下文(和Nanite的主要Pass差別開來)
                const bool bPrimaryContext = false;

                // 裁剪上下文
                Nanite::FCullingContext CullingContext = Nanite::InitCullingContext(
                    GraphBuilder,
                    *Scene,
                    nullptr,
                    FIntRect(),
                    false,
                    bUpdateStreaming,
                    bSupportsMultiplePasses,
                    bForceHWRaster,
                    bPrimaryContext);

                // 多視圖渲染.
                if (GLumenSceneNaniteMultiViewCapture)
                {
                    const uint32 NumCardsToRender = CardsToRender.Num();

                    // 第一層while循環是為了拆分卡片數量, 防止同一個批次的卡片超過MAX_VIEWS_PER_CULL_RASTERIZE_PASS.
                    uint32 NextCardIndex = 0;
                    while(NextCardIndex < NumCardsToRender)
                    {
                        TArray<Nanite::FPackedView, SceneRenderingAllocator> NaniteViews;
                        TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;

                        // 給每個待渲染卡片生成一個FPackedViewParams執行個體, 添加到NaniteViews, 直到NaniteViews達到了最大視圖數量.
                        while(NextCardIndex < NumCardsToRender && NaniteViews.Num() < MAX_VIEWS_PER_CULL_RASTERIZE_PASS)
                        {
                            const FCardRenderData& CardRenderData = CardsToRender[NextCardIndex];

                            if(CardRenderData.NaniteInstanceIds.Num() > 0)
                            {
                                for(uint32 InstanceID : CardRenderData.NaniteInstanceIds)
                                {
                                    NaniteInstanceDraws.Add(Nanite::FInstanceDraw { InstanceID, (uint32)NaniteViews.Num() });
                                }

                                Nanite::FPackedViewParams Params;
                                Params.ViewMatrices = CardRenderData.ViewMatrices;
                                Params.PrevViewMatrices = CardRenderData.ViewMatrices;
                                Params.ViewRect = CardRenderData.AtlasAllocation;
                                Params.RasterContextSize = DepthStencilAtlasSize;
                                Params.LODScaleFactor = CardRenderData.NaniteLODScaleFactor;
                                NaniteViews.Add(Nanite::CreatePackedView(Params));
                            }

                            NextCardIndex++;
                        }

                        // 光栅化卡片.
                        if (NaniteInstanceDraws.Num() > 0)
                        {
                            RDG_EVENT_SCOPE(GraphBuilder, "Nanite::RasterizeLumenCards");

                            Nanite::FRasterState RasterState;
                            Nanite::CullRasterize(
                                GraphBuilder,
                                *Scene,
                                NaniteViews,
                                CullingContext,
                                RasterContext,
                                RasterState,
                                &NaniteInstanceDraws
                            );
                        }
                    }
                }
                else // 單視圖渲染
                {
                    RDG_EVENT_SCOPE(GraphBuilder, "RenderLumenCardsWithNanite");

                    // 單視圖渲染比較暴力, 線性周遊所有待渲染卡片, 每個卡片建構一個view并調用一次繪制.
                    for(FCardRenderData& CardRenderData : CardsToRender)
                    {
                        if(CardRenderData.NaniteInstanceIds.Num() > 0)
                        {                        
                            TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;
                            for( uint32 InstanceID : CardRenderData.NaniteInstanceIds )
                            {
                                NaniteInstanceDraws.Add( Nanite::FInstanceDraw { InstanceID, 0u } );
                            }
                        
                            CardRenderData.PatchView(GraphBuilder.RHICmdList, Scene, SharedView);
                            Nanite::FPackedView PackedView = Nanite::CreatePackedViewFromViewInfo(*SharedView, DepthStencilAtlasSize, 0);

                            Nanite::CullRasterize(
                                GraphBuilder,
                                *Scene,
                                { PackedView },
                                CullingContext,
                                RasterContext,
                                Nanite::FRasterState(),
                                &NaniteInstanceDraws
                            );
                        }
                    }
                }

                extern float GLumenDistantSceneMinInstanceBoundsRadius;

                // 為遠處的卡片渲染整個場景.
                for (FCardRenderData& CardRenderData : CardsToRender)
                {
                    // bDistantScene标記了是否遠處的卡片.
                    if (CardRenderData.bDistantScene)
                    {
                        Nanite::FRasterState RasterState;
                        RasterState.bNearClip = false;

                        CardRenderData.PatchView(GraphBuilder.RHICmdList, Scene, SharedView);
                        Nanite::FPackedView PackedView = Nanite::CreatePackedViewFromViewInfo(
                            *SharedView,
                            DepthStencilAtlasSize,
                            /*Flags*/ 0,
                            /*StreamingPriorityCategory*/ 0,
                            GLumenDistantSceneMinInstanceBoundsRadius,
                            Lumen::GetDistanceSceneNaniteLODScaleFactor());

                        Nanite::CullRasterize(
                            GraphBuilder,
                            *Scene,
                            { PackedView },
                            CullingContext,
                            RasterContext,
                            RasterState);
                    }
                }

                // Lumen網格捕獲Pass.
                Nanite::DrawLumenMeshCapturePass(
                    GraphBuilder,
                    *Scene,
                    SharedView,
                    CardsToRender,
                    CullingContext,
                    RasterContext,
                    PassUniformParameters,
                    RectMinMaxBufferSRV,
                    NumRects,
                    LumenSceneData.MaxAtlasSize,
                    AlbedoAtlasTexture,
                    NormalAtlasTexture,
                    EmissiveAtlasTexture,
                    DepthStencilAtlasTexture
                );
            }

            ConvertToExternalTexture(GraphBuilder, AlbedoAtlasTexture, LumenSceneData.AlbedoAtlas);
            ConvertToExternalTexture(GraphBuilder, NormalAtlasTexture, LumenSceneData.NormalAtlas);
            ConvertToExternalTexture(GraphBuilder, EmissiveAtlasTexture, LumenSceneData.EmissiveAtlas);
        }

        // 上傳卡片資料.
        {
            QUICK_SCOPE_CYCLE_COUNTER(UploadCardIndexBuffers);

            // 上傳索引緩沖.
            {
                FRDGBufferRef CardIndexBuffer = GraphBuilder.CreateBuffer(
                    FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), FMath::Max(LumenCardRenderer.CardIdsToRender.Num(), 1)),
                    TEXT("Lumen.CardsToRenderIndexBuffer"));

                FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
                PassParameters->CardIds = CardIndexBuffer;

                const uint32 CardIdBytes = LumenCardRenderer.CardIdsToRender.GetTypeSize() * LumenCardRenderer.CardIdsToRender.Num();
                const void* CardIdPtr = LumenCardRenderer.CardIdsToRender.GetData();

                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("Upload CardsToRenderIndexBuffer NumIndices=%d", LumenCardRenderer.CardIdsToRender.Num()),
                    PassParameters,
                    ERDGPassFlags::Copy,
                    [PassParameters, CardIdBytes, CardIdPtr](FRHICommandListImmediate& RHICmdList)
                    {
                        if (CardIdBytes > 0)
                        {
                            void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, CardIdBytes, RLM_WriteOnly);
                            FPlatformMemory::Memcpy(DestCardIdPtr, CardIdPtr, CardIdBytes);
                            RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
                        }
                    });

                ConvertToExternalBuffer(GraphBuilder, CardIndexBuffer, LumenCardRenderer.CardsToRenderIndexBuffer);
            }

            // 上傳哈希映射表緩沖.
            {
                const uint32 NumHashMapUInt32 = FLumenCardRenderer::NumCardsToRenderHashMapBucketUInt32;
                const uint32 NumHashMapBytes = 4 * NumHashMapUInt32;
                const uint32 NumHashMapBuckets = 32 * NumHashMapUInt32;

                FRDGBufferRef CardHashMapBuffer = GraphBuilder.CreateBuffer(
                    FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), NumHashMapUInt32),
                    TEXT("Lumen.CardsToRenderHashMapBuffer"));

                LumenCardRenderer.CardsToRenderHashMap.Init(0, NumHashMapBuckets);

                for (int32 CardIndex : LumenCardRenderer.CardIdsToRender)
                {
                    LumenCardRenderer.CardsToRenderHashMap[CardIndex % NumHashMapBuckets] = 1;
                }

                FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
                PassParameters->CardIds = CardHashMapBuffer;

                const void* HashMapDataPtr = LumenCardRenderer.CardsToRenderHashMap.GetData();

                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("Upload CardsToRenderHashMapBuffer NumUInt32=%d", NumHashMapUInt32),
                    PassParameters,
                    ERDGPassFlags::Copy,
                    [PassParameters, NumHashMapBytes, HashMapDataPtr](FRHICommandListImmediate& RHICmdList)
                    {
                        if (NumHashMapBytes > 0)
                        {
                            void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, NumHashMapBytes, RLM_WriteOnly);
                            FPlatformMemory::Memcpy(DestCardIdPtr, HashMapDataPtr, NumHashMapBytes);
                            RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
                        }
                    });

                ConvertToExternalBuffer(GraphBuilder, CardHashMapBuffer, LumenCardRenderer.CardsToRenderHashMapBuffer);
            }

            // 上傳可見卡片索引緩沖.
            {
                FRDGBufferRef VisibleCardsIndexBuffer = GraphBuilder.CreateBuffer(
                    FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), FMath::Max(LumenSceneData.VisibleCardsIndices.Num(), 1)),
                    TEXT("Lumen.VisibleCardsIndexBuffer"));

                FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
                PassParameters->CardIds = VisibleCardsIndexBuffer;

                const uint32 CardIdBytes = sizeof(uint32) * LumenSceneData.VisibleCardsIndices.Num();
                const void* CardIdPtr = LumenSceneData.VisibleCardsIndices.GetData();

                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("Upload VisibleCardIndices NumIndices=%d", LumenSceneData.VisibleCardsIndices.Num()),
                    PassParameters,
                    ERDGPassFlags::Copy,
                    [PassParameters, CardIdBytes, CardIdPtr](FRHICommandListImmediate& RHICmdList)
                    {
                        if (CardIdBytes > 0)
                        {
                            void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, CardIdBytes, RLM_WriteOnly);
                            FPlatformMemory::Memcpy(DestCardIdPtr, CardIdPtr, CardIdBytes);
                            RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
                        }
                    });

                ConvertToExternalBuffer(GraphBuilder, VisibleCardsIndexBuffer, LumenSceneData.VisibleCardsIndexBuffer);
            }
        }

        // 預過濾Lumen場景深度.
        if (LumenCardRenderer.CardIdsToRender.Num() > 0)
        {
            TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer;
            {
                FLumenCardScene* LumenCardSceneParameters = GraphBuilder.AllocParameters<FLumenCardScene>();
                SetupLumenCardSceneParameters(GraphBuilder, Scene, *LumenCardSceneParameters);
                LumenCardSceneUniformBuffer = GraphBuilder.CreateUniformBuffer(LumenCardSceneParameters);
            }

            PrefilterLumenSceneDepth(GraphBuilder, LumenCardSceneUniformBuffer, DepthStencilAtlasTexture, LumenCardRenderer.CardIdsToRender, View);
        }
    }

    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
    LumenSceneData.CardIndicesToUpdateInBuffer.Reset();
    LumenSceneData.MeshCardsIndicesToUpdateInBuffer.Reset();
    LumenSceneData.DFObjectIndicesToUpdateInBuffer.Reset();
}
           

更新Lumen場景的過程主要有裁剪卡片、上傳卡片ID、緩存視圖和場景紋理、捕獲網格卡片、将卡片當做視圖光栅化Lumen場景、渲染遠處卡片、繪制網格捕獲、上傳卡片資料及可見資料等步驟。

由于以上過程比較多,無法将所有過程都詳細闡述,本節将重點闡述捕獲網格卡片和光栅化網格卡片涉及的階段。

為了闡述捕獲網格卡片和光栅化網格卡片的階段,需要弄清楚LumenCardRenderer.CardsToRender的添加過程。下面捋清Lumen場景上有哪些卡片需要捕獲和渲染,它的處理者是

InitView

階段的

BeginUpdateLumenSceneTasks

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp

void FDeferredShadingSceneRenderer::BeginUpdateLumenSceneTasks(FRDGBuilder& GraphBuilder)
{
    LLM_SCOPE_BYTAG(Lumen);

    const FViewInfo& MainView = Views[0];
    const bool bAnyLumenActive = ShouldRenderLumenDiffuseGI(Scene, MainView, true)
        || ShouldRenderLumenReflections(MainView, true);

    if (bAnyLumenActive
        && !ViewFamily.EngineShowFlags.HitProxies)
    {
        SCOPED_NAMED_EVENT(FDeferredShadingSceneRenderer_BeginUpdateLumenSceneTasks, FColor::Emerald);
        QUICK_SCOPE_CYCLE_COUNTER(BeginUpdateLumenSceneTasks);
        const double StartTime = FPlatformTime::Seconds();

        FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
        // 擷取待渲染卡片清單并重置.
        TArray<FCardRenderData, SceneRenderingAllocator>& CardsToRender = LumenCardRenderer.CardsToRender;
        LumenCardRenderer.Reset();

        const int32 LocalLumenSceneGeneration = GLumenSceneGeneration;
        const bool bRecaptureLumenSceneOnce = LumenSceneData.Generation != LocalLumenSceneGeneration;
        LumenSceneData.Generation = LocalLumenSceneGeneration;
        const bool bReallocateAtlas = LumenSceneData.MaxAtlasSize != GetDesiredAtlasSize() 
            || (LumenSceneData.RadiosityAtlas && LumenSceneData.RadiosityAtlas->GetDesc().Extent != GetRadiosityAtlasSize(LumenSceneData.MaxAtlasSize))
            || GLumenSceneReset;

        if (GLumenSceneReset != 2)
        {
            GLumenSceneReset = 0;
        }

        LumenSceneData.NumMeshCardsToAddToSurfaceCache = 0;

        // 更新髒卡片.
        UpdateDirtyCards(Scene, bReallocateAtlas, bRecaptureLumenSceneOnce);
        // 更新Lumen場景的圖元資訊.
        UpdateLumenScenePrimitives(Scene);
        // 更新遠處場景.
        UpdateDistantScene(Scene, Views[0]);

        const FVector LumenSceneCameraOrigin = GetLumenSceneViewOrigin(MainView, GetNumLumenVoxelClipmaps() - 1);
        const float MaxCardUpdateDistanceFromCamera = ComputeMaxCardUpdateDistanceFromCamera();

        // 重新配置設定卡片Atlas.
        if (bReallocateAtlas)
        {
            LumenSceneData.MaxAtlasSize = GetDesiredAtlasSize();
            // 在重新建立Atlas之前,應該釋放所有内容
            ensure(LumenSceneData.NumCardTexels == 0);

            LumenSceneData.AtlasAllocator = FBinnedTextureLayout(LumenSceneData.MaxAtlasSize, GLumenSceneCardAtlasAllocatorBinSize);
        }

        // 每幀捕獲和更新卡片紋素以及它們的數量, 是否更新由GLumenSceneRecaptureLumenSceneEveryFrame(控制台指令r.LumenScene.RecaptureEveryFrame)決定.
        const int32 CardCapturesPerFrame = GLumenSceneRecaptureLumenSceneEveryFrame != 0 ? INT_MAX : GetMaxLumenSceneCardCapturesPerFrame();
        const int32 CardTexelsToCapturePerFrame = GLumenSceneRecaptureLumenSceneEveryFrame != 0 ? INT_MAX : GetLumenSceneCardResToCapturePerFrame() * GetLumenSceneCardResToCapturePerFrame();

        if (CardCapturesPerFrame > 0 && CardTexelsToCapturePerFrame > 0)
        {
            QUICK_SCOPE_CYCLE_COUNTER(FillCardsToRender);

            TArray<FLumenSurfaceCacheUpdatePacket, SceneRenderingAllocator> Packets;
            TArray<FMeshCardsAdd, SceneRenderingAllocator> MeshCardsAddsSortedByPriority;

            // 準備表面緩存更新.
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(PrepareSurfaceCacheUpdate);

                const int32 NumPrimitivesPerPacket = FMath::Max(GLumenScenePrimitivesPerPacket, 1);
                const int32 NumPackets = FMath::DivideAndRoundUp(LumenSceneData.LumenPrimitives.Num(), NumPrimitivesPerPacket);

                CardsToRender.Reset(GetMaxLumenSceneCardCapturesPerFrame());
                Packets.Reserve(NumPackets);

                for (int32 PacketIndex = 0; PacketIndex < NumPackets; ++PacketIndex)
                {
                    Packets.Emplace(
                        LumenSceneData.LumenPrimitives,
                        LumenSceneData.MeshCards,
                        LumenSceneData.Cards,
                        LumenSceneCameraOrigin,
                        MaxCardUpdateDistanceFromCamera,
                        PacketIndex * NumPrimitivesPerPacket,
                        NumPrimitivesPerPacket);
                }
            }

            // 執行準備緩存更新任務.
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(RunPrepareSurfaceCacheUpdate);
                const bool bExecuteInParallel = FApp::ShouldUseThreadingForPerformance();

                ParallelFor(Packets.Num(),
                    [&Packets](int32 Index)
                    {
                        Packets[Index].AnyThreadTask();
                    },
                    !bExecuteInParallel
                );
            }

            // 打包上述任務的結果.
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(PacketResults);

                const float CARD_DISTANCE_BUCKET_SIZE = 100.0f;
                uint32 NumMeshCardsAddsPerBucket[MAX_ADD_PRIMITIVE_PRIORITY + 1];

                for (int32 BucketIndex = 0; BucketIndex < UE_ARRAY_COUNT(NumMeshCardsAddsPerBucket); ++BucketIndex)
                {
                    NumMeshCardsAddsPerBucket[BucketIndex] = 0;
                }

                // Count how many cards fall into each bucket
                for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
                {
                    const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];
                    LumenSceneData.NumMeshCardsToAddToSurfaceCache += Packet.MeshCardsAdds.Num();

                    for (int32 CardIndex = 0; CardIndex < Packet.MeshCardsAdds.Num(); ++CardIndex)
                    {
                        const FMeshCardsAdd& MeshCardsAdd = Packet.MeshCardsAdds[CardIndex];
                        ++NumMeshCardsAddsPerBucket[MeshCardsAdd.Priority];
                    }
                }

                int32 NumMeshCardsInBucketsUpToMaxBucket = 0;
                int32 MaxBucketIndexToAdd = 0;

                // 選擇前N個桶進行配置設定
                for (int32 BucketIndex = 0; BucketIndex < UE_ARRAY_COUNT(NumMeshCardsAddsPerBucket); ++BucketIndex)
                {
                    NumMeshCardsInBucketsUpToMaxBucket += NumMeshCardsAddsPerBucket[BucketIndex];
                    MaxBucketIndexToAdd = BucketIndex;

                    if (NumMeshCardsInBucketsUpToMaxBucket > CardCapturesPerFrame)
                    {
                        break;
                    }
                }

                MeshCardsAddsSortedByPriority.Reserve(GetMaxLumenSceneCardCapturesPerFrame());

                // 拷貝前N個桶到CardsToAllocateSortedByDistance
                for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
                {
                    const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];

                    for (int32 CardIndex = 0; CardIndex < Packet.MeshCardsAdds.Num() && MeshCardsAddsSortedByPriority.Num() < CardCapturesPerFrame; ++CardIndex)
                    {
                        const FMeshCardsAdd& MeshCardsAdd = Packet.MeshCardsAdds[CardIndex];

                        if (MeshCardsAdd.Priority <= MaxBucketIndexToAdd)
                        {
                            MeshCardsAddsSortedByPriority.Add(MeshCardsAdd);
                        }
                    }
                }

                // 移除所有不可見的網格卡片.
                for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
                {
                    const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];

                    for (int32 MeshCardsToRemoveIndex = 0; MeshCardsToRemoveIndex < Packet.MeshCardsRemoves.Num(); ++MeshCardsToRemoveIndex)
                    {
                        const FMeshCardsRemove& MeshCardsRemove = Packet.MeshCardsRemoves[MeshCardsToRemoveIndex];
                        FLumenPrimitive& LumenPrimitive = LumenSceneData.LumenPrimitives[MeshCardsRemove.LumenPrimitiveIndex];
                        FLumenPrimitiveInstance& LumenPrimitiveInstance = LumenPrimitive.Instances[MeshCardsRemove.LumenInstanceIndex];

                        LumenSceneData.RemoveMeshCards(LumenPrimitive, LumenPrimitiveInstance);
                    }
                }
            }

            // 配置設定遠處場景.
            extern int32 GLumenUpdateDistantSceneCaptures;
            if (GLumenUpdateDistantSceneCaptures)
            {
                for (int32 DistantCardIndex : LumenSceneData.DistantCardIndices)
                {
                    FLumenCard& DistantCard = LumenSceneData.Cards[DistantCardIndex];

                    extern int32 GLumenDistantSceneCardResolution;
                    DistantCard.DesiredResolution = FIntPoint(GLumenDistantSceneCardResolution, GLumenDistantSceneCardResolution);

                    if (!DistantCard.bVisible)
                    {
                        LumenSceneData.AddCardToVisibleCardList(DistantCardIndex);
                        DistantCard.bVisible = true;
                    }

                    DistantCard.RemoveFromAtlas(LumenSceneData);
                    LumenSceneData.CardIndicesToUpdateInBuffer.Add(DistantCardIndex);

                    // 加入到CardsToRender清單.
                    CardsToRender.Add(FCardRenderData(
                        DistantCard,
                        nullptr,
                        -1,
                        FeatureLevel,
                        DistantCardIndex));
                }
            }

            // 配置設定新的卡片.
            for (int32 SortedCardIndex = 0; SortedCardIndex < MeshCardsAddsSortedByPriority.Num(); ++SortedCardIndex)
            {
                const FMeshCardsAdd& MeshCardsAdd = MeshCardsAddsSortedByPriority[SortedCardIndex];
                FLumenPrimitive& LumenPrimitive = LumenSceneData.LumenPrimitives[MeshCardsAdd.LumenPrimitiveIndex];
                FLumenPrimitiveInstance& LumenPrimitiveInstance = LumenPrimitive.Instances[MeshCardsAdd.LumenInstanceIndex];

                LumenSceneData.AddMeshCards(MeshCardsAdd.LumenPrimitiveIndex, MeshCardsAdd.LumenInstanceIndex);

                if (LumenPrimitiveInstance.MeshCardsIndex >= 0)
                {
                    // 擷取圖元執行個體的網格卡片.
                    const FLumenMeshCards& MeshCards = LumenSceneData.MeshCards[LumenPrimitiveInstance.MeshCardsIndex];

                    // 周遊網格卡片的所有卡片, 添加有效的卡片到CardsToRender清單.
                    for (uint32 CardIndex = MeshCards.FirstCardIndex; CardIndex < MeshCards.FirstCardIndex + MeshCards.NumCards; ++CardIndex)
                    {
                        FLumenCard& LumenCard = LumenSceneData.Cards[CardIndex];

                        // 配置設定卡片.
                        FCardAllocationOutput CardAllocation;
                        ComputeCardAllocation(LumenCard, LumenSceneCameraOrigin, MaxCardUpdateDistanceFromCamera, CardAllocation);

                        LumenCard.DesiredResolution = CardAllocation.TextureAllocationSize;

                        if (LumenCard.bVisible != CardAllocation.bVisible)
                        {
                            LumenCard.bVisible = CardAllocation.bVisible;
                            if (LumenCard.bVisible)
                            {
                                LumenSceneData.AddCardToVisibleCardList(CardIndex);
                            }
                            else
                            {
                                LumenCard.RemoveFromAtlas(LumenSceneData);
                                LumenSceneData.RemoveCardFromVisibleCardList(CardIndex);
                            }
                            LumenSceneData.CardIndicesToUpdateInBuffer.Add(CardIndex);
                        }

                        // 如果卡片可見且分辨率和預期不一樣, 才添加到CardsToRender.
                        if (LumenCard.bVisible && LumenCard.AtlasAllocation.Width() != LumenCard.DesiredResolution.X && LumenCard.AtlasAllocation.Height() != LumenCard.DesiredResolution.Y)
                        {
                            LumenCard.RemoveFromAtlas(LumenSceneData);
                            LumenSceneData.CardIndicesToUpdateInBuffer.Add(CardIndex);

                            // 加入到CardsToRender清單.
                            CardsToRender.Add(FCardRenderData(
                                LumenCard,
                                LumenPrimitive.Primitive,
                                LumenPrimitive.bMergedInstances ? -1 : MeshCardsAdd.LumenInstanceIndex,
                                FeatureLevel,
                                CardIndex));

                            LumenCardRenderer.NumCardTexelsToCapture += LumenCard.AtlasAllocation.Area();
                        }
                    } // for

                    // 如果卡片或卡片紋素超限, 終止循環.
                    if (CardsToRender.Num() >= CardCapturesPerFrame
                        || LumenCardRenderer.NumCardTexelsToCapture >= CardTexelsToCapturePerFrame)
                    {
                        break;
                    }
                }
            }
        }

        // 配置設定和更新卡片Atlas.
        AllocateOptionalCardAtlases(GraphBuilder, LumenSceneData, MainView, bReallocateAtlas);
        UpdateLumenCardAtlasAllocation(GraphBuilder, MainView, bReallocateAtlas, bRecaptureLumenSceneOnce);

         // 處理待渲染卡片.
        if (CardsToRender.Num() > 0)
        {
            // 設定網格通道.
            {
                QUICK_SCOPE_CYCLE_COUNTER(MeshPassSetup);

                // 在渲染之前,確定所有的網格渲染資料都已準備好.
                {
                    QUICK_SCOPE_CYCLE_COUNTER(PrepareStaticMeshData);

                    // Set of unique primitives requiring static mesh update
                    TSet<FPrimitiveSceneInfo*> PrimitivesToUpdateStaticMeshes;

                    for (FCardRenderData& CardRenderData : CardsToRender)
                    {
                        FPrimitiveSceneInfo* PrimitiveSceneInfo = CardRenderData.PrimitiveSceneInfo;

                        if (PrimitiveSceneInfo && PrimitiveSceneInfo->Proxy->AffectsDynamicIndirectLighting())
                        {
                            if (PrimitiveSceneInfo->NeedsUniformBufferUpdate())
                            {
                                PrimitiveSceneInfo->UpdateUniformBuffer(GraphBuilder.RHICmdList);
                            }

                            if (PrimitiveSceneInfo->NeedsUpdateStaticMeshes())
                            {
                                PrimitivesToUpdateStaticMeshes.Add(PrimitiveSceneInfo);
                            }
                        }
                    }

                    if (PrimitivesToUpdateStaticMeshes.Num() > 0)
                    {
                        TArray<FPrimitiveSceneInfo*> UpdatedSceneInfos;
                        UpdatedSceneInfos.Reserve(PrimitivesToUpdateStaticMeshes.Num());
                        for (FPrimitiveSceneInfo* PrimitiveSceneInfo : PrimitivesToUpdateStaticMeshes)
                        {
                            UpdatedSceneInfos.Add(PrimitiveSceneInfo);
                        }

                        FPrimitiveSceneInfo::UpdateStaticMeshes(GraphBuilder.RHICmdList, Scene, UpdatedSceneInfos, true);
                    }
                }

                // 增加卡片捕獲繪制.
                for (FCardRenderData& CardRenderData : CardsToRender)
                {
                    CardRenderData.StartMeshDrawCommandIndex = LumenCardRenderer.MeshDrawCommands.Num();
                    CardRenderData.NumMeshDrawCommands = 0;
                    int32 NumNanitePrimitives = 0;

                    const FLumenCard& Card = LumenSceneData.Cards[CardRenderData.CardIndex];
                    checkSlow(Card.bVisible && Card.bAllocated);

                    // 建立或處理卡片對應的FVisibleMeshDrawCommand.
                    AddCardCaptureDraws(Scene, 
                        GraphBuilder.RHICmdList, 
                        CardRenderData, 
                        LumenCardRenderer.MeshDrawCommands, 
                        LumenCardRenderer.MeshDrawPrimitiveIds);

                    CardRenderData.NumMeshDrawCommands = LumenCardRenderer.MeshDrawCommands.Num() - CardRenderData.StartMeshDrawCommandIndex;
                }
            }

            (.....)
        }
    }
}
           

以上可知,網格卡片并不是每幀更新,在

GLumenSceneRecaptureLumenSceneEveryFrame

(控制台指令

r.LumenScene.RecaptureEveryFrame

)開啟的情況下,網格卡片的分辨率發生改變且可見的情況下,才會加入到待渲染清單,并且每幀都有上限,防止一幀需要更新和繪制的卡片過多導緻性能瓶頸。

分析完如何将網格卡片加入到待渲染清單,便可以繼續分析捕獲卡片的具體過程了:

// 捕獲網格卡片.
{
    FLumenCardPassParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardPassParameters>();
    // 卡片視圖資訊.
    PassParameters->View = Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer;
    PassParameters->CardPass = GraphBuilder.CreateUniformBuffer(PassUniformParameters);
    // Atlas渲染目标有3個: 基礎色, 法線, 自發光.
    PassParameters->RenderTargets[0] = FRenderTargetBinding(AlbedoAtlasTexture, ERenderTargetLoadAction::ELoad);
    PassParameters->RenderTargets[1] = FRenderTargetBinding(NormalAtlasTexture, ERenderTargetLoadAction::ELoad);
    PassParameters->RenderTargets[2] = FRenderTargetBinding(EmissiveAtlasTexture, ERenderTargetLoadAction::ELoad);
    // 深度目标緩沖.
    PassParameters->RenderTargets.DepthStencil = FDepthStencilBinding(DepthStencilAtlasTexture, ERenderTargetLoadAction::ELoad, FExclusiveDepthStencil::DepthWrite_StencilNop);

    InstanceCullingResult.GetDrawParameters(PassParameters->InstanceCullingDrawParams);

    // 捕獲網格卡片Pass.
    GraphBuilder.AddPass(
        RDG_EVENT_NAME("MeshCardCapture"),
        PassParameters,
        ERDGPassFlags::Raster,
        [this, Scene = Scene, PrimitiveIdVertexBuffer, SharedView, &CardsToRender, PassParameters](FRHICommandList& RHICmdList)
        {
            QUICK_SCOPE_CYCLE_COUNTER(MeshPass);

            // 将所有待渲染的卡片準備資料并送出繪制指令.
            for (FCardRenderData& CardRenderData : CardsToRender)
            {
                if (CardRenderData.NumMeshDrawCommands > 0)
                {
                    FIntRect AtlasRect = CardRenderData.AtlasAllocation;
                    // 設定視口.
                    RHICmdList.SetViewport(AtlasRect.Min.X, AtlasRect.Min.Y, 0.0f, AtlasRect.Max.X, AtlasRect.Max.Y, 1.0f);

                    // 處理視圖資料.
                    CardRenderData.PatchView(RHICmdList, Scene, SharedView);
                    Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer.UpdateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters);

                    FGraphicsMinimalPipelineStateSet GraphicsMinimalPipelineStateSet;
                #if GPUCULL_TODO
                    if (Scene->GPUScene.IsEnabled())
                    {
                        FRHIBuffer* DrawIndirectArgsBuffer = nullptr;
                        FRHIBuffer* InstanceIdOffsetBuffer = nullptr;
                        FInstanceCullingDrawParams& InstanceCullingDrawParams = PassParameters->InstanceCullingDrawParams;
                        if (InstanceCullingDrawParams.DrawIndirectArgsBuffer != nullptr && InstanceCullingDrawParams.InstanceIdOffsetBuffer != nullptr)
                        {
                            DrawIndirectArgsBuffer = InstanceCullingDrawParams.DrawIndirectArgsBuffer->GetRHI();
                            InstanceIdOffsetBuffer = InstanceCullingDrawParams.InstanceIdOffsetBuffer->GetRHI();
                        }

                        // GPU裁剪調用GPUInstanced接口.
                        SubmitGPUInstancedMeshDrawCommandsRange(
                            LumenCardRenderer.MeshDrawCommands,
                            GraphicsMinimalPipelineStateSet,
                            CardRenderData.StartMeshDrawCommandIndex,
                            CardRenderData.NumMeshDrawCommands,
                            1,
                            InstanceIdOffsetBuffer,
                            DrawIndirectArgsBuffer,
                            RHICmdList);
                    }
                #endif // GPUCULL_TODO
                    (......)
                }
            }
        }
    );
}
           

繪制卡片階段,渲染網格卡片時為每個網格卡片以低分辨率從不同的方向擷取網格表面屬性的投影,這些投影後的網格屬性被儲存在紋理atlas中,但不同于傳統的渲染管線,此處隻光栅化卡片視圖範圍内的Nanite網格的三種屬性:基礎色、法線、自發光。(下圖)

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

卡片捕捉階段投影在網格卡片的網格屬性圖集。上:基礎色圖集,下:法線圖集。

下面是捕獲網格卡片使用的VS和PS:

// Engine\Shaders\Private\Lumen\LumenCardVertexShader.usf

struct FLumenCardInterpolantsVSToPS
{
};

struct FLumenCardVSToPS
{
    FVertexFactoryInterpolantsVSToPS FactoryInterpolants;
    FLumenCardInterpolantsVSToPS PassInterpolants;
    float4 Position : SV_POSITION;
};

// 網格卡片VS主入口.
void Main(
    FVertexFactoryInput Input,
    OPTIONAL_VertexID
    out FLumenCardVSToPS Output
    )
{    
    uint EyeIndex = 0;
    ResolvedView = ResolveView();

    FVertexFactoryIntermediates VFIntermediates = GetVertexFactoryIntermediates(Input);
    float4 WorldPositionExcludingWPO = VertexFactoryGetWorldPosition(Input, VFIntermediates);
    float4 WorldPosition = WorldPositionExcludingWPO;
    float4 ClipSpacePosition;

    float3x3 TangentToLocal = VertexFactoryGetTangentToLocal(Input, VFIntermediates);    
    FMaterialVertexParameters VertexParameters = GetMaterialVertexParameters(Input, VFIntermediates, WorldPosition.xyz, TangentToLocal);

    ISOLATE
    {
        // 材質的位置偏移.
        WorldPosition.xyz += GetMaterialWorldPositionOffset(VertexParameters);
        // 光栅化的位置偏移.
        float4 RasterizedWorldPosition = VertexFactoryGetRasterizedWorldPosition(Input, VFIntermediates, WorldPosition);
        // 将位置變換到裁剪空間.
        ClipSpacePosition = INVARIANT(mul(RasterizedWorldPosition, ResolvedView.TranslatedWorldToClip));
        Output.Position = INVARIANT(ClipSpacePosition);
    }

    bool bClampToNearPlane = false;// GetPrimitiveData(Input.PrimitiveId).ObjectWorldPositionAndRadius.w < .5f * max();

    if (bClampToNearPlane && Output.Position.z < 0)
    {
        Output.Position.z = 0.01f;
        Output.Position.w = 1.0f;
    }

    Output.FactoryInterpolants = VertexFactoryGetInterpolantsVSToPS(Input, VFIntermediates, VertexParameters);
}


// Engine\Shaders\Private\Lumen\LumenCardPixelShader.usf

struct FLumenCardInterpolantsVSToPS
{
};

// 網格卡片PS主入口.
void Main(
    FVertexFactoryInterpolantsVSToPS Interpolants,
    FLumenCardInterpolantsVSToPS PassInterpolants,
    in INPUT_POSITION_QUALIFIERS float4 SvPosition : SV_Position        // after all interpolators
    OPTIONAL_IsFrontFace,
    out float4 OutTarget0 : SV_Target0,
    out float4 OutTarget1 : SV_Target1,
    out float4 OutTarget2 : SV_Target2)
{
    ResolvedView = ResolveView();

    // 擷取材質的基本屬性.
    FMaterialPixelParameters MaterialParameters = GetMaterialPixelParameters(Interpolants, SvPosition);
    FPixelMaterialInputs PixelMaterialInputs;
    
    // 計算材質的額外屬性.
    {
        float4 ScreenPosition = SvPositionToResolvedScreenPosition(SvPosition);
        float3 TranslatedWorldPosition = SvPositionToResolvedTranslatedWorld(SvPosition);
        CalcMaterialParametersEx(MaterialParameters, PixelMaterialInputs, SvPosition, ScreenPosition, bIsFrontFace, TranslatedWorldPosition, TranslatedWorldPosition);
    }

    // 擷取材質覆寫和裁剪資料.
    GetMaterialCoverageAndClipping(MaterialParameters, PixelMaterialInputs);

    float3 BaseColor = GetMaterialBaseColor(PixelMaterialInputs);
    float  Metallic = GetMaterialMetallic(PixelMaterialInputs);
    float  Specular = GetMaterialSpecular(PixelMaterialInputs);

    float Roughness = GetMaterialRoughness(PixelMaterialInputs);
    float Opacity = GetMaterialOpacity(PixelMaterialInputs);

    float3 DiffuseColor = BaseColor - BaseColor * Metallic;
    float3 SpecularColor = lerp(0.08 * Specular.xxx, BaseColor, Metallic.xxx);

    // 計算環境光的影響.
    EnvBRDFApproxFullyRough(DiffuseColor, SpecularColor);

    // 存儲基礎色, 法線, 自發光.
    //@todo DynamicGI better encoding for low precision, hemispherical normal encoding
    OutTarget0 = float4(sqrt(DiffuseColor), Opacity);
    OutTarget1 = float4(MaterialParameters.WorldNormal * .5f + .5f, 0);
    OutTarget2 = float4(GetMaterialEmissive(PixelMaterialInputs), 0);
}
           

其中VS的輸入是局部空間的長方體,VS的輸出是裁剪空間的長方體:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

經過PS渲染完之後,會在基礎色、法線、自發光的三個RT圖集中對應的位置存儲資料。需要特意提出的是,這裡的VS和PS邏輯遠遠沒有傳統BasePass的VS和PS複雜,這也是Lumen得以實時渲染的其中一個重要優化措施。

另外說一下,渲染新卡片到Atlas圖集的位置可由Bin packing problem解決,渲染時隻要将起始點和寬高設定到ViewPort就行了,對應的類型是

FBinnedTextureLayout

,其它相關類型還有

FTextureLayout

FTextureLayout3d

。比如以下截幀的卡片ViewPort的位置是(0, 0),寬高是(64, 64),意味着它将被渲染到圖集中最前面寬高為64的區域:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

順帶提一下,網格卡片的繪制指令是在FLumenCardMeshProcessor中處理的:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp

void FLumenCardMeshProcessor::AddMeshBatch(const FMeshBatch& RESTRICT MeshBatch, uint64 BatchElementMask, const FPrimitiveSceneProxy* RESTRICT PrimitiveSceneProxy, int32 StaticMeshId)
{
    LLM_SCOPE_BYTAG(Lumen);

    if (MeshBatch.bUseForMaterial && DoesPlatformSupportLumenGI(GetFeatureLevelShaderPlatform(FeatureLevel)))
    {
        // 處理材質.
        const FMaterialRenderProxy* FallbackMaterialRenderProxyPtr = nullptr;
        const FMaterial& Material = MeshBatch.MaterialRenderProxy->GetMaterialWithFallback(FeatureLevel, FallbackMaterialRenderProxyPtr);

        const FMaterialRenderProxy& MaterialRenderProxy = FallbackMaterialRenderProxyPtr ? *FallbackMaterialRenderProxyPtr : *MeshBatch.MaterialRenderProxy;

        // 處理渲染狀态.
        const EBlendMode BlendMode = Material.GetBlendMode();
        const FMaterialShadingModelField ShadingModels = Material.GetShadingModels();
        const bool bIsTranslucent = IsTranslucentBlendMode(BlendMode);
        const FMeshDrawingPolicyOverrideSettings OverrideSettings = ComputeMeshOverrideSettings(MeshBatch);
        const ERasterizerFillMode MeshFillMode = ComputeMeshFillMode(MeshBatch, Material, OverrideSettings);
        const ERasterizerCullMode MeshCullMode = ComputeMeshCullMode(MeshBatch, Material, OverrideSettings);

        if (!bIsTranslucent
            && (PrimitiveSceneProxy && PrimitiveSceneProxy->ShouldRenderInMainPass() && PrimitiveSceneProxy->AffectsDynamicIndirectLighting())
            && ShouldIncludeDomainInMeshPass(Material.GetMaterialDomain()))
        {
            // 選擇VS和PS等shader
            const FVertexFactory* VertexFactory = MeshBatch.VertexFactory;
            FVertexFactoryType* VertexFactoryType = VertexFactory->GetType();

            TMeshProcessorShaders<FLumenCardVS, FLumenCardPS> PassShaders;

            PassShaders.VertexShader = Material.GetShader<FLumenCardVS>(VertexFactoryType);
            PassShaders.PixelShader = Material.GetShader<FLumenCardPS>(VertexFactoryType);

            FMeshMaterialShaderElementData ShaderElementData;
            ShaderElementData.InitializeMeshMaterialData(ViewIfDynamicMeshCommand, PrimitiveSceneProxy, MeshBatch, StaticMeshId, false);

            const FMeshDrawCommandSortKey SortKey = CalculateMeshStaticSortKey(PassShaders.VertexShader, PassShaders.PixelShader);

            // 建構繪制指令
            BuildMeshDrawCommands(
                MeshBatch,
                BatchElementMask,
                PrimitiveSceneProxy,
                MaterialRenderProxy,
                Material,
                PassDrawRenderState,
                PassShaders,
                MeshFillMode,
                MeshCullMode,
                SortKey,
                EMeshPassFeatures::Default,
                ShaderElementData);
        }
    }
}
           

光栅化Lumen卡片邏輯如下:

if (UseNanite(ShaderPlatform) && ViewFamily.EngineShowFlags.NaniteMeshes && bAnyNaniteMeshes)
{
    (......)

    Nanite::FRasterContext RasterContext = Nanite::InitRasterContext(...);

    (......)

    Nanite::FCullingContext CullingContext = Nanite::InitCullingContext(...);

    if (GLumenSceneNaniteMultiViewCapture) // 多視圖繪制模型
    {
        const uint32 NumCardsToRender = CardsToRender.Num();

        // 拆分視圖, 防止超過同批次的最大數量.
        uint32 NextCardIndex = 0;
        while(NextCardIndex < NumCardsToRender)
        {
            TArray<Nanite::FPackedView, SceneRenderingAllocator> NaniteViews;
            TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;

            while(NextCardIndex < NumCardsToRender && NaniteViews.Num() < MAX_VIEWS_PER_CULL_RASTERIZE_PASS)
            {
                const FCardRenderData& CardRenderData = CardsToRender[NextCardIndex];

                if(CardRenderData.NaniteInstanceIds.Num() > 0)
                {
                    for(uint32 InstanceID : CardRenderData.NaniteInstanceIds)
                    {
                        NaniteInstanceDraws.Add(Nanite::FInstanceDraw { InstanceID, (uint32)NaniteViews.Num() });
                    }

                    Nanite::FPackedViewParams Params;
                    Params.ViewMatrices = CardRenderData.ViewMatrices;
                    Params.PrevViewMatrices = CardRenderData.ViewMatrices;
                    Params.ViewRect = CardRenderData.AtlasAllocation;
                    Params.RasterContextSize = DepthStencilAtlasSize;
                    Params.LODScaleFactor = CardRenderData.NaniteLODScaleFactor;
                    NaniteViews.Add(Nanite::CreatePackedView(Params));
                }

                NextCardIndex++;
            }

            // 執行個體化繪制.
            if (NaniteInstanceDraws.Num() > 0)
            {
                RDG_EVENT_SCOPE(GraphBuilder, "Nanite::RasterizeLumenCards");

                Nanite::FRasterState RasterState;
                Nanite::CullRasterize(
                    GraphBuilder,
                    *Scene,
                    NaniteViews,
                    CullingContext,
                    RasterContext,
                    RasterState,
                    &NaniteInstanceDraws
                );
            }
        }
    }
    else // 單視圖模式.
    {
        (......)
    }
    
    extern float GLumenDistantSceneMinInstanceBoundsRadius;

    // 渲染遠景的卡片.
    for (FCardRenderData& CardRenderData : CardsToRender)
    {
        if (CardRenderData.bDistantScene)
        {
            (......)
        }
    }

    // 繪制Lumen的網格.
    Nanite::DrawLumenMeshCapturePass(
        GraphBuilder,
        *Scene,
        SharedView,
        CardsToRender,
        CullingContext,
        RasterContext,
        PassUniformParameters,
        RectMinMaxBufferSRV,
        NumRects,
        LumenSceneData.MaxAtlasSize,
        AlbedoAtlasTexture,
        NormalAtlasTexture,
        EmissiveAtlasTexture,
        DepthStencilAtlasTexture
    );
}
           

光栅化卡片的階段跟Nanite流程基本一緻:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

光栅化後輸出的結果也是一緻,包含可見性、深度模闆緩沖、三角形ID等資訊:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

之後的步驟就是繪制網格卡片,這個階段也和Nanite基本一緻:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

輸出的GBuffer依然是上面提及的基礎色、法線、自發光三個圖集,但會附加到它們的空白區域。

後面小節會較多地涉及到Voxel Cone Tracing(體素椎體追蹤)的相關知識,本小節先補充一下它的相關知識,論文依據是Interactive Indirect Illumination Using Voxel Cone Tracing和Voxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination。

對場景執行Voxel Cone Tracing的第一步是建構場景物體的稀疏體素八叉樹(Sparse Voxel Octree),UE5使用了稀疏HLOD的網格距離場。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

下圖是Sponza場景體素化後的情形:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

渲染引擎(如UE)一般使用了混合渲染管線,直接光(Primary ray)使用傳統的光栅化獲得,次級光則使用椎體追蹤:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

在體素椎體追蹤之前,會預過濾幾何體,然後像參合媒體那樣去追蹤(可使用體積光線投射法)。而體素使用不透明場+入射輻射率來代表場景物體,這樣可以使用四線性(Quadrilinearly)插值采樣來模拟椎體射線覆寫的腳印:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

上圖步驟中的單條椎體射線追蹤需要用到MIP映射圖,MIP映射圖的生成使用了高斯權重,即體素中心的權重最大,偏離體素中心越遠的點權重越小:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

利用高斯權重生成的MIP圖越高的Level越模糊,剛好可以比對椎體的形狀:椎體射線離起點越遠,其覆寫的範圍越大,接收到的光照越模糊!在此前提下,就可以根據椎體射線相交點與起點的距離去四線性采樣對應Level的MIP圖,以快速得到椎體射線相交點的輻射率:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

Voxel的渲染過程可分拆成3個Pass:第一個Pass是光照,烘焙輻照度(反射陰影圖,RSM);第二個Pass是預過濾,使用稀疏八叉樹下采樣輻射率;第三個Pass是相機Pass,收集每個可見片元(像素)的輻照度。(下圖)

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

同樣地,Voxel追蹤還可以用于鏡面反射、AO、軟陰影中。對于鏡面反射,可以采用類似的追蹤方式,隻是生成的鏡面椎體數量少且範圍小:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

實際上,在Cone Tracing中,不同粗糙度的表面可以構造不同的數量和大小的椎體進行追蹤:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

左:高粗糙度表面,即漫反射,需要多個椎體追蹤;中:較粗糙的鏡面反射,隻需一個角度較大的椎體追蹤;右:低粗糙的鏡面反射,隻需一個角度較小的椎體追蹤。

對于AO,采用近處多采樣椎體追蹤+遠景AO+離線遮擋的綜合方式:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

對于軟陰影,可以用一個像素一個椎體的方式采樣,達到越光滑越高效的計算效果:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

論文還提到了隻用一個Pass達到體素化的技術,以及用Compute Shader建構稀疏八叉樹的技術和過程:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

Lumen的場景光照由RenderLumenSceneLighting擔當,它的代碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneLighting.cpp

void FDeferredShadingSceneRenderer::RenderLumenSceneLighting(
    FRDGBuilder& GraphBuilder,
    FViewInfo& View)
{
    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
    // 檢測是否開啟了Lumen: 非直接漫反射或反射方式的其中一個是Lumen即可.
    const bool bAnyLumenEnabled = GetViewPipelineState(Views[0]).DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen 
        || GetViewPipelineState(Views[0]).ReflectionsMethod == EReflectionsMethod::Lumen;

    if (bAnyLumenEnabled)
    {
        RDG_EVENT_SCOPE(GraphBuilder, "LumenSceneLighting");

        FGlobalShaderMap* GlobalShaderMap = View.ShaderMap;
        FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, Views[0]);

        if (LumenSceneData.VisibleCardsIndices.Num() > 0)
        {
            FRDGTextureRef RadiosityAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.RadiosityAtlas, TEXT("Lumen.RadiosityAtlas"));

            // 渲染輻射度.
            RenderRadiosityForLumenScene(GraphBuilder, TracingInputs, GlobalShaderMap, RadiosityAtlas);

            ConvertToExternalTexture(GraphBuilder, RadiosityAtlas, LumenSceneData.RadiosityAtlas);

            FLumenCardScatterContext DirectLightingCardScatterContext;
            extern float GLumenSceneCardDirectLightingUpdateFrequencyScale;

            // 建構間接參數并寫入卡片的面,這些面用來更新這一幀的直接照明.
            DirectLightingCardScatterContext.Init(
                GraphBuilder,
                View,
                LumenSceneData,
                LumenCardRenderer,
                ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender,
                1);

            // 裁剪卡片到指定形狀.
            DirectLightingCardScatterContext.CullCardsToShape(
                GraphBuilder,
                View,
                LumenSceneData,
                LumenCardRenderer,
                TracingInputs.LumenCardSceneUniformBuffer,
                ECullCardsShapeType::None,
                FCullCardsShapeParameters(),
                GLumenSceneCardDirectLightingUpdateFrequencyScale,
                0);

            // 建構散射非直接參數.
            DirectLightingCardScatterContext.BuildScatterIndirectArgs(
                GraphBuilder,
                View);

            extern int32 GLumenSceneRecaptureLumenSceneEveryFrame;

            // 清理光照相關的圖集: 最終收集圖集, 輻照度圖集, 非直接輻照度圖集.
            if (GLumenSceneRecaptureLumenSceneEveryFrame)
            {
                ClearAtlasRDG(GraphBuilder, TracingInputs.FinalLightingAtlas);
                if (Lumen::UseIrradianceAtlas(View))
                {
                    ClearAtlasRDG(GraphBuilder, TracingInputs.IrradianceAtlas);
                }
                if (Lumen::UseIndirectIrradianceAtlas(View))
                {
                    ClearAtlasRDG(GraphBuilder, TracingInputs.IndirectIrradianceAtlas);
                }
            }

            // 組合場景光照.
            CombineLumenSceneLighting(
                Scene,
                View,
                GraphBuilder,
                TracingInputs.LumenCardSceneUniformBuffer,
                TracingInputs.FinalLightingAtlas,
                TracingInputs.OpacityAtlas,
                RadiosityAtlas,
                GlobalShaderMap, 
                DirectLightingCardScatterContext);

            // 拷貝TracingInputs.FinalLightingAtlas的資料到TracingInputs.IndirectIrradianceAtlas.
            if (Lumen::UseIndirectIrradianceAtlas(View))
            {
                CopyLumenCardAtlas(
                    Scene,
                    View,
                    GraphBuilder,
                    TracingInputs.LumenCardSceneUniformBuffer,
                    TracingInputs.FinalLightingAtlas,
                    TracingInputs.IndirectIrradianceAtlas,
                    GlobalShaderMap,
                    DirectLightingCardScatterContext);
            }

            // 渲染Lumen場景的直接光照.
            RenderDirectLightingForLumenScene(
                GraphBuilder,
                TracingInputs.LumenCardSceneUniformBuffer,
                TracingInputs.FinalLightingAtlas,
                TracingInputs.OpacityAtlas,
                GlobalShaderMap,
                DirectLightingCardScatterContext);

            if (Lumen::UseIrradianceAtlas(View))
            {
                CopyLumenCardAtlas(
                    Scene,
                    View,
                    GraphBuilder,
                    TracingInputs.LumenCardSceneUniformBuffer,
                    TracingInputs.FinalLightingAtlas,
                    TracingInputs.IrradianceAtlas,
                    GlobalShaderMap,
                    DirectLightingCardScatterContext);
            }

            FRDGTextureRef AlbedoAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.AlbedoAtlas, TEXT("Lumen.AlbedoAtlas"));
            FRDGTextureRef EmissiveAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.EmissiveAtlas, TEXT("Lumen.EmissiveAtlas"));
            // 應用Lumen卡片的基礎色.
            ApplyLumenCardAlbedo(
                Scene,
                View,
                GraphBuilder,
                TracingInputs.LumenCardSceneUniformBuffer,
                TracingInputs.FinalLightingAtlas,
                AlbedoAtlas,
                EmissiveAtlas,
                GlobalShaderMap,
                DirectLightingCardScatterContext);

            LumenSceneData.bFinalLightingAtlasContentsValid = true;

            // 預過濾光照.
            PrefilterLumenSceneLighting(GraphBuilder, View, TracingInputs, GlobalShaderMap, DirectLightingCardScatterContext);

            ConvertToExternalTexture(GraphBuilder, TracingInputs.FinalLightingAtlas, LumenSceneData.FinalLightingAtlas);
            if (Lumen::UseIrradianceAtlas(View))
            {
                ConvertToExternalTexture(GraphBuilder, TracingInputs.IrradianceAtlas, LumenSceneData.IrradianceAtlas);
            }
            if (Lumen::UseIndirectIrradianceAtlas(View))
            {
                ConvertToExternalTexture(GraphBuilder, TracingInputs.IndirectIrradianceAtlas, LumenSceneData.IndirectIrradianceAtlas);
            }
        }

        // 計算Voxel光照.
        ComputeLumenSceneVoxelLighting(GraphBuilder, TracingInputs, GlobalShaderMap);

        // 透明物體GI.
        ComputeLumenTranslucencyGIVolume(GraphBuilder, TracingInputs, GlobalShaderMap);
    }
}
           

RenderDoc的截幀一目了然地顯示了以上流程:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

後面的小節對部分主要步驟執行分析。

RenderRadiosityForLumenScene的邏輯是渲染Lumen場景的輻射度,代碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenRadiosity.cpp

void FDeferredShadingSceneRenderer::RenderRadiosityForLumenScene(
    FRDGBuilder& GraphBuilder, 
    const FLumenCardTracingInputs& TracingInputs, 
    FGlobalShaderMap* GlobalShaderMap, 
    FRDGTextureRef RadiosityAtlas)
{
    LLM_SCOPE_BYTAG(Lumen);

    const FViewInfo& MainView = Views[0];
    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;

    extern int32 GLumenSceneRecaptureLumenSceneEveryFrame;

    if (IsRadiosityEnabled() 
        && !GLumenSceneRecaptureLumenSceneEveryFrame
        && LumenSceneData.bFinalLightingAtlasContentsValid
        && TracingInputs.NumClipmapLevels > 0)
    {
        RDG_EVENT_SCOPE(GraphBuilder, "Radiosity");

        FLumenCardScatterContext VisibleCardScatterContext;

        // 建構間接參數并寫入卡片的面,這些面用來更新這一幀的直接照明.
        VisibleCardScatterContext.Init(
            GraphBuilder,
            MainView,
            LumenSceneData,
            LumenCardRenderer,
            ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender);

        VisibleCardScatterContext.CullCardsToShape(
            GraphBuilder,
            MainView,
            LumenSceneData,
            LumenCardRenderer,
            TracingInputs.LumenCardSceneUniformBuffer,
            ECullCardsShapeType::None,
            FCullCardsShapeParameters(),
            GLumenSceneCardRadiosityUpdateFrequencyScale,
            0);

        // 建構非直接散射參數.
        VisibleCardScatterContext.BuildScatterIndirectArgs(
            GraphBuilder,
            MainView);

        // 生成采樣點.
        RadiosityDirections.GenerateSamples(
            FMath::Clamp(GLumenRadiosityNumTargetCones, 1, (int32)MaxRadiosityConeDirections),
            1,
            GLumenRadiosityNumTargetCones,
            false,
            true /* Cosine distribution */);

        const bool bRenderSkylight = Lumen::ShouldHandleSkyLight(Scene, ViewFamily);

        // 渲染輻射度的散射.
        if (GLumenRadiosityComputeTraceBlocksScatter) // CS模式
        {
            RenderRadiosityComputeScatter(
                GraphBuilder,
                Scene,
                Views[0],
                bRenderSkylight,
                LumenSceneData,
                RadiosityAtlas,
                TracingInputs,
                VisibleCardScatterContext.Parameters,
                GlobalShaderMap);
        }
        else // PS模式
        {
            FLumenCardRadiosity* PassParameters = GraphBuilder.AllocParameters<FLumenCardRadiosity>();

            PassParameters->RenderTargets[0] = FRenderTargetBinding(RadiosityAtlas, ERenderTargetLoadAction::ENoAction);

            PassParameters->VS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
            PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
            PassParameters->VS.ScatterInstanceIndex = 0;
            PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;

            SetupTraceFromTexelParameters(Views[0], TracingInputs, LumenSceneData, PassParameters->PS.TraceFromTexelParameters);

            FLumenCardRadiosityPS::FPermutationDomain PermutationVector;
            PermutationVector.Set<FLumenCardRadiosityPS::FDynamicSkyLight>(bRenderSkylight);
            auto PixelShader = GlobalShaderMap->GetShader<FLumenCardRadiosityPS>(PermutationVector);

            FScene* LocalScene = Scene;
            const int32 RadiosityDownsampleArea = GLumenRadiosityDownsampleFactor * GLumenRadiosityDownsampleFactor;

            // 從圖集中追蹤輻射度.
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("TraceFromAtlasTexels: %u Cones", RadiosityDirections.SampleDirections.Num()),
                PassParameters,
                ERDGPassFlags::Raster,
                [LocalScene, PixelShader, PassParameters, GlobalShaderMap](FRHICommandListImmediate& RHICmdList)
            {
                FIntPoint ViewRect = FIntPoint::DivideAndRoundDown(LocalScene->LumenSceneData->MaxAtlasSize, GLumenRadiosityDownsampleFactor);
                DrawQuadsToAtlas(ViewRect, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
            });
        }
    }
    else
    {
        ClearAtlasRDG(GraphBuilder, RadiosityAtlas);
    }
}
           

以上代碼中最後階段是計算輻射度,通常情況下,會進入CS模式

RenderRadiosityComputeScatter

,下面進入其代碼分析:

void RenderRadiosityComputeScatter(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    bool bRenderSkylight, 
    const FLumenSceneData& LumenSceneData,
    FRDGTextureRef RadiosityAtlas,
    const FLumenCardTracingInputs& TracingInputs,
    const FLumenCardScatterParameters& CardScatterParameters,
    FGlobalShaderMap* GlobalShaderMap)
{
    const bool bUseIrradianceCache = GLumenRadiosityUseIrradianceCache != 0;

    // 建構追蹤塊的非直接參數.
    FRDGBufferRef SetupCardTraceBlocksIndirectArgsBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>(1), TEXT("SetupCardTraceBlocksIndirectArgsBuffer"));
    {
        FRDGBufferUAVRef SetupCardTraceBlocksIndirectArgsBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(SetupCardTraceBlocksIndirectArgsBuffer));

        FPlaceProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FPlaceProbeIndirectArgsCS::FParameters>();
        PassParameters->RWIndirectArgs = SetupCardTraceBlocksIndirectArgsBufferUAV;
        PassParameters->QuadAllocator = CardScatterParameters.QuadAllocator;

        auto ComputeShader = GlobalShaderMap->GetShader< FPlaceProbeIndirectArgsCS >(0);

        ensure(GSetupCardTraceBlocksGroupSize == GPlaceRadiosityProbeGroupSize);
        const FIntVector GroupSize(1, 1, 1);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupCardTraceBlocksIndirectArgsCS"),
            ComputeShader,
            PassParameters,
            GroupSize);
    }

    const int32 TraceBlockMaxSize = 2;
    extern int32 GLumenSceneCardLightingForceFullUpdate;
    const int32 Divisor = TraceBlockMaxSize * GLumenRadiosityDownsampleFactor * (GLumenSceneCardLightingForceFullUpdate ? 1 : GLumenRadiosityTraceBlocksAllocationDivisor);
    const int32 NumTraceBlocksToAllocate = (LumenSceneData.MaxAtlasSize.X / Divisor) 
        * (LumenSceneData.MaxAtlasSize.Y / Divisor);

    FRDGBufferRef CardTraceBlockAllocator = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("CardTraceBlockAllocator"));
    FRDGBufferRef CardTraceBlockData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(FIntVector4), NumTraceBlocksToAllocate), TEXT("CardTraceBlockData"));
    FRDGBufferUAVRef CardTraceBlockAllocatorUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(CardTraceBlockAllocator, PF_R32_UINT));
    FRDGBufferUAVRef CardTraceBlockDataUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));

    FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, CardTraceBlockAllocatorUAV, 0);

    // 建構卡片追蹤塊.
    {
        FSetupCardTraceBlocksCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupCardTraceBlocksCS::FParameters>();
        PassParameters->RWCardTraceBlockAllocator = CardTraceBlockAllocatorUAV;
        PassParameters->RWCardTraceBlockData = CardTraceBlockDataUAV;
        PassParameters->QuadAllocator = CardScatterParameters.QuadAllocator;
        PassParameters->QuadData = CardScatterParameters.QuadData;
        PassParameters->CardBuffer = LumenSceneData.CardBuffer.SRV;
        PassParameters->RadiosityAtlasSize = FIntPoint::DivideAndRoundDown(LumenSceneData.MaxAtlasSize, GLumenRadiosityDownsampleFactor);
        PassParameters->IndirectArgs = SetupCardTraceBlocksIndirectArgsBuffer;

        auto ComputeShader = GlobalShaderMap->GetShader<FSetupCardTraceBlocksCS>();

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupCardTraceBlocksCS"),
            ComputeShader,
            PassParameters,
            SetupCardTraceBlocksIndirectArgsBuffer,
            0);
    }

    // 建構卡片追蹤參數.
    FRDGBufferRef TraceBlocksIndirectArgsBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>(1), TEXT("TraceBlocksIndirectArgsBuffer"));
    {
        FRDGBufferUAVRef TraceBlocksIndirectArgsBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(TraceBlocksIndirectArgsBuffer));

        FTraceBlocksIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FTraceBlocksIndirectArgsCS::FParameters>();
        PassParameters->RWIndirectArgs = TraceBlocksIndirectArgsBufferUAV;
        PassParameters->CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));

        FTraceBlocksIndirectArgsCS::FPermutationDomain PermutationVector;
        PermutationVector.Set<FTraceBlocksIndirectArgsCS::FIrradianceCache>(bUseIrradianceCache);
        auto ComputeShader = GlobalShaderMap->GetShader< FTraceBlocksIndirectArgsCS >(PermutationVector);

        const FIntVector GroupSize(1, 1, 1);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceBlocksIndirectArgsCS"),
            ComputeShader,
            PassParameters,
            GroupSize);
    }

    LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;

    // 渲染輻照度緩存.
    if (bUseIrradianceCache)
    {
        const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenRadiosity::SetupRadianceCacheInputs();

        FRadiosityMarkUsedProbesData MarkUsedProbesData;
        MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
        MarkUsedProbesData.Parameters.DepthAtlas = LumenSceneData.DepthAtlas->GetRenderTargetItem().ShaderResourceTexture;
        MarkUsedProbesData.Parameters.CurrentOpacityAtlas = LumenSceneData.OpacityAtlas->GetRenderTargetItem().ShaderResourceTexture;
        MarkUsedProbesData.Parameters.CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
        MarkUsedProbesData.Parameters.CardTraceBlockData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
        MarkUsedProbesData.Parameters.CardBuffer = LumenSceneData.CardBuffer.SRV;
        MarkUsedProbesData.Parameters.RadiosityAtlasSize = FIntPoint::DivideAndRoundDown(LumenSceneData.MaxAtlasSize, GLumenRadiosityDownsampleFactor);
        MarkUsedProbesData.Parameters.IndirectArgs = TraceBlocksIndirectArgsBuffer;

        RenderRadianceCache(
            GraphBuilder, 
            TracingInputs, 
            RadianceCacheInputs, 
            Scene,
            View, 
            nullptr, 
            nullptr, 
            FMarkUsedRadianceCacheProbes::CreateStatic(&RadianceCacheMarkUsedProbes), 
            &MarkUsedProbesData, 
            View.ViewState->RadiosityRadianceCacheState, 
            RadianceCacheParameters);
    }

    // 從圖集中追蹤卡片紋素的輻射度.
    {
        FLumenCardRadiosityTraceBlocksCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardRadiosityTraceBlocksCS::FParameters>();
        PassParameters->RWRadiosityAtlas = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RadiosityAtlas));
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;
        PassParameters->CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
        PassParameters->CardTraceBlockData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
        PassParameters->ProbeOcclusionNormalBias = GLumenRadiosityIrradianceCacheProbeOcclusionNormalBias;
        PassParameters->IndirectArgs = TraceBlocksIndirectArgsBuffer;

        SetupTraceFromTexelParameters(View, TracingInputs, LumenSceneData, PassParameters->TraceFromTexelParameters);

        FLumenCardRadiosityTraceBlocksCS::FPermutationDomain PermutationVector;
        PermutationVector.Set<FLumenCardRadiosityTraceBlocksCS::FDynamicSkyLight>(bRenderSkylight);
        PermutationVector.Set<FLumenCardRadiosityTraceBlocksCS::FIrradianceCache>(bUseIrradianceCache);
        auto ComputeShader = GlobalShaderMap->GetShader< FLumenCardRadiosityTraceBlocksCS >(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceFromAtlasTexels: %u Cones", RadiosityDirections.SampleDirections.Num()),
            ComputeShader,
            PassParameters,
            TraceBlocksIndirectArgsBuffer,
            0);
    }
}
           

由此可知計算輻射度的過程比較多,包含裁剪、建構追蹤參數、追蹤圖集紋素等:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

最後階段的追蹤紋素主要是構造采樣方向,每個采樣方向建構一個椎體(Cone)去追蹤附近的輻射度,它的輸入參數主要有全局距離場圖集、場景深度、場景透明度、場景法線、VoxelLighting等資料:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

追蹤卡片紋素所需的資料:左上是全局距離場圖集,右上是場景深度圖集,左下是場景透明度,右下是場景法線。

輸出的是場景輻射度圖集:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

對應的CS shader代碼如下:

// Engine\Shaders\Private\Lumen\LumenRadiosity.usf

float ProbeOcclusionNormalBias;
// 用于保持線程組的光照結果, 注意是groupshared的.
groupshared float3 ThreadLighting[THREADGROUP_SIZE];

[numthreads(THREADGROUP_SIZE, 1, 1)]
void LumenCardRadiosityTraceBlocksCS(
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
#if IRRADIANCE_CACHE // 輻照度緩存模式
    uint ThreadIndex = DispatchThreadId.x;

    uint GlobalBlockIndex = ThreadIndex / (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);

    if (GlobalBlockIndex < CardTraceBlockAllocator[0])
    {
        // 計算紋素索引.
        uint TexelIndexInBlock = ThreadIndex % (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
        uint2 TexelOffsetInBlock = uint2(TexelIndexInBlock % CARD_TRACE_BLOCK_SIZE, TexelIndexInBlock / CARD_TRACE_BLOCK_SIZE);

        // 擷取追蹤塊資料.
        uint4 TraceBlockData = CardTraceBlockData[GlobalBlockIndex];
        uint CardId = TraceBlockData.x;
        uint ProbeIndex = TraceBlockData.y;
        uint BlockIndex = TraceBlockData.z;

        // 擷取卡片資料.
        FLumenCardData CardData = GetLumenCardData(CardId, CardBuffer);

        float2 CardSizeTexels = abs(CardData.LocalExtent.xy * 2 * CardData.LocalPositionToAtlasUVScale * RadiosityAtlasSize);
        uint2 NumBlocksXY = ((uint2)CardSizeTexels + CARD_TRACE_BLOCK_SIZE - 1) / CARD_TRACE_BLOCK_SIZE;
        uint2 BlockOffset = uint2(BlockIndex % NumBlocksXY.x, BlockIndex / NumBlocksXY.x);
        float2 TexelCoord = BlockOffset * CARD_TRACE_BLOCK_SIZE + TexelOffsetInBlock;

        if (all(TexelCoord < CardSizeTexels))
        {
            // 計算卡片UV.
            float2 CardUV = (TexelCoord + .5f) / (float2)CardSizeTexels;
            float2 CardUVToAtlasScale = GetCardUVToAtlasScale(CardData.LocalPositionToAtlasUVScale, CardData.LocalExtent);
            float2 CardUVToAtlasBias = GetCardUVToAtlasBias(CardUVToAtlasScale, CardData.LocalPositionToAtlasUVBias);
            float2 AtlasUV = CardUV * CardUVToAtlasScale + CardUVToAtlasBias;

            float Opacity = Texture2DSampleLevel(CurrentOpacityAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

            float3 DiffuseLighting = 0;

            // 透明度大于0的輻射度才有意義.
            if (Opacity > 0)
            {
                float Depth = 1.0f - Texture2DSampleLevel(DepthAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

                float3 LocalPosition;
                LocalPosition.xy = (AtlasUV - CardData.LocalPositionToAtlasUVBias) / CardData.LocalPositionToAtlasUVScale;
                LocalPosition.z = -CardData.LocalExtent.z + Depth * 2 * CardData.LocalExtent.z;

                // 計算世界空間的位置和法線.
                float3 WorldPosition = mul(CardData.WorldToLocalRotation, LocalPosition) + CardData.Origin;
                float3 WorldNormal = normalize(Texture2DSampleLevel(NormalAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).xyz * 2 - 1);
                uint ClipmapIndex = GetRadianceProbeClipmap(WorldPosition);

                // 計算漫反射光照. 如果裁剪圖有效, 則從中插值獲得.
                if (ClipmapIndex < NumRadianceProbeClipmaps)
                {
                    float3 BiasOffset = WorldNormal * ProbeOcclusionNormalBias;
                    // 從RadianceProbeIndirectionTexture采樣計算漫反射.
                    DiffuseLighting = SampleIrradianceCacheInterpolated(WorldPosition, WorldNormal, BiasOffset, ClipmapIndex);
                }
                else // 沒有有效裁剪圖, 從天空光的球諧中計算漫反射.
                {
                    DiffuseLighting = GetSkySHDiffuse(WorldNormal) * View.SkyLightColor.rgb;
                }
            }

            // 存儲輻射度.
            uint2 AtlasCoord = uint2(AtlasUV * RadiosityAtlasSize);
            RWRadiosityAtlas[AtlasCoord] = float4(DiffuseLighting * PI, 0);
        }
    }
#else // 非輻照度緩存模式
    ThreadLighting[GroupThreadId.x] = 0;

    uint ThreadIndex = DispatchThreadId.x;
    uint GlobalBlockIndex = ThreadIndex / (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE * THREADS_PER_RADIOSITY_TEXEL);
    int2 AtlasCoord = -1;

    if (GlobalBlockIndex < CardTraceBlockAllocator[0])
    {
        uint TexelIndexInBlock = (ThreadIndex / THREADS_PER_RADIOSITY_TEXEL) % (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
        uint2 TexelOffsetInBlock = uint2(TexelIndexInBlock % CARD_TRACE_BLOCK_SIZE, TexelIndexInBlock / CARD_TRACE_BLOCK_SIZE);

        uint4 TraceBlockData = CardTraceBlockData[GlobalBlockIndex];
        uint CardId = TraceBlockData.x;
        uint ProbeIndex = TraceBlockData.y;
        uint BlockIndex = TraceBlockData.z;

        FLumenCardData CardData = GetLumenCardData(CardId, CardBuffer);

        float2 CardSizeTexels = abs(CardData.LocalExtent.xy * 2 * CardData.LocalPositionToAtlasUVScale * RadiosityAtlasSize);
        uint2 NumBlocksXY = ((uint2)CardSizeTexels + CARD_TRACE_BLOCK_SIZE - 1) / CARD_TRACE_BLOCK_SIZE;
        uint2 BlockOffset = uint2(BlockIndex % NumBlocksXY.x, BlockIndex / NumBlocksXY.x);
        float2 TexelCoord = BlockOffset * CARD_TRACE_BLOCK_SIZE + TexelOffsetInBlock;

        if (all(TexelCoord < CardSizeTexels))
        {
            uint TraceThreadIndex = ThreadIndex % THREADS_PER_RADIOSITY_TEXEL;

            float2 CardUV = (TexelCoord + .5f) / (float2)CardSizeTexels;
            float2 CardUVToAtlasScale = GetCardUVToAtlasScale(CardData.LocalPositionToAtlasUVScale, CardData.LocalExtent);
            float2 CardUVToAtlasBias = GetCardUVToAtlasBias(CardUVToAtlasScale, CardData.LocalPositionToAtlasUVBias);
            float2 AtlasUV = CardUV * CardUVToAtlasScale + CardUVToAtlasBias;

            uint NumTracesPerThread = NumCones / THREADS_PER_RADIOSITY_TEXEL;
            uint ConeStartIndex = TraceThreadIndex * NumTracesPerThread;
            AtlasCoord = int2(AtlasUV * RadiosityAtlasSize);
            // 從卡片紋素追蹤輻射度.
            float3 Lighting = RadiosityTraceFromTexel(AtlasUV, AtlasCoord, ProbeIndex, CardData, ConeStartIndex, ConeStartIndex + NumTracesPerThread);
            ThreadLighting[GroupThreadId.x] = Lighting;
        }
    }

    // 等待同線程組的其它線程完成計算.
    GroupMemoryBarrierWithGroupSync();

    uint TraceThreadIndex = ThreadIndex % THREADS_PER_RADIOSITY_TEXEL;

    // 疊加同線程組所有線程的光照結果并儲存. TraceThreadIndex == 0表明隻在每個線程組的第一個線程執行.
    if (TraceThreadIndex == 0 && all(AtlasCoord >= 0))
    {
        float3 Lighting = 0;

        for (uint OtherThreadIndex = GroupThreadId.x; OtherThreadIndex < GroupThreadId.x + THREADS_PER_RADIOSITY_TEXEL; OtherThreadIndex += 1)
        {
            Lighting += ThreadLighting[OtherThreadIndex];
        }

        RWRadiosityAtlas[AtlasCoord] = float4(Lighting, 0);
    }
#endif
}
           

由此可知,追蹤輻射度時,支援兩種模式:輻照度緩存模式和非輻照度緩存模式。輻照度緩存模式是從3D的RadianceProbeIndirectionTexture采樣、插值計算而得到輻射度,而非輻照度緩存模式是實時追蹤卡片紋素附近的輻射度,再疊加它們的結果,其中用到了RadiosityTraceFromTexel的邏輯如下:

float3 RadiosityTraceFromTexel(float2 AtlasUV, int2 AtlasCoord, uint ProbeIndex, FLumenCardData LumenCardData, uint ConeStartIndex, uint ConeEndIndex)
{
    float Opacity = Texture2DSampleLevel(CurrentOpacityAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

    float3 Lighting = 0;

    if (Opacity > 0)
    {
        float Depth = 1.0f - Texture2DSampleLevel(DepthAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

        // 重建局部位置
        float3 LocalPosition;
        LocalPosition.xy = (AtlasUV - LumenCardData.LocalPositionToAtlasUVBias) / LumenCardData.LocalPositionToAtlasUVScale;
        LocalPosition.z = -LumenCardData.LocalExtent.z + Depth * 2 * LumenCardData.LocalExtent.z;

        // 世界空間的位置和法線.
        float3 WorldPosition = mul(LumenCardData.WorldToLocalRotation, LocalPosition) + LumenCardData.Origin;
        float3 WorldNormal = normalize(Texture2DSampleLevel(NormalAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).xyz * 2 - 1);

        //@todo - derive bias from texel world size
        WorldPosition += WorldNormal * SurfaceBias;

        // 追蹤起點.
        float VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(MinTraceDistance, MaxTraceDistance, MaxMeshSDFTraceDistance, false);

        // 周遊所有方向的椎體, 疊加它們的結果.
        for (uint ConeIndex = ConeStartIndex; ConeIndex < ConeEndIndex; ConeIndex++)
        {
            //uint ConeIndex = ConeStartIndex;
            float3x3 TangentBasis = GetTangentBasisFrisvad(WorldNormal);

            // 計算椎體方向.
            #define PRECOMPUTED_SAMPLE_DIRECTIONS 1
            #if PRECOMPUTED_SAMPLE_DIRECTIONS // 預計算的方向.
                float3 LocalConeDirection = RadiosityConeDirections[ConeIndex].xyz;
                float3 WorldConeDirection = mul(LocalConeDirection, TangentBasis);
            #else // 非預計算, 直接通過低差異序列生成方向.
                uint2 Seed0 = Rand3DPCG16(int3(AtlasCoord + 17, 0)).xy;
                float2 E = Hammersley16(ConeIndex, NumCones, Seed0);
                float2 DiskE = UniformSampleDiskConcentric(E.xy);
                float TangentZ = sqrt(1 - length2(DiskE));
                float3 WorldConeDirection = mul(float3(DiskE, TangentZ), TangentBasis);
            #endif

            //@todo - derive bias from texel world size
            // 采樣位置.
            float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;

            // 建構椎體追蹤輸入資料.
            FConeTraceInput TraceInput;
            TraceInput.Setup(SamplePosition, WorldConeDirection, DiffuseConeHalfAngle, MinSampleRadius, MinTraceDistance, MaxTraceDistance, StepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;
            TraceInput.VoxelTraceStartDistance = VoxelTraceStartDistance;
            TraceInput.SDFStepFactor = 1;

            // 執行椎體追蹤, 儲存結果.
            FConeTraceResult TraceResult;
            ConeTraceVoxels(TraceInput, TraceResult);

            // 用椎體計算天空光的輻射度.
            EvaluateSkyRadianceForCone(WorldConeDirection, TraceInput.TanConeAngle, TraceResult);

            // 疊加采樣的光照結果.
            Lighting += TraceResult.Lighting;
        }
    }

    // 縮放采樣結果, 防止能量不守恒.
    Lighting *= PI / (float)NumCones;
    return Lighting;
}
           

上面涉及到了椎體追蹤場景的接口

ConeTraceVoxels

就是6.5.6.1 Voxel Cone Tracing提及的方式,代碼如下:

// Engine\Shaders\Private\Lumen\LumenTracingCommon.ush

void ConeTraceVoxels(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
    FGlobalSDFTraceResult SDFTraceResult;

    // 追蹤SDF射線
    {
        FGlobalSDFTraceInput SDFTraceInput = SetupGlobalSDFTraceInput(TraceInput.ConeOrigin, TraceInput.ConeDirection, TraceInput.MinTraceDistance, TraceInput.MaxTraceDistance, TraceInput.SDFStepFactor, TraceInput.VoxelStepFactor);
        SDFTraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance = TraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance;
        SDFTraceInput.InitialMaxDistance = TraceInput.InitialMaxDistance;

        // 追蹤全局距離場.
        SDFTraceResult = RayTraceGlobalDistanceField(SDFTraceInput);
    }

    float4 LightingAndAlpha = float4(0, 0, 0, 1);

    // 隻有全局距離場命中才執行下面的邏輯.
    if (GlobalSDFTraceResultIsHit(SDFTraceResult))
    {
        float3 SampleWorldPosition = TraceInput.ConeOrigin + TraceInput.ConeDirection * SDFTraceResult.HitTime;

        uint VoxelClipmapIndex = 0;
        float3 VoxelClipmapCenter = ClipmapWorldCenter[VoxelClipmapIndex].xyz;
        float3 VoxelClipmapExtent = ClipmapWorldSamplingExtent[VoxelClipmapIndex].xyz;

        bool bOutsideValidRegion = any(SampleWorldPosition > VoxelClipmapCenter + VoxelClipmapExtent || SampleWorldPosition < VoxelClipmapCenter - VoxelClipmapExtent);

        // 查找比對目前步進的椎體寬度的voxel clipmap.
        while (bOutsideValidRegion && VoxelClipmapIndex + 1 < NumClipmapLevels)
        {
            VoxelClipmapIndex++;
            VoxelClipmapCenter = ClipmapWorldCenter[VoxelClipmapIndex].xyz;
            VoxelClipmapExtent = ClipmapWorldSamplingExtent[VoxelClipmapIndex].xyz;
            bOutsideValidRegion = any(SampleWorldPosition > VoxelClipmapCenter + VoxelClipmapExtent || SampleWorldPosition < VoxelClipmapCenter - VoxelClipmapExtent);
        }

        LightingAndAlpha.xyzw = 0.0f;

        // 如果沒有超出有效範圍, 則計算Voxel光照.
        if (!bOutsideValidRegion)
        {
            float3 DistanceFieldGradient = -TraceInput.ConeDirection;

            float3 ClipmapVolumeUV = ComputeGlobalUV(SampleWorldPosition, SDFTraceResult.HitClipmapIndex);
            uint PageIndex = GetGlobalDistanceFieldPage(ClipmapVolumeUV, SDFTraceResult.HitClipmapIndex);

            if (PageIndex < GLOBAL_DISTANCE_FIELD_INVALID_PAGE_ID)
            {
                float3 PageUV = ComputeGlobalDistanceFieldPageUV(ClipmapVolumeUV, PageIndex);
                DistanceFieldGradient = GlobalDistanceFieldPageCentralDiff(PageUV);
            }

            float DistanceFieldGradientLength = length(DistanceFieldGradient);
            float3 SampleNormal = DistanceFieldGradientLength > 0.001 ? DistanceFieldGradient / DistanceFieldGradientLength : -TraceInput.ConeDirection;

            // 采樣3D紋理VoxelLighting, 獲得光照.
            float4 StepLighting = SampleVoxelLighting(SampleWorldPosition, -SampleNormal, VoxelClipmapIndex);

            StepLighting.xyz = StepLighting.xyz * (1.0f / max(StepLighting.w, 0.1));

            // 計算自遮擋因子.
            float VoxelSelfLightingBias = 1.0f;
            if (TraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance)
            {
                // 對于漫射光線,最好是過度遮擋, 而不該漏光.
                VoxelSelfLightingBias = smoothstep(1.5 * ClipmapVoxelSizeAndRadius[VoxelClipmapIndex].w, 2.0 * ClipmapVoxelSizeAndRadius[VoxelClipmapIndex].w, SDFTraceResult.HitTime);
            }

            // 獲得自遮擋後的光照結果.
            LightingAndAlpha.xyz = StepLighting.xyz * VoxelSelfLightingBias;
        }
    }

    // 根據Opacity過渡光照結果.
    LightingAndAlpha = FadeOutVoxelConeTraceMinTransparency(LightingAndAlpha);

    // 儲存結果.
    OutResult = (FConeTraceResult)0;
    #if !VISIBILITY_ONLY_TRACE
        OutResult.Lighting = LightingAndAlpha.rgb;
    #endif
    OutResult.Transparency = LightingAndAlpha.a;
    OutResult.NumSteps = SDFTraceResult.TotalStepsTaken;
    OutResult.OpaqueHitDistance = GlobalSDFTraceResultIsHit(SDFTraceResult) ? SDFTraceResult.HitTime : TraceInput.MaxTraceDistance;
}
           

上面的椎體追蹤中使用了VoxelLighting的3D紋理,該紋理同時還是Clipmap,筆者所截取的資料中顯示它的次元是64x256x384,并且很多切片(Slice)是黑色的,隻有少許是有像素的,且區域很小:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

CombineLumenSceneLighting是組合光照,具體邏輯如下:

void CombineLumenSceneLighting(
    FScene* Scene, 
    FViewInfo& View,
    FRDGBuilder& GraphBuilder,
    TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
    FRDGTextureRef FinalLightingAtlas, 
    FRDGTextureRef OpacityAtlas, 
    FRDGTextureRef RadiosityAtlas, 
    FGlobalShaderMap* GlobalShaderMap,
    const FLumenCardScatterContext& VisibleCardScatterContext)
{
    LLM_SCOPE_BYTAG(Lumen);

    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;

    {
        FLumenCardLightingEmissive* PassParameters = GraphBuilder.AllocParameters<FLumenCardLightingEmissive>();
        
        extern int32 GLumenRadiosityDownsampleFactor;
        FVector2D CardUVSamplingOffset = FVector2D::ZeroVector;
        if (GLumenRadiosityDownsampleFactor > 1)
        {
            // Offset bilinear samples in order to not sample outside of the lower res radiosity card bounds
            CardUVSamplingOffset.X = (GLumenRadiosityDownsampleFactor * 0.25f) / LumenSceneData.MaxAtlasSize.X;
            CardUVSamplingOffset.Y = (GLumenRadiosityDownsampleFactor * 0.25f) / LumenSceneData.MaxAtlasSize.Y;
        }

        PassParameters->RenderTargets[0] = FRenderTargetBinding(FinalLightingAtlas, ERenderTargetLoadAction::ENoAction);
        PassParameters->VS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
        PassParameters->VS.ScatterInstanceIndex = 0;
        PassParameters->VS.CardUVSamplingOffset = CardUVSamplingOffset;
        PassParameters->PS.View = View.ViewUniformBuffer;
        PassParameters->PS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->PS.RadiosityAtlas = RadiosityAtlas;
        PassParameters->PS.OpacityAtlas = OpacityAtlas;

        // 增加光照組合Pass, 用的是傳統的光栅化流程.
        GraphBuilder.AddPass(
            RDG_EVENT_NAME("LightingCombine"),
            PassParameters,
            ERDGPassFlags::Raster,
            [MaxAtlasSize = Scene->LumenSceneData->MaxAtlasSize, PassParameters, GlobalShaderMap](FRHICommandListImmediate& RHICmdList)
        {
            FLumenCardLightingInitializePS::FPermutationDomain PermutationVector;
            auto PixelShader = GlobalShaderMap->GetShader< FLumenCardLightingInitializePS >(PermutationVector);

            DrawQuadsToAtlas(MaxAtlasSize, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
        });
    }
}
           

這個階段是将上一節的場景輻射度圖集作為輸入,然後輸出輸出輻射度顔色到SceneFinalLighting中。

RenderDirectLightingForLumenScene是計算Lumen場景的直接光照,流程有點類似于傳統的光照:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneDirectLighting.cpp

void FDeferredShadingSceneRenderer::RenderDirectLightingForLumenScene(
    FRDGBuilder& GraphBuilder,
    TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
    FRDGTextureRef FinalLightingAtlas,
    FRDGTextureRef OpacityAtlas,
    FGlobalShaderMap* GlobalShaderMap,
    const FLumenCardScatterContext& VisibleCardScatterContext)
{
    LLM_SCOPE_BYTAG(Lumen);

    if (GLumenDirectLighting)
    {
        RDG_EVENT_SCOPE(GraphBuilder, "DirectLighting");
        QUICK_SCOPE_CYCLE_COUNTER(RenderDirectLightingForLumenScene);

        const FViewInfo& MainView = Views[0];
        FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
        const bool bLumenUseHardwareRayTracedShadow = Lumen::UseHardwareRayTracedShadows(MainView);
        FLumenDirectLightingHardwareRayTracingData LumenDirectLightingHardwareRayTracingData;
        
        if(bLumenUseHardwareRayTracedShadow)
        {
            LumenDirectLightingHardwareRayTracingData.Initialize(GraphBuilder, Scene);
        }

        TArray<const FLightSceneInfo*, TInlineAllocator<64>> GatheredLocalLights;

        // 周遊場景的所有光源.
        for (TSparseArray<FLightSceneInfoCompact>::TConstIterator LightIt(Scene->Lights); LightIt; ++LightIt)
        {
            const FLightSceneInfoCompact& LightSceneInfoCompact = *LightIt;
            const FLightSceneInfo* LightSceneInfo = LightSceneInfoCompact.LightSceneInfo;

            if (LightSceneInfo->ShouldRenderLightViewIndependent()
                && LightSceneInfo->ShouldRenderLight(MainView, true)
                && LightSceneInfo->Proxy->GetIndirectLightingScale() > 0.0f)
            {
                const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();

                // 平行光
                if (LightType == LightType_Directional)
                {
                    // 不需要裁剪, 直接繪制.

                    FString LightNameWithLevel;
                    FSceneRenderer::GetLightNameForDrawEvent(LightSceneInfo->Proxy, LightNameWithLevel);

                    // 渲染直接光到Lumen卡片.
                    RenderDirectLightIntoLumenCards(
                        GraphBuilder,
                        Scene,
                        MainView,
                        ViewFamily.EngineShowFlags,
                        VisibleLightInfos,
                        LumenCardSceneUniformBuffer,
                        FinalLightingAtlas,
                        OpacityAtlas,
                        LightSceneInfo,
                        LightNameWithLevel,
                        VisibleCardScatterContext,
                        0,
                        LumenDirectLightingHardwareRayTracingData,
                        VirtualShadowMapArray);
                }
                else // 非平行光, 收集到GatheredLocalLights.
                {
                    GatheredLocalLights.Add(LightSceneInfo);
                }
            }
        }

        const int32 LightBatchSize = FMath::Clamp(GLumenDirectLightingBatchSize, 1, 256);

        // 分批的光照裁剪和繪圖
        for (int32 LightBatchIndex = 0; LightBatchIndex * LightBatchSize < GatheredLocalLights.Num(); ++LightBatchIndex)
        {
            const int32 FirstLightIndex = LightBatchIndex * LightBatchSize;
            const int32 LastLightIndex = FMath::Min((LightBatchIndex + 1) * LightBatchSize, GatheredLocalLights.Num());

            FLumenCardScatterContext CardScatterContext;

            {
                RDG_EVENT_SCOPE(GraphBuilder, "Cull Cards %d Lights", LastLightIndex - FirstLightIndex);

                // 初始化上下文.
                CardScatterContext.Init(
                    GraphBuilder,
                    MainView,
                    LumenSceneData,
                    LumenCardRenderer,
                    ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender,
                    LightBatchSize);

                // 将卡片裁剪到光源的形狀上.
                for (int32 LightIndex = FirstLightIndex; LightIndex < LastLightIndex; ++LightIndex)
                {
                    const int32 ScatterInstanceIndex = LightIndex - FirstLightIndex;
                    const FLightSceneInfo* LightSceneInfo = GatheredLocalLights[LightIndex];
                    const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
                    const FSphere LightBounds = LightSceneInfo->Proxy->GetBoundingSphere();

                    ECullCardsShapeType ShapeType = ECullCardsShapeType::None;

                    if (LightType == LightType_Point)
                    {
                        ShapeType = ECullCardsShapeType::PointLight;
                    }
                    else if (LightType == LightType_Spot)
                    {
                        ShapeType = ECullCardsShapeType::SpotLight;
                    }
                    else if (LightType == LightType_Rect)
                    {
                        ShapeType = ECullCardsShapeType::RectLight;
                    }
                    else
                    {
                        ensureMsgf(false, TEXT("Need Lumen card culling for new light type"));
                    }

                    FCullCardsShapeParameters ShapeParameters;
                    ShapeParameters.InfluenceSphere = FVector4(LightBounds.Center, LightBounds.W);
                    ShapeParameters.LightPosition = LightSceneInfo->Proxy->GetPosition();
                    ShapeParameters.LightDirection = LightSceneInfo->Proxy->GetDirection();
                    ShapeParameters.LightRadius = LightSceneInfo->Proxy->GetRadius();
                    ShapeParameters.CosConeAngle = FMath::Cos(LightSceneInfo->Proxy->GetOuterConeAngle());
                    ShapeParameters.SinConeAngle = FMath::Sin(LightSceneInfo->Proxy->GetOuterConeAngle());

                    // 根據光源形狀裁剪卡片
                    CardScatterContext.CullCardsToShape(
                        GraphBuilder,
                        MainView,
                        LumenSceneData,
                        LumenCardRenderer,
                        LumenCardSceneUniformBuffer,
                        ShapeType,
                        ShapeParameters,
                        GLumenSceneCardDirectLightingUpdateFrequencyScale,
                        ScatterInstanceIndex);
                }

                // 建構散射非直接參數.
                CardScatterContext.BuildScatterIndirectArgs(
                    GraphBuilder,
                    MainView);
            }

            // 繪制非平行光的光源.
            {
                RDG_EVENT_SCOPE(GraphBuilder, "Draw %d Lights", LastLightIndex - FirstLightIndex);

                for (int32 LightIndex = FirstLightIndex; LightIndex < LastLightIndex; ++LightIndex)
                {
                    const int32 ScatterInstanceIndex = LightIndex - FirstLightIndex;
                    const FLightSceneInfo* LightSceneInfo = GatheredLocalLights[LightIndex];

                    FString LightNameWithLevel;
                    FSceneRenderer::GetLightNameForDrawEvent(LightSceneInfo->Proxy, LightNameWithLevel);

                    // 繪制非平行光的光源到Lumen卡片.
                    RenderDirectLightIntoLumenCards(
                        GraphBuilder,
                        Scene,
                        MainView,
                        ViewFamily.EngineShowFlags,
                        VisibleLightInfos,
                        LumenCardSceneUniformBuffer,
                        FinalLightingAtlas,
                        OpacityAtlas,
                        LightSceneInfo,
                        LightNameWithLevel,
                        CardScatterContext,
                        ScatterInstanceIndex,
                        LumenDirectLightingHardwareRayTracingData,
                        VirtualShadowMapArray);
                }
            }
        }
    }
}
           

下面是繪制單個光源

RenderDirectLightIntoLumenCards

的代碼:

void RenderDirectLightIntoLumenCards(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    const FEngineShowFlags& EngineShowFlags,
    TArray<FVisibleLightInfo, SceneRenderingAllocator>& VisibleLightInfos,
    TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
    FRDGTextureRef FinalLightingAtlas,
    FRDGTextureRef OpacityAtlas,
    const FLightSceneInfo* LightSceneInfo,
    const FString& LightName,
    const FLumenCardScatterContext& CardScatterContext,
    int32 ScatterInstanceIndex,
    FLumenDirectLightingHardwareRayTracingData& LumenDirectLightingHardwareRayTracingData,
    const FVirtualShadowMapArray& VirtualShadowMapArray)
{
    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
    const FSphere LightBounds = LightSceneInfo->Proxy->GetBoundingSphere();
    const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
    bool bShadowed = LightSceneInfo->Proxy->CastsDynamicShadow();

    // 轉換光源類型.
    ELumenLightType LumenLightType = ELumenLightType::MAX;
    {
        switch (LightType)
        {
        case LightType_Directional: LumenLightType = ELumenLightType::Directional;    break;
        case LightType_Point:        LumenLightType = ELumenLightType::Point;        break;
        case LightType_Spot:        LumenLightType = ELumenLightType::Spot;            break;
        case LightType_Rect:        LumenLightType = ELumenLightType::Rect;            break;
        }
        check(LumenLightType != ELumenLightType::MAX);
    }

    // 設定陰影資訊.
    FVisibleLightInfo& VisibleLightInfo = VisibleLightInfos[LightSceneInfo->Id];
    FLumenShadowSetup ShadowSetup = GetShadowForLumenDirectLighting(VisibleLightInfo);

    const bool bDynamicallyShadowed = ShadowSetup.DenseShadowMap != nullptr;

    FDistanceFieldObjectBufferParameters ObjectBufferParameters = DistanceField::SetupObjectBufferParameters(Scene->DistanceFieldSceneData);

    FLightTileIntersectionParameters LightTileIntersectionParameters;
    FDistanceFieldCulledObjectBufferParameters CulledObjectBufferParameters;
    FMatrix WorldToMeshSDFShadowValue = FMatrix::Identity;

    const bool bLumenUseHardwareRayTracedShadow = Lumen::UseHardwareRayTracedShadows(View) && bShadowed;
    const bool bTraceMeshSDFs = bShadowed 
        && LumenLightType == ELumenLightType::Directional 
        && DoesPlatformSupportDistanceFieldShadowing(View.GetShaderPlatform())
        && GLumenDirectLightingOffscreenShadowingTraceMeshSDFs != 0
        && Lumen::UseMeshSDFTracing()
        && ObjectBufferParameters.NumSceneObjects > 0;

    // 處理虛拟陰影圖ID.
    int32 VirtualShadowMapId = -1;
    if (bDynamicallyShadowed
        && !bLumenUseHardwareRayTracedShadow
        && GLumenDirectLightingVirtualShadowMap != 0
        && VirtualShadowMapArray.IsAllocated())
    {
        if (LightType == LightType_Directional)
        {
            VirtualShadowMapId = VisibleLightInfo.VirtualShadowMapClipmaps[0]->GetVirtualShadowMap()->ID;
        }
        else if (ShadowSetup.VirtualShadowMap)
        {
            VirtualShadowMapId = ShadowSetup.VirtualShadowMap->VirtualShadowMaps[0]->ID;
        }
    }

    const bool bUseVirtualShadowMap = VirtualShadowMapId >= 0;
    if (!bUseVirtualShadowMap)
    {
        // Fallback to a complete shadow map
        ShadowSetup.VirtualShadowMap = nullptr;
        ShadowSetup.DenseShadowMap = GetShadowForInjectionIntoVolumetricFog(VisibleLightInfo);
    }

    if (bLumenUseHardwareRayTracedShadow)
    {
        RenderHardwareRayTracedShadowIntoLumenCards(
            GraphBuilder, Scene, View, LumenCardSceneUniformBuffer, OpacityAtlas, 
            LightSceneInfo, LightName, CardScatterContext, ScatterInstanceIndex,
            LumenDirectLightingHardwareRayTracingData, bDynamicallyShadowed, LumenLightType);
    }
    else if (bTraceMeshSDFs)
    {
        CullMeshSDFsForLightCards(GraphBuilder, Scene, View, LightSceneInfo, ObjectBufferParameters, WorldToMeshSDFShadowValue, CulledObjectBufferParameters, LightTileIntersectionParameters);
    }

    FLumenCardDirectLighting* PassParameters = GraphBuilder.AllocParameters<FLumenCardDirectLighting>();
    {
        PassParameters->RenderTargets[0] = FRenderTargetBinding(FinalLightingAtlas, ERenderTargetLoadAction::ELoad);
        PassParameters->VS.InfluenceSphere = FVector4(LightBounds.Center, LightBounds.W);
        PassParameters->VS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->VS.CardScatterParameters = CardScatterContext.Parameters;
        PassParameters->VS.ScatterInstanceIndex = ScatterInstanceIndex;
        PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;

        // 擷取體積陰影shader參數.
        GetVolumeShadowingShaderParameters(
            GraphBuilder,
            View,
            LightSceneInfo,
            ShadowSetup.DenseShadowMap,
            0,
            bDynamicallyShadowed,
            PassParameters->PS.VolumeShadowingShaderParameters);

        // 光源全局緩沖.
        FDeferredLightUniformStruct DeferredLightUniforms = GetDeferredLightParameters(View, *LightSceneInfo);

        if (LightSceneInfo->Proxy->IsInverseSquared())
        {
            DeferredLightUniforms.LightParameters.FalloffExponent = 0;
        }

        PassParameters->PS.View = View.ViewUniformBuffer;
        PassParameters->PS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->PS.OpacityAtlas = OpacityAtlas;
        DeferredLightUniforms.LightParameters.Color *= LightSceneInfo->Proxy->GetIndirectLightingScale();
        PassParameters->PS.DeferredLightUniforms = CreateUniformBufferImmediate(DeferredLightUniforms, UniformBuffer_SingleDraw);
        PassParameters->PS.ForwardLightData = View.ForwardLightingResources->ForwardLightDataUniformBuffer;
        SetupLightFunctionParameters(LightSceneInfo, 1.0f, PassParameters->PS.LightFunctionParameters);

        PassParameters->PS.VirtualShadowMapId = VirtualShadowMapId;
        if (bUseVirtualShadowMap)
        {
            PassParameters->PS.VirtualShadowMapSamplingParameters = VirtualShadowMapArray.GetSamplingParameters(GraphBuilder);
        }
        
        PassParameters->PS.ObjectBufferParameters = ObjectBufferParameters;
        PassParameters->PS.CulledObjectBufferParameters = CulledObjectBufferParameters;
        PassParameters->PS.LightTileIntersectionParameters = LightTileIntersectionParameters;

        FDistanceFieldAtlasParameters DistanceFieldAtlasParameters = DistanceField::SetupAtlasParameters(Scene->DistanceFieldSceneData);

        // 距離場圖集
        PassParameters->PS.DistanceFieldAtlasParameters = DistanceFieldAtlasParameters;
        PassParameters->PS.WorldToShadow = WorldToMeshSDFShadowValue;
        extern float GTwoSidedMeshDistanceBias;
        PassParameters->PS.TwoSidedMeshDistanceBias = GTwoSidedMeshDistanceBias;

        PassParameters->PS.TanLightSourceAngle = FMath::Tan(LightSceneInfo->Proxy->GetLightSourceAngle());
        PassParameters->PS.MaxTraceDistance = GOffscreenShadowingMaxTraceDistance;
        PassParameters->PS.StepFactor = FMath::Clamp(GOffscreenShadowingTraceStepFactor, .1f, 10.0f);
        PassParameters->PS.SurfaceBias = FMath::Clamp(GShadowingSurfaceBias, .01f, 100.0f);
        PassParameters->PS.SlopeScaledSurfaceBias = FMath::Clamp(GShadowingSlopeScaledSurfaceBias, .01f, 100.0f);
        PassParameters->PS.SDFSurfaceBiasScale = FMath::Clamp(GOffscreenShadowingSDFSurfaceBiasScale, .01f, 100.0f);
        PassParameters->PS.VirtualShadowMapSurfaceBias = FMath::Clamp(GLumenDirectLightingVirtualShadowMapBias, .01f, 100.0f);
        PassParameters->PS.ForceOffscreenShadowing = GLumenDirectLightingForceOffscreenShadowing;

        if (bLumenUseHardwareRayTracedShadow)
        {
            PassParameters->PS.ShadowMaskAtlas = LumenDirectLightingHardwareRayTracingData.ShadowMaskAtlas;
        }

        // IES
        {
            FTexture* IESTextureResource = LightSceneInfo->Proxy->GetIESTextureResource();

            if (View.Family->EngineShowFlags.TexturedLightProfiles && IESTextureResource)
            {
                PassParameters->PS.UseIESProfile = 1;
                PassParameters->PS.IESTexture = IESTextureResource->TextureRHI;
            }
            else
            {
                PassParameters->PS.UseIESProfile = 0;
                PassParameters->PS.IESTexture = GWhiteTexture->TextureRHI;
            }

            PassParameters->PS.IESTextureSampler = TStaticSamplerState<SF_Bilinear,AM_Clamp,AM_Clamp,AM_Clamp>::GetRHI();
        }
    }

    FRasterizeToCardsVS::FPermutationDomain VSPermutationVector;
    VSPermutationVector.Set< FRasterizeToCardsVS::FClampToInfluenceSphere >(LightType != LightType_Directional);
    auto VertexShader = View.ShaderMap->GetShader<FRasterizeToCardsVS>(VSPermutationVector);
    const FMaterialRenderProxy* LightFunctionMaterialProxy = LightSceneInfo->Proxy->GetLightFunctionMaterial();
    bool bUseLightFunction = true;

    if (!LightFunctionMaterialProxy
        || !LightFunctionMaterialProxy->GetIncompleteMaterialWithFallback(Scene->GetFeatureLevel()).IsLightFunction()
        || !EngineShowFlags.LightFunctions)
    {
        bUseLightFunction = false;
        LightFunctionMaterialProxy = UMaterial::GetDefaultMaterial(MD_LightFunction)->GetRenderProxy();
    }

    const bool bUseCloudTransmittance = SetupLightCloudTransmittanceParameters(Scene, View, GLumenDirectLightingCloudTransmittance != 0 ? LightSceneInfo : nullptr, PassParameters->PS.LightCloudTransmittanceParameters);

    // 設定排列.
    FLumenCardDirectLightingPS::FPermutationDomain PermutationVector;
    PermutationVector.Set< FLumenCardDirectLightingPS::FLightType >(LumenLightType);
    PermutationVector.Set< FLumenCardDirectLightingPS::FDynamicallyShadowed >(bDynamicallyShadowed);
    PermutationVector.Set< FLumenCardDirectLightingPS::FShadowed >(bShadowed);
    PermutationVector.Set< FLumenCardDirectLightingPS::FTraceMeshSDFs >(bTraceMeshSDFs);
    PermutationVector.Set< FLumenCardDirectLightingPS::FVirtualShadowMap >(bUseVirtualShadowMap);
    PermutationVector.Set< FLumenCardDirectLightingPS::FLightFunction >(bUseLightFunction);
    PermutationVector.Set< FLumenCardDirectLightingPS::FRayTracingShadowPassCombine>(bLumenUseHardwareRayTracedShadow);
    PermutationVector.Set< FLumenCardDirectLightingPS::FCloudTransmittance >(bUseCloudTransmittance);
    
    PermutationVector = FLumenCardDirectLightingPS::RemapPermutation(PermutationVector);

    const FMaterial& Material = LightFunctionMaterialProxy->GetMaterialWithFallback(Scene->GetFeatureLevel(), LightFunctionMaterialProxy);
    const FMaterialShaderMap* MaterialShaderMap = Material.GetRenderingThreadShaderMap();
    auto PixelShader = MaterialShaderMap->GetShader<FLumenCardDirectLightingPS>(PermutationVector);

    ClearUnusedGraphResources(PixelShader, &PassParameters->PS);

    const uint32 CardIndirectArgOffset = CardScatterContext.GetIndirectArgOffset(ScatterInstanceIndex);

    // 光照繪制Pass.
    GraphBuilder.AddPass(
        RDG_EVENT_NAME("%s %s", *LightName, bDynamicallyShadowed ? TEXT("Shadowmap") : TEXT("")),
        PassParameters,
        ERDGPassFlags::Raster,
        [MaxAtlasSize = LumenSceneData.MaxAtlasSize, PassParameters, LightSceneInfo, VertexShader, PixelShader, GlobalShaderMap = View.ShaderMap, LightFunctionMaterialProxy, &Material, &View, CardIndirectArgOffset](FRHICommandListImmediate& RHICmdList)
        {
            DrawQuadsToAtlas(
                MaxAtlasSize,
                VertexShader,
                PixelShader,
                PassParameters,
                GlobalShaderMap,
                TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_One>::GetRHI(),
                RHICmdList,
                [LightFunctionMaterialProxy, &Material, &View](FRHICommandListImmediate& RHICmdList, TShaderRefBase<FLumenCardDirectLightingPS, FShaderMapPointerTable> Shader, FRHIPixelShader* ShaderRHI, const FLumenCardDirectLightingPS::FParameters& Parameters)
                {
                    Shader->SetParameters(RHICmdList, ShaderRHI, LightFunctionMaterialProxy, Material, View);
                },
                CardIndirectArgOffset);
        });
}
           

直接光照被截幀後的流程如下所示:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

光照計算過程中輸入的紋理資料根據光源類型有所不同,但所有光源類型都會輸入深度、法線、Opacity等資料,不同的是局部光源(非平行光)會輸入距離場相關紋理和16x16x16的Perlin噪點3D紋理,而平行光會輸入128x128x128的3D材質VolumeTexture(下圖是切片0放大4倍後的效果):

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

經過光照計算後輸出如下所示的結果:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

直接光照計算使用的PS如下所示:

// Engine\Shaders\Private\Lumen\LumenSceneDirectLighting.usf

void LumenCardDirectLightingPS(
    FCardVSToPS CardInterpolants,
    out float4 OutColor : SV_Target0)
{
    float Opacity = Texture2DSampleLevel(OpacityAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).x;
    float3 Irradiance = 0;

    if (Opacity > 0)
    {
        // 建構光源資料.
        FDeferredLightData LightData;
        {
            LightData.Position = DeferredLightUniforms.Position;
            LightData.InvRadius = DeferredLightUniforms.InvRadius;
            LightData.Color = DeferredLightUniforms.Color;
            LightData.FalloffExponent = DeferredLightUniforms.FalloffExponent;
            LightData.Direction = DeferredLightUniforms.Direction;  
            LightData.Tangent = DeferredLightUniforms.Tangent;
            LightData.SpotAngles = DeferredLightUniforms.SpotAngles;
            LightData.SourceRadius = DeferredLightUniforms.SourceRadius;
            LightData.SourceLength = DeferredLightUniforms.SourceLength;
            LightData.SoftSourceRadius = DeferredLightUniforms.SoftSourceRadius;
            LightData.SpecularScale = DeferredLightUniforms.SpecularScale;
            LightData.ContactShadowLength = abs(DeferredLightUniforms.ContactShadowLength);
            LightData.ContactShadowLengthInWS = DeferredLightUniforms.ContactShadowLength < 0.0f;
            LightData.DistanceFadeMAD = DeferredLightUniforms.DistanceFadeMAD;
            LightData.ShadowMapChannelMask = DeferredLightUniforms.ShadowMapChannelMask;
            LightData.ShadowedBits = DeferredLightUniforms.ShadowedBits;
            LightData.RectLightBarnCosAngle = DeferredLightUniforms.RectLightBarnCosAngle;
            LightData.RectLightBarnLength = DeferredLightUniforms.RectLightBarnLength;

            LightData.bInverseSquared = LightData.FalloffExponent == 0.0f;
            LightData.bRadialLight = LIGHT_TYPE != LIGHT_TYPE_DIRECTIONAL;
            LightData.bSpotLight = LIGHT_TYPE == LIGHT_TYPE_SPOT;
            LightData.bRectLight = LIGHT_TYPE == LIGHT_TYPE_RECT;
        }

        // 擷取Lumen卡片資料.
        FLumenCardData LumenCardData = GetLumenCardData(CardInterpolants.CardId);

        float Depth = 1.0f - Texture2DSampleLevel(LumenCardScene.DepthAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).x;

        // 計算位置.
        float3 LocalPosition;
        LocalPosition.xy = (CardInterpolants.AtlasCoord - LumenCardData.LocalPositionToAtlasUVBias) / LumenCardData.LocalPositionToAtlasUVScale;
        LocalPosition.z = -LumenCardData.LocalExtent.z + Depth * 2 * LumenCardData.LocalExtent.z;

        float3 WorldPosition = mul(LumenCardData.WorldToLocalRotation, LocalPosition) + LumenCardData.Origin;

        float3 LightColor = DeferredLightUniforms.Color;
        float3 L = LightData.Direction;
        float3 ToLight = L;
    
        // 計算光源衰減.
#if LIGHT_TYPE == LIGHT_TYPE_DIRECTIONAL
        float CombinedAttenuation = 1;
#else
        float LightMask = 1;
        if (LightData.bRadialLight)
        {
            LightMask = GetLocalLightAttenuation(WorldPosition, LightData, ToLight, L);
        }

        float Attenuation;

        if (LightData.bRectLight)
        {
            FRect Rect = GetRect(ToLight, LightData);
            FRectTexture RectTexture = InitRectTexture(DeferredLightUniforms.SourceTexture);
            Attenuation = IntegrateLight(Rect, RectTexture);
        }
        else
        {
            FCapsuleLight Capsule = GetCapsule(ToLight, LightData);
            Capsule.DistBiasSqr = 0;
            Attenuation = IntegrateLight(Capsule, LightData.bInverseSquared);
        }

        float CombinedAttenuation = Attenuation * LightMask;
#endif

        if (CombinedAttenuation > 0)
        {
            float3 WorldNormal = Texture2DSampleLevel(LumenCardScene.NormalAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).xyz * 2 - 1;

            // 面向光源的表面才計算光源.
            if (dot(WorldNormal, L) > 0)
            {
                float ShadowFactor = 1.0f;

                #if SHADOWED_LIGHT  // 帶陰影
                {
                    // 硬體光追陰影
                    #if HARDWARE_RAYTRACING_SHADOW_PASS_COMBINE
                    {
                        float2 AtlasTextureSize = LumenCardScene.AtlasSize;
                        uint2 Pos2D = CardInterpolants.AtlasCoord * AtlasTextureSize.xy - float2(0.5, 0.5) / AtlasTextureSize.xy;
                        ShadowFactor = ShadowMaskAtlas.Load(uint3(Pos2D, 0));
                    }
                    #else // 非硬體光追陰影
                    {
                        bool bShadowFactorComplete = false;
                        bool bVSMValid = false;

                        // 使用虛拟陰影圖
                        #if VIRTUAL_SHADOW_MAP
                        {
                            // Bias only ray start to maximize chances of hitting an allocated page
                            FVirtualShadowMapSampleResult VirtualShadowMapSample = SampleVirtualShadowMap(VirtualShadowMapId, WorldPosition, VirtualShadowMapSurfaceBias, WorldNormal);

                            bVSMValid = VirtualShadowMapSample.bValid;
                            bShadowFactorComplete = VirtualShadowMapSample.bValid && VirtualShadowMapSample.bOccluded;
                            ShadowFactor = VirtualShadowMapSample.ShadowFactor;
                        }
                        #endif

                        // 計算陰影強度ShadowFactor.
                        if (!bShadowFactorComplete)
                        {
                            float3 WorldPositionForShadowing = GetWorldPositionForShadowing(WorldPosition, L, WorldNormal, 1.0f);

                            #if LIGHT_TYPE == LIGHT_TYPE_DIRECTIONAL
                            {
                                #if DYNAMICALLY_SHADOWED
                                    float SceneDepth = dot(WorldPositionForShadowing - View.WorldCameraOrigin, View.ViewForward);

                                    bool bShadowingFromValidUVArea = false;
                                    float NewShadowFactor = ComputeDirectionalLightDynamicShadowing(WorldPositionForShadowing, SceneDepth, bShadowingFromValidUVArea);

                                    float4 PostProjectionPosition = mul(float4(WorldPosition, 1.0), View.WorldToClip);
                                    // CSM's are culled so only query points inside the view are valid
                                    float2 ValidTexelSize = float2(length(ddx(WorldPosition)), length(ddy(WorldPosition))) * 2;
                                    if (bShadowingFromValidUVArea && all(PostProjectionPosition.xy - ValidTexelSize < PostProjectionPosition.w&& PostProjectionPosition.xy + ValidTexelSize > -PostProjectionPosition.w))
                                    { 
                                        ShadowFactor *= NewShadowFactor;
                                        bShadowFactorComplete = VIRTUAL_SHADOW_MAP ? bVSMValid : true;
                                    }
                                #endif
                            }
                            #else
                            {
                                bool bShadowingFromValidUVArea = false;
                                float NewShadowFactor = ComputeVolumeShadowing(WorldPositionForShadowing, LightData.bRadialLight && !LightData.bSpotLight, LightData.bSpotLight, bShadowingFromValidUVArea);

                                if (bShadowingFromValidUVArea) 
                                {
                                    ShadowFactor *= NewShadowFactor;
                                    bShadowFactorComplete = VIRTUAL_SHADOW_MAP ? bVSMValid : true;
                                }
                            }
                            #endif
                        }

                        // 處理離屏陰影.
                        bool bOffscreenShadowing = !bShadowFactorComplete;
                        if (ForceOffscreenShadowing != 0)
                        {
                            ShadowFactor = 1.0;
                            bOffscreenShadowing = true;
                        }

                        if (bOffscreenShadowing)
                        {
                            ShadowFactor *= TraceOffscreenShadows(WorldPosition, L, ToLight, WorldNormal);
                        }
                    }
                    #endif // End hardware/software shadow selection        
                }
                #endif // End ShadowLight

                // 光照圖
                #if LIGHT_FUNCTION
                    ShadowFactor *= GetLightFunction(WorldPosition);
                #endif

                // 雲體透射
                #if USE_CLOUD_TRANSMITTANCE
                {
                    float OutOpticalDepth = 0.0f;
                    ShadowFactor *= lerp(1.0f, GetCloudVolumetricShadow(WorldPosition, CloudShadowmapWorldToLightClipMatrix, CloudShadowmapFarDepthKm, CloudShadowmapTexture, CloudShadowmapSampler, OutOpticalDepth), CloudShadowmapStrength);
                }
                #endif

                // IES
                if (UseIESProfile > 0)
                {
                    ShadowFactor *= ComputeLightProfileMultiplier(WorldPosition, DeferredLightUniforms.Position, -DeferredLightUniforms.Direction, DeferredLightUniforms.Tangent);
                }

                // 最終輻照度
                float NoL = saturate(dot(WorldNormal, L));
                Irradiance = LightColor * (CombinedAttenuation * NoL * ShadowFactor);
                //Irradiance = bShadowFactorValid ? float3(0, 1, 0) : float3(0.2f, 0.0f, 0.0f);
            }
        }
    }
        
    OutColor = float4(Irradiance, 0);
}
           

這個過程類似于6.5.6.1 Voxel Cone Tracing提及的Geometry Prefiltering:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScenePrefilter.cpp

void FDeferredShadingSceneRenderer::PrefilterLumenSceneLighting(
    FRDGBuilder& GraphBuilder,
    const FViewInfo& View,
    FLumenCardTracingInputs& TracingInputs,
    FGlobalShaderMap* GlobalShaderMap,
    const FLumenCardScatterContext& VisibleCardScatterContext)
{
    LLM_SCOPE_BYTAG(Lumen);
    RDG_EVENT_SCOPE(GraphBuilder, "Prefilter");

    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;

    // 根據分辨率計算Mip的數量.
    const int32 NumMips = FMath::CeilLogTwo(FMath::Max(LumenSceneData.MaxAtlasSize.X, LumenSceneData.MaxAtlasSize.Y)) + 1;
    {
        FIntPoint SrcSize = LumenSceneData.MaxAtlasSize;
        FIntPoint DestSize = SrcSize / 2;

        // 循環Mip數量-1次(第0級就是初始紋理本身), 每次生成一個MIP.
        for (int32 MipIndex = 1; MipIndex < NumMips; MipIndex++)
        {
            SrcSize.X = FMath::Max(SrcSize.X, 1);
            SrcSize.Y = FMath::Max(SrcSize.Y, 1);
            DestSize.X = FMath::Max(DestSize.X, 1);
            DestSize.Y = FMath::Max(DestSize.Y, 1);

            FLumenCardPrefilterLighting* PassParameters = GraphBuilder.AllocParameters<FLumenCardPrefilterLighting>();
            
            // 設定渲染目标, 最多3個: 最終光照圖集, 輻照度圖集, 非直接輻照度圖集.
            PassParameters->RenderTargets[0] = FRenderTargetBinding(TracingInputs.FinalLightingAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
            bool bUseIrradianceAtlas = Lumen::UseIrradianceAtlas(View);
            bool bUseIndirectIrradianceAtlas = Lumen::UseIndirectIrradianceAtlas(View);
            if (bUseIrradianceAtlas)
            {
                PassParameters->RenderTargets[1] = FRenderTargetBinding(TracingInputs.IrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
                if (bUseIndirectIrradianceAtlas)
                {
                    PassParameters->RenderTargets[2] = FRenderTargetBinding(TracingInputs.IndirectIrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
                }
            }
            else if (bUseIndirectIrradianceAtlas)
            {
                PassParameters->RenderTargets[1] = FRenderTargetBinding(TracingInputs.IndirectIrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
            }
            PassParameters->VS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
            PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
            PassParameters->VS.ScatterInstanceIndex = 0;
            PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;
            PassParameters->PS.View = View.ViewUniformBuffer;
            PassParameters->PS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
            PassParameters->PS.ParentFinalLightingAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.FinalLightingAtlas, MipIndex - 1));
            // 注意建立SRV使用的是CreateForMipLevel.
            if (bUseIrradianceAtlas)
            {
                PassParameters->PS.ParentIrradianceAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.IrradianceAtlas, MipIndex - 1));
            }
            if (bUseIndirectIrradianceAtlas)
            {
                PassParameters->PS.ParentIndirectIrradianceAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.IndirectIrradianceAtlas, MipIndex - 1));
            }
            PassParameters->PS.InvSize = FVector2D(1.0f / SrcSize.X, 1.0f / SrcSize.Y);

            FScene* LocalScene = Scene;

            // 增加預過濾Pass.
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("PrefilterMip"),
                PassParameters,
                ERDGPassFlags::Raster,
                [LocalScene, PassParameters, DestSize, GlobalShaderMap, bUseIrradianceAtlas, bUseIndirectIrradianceAtlas](FRHICommandListImmediate& RHICmdList)
            {
                FLumenCardPrefilterLightingPS::FPermutationDomain PermutationVector;
                PermutationVector.Set<FLumenCardPrefilterLightingPS::FUseIrradianceAtlas>(bUseIrradianceAtlas != 0);
                PermutationVector.Set<FLumenCardPrefilterLightingPS::FUseIndirectIrradianceAtlas>(bUseIndirectIrradianceAtlas != 0);
                auto PixelShader = GlobalShaderMap->GetShader< FLumenCardPrefilterLightingPS >(PermutationVector);
                DrawQuadsToAtlas(DestSize, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
            });

            SrcSize /= 2;
            DestSize /= 2;
        }
    }
}
           

使用的Shader如下:

// Engine\Shaders\Private\Lumen\LumenSceneLighting.usf

Texture2D ParentFinalLightingAtlas;
Texture2D ParentIrradianceAtlas;
Texture2D ParentIndirectIrradianceAtlas;

void LumenCardPrefilterLightingPS(
    FCardVSToPS CardInterpolants,
    out float4 OutLighting : SV_Target0,
    out float4 OutColor1 : SV_Target1,
    out float4 OutColor2 : SV_Target2)
{
    // 直接使用雙線性過濾獲得該MIP層級的顔色, 并沒有像6.5.6.1節使用高斯權重.
    OutLighting = Texture2DSampleLevel(ParentFinalLightingAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#if USE_IRRADIANCE_ATLAS
    OutColor1 = Texture2DSampleLevel(ParentIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
    #if USE_INDIRECTIRRADIANCE_ATLAS
        OutColor2 = Texture2DSampleLevel(ParentIndirectIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
    #endif
#elif USE_INDIRECTIRRADIANCE_ATLAS
    OutColor1 = Texture2DSampleLevel(ParentIndirectIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#endif
}
           

從截幀可看到,紋理的MIP層級和PrefilterMip的Pass數量一緻:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

ComputeLumenSceneVoxelLighting的主要作用是計算Lumen場景的Voxel光照,代碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenVoxelLighting.cpp

void FDeferredShadingSceneRenderer::ComputeLumenSceneVoxelLighting(
    FRDGBuilder& GraphBuilder,
    FLumenCardTracingInputs& TracingInputs,
    FGlobalShaderMap* GlobalShaderMap)
{
    LLM_SCOPE_BYTAG(Lumen);

    const FViewInfo& View = Views[0];

    const int32 ClampedNumClipmapLevels = GetNumLumenVoxelClipmaps();
    const FIntVector ClipmapResolution = GetClipmapResolution();
    bool bForceFullUpdate = GLumenSceneVoxelLightingForceFullUpdate != 0;

    // 處理體素光照3D紋理.
    FRDGTextureRef VoxelLighting = TracingInputs.VoxelLighting;
    {
        FRDGTextureDesc LightingDesc(FRDGTextureDesc::Create3D(
            FIntVector(
                ClipmapResolution.X,
                ClipmapResolution.Y * ClampedNumClipmapLevels,
                ClipmapResolution.Z * GNumVoxelDirections),
            PF_FloatRGBA,
            FClearValueBinding::Black,
            TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling));

        if (!VoxelLighting || VoxelLighting->Desc != LightingDesc)
        {
            bForceFullUpdate = true;
            VoxelLighting = GraphBuilder.CreateTexture(LightingDesc, TEXT("Lumen.VoxelLighting"));
        }
    }

    // 處理可見性紋理.
    FRDGTextureRef VoxelVisBuffer = View.ViewState->Lumen.VoxelVisBuffer ? GraphBuilder.RegisterExternalTexture(View.ViewState->Lumen.VoxelVisBuffer) : nullptr;
    {
        FRDGTextureDesc VoxelVisBufferDesc(FRDGTextureDesc::Create3D(
            FIntVector(
                ClipmapResolution.X,
                ClipmapResolution.Y * ClampedNumClipmapLevels,
                ClipmapResolution.Z * GNumVoxelDirections),
            PF_R32_UINT,
            FClearValueBinding::Black,
            TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling));

        if (!VoxelVisBuffer
            || VoxelVisBuffer->Desc.Extent != VoxelVisBufferDesc.Extent
            || VoxelVisBuffer->Desc.Depth != VoxelVisBufferDesc.Depth)
        {
            bForceFullUpdate = true;
            VoxelVisBuffer = GraphBuilder.CreateTexture(VoxelVisBufferDesc, TEXT("Lumen.VoxelVisBuffer"));

            uint32 VisBufferClearValue[4] = { 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF };
            AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(VoxelVisBuffer), VisBufferClearValue);
        }
    }

    // 可見性緩沖區資料僅對特定場景有效,如果場景發生變化需要重新建立.
    if (View.ViewState->Lumen.VoxelVisBufferCachedScene != Scene)
    {
        bForceFullUpdate = true;
        View.ViewState->Lumen.VoxelVisBufferCachedScene = Scene;
    }

    // 處理需要更新的Clipmap.
    TArray<int32, SceneRenderingAllocator> ClipmapsToUpdate;
    ClipmapsToUpdate.Empty(ClampedNumClipmapLevels);

    for (int32 ClipmapIndex = 0; ClipmapIndex < ClampedNumClipmapLevels; ClipmapIndex++)
    {
        if (bForceFullUpdate || ShouldUpdateVoxelClipmap(ClipmapIndex, ClampedNumClipmapLevels, View.ViewState->GetFrameIndex()))
        {
            ClipmapsToUpdate.Add(ClipmapIndex);
        }
    }

    ensureMsgf(bForceFullUpdate || ClipmapsToUpdate.Num() <= 1, TEXT("Tweak ShouldUpdateVoxelClipmap for better clipmap update distribution"));

    FString ClipmapsToUpdateString;

    for (int32 ToUpdateIndex = 0; ToUpdateIndex < ClipmapsToUpdate.Num(); ++ToUpdateIndex)
    {
        ClipmapsToUpdateString += FString::FromInt(ClipmapsToUpdate[ToUpdateIndex]);
        if (ToUpdateIndex + 1 < ClipmapsToUpdate.Num())
        {
            ClipmapsToUpdateString += TEXT(",");
        }
    }

    RDG_EVENT_SCOPE(GraphBuilder, "VoxelizeCards Clipmaps=[%s]", *ClipmapsToUpdateString);

    // 更新并體素化可見性緩沖.
    if (ClipmapsToUpdate.Num() > 0)
    {
        TracingInputs.VoxelLighting = VoxelLighting;
        TracingInputs.VoxelGridResolution = GetClipmapResolution();
        TracingInputs.NumClipmapLevels = ClampedNumClipmapLevels;

        // 更新可見性緩沖
        UpdateVoxelVisBuffer(GraphBuilder, Scene, View, TracingInputs, VoxelVisBuffer, ClipmapsToUpdate, bForceFullUpdate);
        // 體素化可見性緩沖
        VoxelizeVisBuffer(View, Scene, TracingInputs, VoxelLighting, VoxelVisBuffer, ClipmapsToUpdate, GraphBuilder);

        ConvertToExternalTexture(GraphBuilder, VoxelLighting, View.ViewState->Lumen.VoxelLighting);
        View.ViewState->Lumen.VoxelGridResolution = TracingInputs.VoxelGridResolution;
        View.ViewState->Lumen.NumClipmapLevels = TracingInputs.NumClipmapLevels;
    }

    ConvertToExternalTexture(GraphBuilder, VoxelVisBuffer, View.ViewState->Lumen.VoxelVisBuffer);
}
           

上面涉及了更新和體素化可見性緩存,其具體的代碼不再分析,但截幀的過程如下所示:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

其中UpdateVoxelVisBuffer過程的最後階段VoxelTraceCS的輸入是距離場塊3D紋理,輸出是VoxelVisBuffer的3D紋理:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

而VoxelizeVoxelVisBuffer過程的最後階段VisBufferShading的輸入有SceneFinalLighting、SceneOpacity、SceneDepth、距離場塊3D紋理和VoxelVisBuffer,輸出是VoxelLighting3D紋理,此階段之後,Lumen場景的光照資訊已經存儲在體素化後的3D紋理中了:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

此階段就是利用之前Lumen計算生成的資訊計算最終的非直接光照,以模拟全局光照效果,它的過程如下所示:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

可知有SSGI降噪、螢幕空間探針收集、反射以及非直接光組合等幾個階段。對應的源碼

RenderDiffuseIndirectAndAmbientOcclusion

如下:

// Engine\Source\Runtime\Renderer\Private\IndirectLightRendering.cpp

oid FDeferredShadingSceneRenderer::RenderDiffuseIndirectAndAmbientOcclusion(
    FRDGBuilder& GraphBuilder,
    FSceneTextures& SceneTextures,
    FRDGTextureRef LightingChannelsTexture,
    bool bIsVisualizePass)
{
    using namespace HybridIndirectLighting;

    if (ViewFamily.EngineShowFlags.VisualizeLumenIndirectDiffuse != bIsVisualizePass)
    {
        return;
    }

    RDG_EVENT_SCOPE(GraphBuilder, "DiffuseIndirectAndAO");

    FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);
    FRDGTextureRef SceneColorTexture = SceneTextures.Color.Target;

    const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);

    // 每個view都需要單獨計算一次.
    for (FViewInfo& View : Views)
    {
        RDG_GPU_MASK_SCOPE(GraphBuilder, View.GPUMask);

        const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);

        int32 DenoiseMode = CVarDiffuseIndirectDenoiser.GetValueOnRenderThread();

        // 設定通用的漫反射參數.
        FCommonParameters CommonDiffuseParameters;
        SetupCommonDiffuseIndirectParameters(GraphBuilder, SceneTextureParameters, View, /* out */ CommonDiffuseParameters);

        // 為降噪器更新舊的光線追蹤配置.
        IScreenSpaceDenoiser::FAmbientOcclusionRayTracingConfig RayTracingConfig;
        {
            RayTracingConfig.RayCountPerPixel = CommonDiffuseParameters.RayCountPerPixel;
            RayTracingConfig.ResolutionFraction = 1.0f / float(CommonDiffuseParameters.DownscaleFactor);
        }

        // 上一幀場景顔色
        ScreenSpaceRayTracing::FPrevSceneColorMip PrevSceneColorMip;
        if ((ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI) && View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid())
        {
            PrevSceneColorMip = ScreenSpaceRayTracing::ReducePrevSceneColorMip(GraphBuilder, SceneTextureParameters, View);
        }

        // 降噪器輸入輸出參數
        FSSDSignalTextures DenoiserOutputs;
        IScreenSpaceDenoiser::FDiffuseIndirectInputs DenoiserInputs;
        IScreenSpaceDenoiser::FDiffuseIndirectHarmonic DenoiserSphericalHarmonicInputs;
        FLumenReflectionCompositeParameters LumenReflectionCompositeParameters;
        bool bLumenUseDenoiserComposite = ViewPipelineState.bUseLumenProbeHierarchy;

        // 根據不同的非直接光方法獲得降噪輸入或輸出結構.
        
        // Lumen探針層次結構
        if (ViewPipelineState.bUseLumenProbeHierarchy)
        {
            check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);
            DenoiserOutputs = RenderLumenProbeHierarchy(
                GraphBuilder,
                SceneTextures,
                CommonDiffuseParameters, PrevSceneColorMip,
                View, &View.PrevViewInfo);
        }
        // 螢幕空間全局光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
        {
            RDG_EVENT_SCOPE(GraphBuilder, "SSGI %dx%d", CommonDiffuseParameters.TracingViewportSize.X, CommonDiffuseParameters.TracingViewportSize.Y);
            DenoiserInputs = ScreenSpaceRayTracing::CastStandaloneDiffuseIndirectRays(
                GraphBuilder, CommonDiffuseParameters, PrevSceneColorMip, View);
        }
        // 光線追蹤全局光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
        {
            // TODO: Refactor under the HybridIndirectLighting standard API.
            // TODO: hybrid SSGI / RTGI
            RenderRayTracingGlobalIllumination(GraphBuilder, SceneTextureParameters, View, /* out */ &RayTracingConfig, /* out */ &DenoiserInputs);
        }
        // Lumen全局光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
        {
            check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);

            FLumenMeshSDFGridParameters MeshSDFGridParameters;

            DenoiserOutputs = RenderLumenScreenProbeGather(
                GraphBuilder, 
                SceneTextures,
                PrevSceneColorMip, 
                LightingChannelsTexture,
                View,
                &View.PrevViewInfo,
                bLumenUseDenoiserComposite,
                MeshSDFGridParameters);

            if (ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen)
            {
                DenoiserOutputs.Textures[2] = RenderLumenReflections(
                    GraphBuilder,
                    View,
                    SceneTextures, 
                    MeshSDFGridParameters,
                    LumenReflectionCompositeParameters);
            }

            if (!DenoiserOutputs.Textures[2])
            {
                DenoiserOutputs.Textures[2] = DenoiserOutputs.Textures[1];
            }
        }

        FRDGTextureRef AmbientOcclusionMask = DenoiserInputs.AmbientOcclusionMask;

        // 處理降噪.
        if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
        {
            // 由于Lumen全局輸出的已經帶了降噪, 是以此處不需要任何操作.
        }
        else if (ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled)
        {
            DenoiserOutputs.Textures[0] = DenoiserInputs.Color;
            DenoiserOutputs.Textures[1] = SystemTextures.White;
        }
        else
        {
            const IScreenSpaceDenoiser* DefaultDenoiser = IScreenSpaceDenoiser::GetDefaultDenoiser();
            const IScreenSpaceDenoiser* DenoiserToUse = 
                ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::DefaultDenoiser
                ? DefaultDenoiser : GScreenSpaceDenoiser;

            RDG_EVENT_SCOPE(GraphBuilder, "%s%s(DiffuseIndirect) %dx%d",
                DenoiserToUse != DefaultDenoiser ? TEXT("ThirdParty ") : TEXT(""),
                DenoiserToUse->GetDebugName(),
                View.ViewRect.Width(), View.ViewRect.Height());

            if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
            {
                // 對RTGI進行降噪.
                DenoiserOutputs = DenoiserToUse->DenoiseDiffuseIndirect(
                    GraphBuilder,
                    View,
                    &View.PrevViewInfo,
                    SceneTextureParameters,
                    DenoiserInputs,
                    RayTracingConfig);

                AmbientOcclusionMask = DenoiserOutputs.Textures[1];
            }
            else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
            {
                // 對SSGI的結果降噪.
                DenoiserOutputs = DenoiserToUse->DenoiseScreenSpaceDiffuseIndirect(
                    GraphBuilder,
                    View,
                    &View.PrevViewInfo,
                    SceneTextureParameters,
                    DenoiserInputs,
                    RayTracingConfig);

                AmbientOcclusionMask = DenoiserOutputs.Textures[1];
            }
        }

        // 渲染AO
        bool bWritableAmbientOcclusionMask = true;
        if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::Disabled)
        {
            ensure(!HasBeenProduced(SceneTextures.ScreenSpaceAO));
            AmbientOcclusionMask = nullptr;
            bWritableAmbientOcclusionMask = false;
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::RTAO)
        {
            RenderRayTracingAmbientOcclusion(
                GraphBuilder,
                View,
                SceneTextureParameters,
                &AmbientOcclusionMask);
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI)
        {
            check(AmbientOcclusionMask);
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO)
        {
            // Fetch result of SSAO that was done earlier.
            if (HasBeenProduced(SceneTextures.ScreenSpaceAO))
            {
                AmbientOcclusionMask = SceneTextures.ScreenSpaceAO;
            }
            else
            {
                AmbientOcclusionMask = GetScreenSpaceAOFallback(SystemTextures);
                bWritableAmbientOcclusionMask = false;
            }
        }
        else
        {
            unimplemented();
            bWritableAmbientOcclusionMask = false;
        }

        // Extract the dynamic AO for application of AO beyond RenderDiffuseIndirectAndAmbientOcclusion()
        if (AmbientOcclusionMask && ViewPipelineState.AmbientOcclusionMethod != EAmbientOcclusionMethod::SSAO)
        {
            ensureMsgf(Views.Num() == 1, TEXT("Need to add support for one AO texture per view in FSceneTextures"));
            SceneTextures.ScreenSpaceAO = AmbientOcclusionMask;
        }

        if (HairStrands::HasViewHairStrandsData(View) && (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI || ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO) && bWritableAmbientOcclusionMask)
        {
            RenderHairStrandsAmbientOcclusion(
                GraphBuilder,
                View,
                AmbientOcclusionMask);
        }

        // 應用漫反射非直接光和環境光AO到場景顔色.
        if ((DenoiserOutputs.Textures[0] || AmbientOcclusionMask) && (!bIsVisualizePass || ViewPipelineState.DiffuseIndirectDenoiser != IScreenSpaceDenoiser::EMode::Disabled || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
            && !IsMetalPlatform(ShaderPlatform))
        {
            // 用的PS是FDiffuseIndirectCompositePS
            FDiffuseIndirectCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FDiffuseIndirectCompositePS::FParameters>();
            
            PassParameters->AmbientOcclusionStaticFraction = FMath::Clamp(View.FinalPostProcessSettings.AmbientOcclusionStaticFraction, 0.0f, 1.0f);

            PassParameters->ApplyAOToDynamicDiffuseIndirect = 0.0f;

            if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
            {
                PassParameters->ApplyAOToDynamicDiffuseIndirect = 1.0f;
            }

            const FIntPoint BufferExtent = SceneTextureParameters.SceneDepthTexture->Desc.Extent;

            {
                // Placeholder texture for textures pulled in from SSDCommon.ush
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    FIntPoint(1),
                    PF_R32_UINT,
                    FClearValueBinding::Black,
                    TexCreate_ShaderResource);
                FRDGTextureRef CompressedMetadataPlaceholder = GraphBuilder.CreateTexture(Desc, TEXT("CompressedMetadataPlaceholder"));

                PassParameters->CompressedMetadata[0] = CompressedMetadataPlaceholder;
                PassParameters->CompressedMetadata[1] = CompressedMetadataPlaceholder;
            }

            PassParameters->BufferUVToOutputPixelPosition = BufferExtent;
            PassParameters->EyeAdaptation = GetEyeAdaptationTexture(GraphBuilder, View);
            PassParameters->LumenReflectionCompositeParameters = LumenReflectionCompositeParameters;

            PassParameters->bVisualizeDiffuseIndirect = bIsVisualizePass;

            PassParameters->DiffuseIndirect = DenoiserOutputs;
            PassParameters->DiffuseIndirectSampler = TStaticSamplerState<SF_Point>::GetRHI();

            PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
            PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();

            PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
            PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
            
            if (!PassParameters->AmbientOcclusionTexture || bIsVisualizePass)
            {
                PassParameters->AmbientOcclusionTexture = SystemTextures.White;
            }

            // 設定降噪器的通用shader參數.
            Denoiser::SetupCommonShaderParameters(
                View, SceneTextureParameters,
                View.ViewRect,
                1.0f / CommonDiffuseParameters.DownscaleFactor,
                /* out */ &PassParameters->DenoiserCommonParameters);
            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;

            PassParameters->RenderTargets[0] = FRenderTargetBinding(
                SceneColorTexture, ERenderTargetLoadAction::ELoad);

            {
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    SceneColorTexture->Desc.Extent,
                    PF_FloatRGBA,
                    FClearValueBinding::None,
                    TexCreate_ShaderResource | TexCreate_UAV);

                PassParameters->PassDebugOutput = GraphBuilder.CreateUAV(
                    GraphBuilder.CreateTexture(Desc, TEXT("DebugDiffuseIndirectComposite")));
            }

            const TCHAR* DiffuseIndirectSampling = TEXT("Disabled");
            FDiffuseIndirectCompositePS::FPermutationDomain PermutationVector;
            bool bUpscale = false;

            if (DenoiserOutputs.Textures[0])
            {
                if (bLumenUseDenoiserComposite)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(2);
                    DiffuseIndirectSampling = TEXT("ProbeHierarchy");
                }
                else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(3);
                    DiffuseIndirectSampling = TEXT("RTGI");
                }
                else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(4);
                    DiffuseIndirectSampling = TEXT("ScreenProbeGather");
                }
                else
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(1);
                    DiffuseIndirectSampling = TEXT("SSGI");
                    bUpscale = DenoiserOutputs.Textures[0]->Desc.Extent != SceneColorTexture->Desc.Extent;
                }

                PermutationVector.Set<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>(bUpscale);
            }

            TShaderMapRef<FDiffuseIndirectCompositePS> PixelShader(View.ShaderMap, PermutationVector);
            // 清理和優化無用的shader資源綁定.
            ClearUnusedGraphResources(PixelShader, PassParameters);

            FRHIBlendState* BlendState = TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_Source1Color, BO_Add, BF_One, BF_Source1Alpha>::GetRHI();

            if (bIsVisualizePass)
            {
                BlendState = TStaticBlendState<>::GetRHI();
            }

            // 組合非直接光Pass.
            FPixelShaderUtils::AddFullscreenPass(
                GraphBuilder,
                View.ShaderMap,
                RDG_EVENT_NAME(
                    "DiffuseIndirectComposite(DiffuseIndirect=%s%s%s%s) %dx%d",
                    DiffuseIndirectSampling,
                    PermutationVector.Get<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>() ? TEXT(" UpscaleDiffuseIndirect") : TEXT(""),
                    AmbientOcclusionMask ? TEXT(" ApplyAOToSceneColor") : TEXT(""),
                    PassParameters->ApplyAOToDynamicDiffuseIndirect > 0.0f ? TEXT(" ApplyAOToDynamicDiffuseIndirect") : TEXT(""),
                    View.ViewRect.Width(), View.ViewRect.Height()),
                PixelShader,
                PassParameters,
                View.ViewRect,
                BlendState);
        } // if (DenoiserOutputs.Color || bApplySSAO)

        // 應用環境cubemap.
        if (IsAmbientCubemapPassRequired(View) && !bIsVisualizePass && !ViewPipelineState.bUseLumenProbeHierarchy)
        {
            FAmbientCubemapCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FAmbientCubemapCompositePS::FParameters>();
            
            PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
            PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();
            
            PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
            PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
            
            if (!PassParameters->AmbientOcclusionTexture)
            {
                PassParameters->AmbientOcclusionTexture = SystemTextures.White;
            }

            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;

            PassParameters->RenderTargets[0] = FRenderTargetBinding(
                SceneColorTexture, ERenderTargetLoadAction::ELoad);
        
            TShaderMapRef<FAmbientCubemapCompositePS> PixelShader(View.ShaderMap);
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("AmbientCubemapComposite %dx%d", View.ViewRect.Width(), View.ViewRect.Height()),
                PassParameters,
                ERDGPassFlags::Raster,
                [PassParameters, &View, PixelShader](FRHICommandList& RHICmdList)
            {
                TShaderMapRef<FPostProcessVS> VertexShader(View.ShaderMap);
                
                RHICmdList.SetViewport(View.ViewRect.Min.X, View.ViewRect.Min.Y, 0.0f, View.ViewRect.Max.X, View.ViewRect.Max.Y, 0.0);

                FGraphicsPipelineStateInitializer GraphicsPSOInit;
                RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);

                // set the state
                GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGB, BO_Add, BF_One, BF_One, BO_Add, BF_One, BF_One>::GetRHI();
                GraphicsPSOInit.RasterizerState = TStaticRasterizerState<>::GetRHI();
                GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<false, CF_Always>::GetRHI();

                GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
                GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
                GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
                GraphicsPSOInit.PrimitiveType = PT_TriangleList;

                SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);

                uint32 Count = View.FinalPostProcessSettings.ContributingCubemaps.Num();
                for (const FFinalPostProcessSettings::FCubemapEntry& CubemapEntry : View.FinalPostProcessSettings.ContributingCubemaps)
                {
                    FAmbientCubemapCompositePS::FParameters ShaderParameters = *PassParameters;
                    SetupAmbientCubemapParameters(CubemapEntry, &ShaderParameters.AmbientCubemap);
                    SetShaderParameters(RHICmdList, PixelShader, PixelShader.GetPixelShader(), ShaderParameters);
                    
                    DrawPostProcessPass(
                        RHICmdList,
                        0, 0,
                        View.ViewRect.Width(), View.ViewRect.Height(),
                        View.ViewRect.Min.X, View.ViewRect.Min.Y,
                        View.ViewRect.Width(), View.ViewRect.Height(),
                        View.ViewRect.Size(),
                        GetSceneTextureExtent(),
                        VertexShader,
                        View.StereoPass, 
                        false, // TODO.
                        EDRF_UseTriangleOptimization);
                }
            });
        } // if (IsAmbientCubemapPassRequired(View))
    } // for (FViewInfo& View : Views)
}
           

RenderLumenScreenProbeGather的功能是渲染Lumen螢幕空間的探針收集,其代碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeGather.cpp

FSSDSignalTextures FDeferredShadingSceneRenderer::RenderLumenScreenProbeGather(
    FRDGBuilder& GraphBuilder,
    const FSceneTextures& SceneTextures,
    const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColorMip,
    FRDGTextureRef LightingChannelsTexture,
    const FViewInfo& View,
    FPreviousViewInfo* PreviousViewInfos,
    bool& bLumenUseDenoiserComposite,
    FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
    LLM_SCOPE_BYTAG(Lumen);

    // 渲染Lumen輻照度場收集.
    if (GLumenIrradianceFieldGather != 0)
    {
        bLumenUseDenoiserComposite = false;
        return RenderLumenIrradianceFieldGather(GraphBuilder, SceneTextures, View);
    }

    RDG_EVENT_SCOPE(GraphBuilder, "LumenScreenProbeGather");
    RDG_GPU_STAT_SCOPE(GraphBuilder, LumenScreenProbeGather);

    check(ShouldRenderLumenDiffuseGI(Scene, View, true));
    const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);

    if (!LightingChannelsTexture)
    {
        LightingChannelsTexture = SystemTextures.Black;
    }

    // 如果沒有啟用LumenScreenProbeGather, 則直接清理降噪輸入.
    if (!GLumenScreenProbeGather)
    {
        FSSDSignalTextures ScreenSpaceDenoiserInputs;
        ScreenSpaceDenoiserInputs.Textures[0] = SystemTextures.Black;
        FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
        ScreenSpaceDenoiserInputs.Textures[1] = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));
        AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenSpaceDenoiserInputs.Textures[1])), FLinearColor::Black);
        bLumenUseDenoiserComposite = false;
        return ScreenSpaceDenoiserInputs;
    }

    // 從統一緩沖區拉取備用紋理.
    const FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);

    // 設定螢幕空間探針的參數.
    FScreenProbeParameters ScreenProbeParameters;
    ScreenProbeParameters.ScreenProbeTracingOctahedronResolution = LumenScreenProbeGather::GetTracingOctahedronResolution(View);
    ensureMsgf(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution < (1 << 6) - 1, TEXT("Tracing resolution %u was larger than supported by PackRayInfo()"), ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
    ScreenProbeParameters.ScreenProbeGatherOctahedronResolution = LumenScreenProbeGather::GetGatherOctahedronResolution(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
    ScreenProbeParameters.ScreenProbeGatherOctahedronResolutionWithBorder = ScreenProbeParameters.ScreenProbeGatherOctahedronResolution + 2 * (1 << (GLumenScreenProbeGatherNumMips - 1));
    ScreenProbeParameters.ScreenProbeDownsampleFactor = LumenScreenProbeGather::GetScreenDownsampleFactor(View);

    ScreenProbeParameters.ScreenProbeViewSize = FIntPoint::DivideAndRoundUp(View.ViewRect.Size(), (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    ScreenProbeParameters.ScreenProbeAtlasViewSize = ScreenProbeParameters.ScreenProbeViewSize;
    ScreenProbeParameters.ScreenProbeAtlasViewSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeViewSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);

    ScreenProbeParameters.ScreenProbeAtlasBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);

    ScreenProbeParameters.ScreenProbeGatherMaxMip = GLumenScreenProbeGatherNumMips - 1;
    ScreenProbeParameters.RelativeSpeedDifferenceToConsiderLightingMoving = GLumenScreenProbeRelativeSpeedDifferenceToConsiderLightingMoving;
    ScreenProbeParameters.ScreenTraceNoFallbackThicknessScale = Lumen::UseHardwareRayTracedScreenProbeGather() ? 1.0f : GLumenScreenProbeScreenTracesThicknessScaleWhenNoFallback;
    ScreenProbeParameters.NumUniformScreenProbes = ScreenProbeParameters.ScreenProbeViewSize.X * ScreenProbeParameters.ScreenProbeViewSize.Y;
    ScreenProbeParameters.MaxNumAdaptiveProbes = FMath::TruncToInt(ScreenProbeParameters.NumUniformScreenProbes * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);
    extern int32 GLumenScreenProbeGatherVisualizeTraces;
    ScreenProbeParameters.FixedJitterIndex = GLumenScreenProbeGatherVisualizeTraces == 0 ? GLumenScreenProbeFixedJitterIndex : 6;

    FRDGTextureDesc DownsampledDepthDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenProbeSceneDepth = GraphBuilder.CreateTexture(DownsampledDepthDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeSceneDepth"));

    FRDGTextureDesc DownsampledSpeedDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R16F, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenProbeWorldSpeed = GraphBuilder.CreateTexture(DownsampledSpeedDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeWorldSpeed"));

    FBlueNoise BlueNoise;
    InitializeBlueNoise(BlueNoise);
    ScreenProbeParameters.BlueNoise = CreateUniformBufferImmediate(BlueNoise, EUniformBufferUsage::UniformBuffer_SingleDraw);

    ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTextureResolutionSq = GLumenOctahedralSolidAngleTextureSize * GLumenOctahedralSolidAngleTextureSize;
    ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTexture = InitializeOctahedralSolidAngleTexture(GraphBuilder, View.ShaderMap, GLumenOctahedralSolidAngleTextureSize, View.ViewState->Lumen.ScreenProbeGatherState.OctahedralSolidAngleTextureRT);

    // 探針下采樣深度.
    {
        FScreenProbeDownsampleDepthUniformCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeDownsampleDepthUniformCS::FParameters>();
        PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
        PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
        PassParameters->View = View.ViewUniformBuffer;
        PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
        PassParameters->SceneTextures = SceneTextureParameters;
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeDownsampleDepthUniformCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("UniformPlacement DownsampleFactor=%u", ScreenProbeParameters.ScreenProbeDownsampleFactor),
            ComputeShader,
            PassParameters,
            FComputeShaderUtils::GetGroupCount(ScreenProbeParameters.ScreenProbeViewSize, FScreenProbeDownsampleDepthUniformCS::GetGroupSize()));
    }

    FRDGBufferRef NumAdaptiveScreenProbes = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("Lumen.ScreenProbeGather.NumAdaptiveScreenProbes"));
    FRDGBufferRef AdaptiveScreenProbeData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), FMath::Max<uint32>(ScreenProbeParameters.MaxNumAdaptiveProbes, 1)), TEXT("Lumen.ScreenProbeGather.daptiveScreenProbeData"));

    ScreenProbeParameters.NumAdaptiveScreenProbes = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
    ScreenProbeParameters.AdaptiveScreenProbeData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(AdaptiveScreenProbeData, PF_R32_UINT));

    const FIntPoint ScreenProbeViewportBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    FRDGTextureDesc ScreenTileAdaptiveProbeHeaderDesc(FRDGTextureDesc::Create2D(ScreenProbeViewportBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    FIntPoint ScreenTileAdaptiveProbeIndicesBufferSize = FIntPoint(ScreenProbeViewportBufferSize.X * ScreenProbeParameters.ScreenProbeDownsampleFactor, ScreenProbeViewportBufferSize.Y * ScreenProbeParameters.ScreenProbeDownsampleFactor);
    FRDGTextureDesc ScreenTileAdaptiveProbeIndicesDesc(FRDGTextureDesc::Create2D(ScreenTileAdaptiveProbeIndicesBufferSize, PF_R16_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenTileAdaptiveProbeHeader = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeHeaderDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeHeader"));
    ScreenProbeParameters.ScreenTileAdaptiveProbeIndices = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeIndicesDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeIndices"));

    FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT)), 0);
    uint32 ClearValues[4] = {0, 0, 0, 0};
    AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader)), ClearValues);

    const uint32 AdaptiveProbeMinDownsampleFactor = FMath::Clamp(GLumenScreenProbeGatherAdaptiveProbeMinDownsampleFactor, 1, 64);

    if (ScreenProbeParameters.MaxNumAdaptiveProbes > 0 && AdaptiveProbeMinDownsampleFactor < ScreenProbeParameters.ScreenProbeDownsampleFactor)
    { 
        // 探針自适應地放置位置.
        uint32 PlacementDownsampleFactor = ScreenProbeParameters.ScreenProbeDownsampleFactor;
        do
        {
            PlacementDownsampleFactor /= 2;
            FScreenProbeAdaptivePlacementCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeAdaptivePlacementCS::FParameters>();
            PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
            PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
            PassParameters->RWNumAdaptiveScreenProbes = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
            PassParameters->RWAdaptiveScreenProbeData = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT));
            PassParameters->RWScreenTileAdaptiveProbeHeader = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader));
            PassParameters->RWScreenTileAdaptiveProbeIndices = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices));
            PassParameters->View = View.ViewUniformBuffer;
            PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ScreenProbeParameters = ScreenProbeParameters;
            PassParameters->PlacementDownsampleFactor = PlacementDownsampleFactor;

            auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeAdaptivePlacementCS>(0);

            FComputeShaderUtils::AddPass(
                GraphBuilder,
                RDG_EVENT_NAME("AdaptivePlacement DownsampleFactor=%u", PlacementDownsampleFactor),
                ComputeShader,
                PassParameters,
                FComputeShaderUtils::GetGroupCount(FIntPoint::DivideAndRoundDown(View.ViewRect.Size(), (int32)PlacementDownsampleFactor), FScreenProbeAdaptivePlacementCS::GetGroupSize()));
        }
        while (PlacementDownsampleFactor > AdaptiveProbeMinDownsampleFactor);
    }
    else
    {
        FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT)), 0);
        AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices)), ClearValues);
    }

    FRDGBufferRef ScreenProbeIndirectArgs = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>((uint32)EScreenProbeIndirectArgs::Max), TEXT("Lumen.ScreenProbeGather.ScreenProbeIndirectArgs"));

    // 設定自适應探針的非直接參數.
    {
        FSetupAdaptiveProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupAdaptiveProbeIndirectArgsCS::FParameters>();
        PassParameters->RWScreenProbeIndirectArgs = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(ScreenProbeIndirectArgs, PF_R32_UINT));
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FSetupAdaptiveProbeIndirectArgsCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupAdaptiveProbeIndirectArgs"),
            ComputeShader,
            PassParameters,
            FIntVector(1, 1, 1));
    }

    ScreenProbeParameters.ProbeIndirectArgs = ScreenProbeIndirectArgs;

    FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);

    FRDGTextureRef BRDFProbabilityDensityFunction = nullptr;
    FRDGBufferSRVRef BRDFProbabilityDensityFunctionSH = nullptr;
    GenerateBRDF_PDF(GraphBuilder, View, SceneTextures, BRDFProbabilityDensityFunction, BRDFProbabilityDensityFunctionSH, ScreenProbeParameters);

    const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenScreenProbeGatherRadianceCache::SetupRadianceCacheInputs();
    LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;

    // 輻射率緩存.
    if (LumenScreenProbeGather::UseRadianceCache(View))
    {
        FScreenGatherMarkUsedProbesData MarkUsedProbesData;
        MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
        MarkUsedProbesData.Parameters.SceneTexturesStruct = SceneTextures.UniformBuffer;
        MarkUsedProbesData.Parameters.ScreenProbeParameters = ScreenProbeParameters;
        MarkUsedProbesData.Parameters.VisualizeLumenScene = View.Family->EngineShowFlags.VisualizeLumenScene != 0 ? 1 : 0;
        MarkUsedProbesData.Parameters.RadianceCacheParameters = RadianceCacheParameters;

        // 渲染輻射率緩存.
        RenderRadianceCache(
            GraphBuilder, 
            TracingInputs, 
            RadianceCacheInputs, 
            Scene,
            View, 
            &ScreenProbeParameters, 
            BRDFProbabilityDensityFunctionSH, 
            FMarkUsedRadianceCacheProbes::CreateStatic(&ScreenGatherMarkUsedProbes), 
            &MarkUsedProbesData, 
            View.ViewState->RadianceCacheState, 
            RadianceCacheParameters);
    }

    if (LumenScreenProbeGather::UseImportanceSampling(View))
    {
        // 生成重要性采樣射線.
        GenerateImportanceSamplingRays(
            GraphBuilder,
            View,
            SceneTextures,
            RadianceCacheParameters,
            BRDFProbabilityDensityFunction,
            BRDFProbabilityDensityFunctionSH,
            ScreenProbeParameters);
    }

    const FIntPoint ScreenProbeTraceBufferSize = ScreenProbeParameters.ScreenProbeAtlasBufferSize * ScreenProbeParameters.ScreenProbeTracingOctahedronResolution;
    FRDGTextureDesc TraceRadianceDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.TraceRadiance = GraphBuilder.CreateTexture(TraceRadianceDesc, TEXT("Lumen.ScreenProbeGather.TraceRadiance"));
    ScreenProbeParameters.RWTraceRadiance = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceRadiance));

    FRDGTextureDesc TraceHitDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.TraceHit = GraphBuilder.CreateTexture(TraceHitDesc, TEXT("Lumen.ScreenProbeGather.TraceHit"));
    ScreenProbeParameters.RWTraceHit = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceHit));

    // 追蹤螢幕空間的探針.
    TraceScreenProbes(
        GraphBuilder, 
        Scene,
        View, 
        GLumenGatherCvars.TraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
        SceneTextures.UniformBuffer,
        PrevSceneColorMip,
        LightingChannelsTexture,
        TracingInputs,
        RadianceCacheParameters,
        ScreenProbeParameters,
        MeshSDFGridParameters);
    
    FScreenProbeGatherParameters GatherParameters;
    // 過濾螢幕空間探針.
    FilterScreenProbes(GraphBuilder, View, ScreenProbeParameters, GatherParameters);

    FScreenSpaceBentNormalParameters ScreenSpaceBentNormalParameters;
    ScreenSpaceBentNormalParameters.UseScreenBentNormal = 0;
    ScreenSpaceBentNormalParameters.ScreenBentNormal = SystemTextures.Black;
    ScreenSpaceBentNormalParameters.ScreenDiffuseLighting = SystemTextures.Black;

    // 計算螢幕空間的環境法線.
    if (LumenScreenProbeGather::UseScreenSpaceBentNormal())
    {
        ScreenSpaceBentNormalParameters = ComputeScreenSpaceBentNormal(GraphBuilder, Scene, View, SceneTextures, LightingChannelsTexture, ScreenProbeParameters);
    }

    FRDGTextureDesc DiffuseIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGBA, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
    FRDGTextureRef DiffuseIndirect = GraphBuilder.CreateTexture(DiffuseIndirectDesc, TEXT("Lumen.ScreenProbeGather.DiffuseIndirect"));

    FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
    FRDGTextureRef RoughSpecularIndirect = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));

    {
        FScreenProbeIndirectCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeIndirectCS::FParameters>();
        PassParameters->RWDiffuseIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(DiffuseIndirect));
        PassParameters->RWRoughSpecularIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RoughSpecularIndirect));
        PassParameters->GatherParameters = GatherParameters;
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->View = View.ViewUniformBuffer;
        PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
        PassParameters->FullResolutionJitterWidth = GLumenScreenProbeFullResolutionJitterWidth;
        extern float GLumenReflectionMaxRoughnessToTrace;
        extern float GLumenReflectionRoughnessFadeLength;
        PassParameters->MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
        PassParameters->RoughnessFadeLength = GLumenReflectionRoughnessFadeLength;
        PassParameters->ScreenSpaceBentNormalParameters = ScreenSpaceBentNormalParameters;

        FScreenProbeIndirectCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeIndirectCS::FDiffuseIntegralMethod >(LumenScreenProbeGather::GetDiffuseIntegralMethod());
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeIndirectCS>(PermutationVector);

        // 計算螢幕空間探針的非直接光.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ComputeIndirect %ux%u", View.ViewRect.Width(), View.ViewRect.Height()),
            ComputeShader,
            PassParameters,
            FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FScreenProbeIndirectCS::GetGroupSize()));
    }

    FSSDSignalTextures DenoiserOutputs;
    DenoiserOutputs.Textures[0] = DiffuseIndirect;
    DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
    bLumenUseDenoiserComposite = false;

    // 螢幕空間探針的時間過濾.
    if (GLumenScreenProbeTemporalFilter)
    {
        if (GLumenScreenProbeUseHistoryNeighborhoodClamp)
        {
            FRDGTextureRef CompressedDepthTexture;
            FRDGTextureRef CompressedShadingModelTexture;
            {
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    SceneTextures.Depth.Resolve->Desc.Extent,
                    PF_R16F,
                    FClearValueBinding::None,                    
                    /* InTargetableFlags = */ TexCreate_ShaderResource | TexCreate_UAV);

                CompressedDepthTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedDepth"));

                Desc.Format = PF_R8_UINT;
                CompressedShadingModelTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedShadingModelID"));
            }

            {
                FGenerateCompressedGBuffer::FParameters* PassParameters = GraphBuilder.AllocParameters<FGenerateCompressedGBuffer::FParameters>();
                PassParameters->RWCompressedDepthBufferOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedDepthTexture));
                PassParameters->RWCompressedShadingModelOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedShadingModelTexture));
                PassParameters->View = View.ViewUniformBuffer;
                PassParameters->SceneTextures = SceneTextureParameters;

                auto ComputeShader = View.ShaderMap->GetShader<FGenerateCompressedGBuffer>(0);

                FComputeShaderUtils::AddPass(
                    GraphBuilder,
                    RDG_EVENT_NAME("GenerateCompressedGBuffer"),
                    ComputeShader,
                    PassParameters,
                    FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FGenerateCompressedGBuffer::GetGroupSize()));
            }

            FSSDSignalTextures ScreenSpaceDenoiserInputs;
            ScreenSpaceDenoiserInputs.Textures[0] = DiffuseIndirect;
            ScreenSpaceDenoiserInputs.Textures[1] = RoughSpecularIndirect;

            DenoiserOutputs = IScreenSpaceDenoiser::DenoiseIndirectProbeHierarchy(
                GraphBuilder,
                View, 
                PreviousViewInfos,
                SceneTextureParameters,
                ScreenSpaceDenoiserInputs,
                CompressedDepthTexture,
                CompressedShadingModelTexture);

            bLumenUseDenoiserComposite = true;
        }
        else
        {
            UpdateHistoryScreenProbeGather(
                GraphBuilder,
                View,
                SceneTextures,
                DiffuseIndirect,
                RoughSpecularIndirect);

            DenoiserOutputs.Textures[0] = DiffuseIndirect;
            DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
        }
    }

    return DenoiserOutputs;
}
           

結合源碼和RenderDoc截幀資料,可知螢幕空間的探針收集階段異常複雜,正常流程的主要步驟有:全局并自适應調整位置、計算BRDF、渲染輻射率緩存、計算光照PDF、生成采樣射線、追蹤螢幕空間的探針、壓縮追蹤結果、追蹤Voxel體素、組合追蹤結果、過濾帶收集的輻射率、處理環境法線、計算非直接光、更新曆史資料:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

由于以上步驟涉及太多了,隻能結合截幀資料挑選部分重要步驟加以分析。

  • RadianceCache

光照緩存(RadianceCache)也是一系列非常複雜的過程,先後經曆清理、标記、更新、配置設定探針,設定繪制參數,追蹤探針,過濾探針輻射度等階段:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

RadianceCache最重要的是追蹤螢幕空間的探針,它的輸入資料有全局距離場、VoxelLighting等紋理。

輸出是4096x4096的輻射率探針圖集和深度:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

TraceFromProbes輸出的探針圖集(局部放大)。

其使用的Compute Shader代碼如下:

// Engine\Shaders\Private\Lumen\LumenRadianceCache.usf

groupshared float3 SharedTraceRadiance[THREADGROUP_SIZE][THREADGROUP_SIZE];
groupshared float SharedTraceHitDistance[THREADGROUP_SIZE][THREADGROUP_SIZE];

[numthreads(THREADGROUP_SIZE, THREADGROUP_SIZE, 1)]
void TraceFromProbesCS(
    uint3 GroupId : SV_GroupID,
    uint2 GroupThreadId : SV_GroupThreadID)
{
    uint TraceTileIndex = GroupId.y * TRACE_TILE_GROUP_STRIDE + GroupId.x;

    if (TraceTileIndex < ProbeTraceTileAllocator[0])
    {
        uint2 TraceTileCoord;
        uint TraceTileLevel;
        uint ProbeTraceIndex;
        // 擷取追蹤塊的資訊
        UnpackTraceTileInfo(ProbeTraceTileData[TraceTileIndex], TraceTileCoord, TraceTileLevel, ProbeTraceIndex);

        uint TraceResolution = (RadianceProbeResolution / 2) << TraceTileLevel;
        // 探針紋素坐标
        uint2 ProbeTexelCoord = TraceTileCoord * THREADGROUP_SIZE + GroupThreadId.xy;


        float3 ProbeWorldCenter;
        uint ClipmapIndex;
        uint ProbeIndex;
        // 擷取探針的追蹤資料.
        GetProbeTraceData(ProbeTraceIndex, ProbeWorldCenter, ClipmapIndex, ProbeIndex);

        if (all(ProbeTexelCoord < TraceResolution))
        {
            float2 ProbeTexelCenter = float2(0.5, 0.5);
            float2 ProbeUV = (ProbeTexelCoord + ProbeTexelCenter) / float(TraceResolution);
            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            float FinalMinTraceDistance = max(MinTraceDistance, GetRadianceProbeTMin(ClipmapIndex));
            float FinalMaxTraceDistance = MaxTraceDistance;
            float EffectiveStepFactor = StepFactor;

            // 将球的立體角均勻地分布在所有錐體上,而不是基于八面體的畸變.
            float ConeHalfAngle = acosFast(1.0f - 1.0f / (float)(TraceResolution * TraceResolution));

            // 設定錐體追蹤輸入資料.
            FConeTraceInput TraceInput;
            TraceInput.Setup(
                ProbeWorldCenter, WorldConeDirection,
                ConeHalfAngle, MinSampleRadius,
                FinalMinTraceDistance, FinalMaxTraceDistance,
                EffectiveStepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;

            bool bContinueCardTracing = false;

            TraceInput.VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(FinalMinTraceDistance, FinalMaxTraceDistance, MaxMeshSDFTraceDistance, bContinueCardTracing);

            // 為探針紋素執行錐體追蹤.
            FConeTraceResult TraceResult = TraceForProbeTexel(TraceInput);

            // 存儲追蹤的光照結果.
            SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x] = TraceResult.Lighting;

            // 存儲追蹤的深度.
            #if RADIANCE_CACHE_STORE_DEPTHS
                SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x] = TraceResult.OpaqueHitDistance;
            #endif
        }

        GroupMemoryBarrierWithGroupSync();

        uint2 ProbeAtlasBaseCoord = RadianceProbeResolution * uint2(ProbeIndex % ProbeAtlasResolutionInProbes.x, ProbeIndex / ProbeAtlasResolutionInProbes.x);

        // 存儲光照結果和相交點的距離.
        if (TraceResolution < RadianceProbeResolution)
        {
            uint UpsampleFactor = RadianceProbeResolution / TraceResolution;
            ProbeAtlasBaseCoord += (THREADGROUP_SIZE * TraceTileCoord + GroupThreadId.xy) * UpsampleFactor;

            float3 Lighting = SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x];

            for (uint Y = 0; Y < UpsampleFactor; Y++)
            {
                for (uint X = 0; X < UpsampleFactor; X++)
                {
                    RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = Lighting;
                }
            }

            #if RADIANCE_CACHE_STORE_DEPTHS
                float HitDistance = min(SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x], MaxHalfFloat);

                for (uint Y = 0; Y < UpsampleFactor; Y++)
                {
                    for (uint X = 0; X < UpsampleFactor; X++)
                    {
                        RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = HitDistance;
                    }
                }
            #endif
        }
        else
        {
            uint DownsampleFactor = TraceResolution / RadianceProbeResolution;
            uint WriteTileSize = THREADGROUP_SIZE / DownsampleFactor;

            if (all(GroupThreadId.xy < WriteTileSize))
            {
                float3 Lighting = 0;

                for (uint Y = 0; Y < DownsampleFactor; Y++)
                {
                    for (uint X = 0; X < DownsampleFactor; X++)
                    {
                        Lighting += SharedTraceRadiance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X];
                    }
                }

                ProbeAtlasBaseCoord += WriteTileSize * TraceTileCoord + GroupThreadId.xy;
                RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord] = Lighting / (float)(DownsampleFactor * DownsampleFactor);

                #if RADIANCE_CACHE_STORE_DEPTHS
                    float HitDistance = MaxHalfFloat;

                    for (uint Y = 0; Y < DownsampleFactor; Y++)
                    {
                        for (uint X = 0; X < DownsampleFactor; X++)
                        {
                            HitDistance = min(HitDistance, SharedTraceHitDistance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X]);
                        }
                    }

                    RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord] = HitDistance;
                #endif
            }
        }
    }
}
           

下面再進入

TraceForProbeTexel

分析探針紋素的追蹤堆棧:

FConeTraceResult TraceForProbeTexel(FConeTraceInput TraceInput)
{
    // 構造追蹤結果結構體.
    FConeTraceResult TraceResult;
    TraceResult = (FConeTraceResult)0;
    TraceResult.Lighting = 0.0;
    TraceResult.Transparency = 1.0;
    TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;

    // 錐體追蹤Lumen場景的紋素, 後面有解析.
    ConeTraceLumenSceneVoxels(TraceInput, TraceResult);

    // 遠景距離場的追蹤.
#if TRACE_DISTANT_SCENE
    if (TraceResult.Transparency > .01f)
    {
        FConeTraceResult DistantTraceResult;
        // 錐體追蹤Lumen遠處場景, 後面有解析.
        ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
        TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
        TraceResult.Transparency *= DistantTraceResult.Transparency;
    }
#endif

    // 天空光處理.
#if ENABLE_DYNAMIC_SKY_LIGHT
    if (ReflectionStruct.SkyLightParameters.y > 0)
    {
        float SkyAverageBrightness = 1.0f;
        float Roughness = TanConeAngleToRoughness(tan(TraceInput.ConeAngle));

        TraceResult.Lighting = TraceResult.Lighting + GetSkyLightReflection(TraceInput.ConeDirection, Roughness, SkyAverageBrightness) * TraceResult.Transparency;
    }
#endif

    return TraceResult;
}

// 錐體追蹤Lumen場景的紋素
void ConeTraceLumenSceneVoxels(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
#if SCENE_TRACE_VOXELS
    if (TraceInput.VoxelTraceStartDistance < TraceInput.MaxTraceDistance)
    {
        FConeTraceInput VoxelTraceInput = TraceInput;
        VoxelTraceInput.MinTraceDistance = TraceInput.VoxelTraceStartDistance;
        FConeTraceResult VoxelTraceResult;
        // 錐體追蹤體素, 之前就解析過了.
        ConeTraceVoxels(VoxelTraceInput, VoxelTraceResult);

        // 應用透明度.
        #if !VISIBILITY_ONLY_TRACE
            OutResult.Lighting += VoxelTraceResult.Lighting * OutResult.Transparency;
        #endif
        OutResult.Transparency *= VoxelTraceResult.Transparency;
        OutResult.NumSteps += VoxelTraceResult.NumSteps;
        OutResult.OpaqueHitDistance = min(OutResult.OpaqueHitDistance, VoxelTraceResult.OpaqueHitDistance);
    }
#endif
}

// 錐體追蹤Lumen遠處場景.
void ConeTraceLumenDistantScene(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
    float3 debug = 0;
    TraceInput.MaxTraceDistance = LumenCardScene.DistantSceneMaxTraceDistance;
    TraceInput.bBlackOutSteepIntersections = true;

    FCardTraceBlendState CardTraceBlendState;
    CardTraceBlendState.Initialize(TraceInput.MaxTraceDistance);

    if (LumenCardScene.NumDistantCards > 0)
    {
        // 從裁剪圖擷取最小追蹤距離.
        if (NumClipmapLevels > 0)
        {
            float3 VoxelLightingCenter = ClipmapWorldCenter[NumClipmapLevels - 1].xyz;
            float3 VoxelLightingExtent = ClipmapWorldSamplingExtent[NumClipmapLevels - 1].xyz;

            float3 RayEnd = TraceInput.ConeOrigin + TraceInput.ConeDirection * TraceInput.MaxTraceDistance;
            float2 IntersectionTimes = LineBoxIntersect(TraceInput.ConeOrigin, RayEnd, VoxelLightingCenter - VoxelLightingExtent, VoxelLightingCenter + VoxelLightingExtent);

            // If we are starting inside the voxel clipmaps, move the start of the trace past the voxel clipmaps
            if (IntersectionTimes.x < IntersectionTimes.y && IntersectionTimes.x < .001f)
            {
                TraceInput.MinTraceDistance = IntersectionTimes.y * TraceInput.MaxTraceDistance;
            }
        }

        float TraceEndDistance = TraceInput.MinTraceDistance;

        {
            uint ListIndex = 0;
            uint CardIndex = LumenCardScene.DistantCardIndices[ListIndex];

            // 錐體追蹤單個Lumen卡片, 後面有解析.
            ConeTraceSingleLumenCard(
                TraceInput,
                CardIndex,
                debug,
                TraceEndDistance,
                CardTraceBlendState);
        }
    }

    OutResult = (FConeTraceResult)0;

    // 存儲結果.
    #if !VISIBILITY_ONLY_TRACE
        OutResult.Lighting = CardTraceBlendState.GetFinalLighting();
    #endif
    OutResult.Transparency = CardTraceBlendState.GetTransparency();
    OutResult.NumSteps = CardTraceBlendState.NumSteps;
    OutResult.NumOverlaps = CardTraceBlendState.NumOverlaps;
    OutResult.OpaqueHitDistance = CardTraceBlendState.OpaqueHitDistance;
    OutResult.Debug = debug;
}

// 錐體追蹤單個Lumen卡片
void ConeTraceSingleLumenCard(
    FConeTraceInput TraceInput,
    uint CardIndex,
    inout float3 Debug,
    inout float OutTraceEndDistance,
    inout FCardTraceBlendState CardTraceBlendState)
{
    // 擷取卡片資料.
    FLumenCardData LumenCardData = GetLumenCardData(CardIndex);

    // 計算局部空間的錐體資料.
    float3 LocalConeOrigin = mul(TraceInput.ConeOrigin - LumenCardData.Origin, LumenCardData.WorldToLocalRotation);
    float3 LocalConeDirection = mul(TraceInput.ConeDirection, LumenCardData.WorldToLocalRotation);
    float3 LocalTraceEnd = LocalConeOrigin + LocalConeDirection * TraceInput.MaxTraceDistance;

    // 相交範圍.
    float2 IntersectionRange = LineBoxIntersect(LocalConeOrigin, LocalTraceEnd, -LumenCardData.LocalExtent, LumenCardData.LocalExtent);
    IntersectionRange.x = max(IntersectionRange.x, TraceInput.MinTraceDistance / TraceInput.MaxTraceDistance);
    OutTraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;

    if (IntersectionRange.y > IntersectionRange.x
        && LumenCardData.bVisible)
    {
        {
            // 卡片追蹤混合狀态.
            FCardTraceBlendState ConeStepBlendState;
            ConeStepBlendState.Initialize(TraceInput.MaxTraceDistance);

            float StepTime = IntersectionRange.x * TraceInput.MaxTraceDistance;
            float3 SamplePosition = LocalConeOrigin + StepTime * LocalConeDirection;
            float TraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;

            float IntersectionLength = (IntersectionRange.y - IntersectionRange.x) * TraceInput.MaxTraceDistance;
            float MinStepSize = IntersectionLength / (float)LumenCardScene.MaxConeSteps;

            float PreviousStepTime = StepTime;
            float3 PreviousSamplePosition = SamplePosition;
            // Magic value to prevent linear intersection approximation on first step
            float PreviousHeightfieldZ = -2;

            bool bClampedToEnd = false;
            bool bFoundSurface = false;
            bool bRayAboveSurface = false;
            float IntersectionStepTime = 0;
            float2 IntersectionSamplePositionXY = SamplePosition.xy;
            float IntersectionSlope = 0;

            uint NumStepsPerLoop = 4; // 每次循環采樣4次.
            for (uint StepIndex = 0; StepIndex < LumenCardScene.MaxConeSteps && StepTime < TraceEndDistance; StepIndex += NumStepsPerLoop)
            {
                float SampleRadius = max(TraceInput.ConeStartRadius + TraceInput.TanConeAngle * StepTime, TraceInput.MinSampleRadius);
                float StepSize = max(SampleRadius * TraceInput.StepFactor, MinStepSize);
                float TraceClampDistance = TraceEndDistance - StepSize * .0001f;

                float DepthMip;
                float2 DepthValidRegionScale;
                CalculateMip(SampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, DepthMip, DepthValidRegionScale);

                // 4個采樣位置.
                float3 SamplePosition1 = LocalConeOrigin + min(StepTime + 0 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition2 = LocalConeOrigin + min(StepTime + 1 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition3 = LocalConeOrigin + min(StepTime + 2 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition4 = LocalConeOrigin + min(StepTime + 3 * StepSize, TraceClampDistance) * LocalConeDirection;

                // 4個深度UV.
                float2 DepthAtlasUV1 = CalculateAtlasUV(SamplePosition1.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV2 = CalculateAtlasUV(SamplePosition2.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV3 = CalculateAtlasUV(SamplePosition3.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV4 = CalculateAtlasUV(SamplePosition4.xy, DepthValidRegionScale, LumenCardData);

                // 4個深度.
                float Depth1 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV1, DepthMip).x;
                float Depth2 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV2, DepthMip).x;
                float Depth3 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV3, DepthMip).x;
                float Depth4 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV4, DepthMip).x;

                // 4個高度場Z值.
                float HeightfieldZ1 = LumenCardData.LocalExtent.z - Depth1 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ2 = LumenCardData.LocalExtent.z - Depth2 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ3 = LumenCardData.LocalExtent.z - Depth3 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ4 = LumenCardData.LocalExtent.z - Depth4 * 2 * LumenCardData.LocalExtent.z;

                ConeStepBlendState.RegisterStep(NumStepsPerLoop);

                // 高度場是否相交.
                bool4 HeightfieldHit = bool4(
                    SamplePosition1.z < HeightfieldZ1,
                    SamplePosition2.z < HeightfieldZ2,
                    SamplePosition3.z < HeightfieldZ3,
                    SamplePosition4.z < HeightfieldZ4);

                bool bRayBelowHeightfield = any(HeightfieldHit);
                bool bRayWasAboveSurface = bRayAboveSurface;

                if (!bRayBelowHeightfield)
                {
                    bRayAboveSurface = true;
                }

                // 從高度場以下開始的追蹤必須在到達高度場以上才能被命中
                if (bRayBelowHeightfield && bRayWasAboveSurface)
                {
                    float HeightfieldZ;
                    if (HeightfieldHit.x)
                    {
                        SamplePosition = SamplePosition1;
                        HeightfieldZ = HeightfieldZ1;
                        StepTime = StepTime + 0 * StepSize;
                    }
                    else if (HeightfieldHit.y)
                    {
                        PreviousSamplePosition = SamplePosition1;
                        PreviousHeightfieldZ = HeightfieldZ1;
                        PreviousStepTime = StepTime + 0 * StepSize;

                        SamplePosition = SamplePosition2;
                        HeightfieldZ = HeightfieldZ2;
                        StepTime = StepTime + 1 * StepSize;
                    }
                    else if (HeightfieldHit.z)
                    {
                        PreviousSamplePosition = SamplePosition2;
                        PreviousHeightfieldZ = HeightfieldZ2;
                        PreviousStepTime = StepTime + 1 * StepSize;

                        SamplePosition = SamplePosition3;
                        HeightfieldZ = HeightfieldZ3;
                        StepTime = StepTime + 2 * StepSize;
                    }
                    else
                    {
                        PreviousSamplePosition = SamplePosition3;
                        PreviousHeightfieldZ = HeightfieldZ3;
                        PreviousStepTime = StepTime + 2 * StepSize;

                        SamplePosition = SamplePosition4;
                        HeightfieldZ = HeightfieldZ4;
                        StepTime = StepTime + 3 * StepSize;
                    }

                    StepTime = min(StepTime, TraceClampDistance);

                    if (PreviousHeightfieldZ != -2)
                    {
                        // 求出x的交點.
                        IntersectionStepTime = PreviousStepTime + ((PreviousSamplePosition.z - PreviousHeightfieldZ) * (StepTime - PreviousStepTime)) / (HeightfieldZ - PreviousHeightfieldZ + PreviousSamplePosition.z - SamplePosition.z);

                        float2 LocalPositionSlopeXY = (SamplePosition.xy - PreviousSamplePosition.xy) / (StepTime - PreviousStepTime);
                        IntersectionSamplePositionXY = LocalPositionSlopeXY * (IntersectionStepTime - PreviousStepTime) + PreviousSamplePosition.xy;

                        IntersectionSlope = abs(PreviousHeightfieldZ - HeightfieldZ) / max(length(PreviousSamplePosition.xy - SamplePosition.xy), .0001f);

                        PreviousHeightfieldZ = -2;
                        // 找到了表面.
                        bFoundSurface = true;
                    }
                    break;
                }

                PreviousStepTime = StepTime + 3 * StepSize;
                PreviousSamplePosition = SamplePosition4;
                PreviousHeightfieldZ = HeightfieldZ4;
                StepTime += 4 * StepSize;

                if (StepTime >= TraceEndDistance && !bClampedToEnd)
                {
                    bClampedToEnd = true;
                    // Stop the last step just before the intersection end, since the linear approximation needs to step past the surface to detect a hit, without terminating the loop
                    StepTime = TraceClampDistance;
                }
            }

            // 如果找到了表面點.
            if (bFoundSurface)
            {
                float IntersectionSampleRadius = TraceInput.ConeStartRadius + TraceInput.TanConeAngle * IntersectionStepTime;

                float MaxMip;
                float2 ValidRegionScale;
                CalculateMip(IntersectionSampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, MaxMip, ValidRegionScale);

                float2 IntersectionAtlasUV = CalculateAtlasUV(IntersectionSamplePositionXY, ValidRegionScale, LumenCardData);

                float DistanceToSurface = 0;
                float ConeIntersectSurface = saturate(DistanceToSurface / IntersectionSampleRadius);
                float ConeVisibility = ConeIntersectSurface;

                float MaxDistanceFade = 1;

                ConeStepBlendState.RegisterOpaqueHit(IntersectionStepTime);
                OutTraceEndDistance = IntersectionStepTime;

                float Opacity = Texture2DSampleLevel(OpacityAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).x;
                float ConeOcclusion = (1.0f - ConeVisibility) * Opacity * MaxDistanceFade;

                #if VISIBILITY_ONLY_TRACE
                    float3 StepLighting = 0;
                #else
                    float3 StepLighting = Texture2DSampleLevel(FinalLightingAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).rgb;
                #endif
            
                if (TraceInput.bBlackOutSteepIntersections)
                {
                    // 假設陡峭的部分被其他面覆寫,然後淡出。
                    float SlopeFade = 1 - saturate((IntersectionSlope - 5) / 1.0f);
                    StepLighting = lerp(0, StepLighting, SlopeFade);
                    ConeOcclusion = lerp(0, ConeOcclusion, SlopeFade);
                }

                ConeStepBlendState.AddLighting(StepLighting, ConeOcclusion, IntersectionStepTime);
            }

            CardTraceBlendState.AddCardTrace(ConeStepBlendState);
        }
    }
}
           

以上可知,RadianceCache階段經曆紛繁複雜的渲染過程,其中單單TraceFromProbes就先後考慮了錐體追蹤Voxel光場和場景遠處的卡片,最後還需要考慮天空光的影響。

  • TraceScreenProbes

TraceScreenProbes包含追蹤螢幕的探針、網格距離場、Voxel光照等,具體的代碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeTracing.cpp

void TraceScreenProbes(
    FRDGBuilder& GraphBuilder, 
    const FScene* Scene,
    const FViewInfo& View, 
    bool bTraceMeshSDFs,
    TRDGUniformBufferRef<FSceneTextureUniformParameters> SceneTexturesUniformBuffer,
    const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColor,
    FRDGTextureRef LightingChannelsTexture,
    const FLumenCardTracingInputs& TracingInputs,
    const LumenRadianceCache::FRadianceCacheInterpolationParameters& RadianceCacheParameters,
    FScreenProbeParameters& ScreenProbeParameters,
    FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
    const FSceneTextureParameters SceneTextures = GetSceneTextureParameters(GraphBuilder, SceneTexturesUniformBuffer);

    // 清理探針.
    {
        FClearTracesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FClearTracesCS::FParameters>();
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FClearTracesCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ClearTraces %ux%u", ScreenProbeParameters.ScreenProbeTracingOctahedronResolution, ScreenProbeParameters.ScreenProbeTracingOctahedronResolution),
            ComputeShader,
            PassParameters,
            ScreenProbeParameters.ProbeIndirectArgs,
            (uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
    }

    FLumenIndirectTracingParameters IndirectTracingParameters;
    SetupLumenDiffuseTracingParameters(IndirectTracingParameters);

    const bool bTraceScreen = View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid() 
        && GLumenScreenProbeGatherScreenTraces != 0
        && !View.Family->EngineShowFlags.VisualizeLumenIndirectDiffuse;

    // 追蹤螢幕空間的探針.
    if (bTraceScreen)
    {
        FScreenProbeTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceScreenTexturesCS::FParameters>();

        ScreenSpaceRayTracing::SetupCommonScreenSpaceRayParameters(GraphBuilder, SceneTextures, PrevSceneColor, View, /* out */ &PassParameters->ScreenSpaceRayParameters);

        PassParameters->ScreenSpaceRayParameters.CommonDiffuseParameters.SceneTextures = SceneTextures;

        {
            const FVector2D HZBUvFactor(
                float(View.ViewRect.Width()) / float(2 * View.HZBMipmap0Size.X),
                float(View.ViewRect.Height()) / float(2 * View.HZBMipmap0Size.Y));

            const FVector4 ScreenPositionScaleBias = View.GetScreenPositionScaleBias(SceneTextures.SceneDepthTexture->Desc.Extent, View.ViewRect);
            const FVector2D HZBUVToScreenUVScale = FVector2D(1.0f / HZBUvFactor.X, 1.0f / HZBUvFactor.Y) * FVector2D(2.0f, -2.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y);
            const FVector2D HZBUVToScreenUVBias = FVector2D(-1.0f, 1.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y) + FVector2D(ScreenPositionScaleBias.W, ScreenPositionScaleBias.Z);
            PassParameters->HZBUVToScreenUVScaleBias = FVector4(HZBUVToScreenUVScale, HZBUVToScreenUVBias);
        }

        checkf(View.ClosestHZB, TEXT("Lumen screen tracing: ClosestHZB was not setup, should have been setup by FDeferredShadingSceneRenderer::RenderHzb"));
        PassParameters->ClosestHZBTexture = View.ClosestHZB;
        PassParameters->SceneDepthTexture = SceneTextures.SceneDepthTexture;
        PassParameters->LightingChannelsTexture = LightingChannelsTexture;
        PassParameters->HZBBaseTexelSize = FVector2D(1.0f / View.ClosestHZB->Desc.Extent.X, 1.0f / View.ClosestHZB->Desc.Extent.Y);
        PassParameters->MaxHierarchicalScreenTraceIterations = GLumenScreenProbeGatherHierarchicalScreenTracesMaxIterations;
        PassParameters->UncertainTraceRelativeDepthThreshold = GLumenScreenProbeGatherUncertainTraceRelativeDepthThreshold;
        PassParameters->NumThicknessStepsToDetermineCertainty = GLumenScreenProbeGatherNumThicknessStepsToDetermineCertainty;

        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->IndirectTracingParameters = IndirectTracingParameters;
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;

        FScreenProbeTraceScreenTexturesCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FRadianceCache >(LumenScreenProbeGather::UseRadianceCache(View));
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FHierarchicalScreenTracing >(GLumenScreenProbeGatherHierarchicalScreenTraces != 0);
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceScreenTexturesCS>(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceScreen"),
            ComputeShader,
            PassParameters,
            ScreenProbeParameters.ProbeIndirectArgs,
            (uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
    }

    // 追蹤網格距離場.
    if (bTraceMeshSDFs)
    {
        // 硬體模式
        if (Lumen::UseHardwareRayTracedScreenProbeGather())
        {
            FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
                GraphBuilder,
                View,
                ScreenProbeParameters,
                WORLD_MAX,
                IndirectTracingParameters.MaxTraceDistance);

            RenderHardwareRayTracingScreenProbe(GraphBuilder,
                Scene,
                SceneTextures,
                ScreenProbeParameters,
                View,
                TracingInputs,
                IndirectTracingParameters,
                RadianceCacheParameters,
                CompactedTraceParameters);
        }
        // 軟體模式
        else
        {
            CullForCardTracing(
                GraphBuilder,
                Scene, View,
                TracingInputs,
                IndirectTracingParameters,
                /* out */ MeshSDFGridParameters);

            if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
            {
                FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
                    GraphBuilder,
                    View,
                    ScreenProbeParameters,
                    IndirectTracingParameters.CardTraceEndDistanceFromCamera,
                    IndirectTracingParameters.MaxMeshSDFTraceDistance);

                {
                    FScreenProbeTraceMeshSDFsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceMeshSDFsCS::FParameters>();
                    GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
                    PassParameters->MeshSDFGridParameters = MeshSDFGridParameters;
                    PassParameters->ScreenProbeParameters = ScreenProbeParameters;
                    PassParameters->IndirectTracingParameters = IndirectTracingParameters;
                    PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
                    PassParameters->CompactedTraceParameters = CompactedTraceParameters;

                    FScreenProbeTraceMeshSDFsCS::FPermutationDomain PermutationVector;
                    PermutationVector.Set< FScreenProbeTraceMeshSDFsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
                    auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceMeshSDFsCS>(PermutationVector);

                    FComputeShaderUtils::AddPass(
                        GraphBuilder,
                        RDG_EVENT_NAME("TraceMeshSDFs"),
                        ComputeShader,
                        PassParameters,
                        CompactedTraceParameters.IndirectArgs,
                        0);
                }
            }
        }
    }

    // 壓縮追蹤參數.
    FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
        GraphBuilder,
        View,
        ScreenProbeParameters,
        WORLD_MAX,
        // Make sure the shader runs on all misses to apply radiance cache + skylight
        IndirectTracingParameters.MaxTraceDistance + 1);

    // 追蹤Voxel光照.
    {
        FScreenProbeTraceVoxelsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceVoxelsCS::FParameters>();
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;
        GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->IndirectTracingParameters = IndirectTracingParameters;
        PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
        PassParameters->CompactedTraceParameters = CompactedTraceParameters;

        const bool bRadianceCache = LumenScreenProbeGather::UseRadianceCache(View);

        FScreenProbeTraceVoxelsCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FDynamicSkyLight >(Lumen::ShouldHandleSkyLight(Scene, *View.Family));
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FTraceDistantScene >(Scene->LumenSceneData->DistantCardIndices.Num() > 0);
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FRadianceCache >(bRadianceCache);
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceVoxelsCS>(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceVoxels"),
            ComputeShader,
            PassParameters,
            CompactedTraceParameters.IndirectArgs,
            0);
    }

    if (GLumenScreenProbeGatherVisualizeTraces)
    {
        SetupVisualizeTraces(GraphBuilder, Scene, View, ScreenProbeParameters);
    }
}
           

先結合截幀資料分析TraceScreen,它的輸入是BlueNoise、Velocity、深度、探針速度、射線資訊、HZB、SSRReducedSceneColor等紋理,輸出是像素格式為R11G11B10的TraceRadiance和R32的TraceHit紋理:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

左:TraceRadiance,右:TraceHit。

它使用的Compute Shader如下:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

[numthreads(PROBE_THREADGROUP_SIZE_2D, PROBE_THREADGROUP_SIZE_2D, 1)]
void ScreenProbeTraceScreenTexturesCS(
    uint3 GroupId : SV_GroupID,
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
#define DEINTERLEAVED_SCREEN_TRACING 1
    // 計算紋理坐标
#if DEINTERLEAVED_SCREEN_TRACING
    uint2 AtlasSizeInProbes = uint2(ScreenProbeAtlasViewSize.x, (GetNumScreenProbes() + ScreenProbeAtlasViewSize.x - 1) / ScreenProbeAtlasViewSize.x);
    uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy % AtlasSizeInProbes;
    uint2 TraceTexelCoord = DispatchThreadId.xy / AtlasSizeInProbes;
#else
    uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy / ScreenProbeTracingOctahedronResolution;
    uint2 TraceTexelCoord = DispatchThreadId.xy - ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution;
#endif

    uint ScreenProbeIndex = ScreenProbeAtlasCoord.y * ScreenProbeAtlasViewSize.x + ScreenProbeAtlasCoord.x;

    uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
    uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);

    if (ScreenProbeIndex < GetNumScreenProbes() && all(TraceTexelCoord < ScreenProbeTracingOctahedronResolution))
    {
        float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
        float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);

        if (SceneDepth > 0.0f)
        {
            float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);

            float2 ProbeUV;
            float ConeHalfAngle;
            // 擷取探針追蹤的UV.
            GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);

            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            float DepthThresholdScale = HasDistanceFieldRepresentation(ScreenUV) ? 1.0f : ScreenTraceNoFallbackThicknessScale;

            {
                float TraceDistance = MaxTraceDistance;
                bool bCoveredByRadianceCache = false;
                #if RADIANCE_CACHE
                    float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
                    TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
                #endif


#if HIERARCHICAL_SCREEN_TRACING // 層級螢幕追蹤

                bool bHit;
                bool bUncertain;
                float3 HitUVz;

                // 螢幕追蹤
                TraceScreen(
                    WorldPosition + View.PreViewTranslation,
                    WorldConeDirection,
                    TraceDistance,
                    HZBUvFactorAndInvFactor,
                    MaxHierarchicalScreenTraceIterations, 
                    UncertainTraceRelativeDepthThreshold * DepthThresholdScale,
                    NumThicknessStepsToDetermineCertainty,
                    bHit,
                    bUncertain,
                    HitUVz);
                
                float Level = 1;
                bool bWriteDepthOnMiss = true;
#else // 非層級螢幕追蹤
    
                uint NumSteps = 16;
                float StartMipLevel = 1.0f;
                float MaxScreenTraceFraction = .2f;

                // 通過限制跟蹤距離,隻能在固定步長計數的螢幕跟蹤中獲得良好的品質.
                float MaxWorldTraceDistance = SceneDepth * MaxScreenTraceFraction * 2.0 * GetTanHalfFieldOfView().x;
                TraceDistance = min(TraceDistance, MaxWorldTraceDistance);

                uint2 NoiseCoord = ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution + TraceTexelCoord;
                float StepOffset = InterleavedGradientNoise(NoiseCoord + 0.5f, 0);

                float RayRoughness = .2f;
                StepOffset = StepOffset - .9f;

                FSSRTCastingSettings CastSettings = CreateDefaultCastSettings();
                CastSettings.bStopWhenUncertain = true;

                bool bHit = false;
                float Level;
                float3 HitUVz;
                bool bRayWasClipped;

                // 初始化螢幕空間的來自世界空間的光線.
                FSSRTRay Ray = InitScreenSpaceRayFromWorldSpace(
                    WorldPosition + View.PreViewTranslation, WorldConeDirection,
                    /* WorldTMax = */ TraceDistance,
                    /* SceneDepth = */ SceneDepth,
                    /* SlopeCompareToleranceScale */ 2.0f * DepthThresholdScale,
                    /* bExtendRayToScreenBorder = */ false,
                    /* out */ bRayWasClipped);

                bool bUncertain;
                float3 DebugOutput;

                // 投射螢幕空間的射線.
                CastScreenSpaceRay(
                    FurthestHZBTexture, FurthestHZBTextureSampler,
                    StartMipLevel,
                    CastSettings,
                    Ray, RayRoughness, NumSteps, StepOffset,
                    HZBUvFactorAndInvFactor, false,
                    /* out */ DebugOutput,
                    /* out */ HitUVz,
                    /* out */ Level,
                    /* out */ bHit,
                    /* out */ bUncertain);

                // CastScreenSpaceRay skips Mesh SDF tracing in a lot of places where it shouldn't, in particular missing thin occluders due to low NumSteps.  
                bool bWriteDepthOnMiss = !bUncertain;

#endif
                bHit = bHit && !bUncertain;

                uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
                bool bFastMoving = false;

                // 處理相交後的邏輯.
                if (bHit)
                {
                    float2 ReducedColorUV = HitUVz.xy * ColorBufferScaleBias.xy + ColorBufferScaleBias.zw;
                    ReducedColorUV = min(ReducedColorUV, ReducedColorUVMax);

                    float3 Lighting = ColorTexture.SampleLevel(ColorTextureSampler, ReducedColorUV, Level).rgb;
                    
                    #if DEBUG_VISUALIZE_TRACE_TYPES
                        RWTraceRadiance[TraceCoord] = float3(.5f, 0, 0) * View.PreExposure;
                    #else
                        RWTraceRadiance[TraceCoord] = Lighting;
                    #endif

                    float3 HitWorldVelocity;
                    {
                        float2 HitScreenUV = HitUVz.xy;
                        float2 HitScreenPosition = (HitScreenUV.xy - View.ScreenPositionScaleBias.wz) / View.ScreenPositionScaleBias.xy;

                        float HitDeviceZ = HitUVz.z;
                        float HitSceneDepth = ConvertFromDeviceZ(HitUVz.z);
                        float3 HitHistoryScreenPosition = GetHistoryScreenPosition(HitScreenPosition, HitScreenUV, HitDeviceZ);

                        float3 HitTranslatedWorldPosition = mul(float4(HitScreenPosition * HitSceneDepth, HitSceneDepth, 1), View.ScreenToTranslatedWorld).xyz;
                        HitWorldVelocity = HitTranslatedWorldPosition - GetPrevTranslatedWorldPosition(HitHistoryScreenPosition);
                    }

                    float ProbeWorldSpeed = ScreenProbeWorldSpeed.Load(int3(ScreenProbeAtlasCoord, 0)).x;
                    float HitWorldSpeed = length(HitWorldVelocity);

                    bFastMoving = abs(ProbeWorldSpeed - HitWorldSpeed) / max(SceneDepth, 100.0f) > RelativeSpeedDifferenceToConsiderLightingMoving;
                }

                // 相交或要求寫深度則儲存深度.
                if (bHit || bWriteDepthOnMiss)
                {
                    float HitDistance = min(sqrt(ComputeRayHitSqrDistance(WorldPosition + View.PreViewTranslation, HitUVz)), MaxTraceDistance);
                    RWTraceHit[TraceCoord] = EncodeProbeRayDistance(HitDistance, bHit, bFastMoving);
                }
            }
        }
    }
}
           

上面會根據是否HIERARCHICAL_SCREEN_TRACING而進入兩種不同的螢幕追蹤方式,截幀資料顯示HIERARCHICAL_SCREEN_TRACING為1,即會進入TraceScreen而不會進入CastScreenSpaceRay。下面分析

TraceScreen

// Engine\Shaders\Private\Lumen\LumenScreenTracing.ush

// 通過周遊HZB追蹤螢幕空間, 雖然精确但比較慢。
void TraceScreen(
    float3 RayTranslatedWorldOrigin, 
    float3 RayWorldDirection,
    float MaxWorldTraceDistance,
    float4 HZBUvFactorAndInvFactor,
    float MaxIterations,
    float UncertainTraceRelativeDepthThreshold,
    float NumThicknessStepsToDetermineCertainty,
    inout bool bHit,
    inout bool bUncertain,
    inout float3 OutScreenUV)
{
    // 計算射線起點的螢幕UV.
    float3 RayStartScreenUV;
    {
        float4 RayStartClip = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToClip);
        float3 RayStartScreenPosition = RayStartClip.xyz / max(RayStartClip.w, 1.0f);
        RayStartScreenUV = float3((RayStartScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayStartScreenPosition.z);
    }
    
    // 計算射線終點的螢幕UV.
    float3 RayEndScreenUV;
    {
        float3 ViewRayDirection = mul(float4(RayWorldDirection, 0.0), View.TranslatedWorldToView).xyz;
        float SceneDepth = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToView).z;
        // 将射線夾在Z==0的平面結束,這樣結束點将在NDC空間中有效.
        float RayEndWorldDistance = ViewRayDirection.z < 0.0 ? min(-0.99f * SceneDepth / ViewRayDirection.z, MaxWorldTraceDistance) : MaxWorldTraceDistance;

        float3 RayWorldEnd = RayTranslatedWorldOrigin + RayWorldDirection * RayEndWorldDistance;
        float4 RayEndClip = mul(float4(RayWorldEnd, 1.0f), View.TranslatedWorldToClip);
        float3 RayEndScreenPosition = RayEndClip.xyz / RayEndClip.w;
        RayEndScreenUV = float3((RayEndScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayEndScreenPosition.z);

        float2 ScreenEdgeIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, float3(0, 0, 0), float3(HZBUvFactorAndInvFactor.xy, 1));

        // 重新計算它離開螢幕的終點.
        RayEndScreenUV = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * ScreenEdgeIntersections.y;
    }

    float BaseMipLevel = HZB_TRACE_INCLUDE_FULL_RES_DEPTH ? -1 : 0;
    float MipLevel = BaseMipLevel;

    // 跳出目前分塊而不進行命中測試,以避免自遮擋. 這是必要的,因為HZB mip 0是最接近2x2深度的,而且HZB存儲在16位浮點數中
    bool bStepOutOfCurrentTile = true;
    if (bStepOutOfCurrentTile)
    {
        float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
        float2 BiasedUV = RayStartScreenUV.xy;
        float3 HZBTileMin = float3(floor(BiasedUV.xy / HZBTileSize) * HZBTileSize, 0.0f);
        float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
        float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);

        {
            float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;
            RayStartScreenUV = RayTileHit;
        }
    }

    bHit = false;
    bUncertain = false;

    float RayLength2D = length(RayEndScreenUV.xy - RayStartScreenUV.xy);
    float2 RayDirectionScreenUV = (RayEndScreenUV.xy - RayStartScreenUV.xy) / max(RayLength2D, .0001f);
    float3 RayScreenUV = RayStartScreenUV;
    float NumIterations = 0;
    
    // 無棧周遊HZB.
    while (MipLevel >= BaseMipLevel && NumIterations < MaxIterations)
    {
        float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
        // RayScreenUV is on a tile boundary due to bStepOutOfCurrentTile
        // Offset the UV along the ray direction so it always quantizes to the next tile
        float2 BiasedUV = RayScreenUV.xy + .01f * RayDirectionScreenUV.xy * HZBTileSize;
        float3 HZBTileMin = float3(floor(BiasedUV / HZBTileSize) * HZBTileSize, 0.0f);
        float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
        float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);
        float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;

        float TileZ;
        float AvoidSelfIntersectionZScale = 1.0f;

#if HZB_TRACE_INCLUDE_FULL_RES_DEPTH
        if (MipLevel < 0)
        {
            TileZ = SceneDepthTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw, 0).x;
        }
        else
#endif
        {
            TileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV, MipLevel).x;
            // 啟發式避免錯誤的自遮擋, 因為HZB mip 0是最接近2x2深度的,而且HZB存儲在16位浮點數中
            AvoidSelfIntersectionZScale = lerp(.99f, 1.0f, saturate(TileIntersections.y * 10.0f));
        }

        if (RayTileHit.z > TileZ * AvoidSelfIntersectionZScale)
        {
            RayScreenUV = RayTileHit;
            MipLevel++;

            if (TileIntersections.y == 1.0f)
            {
                // 射線沒有和HZB塊相交.
                MipLevel = BaseMipLevel - 1;
            }
        }
        else
        {
            if (abs(MipLevel - BaseMipLevel) < .1f)
            {
                // 将相交點的UV對齊到紋素的中心,進行SceneColor查找.
                RayScreenUV = float3(.5f * (HZBTileMin.xy + HZBTileMax.xy), RayTileHit.z);
                bHit = true;
                float IntersectionDepth = ConvertFromDeviceZ(TileZ);
                float RayTileEnterZ = RayStartScreenUV.z + (RayEndScreenUV.z - RayStartScreenUV.z) * TileIntersections.x;
                bUncertain = (ConvertFromDeviceZ(RayTileEnterZ) - IntersectionDepth) / max(IntersectionDepth, .00001f) > UncertainTraceRelativeDepthThreshold;
            }

            MipLevel--;
        }

        NumIterations++;
    }

    // 沿着射線确定特定厚度的線性步驟,以拒絕非常薄的表面(草, 頭發, 植被)後面的相交.
    if (bHit && !bUncertain && NumThicknessStepsToDetermineCertainty > 0)
    {
        float ThicknessSearchMipLevel = 0.0f;
        float MipNumTexels = exp2(ThicknessSearchMipLevel);
        float2 HZBTileSize = MipNumTexels * HZBBaseTexelSize;
        float NumSteps = NumThicknessStepsToDetermineCertainty / MipNumTexels;
        float ThicknessSearchEndTime = min(length(RayDirectionScreenUV * HZBTileSize * NumSteps) / length(RayEndScreenUV.xy - RayScreenUV.xy), 1.0f);

        for (float I = 0; I < NumSteps; I++)
        {
            float3 SampleUV = RayScreenUV + (I / NumSteps) * ThicknessSearchEndTime * (RayEndScreenUV - RayScreenUV);

            if (all(SampleUV.xy > 0 && SampleUV.xy < HZBUvFactorAndInvFactor.xy))
            {
                float SampleTileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, SampleUV.xy, ThicknessSearchMipLevel).x;

                if (SampleUV.z > SampleTileZ)
                {
                    bUncertain = true;
                }
            }
        }
    }

    OutScreenUV.xy = RayScreenUV.xy * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw;
    OutScreenUV.z = RayScreenUV.z;
}
           

關于HZB螢幕空間的光線追蹤,推薦參看闫令琪大神的圖形學課程《GAMES202-高品質實時渲染》Lecture9 Real-Time Global Illumination(Screen Space),其視訊詳盡動态地描述了HZB的周遊和追蹤過程。下圖隻是截取視訊的其中一幅圖例:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)
  • TraceVoxels

追蹤體素的輸入有全局距離場、法線、深度、天空光、藍噪點、VoxelLighting、RadianceProbeIndirectTexture、FinalRadianceAtlas、射線資訊等,輸出有R32的TraceHit、R11G11B10的TraceRandiance:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

TraceVoxels的輸出紋理TraceHit,存儲了相交點的深度,注意右上角範圍做了調整。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

TraceVoxels的輸出紋理TraceRadiance,存儲了相交點的輻射率。

再分析其使用的compute shader:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

[numthreads(PROBE_THREADGROUP_SIZE_1D, 1, 1)]
void ScreenProbeTraceVoxelsCS(
    uint3 GroupId : SV_GroupID,
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
    if (DispatchThreadId.x < CompactedTraceTexelAllocator[0])
    {
        uint ScreenProbeIndex;
        uint2 TraceTexelCoord;
        float TraceHitDistance;
        // 解碼需要追蹤的紋素資訊.
        DecodeTraceTexel(CompactedTraceTexelData[DispatchThreadId.x], ScreenProbeIndex, TraceTexelCoord, TraceHitDistance);

        // 計算探針所在圖集的UV.
        uint2 ScreenProbeAtlasCoord = uint2(ScreenProbeIndex % ScreenProbeAtlasViewSize.x, ScreenProbeIndex / ScreenProbeAtlasViewSize.x);
        // 追蹤探針紋素的體素光照.
        TraceVoxels(ScreenProbeAtlasCoord, TraceTexelCoord, ScreenProbeIndex, TraceHitDistance);
    }
}

void TraceVoxels(
    uint2 ScreenProbeAtlasCoord,
    uint2 TraceTexelCoord,
    uint ScreenProbeIndex,
    float TraceHitDistance)
{
    // 計算追蹤的UV.
    uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
    uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);

    uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
    
    {
        // 擷取螢幕空間的各類資料.
        float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
        float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);
        float3 SceneNormal = DecodeNormal(SceneTexturesStruct.GBufferATexture.Load(int3(ScreenUV * View.BufferSizeAndInvSize.xy, 0)).xyz);

        bool bHit = false;

        {
            // 計算世界坐标.
            float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);

            float2 ProbeUV;
            float ConeHalfAngle;
            // 擷取探針追蹤UV.
            GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);

            // 從八面體圖反算成方向.
            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            // 采樣位置.
            float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;
            SamplePosition += SurfaceBias * SceneNormal;

            float TraceDistance = MaxTraceDistance;
            bool bCoveredByRadianceCache = false;
#if RADIANCE_CACHE
            float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
            TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
#endif

            // 建構錐體追蹤輸入資料.
            FConeTraceInput TraceInput;
            TraceInput.Setup(SamplePosition, WorldConeDirection, ConeHalfAngle, MinSampleRadius, MinTraceDistance, TraceDistance, StepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;
            TraceInput.VoxelTraceStartDistance = max(MinTraceDistance, TraceHitDistance);

            // 建構錐體追蹤輸出資料.
            FConeTraceResult TraceResult = (FConeTraceResult)0;
            TraceResult.Lighting = 0;
            TraceResult.Transparency = 1;
            TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;

            // 錐體追蹤Lumen場景的光照體素.
            ConeTraceLumenSceneVoxels(TraceInput, TraceResult);

            if (TraceResult.Transparency <= .5f)
            {
                // 掠射角追蹤的自相交産生的噪點無法被空間濾波器消除.
                #define USE_VOXEL_TRACE_HIT_DISTANCE 0
                #if USE_VOXEL_TRACE_HIT_DISTANCE
                    TraceHitDistance = TraceResult.OpaqueHitDistance;
                #else
                    TraceHitDistance = TraceDistance;
                #endif
                bHit = true;
            }

#if RADIANCE_CACHE
            if (bCoveredByRadianceCache)
            {
                if (TraceResult.Transparency > .5f)
                {
                    // 不儲存輻射率緩存相交點的深度.
                    TraceHitDistance = MaxTraceDistance;
                }

                SampleRadianceCacheAndApply(WorldPosition, WorldConeDirection, ConeHalfAngle, float3(0, 0, 0), TraceResult.Lighting, TraceResult.Transparency);
            }
            else
#endif
            {
#if TRACE_DISTANT_SCENE
                // 追蹤遠處場景.
                if (TraceResult.Transparency > .01f)
                {
                    FConeTraceResult DistantTraceResult;
                    ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
                    TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
                    TraceResult.Transparency *= DistantTraceResult.Transparency;
                }
#endif
                // 計算天空光.
                EvaluateSkyRadianceForCone(WorldConeDirection, tan(ConeHalfAngle), TraceResult);

                if (TraceHitDistance >= GetProbeMaxHitDistance())
                {
                    TraceHitDistance = MaxTraceDistance;
                }
            }
            
            #if USE_PREEXPOSURE
                TraceResult.Lighting *= View.PreExposure;
            #endif

            #if DEBUG_VISUALIZE_TRACE_TYPES
                RWTraceRadiance[TraceCoord] = float3(0, 0, .5f) * View.PreExposure;
            #else
                RWTraceRadiance[TraceCoord] = TraceResult.Lighting;
            #endif
        }

        // 存儲追蹤結果, 将相交點距離/是否相交/是否移動編碼到32位非負整數中.
        RWTraceHit[TraceCoord] = EncodeProbeRayDistance(TraceHitDistance, bHit, false);
    }
}
           
  • CompositeTraces

CompositeTraces就是根據前面步驟生成的TraceHit、RayInfo和TraceRadianc生成ScreenProbeRadiance、ScreenProbeHitDistance、ScreenProbeTraceMoving紋理。其使用的Compute Shader是LumenScreenProbeFiltering.usf,主入口是

ScreenProbeCompositeTracesWithScatterCS

,具體代碼此文忽略。

  • FilterRadianceWithGather

CompositeTraces之後會經曆數次FilterRadianceWithGather,執行探針輻射率過濾:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

左:過濾前的ScreenProbeRadiance;右:執行若幹次過濾後的ScreenProbeRadiance。

  • ComputeIndirect

這個階段就是利用之前生成的各種螢幕空間的探針資料(深度、法線、基礎色、FilteredScreenProbeRadiance、BentNormal)計算出最終的場景非直接光顔色(下圖):

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

RenderLumenReflections就是渲染Lumen場景中粗糙度比較低比較光滑的表面的反射,其流程和RenderLumenScreenProbeGather類似,但更簡單步驟更少:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

其涉及的C++渲染代碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenReflections.cpp

FRDGTextureRef FDeferredShadingSceneRenderer::RenderLumenReflections(
    FRDGBuilder& GraphBuilder, 
    const FViewInfo& View,
    const FSceneTextures& SceneTextures,
    const FLumenMeshSDFGridParameters& MeshSDFGridParameters,
    FLumenReflectionCompositeParameters& OutCompositeParameters)
{
    // 反射追蹤的最大的粗糙度, 大于此的表面将忽略.
    OutCompositeParameters.MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
    OutCompositeParameters.InvRoughnessFadeLength = 1.0f / GLumenReflectionRoughnessFadeLength;

    (......)

    {
        (......)

        auto ComputeShader = View.ShaderMap->GetShader<FReflectionGenerateRaysCS>(0);

        // 生成射線Pass.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("GenerateRaysCS"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }

    FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);

    (......)

    // 追蹤反射.
    TraceReflections(
        GraphBuilder, 
        Scene,
        View, 
        GLumenReflectionTraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
        SceneTextures,
        TracingInputs,
        ReflectionTracingParameters,
        ReflectionTileParameters,
        MeshSDFGridParameters);
    
    (......)

    {
        FReflectionResolveCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionResolveCS::FParameters>();
        
        (......)
        
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionResolveCS>(PermutationVector);

        // 解析反射.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ReflectionResolve"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.ResolveIndirectArgs,
            0);
    }

    (......)

    // 更新曆史資料.
    UpdateHistoryReflections(
        GraphBuilder,
        View,
        SceneTextures,
        ReflectionTileParameters,
        ResolvedSpecularIndirect,
        SpecularIndirect);

    return SpecularIndirect;
}

void TraceReflections(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    bool bTraceMeshSDFs,
    const FSceneTextures& SceneTextures,
    const FLumenCardTracingInputs& TracingInputs,
    const FLumenReflectionTracingParameters& ReflectionTracingParameters,
    const FLumenReflectionTileParameters& ReflectionTileParameters,
    const FLumenMeshSDFGridParameters& InMeshSDFGridParameters)
{
    {
        (......)

        auto ComputeShader = View.ShaderMap->GetShader<FReflectionClearTracesCS>(0);

        // 清理追蹤輸出紋理.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ClearTraces"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }

    FLumenIndirectTracingParameters IndirectTracingParameters;
    SetupIndirectTracingParametersForReflections(IndirectTracingParameters);

    const FSceneTextureParameters& SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures);

    const bool bScreenTraces = GLumenReflectionScreenTraces != 0;

    if (bScreenTraces)
    {
        FReflectionTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionTraceScreenTexturesCS::FParameters>();

        (......)

        FReflectionTraceScreenTexturesCS::FPermutationDomain PermutationVector;
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceScreenTexturesCS>(PermutationVector);

        // 螢幕追蹤.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceScreen"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }
    
    // 網格距離場追蹤.
    if (bTraceMeshSDFs)
    {
        if (Lumen::UseHardwareRayTracedReflections()) // 硬體追蹤反射.
        {
            FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
                GraphBuilder,
                View,
                ReflectionTracingParameters,
                ReflectionTileParameters,
                WORLD_MAX,
                IndirectTracingParameters.MaxTraceDistance);

            RenderLumenHardwareRayTracingReflections(
                GraphBuilder,
                SceneTextureParameters,
                View,
                ReflectionTracingParameters,
                ReflectionTileParameters,
                TracingInputs,
                CompactedTraceParameters,
                IndirectTracingParameters.MaxTraceDistance);
        }
        else
        {
            FLumenMeshSDFGridParameters MeshSDFGridParameters = InMeshSDFGridParameters;
            if (!MeshSDFGridParameters.NumGridCulledMeshSDFObjects)
            {
                CullForCardTracing(
                    GraphBuilder,
                    Scene, View,
                    TracingInputs,
                    IndirectTracingParameters,
                    /* out */ MeshSDFGridParameters);
            }

            if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
            {
                // 壓縮追蹤.
                FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
                    GraphBuilder,
                    View,
                    ReflectionTracingParameters,
                    ReflectionTileParameters,
                    IndirectTracingParameters.CardTraceEndDistanceFromCamera,
                    IndirectTracingParameters.MaxMeshSDFTraceDistance);

                {
                    (......)
                    
                    auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceMeshSDFsCS>(PermutationVector);

                    // 追蹤網格距離場.
                    FComputeShaderUtils::AddPass(
                        GraphBuilder,
                        RDG_EVENT_NAME("TraceMeshSDFs"),
                        ComputeShader,
                        PassParameters,
                        CompactedTraceParameters.IndirectArgs,
                        0);
                }
            }
        }
    }

    FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(...);

    {
        (......)
        
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceVoxelsCS>(PermutationVector);

        // 追蹤Voxel光照.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceVoxels"),
            ComputeShader,
            PassParameters,
            CompactedTraceParameters.IndirectArgs,
            0);
    }
}
           

Lumen反射非直接光和Lumen漫反射非直接光最重要的差別是它們追蹤的射線數量和方式有所不同,Lumen反射需要指定追蹤的最大粗糙度GLumenReflectionMaxRoughnessToTrace(預設值是0.4,可由控制台指令r.Lumen.Reflections.MaxRoughnessToTrace改變),生成的TraceHit、TraceRadiance結果也會不同。

由于反射和漫反射涉及到的技術高度相似,此文就不再細究其技術細節了。

此階段就是将之前的RenderLumenScreenProbeGather生成的探針的資訊(DiffuseIndirect、RoughSpecularIndirect)和RenderLumenReflections生成的反射資訊(SpecularIndirect),結合場景的GBuffer及相關資料,生成最終的場景顔色:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

組合了GI的漫反射和鏡面反射後的場景顔色。(放大1.5倍,顔色範圍做了調整)

至于組合的過程,可以在其使用的PS中找到答案:

// Engine\Shaders\Private\DiffuseIndirectComposite.usf

void MainPS(
    float4 SvPosition : SV_POSITION
    , out float4 OutAddColor : SV_Target0
    , out float4 OutMultiplyColor : SV_Target1
)
{
    float2 SceneBufferUV = SvPositionToBufferUV(SvPosition);
    float2 ScreenPosition = SvPositionToScreenPosition(SvPosition).xy;

    // 采樣場景的GBuffer.
    FGBufferData GBuffer = GetGBufferDataFromSceneTextures(SceneBufferUV);

    // 采樣每幀動态生成的AO.
    float DynamicAmbientOcclusion = AmbientOcclusionTexture.SampleLevel(AmbientOcclusionSampler, SceneBufferUV, 0).r;

    // 計算最終要應用的AO.  
    float AOMask = (GBuffer.ShadingModelID != SHADINGMODELID_UNLIT);
    float FinalAmbientOcclusion = lerp(1.0f, GBuffer.GBufferAO * DynamicAmbientOcclusion, AOMask * AmbientOcclusionStaticFraction);

    float3 TranslatedWorldPosition = mul(float4(ScreenPosition * GBuffer.Depth, GBuffer.Depth, 1), View.ScreenToTranslatedWorld).xyz;

    float3 N = GBuffer.WorldNormal;
    float3 V = normalize(View.TranslatedWorldCameraOrigin - TranslatedWorldPosition);
    float NoV = saturate(dot(N, V));

    // 應用非直接漫反射.
#if DIM_APPLY_DIFFUSE_INDIRECT
    {
        float3 DiffuseIndirectLighting = 0;
        float3 RoughSpecularIndirectLighting = 0;
        float3 SpecularIndirectLighting = 0;

        #if DIM_APPLY_DIFFUSE_INDIRECT == 4
            DiffuseIndirectLighting = DiffuseIndirect_Textures_0.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
            RoughSpecularIndirectLighting = DiffuseIndirect_Textures_1.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
            SpecularIndirectLighting = DiffuseIndirect_Textures_2.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
        #else
        {
            // 采樣降噪器的輸出.
            FSSDKernelConfig KernelConfig = CreateKernelConfig();
                
            #if DEBUG_OUTPUT
            {
                KernelConfig.DebugPixelPosition = uint2(SvPosition.xy);
                KernelConfig.DebugEventCounter = 0;
            }
            #endif

            // Compile time.
            KernelConfig.bSampleKernelCenter = true;
            KernelConfig.BufferLayout = CONFIG_SIGNAL_INPUT_LAYOUT;
            KernelConfig.bUnroll = true;

            #if DIM_UPSCALE_DIFFUSE_INDIRECT
            {
                KernelConfig.SampleSet = SAMPLE_SET_2X2_BILINEAR;
                KernelConfig.BilateralDistanceComputation = SIGNAL_WORLD_FREQUENCY_REF_METADATA_ONLY;
                KernelConfig.WorldBluringDistanceMultiplier = 16.0;
                
                KernelConfig.BilateralSettings[0] = BILATERAL_POSITION_BASED(3);
                
                // SGPRs(Scalar General Purpose Register, 标量通用寄存器)
                KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize * float4(0.5, 0.5, 2.0, 2.0);
                KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
            }
            #else
            {
                KernelConfig.SampleSet = SAMPLE_SET_1X1;
                KernelConfig.bNormalizeSample = true;
                
                // SGPRs
                KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize;
                KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
            }
            #endif

            // VGPRs(Vector General Purpose Register, 向量通用寄存器)
            KernelConfig.BufferUV = SceneBufferUV; 
            {
                KernelConfig.CompressedRefSceneMetadata = GBufferDataToCompressedSceneMetadata(GBuffer);
                KernelConfig.RefBufferUV = SceneBufferUV;
                KernelConfig.RefSceneMetadataLayout = METADATA_BUFFER_LAYOUT_DISABLED;
            }
            KernelConfig.HammersleySeed = Rand3DPCG16(int3(SvPosition.xy, View.StateFrameIndexMod8)).xy;
                
            FSSDSignalAccumulatorArray UncompressedAccumulators = CreateSignalAccumulatorArray();
            FSSDCompressedSignalAccumulatorArray CompressedAccumulators = CompressAccumulatorArray(
                UncompressedAccumulators, CONFIG_ACCUMULATOR_VGPR_COMPRESSION);

            // 累加卷積核
            AccumulateKernel(
                KernelConfig,
                DiffuseIndirect_Textures_0,
                DiffuseIndirect_Textures_1,
                DiffuseIndirect_Textures_2,
                DiffuseIndirect_Textures_3,
                /* inout */ UncompressedAccumulators,
                /* inout */ CompressedAccumulators);

            // 采樣
            FSSDSignalSample Sample;
            #if DIM_UPSCALE_DIFFUSE_INDIRECT
                Sample = NormalizeToOneSample(UncompressedAccumulators.Array[0].Moment1);
            #else
                Sample = UncompressedAccumulators.Array[0].Moment1;
            #endif
            
            // DIM_APPLY_DIFFUSE_INDIRECT是1或3時隻有漫反射非直接光.
            #if DIM_APPLY_DIFFUSE_INDIRECT == 1 || DIM_APPLY_DIFFUSE_INDIRECT == 3
            {
                DiffuseIndirectLighting = Sample.SceneColor.rgb;
            }
            // DIM_APPLY_DIFFUSE_INDIRECT是2時有漫反射和鏡面非直接光.
            #elif DIM_APPLY_DIFFUSE_INDIRECT == 2
            {
                DiffuseIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[0];
                SpecularIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[1];
            }
            #else
                #error Unimplemented
            #endif
        }
        #endif

        float3 DiffuseColor = bVisualizeDiffuseIndirect ? float3(.18f, .18f, .18f) : GBuffer.DiffuseColor;
        float3 SpecularColor = GBuffer.SpecularColor;

        #if DIM_APPLY_DIFFUSE_INDIRECT == 4
            RemapClearCoatDiffuseAndSpecularColor(GBuffer, NoV, DiffuseColor, SpecularColor);
        #endif

        #if DIM_APPLY_DIFFUSE_INDIRECT == 2 || DIM_APPLY_DIFFUSE_INDIRECT == 4
            float DiffuseIndirectAO = 1;
        #else
            float DiffuseIndirectAO = lerp(1, FinalAmbientOcclusion, ApplyAOToDynamicDiffuseIndirect);
        #endif

        FDirectLighting IndirectLighting;
        if (GBuffer.ShadingModelID == SHADINGMODELID_HAIR)
        {
            IndirectLighting.Diffuse = DiffuseIndirectLighting * GBuffer.BaseColor;
            IndirectLighting.Specular = 0;
        }
        else
        {
            IndirectLighting.Diffuse = DiffuseIndirectLighting * DiffuseColor * DiffuseIndirectAO;
            IndirectLighting.Transmission = 0;

            #if DIM_APPLY_DIFFUSE_INDIRECT == 4
                IndirectLighting.Specular = CombineRoughSpecular(GBuffer, NoV, SpecularIndirectLighting, RoughSpecularIndirectLighting, SpecularColor);
            #else
                IndirectLighting.Specular = SpecularIndirectLighting * EnvBRDF(SpecularColor, GBuffer.Roughness, NoV);
            #endif
        }

        const bool bNeedsSeparateSubsurfaceLightAccumulation = UseSubsurfaceProfile(GBuffer.ShadingModelID);

        if (bNeedsSeparateSubsurfaceLightAccumulation &&
            View.bSubsurfacePostprocessEnabled > 0 && View.bCheckerboardSubsurfaceProfileRendering > 0)
        {
            bool bChecker = CheckerFromSceneColorUV(SceneBufferUV);

            // Adjust for checkerboard. only apply non-diffuse lighting (including emissive) 
            // to the specular component, otherwise lighting is applied twice
            IndirectLighting.Specular *= !bChecker;
        }

        // 累加光照結果.
        FLightAccumulator LightAccumulator = (FLightAccumulator)0;
        LightAccumulator_Add(
            LightAccumulator,
            IndirectLighting.Diffuse + IndirectLighting.Specular,
            IndirectLighting.Diffuse,
            1.0f,
            bNeedsSeparateSubsurfaceLightAccumulation);
        // 擷取光照結果.
        OutAddColor = LightAccumulator_GetResult(LightAccumulator);
    }
    #else
    {
        OutAddColor = 0;
    }
    #endif

    OutMultiplyColor = FinalAmbientOcclusion;
}
           

Lumen的步驟很多很複雜,但總結起來可分為幾個步驟:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

1、建構MeshCard和LumenCard,更新它們。

2、根據Lumen場景的Card資訊,追蹤并更新對應的紋素(Texel)。

3、在漫反射和鏡面反射階段,利用多種方式追蹤和計算螢幕空間表面的光照。

4、組合前述步驟得到的非直接光的漫反射和鏡面反射,獲得疊加了非直接光的最終場景顔色。

另外,在追蹤過程中涉及到了多種方式,并且它們是按照權重過渡而成(下圖)。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

混合追蹤示意圖。紅色表示螢幕追蹤,綠色表示網格距離場追蹤,藍色表示Voxel Lighting追蹤。顔色過渡代表着不同類型追蹤之間的過渡。

修改DEBUG_VISUALIZE_TRACE_TYPES為1且在指令行關閉ShowFlag.DirectLighting可以開啟追蹤權重可視化模式:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

#define DEBUG_VISUALIZE_TRACE_TYPES 1 // 啟用追蹤權重可視化(預設為0)
           

整體上,Lumen綜合了SSGI、SDF(Mesh SDF和Global SDF)、Lumen Card、Voxel Cone等追蹤技術,應用了各種技術生成了各類資料息(自适應的Screen Space Probe、 Irradiance Probe、Surface Cache、Prefilter Radiance、Voxel Lighting、RSM、Virtual Texture、Clipmap),計算出非直接光的漫反射和鏡面反射,最後按權重混合成場景顔色。

Lumen漫反射GI支援軟硬體兩種方式,預設參數下,其軟體方式涉及的各類追蹤描述如下:

追蹤類型 譯名 範圍 描述
Screen Trace 螢幕追蹤 全場景 亦即SSGI,隻要能追蹤到相交點,優先使用其反彈資訊。
Voxel Lighting Trace 體素光照追蹤 距相機200米内 基于Cone的射線追蹤,會采樣MIP快速得到不同Hit距離的資訊。
Detail MeshCard Trace 細節網格卡片追蹤 2~40米 采樣MeshCard 光照資訊時會使⽤類似VSM的⽅式使⽤機率估算遮擋。
Distant MeshCard Trace 遠距網格卡片追蹤 200~1000米 會追蹤預先生成的全局距離場,不再使用遮擋估算。

Lumen鏡面反射GI也支援軟硬體兩種方式,其中軟體方式結合了SSR + SDF Tracing(Mesh SDF、Global SDF)的技術。

時間超分辨率(Temporal Super Resolution,TSR)是新一代的時間抗鋸齒算法,用來替換傳統(UE4)的TAA。它的特性有利于低分辨率輸入獲得高分辨率的輸出,且品質解決原生分辨率,在高頻下更少鬼影更少閃爍,針對PS5等平台做了優化,但同時需要SM5.0以上的圖形平台。

TSR使用的技術跟NVIDIA的DLSS和AMD的FidelityFX Super Resolution(FSR)相似,隻是DLSS基于Tensor Core的深度學習做了加速,而TSR不需要依賴Tensor Core。換句話說,TSR可以不依賴RTX顯示卡而運作于其它顯示卡廠商的裝置。TSR由于可以采用低分辨率輸出高分辨率的紋理,是以不僅可以提升抗鋸齒效果,還可以提升渲染性能,減少能耗。

不同于UE4,UE5隻要配置沒有顯式禁用TemporalAA,無論選擇了何種抗鋸齒,在後處理階段都會走TSR通道。調用堆棧如下所示:

// Engine\Source\Runtime\Renderer\Private\PostProcess\PostProcessing.cpp

void AddPostProcessingPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View, ...)
{
    (......)
    
    // TAA抗鋸齒.
    EMainTAAPassConfig TAAConfig = ITemporalUpscaler::GetMainTAAPassConfig(View);
    // TAA配置沒有禁用.
    if (TAAConfig != EMainTAAPassConfig::Disabled)
    {
        (......)
        
        // 調用FDefaultTemporalUpscaler::AddPasses, 見後面的解析.
        UpscalerToUse->AddPasses(
            GraphBuilder,
            View,
            UpscalerPassInputs,
            &SceneColor.Texture,
            &SecondaryViewRect,
            &DownsampledSceneColor.Texture,
            &DownsampledSceneColor.ViewRect);
    }
    
    (......)
}

// Engine\Source\Runtime\Renderer\Private\PostProcess\TemporalAA.cpp

void FDefaultTemporalUpscaler::AddPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View,...) const final
{
    // 如果啟用了且支援第五代TAA, 則進入TSR通道.
    if (CVarTAAAlgorithm.GetValueOnRenderThread() && DoesPlatformSupportGen5TAA(View.GetShaderPlatform()))
    {
        *OutSceneColorHalfResTexture = nullptr;

        return AddTemporalSuperResolutionPasses(
            GraphBuilder,
            View,
            PassInputs,
            OutSceneColorTexture,
            OutSceneColorViewRect);
    }
    (......)
}
           

由此進入了

AddTemporalSuperResolutionPasses

,以下是RenderDoc截取的TSR渲染過程:

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

由此可知,TSR相比UE4的TAA多了很多個Pass,主要包含清理上一幀紋理、放大速度緩沖、摒棄無效速度緩沖、過濾頻率、對比曆史資料、後置過濾重投射、放大重投射、更新曆史等幾個階段。

其中以上階段最重要的一步是更新曆史階段,它會根據輸入的場景顔色、深度、放大後速度、視差系數、曆史幀資料(放大後重投影、重投影、高頻、低頻、中繼資料、子像素資訊)等資料生成最終的抗鋸齒後的場景顔色和目前的曆史幀資料。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

左:場景顔色輸入;右:TSR後的場景顔色輸出。

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

TSR輸出的曆史幀資料:低頻、高頻、中繼資料、子像素資訊。

下面直接進入更新曆史階段使用的Compute Shader進行分析:

// /Engine/Private/TemporalAA/TAAUpdateHistory.usf

[numthreads(TILE_SIZE, TILE_SIZE, 1)]
void MainCS(
    uint2 GroupId : SV_GroupID,
    uint GroupThreadIndex : SV_GroupIndex)
{
    uint GroupWaveIndex = GetGroupWaveIndex(GroupThreadIndex, /* GroupSize = */ TILE_SIZE * TILE_SIZE);

    float4 Debug = 0.0;

    // 曆史像素位置.
    taa_short2 HistoryPixelPos = (
        taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
        Map8x8Tile2x2Lane(GroupThreadIndex));

    float2 ViewportUV = (float2(HistoryPixelPos) + 0.5f) * HistoryInfo_ViewportSizeInverse;
    float2 ScreenPos = ViewportUVToScreenPos(ViewportUV);
    
    // 輸入視口中輸出像素O中心的像素坐标.
    float2 PPCo = ViewportUV * InputInfo_ViewportSize + InputJitter;

    // 最近的輸入像素K的中心像素坐标。
    float2 PPCk = floor(PPCo) + 0.5;
    
    taa_short2 InputPixelPos = ClampPixelOffset(
        taa_short2(InputPixelPosMin) + taa_short2(PPCo),
        InputPixelPosMin, InputPixelPosMax);

    // 擷取重投影相關的資訊.
    float2 PrevScreenPos = ScreenPos;
    taa_half ParallaxRejectionMask = taa_half(1.0);
    taa_half LowFrequencyRejection = taa_half(1.0);
    taa_half OutputPixelVelocity = taa_half(0.0);
    #if 1
    {
        float2 EncodedVelocity = DilatedVelocityTexture[InputPixelPos];
        ParallaxRejectionMask = ParallaxRejectionMaskTexture[InputPixelPos];

        float2 ScreenVelocity = DecodeVelocityFromTexture(float4(EncodedVelocity, 0.0, 0.0)).xy;

        PrevScreenPos = ScreenPos - ScreenVelocity;
        OutputPixelVelocity = taa_half(length(ScreenVelocity * HistoryInfo_ViewportSize));

        taa_ushort2 RejectionPixelPos = (taa_ushort2(InputPixelPos) - taa_short2(InputPixelPosMin)) / 2;
        LowFrequencyRejection = HistoryRejectionTexture[RejectionPixelPos];
        
        #if !CONFIG_CLAMP
        {
            ParallaxRejectionMask = taa_half(1.0);
            LowFrequencyRejection = taa_half(1.0);
        }
        #endif
    }
    #endif

    // 擷取像素是否響應AA.
    bool bIsResponsiveAAPixel = false;
    #if CONFIG_RESPONSIVE_STENCIL
    {
        const uint kResponsiveStencilMask = 1 << 3;
            
        uint SceneStencilRef = InputSceneStencilTexture.Load(int3(InputPixelPos, 0)) STENCIL_COMPONENT_SWIZZLE;

        bIsResponsiveAAPixel = (SceneStencilRef & kResponsiveStencilMask) != 0;
    }
    #endif
    
    // 檢測HistoryBufferUV是否在視口之外.
    bool bOffScreen = IsOffScreen(bCameraCut, PrevScreenPos, ParallaxRejectionMask);
    
    taa_half TotalRejection = bOffScreen ? 0.0 : saturate(LowFrequencyRejection * 4.0);


    // 以預測頻率過濾輸入場景顔色.
    taa_half3 FilteredInputColor;
    taa_half3 InputMinColor;
    taa_half3 InputMaxColor;
    taa_half InputPixelAlignement;
    taa_half ClosestInputLuma4;
    
    ISOLATE
    {
        // 從像素K到O的向量.
        taa_half2 dKO = taa_half2(PPCo - PPCk);

        FilteredInputColor = taa_half(0.0);

        taa_half FilteredInputColorWeight = taa_half(0.0);
        
        #if 0 // shader compiler bug :'(
            taa_half InputToHistoryFactor = taa_half(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
            taa_half FinalInputToHistoryFactor = bOffScreen ? taa_half(1.0) : InputToHistoryFactor;
        #else
            float InputToHistoryFactor = float(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
            float FinalInputToHistoryFactor = lerp(1.0, InputToHistoryFactor, TotalRejection);
        #endif

        InputMinColor = taa_half(INFINITE_FLOAT);
        InputMaxColor = taa_half(-INFINITE_FLOAT);

        // 根據CONFIG_SAMPLES用不同方式生成采樣坐标并采樣輸入的場景顔色.
        UNROLL_N(CONFIG_SAMPLES)
        for (uint SampleId = 0; SampleId < CONFIG_SAMPLES; SampleId++)
        {
            taa_short2 SampleInputPixelPos;
            taa_half2 PixelOffset;
            
            #if CONFIG_SAMPLES == 9
            {
                taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kSquareIndexes3x3[SampleId]]);
                PixelOffset = taa_half2(iPixelOffset);
                
                SampleInputPixelPos = AddAndClampPixelOffset(
                    InputPixelPos,
                    iPixelOffset, iPixelOffset,
                    InputPixelPosMin, InputPixelPosMax);
            }
            #elif CONFIG_SAMPLES == 5 || CONFIG_SAMPLES == 6
            {
                if (SampleId == 5)
                {
                    taa_short2 iPixelOffset;
                    #if CONFIG_COMPILE_FP16
                        iPixelOffset = int16_t2(1, 1) - int16_t2((asuint16(dKO) & uint16_t(0x8000)) >> uint16_t(14));
                        PixelOffset = asfloat16(asuint16(half(1.0)).xx | (asuint16(dKO) & uint16_t(0x8000)));
                    #else
                        iPixelOffset = SignFastInt(dKO);
                        PixelOffset = asfloat(asuint(1.0).xx | (asuint(dKO) & uint(0x80000000)));
                    #endif
                        
                    SampleInputPixelPos = ClampPixelOffset(InputPixelPos, InputPixelPosMin, InputPixelPosMax);
                }
                else
                {
                    taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kPlusIndexes3x3[SampleId]]);
                    PixelOffset = taa_half2(iPixelOffset);
                    
                    SampleInputPixelPos = AddAndClampPixelOffset(
                        InputPixelPos,
                        iPixelOffset, iPixelOffset,
                        InputPixelPosMin, InputPixelPosMax);
                }
            }
            #else
                #error Unknown sample count
            #endif

            taa_half3 InputColor = InputSceneColorTexture[SampleInputPixelPos];

            taa_half2 dPP = PixelOffset - dKO;
            taa_half SampleSpatialWeight = ComputeSampleWeigth(FinalInputToHistoryFactor, dPP, /* MinimalContribution = */ float(0.005));

            taa_half ToneWeight = HdrWeight4(InputColor);

            FilteredInputColor       += (SampleSpatialWeight * ToneWeight) * InputColor;
            FilteredInputColorWeight += (SampleSpatialWeight * ToneWeight);

            if (SampleId == 0)
            {
                ClosestInputLuma4 = Luma4(InputColor);
                InputMinColor = TransformColorForClampingBox(InputColor);
                InputMaxColor = TransformColorForClampingBox(InputColor);
            }
            else
            {
                InputMinColor = min(InputMinColor, TransformColorForClampingBox(InputColor));
                InputMaxColor = max(InputMaxColor, TransformColorForClampingBox(InputColor));
            }
        }
        
        FilteredInputColor *= rcp(FilteredInputColorWeight);

        InputPixelAlignement = ComputeSampleWeigth(InputToHistoryFactor, dKO, /* MinimalContribution = */ float(0.0));
    }
        
    // 儲存到LDS中,為VGPR采樣曆史資料騰出空間.
    #if CONFIG_MANUAL_LDS_SPILL
    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        SharedArray0[LocalGroupThreadIndex] = taa_half4(FilteredInputColor, LowFrequencyRejection);
        SharedArray1[LocalGroupThreadIndex] = taa_half4(InputMinColor, InputPixelAlignement);
        SharedArray2[LocalGroupThreadIndex] = taa_half4(InputMaxColor, OutputPixelVelocity);
    }
    #endif
    
    // 重投影曆史資料.
    taa_half3 PrevHistoryMoment1;
    taa_half PrevHistoryValidity;
    
    taa_half3 PrevHistoryMommentMin;
    taa_half3 PrevHistoryMommentMax;

    taa_half3 PrevFallbackColor;
    taa_half PrevFallbackWeight;
    
    taa_subpixel_details PrevSubpixelDetails;

    ISOLATE
    {
        // 重投影曆史資料.
        taa_half3 RawHistory0 = taa_half(0);
        taa_half3 RawHistory1 = taa_half(0);
        taa_half2 RawHistory2 = taa_half(0);

        taa_half3 RawHistory1Min = INFINITE_FLOAT;
        taa_half3 RawHistory1Max = -INFINITE_FLOAT;

        // 采樣原始的曆史資料.
        {
            float2 PrevHistoryBufferUV = (PrevHistoryInfo_ScreenPosToViewportScale * PrevScreenPos + PrevHistoryInfo_ScreenPosToViewportBias) * PrevHistoryInfo_ExtentInverse;
            PrevHistoryBufferUV = clamp(PrevHistoryBufferUV, PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);

            #if 1
            {
                FCatmullRomSamples Samples = GetBicubic2DCatmullRomSamples(PrevHistoryBufferUV, PrevHistoryInfo_Extent, PrevHistoryInfo_ExtentInverse);

                UNROLL
                for (uint i = 0; i < Samples.Count; i++)
                {
                    float2 SampleUV = clamp(Samples.UV[i], PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);

                    taa_half3 Sample0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
                    taa_half3 Sample1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
                    taa_half2 Sample2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);

                    RawHistory1Min = min(RawHistory1Min, Sample1 * SafeRcp(Sample2.g));
                    RawHistory1Max = max(RawHistory1Max, Sample1 * SafeRcp(Sample2.g));

                    RawHistory0 += Sample0 * taa_half(Samples.Weight[i]);
                    RawHistory1 += Sample1 * taa_half(Samples.Weight[i]);
                    RawHistory2 += Sample2 * taa_half(Samples.Weight[i]);
                }
                RawHistory0 *= taa_half(Samples.FinalMultiplier);
                RawHistory1 *= taa_half(Samples.FinalMultiplier);
                RawHistory2 *= taa_half(Samples.FinalMultiplier);
            }
            #else
            {
                RawHistory0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
                RawHistory1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
                RawHistory2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
            }
            #endif
            
            FSubpixelNeighborhood SubpixelNeighborhood = GatherPrevSubpixelNeighborhood(PrevHistory_Textures_3, PrevHistoryBufferUV);
            {
                PrevSubpixelDetails = 0;
                UNROLL_N(SUB_PIXEL_COUNT)
                for (uint SubpixelId = 0; SubpixelId < SUB_PIXEL_COUNT; SubpixelId++)
                {
                    taa_subpixel_payload SubpixelPayload = GetSubpixelPayload(SubpixelNeighborhood, SubpixelId);
                    PrevSubpixelDetails |= SubpixelPayload << (SUB_PIXEL_BIT_COUNT * SubpixelId);
                }
            }

            RawHistory0 = -min(-RawHistory0, taa_half(0.0));
            RawHistory1 = -min(-RawHistory1, taa_half(0.0));
            RawHistory2 = -min(-RawHistory2, taa_half(0.0));
        }
        
        // 解壓曆史資料.
        {
            PrevFallbackColor = RawHistory0;
            PrevFallbackWeight = RawHistory2.r;
            
            PrevHistoryMommentMin = RawHistory1Min;
            PrevHistoryMommentMax = RawHistory1Max;

            PrevHistoryMoment1 = RawHistory1;
            PrevHistoryValidity = RawHistory2.g;
        }

        // 校正曆史資料.
        {
            PrevHistoryMommentMin *= taa_half(HistoryPreExposureCorrection);
            PrevHistoryMommentMax *= taa_half(HistoryPreExposureCorrection);
            PrevHistoryMoment1 *= taa_half(HistoryPreExposureCorrection);
            PrevFallbackColor *= taa_half(HistoryPreExposureCorrection);
        }
    }
    
    // 從LDS讀取資料.
    #if CONFIG_MANUAL_LDS_SPILL
    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        taa_half4 RawLDS0 = SharedArray0[LocalGroupThreadIndex];
        taa_half4 RawLDS1 = SharedArray1[LocalGroupThreadIndex];
        taa_half4 RawLDS2 = SharedArray2[LocalGroupThreadIndex];

        FilteredInputColor = RawLDS0.rgb;
        InputMinColor = RawLDS1.rgb;
        InputMaxColor = RawLDS2.rgb;
        
        LowFrequencyRejection = RawLDS0.a;
        InputPixelAlignement = RawLDS1.a;
        OutputPixelVelocity = RawLDS2.a;
    }
    #endif

    // 如果目前低頻偏離曆史低頻, 摒棄高頻細節.
    #if CONFIG_LOW_FREQUENCY_DRIFT_REJECTION
    {
        taa_half3 PrevHighFrequencyYCoCg = TransformColorForClampingBox(PrevHistoryMoment1 * SafeRcp(PrevHistoryValidity));
        taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
        taa_half3 ClampedPrevYCoCg = TransformColorForClampingBox(clamp(PrevFallbackColor, PrevHistoryMommentMin, PrevHistoryMommentMax));

        taa_half HighFrequencyRejection = MeasureRejectionFactor(
            PrevYCoCg, ClampedPrevYCoCg,
            PrevHighFrequencyYCoCg, InputMinColor, InputMaxColor);
        
        PrevHistoryMoment1 *= HighFrequencyRejection;
        PrevHistoryValidity *= HighFrequencyRejection;
    }
    #endif

    // 将目前幀的輸入輸入到下一幀的預測器中.
    const taa_half Histeresis = rcp(taa_half(MAX_SAMPLE_COUNT));
    const taa_half PredictionOnlyValidity = Histeresis * taa_half(2.0);
    
    // 截取備選資料.
    taa_half LumaMin;
    taa_half LumaMax;
    taa_half3 ClampedFallbackColor;
    taa_half FallbackRejection;
    {
        LumaMin = InputMinColor.x;
        LumaMax = InputMaxColor.x;

        taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
        taa_half3 ClampedPrevYCoCg = clamp(PrevYCoCg, InputMinColor, InputMaxColor);
        taa_half3 InputCenterYCoCg = TransformColorForClampingBox(FilteredInputColor);

        ClampedFallbackColor = YCoCgToRGB(ClampedPrevYCoCg);
        
        FallbackRejection = MeasureRejectionFactor(
            PrevYCoCg, ClampedPrevYCoCg,
            InputCenterYCoCg, InputMinColor, InputMaxColor);

        #if !CONFIG_CLAMP
        {
            ClampedFallbackColor = PrevFallbackColor;
            FallbackRejection = taa_half(1.0);
        }
        #endif
    }

    taa_half3 FinalHistoryMoment1;
    taa_half FinalHistoryValidity;
    {
        // 根據完整性,計算需要摒棄多少曆史記錄.
        taa_half PrevHistoryRejectionWeight = LowFrequencyRejection;
            
        FLATTEN
        if (bOffScreen)
        {
            PrevHistoryRejectionWeight = taa_half(0.0);
        }

        taa_half DesiredCurrentContribution = max(Histeresis * InputPixelAlignement, taa_half(0.0));

        // 确定基于預測的摒棄是否足夠可信.
        taa_half RejectionConfidentEnough = taa_half(1); // saturate(RejectionValidity * MAX_SAMPLE_COUNT - 3.0);

        // 計算新摒棄的有效性.
        taa_half RejectedValidity = (
            min(PrevHistoryValidity, PredictionOnlyValidity - DesiredCurrentContribution) +
            max(PrevHistoryValidity - PredictionOnlyValidity + DesiredCurrentContribution, taa_half(0.0)) * PrevHistoryRejectionWeight);

        RejectedValidity = PrevHistoryValidity * PrevHistoryRejectionWeight;

        // 計算最大輸出有效性.
        taa_half OutputValidity = (
            clamp(RejectedValidity + DesiredCurrentContribution, taa_half(0.0), PredictionOnlyValidity) +
            clamp(RejectedValidity + DesiredCurrentContribution * PrevHistoryRejectionWeight * RejectionConfidentEnough - PredictionOnlyValidity, 0.0, 1.0 - PredictionOnlyValidity));

        FLATTEN
        if (bIsResponsiveAAPixel)
        {
            OutputValidity = taa_half(0.0);
        }
        
        taa_half InvPrevHistoryValidity = SafeRcp(PrevHistoryValidity);

        taa_half PrevMomentWeight = max(OutputValidity - DesiredCurrentContribution, taa_half(0.0));
        taa_half CurrentMomentWeight = min(DesiredCurrentContribution, OutputValidity);
        
        {
            taa_half PrevHistoryToneWeight = HdrWeightY(Luma4(PrevHistoryMoment1) * InvPrevHistoryValidity);
            taa_half FilteredInputToneWeight = HdrWeight4(FilteredInputColor);
            
            taa_half BlendPrevHistory = PrevMomentWeight * PrevHistoryToneWeight;
            taa_half BlendFilteredInput = CurrentMomentWeight * FilteredInputToneWeight;

            taa_half CommonWeight = OutputValidity * SafeRcp(BlendPrevHistory + BlendFilteredInput);

            FinalHistoryMoment1 = (
                PrevHistoryMoment1 * (CommonWeight * BlendPrevHistory * InvPrevHistoryValidity) +
                FilteredInputColor * (CommonWeight * BlendFilteredInput));
        }

        // 量化有效性的8位編碼調整,以避免數字偏移.
        taa_half OutputInvValidity = SafeRcp(OutputValidity);
        FinalHistoryValidity = ceil(taa_half(255.0) * OutputValidity) * rcp(taa_half(255.0));
        FinalHistoryMoment1 *= FinalHistoryValidity * OutputInvValidity;
    }

    // 計算備用的曆史資料.
    taa_half3 FinalFallbackColor;
    taa_half FinalFallbackWeight;
    {
        const taa_half TargetHesteresisCurrentFrameWeight = rcp(taa_half(MAX_FALLBACK_SAMPLE_COUNT));

        taa_half LumaHistory = Luma4(PrevFallbackColor);
        taa_half LumaFiltered = Luma4(FilteredInputColor);

        {
            taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);
        }

        taa_half BlendFinal;
        #if 1
        {
            taa_half CurrentFrameSampleCount = max(InputPixelAlignement, taa_half(0.005));
            
            // 僅使用一個樣本計數就可以極快地恢複曆史摒棄, 但随後立即穩定,以便子像素頻率可以盡快使用.
            taa_half PrevFallbackSampleCount;
            FLATTEN
            if (PrevFallbackWeight < taa_half(1.0))
            {
                PrevFallbackSampleCount = PrevFallbackWeight;
            }
            else
            {
                PrevFallbackSampleCount = taa_half(MAX_FALLBACK_SAMPLE_COUNT);
            }

            // 根據低頻摒棄曆史資料.
            #if 1
            {
                taa_half PrevFallbackRejectionFactor = saturate(LowFrequencyRejection * (CurrentFrameSampleCount + PrevFallbackSampleCount) / PrevFallbackSampleCount);

                PrevFallbackSampleCount *= PrevFallbackRejectionFactor;
            }
            #endif

            BlendFinal = CurrentFrameSampleCount / (CurrentFrameSampleCount + PrevFallbackSampleCount);

            // 增加運動的混合權重.
            #if 1
            {
                BlendFinal = lerp(BlendFinal, max(taa_half(0.2), BlendFinal), saturate(OutputPixelVelocity * rcp(taa_half(40.0))));
            }
            #endif

            // 抗閃爍.
            #if 1
            {
                taa_half DistToClamp = min( abs(LumaHistory - LumaMin), abs(LumaHistory - LumaMax) ) / max3( LumaHistory, LumaFiltered, taa_half(1e-4) );
                BlendFinal *= taa_half(0.2) + taa_half(0.8) * saturate(taa_half(0.5) * DistToClamp);
            }
            #endif
            
            // 確定至少有一些小的貢獻.
            #if 1
            {
                BlendFinal = max( BlendFinal, saturate( taa_half(0.01) * LumaHistory / abs( LumaFiltered - LumaHistory ) ) );
            }
            #endif

            // 反應力度是新幀的1/4.
            BlendFinal = bIsResponsiveAAPixel ? taa_half(1.0/4.0) : BlendFinal;

            // 完全摒棄曆史資料.
            {
                PrevFallbackSampleCount *= TotalRejection;
                BlendFinal = lerp(1.0, BlendFinal, TotalRejection);
            }

            FinalFallbackWeight = saturate(CurrentFrameSampleCount + PrevFallbackSampleCount);
            
            #if 1
                FinalFallbackWeight = saturate(floor(255.0 * (CurrentFrameSampleCount + PrevFallbackSampleCount)) * rcp(255.0));
            #endif
        }
        #endif

        {
            taa_half FilterWeight = HdrWeight4(FilteredInputColor);
            taa_half ClampedHistoryWeight = HdrWeight4(ClampedFallbackColor);

            taa_half2 Weights = WeightedLerpFactors(ClampedHistoryWeight, FilterWeight, BlendFinal);

            FinalFallbackColor = ClampedFallbackColor * Weights.x + FilteredInputColor * Weights.y;
        }
    }

    // 更新子像素細節.
    taa_subpixel_details FinalSubpixelDetails;
    {
        taa_half2 dKO = taa_half2(PPCo - PPCk);

        bool bUpdate = all(abs(dKO) < 0.5 * (InputInfo_ViewportSize.x * HistoryInfo_ViewportSizeInverse.x));

        FinalSubpixelDetails = PrevSubpixelDetails;

        taa_subpixel_payload ParallaxFactorBits = ParallaxFactorTexture[InputPixelPos] & SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK;

        {
            const uint ParallaxFactorMask = (
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 0 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 1 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 2 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 3 * SUB_PIXEL_BIT_COUNT)) | 
                0x0);
            
            // 重置視差系數.
            FLATTEN
            if (bOffScreen)
            {
                FinalSubpixelDetails = FinalSubpixelDetails & ~ParallaxFactorMask;
            }
        }

        FLATTEN
        if (bUpdate)
        {
            bool2 bBool = dKO < 0.0;

            uint SubpixelId = dot(uint2(bBool), uint2(1, SUB_PIXEL_GRID_SIZE));
            uint SubpixelShift = SubpixelId * SUB_PIXEL_BIT_COUNT;

            taa_subpixel_payload SubpixelPayload = (ParallaxFactorBits << SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET);

            FinalSubpixelDetails = (FinalSubpixelDetails & (~(SUB_PIXEL_BIT_MASK << SubpixelShift))) | (SubpixelPayload << SubpixelShift);
        }
    }

    // 計算最終輸出.
    taa_half3 FinalOutputColor;
    taa_half FinalOutputValidity;
    {
        taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);

        FinalOutputValidity = lerp(taa_half(1.0), saturate(FinalHistoryValidity), OutputBlend);

        taa_half3 NormalizedFinalHistoryMoment1 = taa_half3(FinalHistoryMoment1 * float(SafeRcp(FinalHistoryValidity)));

        taa_half FallbackWeight = HdrWeight4(FinalFallbackColor);
        taa_half Moment1Weight = HdrWeight4(NormalizedFinalHistoryMoment1);

        taa_half2 Weights = WeightedLerpFactors(FallbackWeight, Moment1Weight, OutputBlend);

        #if DEBUG_FALLBACK_BLENDING
            taa_half3 FallbackColor = taa_half3(1, 0.25, 0.25);
            taa_half3 HighFrequencyColor = taa_half3(0.25, 1, 0.25);

            FinalOutputColor = FinalFallbackColor * Weights.x * FallbackColor + NormalizedFinalHistoryMoment1 * Weights.y * HighFrequencyColor;
        #elif DEBUG_LOW_FREQUENCY_REJECTION
            taa_half3 DebugColor = lerp(taa_half3(1, 0.5, 0.5), taa_half3(0.5, 1, 0.5), LowFrequencyRejection);
            
            FinalOutputColor = FinalFallbackColor * Weights.x * DebugColor + NormalizedFinalHistoryMoment1 * Weights.y * DebugColor;
        #else
            FinalOutputColor = FinalFallbackColor * Weights.x + NormalizedFinalHistoryMoment1 * Weights.y;
        #endif
    }

    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        taa_short2 LocalHistoryPixelPos = (
            taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
            Map8x8Tile2x2Lane(LocalGroupThreadIndex));
            
        LocalHistoryPixelPos = InvalidateOutputPixelPos(LocalHistoryPixelPos, HistoryInfo_ViewportMax);

        // 輸出最終的曆史資料.
        {
            #if CONFIG_ENABLE_STOCASTIC_QUANTIZATION
            {
                uint2 Random = Rand3DPCG16(int3(LocalHistoryPixelPos, View.StateFrameIndexMod8)).xy;
                float2 E = Hammersley16(0, 1, Random);

                FinalHistoryMoment1 = QuantizeForFloatRenderTarget(FinalHistoryMoment1, E.x, HistoryQuantizationError);
                FinalFallbackColor = QuantizeForFloatRenderTarget(FinalFallbackColor, E.x, HistoryQuantizationError);
            }
            #endif

            FinalFallbackColor = -min(-FinalFallbackColor, taa_half(0.0));
            FinalHistoryMoment1 = -min(-FinalHistoryMoment1, taa_half(0.0));
            FinalFallbackColor = min(FinalFallbackColor, taa_half(Max10BitsFloat));
            FinalHistoryMoment1 = min(FinalHistoryMoment1, taa_half(Max10BitsFloat));
            
            HistoryOutput_Textures_0[LocalHistoryPixelPos] = FinalFallbackColor;
            HistoryOutput_Textures_1[LocalHistoryPixelPos] = FinalHistoryMoment1;
            HistoryOutput_Textures_2[LocalHistoryPixelPos] = taa_half2(FinalFallbackWeight, FinalHistoryValidity);
            HistoryOutput_Textures_3[LocalHistoryPixelPos] = FinalSubpixelDetails;

            #if DEBUG_OUTPUT
            {
                DebugOutput[LocalHistoryPixelPos] = Debug;
            }
            #endif
        }

        // 輸出最終的場景顔色.
        {
            taa_half3 OutputColor = FinalOutputColor;
                
            OutputColor = -min(-OutputColor, taa_half(0.0));
            OutputColor = min(OutputColor, taa_half(Max10BitsFloat));

            SceneColorOutput[LocalHistoryPixelPos] = OutputColor;
        }
    }
}
           

由此可知,相較傳統的TAA,TSR增加了很多資料,包含目前和曆史的高頻、低頻、視差系數、重投影等等資料,先後根據這些資訊摒棄或恢複曆史資料,生成目前幀的混合權重,最終算出抗鋸齒之後的場景顔色和曆史幀資料。

以上代碼隻是TSR的最後一個階段更新曆史資料的代碼,前面還有很多步驟來生成此階段所需的資料,此文不再分析,留給讀者們自行研究。

筆者粗略地看了Strata的相關代碼,看起來Strata類似于UE4的Material Layer,但它主要應用于Nanite幾何體的材質投射、混合和光影處理。Strata有專用的材質、材質節點、着色模型、可視化模式和Shader處理子產品。不過,目前EA版本尚處于體驗階段,限制較多。涉及Strata的主要檔案有:

  • Strata.h/cpp
  • StrataMaterial.h/cpp
  • StrataDefinitions.h
  • MaterialExpressionStrata.h
  • Strata.ush
  • BasePassPixelShader.usf
  • DeferredLightPixelShaders.usf
  • 場景渲染管線、光照相關的代碼。

有興趣的同學自行研讀相關源碼。

本篇主要闡述了UE5的編輯器特性、Nanite、Lumen及相關渲染技術,但由于UE5改動巨大,無法覆寫所有的技術點,除了本篇文章談及的技術,實際上還有很多未涉及的,這就需要感興趣的讀者自己去探索UE的源碼了。

UE5 EA階段,無論是Nanite還是Lumen,都存在着諸多瑕疵,如Nanite隻支援靜态物體,Lumen的噪點、漏光,TSR的閃爍和模糊,陰影精度的不足(下圖),海量傳統特性的不支援......

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

鏡頭離物體足夠近時出現的物體模糊和陰影瑕疵。

雖然UE5目前存在着諸多瑕疵,但它是沐浴着陽光雨露的小樹苗,經過Epic Game的精心培育,假以時日,終會成長為枝繁葉茂的參天大樹,蔭護着UE引擎關聯的各行各業。UE5 really No.1!!!

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

  • 感謝所有參考文獻的作者,部分圖檔來自參考文獻和網絡,侵删。
  • 本系列文章為筆者原創,隻發表在部落格園上,歡迎分享本文連結,但未經同意,不允許轉載!
  • 系列文章,未完待續,完整目錄請戳内容綱目。

  • Unreal Engine Source
  • Rendering and Graphics
  • Materials
  • Graphics Programming
  • New Rendering Features
  • Lumen Global Illumination and Reflections
  • Lumen Technical Details
  • Behind the scenes of “Lumen in the Land of Nanite” | Unreal Engine 5
  • Unreal Engine 5 Early Access Release Notes
  • 初探虛幻引擎5
  • 如何評價 Unreal Engine 5 Early-Access?
  • UE5 Nanite和Lumen背後的優化技術
  • Clipmap 在開放世界中的實戰應用
  • Family of Graph and Hypergraph Partitioning Software
  • UE5 Lumen實作分析
  • GPU-Driven Rendering Pipeline
  • Optimizing the Graphics Pipeline with Compute
  • DynamicOcclusionWithSignedDistanceFields
  • UE4硬體光追對比UE5 Lumen
  • UE5 Lumen原理介紹
  • Lumen | Inside Unreal
  • Intel Embree
  • Embree Overview
  • Silhouette Partitioning for Height Field Ray Tracing Tomas Sakalauskas Vilnius University
  • Ray Tracing Height Fields
  • Accelerating the ray tracing of height fields
  • Brief Analysis of UE5 Rendering Pipeline
  • Bin packing problem
  • Interactive Indirect Illumination Using Voxel Cone Tracing
  • Voxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination
  • Comparing 3D-Clipmaps and Sparse Voxel Octrees for voxel based conetracing
  • PRACTICAL REAL-TIME VOXEL-BASED GLOBAL ILLUMINATION FOR CURRENT GPUS
  • Lecture9 Real-Time Global Illumination(Screen Space)

繼續閱讀