目錄
- 11.1 本篇概述
- 11.2 RDG基礎
- 11.2.1 RDG基礎類型
- 11.2.2 RDG資源
- 11.2.3 RDG Pass
- 11.2.4 FRDGBuilder
- 11.3 RDG機制
- 11.3.1 RDG機制概述
- 11.3.2 FRDGBuilder::AddPass
- 11.3.3 FRDGBuilder::Compile
- 11.3.4 FRDGBuilder::Execute
- 11.3.5 RDG機制總結
- 11.4 RDG開發
- 11.4.1 建立RDG資源
- 11.4.2 注冊外部資源
- 11.4.3 提取資源
- 11.4.4 增加Pass
- 11.4.5 建立FRDGBuilder
- 11.4.6 RDG調試
- 11.5 本篇總結
- 11.5.1 本篇思考
- 特别說明
- 參考文獻
11.1 本篇概述
RDG全稱是Rendering Dependency Graph,意為渲染依賴性圖表,是UE4.22開始引進的全新的渲染子系統,基于有向無環圖(Directed Acyclic Graph,DAG)的排程系統,用于執行渲染管線的整幀優化。
它利用現代的圖形API(DirectX 12、Vulkan和Metal 2),實作自動異步計算排程以及更高效的記憶體管理和屏障管理來提升性能。
傳統的圖形API(DirectX 11、OpenGL)要求驅動器調用複雜的啟發法,以确定何時以及如何在GPU上執行關鍵的排程操作。例如清空緩存,管理和再使用記憶體,執行布局轉換等等。由于接口存在即時模式特性,是以需要複雜的記錄和狀态跟蹤才能處理各種極端情況。這些情況最終會對性能産生負面影響,并阻礙并行。
現代的圖形API(DirectX 12、Vulkan和Metal 2)與傳統圖形API不同,将低級GPU管理的負擔轉移到應用程式。這使得應用程式可以利用渲染管線的進階情境來驅動排程,進而提高性能并且簡化渲染堆棧。
RDG的理念不在GPU上立即執行Pass,而是先收集所有需要渲染的Pass,然後按照依賴的順序對圖表進行編譯和執行,期間會執行各類裁剪和優化。
依賴性圖表資料結構的整幀認知與現代圖形API的能力相結合,使RDG能夠在背景執行複雜的排程任務:
- 執行異步計算通道的自動排程和隔離。
- 在幀的不相交間隔期間,使資源之間的别名記憶體保持活躍狀态。
- 盡早啟動屏障和布局轉換,避免管線延遲。
此外,RDG利用依賴性圖表在通道設定期間提供豐富的驗證,對影響功能和性能的問題進行自動捕捉,進而改進開發流程。
RDG并非UE獨創的概念和技術,早在2017年的GDC中,寒霜就已經實作并應用了Frame Graph(幀圖)的技術。Frame Graph旨在将引擎的各類渲染功能(Feature)和上層渲染邏輯(Renderer)和下層資源(Shader、RenderContext、圖形API等)隔離開來,以便做進一步的解耦、優化,其中最重要的就是多線程和并行渲染。
FrameGraph是高層級的Render Pass和資源的代表,包含了一幀中所用到的所有資訊。Pass之間可以指定順序和依賴關系,下圖是其中的一個示例:
寒霜引擎采用幀圖方式實作的延遲渲染的順序和依賴圖。
可以毫不誇張地說,UE的RDG正是基于Frame Graph之上定制和實作而成的。到了UE4.26,RDG已經被大量普及,包含場景渲染、後處理、光追等等子產品都使用了RDG代替原本直接調用RHI指令的方式。
本篇主要闡述UE RDG的以下内容:
- RDG的基本概念和類型。
- RDG的使用方法。
- RDG的内部機制和原理。
11.2 RDG基礎
本章先闡述RDG涉及的主要類型、概念、接口等。
11.2.1 RDG基礎類型
RDG基礎類型和接口主要集中于RenderGraphUtils.h和RenderGraphDefinitions.h之中。部分解析如下:
// EngineSourceRuntimeRenderCorePublicRenderGraphDefinitions.h
// RDG Pass類型.
enum class ERDGPassFlags : uint8
{
None = 0, // 用于無參數的AddPass函數.
Raster = 1 << 0, // Pass在圖形管道上使用光栅化.
Compute = 1 << 1, // Pass在圖形管道上使用compute.
AsyncCompute = 1 << 2, // Pass在異步計算管道上使用compute
Copy = 1 << 3, // Pass在圖形管道上使用複制指令.
NeverCull = 1 << 4, // 不被裁剪優化, 用于特殊pass.
SkipRenderPass = 1 << 5, // 忽略BeginRenderPass/EndRenderPass, 留給使用者去調用. 隻在Raster綁定時有效. 将禁用Pass合并.
UntrackedAccess = 1 << 6, // Pass通路原始的RHI資源,這些資源可能被注冊到RDG中,但所有資源都保持在目前狀态. 此标志阻止圖形排程跨越通道的分割障礙。任何拆分都被延遲到pass執行之後。資源可能不會在pass執行過程中改變狀态。影響性能的屏障。不能與AsyncCompute組合。
Readback = Copy | NeverCull, // Pass使用複制指令,但寫入暫存資源(staging resource).
CommandMask = Raster | Compute | AsyncCompute | Copy, // 标志掩碼,表示送出給pass的RHI指令的類型.
ScopeMask = NeverCull | UntrackedAccess // 可由傳遞标志作用域使用的标志掩碼
};
// Buffer标記.
enum class ERDGBufferFlags : uint8
{
None = 0, // 無标記.
MultiFrame = 1 << 0 // 存續于多幀.
};
// 紋理标記.
enum class ERDGTextureFlags : uint8
{
None = 0,
MultiFrame = 1 << 0, // 存續于多幀.
MaintainCompression = 1 << 1, // 防止在此紋理上解壓中繼資料.
};
// UAV标記.
enum class ERDGUnorderedAccessViewFlags : uint8
{
None = 0,
SkipBarrier = 1 << 0 // 忽略屏障.
};
// 父資源類型.
enum class ERDGParentResourceType : uint8
{
Texture,
Buffer,
MAX
};
// 視圖類型.
enum class ERDGViewType : uint8
{
TextureUAV, // 紋理UAV(用于寫入資料)
TextureSRV, // 紋理SRV(用于讀取資料)
BufferUAV, // 緩沖UAV(用于寫入資料)
BufferSRV, // 緩沖SRV(用于讀取資料)
MAX
};
// 用于在建立視圖時指定紋理中繼資料平面
enum class ERDGTextureMetaDataAccess : uint8
{
None = 0, // 主平面預設壓縮使用.
CompressedSurface, // 主平面不壓縮使用.
Depth, // 深度平面預設壓縮使用.
Stencil, // 模闆平面預設壓縮使用.
HTile, // HTile平面.
FMask, // FMask平面.
CMask // CMask平面.
};
// 簡單的C++對象配置設定器, 用MemStack配置設定器追蹤和銷毀物體.
class FRDGAllocator final
{
public:
FRDGAllocator();
~FRDGAllocator();
// 配置設定原始記憶體.
FORCEINLINE void* Alloc(uint32 SizeInBytes, uint32 AlignInBytes)
{
return MemStack.Alloc(SizeInBytes, AlignInBytes);
}
// 配置設定POD記憶體而不跟蹤析構函數.
template <typename PODType>
FORCEINLINE PODType* AllocPOD()
{
return reinterpret_cast<PODType*>(Alloc(sizeof(PODType), alignof(PODType)));
}
// 帶析構追蹤的C++對象配置設定.
template <typename ObjectType, typename... TArgs>
FORCEINLINE ObjectType* AllocObject(TArgs&&... Args)
{
TTrackedAlloc<ObjectType>* TrackedAlloc = new(MemStack) TTrackedAlloc<ObjectType>(Forward<TArgs&&>(Args)...);
check(TrackedAlloc);
TrackedAllocs.Add(TrackedAlloc);
return TrackedAlloc->Get();
}
// 不帶析構追蹤的C++對象配置設定. (危險, 慎用)
template <typename ObjectType, typename... TArgs>
FORCEINLINE ObjectType* AllocNoDestruct(TArgs&&... Args)
{
return new (MemStack) ObjectType(Forward<TArgs&&>(Args)...);
}
// 釋放全部已配置設定的記憶體.
void ReleaseAll();
private:
class FTrackedAlloc
{
public:
virtual ~FTrackedAlloc() = default;
};
template <typename ObjectType>
class TTrackedAlloc : public FTrackedAlloc
{
public:
template <typename... TArgs>
FORCEINLINE TTrackedAlloc(TArgs&&... Args) : Object(Forward<TArgs&&>(Args)...) {}
FORCEINLINE ObjectType* Get() { return &Object; }
private:
ObjectType Object;
};
// 配置設定器.
FMemStackBase MemStack;
// 所有已配置設定的對象.
TArray<FTrackedAlloc*, SceneRenderingAllocator> TrackedAllocs;
};
// EngineSourceRuntimeRenderCorePublicRenderGraphUtils.h
// 清理未使用的資源.
extern RENDERCORE_API void ClearUnusedGraphResourcesImpl(const FShaderParameterBindings& ShaderBindings, ...);
(......)
// 注冊外部紋理, 可附帶備用執行個體.
FRDGTextureRef RegisterExternalTextureWithFallback(FRDGBuilder& GraphBuilder, ...);
inline FRDGTextureRef TryRegisterExternalTexture(FRDGBuilder& GraphBuilder, ...);
inline FRDGBufferRef TryRegisterExternalBuffer(FRDGBuilder& GraphBuilder, ...);
// 計算着色器的工具類.
struct RENDERCORE_API FComputeShaderUtils
{
// 理想的組大小為8x8,在GCN上至少占據一個wave,在Nvidia上占據兩個warp.
static constexpr int32 kGolden2DGroupSize = 8;
static FIntVector GetGroupCount(const int32 ThreadCount, const int32 GroupSize);
// 派發計算着色器到RHI指令清單, 攜帶其參數.
template<typename TShaderClass>
static void Dispatch(FRHIComputeCommandList& RHICmdList, const TShaderRef<TShaderClass>& ComputeShader, const typename TShaderClass::FParameters& Parameters, FIntVector GroupCount);
// 派發非直接的計算着色器到RHI指令清單, 攜帶其參數.
template<typename TShaderClass>
static void DispatchIndirect(FRHIComputeCommandList& RHICmdList, const TShaderRef<TShaderClass>& ComputeShader, const typename TShaderClass::FParameters& Parameters, FRHIVertexBuffer* IndirectArgsBuffer, uint32 IndirectArgOffset);
// 派發計算着色器到render graph builder, 攜帶其參數.
template<typename TShaderClass>
static void AddPass(FRDGBuilder& GraphBuilder,FRDGEventName&& PassName,ERDGPassFlags PassFlags,const TShaderRef<TShaderClass>& ComputeShader,typename TShaderClass::FParameters* Parameters,FIntVector GroupCount);
(......)
// 清理UAV.
static void ClearUAV(FRDGBuilder& GraphBuilder, FGlobalShaderMap* ShaderMap, FRDGBufferUAVRef UAV, uint32 ClearValue);
static void ClearUAV(FRDGBuilder& GraphBuilder, FGlobalShaderMap* ShaderMap, FRDGBufferUAVRef UAV, FVector4 ClearValue);
};
// 增加拷貝紋理Pass.
void AddCopyTexturePass(FRDGBuilder& GraphBuilder, FRDGTextureRef InputTexture, FRDGTextureRef OutputTexture, const FRHICopyTextureInfo& CopyInfo);
(......)
// 增加拷貝到解析目标的Pass.
void AddCopyToResolveTargetPass(FRDGBuilder& GraphBuilder, FRDGTextureRef InputTexture, FRDGTextureRef OutputTexture, const FResolveParams& ResolveParams);
// 清理各類資源的Pass.
void AddClearUAVPass(FRDGBuilder& GraphBuilder, FRDGBufferUAVRef BufferUAV, uint32 Value);
void AddClearUAVFloatPass(FRDGBuilder& GraphBuilder, FRDGBufferUAVRef BufferUAV, float Value);
void AddClearUAVPass(FRDGBuilder& GraphBuilder, FRDGTextureUAVRef TextureUAV, const FUintVector4& ClearValues);
void AddClearRenderTargetPass(FRDGBuilder& GraphBuilder, FRDGTextureRef Texture);
void AddClearDepthStencilPass(FRDGBuilder& GraphBuilder,FRDGTextureRef Texture,bool bClearDepth,float Depth,bool bClearStencil,uint8 Stencil);
void AddClearStencilPass(FRDGBuilder& GraphBuilder, FRDGTextureRef Texture);
(......)
// 增加回讀紋理的Pass.
void AddEnqueueCopyPass(FRDGBuilder& GraphBuilder, FRHIGPUTextureReadback* Readback, FRDGTextureRef SourceTexture, FResolveRect Rect = FResolveRect());
// 增加回讀緩沖區的Pass.
void AddEnqueueCopyPass(FRDGBuilder& GraphBuilder, FRHIGPUBufferReadback* Readback, FRDGBufferRef SourceBuffer, uint32 NumBytes);
// 建立資源.
FRDGBufferRef CreateStructuredBuffer(FRDGBuilder& GraphBuilder, ...);
FRDGBufferRef CreateVertexBuffer(FRDGBuilder& GraphBuilder, ...);
// 無參數的Pass增加.
template <typename ExecuteLambdaType>
void AddPass(FRDGBuilder& GraphBuilder, FRDGEventName&& Name, ExecuteLambdaType&& ExecuteLambda);
template <typename ExecuteLambdaType>
void AddPass(FRDGBuilder& GraphBuilder, ExecuteLambdaType&& ExecuteLambda);
// 其它特殊Pass
void AddBeginUAVOverlapPass(FRDGBuilder& GraphBuilder);
void AddEndUAVOverlapPass(FRDGBuilder& GraphBuilder);
(......)
11.2.2 RDG資源
RDG資源并不是直接用RHI資源,而是包裹了RHI資源引用,然後針對不同類型的資源各自封裝,且增加了額外的資訊。部分RDG的定義如下:
// EngineSourceRuntimeRenderCorePublicRenderGraphResources.h
class FRDGResource
{
public:
// 删除拷貝構造函數.
FRDGResource(const FRDGResource&) = delete;
virtual ~FRDGResource() = default;
//////////////////////////////////////////////////////////////////////////
// 下面的接口隻能被RDG的Pass執行期間調用.
// 标記此資源是否被使用, 若非, 則會被清理掉.
#if RDG_ENABLE_DEBUG
virtual void MarkResourceAsUsed();
#else
inline void MarkResourceAsUsed() {}
#endif
// 擷取RDG的RHI資源引用.
FRHIResource* GetRHI() const
{
ValidateRHIAccess();
return ResourceRHI;
}
//////////////////////////////////////////////////////////////////////////
protected:
FRDGResource(const TCHAR* InName);
// 将此資源配置設定為RHI資源的簡單直通容器.
void SetPassthroughRHI(FRHIResource* InResourceRHI)
{
ResourceRHI = InResourceRHI;
#if RDG_ENABLE_DEBUG
DebugData.bAllowRHIAccess = true;
DebugData.bPassthrough = true;
#endif
}
bool IsPassthrough() const
{
#if RDG_ENABLE_DEBUG
return DebugData.bPassthrough;
#else
return false;
#endif
}
/** Verify that the RHI resource can be accessed at a pass execution. */
void ValidateRHIAccess() const
{
#if RDG_ENABLE_DEBUG
checkf(DebugData.bAllowRHIAccess,
TEXT("Accessing the RHI resource of %s at this time is not allowed. If you hit this check in pass, ")
TEXT("that is due to this resource not being referenced in the parameters of your pass."),
Name);
#endif
}
FRHIResource* GetRHIUnchecked() const
{
return ResourceRHI;
}
// RHI資源引用.
FRHIResource* ResourceRHI = nullptr;
private:
// 調試資訊.
#if RDG_ENABLE_DEBUG
class FDebugData
{
private:
// 在運作時跟蹤資源是否被pass的lambda實際使用,以檢測對pass不必要的資源依賴.
bool bIsActuallyUsedByPass = false;
// 追蹤Pass執行期間, 底層的RHI自已是否允許被通路.
bool bAllowRHIAccess = false;
// 如果為true,則該資源不附加到任何建構器,而是作為一個虛拟容器存在,用于将代碼暫存到RDG.
bool bPassthrough = false;
} DebugData;
#endif
};
class FRDGUniformBuffer : public FRDGResource
{
public:
// 擷取RHI.
FRHIUniformBuffer* GetRHI() const
{
return static_cast<FRHIUniformBuffer*>(FRDGResource::GetRHI());
}
(......)
protected:
template <typename TParameterStruct>
explicit FRDGUniformBuffer(TParameterStruct* InParameters, const TCHAR* InName)
: FRDGResource(InName)
, ParameterStruct(InParameters)
, bGlobal(ParameterStruct.HasStaticSlot());
private:
// 參數結構體.
const FRDGParameterStruct ParameterStruct;
// RHI資源.
TRefCountPtr<FRHIUniformBuffer> UniformBufferRHI;
// RDG句柄.
FRDGUniformBufferHandle Handle;
// 全局綁定或局部綁定.
uint8 bGlobal : 1;
};
// RDGUniformBuffer模闆類.
template <typename ParameterStructType>
class TRDGUniformBuffer : public FRDGUniformBuffer
{
public:
const TRDGParameterStruct<ParameterStructType>& GetParameters() const;
TUniformBufferRef<ParameterStructType> GetRHIRef() const;
const ParameterStructType* operator->() const;
(......)
};
// 一種由圖跟蹤配置設定生命周期的渲染圖資源。可能有引用它的子資源(例如視圖)
class FRDGParentResource : public FRDGResource
{
public:
// 父資源類型.
const ERDGParentResourceType Type;
bool IsExternal() const;
protected:
FRDGParentResource(const TCHAR* InName, ERDGParentResourceType InType);
// 是否外部資源.
uint8 bExternal : 1;
// 是否被提取的資源.
uint8 bExtracted : 1;
// 此資源是否需要acquire / discard.
uint8 bTransient : 1;
// 是否最後的擁有者配置設定的.
uint8 bLastOwner : 1;
// 将被裁剪.
uint8 bCulled : 1;
// 是否被異步計算Pass使用.
uint8 bUsedByAsyncComputePass : 1;
private:
// 引用數量.
uint16 ReferenceCount = 0;
// 使用者配置設定的資源的初始和最終狀态(如果已知)
ERHIAccess AccessInitial = ERHIAccess::Unknown;
ERHIAccess AccessFinal = ERHIAccess::Unknown;
FRDGPassHandle AcquirePass;
FRDGPassHandle FirstPass;
FRDGPassHandle LastPass;
(......)
};
// 建立渲染紋理的描述資訊.
struct RENDERCORE_API FRDGTextureDesc
{
static FRDGTextureDesc Create2D(...);
static FRDGTextureDesc Create2DArray(...);
static FRDGTextureDesc Create3D(...);
static FRDGTextureDesc CreateCube(...);
static FRDGTextureDesc CreateCubeArray(...);
bool IsTexture2D() const;
bool IsTexture3D() const;
bool IsTextureCube() const;
bool IsTextureArray() const;
bool IsMipChain() const;
bool IsMultisample() const;
FIntVector GetSize() const;
// 子資源的布局.
FRDGTextureSubresourceLayout GetSubresourceLayout() const;
bool IsValid() const;
// 清理值.
FClearValueBinding ClearValue;
ETextureDimension Dimension = ETextureDimension::Texture2D;
// 清理标記.
ETextureCreateFlags Flags = TexCreate_None;
// 像素格式.
EPixelFormat Format = PF_Unknown;
// 紋理在x和y中的範圍
FIntPoint Extent = FIntPoint(1, 1);
// 3D紋理的深度.
uint16 Depth = 1;
uint16 ArraySize = 1;
// 紋理層級數.
uint8 NumMips = 1;
// 采樣數.
uint8 NumSamples = 1;
};
// 将池内的RT描述轉成RDG紋理描述.
inline FRDGTextureDesc Translate(const FPooledRenderTargetDesc& InDesc, ERenderTargetTexture InTexture = ERenderTargetTexture::Targetable);
// 将RDG紋理描述轉成池内的RT描述.
inline FPooledRenderTargetDesc Translate(const FRDGTextureDesc& InDesc);
// 池内的紋理.
class RENDERCORE_API FRDGPooledTexture
{
public:
// 描述.
const FRDGTextureDesc Desc;
// 引用計數.
uint32 GetRefCount() const;
uint32 AddRef() const;
uint32 Release() const;
private:
FRDGPooledTexture(FRHITexture* InTexture, const FRDGTextureDesc& InDesc, const FUnorderedAccessViewRHIRef& FirstMipUAV);
// 初始化緩存的UAV.
void InitViews(const FUnorderedAccessViewRHIRef& FirstMipUAV);
void Finalize();
void Reset();
// 對應的RHI紋理.
FRHITexture* Texture = nullptr;
// 所在的紋理對象.
FRDGTexture* Owner = nullptr;
// 子資源布局.
FRDGTextureSubresourceLayout Layout;
// 子資源狀态.
FRDGTextureSubresourceState State;
// 為RHI紋理緩存的UAV/SRV.
TArray<FUnorderedAccessViewRHIRef, TInlineAllocator<1>> MipUAVs;
TArray<TPair<FRHITextureSRVCreateInfo, FShaderResourceViewRHIRef>, TInlineAllocator<1>> SRVs;
FUnorderedAccessViewRHIRef HTileUAV;
FShaderResourceViewRHIRef HTileSRV;
FUnorderedAccessViewRHIRef StencilUAV;
FShaderResourceViewRHIRef FMaskSRV;
FShaderResourceViewRHIRef CMaskSRV;
mutable uint32 RefCount = 0;
};
// RDG紋理.
class RENDERCORE_API FRDGTexture final : public FRDGParentResource
{
public:
// 為還未傳到RDG的Pass建立一個适用于用RDG參數填充RHI統一緩沖區的直通紋理.
static FRDGTextureRef GetPassthrough(const TRefCountPtr<IPooledRenderTarget>& PooledRenderTarget);
// 描述和标記.
const FRDGTextureDesc Desc;
const ERDGTextureFlags Flags;
//////////////////////////////////////////////////////////////////////////
//! The following methods may only be called during pass execution.
IPooledRenderTarget* GetPooledRenderTarget() const
FRHITexture* GetRHI() const
//////////////////////////////////////////////////////////////////////////
FRDGTextureSubresourceLayout GetSubresourceLayout() const;
FRDGTextureSubresourceRange GetSubresourceRange() const;
FRDGTextureSubresourceRange GetSubresourceRangeSRV() const;
private:
FRDGTexture(const TCHAR* InName, const FRDGTextureDesc& InDesc, ERDGTextureFlags InFlags, ERenderTargetTexture InRenderTargetTexture);
void SetRHI(FPooledRenderTarget* PooledRenderTarget, FRDGTextureRef& OutPreviousOwner);
void Finalize();
FRHITexture* GetRHIUnchecked() const;
bool IsLastOwner() const;
FRDGTextureSubresourceState& GetState();
const ERenderTargetTexture RenderTargetTexture;
// 用于促進子資源轉換的布局.
FRDGTextureSubresourceLayout Layout;
// 在執行期間擁有PooledTexture配置設定的下一個紋理.
FRDGTextureHandle NextOwner;
// 已注冊到建構器的句柄.
FRDGTextureHandle Handle;
// 池内紋理.
IPooledRenderTarget* PooledRenderTarget = nullptr;
FRDGPooledTexture* PooledTexture = nullptr;
// 從池紋理緩存的狀态指針
FRDGTextureSubresourceState* State = nullptr;
// 當持有強引用時嚴格有效,
TRefCountPtr<IPooledRenderTarget> Allocation;
// 在建構圖時跟蹤合并的子資源狀态
FRDGTextureTransientSubresourceStateIndirect MergeState;
// 在圖的建構過程中,追蹤傳遞每個子資源的生産者.
TRDGTextureSubresourceArray<FRDGPassHandle> LastProducers;
};
// 池化的緩沖區.
class RENDERCORE_API FRDGPooledBuffer
{
public:
const FRDGBufferDesc Desc;
FRHIUnorderedAccessView* GetOrCreateUAV(FRDGBufferUAVDesc UAVDesc);
FRHIShaderResourceView* GetOrCreateSRV(FRDGBufferSRVDesc SRVDesc);
FRHIVertexBuffer* GetVertexBufferRHI() const;
FRHIIndexBuffer* GetIndexBufferRHI() const;
FRHIStructuredBuffer* GetStructuredBufferRHI() const;
uint32 GetRefCount() const;
uint32 AddRef() const;
uint32 Release() const;
(......)
private:
FRDGPooledBuffer(const FRDGBufferDesc& InDesc);
// 頂點/索引/結構體緩沖.
FVertexBufferRHIRef VertexBuffer;
FIndexBufferRHIRef IndexBuffer;
FStructuredBufferRHIRef StructuredBuffer;
// UAV/SRV.
TMap<FRDGBufferUAVDesc, FUnorderedAccessViewRHIRef, FDefaultSetAllocator, TUAVFuncs<FRDGBufferUAVDesc, FUnorderedAccessViewRHIRef>> UAVs;
TMap<FRDGBufferSRVDesc, FShaderResourceViewRHIRef, FDefaultSetAllocator, TSRVFuncs<FRDGBufferSRVDesc, FShaderResourceViewRHIRef>> SRVs;
void Reset();
void Finalize();
const TCHAR* Name = nullptr;
// 擁有者.
FRDGBufferRef Owner = nullptr;
FRDGSubresourceState State;
mutable uint32 RefCount = 0;
uint32 LastUsedFrame = 0;
};
// 渲染圖追蹤的緩沖區.
class RENDERCORE_API FRDGBuffer final : public FRDGParentResource
{
public:
const FRDGBufferDesc Desc;
const ERDGBufferFlags Flags;
//////////////////////////////////////////////////////////////////////////
//! The following methods may only be called during pass execution.
// 擷取RHI資源.
FRHIVertexBuffer* GetIndirectRHICallBuffer() const
FRHIVertexBuffer* GetRHIVertexBuffer() const
FRHIStructuredBuffer* GetRHIStructuredBuffer() const
//////////////////////////////////////////////////////////////////////////
private:
FRDGBuffer(const TCHAR* InName, const FRDGBufferDesc& InDesc, ERDGBufferFlags InFlags);
// 設定RHI資源.
void SetRHI(FRDGPooledBuffer* InPooledBuffer, FRDGBufferRef& OutPreviousOwner);
void Finalize();
FRDGSubresourceState& GetState() const
// RDG句柄.
FRDGBufferHandle Handle;
// 最後處理此資源的人.
FRDGPassHandle LastProducer;
// 下一個擁有者.
FRDGBufferHandle NextOwner;
// 賦予的池化緩沖區.
FRDGPooledBuffer* PooledBuffer = nullptr;
// 子資源狀态.
FRDGSubresourceState* State = nullptr;
TRefCountPtr<FRDGPooledBuffer> Allocation;
FRDGSubresourceState* MergeState = nullptr;
};
(......)
在RDG系統中,基本上對所有的RHI資源進行了封裝和包裹,以便進一步控制、管理RHI資源,精準控制它們的生命周期、引用關系及調試資訊等,進一步可以優化、裁剪它們,提升渲染性能。
11.2.3 RDG Pass
RDG Pass子產品涉及了屏障、資源轉換、RDGPass等概念:
// EngineSourceRuntimeRHIPublicRHI.h
// 用于表示RHI中挂起的資源轉換的不透明資料結構.
struct FRHITransition
{
public:
template <typename T>
inline T* GetPrivateData()
{
uintptr_t Addr = Align(uintptr_t(this + 1), GRHITransitionPrivateData_AlignInBytes);
return reinterpret_cast<T*>(Addr);
}
template <typename T>
inline const T* GetPrivateData() const
{
return const_cast<FRHITransition*>(this)->GetPrivateData<T>();
}
private:
FRHITransition(const FRHITransition&) = delete;
FRHITransition(FRHITransition&&) = delete;
FRHITransition(ERHIPipeline SrcPipelines, ERHIPipeline DstPipelines);
~FRHITransition();
// 擷取總的配置設定尺寸.
static uint64 GetTotalAllocationSize()
// 擷取對齊位元組數.
static uint64 GetAlignment();
// 開始标記.
inline void MarkBegin(ERHIPipeline Pipeline) const
{
int8 Mask = int8(Pipeline);
int8 PreviousValue = FPlatformAtomics::InterlockedAnd(&State, ~Mask);
if (PreviousValue == Mask)
{
Cleanup();
}
}
// 結束标記.
inline void MarkEnd(ERHIPipeline Pipeline) const
{
int8 Mask = int8(Pipeline) << int32(ERHIPipeline::Num);
int8 PreviousValue = FPlatformAtomics::InterlockedAnd(&State, ~Mask);
if (PreviousValue == Mask)
{
Cleanup();
}
}
// 清理轉換資源, 包含RHI轉換和配置設定的記憶體.
inline void Cleanup() const;
mutable int8 State;
#if DO_CHECK
mutable ERHIPipeline AllowedSrc;
mutable ERHIPipeline AllowedDst;
#endif
#if ENABLE_RHI_VALIDATION
// 栅欄.
RHIValidation::FFence* Fence = nullptr;
// 挂起的開始操作.
RHIValidation::FOperationsList PendingOperationsBegin;
// 挂起的結束操作.
RHIValidation::FOperationsList PendingOperationsEnd;
#endif
};
// EngineSourceRuntimeRenderCorePublicRenderGraphPass.h
// RDG屏障批
class RENDERCORE_API FRDGBarrierBatch
{
public:
FRDGBarrierBatch(const FRDGBarrierBatch&) = delete;
bool IsSubmitted() const
FString GetName() const;
protected:
FRDGBarrierBatch(const FRDGPass* InPass, const TCHAR* InName);
void SetSubmitted();
ERHIPipeline GetPipeline() const
private:
bool bSubmitted = false;
// Graphics或AsyncCompute
ERHIPipeline Pipeline;
#if RDG_ENABLE_DEBUG
const FRDGPass* Pass;
const TCHAR* Name;
#endif
};
// 屏障批開始
class RENDERCORE_API FRDGBarrierBatchBegin final : public FRDGBarrierBatch
{
public:
FRDGBarrierBatchBegin(const FRDGPass* InPass, const TCHAR* InName, TOptional<ERHIPipeline> InOverridePipelineForEnd = {});
~FRDGBarrierBatchBegin();
// 增加資源轉換到批次.
void AddTransition(FRDGParentResourceRef Resource, const FRHITransitionInfo& Info);
const FRHITransition* GetTransition() const;
bool IsTransitionValid() const;
void SetUseCrossPipelineFence();
// 送出屏障/資源轉換.
void Submit(FRHIComputeCommandList& RHICmdList);
private:
TOptional<ERHIPipeline> OverridePipelineToEnd;
bool bUseCrossPipelineFence = false;
// 送出後存儲的資源轉換, 它在結束批處理時被賦回null.
const FRHITransition* Transition = nullptr;
// 要執行的異步資源轉換數組.
TArray<FRHITransitionInfo, TInlineAllocator<1, SceneRenderingAllocator>> Transitions;
#if RDG_ENABLE_DEBUG
// 與Transitions數組比對的RDG資源數組, 僅供調試.
TArray<FRDGParentResource*, SceneRenderingAllocator> Resources;
#endif
};
// 屏障批結束
class RENDERCORE_API FRDGBarrierBatchEnd final : public FRDGBarrierBatch
{
public:
FRDGBarrierBatchEnd(const FRDGPass* InPass, const TCHAR* InName);
~FRDGBarrierBatchEnd();
// 預留記憶體.
void ReserveMemory(uint32 ExpectedDependencyCount);
// 在開始批處理上插入依賴項, 開始批可以插入多個結束批.
void AddDependency(FRDGBarrierBatchBegin* BeginBatch);
// 送出資源轉換.
void Submit(FRHIComputeCommandList& RHICmdList);
private:
// 此結束批完成後可以喚起的開始批轉換.
TArray<FRDGBarrierBatchBegin*, TInlineAllocator<1, SceneRenderingAllocator>> Dependencies;
};
// RGD通道基礎類.
class RENDERCORE_API FRDGPass
{
public:
FRDGPass(FRDGEventName&& InName, FRDGParameterStruct InParameterStruct, ERDGPassFlags InFlags);
FRDGPass(const FRDGPass&) = delete;
virtual ~FRDGPass() = default;
// 通道資料接口.
const TCHAR* GetName() const;
FORCEINLINE const FRDGEventName& GetEventName() const;
FORCEINLINE ERDGPassFlags GetFlags() const;
FORCEINLINE ERHIPipeline GetPipeline() const;
// RDG Pass參數.
FORCEINLINE FRDGParameterStruct GetParameters() const;
FORCEINLINE FRDGPassHandle GetHandle() const;
bool IsMergedRenderPassBegin() const;
bool IsMergedRenderPassEnd() const;
bool SkipRenderPassBegin() const;
bool SkipRenderPassEnd() const;
bool IsAsyncCompute() const;
bool IsAsyncComputeBegin() const;
bool IsAsyncComputeEnd() const;
bool IsGraphicsFork() const;
bool IsGraphicsJoin() const;
// 生産者句柄.
const FRDGPassHandleArray& GetProducers() const;
// 跨管線生産者.
FRDGPassHandle GetCrossPipelineProducer() const;
// 跨管線消費者.
FRDGPassHandle GetCrossPipelineConsumer() const;
// 分叉Pass.
FRDGPassHandle GetGraphicsForkPass() const;
// 合并Pass.
FRDGPassHandle GetGraphicsJoinPass() const;
#if RDG_CPU_SCOPES
FRDGCPUScopes GetCPUScopes() const;
#endif
#if RDG_GPU_SCOPES
FRDGGPUScopes GetGPUScopes() const;
#endif
private:
// 前序屏障.
FRDGBarrierBatchBegin& GetPrologueBarriersToBegin(FRDGAllocator& Allocator);
FRDGBarrierBatchEnd& GetPrologueBarriersToEnd(FRDGAllocator& Allocator);
// 後序屏障.
FRDGBarrierBatchBegin& GetEpilogueBarriersToBeginForGraphics(FRDGAllocator& Allocator);
FRDGBarrierBatchBegin& GetEpilogueBarriersToBeginForAsyncCompute(FRDGAllocator& Allocator);
FRDGBarrierBatchBegin& GetEpilogueBarriersToBeginFor(FRDGAllocator& Allocator, ERHIPipeline PipelineForEnd);
//////////////////////////////////////////////////////////////////////////
//! User Methods to Override
// 執行實作.
virtual void ExecuteImpl(FRHIComputeCommandList& RHICmdList) = 0;
//////////////////////////////////////////////////////////////////////////
// 執行.
void Execute(FRHIComputeCommandList& RHICmdList);
// Pass資料.
const FRDGEventName Name;
const FRDGParameterStruct ParameterStruct;
const ERDGPassFlags Flags;
const ERHIPipeline Pipeline;
FRDGPassHandle Handle;
// Pass标記.
union
{
struct
{
uint32 bSkipRenderPassBegin : 1;
uint32 bSkipRenderPassEnd : 1;
uint32 bAsyncComputeBegin : 1;
uint32 bAsyncComputeEnd : 1;
uint32 bAsyncComputeEndExecute : 1;
uint32 bGraphicsFork : 1;
uint32 bGraphicsJoin : 1;
uint32 bUAVAccess : 1;
IF_RDG_ENABLE_DEBUG(uint32 bFirstTextureAllocated : 1);
};
uint32 PackedBits = 0;
};
// 最新的跨管道生産者的句柄.
FRDGPassHandle CrossPipelineProducer;
// 最早的跨管線消費者的句柄.
FRDGPassHandle CrossPipelineConsumer;
// (僅限AsyncCompute)Graphics pass,該通道是異步計算間隔的fork / join.
FRDGPassHandle GraphicsForkPass;
FRDGPassHandle GraphicsJoinPass;
// 處理此通道的前序/後續屏障的通道.
FRDGPassHandle PrologueBarrierPass;
FRDGPassHandle EpilogueBarrierPass;
// 生産者Pass清單.
FRDGPassHandleArray Producers;
// 紋理狀态.
struct FTextureState
{
FRDGTextureTransientSubresourceState State;
FRDGTextureTransientSubresourceStateIndirect MergeState;
uint16 ReferenceCount = 0;
};
// 緩沖區狀态.
struct FBufferState
{
FRDGSubresourceState State;
FRDGSubresourceState* MergeState = nullptr;
uint16 ReferenceCount = 0;
};
// 将紋理/緩沖區映射到Pass中如何使用的資訊。
TSortedMap<FRDGTexture*, FTextureState, SceneRenderingAllocator> TextureStates;
TSortedMap<FRDGBuffer*, FBufferState, SceneRenderingAllocator> BufferStates;
// 在執行此Pass期間,計劃開始的Pass參數清單.
TArray<FRDGPass*, TInlineAllocator<1, SceneRenderingAllocator>> ResourcesToBegin;
TArray<FRDGPass*, TInlineAllocator<1, SceneRenderingAllocator>> ResourcesToEnd;
// 在acquire完成*之後*,*在丢棄*之前*擷取的紋理清單. 擷取适用于所有配置設定的紋理.
TArray<FRHITexture*, SceneRenderingAllocator> TexturesToAcquire;
// 在Pass完成*之後*,獲得(acquires)*之後*,丢棄的紋理清單. 丢棄僅适用于标記為瞬态(transient)的紋理,并且紋理的最後一個别名(alia)使用自動丢棄行為(為了支援更幹淨的切換到使用者或傳回池).
TArray<FRHITexture*, SceneRenderingAllocator> TexturesToDiscard;
FRDGBarrierBatchBegin* PrologueBarriersToBegin = nullptr;
FRDGBarrierBatchEnd* PrologueBarriersToEnd = nullptr;
FRDGBarrierBatchBegin* EpilogueBarriersToBeginForGraphics = nullptr;
FRDGBarrierBatchBegin* EpilogueBarriersToBeginForAsyncCompute = nullptr;
EAsyncComputeBudget AsyncComputeBudget = EAsyncComputeBudget::EAll_4;
};
// RDG Pass Lambda執行函數.
template <typename ParameterStructType, typename ExecuteLambdaType>
class TRDGLambdaPass : public FRDGPass
{
(......)
TRDGLambdaPass(FRDGEventName&& InName, const ParameterStructType* InParameterStruct, ERDGPassFlags InPassFlags, ExecuteLambdaType&& InExecuteLambda);
private:
// 執行實作.
void ExecuteImpl(FRHIComputeCommandList& RHICmdList) override
{
check(!kSupportsRaster || RHICmdList.IsImmediate());
// 調用Lambda執行個體.
ExecuteLambda(static_cast<TRHICommandList&>(RHICmdList));
}
Lambda執行個體.
ExecuteLambdaType ExecuteLambda;
};
// 附帶空Lambda的Pass.
template <typename ExecuteLambdaType>
class TRDGEmptyLambdaPass : public TRDGLambdaPass<FEmptyShaderParameters, ExecuteLambdaType>
{
public:
TRDGEmptyLambdaPass(FRDGEventName&& InName, ERDGPassFlags InPassFlags, ExecuteLambdaType&& InExecuteLambda);
private:
FEmptyShaderParameters EmptyShaderParameters;
};
// 用于前序/後序Pass.
class FRDGSentinelPass final : public FRDGPass
{
public:
FRDGSentinelPass(FRDGEventName&& Name);
private:
void ExecuteImpl(FRHIComputeCommandList&) override;
FEmptyShaderParameters EmptyShaderParameters;
};
以上顯示RDG的Pass比較複雜,是RDG體系中最核心的類型之一,涉及了消費者、生産者、轉換依賴、各類資源狀态等等資料和處理。RDG的Pass有以下幾種類型:
classDiagram-v2
FRDGPass <|-- TRDGLambdaPass
FRDGPass <|-- FRDGSentinelPass
RDG Pass和渲染Pass并非一一對應關系,有可能多個合并成一個渲染Pass,詳見後面章節。RDG Pass最複雜莫過于多線程處理、資源狀态轉換以及依賴處理,不過本節先不涉及,後續章節再詳細探讨。
11.2.4 FRDGBuilder
FRDGBuilder是RDG體系的心髒和發動機,也是個大管家,負責收集渲染Pass和參數,編譯Pass、資料,處理資源依賴,裁剪和優化各類資料,還有提供執行接口。它的聲明如下:
class RENDERCORE_API FRDGBuilder
{
public:
FRDGBuilder(FRHICommandListImmediate& InRHICmdList, FRDGEventName InName = {}, const char* UnaccountedCSVStat = kDefaultUnaccountedCSVStat);
FRDGBuilder(const FRDGBuilder&) = delete;
// 查找外部紋理, 若找不到傳回null.
FRDGTextureRef FindExternalTexture(FRHITexture* Texture) const;
FRDGTextureRef FindExternalTexture(IPooledRenderTarget* ExternalPooledTexture, ERenderTargetTexture Texture) const;
// 注冊外部池内RT到RDG, 以便RDG追蹤之. 池内RT可能包含兩種RHI紋理: MSAA和非MSAA.
FRDGTextureRef RegisterExternalTexture(
const TRefCountPtr<IPooledRenderTarget>& ExternalPooledTexture,
ERenderTargetTexture Texture = ERenderTargetTexture::ShaderResource,
ERDGTextureFlags Flags = ERDGTextureFlags::None);
FRDGTextureRef RegisterExternalTexture(
const TRefCountPtr<IPooledRenderTarget>& ExternalPooledTexture,
const TCHAR* NameIfNotRegistered,
ERenderTargetTexture RenderTargetTexture = ERenderTargetTexture::ShaderResource,
ERDGTextureFlags Flags = ERDGTextureFlags::None);
// 注冊外部緩沖區到RDG, 以便RDG追蹤之.
FRDGBufferRef RegisterExternalBuffer(const TRefCountPtr<FRDGPooledBuffer>& ExternalPooledBuffer, ERDGBufferFlags Flags = ERDGBufferFlags::None);
FRDGBufferRef RegisterExternalBuffer(const TRefCountPtr<FRDGPooledBuffer>& ExternalPooledBuffer, ERDGBufferFlags Flags, ERHIAccess AccessFinal);
FRDGBufferRef RegisterExternalBuffer(
const TRefCountPtr<FRDGPooledBuffer>& ExternalPooledBuffer,
const TCHAR* NameIfNotRegistered,
ERDGBufferFlags Flags = ERDGBufferFlags::None);
// 資源建立接口.
FRDGTextureRef CreateTexture(const FRDGTextureDesc& Desc, const TCHAR* Name, ERDGTextureFlags Flags = ERDGTextureFlags::None);
FRDGBufferRef CreateBuffer(const FRDGBufferDesc& Desc, const TCHAR* Name, ERDGBufferFlags Flags = ERDGBufferFlags::None);
FRDGTextureSRVRef CreateSRV(const FRDGTextureSRVDesc& Desc);
FRDGBufferSRVRef CreateSRV(const FRDGBufferSRVDesc& Desc);
FORCEINLINE FRDGBufferSRVRef CreateSRV(FRDGBufferRef Buffer, EPixelFormat Format);
FRDGTextureUAVRef CreateUAV(const FRDGTextureUAVDesc& Desc, ERDGUnorderedAccessViewFlags Flags = ERDGUnorderedAccessViewFlags::None);
FORCEINLINE FRDGTextureUAVRef CreateUAV(FRDGTextureRef Texture, ERDGUnorderedAccessViewFlags Flags = ERDGUnorderedAccessViewFlags::None);
FRDGBufferUAVRef CreateUAV(const FRDGBufferUAVDesc& Desc, ERDGUnorderedAccessViewFlags Flags = ERDGUnorderedAccessViewFlags::None);
FORCEINLINE FRDGBufferUAVRef CreateUAV(FRDGBufferRef Buffer, EPixelFormat Format, ERDGUnorderedAccessViewFlags Flags = ERDGUnorderedAccessViewFlags::None);
template <typename ParameterStructType>
TRDGUniformBufferRef<ParameterStructType> CreateUniformBuffer(ParameterStructType* ParameterStruct);
// 配置設定記憶體, 記憶體由RDG管理生命周期.
void* Alloc(uint32 SizeInBytes, uint32 AlignInBytes);
template <typename PODType>
PODType* AllocPOD();
template <typename ObjectType, typename... TArgs>
ObjectType* AllocObject(TArgs&&... Args);
template <typename ParameterStructType>
ParameterStructType* AllocParameters();
// 增加附帶參數和Lambda的Pass.
template <typename ParameterStructType, typename ExecuteLambdaType>
FRDGPassRef AddPass(FRDGEventName&& Name, const ParameterStructType* ParameterStruct, ERDGPassFlags Flags, ExecuteLambdaType&& ExecuteLambda);
// 增加沒有參數隻有Lambda的Pass.
template <typename ExecuteLambdaType>
FRDGPassRef AddPass(FRDGEventName&& Name, ERDGPassFlags Flags, ExecuteLambdaType&& ExecuteLambda);
// 在Builder執行末期, 提取池内紋理到指定的指針. 對于RDG建立的資源, 這将延長GPU資源的生命周期,直到執行,指針被填充. 如果指定,紋理将轉換為AccessFinal狀态, 否則将轉換為kDefaultAccessFinal狀态.
void QueueTextureExtraction(FRDGTextureRef Texture, TRefCountPtr<IPooledRenderTarget>* OutPooledTexturePtr);
void QueueTextureExtraction(FRDGTextureRef Texture, TRefCountPtr<IPooledRenderTarget>* OutPooledTexturePtr, ERHIAccess AccessFinal);
// 在Builder執行末期, 提取緩沖區到指定的指針.
void QueueBufferExtraction(FRDGBufferRef Buffer, TRefCountPtr<FRDGPooledBuffer>* OutPooledBufferPtr);
void QueueBufferExtraction(FRDGBufferRef Buffer, TRefCountPtr<FRDGPooledBuffer>* OutPooledBufferPtr, ERHIAccess AccessFinal);
// 預配置設定資源. 隻對RDG建立的資源, 會強制立即配置設定底層池内資源, 有效地将其推廣到外部資源. 這将增加記憶體壓力,但允許使用GetPooled{Texture, Buffer}查詢池中的資源. 主要用于增量地将代碼移植到RDG.
void PreallocateTexture(FRDGTextureRef Texture);
void PreallocateBuffer(FRDGBufferRef Buffer);
// 立即擷取底層資源, 隻允許用于注冊或預配置設定的資源.
const TRefCountPtr<IPooledRenderTarget>& GetPooledTexture(FRDGTextureRef Texture) const;
const TRefCountPtr<FRDGPooledBuffer>& GetPooledBuffer(FRDGBufferRef Buffer) const;
// 設定執行之後的狀态.
void SetTextureAccessFinal(FRDGTextureRef Texture, ERHIAccess Access);
void SetBufferAccessFinal(FRDGBufferRef Buffer, ERHIAccess Access);
void RemoveUnusedTextureWarning(FRDGTextureRef Texture);
void RemoveUnusedBufferWarning(FRDGBufferRef Buffer);
// 執行隊列Pass,管理渲染目标(RHI RenderPasses)的設定,資源轉換和隊列紋理提取.
void Execute();
// 渲染圖形資源池的每幀更新.
static void TickPoolElements();
// RDG使用的指令清單.
FRHICommandListImmediate& RHICmdList;
private:
static const ERHIAccess kDefaultAccessInitial = ERHIAccess::Unknown;
static const ERHIAccess kDefaultAccessFinal = ERHIAccess::SRVMask;
static const char* const kDefaultUnaccountedCSVStat;
// RDG使用的AsyncCompute指令清單.
FRHIAsyncComputeCommandListImmediate& RHICmdListAsyncCompute;
FRDGAllocator Allocator;
const FRDGEventName BuilderName;
ERDGPassFlags OverridePassFlags(const TCHAR* PassName, ERDGPassFlags Flags, bool bAsyncComputeSupported);
FORCEINLINE FRDGPassHandle GetProloguePassHandle() const;
FORCEINLINE FRDGPassHandle GetEpiloguePassHandle() const;
// RDG對象系統資料庫.
FRDGPassRegistry Passes;
FRDGTextureRegistry Textures;
FRDGBufferRegistry Buffers;
FRDGViewRegistry Views;
FRDGUniformBufferRegistry UniformBuffers;
// 已被裁剪的Pass.
FRDGPassBitArray PassesToCull;
// 沒有參數的Pass.
FRDGPassBitArray PassesWithEmptyParameters;
// 跟蹤外部資源到已注冊的渲染圖對應項,以進行重複資料删除.
TSortedMap<FRHITexture*, FRDGTexture*, TInlineAllocator<4, SceneRenderingAllocator>> ExternalTextures;
TSortedMap<const FRDGPooledBuffer*, FRDGBuffer*, TInlineAllocator<4, SceneRenderingAllocator>> ExternalBuffers;
FRDGPass* ProloguePass = nullptr;
FRDGPass* EpiloguePass = nullptr;
// 待提取資源的清單.
TArray<TPair<FRDGTextureRef, TRefCountPtr<IPooledRenderTarget>*>, TInlineAllocator<4, SceneRenderingAllocator>> ExtractedTextures;
TArray<TPair<FRDGBufferRef, TRefCountPtr<FRDGPooledBuffer>*>, TInlineAllocator<4, SceneRenderingAllocator>> ExtractedBuffers;
// 用于中間操作的紋理狀态, 儲存在這裡以避免重新配置設定.
FRDGTextureTransientSubresourceStateIndirect ScratchTextureState;
EAsyncComputeBudget AsyncComputeBudgetScope = EAsyncComputeBudget::EAll_4;
// 編譯.
void Compile();
// 清理.
void Clear();
// 開始資源轉換.
void BeginResourceRHI(FRDGUniformBuffer* UniformBuffer);
void BeginResourceRHI(FRDGPassHandle, FRDGTexture* Texture);
void BeginResourceRHI(FRDGPassHandle, FRDGTextureSRV* SRV);
void BeginResourceRHI(FRDGPassHandle, FRDGTextureUAV* UAV);
void BeginResourceRHI(FRDGPassHandle, FRDGBuffer* Buffer);
void BeginResourceRHI(FRDGPassHandle, FRDGBufferSRV* SRV);
void BeginResourceRHI(FRDGPassHandle, FRDGBufferUAV* UAV);
// 結束資源轉換.
void EndResourceRHI(FRDGPassHandle, FRDGTexture* Texture, uint32 ReferenceCount);
void EndResourceRHI(FRDGPassHandle, FRDGBuffer* Buffer, uint32 ReferenceCount);
// Pass接口.
void SetupPassInternal(FRDGPass* Pass, FRDGPassHandle PassHandle, ERHIPipeline PassPipeline);
void SetupPass(FRDGPass* Pass);
void SetupEmptyPass(FRDGPass* Pass);
void ExecutePass(FRDGPass* Pass);
// Pass前序後序.
void ExecutePassPrologue(FRHIComputeCommandList& RHICmdListPass, FRDGPass* Pass);
void ExecutePassEpilogue(FRHIComputeCommandList& RHICmdListPass, FRDGPass* Pass);
// 收集資源和屏障.
void CollectPassResources(FRDGPassHandle PassHandle);
void CollectPassBarriers(FRDGPassHandle PassHandle, FRDGPassHandle& LastUntrackedPassHandle);
// 增加Pass依賴.
void AddPassDependency(FRDGPassHandle ProducerHandle, FRDGPassHandle ConsumerHandle);
// 增加後序轉換.
void AddEpilogueTransition(FRDGTextureRef Texture, FRDGPassHandle LastUntrackedPassHandle);
void AddEpilogueTransition(FRDGBufferRef Buffer, FRDGPassHandle LastUntrackedPassHandle);
// 增加普通轉換.
void AddTransition(FRDGPassHandle PassHandle, FRDGTextureRef Texture, const FRDGTextureTransientSubresourceStateIndirect& StateAfter, FRDGPassHandle LastUntrackedPassHandle);
void AddTransition(FRDGPassHandle PassHandle, FRDGBufferRef Buffer, FRDGSubresourceState StateAfter, FRDGPassHandle LastUntrackedPassHandle);
void AddTransitionInternal(
FRDGParentResource* Resource,
FRDGSubresourceState StateBefore,
FRDGSubresourceState StateAfter,
FRDGPassHandle LastUntrackedPassHandle,
const FRHITransitionInfo& TransitionInfo);
// 擷取渲染Pass資訊.
FRHIRenderPassInfo GetRenderPassInfo(const FRDGPass* Pass) const;
// 配置設定子資源.
FRDGSubresourceState* AllocSubresource(const FRDGSubresourceState& Other);
#if RDG_ENABLE_DEBUG
void VisualizePassOutputs(const FRDGPass* Pass);
void ClobberPassOutputs(const FRDGPass* Pass);
#endif
};
作為RDG系統的驅動器,FRDGBuilder負責存儲資料、處理狀态轉換、自動管理資源生命周期和屏障、裁剪無效資源,以及收集、編譯、執行Pass,提取紋理或緩沖等等功能。它的内部執行機制比較複雜,後續的章節會詳盡地剖析之。
11.3 RDG機制
本節将主要闡述RDG的工作機制、過程和原理,以及它在渲染方面的優勢和特性。
有的同學如果隻想學習如何使用RDG,則可以跳過本章而直接閱讀11.4 RDG開發。
11.3.1 RDG機制概述
渲染依賴圖架構(Rendering Dependency Graph Framework),它設定Lambda範圍,該範圍設計為Pass,利用延遲執行向RHI發出GPU指令。它們是通過FRDGBuilder::AddPass()建立的。當建立一個Pass時,它需要Shader參數。 可以是任何着色器參數,但架構最感興趣的是渲染圖形資源。
儲存所有Pass參數的結構應該使用FRDGBuilder::AllocParameters()配置設定,以確定正确的生命周期,因為Lambda的執行是被延遲的。
用FRDGBuilder::CreateTexture()或FRDGBuilder::CreateBuffer()建立的一個渲染圖資源隻記錄資源描述符。當資源需要時,将按圖表進行配置設定。渲染圖将跟蹤資源的生命周期,并在剩餘的Pass不再引用它時釋放和重用記憶體。
Pass使用的所有渲染圖資源必須在FRDGBuilder::AddPass()給出的Pass參數中,因為渲染圖需要知道每個Pass正在使用哪些資源。
隻保證在執行Pass時配置設定資源。 是以,通路它們應該隻在使用FRDGBuilder::AddPass()建立的Pass的Lambda範圍内完成。未列出Pass使用的一些資源可能會導緻問題。
重要的是不要在參數中引用比Pass需要的更多的圖資源,因為這人為地增加了關于該資源生命周期的圖資訊。這可能會導緻記憶體使用的增加或防止Pass的重疊地執行。一個例子是ClearUnusedGraphResources(),它可以自動清除Shader中沒有使用的資源引用。如果資源在Pass中沒有被使用,則會發出警告。
Pass執行的lambda範圍可能發生在FRDGBuilder::AddPass()之後的任何時候。出于調試的目的,它可能直接發生在具有Immediate模式的AddPass()中。當在傳遞執行過程中發生錯誤時,立即模式允許您使用可能包含錯誤源原因的Pass設定的調用堆棧。Immediate模式可以通過指令行指令
-rdgimmediate
或控制台變量
r.RDG.ImmediateMode=1
來啟用。
由遺留代碼生成的池管理資源紋理FPooledRenderTarget可以通過使用FRDGBuilder::RegisterExternalTexture()在渲染圖中使用。
有了Pass依賴關系的資訊,執行可能會對不同的硬體目标進行優先級排序,例如對記憶體壓力或Pass GPU執行并發進行優先級排序。是以,不能保證Pass的執行順序。Pass的執行順序隻能保證将在中間資源上執行工作,就像立即模式在GPU上執行工作一樣。
渲染圖通道不應該修改外部資料結構的狀态,因為這可能會根據Pass的執行順序導緻邊界情況。應該使用FRDGBuilder::QueueTextureExtraction()提取執行完成後幸存的渲染圖資源(例如viewport back buffer、TAA曆史記錄…)。如果檢測到一個Pass對生成任何計劃提取的資源或修改外部紋理沒有用處,這個Pass甚至可能不會執行警告。
除非是出于強大的技術原因(比如為VR一次性渲染多個視圖的立體渲染),否則不要在同一Pass中将多個工作捆綁在不同的資源上。這将最終在一組工作上建立更多的依賴關系,單個工作可能隻需要這些依賴關系的一個子集。排程程式可能會将其中的一部分與其它GPU工作重疊。這也可能保留配置設定的瞬态資源更長的時間,潛在地增加整幀的最高記憶體壓力峰值。
雖然AddPass()隻希望lambda範圍有延遲執行,但這并不意味着我們需要編寫一個。通過使用一個更簡單的工具箱(如FComputeShaderUtils、FPixelShaderUtils)就可以滿足大多數情況的需求了。
11.3.2 FRDGBuilder::AddPass
FRDGBuilder::AddPass是向RDG系統增加一個包含Pass參數和Lambda的Pass,其具體的邏輯如下:
// EngineSourceRuntimeRenderCorePublicRenderGraphBuilder.inl
template <typename ParameterStructType, typename ExecuteLambdaType>
FRDGPassRef FRDGBuilder::AddPass(FRDGEventName&& Name, const ParameterStructType* ParameterStruct, ERDGPassFlags Flags, ExecuteLambdaType&& ExecuteLambda)
{
using LambdaPassType = TRDGLambdaPass<ParameterStructType, ExecuteLambdaType>;
(......)
// 配置設定RDG Pass執行個體.
FRDGPass* Pass = Allocator.AllocObject<LambdaPassType>(
MoveTemp(Name),
ParameterStruct,
OverridePassFlags(Name.GetTCHAR(), Flags, LambdaPassType::kSupportsAsyncCompute),
MoveTemp(ExecuteLambda));
// 加入Pass清單.
Passes.Insert(Pass);
// 設定Pass.
SetupPass(Pass);
return Pass;
}
AddPass的邏輯比較簡單,将傳入的資料構造一個FRDGPass執行個體,然後加入清單并設定Pass資料。下面是SetupPass的具體邏輯:
void FRDGBuilder::SetupPass(FRDGPass* Pass)
{
// 擷取Pass資料.
const FRDGParameterStruct PassParameters = Pass->GetParameters();
const FRDGPassHandle PassHandle = Pass->GetHandle();
const ERDGPassFlags PassFlags = Pass->GetFlags();
const ERHIPipeline PassPipeline = Pass->GetPipeline();
bool bPassUAVAccess = false;
// ----處理紋理狀态----
Pass->TextureStates.Reserve(PassParameters.GetTextureParameterCount() + (PassParameters.HasRenderTargets() ? (MaxSimultaneousRenderTargets + 1) : 0));
// 周遊所有紋理, 對每個紋理執行狀态/資料/引用等處理.
EnumerateTextureAccess(PassParameters, PassFlags, [&](FRDGViewRef TextureView, FRDGTextureRef Texture, ERHIAccess Access, FRDGTextureSubresourceRange Range)
{
const FRDGViewHandle NoUAVBarrierHandle = GetHandleIfNoUAVBarrier(TextureView);
const EResourceTransitionFlags TransitionFlags = GetTextureViewTransitionFlags(TextureView, Texture);
auto& PassState = Pass->TextureStates.FindOrAdd(Texture);
PassState.ReferenceCount++;
const bool bWholeTextureRange = Range.IsWholeResource(Texture->GetSubresourceLayout());
bool bWholePassState = IsWholeResource(PassState.State);
// Convert the pass state to subresource dimensionality if we've found a subresource range.
if (!bWholeTextureRange && bWholePassState)
{
InitAsSubresources(PassState.State, Texture->Layout);
bWholePassState = false;
}
const auto AddSubresourceAccess = [&](FRDGSubresourceState& State)
{
State.Access = MakeValidAccess(State.Access | Access);
State.Flags |= TransitionFlags;
State.NoUAVBarrierFilter.AddHandle(NoUAVBarrierHandle);
State.Pipeline = PassPipeline;
};
if (bWholePassState)
{
AddSubresourceAccess(GetWholeResource(PassState.State));
}
else
{
EnumerateSubresourceRange(PassState.State, Texture->Layout, Range, AddSubresourceAccess);
}
bPassUAVAccess |= EnumHasAnyFlags(Access, ERHIAccess::UAVMask);
});
// ----處理緩沖區狀态----
Pass->BufferStates.Reserve(PassParameters.GetBufferParameterCount());
// 周遊所有緩沖區, 對每個緩沖區執行狀态/資料/引用等處理.
EnumerateBufferAccess(PassParameters, PassFlags, [&](FRDGViewRef BufferView, FRDGBufferRef Buffer, ERHIAccess Access)
{
const FRDGViewHandle NoUAVBarrierHandle = GetHandleIfNoUAVBarrier(BufferView);
auto& PassState = Pass->BufferStates.FindOrAdd(Buffer);
PassState.ReferenceCount++;
PassState.State.Access = MakeValidAccess(PassState.State.Access | Access);
PassState.State.NoUAVBarrierFilter.AddHandle(NoUAVBarrierHandle);
PassState.State.Pipeline = PassPipeline;
bPassUAVAccess |= EnumHasAnyFlags(Access, ERHIAccess::UAVMask);
});
Pass->bUAVAccess = bPassUAVAccess;
const bool bEmptyParameters = !Pass->TextureStates.Num() && !Pass->BufferStates.Num();
PassesWithEmptyParameters.Add(bEmptyParameters);
// 在Graphics管線, Pass可以開始/結束Pass自己的資源. 異步計算則在編譯期間編排.
if (PassPipeline == ERHIPipeline::Graphics && !bEmptyParameters)
{
Pass->ResourcesToBegin.Add(Pass);
Pass->ResourcesToEnd.Add(Pass);
}
// 内部設定Pass.
SetupPassInternal(Pass, PassHandle, PassPipeline);
}
下面繼續解析SetupPassInternal:
void FRDGBuilder::SetupPassInternal(FRDGPass* Pass, FRDGPassHandle PassHandle, ERHIPipeline PassPipeline)
{
// 設定各種Pass為自身句柄.
Pass->GraphicsJoinPass = PassHandle;
Pass->GraphicsForkPass = PassHandle;
Pass->PrologueBarrierPass = PassHandle;
Pass->EpilogueBarrierPass = PassHandle;
(......)
// 如果是立即模式且非後序Pass,
if (GRDGImmediateMode && Pass != EpiloguePass)
{
// 簡單地将merge狀态重定向成pass狀态,因為不會編譯圖.
// 紋理的Merge狀态.
for (auto& TexturePair : Pass->TextureStates)
{
auto& PassState = TexturePair.Value;
const uint32 SubresourceCount = PassState.State.Num();
PassState.MergeState.SetNum(SubresourceCount);
for (uint32 Index = 0; Index < SubresourceCount; ++Index)
{
if (PassState.State[Index].Access != ERHIAccess::Unknown)
{
PassState.MergeState[Index] = &PassState.State[Index];
PassState.MergeState[Index]->SetPass(PassHandle);
}
}
}
// 緩沖區的Merge狀态.
for (auto& BufferPair : Pass->BufferStates)
{
auto& PassState = BufferPair.Value;
PassState.MergeState = &PassState.State;
PassState.MergeState->SetPass(PassHandle);
}
FRDGPassHandle LastUntrackedPassHandle = GetProloguePassHandle();
// 收集Pass資源.
CollectPassResources(PassHandle);
// 收集Pass屏障.
CollectPassBarriers(PassHandle, LastUntrackedPassHandle);
// 直接執行Pass.
ExecutePass(Pass);
}
}
總結起來,AddPass會根據傳入的參數建構一個RDG Pass的執行個體,然後設定該Pass的紋理和緩沖區資料,接着用内部設定Pass的依賴Pass等句柄,如果是立即模式,會重定向紋理和緩沖區的Merge狀态成Pass狀态,并且直接執行。
11.3.3 FRDGBuilder::Compile
FRDGBuilder的編譯邏輯非常複雜,執行了很多處理和優化,具體如下:
void FRDGBuilder::Compile()
{
uint32 RasterPassCount = 0;
uint32 AsyncComputePassCount = 0;
// Pass标記位.
FRDGPassBitArray PassesOnAsyncCompute(false, Passes.Num());
FRDGPassBitArray PassesOnRaster(false, Passes.Num());
FRDGPassBitArray PassesWithUntrackedOutputs(false, Passes.Num());
FRDGPassBitArray PassesToNeverCull(false, Passes.Num());
const FRDGPassHandle ProloguePassHandle = GetProloguePassHandle();
const FRDGPassHandle EpiloguePassHandle = GetEpiloguePassHandle();
const auto IsCrossPipeline = [&](FRDGPassHandle A, FRDGPassHandle B)
{
return PassesOnAsyncCompute[A] != PassesOnAsyncCompute[B];
};
const auto IsSortedBefore = [&](FRDGPassHandle A, FRDGPassHandle B)
{
return A < B;
};
const auto IsSortedAfter = [&](FRDGPassHandle A, FRDGPassHandle B)
{
return A > B;
};
// 在圖中建構生産者/消費者依賴關系,并建構打包的中繼資料位數組,以便在搜尋符合特定條件的Pass時獲得更好的緩存一緻性.
// 搜尋根也被用來進行篩選. 攜帶了不跟蹤的RHI輸出(e.g. SHADER_PARAMETER_{BUFFER, TEXTURE}_UAV)的Pass不能被裁剪, 也不能寫入外部資源的任何Pass.
// 資源提取将生命周期延長到尾聲(epilogue)Pass,尾聲Pass總是圖的根。前言和尾聲是輔助Pass,是以永遠不會被淘汰。
{
SCOPED_NAMED_EVENT(FRDGBuilder_Compile_Culling_Dependencies, FColor::Emerald);
// 增加裁剪依賴.
const auto AddCullingDependency = [&](FRDGPassHandle& ProducerHandle, FRDGPassHandle PassHandle, ERHIAccess Access)
{
if (Access != ERHIAccess::Unknown)
{
if (ProducerHandle.IsValid())
{
// 增加Pass依賴.
AddPassDependency(ProducerHandle, PassHandle);
}
// 如果可寫, 則存儲新的生産者.
if (IsWritableAccess(Access))
{
ProducerHandle = PassHandle;
}
}
};
// 周遊所有Pass, 處理每個Pass的紋理和緩沖區狀态等.
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
FRDGPass* Pass = Passes[PassHandle];
bool bUntrackedOutputs = Pass->GetParameters().HasExternalOutputs();
// 處理Pass的所有紋理狀态.
for (auto& TexturePair : Pass->TextureStates)
{
FRDGTextureRef Texture = TexturePair.Key;
auto& LastProducers = Texture->LastProducers;
auto& PassState = TexturePair.Value.State;
const bool bWholePassState = IsWholeResource(PassState);
const bool bWholeProducers = IsWholeResource(LastProducers);
// 生産者數組需要至少和pass狀态數組一樣大.
if (bWholeProducers && !bWholePassState)
{
InitAsSubresources(LastProducers, Texture->Layout);
}
// 增加裁剪依賴.
for (uint32 Index = 0, Count = LastProducers.Num(); Index < Count; ++Index)
{
AddCullingDependency(LastProducers[Index], PassHandle, PassState[bWholePassState ? 0 : Index].Access);
}
bUntrackedOutputs |= Texture->bExternal;
}
// 處理Pass的所有緩沖區狀态.
for (auto& BufferPair : Pass->BufferStates)
{
FRDGBufferRef Buffer = BufferPair.Key;
AddCullingDependency(Buffer->LastProducer, PassHandle, BufferPair.Value.State.Access);
bUntrackedOutputs |= Buffer->bExternal;
}
// 處理Pass的其它标記和資料.
const ERDGPassFlags PassFlags = Pass->GetFlags();
const bool bAsyncCompute = EnumHasAnyFlags(PassFlags, ERDGPassFlags::AsyncCompute);
const bool bRaster = EnumHasAnyFlags(PassFlags, ERDGPassFlags::Raster);
const bool bNeverCull = EnumHasAnyFlags(PassFlags, ERDGPassFlags::NeverCull);
PassesOnRaster[PassHandle] = bRaster;
PassesOnAsyncCompute[PassHandle] = bAsyncCompute;
PassesToNeverCull[PassHandle] = bNeverCull;
PassesWithUntrackedOutputs[PassHandle] = bUntrackedOutputs;
AsyncComputePassCount += bAsyncCompute ? 1 : 0;
RasterPassCount += bRaster ? 1 : 0;
}
// prologue/epilogue設定為不追蹤, 它們分别負責外部資源的導入/導出.
PassesWithUntrackedOutputs[ProloguePassHandle] = true;
PassesWithUntrackedOutputs[EpiloguePassHandle] = true;
// 處理提取紋理的裁剪依賴.
for (const auto& Query : ExtractedTextures)
{
FRDGTextureRef Texture = Query.Key;
for (FRDGPassHandle& ProducerHandle : Texture->LastProducers)
{
AddCullingDependency(ProducerHandle, EpiloguePassHandle, Texture->AccessFinal);
}
Texture->ReferenceCount++;
}
// 處理提取緩沖區的裁剪依賴.
for (const auto& Query : ExtractedBuffers)
{
FRDGBufferRef Buffer = Query.Key;
AddCullingDependency(Buffer->LastProducer, EpiloguePassHandle, Buffer->AccessFinal);
Buffer->ReferenceCount++;
}
}
// -------- 處理Pass裁剪 --------
if (GRDGCullPasses)
{
TArray<FRDGPassHandle, TInlineAllocator<32, SceneRenderingAllocator>> PassStack;
// 所有Pass初始化為剔除.
PassesToCull.Init(true, Passes.Num());
// 收集Pass的根清單, 符合條件的是那些不追蹤的輸出或标記為永不剔除的Pass.
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
if (PassesWithUntrackedOutputs[PassHandle] || PassesToNeverCull[PassHandle])
{
PassStack.Add(PassHandle);
}
}
// 非遞歸循環的棧周遊, 采用深度優先搜尋方式, 标記每個根可達的Pass節點為不裁剪.
while (PassStack.Num())
{
const FRDGPassHandle PassHandle = PassStack.Pop();
if (PassesToCull[PassHandle])
{
PassesToCull[PassHandle] = false;
PassStack.Append(Passes[PassHandle]->Producers);
#if STATS
--GRDGStatPassCullCount;
#endif
}
}
}
else // 不啟用Pass裁剪, 所有Pass初始化為不裁剪.
{
PassesToCull.Init(false, Passes.Num());
}
// -------- 處理Pass屏障 --------
// 周遊經過篩選的圖,并為每個子資源編譯屏障, 某些過渡是多餘的, 例如read-to-read。
// RDG采用了保守的啟發式,選擇不合并不一定意味着就要執行轉換.
// 它們是兩個不同的步驟。合并狀态跟蹤第一次和最後一次的Pass間隔. Pass的引用也會累積到每個資源上.
// 必須在剔除後發生,因為剔除後的Pass不能提供引用.
{
SCOPED_NAMED_EVENT(FRDGBuilder_Compile_Barriers, FColor::Emerald);
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
// 跳過被裁剪或無參數的Pass.
if (PassesToCull[PassHandle] || PassesWithEmptyParameters[PassHandle])
{
continue;
}
// 合并子資源狀态.
const auto MergeSubresourceStates = [&](ERDGParentResourceType ResourceType, FRDGSubresourceState*& PassMergeState, FRDGSubresourceState*& ResourceMergeState, const FRDGSubresourceState& PassState)
{
// 跳過未知狀态的資源合并.
if (PassState.Access == ERHIAccess::Unknown)
{
return;
}
if (!ResourceMergeState || !FRDGSubresourceState::IsMergeAllowed(ResourceType, *ResourceMergeState, PassState))
{
// 跨管線、不可合并的狀态改變需要一個新的pass依賴項來進行防護.
if (ResourceMergeState && ResourceMergeState->Pipeline != PassState.Pipeline)
{
AddPassDependency(ResourceMergeState->LastPass, PassHandle);
}
// 配置設定一個新的挂起的合并狀态,并将其配置設定給pass狀态.
ResourceMergeState = AllocSubresource(PassState);
ResourceMergeState->SetPass(PassHandle);
}
else
{
// 合并Pass狀态進合并後的狀态.
ResourceMergeState->Access |= PassState.Access;
ResourceMergeState->LastPass = PassHandle;
}
PassMergeState = ResourceMergeState;
};
const bool bAsyncComputePass = PassesOnAsyncCompute[PassHandle];
// 擷取目前處理的Pass執行個體.
FRDGPass* Pass = Passes[PassHandle];
// 處理目前Pass的紋理狀态.
for (auto& TexturePair : Pass->TextureStates)
{
FRDGTextureRef Texture = TexturePair.Key;
auto& PassState = TexturePair.Value;
// 增加引用數量.
Texture->ReferenceCount += PassState.ReferenceCount;
Texture->bUsedByAsyncComputePass |= bAsyncComputePass;
const bool bWholePassState = IsWholeResource(PassState.State);
const bool bWholeMergeState = IsWholeResource(Texture->MergeState);
// 為簡單起見,合并/Pass狀态次元應該比對.
if (bWholeMergeState && !bWholePassState)
{
InitAsSubresources(Texture->MergeState, Texture->Layout);
}
else if (!bWholeMergeState && bWholePassState)
{
InitAsWholeResource(Texture->MergeState);
}
const uint32 SubresourceCount = PassState.State.Num();
PassState.MergeState.SetNum(SubresourceCount);
// 合并子資源狀态.
for (uint32 Index = 0; Index < SubresourceCount; ++Index)
{
MergeSubresourceStates(ERDGParentResourceType::Texture, PassState.MergeState[Index], Texture->MergeState[Index], PassState.State[Index]);
}
}
// 處理目前Pass的緩沖區狀态.
for (auto& BufferPair : Pass->BufferStates)
{
FRDGBufferRef Buffer = BufferPair.Key;
auto& PassState = BufferPair.Value;
Buffer->ReferenceCount += PassState.ReferenceCount;
Buffer->bUsedByAsyncComputePass |= bAsyncComputePass;
MergeSubresourceStates(ERDGParentResourceType::Buffer, PassState.MergeState, Buffer->MergeState, PassState.State);
}
}
}
// 處理異步計算Pass.
if (AsyncComputePassCount > 0)
{
SCOPED_NAMED_EVENT(FRDGBuilder_Compile_AsyncCompute, FColor::Emerald);
FRDGPassBitArray PassesWithCrossPipelineProducer(false, Passes.Num());
FRDGPassBitArray PassesWithCrossPipelineConsumer(false, Passes.Num());
// 周遊正在執行的活動Pass,以便為每個Pass找到最新的跨管道生産者和最早的跨管道消費者, 以便後續建構異步計算重疊區域時縮小搜尋空間.
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
if (PassesToCull[PassHandle] || PassesWithEmptyParameters[PassHandle])
{
continue;
}
FRDGPass* Pass = Passes[PassHandle];
// 周遊生産者, 處理生産者和消費者的引用關系.
for (FRDGPassHandle ProducerHandle : Pass->GetProducers())
{
const FRDGPassHandle ConsumerHandle = PassHandle;
if (!IsCrossPipeline(ProducerHandle, ConsumerHandle))
{
continue;
}
FRDGPass* Consumer = Pass;
FRDGPass* Producer = Passes[ProducerHandle];
// 為生産者查找另一個管道上最早的消費者.
if (Producer->CrossPipelineConsumer.IsNull() || IsSortedBefore(ConsumerHandle, Producer->CrossPipelineConsumer))
{
Producer->CrossPipelineConsumer = PassHandle;
PassesWithCrossPipelineConsumer[ProducerHandle] = true;
}
// 為消費者查找另一個管道上的最新生産者.
if (Consumer->CrossPipelineProducer.IsNull() || IsSortedAfter(ProducerHandle, Consumer->CrossPipelineProducer))
{
Consumer->CrossPipelineProducer = ProducerHandle;
PassesWithCrossPipelineProducer[ConsumerHandle] = true;
}
}
}
// 為異步計算建立fork / join重疊區域, 用于栅欄及資源配置設定/回收. 在fork/join完成之前,異步計算Pass不能配置設定/釋放它們的資源引用,因為兩個管道是并行運作的。是以,異步計算的所有資源生命周期都被擴充到整個異步區域。
const auto IsCrossPipelineProducer = [&](FRDGPassHandle A)
{
return PassesWithCrossPipelineConsumer[A];
};
const auto IsCrossPipelineConsumer = [&](FRDGPassHandle A)
{
return PassesWithCrossPipelineProducer[A];
};
// 查找跨管道生産者.
const auto FindCrossPipelineProducer = [&](FRDGPassHandle PassHandle)
{
FRDGPassHandle LatestProducerHandle = ProloguePassHandle;
FRDGPassHandle ConsumerHandle = PassHandle;
// 期望在其它管道上找到最新的生産者,以便建立一個分叉點. 因為可以用N個生産者通道消耗N個資源,是以隻關心最後一個.
while (ConsumerHandle != Passes.Begin())
{
if (!PassesToCull[ConsumerHandle] && !IsCrossPipeline(ConsumerHandle, PassHandle) && IsCrossPipelineConsumer(ConsumerHandle))
{
const FRDGPass* Consumer = Passes[ConsumerHandle];
if (IsSortedAfter(Consumer->CrossPipelineProducer, LatestProducerHandle))
{
LatestProducerHandle = Consumer->CrossPipelineProducer;
}
}
--ConsumerHandle;
}
return LatestProducerHandle;
};
// 查找跨管道消費者.
const auto FindCrossPipelineConsumer = [&](FRDGPassHandle PassHandle)
{
check(PassHandle != EpiloguePassHandle);
FRDGPassHandle EarliestConsumerHandle = EpiloguePassHandle;
FRDGPassHandle ProducerHandle = PassHandle;
// 期望找到另一個管道上最早的使用者,因為這在管道之間建立了連接配接點。因為可以在另一個管道上為N個消費者生産,是以隻關心第一個執行的消費者.
while (ProducerHandle != Passes.End())
{
if (!PassesToCull[ProducerHandle] && !IsCrossPipeline(ProducerHandle, PassHandle) && IsCrossPipelineProducer(ProducerHandle))
{
const FRDGPass* Producer = Passes[ProducerHandle];
if (IsSortedBefore(Producer->CrossPipelineConsumer, EarliestConsumerHandle))
{
EarliestConsumerHandle = Producer->CrossPipelineConsumer;
}
}
++ProducerHandle;
}
return EarliestConsumerHandle;
};
// 将圖形Pass插入到異步計算Pass的分叉中.
const auto InsertGraphicsToAsyncComputeFork = [&](FRDGPass* GraphicsPass, FRDGPass* AsyncComputePass)
{
FRDGBarrierBatchBegin& EpilogueBarriersToBeginForAsyncCompute = GraphicsPass->GetEpilogueBarriersToBeginForAsyncCompute(Allocator);
GraphicsPass->bGraphicsFork = 1;
EpilogueBarriersToBeginForAsyncCompute.SetUseCrossPipelineFence();
AsyncComputePass->bAsyncComputeBegin = 1;
AsyncComputePass->GetPrologueBarriersToEnd(Allocator).AddDependency(&EpilogueBarriersToBeginForAsyncCompute);
};
// 将異步計算Pass插入到圖形Pass的合并中.
const auto InsertAsyncToGraphicsComputeJoin = [&](FRDGPass* AsyncComputePass, FRDGPass* GraphicsPass)
{
FRDGBarrierBatchBegin& EpilogueBarriersToBeginForGraphics = AsyncComputePass->GetEpilogueBarriersToBeginForGraphics(Allocator);
AsyncComputePass->bAsyncComputeEnd = 1;
EpilogueBarriersToBeginForGraphics.SetUseCrossPipelineFence();
GraphicsPass->bGraphicsJoin = 1;
GraphicsPass->GetPrologueBarriersToEnd(Allocator).AddDependency(&EpilogueBarriersToBeginForGraphics);
};
FRDGPass* PrevGraphicsForkPass = nullptr;
FRDGPass* PrevGraphicsJoinPass = nullptr;
FRDGPass* PrevAsyncComputePass = nullptr;
// 周遊所有Pass, 擴充資源的生命周期, 處理圖形Pass和異步計算Pass的交叉和合并節點.
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
if (!PassesOnAsyncCompute[PassHandle] || PassesToCull[PassHandle])
{
continue;
}
FRDGPass* AsyncComputePass = Passes[PassHandle];
// 找到分叉Pass和合并Pass.
const FRDGPassHandle GraphicsForkPassHandle = FindCrossPipelineProducer(PassHandle);
const FRDGPassHandle GraphicsJoinPassHandle = FindCrossPipelineConsumer(PassHandle);
AsyncComputePass->GraphicsForkPass = GraphicsForkPassHandle;
AsyncComputePass->GraphicsJoinPass = GraphicsJoinPassHandle;
FRDGPass* GraphicsForkPass = Passes[GraphicsForkPassHandle];
FRDGPass* GraphicsJoinPass = Passes[GraphicsJoinPassHandle];
// 将異步計算中使用的資源的生命周期延長到fork/join圖形Pass。
GraphicsForkPass->ResourcesToBegin.Add(AsyncComputePass);
GraphicsJoinPass->ResourcesToEnd.Add(AsyncComputePass);
// 将圖形分叉Pass插入到異步計算分叉Pass.
if (PrevGraphicsForkPass != GraphicsForkPass)
{
InsertGraphicsToAsyncComputeFork(GraphicsForkPass, AsyncComputePass);
}
// 将異步計算合并Pass插入到圖形合并Pass.
if (PrevGraphicsJoinPass != GraphicsJoinPass && PrevAsyncComputePass)
{
InsertAsyncToGraphicsComputeJoin(PrevAsyncComputePass, PrevGraphicsJoinPass);
}
PrevAsyncComputePass = AsyncComputePass;
PrevGraphicsForkPass = GraphicsForkPass;
PrevGraphicsJoinPass = GraphicsJoinPass;
}
// 圖中的最後一個異步計算Pass需要手動連接配接回epilogue pass.
if (PrevAsyncComputePass)
{
InsertAsyncToGraphicsComputeJoin(PrevAsyncComputePass, EpiloguePass);
PrevAsyncComputePass->bAsyncComputeEndExecute = 1;
}
}
// 周遊所有圖形管道Pass, 并且合并所有具有相同RT的光栅化Pass到同一個RHI渲染Pass中.
if (GRDGMergeRenderPasses && RasterPassCount > 0)
{
SCOPED_NAMED_EVENT(FRDGBuilder_Compile_RenderPassMerge, FColor::Emerald);
TArray<FRDGPassHandle, SceneRenderingAllocator> PassesToMerge;
FRDGPass* PrevPass = nullptr;
const FRenderTargetBindingSlots* PrevRenderTargets = nullptr;
const auto CommitMerge = [&]
{
if (PassesToMerge.Num())
{
const FRDGPassHandle FirstPassHandle = PassesToMerge[0];
const FRDGPassHandle LastPassHandle = PassesToMerge.Last();
// 給定一個Pass的間隔合并成一個單一的渲染Pass: [B, X, X, X, X, E], 開始Pass(B)和結束Pass(E)會分别調用BeginRenderPass/EndRenderPass.
// 另外,begin将處理整個合并間隔的所有序言屏障,end将處理所有尾聲屏障, 這可以避免渲染通道内的資源轉換,并更有效地批量處理資源轉換.
// 假設已經在周遊期間完成了過濾來自合并集的Pass之間的依賴關系.
// (B)是合并序列裡的首個Pass.
{
FRDGPass* Pass = Passes[FirstPassHandle];
Pass->bSkipRenderPassEnd = 1;
Pass->EpilogueBarrierPass = LastPassHandle;
}
// (X)是中間Pass.
for (int32 PassIndex = 1, PassCount = PassesToMerge.Num() - 1; PassIndex < PassCount; ++PassIndex)
{
const FRDGPassHandle PassHandle = PassesToMerge[PassIndex];
FRDGPass* Pass = Passes[PassHandle];
Pass->bSkipRenderPassBegin = 1;
Pass->bSkipRenderPassEnd = 1;
Pass->PrologueBarrierPass = FirstPassHandle;
Pass->EpilogueBarrierPass = LastPassHandle;
}
// (E)是合并序列裡的最後Pass.
{
FRDGPass* Pass = Passes[LastPassHandle];
Pass->bSkipRenderPassBegin = 1;
Pass->PrologueBarrierPass = FirstPassHandle;
}
#if STATS
GRDGStatRenderPassMergeCount += PassesToMerge.Num();
#endif
}
PassesToMerge.Reset();
PrevPass = nullptr;
PrevRenderTargets = nullptr;
};
// 周遊所有光栅Pass, 合并所有相同RT的Pass到同一個渲染Pass中.
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
// 跳過已被裁剪的Pass.
if (PassesToCull[PassHandle])
{
continue;
}
// 是光栅Pass才處理.
if (PassesOnRaster[PassHandle])
{
FRDGPass* NextPass = Passes[PassHandle];
// 使用者控制渲染Pass的Pass不能與其他Pass合并,光栅UAV的Pass由于潛在的互相依賴也不能合并.
if (EnumHasAnyFlags(NextPass->GetFlags(), ERDGPassFlags::SkipRenderPass) || NextPass->bUAVAccess)
{
CommitMerge();
continue;
}
// 圖形分叉Pass不能和之前的光栅Pass合并.
if (NextPass->bGraphicsFork)
{
CommitMerge();
}
const FRenderTargetBindingSlots& RenderTargets = NextPass->GetParameters().GetRenderTargets();
if (PrevPass)
{
// 對比RT, 以判定是否可以合并.
if (PrevRenderTargets->CanMergeBefore(RenderTargets)
#if WITH_MGPU
&& PrevPass->GPUMask == NextPass->GPUMask
#endif
)
{
// 如果可以, 添加Pass到PassesToMerge清單.
if (!PassesToMerge.Num())
{
PassesToMerge.Add(PrevPass->GetHandle());
}
PassesToMerge.Add(PassHandle);
}
else
{
CommitMerge();
}
}
PrevPass = NextPass;
PrevRenderTargets = &RenderTargets;
}
else if (!PassesOnAsyncCompute[PassHandle])
{
// 圖形管道上的非光栅Pass将使RT合并無效.
CommitMerge();
}
}
CommitMerge();
}
}
以上代碼顯示RDG編譯期間的邏輯非常複雜,步驟繁多,先後經曆建構生産者和消費者的依賴關系,确定Pass的裁剪等各類标記,調整資源的生命周期,裁剪Pass,處理Pass的資源轉換和屏障,處理異步計算Pass的依賴和引用關系,查找并建立分叉和合并Pass節點,合并所有具體相同渲染目标的光栅化Pass等步驟。
以上代碼還涉及了一些重要接口,下面一一分析之:
// 增加Pass依賴, 将生産者(ProducerHandle)加入到消費者(ConsumerHandle)的生産者清單(Producers)中.
void FRDGBuilder::AddPassDependency(FRDGPassHandle ProducerHandle, FRDGPassHandle ConsumerHandle)
{
FRDGPass* Consumer = Passes[ConsumerHandle];
auto& Producers = Consumer->Producers;
if (Producers.Find(ProducerHandle) == INDEX_NONE)
{
Producers.Add(ProducerHandle);
}
};
// 初始化為子資源.
template <typename ElementType, typename AllocatorType>
inline void InitAsSubresources(TRDGTextureSubresourceArray<ElementType, AllocatorType>& SubresourceArray, const FRDGTextureSubresourceLayout& Layout, const ElementType& Element = {})
{
const uint32 SubresourceCount = Layout.GetSubresourceCount();
SubresourceArray.SetNum(SubresourceCount, false);
for (uint32 SubresourceIndex = 0; SubresourceIndex < SubresourceCount; ++SubresourceIndex)
{
SubresourceArray[SubresourceIndex] = Element;
}
}
// 初始化為整資源.
template <typename ElementType, typename AllocatorType>
FORCEINLINE void InitAsWholeResource(TRDGTextureSubresourceArray<ElementType, AllocatorType>& SubresourceArray, const ElementType& Element = {})
{
SubresourceArray.SetNum(1, false);
SubresourceArray[0] = Element;
}
// 配置設定子資源.
FRDGSubresourceState* FRDGBuilder::AllocSubresource(const FRDGSubresourceState& Other)
{
FRDGSubresourceState* State = Allocator.AllocPOD<FRDGSubresourceState>();
*State = Other;
return State;
}
11.3.4 FRDGBuilder::Execute
經過前述的收集Pass(AddPass)、編譯渲染圖之後,便可以執行渲染圖了,由FRDGBuilder::Execute承擔:
void FRDGBuilder::Execute()
{
SCOPED_NAMED_EVENT(FRDGBuilder_Execute, FColor::Emerald);
// 在編譯之前,在圖的末尾建立epilogue pass.
EpiloguePass = Passes.Allocate<FRDGSentinelPass>(Allocator, RDG_EVENT_NAME("Graph Epilogue"));
SetupEmptyPass(EpiloguePass);
const FRDGPassHandle ProloguePassHandle = GetProloguePassHandle();
const FRDGPassHandle EpiloguePassHandle = GetEpiloguePassHandle();
FRDGPassHandle LastUntrackedPassHandle = ProloguePassHandle;
// 非立即模式.
if (!GRDGImmediateMode)
{
// 執行之前先編譯, 具體見11.3.3章節.
Compile();
{
SCOPE_CYCLE_COUNTER(STAT_RDG_CollectResourcesTime);
// 收集Pass資源.
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
if (!PassesToCull[PassHandle])
{
CollectPassResources(PassHandle);
}
}
// 結束紋理提取.
for (const auto& Query : ExtractedTextures)
{
EndResourceRHI(EpiloguePassHandle, Query.Key, 1);
}
// 結束緩沖區提取.
for (const auto& Query : ExtractedBuffers)
{
EndResourceRHI(EpiloguePassHandle, Query.Key, 1);
}
}
// 收集Pass的屏障.
{
SCOPE_CYCLE_COUNTER(STAT_RDG_CollectBarriersTime);
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
if (!PassesToCull[PassHandle])
{
CollectPassBarriers(PassHandle, LastUntrackedPassHandle);
}
}
}
}
// 周遊所有紋理, 每個紋理增加尾聲轉換.
for (FRDGTextureHandle TextureHandle = Textures.Begin(); TextureHandle != Textures.End(); ++TextureHandle)
{
FRDGTextureRef Texture = Textures[TextureHandle];
if (Texture->GetRHIUnchecked())
{
AddEpilogueTransition(Texture, LastUntrackedPassHandle);
Texture->Finalize();
}
}
// 周遊所有緩沖區, 每個緩沖區增加尾聲轉換.
for (FRDGBufferHandle BufferHandle = Buffers.Begin(); BufferHandle != Buffers.End(); ++BufferHandle)
{
FRDGBufferRef Buffer = Buffers[BufferHandle];
if (Buffer->GetRHIUnchecked())
{
AddEpilogueTransition(Buffer, LastUntrackedPassHandle);
Buffer->Finalize();
}
}
// 執行Pass.
if (!GRDGImmediateMode)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FRDGBuilder_Execute_Passes);
for (FRDGPassHandle PassHandle = Passes.Begin(); PassHandle != Passes.End(); ++PassHandle)
{
// 執行非裁剪的Pass.
if (!PassesToCull[PassHandle])
{
ExecutePass(Passes[PassHandle]);
}
}
}
else
{
ExecutePass(EpiloguePass);
}
RHICmdList.SetGlobalUniformBuffers({});
#if WITH_MGPU
(......)
#endif
// 執行紋理提取.
for (const auto& Query : ExtractedTextures)
{
*Query.Value = Query.Key->PooledRenderTarget;
}
// 執行緩沖區提取.
for (const auto& Query : ExtractedBuffers)
{
*Query.Value = Query.Key->PooledBuffer;
}
// 清理.
Clear();
}
在執行過程中涉及到了執行Pass的接口ExecutePass,其邏輯如下:
void FRDGBuilder::ExecutePass(FRDGPass* Pass)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FRDGBuilder_ExecutePass);
SCOPED_GPU_MASK(RHICmdList, Pass->GPUMask);
IF_RDG_CPU_SCOPES(CPUScopeStacks.BeginExecutePass(Pass));
// 使用GPU範圍.
#if RDG_GPU_SCOPES
const bool bUsePassEventScope = Pass != EpiloguePass && Pass != ProloguePass;
if (bUsePassEventScope)
{
GPUScopeStacks.BeginExecutePass(Pass);
}
#endif
#if WITH_MGPU
if (!bWaitedForTemporalEffect && NameForTemporalEffect != NAME_None)
{
RHICmdList.WaitForTemporalEffect(NameForTemporalEffect);
bWaitedForTemporalEffect = true;
}
#endif
// 執行pass的順序: 1.prologue -> 2.pass主體 -> 3.epilogue.
// 整個過程使用指定管道上的指令清單執行.
FRHIComputeCommandList& RHICmdListPass = (Pass->GetPipeline() == ERHIPipeline::AsyncCompute)
? static_cast<FRHIComputeCommandList&>(RHICmdListAsyncCompute)
: RHICmdList;
// 1.執行prologue
ExecutePassPrologue(RHICmdListPass, Pass);
// 2.執行pass主體
Pass->Execute(RHICmdListPass);
// 3.執行epilogue
ExecutePassEpilogue(RHICmdListPass, Pass);
#if RDG_GPU_SCOPES
if (bUsePassEventScope)
{
GPUScopeStacks.EndExecutePass(Pass);
}
#endif
// 異步計算完成, 則立即派發.
if (Pass->bAsyncComputeEnd)
{
FRHIAsyncComputeCommandListImmediate::ImmediateDispatch(RHICmdListAsyncCompute);
}
// 如果是調試模式且非異步計算,則送出指令并重新整理到GPU, 然後等待GPU處理完成.
if (GRDGDebugFlushGPU && !GRDGAsyncCompute)
{
RHICmdList.SubmitCommandsAndFlushGPU();
RHICmdList.BlockUntilGPUIdle();
}
}
執行Pass主要有3個步驟:1. prologue、2. pass主體、3. epilogue,它們的執行邏輯如下:
// 1. prologue
void FRDGBuilder::ExecutePassPrologue(FRHIComputeCommandList& RHICmdListPass, FRDGPass* Pass)
{
// 送出前序開始屏障.
if (Pass->PrologueBarriersToBegin)
{
Pass->PrologueBarriersToBegin->Submit(RHICmdListPass);
}
// 送出前序結束屏障.
if (Pass->PrologueBarriersToEnd)
{
Pass->PrologueBarriersToEnd->Submit(RHICmdListPass);
}
// 由于通路檢查将允許在RDG資源上調用GetRHI,是以在第一次使用時将初始化統一緩沖區.
Pass->GetParameters().EnumerateUniformBuffers([&](FRDGUniformBufferRef UniformBuffer)
{
BeginResourceRHI(UniformBuffer);
});
// 設定異步計算預算(Budget).
if (Pass->GetPipeline() == ERHIPipeline::AsyncCompute)
{
RHICmdListPass.SetAsyncComputeBudget(Pass->AsyncComputeBudget);
}
const ERDGPassFlags PassFlags = Pass->GetFlags();
if (EnumHasAnyFlags(PassFlags, ERDGPassFlags::Raster))
{
if (!EnumHasAnyFlags(PassFlags, ERDGPassFlags::SkipRenderPass) && !Pass->SkipRenderPassBegin())
{
// 調用指令隊列的BeginRenderPass接口.
static_cast<FRHICommandList&>(RHICmdListPass).BeginRenderPass(Pass->GetParameters().GetRenderPassInfo(), Pass->GetName());
}
}
}
// 2. pass主體
void FRDGPass::Execute(FRHIComputeCommandList& RHICmdList)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FRDGPass_Execute);
// 設定統一緩沖區.
RHICmdList.SetGlobalUniformBuffers(ParameterStruct.GetGlobalUniformBuffers());
// 執行Pass的實作.
ExecuteImpl(RHICmdList);
}
void TRDGLambdaPass::ExecuteImpl(FRHIComputeCommandList& RHICmdList) override
{
// 執行Lambda.
ExecuteLambda(static_cast<TRHICommandList&>(RHICmdList));
}
// 3. epilogue
void FRDGBuilder::ExecutePassEpilogue(FRHIComputeCommandList& RHICmdListPass, FRDGPass* Pass)
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FRDGBuilder_ExecutePassEpilogue);
const ERDGPassFlags PassFlags = Pass->GetFlags();
// 調用指令隊列的EndRenderPass.
if (EnumHasAnyFlags(PassFlags, ERDGPassFlags::Raster) && !EnumHasAnyFlags(PassFlags, ERDGPassFlags::SkipRenderPass) && !Pass->SkipRenderPassEnd())
{
static_cast<FRHICommandList&>(RHICmdListPass).EndRenderPass();
}
// 放棄資源轉換.
for (FRHITexture* Texture : Pass->TexturesToDiscard)
{
RHIDiscardTransientResource(Texture);
}
// 擷取(Acquire)轉換資源.
for (FRHITexture* Texture : Pass->TexturesToAcquire)
{
RHIAcquireTransientResource(Texture);
}
const FRDGParameterStruct PassParameters = Pass->GetParameters();
// 送出用于圖形管線的尾聲屏障.
if (Pass->EpilogueBarriersToBeginForGraphics)
{
Pass->EpilogueBarriersToBeginForGraphics->Submit(RHICmdListPass);
}
// 送出用于異步計算的尾聲屏障.
if (Pass->EpilogueBarriersToBeginForAsyncCompute)
{
Pass->EpilogueBarriersToBeginForAsyncCompute->Submit(RHICmdListPass);
}
}
由上可知,執行期間,會先編譯所有Pass,然後依次執行Pass的前序、主體和後續,相當于将指令隊列的BeginRenderPass、執行渲染代碼、EndRenderPass分散在它們之間。Pass執行主體實際很簡單,就是調用該Pass的Lambda執行個體,傳入使用的指令隊列執行個體。
執行的最後階段是清理,見下面的分析:
void FRDGBuilder::Clear()
{
// 清理外部資源.
ExternalTextures.Empty();
ExternalBuffers.Empty();
// 清理提取資源.
ExtractedTextures.Empty();
ExtractedBuffers.Empty();
// 清理主體資料.
Passes.Clear();
Views.Clear();
Textures.Clear();
Buffers.Clear();
// 清理統一緩沖區和配置設定器.
UniformBuffers.Clear();
Allocator.ReleaseAll();
}
11.3.5 RDG機制總結
UE的RDG體系預設執行于渲染線程,雖然會合并具有相同RT的RDG Pass,但不意味着它們會被并行地執行,而是被串行地執行。在普通情況下,每個Pass執行的末期不會立即送出并等待GPU完成,但如果是調試模式且非異步計算,則會。
FRDGBuilder并沒有全局唯一的執行個體,通常是将它聲明為局部變量,在一定生命周期内完成Pass的收集、編譯和執行的整套流程。聲明FRDGBuilder執行個體的子產品有:距離場、渲染紋理、場景渲染器、場景捕捉器、光線追蹤、後處理、毛發、虛拟紋理等等。
FRDGBuilder的執行周期可劃分為4個階段:收集Pass、編譯Pass、執行Pass和清理。
收集Pass階段,主要是收集渲染子產品的所有能夠産生RHI渲染指令的Pass(Lambda),收集之後并非立即執行,将被延遲執行。AddPass的步驟是先建立FRDGPass的執行個體,并加入到Pass清單,随後執行SetupPass。SetupPass的過程主要是處理紋理和緩沖區的狀态、引用、依賴和标記等。
編譯Pass階段,則比較複雜,步驟甚多。主要包含建構生産者和消費者的依賴關系,确定Pass的裁剪等各類标記,調整資源的生命周期,裁剪Pass,處理Pass的資源轉換和屏障,處理異步計算Pass的依賴和引用關系,查找并建立分叉和合并Pass節點,合并所有具體相同渲染目标的光栅化Pass等步驟。
執行Pass階段,首先會執行編譯,再根據編譯結果執行所有符合條件的Pass。執行單個Pass時依次執行前序、主體和後續,相當于執行指令隊列的BeginRenderPass、執行Pass主體(Lambda)渲染代碼、EndRenderPass。執行Pass主體時過程簡潔,就是調用該Pass的Lambda執行個體。
最後便是清理階段,将清理或重置FRDGBuilder執行個體内的所有資料和記憶體。
在FRDGBuilder執行的整個過程中,和直接使用RHICommandList相比,FRDGBuilder的特性和優化措施如下:
- RDG Pass引用的資源都應該由RDG配置設定或管理,即便是外部注冊的資源,也應該在RDG期間保證生命周期。RDG會自動管理資源的生命周期,延遲它們在交叉、合并Pass期間的生命周期,并在使用完無引用時釋放并重用之。
- 資源的配置設定并非即時響應,而是在初次被使用時才配置設定或建立。
- 擁有子資源(Subresource)的概念,通過合理的布局将它們整合成大的資源塊,可以派發一個子資源到另外一個,也可以自動建立子資源的視圖(View)和别名(Aliase),建立由未來渲染Pass建立的資源别名。進而有效地管理資源的配置設定、釋放、重用,減少記憶體總體占用和記憶體碎片,減少CPU和GPU的IO,提升記憶體使用效率。
- 以FRDGBuilder的執行個體為機關管理RDG Pass,自動排序、引用、分叉和合并Pass,處理Pass的資源引用和依賴,裁剪無用的Pass和資源。RDG也可以正确處理Graphics Pass和Async Compute Pass之間的依賴和引用,将它們有序地按照DAG圖串聯起來,并正确地處理它們的資源交叉使用和狀态轉換。
- RDG能夠合并RDG Pass的渲染,前提是這些RDG Pass使用了相同的渲染紋理。這樣可以減少RHI層的Begin/EndRenderPass的調用,減少RHI渲染指令和資源狀态的轉換。
- RDG能夠自動處理Pass之間的資源依賴、屏障和狀态轉換,摒棄無效的狀态轉換(如read-to-read、write-to-write),并且可以合并、批量轉換資源的狀态,進一步減少渲染指令的數量。
- RDG Pass的執行是在渲染線程中發生,并且是串行,而沒有使用TaskGraph并行地執行。
理論上是可以并行執行的,不過這隻是猜想,是否可行還需要實踐驗證。
- RDG擁有豐富的調試模式和資訊,支援即時執行模式,協助開發人員快速定位問題,減少Bug查複時間和難度。
當然,FRDGBuilder也存在一些副作用:
- 在渲染體系增加了一層概念和封裝,提高渲染層的複雜性,增加學習成本。
- 增加開發複雜度,由于是延時執行,有些bug不能得到即時回報。
- 某些Pass或資源的生命周期可能被額外延長。
來自Frostbite異步計算示意圖。其中SSAO、SSAO Filter的Pass放入到異步隊列,它們會寫入和讀取Raw AO的紋理,即便在同步點之前結束,但Raw AO的生命周期依然會被延長到同步點。
11.4 RDG開發
本章主要闡述如何使用UE的RDG系統。
11.4.1 建立RDG資源
建立RDG資源(紋理、緩沖區、UAV、SRV等)的示例代碼如下:
// ---- 建立RDG紋理示範 ----
// 建立RDG紋理描述
FRDGTextureDesc TextureDesc = Input.Texture->Desc;
TextureDesc.Reset();
TextureDesc.Format = PF_FloatRGBA;
TextureDesc.ClearValue = FClearValueBinding::None;
TextureDesc.Flags &= ~TexCreate_DepthStencilTargetable;
TextureDesc.Flags |= TexCreate_RenderTargetable;
TextureDesc.Extent = OutputViewport.Extent;
// 建立RDG紋理.
FRDGTextureRef MyRDGTexture = GraphBuilder.CreateTexture(TextureDesc, TEXT("MyRDGTexture"));
// ---- 建立RDG紋理UAV示範 ----
FRDGTextureUAVRef MyRDGTextureUAV = GraphBuilder.CreateUAV(MyRDGTexture);
// ---- 建立RDG紋理SRV示範 ----
FRDGTextureSRVRef MyRDGTextureSRV = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateWithPixelFormat(MyRDGTexture, PF_FloatRGBA));
建立紋理等資源前需要建立資源的描述符,而建立資源的UAV和SRV時,可以用之前建立的資源作為執行個體傳進去,進而達到複用的目的。建立SRV需要将資源執行個體作為描述符的參數,建立描述符後再建立SRV。
上述代碼以建立紋理的相關資源為例,緩沖區的建立也類似,不再舉例。
11.4.2 注冊外部資源
上一節的資源由RDG建立和管理,資源的生命周期也由RDG負責。如果我們已有非RDG建立的資源,可以在RDG使用麼?答案是可以,通過FRDGBuilder::RegisterExternalXXX接口可以完成将外部資源注冊到RDG系統中。下面以注冊紋理為例:
// 在RDG外建立RHI資源.
FRHIResourceCreateInfo CreateInfo;
FTexture2DRHIRef MyRHITexture = RHICreateTexture2D(1024, 768, PF_B8G8R8A8, 1, 1, TexCreate_CPUReadback, CreateInfo);
// 将外部建立的RHI資源注冊成RDG資源.
FRDGTextureRef MyExternalRDGTexture = GraphBuilder.RegisterExternalTexture(MyRHITexture);
需要注意的是,外部注冊的資源,RDG無法控制和管理其生命周期,需要保證RDG使用期間外部資源的生命周期處于正常狀态,否則将引發異常甚至程式崩潰。
如果想從RDG資源擷取RHI資源的執行個體,以下代碼可達成:
FRHITexture* MyRHITexture = MyRDGTexture.GetRHI();
用圖例展示RHI資源和RDG資源之間的轉換關系:
graph LR
A(FRHIResource) -->|FRDGBuilder::RegisterExternalXXX| B(FRDGResource)
B -->|FRDGResource::GetRHI| A
上述代碼以注冊紋理的相關資源為例,緩沖區的注冊也類似。
11.4.3 提取資源
上一章RDG機制中已經提到了,RDG收集Pass之後并非立即執行,而是延遲執行(包括資源被延遲建立或配置設定),這就導緻了一個問題:如果想将渲染後的資源指派給某個變量,無法使用立即模式,需要适配延遲執行模式。這種适配延遲執行的資源提取是通過以下接口來實作的:
- FRDGBuilder::QueueTextureExtraction
- FRDGBuilder::QueueBufferExtraction
使用示例如下:
// 建立RDG紋理.
FRDGTextureRef MyRDGTexture;
FRDGTextureDesc MyTextureDesc = FRDGTextureDesc::Create2D(OutputExtent, HistoryPixelFormat, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
MyRDGTexture = GraphBuilder.CreateTexture(MyTextureDesc, "MyRDGTexture", ERDGTextureFlags::MultiFrame);
// 建立UAV并作為Pass的shader參數.
(......)
PassParameters->MyRDGTextureUAV = GraphBuilder.CreateUAV(MyRDGTexture);
(......)
// 增加Pass, 以便渲染圖像到MyRDGTextureUAV.
FComputeShaderUtils::AddPass(GraphBuilder, RDG_EVENT_NAME("MyCustomPass", ...), ComputeShader, PassParameters, FComputeShaderUtils::GetGroupCount(8, 8));
// 入隊提取資源.
TRefCountPtr<IPooledRenderTarget>* OutputRT;
GraphBuilder.QueueTextureExtraction(MyRDGTexture, &OutputRT);
// 對提取的OutputRT進行後續操作.
(......)
不過需要注意的是,由于Pass、資源建立和提取都是被延遲的,意味着提取的資源僅可傳回,提供給下一幀使用。
小思考:如果要在本幀使用提取後的資源,增加特殊的無參數Pass對提取後的資源進行操作可行嗎?為什麼?
11.4.4 增加Pass
整個RDG體系執行的機關是RDG Pass,它們的依賴、引用、輸入、輸出都是通過FRDGBuilder::AddPass完成,以下是其中一個示例:
// 建立Pass的shader參數.
FMyPS::FParameters* PassParameters = GraphBuilder.AllocParameters<FMyPS::FParameters>();
PassParameters->InputTexture = InputTexture;
PassParameters->RenderTargets = FRenderTargetBinding(InputTexture, InputTextureLoadAction);
PassParameters->InputSampler = BilinearSampler;
// 處理着色器.
TShaderMapRef<FScreenPassVS> VertexShader(View.ShaderMap);
TShaderMapRef<FMyPS> PixelShader(View.ShaderMap);
const FScreenPassPipelineState PipelineState(VertexShader, PixelShader, AdditiveBlendState);
// 增加RDG Pass.
GraphBuilder.AddPass(
RDG_EVENT_NAME("MyRDGPass"),
PassParameters,
ERDGPassFlags::Raster,
// Pass的Lambda
[PixelShader, PassParameters, PipelineState] (FRHICommandListImmediate& RHICmdList)
{
// 設定視口.
RHICmdList.SetViewport(0, 0, 0.0f, 1024, 768, 1.0f);
// 設定PSO.
SetScreenPassPipelineState(RHICmdList, PipelineState);
// 設定着色器參數.
SetShaderParameters(RHICmdList, PixelShader, PixelShader.GetPixelShader(), *PassParameters);
// 繪制矩形區域.
DrawRectangle(RHICmdList, 0, 0, 1024, 768, 0, 0, 1.0f, 1.0f, FIntPoint(1024, 768), FIntPoint(1024, 768), PipelineState.VertexShader, EDRF_Default);
});
向RDG系統增加的Pass可以是傳統的Graphics Pass,也可以是Compute Shader,還可以是無參數的Pass。RDG Pass和RHI Pass并非一一對應關系,若幹個RDG Pass可能合并成一個RHI Pass執行。具體見上一章節11.3.4 FRDGBuilder::Execute。
11.4.5 建立FRDGBuilder
建立和使用FRDGBuilder的代碼非常簡單,如下所示:
void RenderMyStuff(FRHICommandListImmediate& RHICmdList)
{
// ----建立FRDGBuilder的局部對象----
FRDGBuilder GraphBuilder(RHICmdList, RDG_EVENT_NAME("GraphBuilder_RenderMyStuff"));
(......)
// ----增加Pass----
GraphBuilder.AddPass(...);
(......)
GraphBuilder.AddPass(...);
(......)
// ----增加資源提取----
GraphBuilder.QueueTextureExtraction(...);
(......)
// ---- 執行FRDGBuilder ----
GraphBuilder.Execute();
}
需要特别指出的是,FRDGBuilder的執行個體通常都是局部的,在UE體系中存在若幹個FRDGBuilder的執行個體,主要用于比較獨立的子產品,例如場景渲染器、後處理、光線追蹤等等子產品。
FRDGBuilder執行實際有三個步驟:收集Pass、編譯Pass、執行Pass,不過FRDGBuilder::Execute已經包含了編譯和執行Pass,是以我們不再需要顯示地調用FRDGBuilder::Compile接口。
11.4.6 RDG調試
RDG系統存在一些控制台指令,其名稱和描述如下:
控制台變量 | 描述 |
---|---|
r.RDG.AsyncCompute | 控制異步計算政策:0-禁用;1-為異步計算Pass啟用标記(預設);2-開啟所有使用compute指令清單的計算通道。 |
r.RDG.Breakpoint | 當滿足某些條件時,斷點到調試器的斷點位置。0-禁用,1~4-不同的特殊調試模式。 |
r.RDG.ClobberResources | 在配置設定時間用指定的清理顔色清除所有渲染目标和紋理/緩沖UAV。用于調試。 |
r.RDG.CullPasses | RDG是否開啟裁剪無用的Pass。0-禁用,1-開啟(預設)。 |
r.RDG.Debug | 允許輸出在連接配接和執行過程中發現的效率低下的警告。 |
r.RDG.Debug.FlushGPU | 開啟每次Pass執行後重新整理指令到GPU。當設定(r.RDG.AsyncCompute=0)時禁用異步計算。 |
r.RDG.Debug.GraphFilter | 将某些調試事件過濾到特定的圖中。 |
r.RDG.Debug.PassFilter | 将某些調試事件過濾到特定的Pass。 |
r.RDG.Debug.ResourceFilter | 将某些調試事件過濾到特定的資源。 |
r.RDG.DumpGraph | 将多個可視化日志轉儲到磁盤。0-禁用,1-顯示生産者、消費者Pass依賴,2-顯示資源狀态和轉換,3-顯示圖形、異步計算的重疊。 |
r.RDG.ExtendResourceLifetimes | RDG将把資源生命周期擴充到圖的全部長度。會增加記憶體的占用。 |
r.RDG.ImmediateMode | 在建立Pass時執行Pass。當在Pass的Lambda中崩潰時,連接配接代碼的調用堆棧非常有用。 |
r.RDG.MergeRenderPasses | 圖形将合并相同的、連續的渲染通道到一個單一的渲染通道。0-禁用,1-開啟(預設)。 |
r.RDG.OverlapUAVs | RDG将在需要時重疊UAV的工作。如果禁用,UAV屏障總是插入。 |
r.RDG.TransitionLog | 輸出資源轉換到控制台。 |
r.RDG.VerboseCSVStats | 控制RDG的CSV分析統計的詳細程度。0-為圖形執行生成一個CSV配置檔案,1-為圖形執行的每個階段生成一個CSV檔案。 |
除了以上列出的RDG控制台,還有一些指令可以顯示RDG系統運作過程中的有用資訊。
vis
列出所有有效的紋理,輸入之後可能顯示如下所示的資訊:
VisualizeTexture/Vis <CheckpointName> [<Mode>] [PIP/UV0/UV1/UV2] [BMP] [FRAC/SAT] [FULL]:
Mode (examples):
RGB = RGB in range 0..1 (default)
*8 = RGB * 8
A = alpha channel in range 0..1
R = red channel in range 0..1
G = green channel in range 0..1
B = blue channel in range 0..1
A*16 = Alpha * 16
RGB/2 = RGB / 2
SubResource:
MIP5 = Mip level 5 (0 is default)
INDEX5 = Array Element 5 (0 is default)
InputMapping:
PIP = like UV1 but as picture in picture with normal rendering (default)
UV0 = UV in left top
UV1 = full texture
UV2 = pixel perfect centered
Flags:
BMP = save out bitmap to the screenshots folder (not on console, normalized)
STENCIL = Stencil normally displayed in alpha channel of depth. This option is used for BMP to get a stencil only BMP.
FRAC = use frac() in shader (default)
SAT = use saturate() in shader
FULLLIST = show full list, otherwise we hide some textures in the printout
BYNAME = sort list by name
BYSIZE = show list by size
TextureId:
0 = <off>
LogConsoleResponse: 13 = (2D 1x1 PF_DepthStencil) DepthDummy 1 KB
LogConsoleResponse: 18 = (2D 976x492 PF_FloatRGBA RT) SceneColor 3752 KB
LogConsoleResponse: 19 = (2D 128x32 PF_G16R16) PreintegratedGF 16 KB
LogConsoleResponse: 23 = (2D 64x64 PF_FloatRGBA VRam) LTCMat 32 KB VRamInKB(Start/Size):<NONE>
LogConsoleResponse: 24 = (2D 64x64 PF_G16R16F VRam) LTCAmp 16 KB VRamInKB(Start/Size):<NONE>
LogConsoleResponse: 26 = (2D 976x492 PF_FloatRGBA UAV) SSRTemporalAA 3752 KB
LogConsoleResponse: 27 = (2D 976x492 PF_FloatR11G11B10 RT UAV) SSGITemporalAccumulation0 1876 KB
LogConsoleResponse: 29 = (2D 976x492 PF_R32_UINT RT UAV) DenoiserMetadata0 1876 KB
LogConsoleResponse: 30 = (2D 976x492 PF_FloatRGBA RT UAV VRam) SceneColorDeferred 3752 KB VRamInKB(Start/Size):<NONE>
LogConsoleResponse: 31 = (2D 976x492 PF_DepthStencil VRam) SceneDepthZ 2345 KB VRamInKB(Start/Size):<NONE>
LogConsoleResponse: 37 = (3D 64x64x16 PF_FloatRGBA UAV) HairLUT 512 KB
LogConsoleResponse: 38 = (3D 64x64x16 PF_FloatRGBA UAV) HairLUT 512 KB
LogConsoleResponse: 39 = (2D 64x64 PF_R32_FLOAT UAV) HairCoverageLUT 16 KB
LogConsoleResponse: 47 = (2D 98x64 PF_A16B16G16R16) SSProfiles 49 KB
LogConsoleResponse: 48 = (2D 256x64 PF_FloatRGBA RT) AtmosphereTransmittance 128 KB
LogConsoleResponse: 49 = (2D 64x16 PF_FloatRGBA RT) AtmosphereIrradiance 8 KB
LogConsoleResponse: 50 = (2D 64x16 PF_FloatRGBA RT) AtmosphereDeltaE 8 KB
LogConsoleResponse: 51 = (3D 256x128x2 PF_FloatRGBA RT) AtmosphereInscatter 512 KB
LogConsoleResponse: 52 = (3D 256x128x2 PF_FloatRGBA RT) AtmosphereDeltaSR 512 KB
LogConsoleResponse: 53 = (3D 256x128x2 PF_FloatRGBA RT) AtmosphereDeltaSM 512 KB
LogConsoleResponse: 54 = (3D 256x128x2 PF_FloatRGBA RT) AtmosphereDeltaJ 512 KB
LogConsoleResponse: 55 = (2D 1x1 PF_A32B32G32R32F RT UAV) EyeAdaptation 1 KB
LogConsoleResponse: 56 = (3D 32x32x32 PF_A2B10G10R10 RT VRam) CombineLUTs 128 KB VRamInKB(Start/Size):<NONE>
LogConsoleResponse: 68 = (2D 976x492 PF_R8G8 RT UAV) SSGITemporalAccumulation1 938 KB
LogConsoleResponse: 89 = (2D 976x246 PF_R32_UINT RT UAV) QuadOverdrawBuffer 938 KB
LogConsoleResponse: 91 = (2D 976x492 PF_FloatRGBA RT UAV) LightAccumulation 3752 KB
LogConsoleResponse: 92 = (Cube[2] 128 PF_FloatRGBA) ReflectionEnvs 2048 KB
LogConsoleResponse: 93 = (3D 64x64x64 PF_FloatRGBA RT UAV) TranslucentVolumeDir2 2048 KB
LogConsoleResponse: 95 = (2D 1x1 PF_A32B32G32R32F RT UAV) EyeAdaptation 1 KB
LogConsoleResponse: 96 = (3D 64x64x64 PF_FloatRGBA RT UAV) TranslucentVolume2 2048 KB
LogConsoleResponse: 97 = (3D 64x64x64 PF_FloatRGBA RT UAV) TranslucentVolumeDir1 2048 KB
LogConsoleResponse: 98 = (3D 64x64x64 PF_FloatRGBA RT UAV) TranslucentVolume1 2048 KB
LogConsoleResponse: 99 = (3D 64x64x64 PF_FloatRGBA RT UAV) TranslucentVolumeDir0 2048 KB
LogConsoleResponse: 101 = (3D 64x64x64 PF_FloatRGBA RT UAV) TranslucentVolume0 2048 KB
LogConsoleResponse: 102 = (2D 976x492 PF_G8 RT UAV) ScreenSpaceAO 469 KB
LogConsoleResponse: 106 = (2D 488x246 PF_DepthStencil) SmallDepthZ 1173 KB
LogConsoleResponse: 107 = (2D 1x1 PF_A32B32G32R32F RT UAV) EyeAdaptation 1 KB
LogConsoleResponse: CheckpointName (what was rendered this frame, use <Name>@<Number> to get intermediate versions):
LogConsoleResponse: Pool: 43/112 MB (referenced/allocated)
11.5 本篇總結
本篇主要闡述了UE的RDG的、基礎概念、使用方法、渲染流程和主要機制,使得讀者對RDG有着大緻的了解,至于更多技術細節和原理,需要讀者自己去研讀UE源碼發掘。有很多RDG的使用細節本篇沒有涉及,可以閱讀官方的RDG 101: A Crash Cours彌補。
11.5.1 本篇思考
按慣例,本篇也布置一些小思考,以助了解和加深RDG的掌握和了解:
- RDG的步驟有哪些?每個步驟的作用是什麼?各有什麼特點?
- RDG的資源和RHI的資源有什麼差別和聯系?如何互相轉換?
- 利用RDG實作自定義的CS和PS繪制代碼。
特别說明
- 感謝所有參考文獻的作者,部分圖檔來自參考文獻和網絡,侵删。
- 本系列文章為筆者原創,隻發表在部落格園上,歡迎分享本文連結,但未經同意,不允許轉載!
- 系列文章,未完待續,完整目錄請戳内容綱目。
- 系列文章,未完待續,完整目錄請戳内容綱目。
- 系列文章,未完待續,完整目錄請戳内容綱目。
參考文獻
- Unreal Engine Source
- Rendering and Graphics
- Materials
- Graphics Programming
- Render Dependency Graph
- FrameGraph: Extensible Rendering Architecture in Frostbite
- RDG 101: A Crash Cours
- Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing Techniques