天天看點

剖析虛幻渲染體系(08)- Shader體系

目錄

  • 8.1 本篇概述
  • 8.2 Shader基礎
    • 8.2.1 FShader
    • 8.2.2 Shader Parameter
    • 8.2.3 Uniform Buffer
    • 8.2.4 Vertex Factory
    • 8.2.5 Shader Permutation
  • 8.3 Shader機制
    • 8.3.1 Shader Map
      • 8.3.1.1 FShaderMapBase
      • 8.3.1.2 FGlobalShaderMap
      • 8.3.1.3 FMaterialShaderMap
      • 8.3.1.4 FMeshMaterialShaderMap
    • 8.3.2 Shader編譯
    • 8.3.3 Shader跨平台
    • 8.3.4 Shader緩存
  • 8.4 Shader開發
    • 8.4.1 Shader調試
    • 8.4.2 Shader優化
      • 8.4.2.1 優化排列
      • 8.4.2.2 指令優化
    • 8.4.3 Shader開發案例
      • 8.4.3.1 新增Global Shader
      • 8.4.3.2 新增Vertex Factory
  • 8.5 本篇總結
    • 8.5.1 本篇思考
  • 特别說明
  • 參考文獻

Shader是在GPU側執行的邏輯指令,根據執行單元的不同,可分為頂點着色器(Vertex Shader)、像素着色器(Pixel Shader)、計算着色器(Compute Shader),以及幾何着色器、網格着色器等等。

UE的Shader為了跨平台、跨圖形API,做了很多封裝和抽象,由此闡述的類型和概念非常多,另外,為了優化,提升代碼複用率,又增加了排列、PSO、DDC等概念和類型。

前面很多篇章都有涉及Shader的概念、類型和代碼,本篇将更加深入且廣泛低闡述它的體系。主要闡述UE的以下内容:

  • Shader的基礎概念。
  • Shader的基礎類型。
  • Shader的實作層級。
  • Shader的使用方法和用例。
  • Shader的實作和原理。
  • Shader的跨平台機制。

需要注意的是,本篇涉及的Shader既包含C++層的概念和類型,也包括GPU層的概念和類型。

本章将分析Shader涉及的基礎概念和類型,闡述它們之間的基本關系和使用方法。

FShader是一個已經編譯好的着色器代碼和它的參數綁定的類型,是我們在渲染代碼中最基礎、核心、常見的一個類型。它的定義如下:

// Engine\Source\Runtime\RenderCore\Public\Shader.h

class RENDERCORE_API FShader
{
public:
    (......)

    // 在編譯觸發之前修改編譯環境參數, 可由子類覆寫.
    static void ModifyCompilationEnvironment(const FShaderPermutationParameters&, FShaderCompilerEnvironment&) {}
    // 是否需要編譯指定的排列, 可由子類覆寫.
    static bool ShouldCompilePermutation(const FShaderPermutationParameters&) { return true; }
    // 檢測編譯結果是否有效, 可由子類覆寫.
    static bool ValidateCompiledResult(EShaderPlatform InPlatform, const FShaderParameterMap& InParameterMap, TArray<FString>& OutError) { return true; }

    // 擷取各類資料的Hash的接口.
    const FSHAHash& GetHash() const;
    const FSHAHash& GetVertexFactoryHash() const;
    const FSHAHash& GetOutputHash() const;

    // 儲存并檢測shader代碼的編譯結果.
    void Finalize(const FShaderMapResourceCode* Code);

    // 資料擷取接口.
    inline FShaderType* GetType(const FShaderMapPointerTable& InPointerTable) const { return Type.Get(InPointerTable.ShaderTypes); }
    inline FShaderType* GetType(const FPointerTableBase* InPointerTable) const { return Type.Get(InPointerTable); }
    inline FVertexFactoryType* GetVertexFactoryType(const FShaderMapPointerTable& InPointerTable) const { return VFType.Get(InPointerTable.VFTypes); }
    inline FVertexFactoryType* GetVertexFactoryType(const FPointerTableBase* InPointerTable) const { return VFType.Get(InPointerTable); }
    inline FShaderType* GetTypeUnfrozen() const { return Type.GetUnfrozen(); }
    inline int32 GetResourceIndex() const { checkSlow(ResourceIndex != INDEX_NONE); return ResourceIndex; }
    inline EShaderPlatform GetShaderPlatform() const { return Target.GetPlatform(); }
    inline EShaderFrequency GetFrequency() const { return Target.GetFrequency(); }
    inline const FShaderTarget GetTarget() const { return Target; }
    inline bool IsFrozen() const { return Type.IsFrozen(); }
    inline uint32 GetNumInstructions() const { return NumInstructions; }

#if WITH_EDITORONLY_DATA
    inline uint32 GetNumTextureSamplers() const { return NumTextureSamplers; }
    inline uint32 GetCodeSize() const { return CodeSize; }
    inline void SetNumInstructions(uint32 Value) { NumInstructions = Value; }
#else
    inline uint32 GetNumTextureSamplers() const { return 0u; }
    inline uint32 GetCodeSize() const { return 0u; }
#endif

    // 嘗試傳回比對指定類型的自動綁定的Uniform Buffer, 如果不存在則傳回未綁定的.
    template<typename UniformBufferStructType>
    const TShaderUniformBufferParameter<UniformBufferStructType>& GetUniformBufferParameter() const;
    const FShaderUniformBufferParameter& GetUniformBufferParameter(const FShaderParametersMetadata* SearchStruct) const;
    const FShaderUniformBufferParameter& GetUniformBufferParameter(const FHashedName SearchName) const;
    const FShaderParametersMetadata* FindAutomaticallyBoundUniformBufferStruct(int32 BaseIndex) const;
    static inline const FShaderParametersMetadata* GetRootParametersMetadata();

    (......)

public:
    // 着色器參數綁定.
    LAYOUT_FIELD(FShaderParameterBindings, Bindings);
    // 着色器參數綁定的映射資訊.
    LAYOUT_FIELD(FShaderParameterMapInfo, ParameterMapInfo);

protected:
    LAYOUT_FIELD(TMemoryImageArray<FHashedName>, UniformBufferParameterStructs);
    LAYOUT_FIELD(TMemoryImageArray<FShaderUniformBufferParameter>, UniformBufferParameters);

    // 下面3個是編輯器參數.
    // 着色器的編譯輸出和結果參數映射的哈希值, 用于查找比對的資源.
    LAYOUT_FIELD_EDITORONLY(FSHAHash, OutputHash);
    // 頂點工廠資源哈希值
    LAYOUT_FIELD_EDITORONLY(FSHAHash, VFSourceHash);
    // shader資源哈希值.
    LAYOUT_FIELD_EDITORONLY(FSHAHash, SourceHash);

private:
    // 着色器類型.
    LAYOUT_FIELD(TIndexedPtr<FShaderType>, Type);
    // 頂點工廠類型.
    LAYOUT_FIELD(TIndexedPtr<FVertexFactoryType>, VFType);
    // 目标平台和着色頻率(frequency).
    LAYOUT_FIELD(FShaderTarget, Target);
    
    // 在FShaderMapResource的shader索引.
    LAYOUT_FIELD(int32, ResourceIndex);
    // shader指令數.
    LAYOUT_FIELD(uint32, NumInstructions);
    // 紋理采樣器數量.
    LAYOUT_FIELD_EDITORONLY(uint32, NumTextureSamplers);
    // shader代碼尺寸.
    LAYOUT_FIELD_EDITORONLY(uint32, CodeSize);
};
           

以上可知,FShader存儲着Shader關聯的綁定參數、頂點工廠、編譯後的各類資源等資料,并提供了編譯器修改和檢測接口,還有各類資料擷取接口。

FShader實際上是個基礎父類,它的子類有:

  • FGlobalShader:全局着色器,它的子類在記憶體中隻有唯一的執行個體,常用于螢幕方塊繪制、後處理等。它的定義如下:
    // Engine\Source\Runtime\RenderCore\Public\GlobalShader.h
    
    class FGlobalShader : public FShader
    {
    public:
        (......)
    
        FGlobalShader() : FShader() {}
        FGlobalShader(const ShaderMetaType::CompiledShaderInitializerType& Initializer);
        
        // 設定視圖着色器參數.
        template<typename TViewUniformShaderParameters, typename ShaderRHIParamRef, typename TRHICmdList>
        inline void SetParameters(TRHICmdList& RHICmdList, ...);
    };
               
    相比父類FShader,增加了SetParameters設定視圖統一緩沖的接口。
  • FMaterialShader:材質着色器,由FMaterialShaderType指定的材質引用的着色器,是材質藍圖在執行個體化後的一個shader子集。它的定義如下:
    // Engine\Source\Runtime\Renderer\Public\MaterialShader.h
    
    class RENDERER_API FMaterialShader : public FShader
    {
    public:
        (......)
    
        FMaterialShader() = default;
        FMaterialShader(const FMaterialShaderType::CompiledShaderInitializerType& Initializer);
    
        // 設定視圖Uniform Buffer參數.
        template<typename ShaderRHIParamRef>
        void SetViewParameters(FRHICommandList& RHICmdList, ...);
        // 設定材質相關但非FMeshBatch相關的像素着色器參數
        template< typename TRHIShader >
        void SetParameters(FRHICommandList& RHICmdList, ...);
        // 擷取着色器參數綁定.
        void GetShaderBindings(const FScene* Scene, ...) const;
    
    private:
        // 是否允許Uniform表達式緩存.
        static int32 bAllowCachedUniformExpressions;
        // bAllowCachedUniformExpressions對應的控制台周遊.
        static FAutoConsoleVariableRef CVarAllowCachedUniformExpressions;
    
    #if !(UE_BUILD_TEST || UE_BUILD_SHIPPING || !WITH_EDITOR)
        // 驗證表達式和着色器圖的有效性.
        void VerifyExpressionAndShaderMaps(const FMaterialRenderProxy* MaterialRenderProxy, const FMaterial& Material, const FUniformExpressionCache* UniformExpressionCache) const;
    #endif
        // 配置設定的參數Uniform Buffer.
        LAYOUT_FIELD(TMemoryImageArray<FShaderUniformBufferParameter>, ParameterCollectionUniformBuffers);
        // 材質的着色器Uniform Buffer.
        LAYOUT_FIELD(FShaderUniformBufferParameter, MaterialUniformBuffer);
    
        (......)
    };
               

下面是FShader繼承體系下的部分子類:

FShader
    FGlobalShader
        TMeshPaintVertexShader
        TMeshPaintPixelShader
        FDistanceFieldDownsamplingCS
        FBaseGPUSkinCacheCS
            TGPUSkinCacheCS
        FBaseRecomputeTangentsPerTriangleShader
        FBaseRecomputeTangentsPerVertexShader
        FRadixSortUpsweepCS
        FRadixSortDownsweepCS
        FParticleTileVS
        FBuildMipTreeCS
        FScreenVS
        FScreenPS
        FScreenPSInvertAlpha
        FSimpleElementVS
        FSimpleElementPS
        FStereoLayerVS
        FStereoLayerPS_Base
            FStereoLayerPS
        FUpdateTexture2DSubresouceCS
        FUpdateTexture3DSubresouceCS
        FCopyTexture2DCS
        TCopyDataCS
        FLandscapeLayersVS
        FLandscapeLayersHeightmapPS
        FGenerateMipsCS
        FGenerateMipsVS
        FGenerateMipsPS
        FCopyTextureCS
        FMediaShadersVS
        FRGBConvertPS
        FYUVConvertPS
        FYUY2ConvertPS
        FRGB10toYUVv210ConvertPS
        FInvertAlphaPS
        FSetAlphaOnePS
        FReadTextureExternalPS
        FOculusVertexShader
        FRasterizeToRectsVS
        FResolveVS
        FResolveDepthPS
        FResolveDepth2XPS
        FAmbientOcclusionPS
        FGTAOSpatialFilterCS
        FGTAOTemporalFilterCS
        FDeferredDecalVS
        FDitheredTransitionStencilPS
        FObjectCullVS
        FObjectCullPS
        FDeferredLightPS
        TDeferredLightHairVS
        FFXAAVS
        FFXAAPS
        FMotionBlurShader
        FSubsurfaceShader
        FTonemapVS
        FTonemapPS
        FTonemapCS
        FUpscalePS
        FTAAStandaloneCS
        FSceneCapturePS
        FHZBTestPS
        FOcclusionQueryVS
        FOcclusionQueryPS
        FHZBBuildPS
        FHZBBuildCS
        FDownsampleDepthPS
        FTiledDeferredLightingCS
        FShader_VirtualTextureCompress
        FShader_VirtualTextureCopy
        FPageTableUpdateVS
        FPageTableUpdatePS
        FSlateElementVS
        FSlateElementPS
        (......)
    FMaterialShader
        FDeferredDecalPS
        FLightHeightfieldsPS
        FLightFunctionVS
        FLightFunctionPS
        FPostProcessMaterialShader
        TTranslucentLightingInjectPS
        FVolumetricFogLightFunctionPS
        FMeshMaterialShader
            FLightmapGBufferVS
            FLightmapGBufferPS
            FVLMVoxelizationVS
            FVLMVoxelizationGS
            FVLMVoxelizationPS
            FLandscapeGrassWeightVS
            FLandscapeGrassWeightPS
            FLandscapePhysicalMaterial
            FAnisotropyVS
            FAnisotropyPS
            TBasePassVertexShaderPolicyParamType
                TBasePassVertexShaderBaseType
                    TBasePassVS
            TBasePassPixelShaderPolicyParamType
                TBasePassPixelShaderBaseType
                    TBasePassPS
            FMeshDecalsVS
            FMeshDecalsPS
            TDepthOnlyVS
            TDepthOnlyPS
            FDistortionMeshVS
            FDistortionMeshPS
            FHairMaterialVS
            FHairMaterialPS
            FHairVisibilityVS
            FHairVisibilityPS
            TLightMapDensityVS
            TLightMapDensityPS
            FShadowDepthVS
            FShadowDepthBasePS
                TShadowDepthPS
            FTranslucencyShadowDepthVS
            FTranslucencyShadowDepthPS
            FVelocityVS
            FVelocityPS
            FRenderVolumetricCloudVS
            FVolumetricCloudShadowPS
            FVoxelizeVolumeVS
            FVoxelizeVolumePS
            FShader_VirtualTextureMaterialDraw
            (......)
        FSlateMaterialShaderVS
        FSlateMaterialShaderPS
        (......)
           

上述隻是列出了FShader的部分繼承體系,包含了部分之前已經解析過的Shader類型,比如FDeferredLightPS、FFXAAPS、FTonemapPS、FUpscalePS、TBasePassPS、TDepthOnlyPS等等。

FGlobalShader包含了後處理、光照、工具類、可視化、地形、虛拟紋理等方面的Shader代碼,可以是VS、PS、CS,但CS必然是FGlobalShader的子類;FMaterialShader主要包含了模型、專用Pass、體素化等方面的Shader代碼,可以是VS、PS、GS等,但不會有CS。

如果新定義了FShader的子類,需要借助下面的宏聲明和實作對應的代碼(部分常見的宏):

// ------ Shader聲明和實作宏 ------

// 聲明指定類型(FShader子類)的Shader, 可以是Global, Material, MeshMaterial, ...
#define DECLARE_SHADER_TYPE(ShaderClass,ShaderMetaTypeShortcut,...)
// 實作指定類型的Shader, 可以是Global, Material, MeshMaterial, ...
#define IMPLEMENT_SHADER_TYPE(TemplatePrefix,ShaderClass,SourceFilename,FunctionName,Frequency)

// 聲明FGlobalShader及其子類.
#define DECLARE_GLOBAL_SHADER(ShaderClass)
// 實作FGlobalShader及其子類.
#define IMPLEMENT_GLOBAL_SHADER(ShaderClass,SourceFilename,FunctionName,Frequency)

// 實作Material着色器.
#define IMPLEMENT_MATERIAL_SHADER_TYPE(TemplatePrefix,ShaderClass,SourceFilename,FunctionName,Frequency)

// 其它不常見的宏
(......)

// ------ 示例1 ------

class FDeferredLightPS : public FGlobalShader
{
    // 在FDeferredLightPS類内聲明全局着色器
    DECLARE_SHADER_TYPE(FDeferredLightPS, Global)
    (......)
};
// 實作FDeferredLightPS着色器, 讓它和代碼檔案, 主入口及着色頻率關聯起來.
IMPLEMENT_GLOBAL_SHADER(FDeferredLightPS, "/Engine/Private/DeferredLightPixelShaders.usf", "DeferredLightPixelMain", SF_Pixel);


// ------ 示例2 ------

class FDeferredDecalPS : public FMaterialShader
{
    // 在類内聲明材質着色器
    DECLARE_SHADER_TYPE(FDeferredDecalPS,Material);
    (......)
};
// 實作FDeferredDecalPS類, 讓它和代碼檔案, 主入口以及着色頻率關聯起來.
IMPLEMENT_MATERIAL_SHADER_TYPE(,FDeferredDecalPS,TEXT("/Engine/Private/DeferredDecal.usf"),TEXT("MainPS"),SF_Pixel);
           

着色器參數是一組由CPU的C++層傳入GPU Shader并存儲于GPU寄存器或顯存的資料。下面是着色器參數常見類型的定義:

// Engine\Source\Runtime\RenderCore\Public\ShaderParameters.h

// 着色器的寄存器綁定參數, 它的類型可以是float1/2/3/4,數組, UAV等.
class FShaderParameter
{
    (......)
public:
    // 綁定指定名稱的參數.
    void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* ParameterName, EShaderParameterFlags Flags = SPF_Optional);
    // 是否已被着色器綁定.
    bool IsBound() const;
    // 是否初始化.
    inline bool IsInitialized() const;

    // 資料擷取接口.
    uint32 GetBufferIndex() const;
    uint32 GetBaseIndex() const;
    uint32 GetNumBytes() const;

    (......)
};

// 着色器資源綁定(紋理或采樣器)
class FShaderResourceParameter
{
    (......)
public:
    // 綁定指定名稱的參數.
    void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* ParameterName,EShaderParameterFlags Flags = SPF_Optional);
    bool IsBound() const;
    inline bool IsInitialized() const;

    uint32 GetBaseIndex() const;
    uint32 GetNumResources() const;

    (......)
};

// 綁定了UAV或SRV資源的類型.
class FRWShaderParameter
{
    (......)
public:
    // 綁定指定名稱的參數.
    void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* BaseName);

    bool IsBound() const;
    bool IsUAVBound() const;
    uint32 GetUAVIndex() const;

    // 設定緩沖資料到RHI.
    template<typename TShaderRHIRef, typename TRHICmdList>
    inline void SetBuffer(TRHICmdList& RHICmdList, const TShaderRHIRef& Shader, const FRWBuffer& RWBuffer) const;
    template<typename TShaderRHIRef, typename TRHICmdList>
    inline void SetBuffer(TRHICmdList& RHICmdList, const TShaderRHIRef& Shader, const FRWBufferStructured& RWBuffer) const;

    // 設定紋理資料到RHI.
    template<typename TShaderRHIRef, typename TRHICmdList>
    inline void SetTexture(TRHICmdList& RHICmdList, const TShaderRHIRef& Shader, FRHITexture* Texture, FRHIUnorderedAccessView* UAV) const;

    // 從RHI取消設定UAV.
    template<typename TRHICmdList>
    inline void UnsetUAV(TRHICmdList& RHICmdList, FRHIComputeShader* ComputeShader) const;

    (......)
};

// 建立指定平台下的Uniform Buffer結構體的着色器代碼聲明.
extern void CreateUniformBufferShaderDeclaration(const TCHAR* Name,const FShaderParametersMetadata& UniformBufferStruct, EShaderPlatform Platform, FString& OutDeclaration);

// 着色器統一緩沖參數.
class FShaderUniformBufferParameter
{
    (......)
public:
    // 修改編譯環境變量.
    static void ModifyCompilationEnvironment(const TCHAR* ParameterName,const FShaderParametersMetadata& Struct,EShaderPlatform Platform,FShaderCompilerEnvironment& OutEnvironment);

    // 綁定着色器參數.
    void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* ParameterName,EShaderParameterFlags Flags = SPF_Optional);

    bool IsBound() const;
    inline bool IsInitialized() const;
    uint32 GetBaseIndex() const;

    (......)
};

// 指定結構體的着色器統一緩沖參數
template<typename TBufferStruct>
class TShaderUniformBufferParameter : public FShaderUniformBufferParameter
{
public:
    static void ModifyCompilationEnvironment(const TCHAR* ParameterName,EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment);

    (......)
};
           

由此可見,着色器參數可以綁定任何GPU類型的資源或資料,但不同的類隻能綁定特定的着色器類型,不能夠混用,比如FRWShaderParameter隻能綁定UAV或SRV。有了以上類型,就可以在C++層的Shader類配合LAYOUT_FIELD的相關宏聲明具體的Shader參數了。

LAYOUT_FIELD是可以聲明指定着色器參數的類型、名字、初始值、位域、寫入函數等資料的宏,其相關定義如下:

// Engine\Source\Runtime\Core\Public\Serialization\MemoryLayout.h

// 普通布局
#define LAYOUT_FIELD(T, Name, ...)
// 帶初始值
#define LAYOUT_FIELD_INITIALIZED(T, Name, Value, ...)
// 帶mutable和初始值
#define LAYOUT_MUTABLE_FIELD_INITIALIZED(T, Name, Value, ...)
// 數組布局
#define LAYOUT_ARRAY(T, Name, NumArray, ...)
#define LAYOUT_MUTABLE_BITFIELD(T, Name, BitFieldSize, ...)
// 位域
#define LAYOUT_BITFIELD(T, Name, BitFieldSize, ...)
// 帶寫入函數
#define LAYOUT_FIELD_WITH_WRITER(T, Name, Func)
#define LAYOUT_MUTABLE_FIELD_WITH_WRITER(T, Name, Func)
#define LAYOUT_WRITE_MEMORY_IMAGE(Func)
#define LAYOUT_TOSTRING(Func)
           

借助LAYOUT_FIELD等宏,就可以在C++類中聲明指定類型的着色器參數,示例:

struct FMyExampleParam
{
    // 聲明非虛類.
    DECLARE_TYPE_LAYOUT(FMyExampleParam, NonVirtual);
    
    // 位域
    LAYOUT_FIELD(FShaderParameter, ShaderParam); // 等價于: FShaderParameter ShaderParam;
    LAYOUT_FIELD(FShaderResourceParameter, TextureParam); // 等價于: FShaderResourceParameter TextureParam;
    LAYOUT_FIELD(FRWShaderParameter, OutputUAV); // 等價于: FRWShaderParameter OutputUAV;
    
    // 數組, 第3個參數是最大數量.
    LAYOUT_ARRAY(FShaderResourceParameter, TextureArray, 5); // 等價于: FShaderResourceParameter TextureArray[5];
    LAYOUT_ARRAY(int32, Ids, 64); // 等價于: int32 Ids[64];
    
    LAYOUT_FIELD_INITIALIZED(uint32, Size, 0); // 等價于: int32 Size = 0;

    void WriteDataFunc(FMemoryImageWriter& Writer, const TMemoryImagePtr<FOtherExampleParam>& InParameters) const;
    // 帶寫入函數.
    LAYOUT_FIELD_WITH_WRITER(TMemoryImagePtr<FOtherExampleParam>, Parameters, WriteDataFunc);
};
           

UE的Uniform Buffer涉及了幾個核心的概念,最底層的是RHI層的FRHIUniformBuffer,封裝了各種圖形API的統一緩沖區(也叫Constant Buffer),它的定義如下(去掉了實作和調試代碼):

// Engine\Source\Runtime\RHI\Public\RHIResources.h

class FRHIUniformBuffer : public FRHIResource
{
public:
    // 構造函數.
    FRHIUniformBuffer(const FRHIUniformBufferLayout& InLayout);

    // 引用計數操作.
    uint32 AddRef() const;
    uint32 Release() const;
    
    // 資料擷取接口.
    uint32 GetSize() const;
    const FRHIUniformBufferLayout& GetLayout() const;
    bool IsGlobal() const;

private:
    // RHI Uniform Buffer的布局.
    const FRHIUniformBufferLayout* Layout;
    // 緩沖區尺寸.
    uint32 LayoutConstantBufferSize;
};
           

再往上一層就是TUniformBufferRef,會引用到上述的FRHIUniformBuffer:

// Engine\Source\Runtime\RHI\Public\RHIResources.h

// 定義FRHIUniformBuffer的引用類型.
typedef TRefCountPtr<FRHIUniformBuffer> FUniformBufferRHIRef;


// Engine\Source\Runtime\RenderCore\Public\ShaderParameterMacros.h

// 引用了指定類型的FRHIUniformBuffer的執行個體資源. 注意是繼承了FUniformBufferRHIRef.
template<typename TBufferStruct>
class TUniformBufferRef : public FUniformBufferRHIRef
{
public:
    TUniformBufferRef();

    // 根據給定的值建立Uniform Buffer, 并傳回結構體引用. (模闆)
    static TUniformBufferRef<TBufferStruct> CreateUniformBufferImmediate(const TBufferStruct& Value, EUniformBufferUsage Usage, EUniformBufferValidation Validation = EUniformBufferValidation::ValidateResources);
    // 根據給定的值建立[局部]的Uniform Buffer, 并傳回結構體引用.
    static FLocalUniformBuffer CreateLocalUniformBuffer(FRHICommandList& RHICmdList, const TBufferStruct& Value, EUniformBufferUsage Usage);

    // 立即重新整理緩沖區資料到RHI.
    void UpdateUniformBufferImmediate(const TBufferStruct& Value);

private:
    // 私有構造體, 隻能給TUniformBuffer和TRDGUniformBuffer建立.
    TUniformBufferRef(FRHIUniformBuffer* InRHIRef);

    template<typename TBufferStruct2>
    friend class TUniformBuffer;

    friend class TRDGUniformBuffer<TBufferStruct>;
};
           

再往上一層就是引用了FUniformBufferRHIRef的TUniformBuffer和TRDGUniformBuffer,它們的定義如下:

// Engine\Source\Runtime\RenderCore\Public\UniformBuffer.h

// 引用了Uniform Buffer的資源.
template<typename TBufferStruct>
class TUniformBuffer : public FRenderResource
{
public:
    // 構造函數.
    TUniformBuffer()
        : BufferUsage(UniformBuffer_MultiFrame)
        , Contents(nullptr){}

    // 析構函數.
    ~TUniformBuffer()
    {
        if (Contents)
        {
            FMemory::Free(Contents);
        }
    }

    // 設定Uniform Buffer的内容資料.
    void SetContents(const TBufferStruct& NewContents)
    {
        SetContentsNoUpdate(NewContents);
        UpdateRHI();
    }
    // 清零Uniform Buffer的内容資料. (若内容為空會先建立)
    void SetContentsToZero()
    {
        if (!Contents)
        {
            Contents = (uint8*)FMemory::Malloc(sizeof(TBufferStruct), SHADER_PARAMETER_STRUCT_ALIGNMENT);
        }
        FMemory::Memzero(Contents, sizeof(TBufferStruct));
        UpdateRHI();
    }

    // 擷取内容.
    const uint8* GetContents() const 
    {
        return Contents;
    }

    // ----重載FRenderResource的接口----
    
    // 初始化動态RHI資源.
    virtual void InitDynamicRHI() override
    {
        check(IsInRenderingThread());
        UniformBufferRHI.SafeRelease();
        if (Contents)
        {
            // 根據二進制流的内容資料建立RHI資源.
            UniformBufferRHI = CreateUniformBufferImmediate<TBufferStruct>(*((const TBufferStruct*)Contents), BufferUsage);
        }
    }
    // 釋放動态RHI資源.
    virtual void ReleaseDynamicRHI() override
    {
        UniformBufferRHI.SafeRelease();
    }

    // 資料通路接口.
    FRHIUniformBuffer* GetUniformBufferRHI() const
    { 
        return UniformBufferRHI; 
    }
    const TUniformBufferRef<TBufferStruct>& GetUniformBufferRef() const
    {
        return UniformBufferRHI;
    }

    // Buffer标記.
    EUniformBufferUsage BufferUsage;

protected:
    // 設定Uniform Buffer的内容資料.
    void SetContentsNoUpdate(const TBufferStruct& NewContents)
    {
        if (!Contents)
        {
            Contents = (uint8*)FMemory::Malloc(sizeof(TBufferStruct), SHADER_PARAMETER_STRUCT_ALIGNMENT);
        }
        FMemory::Memcpy(Contents,&NewContents,sizeof(TBufferStruct));
    }

private:
    // TUniformBufferRef的引用.
    TUniformBufferRef<TBufferStruct> UniformBufferRHI;
    // CPU側的内容資料.
    uint8* Contents;
};


// Engine\Source\Runtime\RenderCore\Public\RenderGraphResources.h

class FRDGUniformBuffer : public FRDGResource
{
public:
    bool IsGlobal() const;
    const FRDGParameterStruct& GetParameters() const;

    //////////////////////////////////////////////////////////////////////////
    // 擷取RHI, 隻可在Pass執行時調用.
    FRHIUniformBuffer* GetRHI() const
    {
        return static_cast<FRHIUniformBuffer*>(FRDGResource::GetRHI());
    }
    //////////////////////////////////////////////////////////////////////////

protected:
    // 構造函數.
    template <typename TParameterStruct>
    explicit FRDGUniformBuffer(TParameterStruct* InParameters, const TCHAR* InName)
        : FRDGResource(InName)
        , ParameterStruct(InParameters)
        , bGlobal(ParameterStruct.HasStaticSlot())
    {}

private:
    const FRDGParameterStruct ParameterStruct;
    // 引用了FRHIUniformBuffer的資源.
    // 注意TUniformBufferRef<TBufferStruct>和FUniformBufferRHIRef時等價的.
    TRefCountPtr<FRHIUniformBuffer> UniformBufferRHI;
    FRDGUniformBufferHandle Handle;

    // 是否被全局Shader還是局部Shader綁定.
    uint8 bGlobal : 1;

    friend FRDGBuilder;
    friend FRDGUniformBufferRegistry;
    friend FRDGAllocator;
};

// FRDGUniformBuffer的模闆版本.
template <typename ParameterStructType>
class TRDGUniformBuffer : public FRDGUniformBuffer
{
public:
    // 資料擷取接口.
    const TRDGParameterStruct<ParameterStructType>& GetParameters() const;
    TUniformBufferRef<ParameterStructType> GetRHIRef() const;
    const ParameterStructType* operator->() const;

private:
    explicit TRDGUniformBuffer(ParameterStructType* InParameters, const TCHAR* InName)
        : FRDGUniformBuffer(InParameters, InName)
    {}

    friend FRDGBuilder;
    friend FRDGUniformBufferRegistry;
    friend FRDGAllocator;
};
           

将它們抽象成UML繼承圖之後,如下所示:

classDiagram-v2

FRHIResource <|-- FRHIUniformBuffer

FUniformBufferRHIRef <|-- TUniformBufferRef

FRHIUniformBuffer <-- FUniformBufferRHIRef

class FRHIResource{

}

class FRHIUniformBuffer{

FRHIUniformBufferLayout* Layout

uint32 LayoutConstantBufferSize

class FUniformBufferRHIRef{

FRHIUniformBuffer* Reference

class TUniformBufferRef{

TUniformBufferRef(FRHIUniformBuffer* InRHIRef)

CreateUniformBufferImmediate()

CreateLocalUniformBuffer()

UpdateUniformBufferImmediate()

FRenderResource <|-- TUniformBuffer

TUniformBufferRef <-- TUniformBuffer

class FRenderResource{

class TUniformBuffer{

SetContents()

GetUniformBufferRHI()

GetUniformBufferRef()

uint8* Contents

EUniformBufferUsage BufferUsage

TUniformBufferRef<TBufferStruct> UniformBufferRHI

FRDGUniformBuffer <|-- TRDGUniformBuffer

FUniformBufferRHIRef <-- FRDGUniformBuffer

class FRDGUniformBuffer{

FUniformBufferRHIRef UniformBufferRHI

FRDGUniformBufferHandle Handle

class TRDGUniformBuffer{

GetRHIRef()

吐槽一下:文本繪圖文法Mermaid不能指定布局,自動生成的圖形布局不夠美觀,并且在window下放大UI之後,文字顯示不全了。湊合着看吧。

以上Uniform Buffer的類型可以通過SHADER_PARAMETER的相關宏定義結構體和結構體成員。SHADER_PARAMETER的相關宏定義如下:

// Engine\Source\Runtime\RenderCore\Public\ShaderParameterMacros.h

// Shader Parameter Struct: 開始/結束.
#define BEGIN_SHADER_PARAMETER_STRUCT(StructTypeName, PrefixKeywords)
#define END_SHADER_PARAMETER_STRUCT()

// Uniform Buffer Struct: 開始/結束/實作.
#define BEGIN_UNIFORM_BUFFER_STRUCT(StructTypeName, PrefixKeywords)
#define BEGIN_UNIFORM_BUFFER_STRUCT_WITH_CONSTRUCTOR(StructTypeName, PrefixKeywords)
#define END_UNIFORM_BUFFER_STRUCT()
#define IMPLEMENT_UNIFORM_BUFFER_STRUCT(StructTypeName,ShaderVariableName)
#define IMPLEMENT_UNIFORM_BUFFER_ALIAS_STRUCT(StructTypeName, UniformBufferAlias)
#define IMPLEMENT_STATIC_UNIFORM_BUFFER_STRUCT(StructTypeName,ShaderVariableName,StaticSlotName)
#define IMPLEMENT_STATIC_UNIFORM_BUFFER_SLOT(SlotName)

// Global Shader Parameter Struct: 開始/結束/實作.
#define BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT
#define BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT_WITH_CONSTRUCTOR
#define END_GLOBAL_SHADER_PARAMETER_STRUCT
#define IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT
#define IMPLEMENT_GLOBAL_SHADER_PARAMETER_ALIAS_STRUCT

// Shader Parameter: 單個, 數組.
#define SHADER_PARAMETER(MemberType, MemberName)
#define SHADER_PARAMETER_EX(MemberType,MemberName,Precision)
#define SHADER_PARAMETER_ARRAY(MemberType,MemberName,ArrayDecl)
#define SHADER_PARAMETER_ARRAY_EX(MemberType,MemberName,ArrayDecl,Precision)

// Shader Parameter: 紋理, SRV, UAV, 采樣器及其數組
#define SHADER_PARAMETER_TEXTURE(ShaderType,MemberName)
#define SHADER_PARAMETER_TEXTURE_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_SRV(ShaderType,MemberName)
#define SHADER_PARAMETER_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_SAMPLER(ShaderType,MemberName)
#define SHADER_PARAMETER_SAMPLER_ARRAY(ShaderType,MemberName, ArrayDecl)

// Shader Parameter Struct内的Shader Parameter Struct參數.
#define SHADER_PARAMETER_STRUCT(StructType,MemberName)
#define SHADER_PARAMETER_STRUCT_ARRAY(StructType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_STRUCT_INCLUDE(StructType,MemberName)
// 引用一個[全局]的着色器參數結構體.
#define SHADER_PARAMETER_STRUCT_REF(StructType,MemberName)

// RDG模式的Shader Parameter.
#define SHADER_PARAMETER_RDG_TEXTURE(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_TEXTURE_SRV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_TEXTURE_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_TEXTURE_UAV_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_RDG_BUFFER(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_SRV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_SRV_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_RDG_BUFFER_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_UAV_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_RDG_UNIFORM_BUFFER(StructType, MemberName)
           

注意局部(普通)的Shader Parameter Struct沒有實作(IMPLEMENT_SHADER_PARAMETER_STRUCT)宏,Global的才有(IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT)。

下面給出示例,展示如何用上述部分宏來聲明着色器的各類參數:

// 定義全局的着色器參數結構體(可在.h或.cpp, 不過一般在.h)
BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT(FMyShaderParameterStruct, )
    // 正常單個和數組參數.
    SHADER_PARAMETER(float, Intensity)
    SHADER_PARAMETER_ARRAY(FVector3, Vertexes, [8])
    
    // 采樣器, 紋理, SRV, UAV
    SHADER_PARAMETER_SAMPLER(SamplerState, TextureSampler)
    SHADER_PARAMETER_TEXTURE(Texture3D, Texture3d)
    SHADER_PARAMETER_SRV(Buffer<float4>, VertexColorBuffer)
    SHADER_PARAMETER_UAV(RWStructuredBuffer<float4>, OutputTexture)
    
    // 着色器參數結構體
    // 引用着色器參數結構體(全局的才行)
    SHADER_PARAMETER_STRUCT_REF(FViewUniformShaderParameters, View)
    // 包含着色器參數結構體(局部或全局都行)
    SHADER_PARAMETER_STRUCT_INCLUDE(FSceneTextureShaderParameters, SceneTextures)
END_GLOBAL_SHADER_PARAMETER_STRUCT()

// 實作全局的着色器參數結構體(隻能在.cpp)
IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT(FMyShaderParameterStruct, "MyShaderParameterStruct");
           

上面的着色器結構體是在C++側聲明和實作的,如果需要正确傳入到Shader中,還需要額外的C++代碼來完成:

// 聲明結構體.
FMyShaderParameterStruct MyShaderParameterStruct;

// 建立RHI資源.
// 可以是多幀(UniformBuffer_MultiFrame)的, 這樣隻需建立1次就可以緩存指針, 後續有資料更新調用UpdateUniformBufferImmediate即可.
// 也可以是單幀的(UniformBuffer_SingleFrame), 則每幀需要建立和更新資料.
auto MyShaderParameterStructRHI = TUniformBufferRef<FMyShaderParameterStruct>::CreateUniformBufferImmediate(ShaderParameterStruct, EUniformBufferUsage::UniformBuffer_MultiFrame);

// 更新着色器參數結構體.
MyShaderParameterStruct.Intensity = 1.0f;
(......)

// 更新資料到RHI.
MyShaderParameterStructRHI.UpdateUniformBufferImmediate(MyShaderParameterStruct);
           

我們知道,在引擎中存在着靜态網格、蒙皮骨骼、程式化網格以及地形等等類型的網格類型,而材質就是通過頂點工廠FVertexFactory來支援這些網格類型。實際上,頂點工廠要涉及各方面的資料和類型,包含但不限于:

  • 頂點着色器。頂點着色器的輸入輸出需要頂點工廠來表明資料的布局。
  • 頂點工廠的參數和RHI資源。這些資料将從C++層傳入到頂點着色器中進行處理。
  • 頂點緩沖和頂點布局。通過頂點布局,我們可以自定義和擴充頂點緩沖的輸入,進而實作定制化的Shader代碼。
  • 幾何預處理。頂點緩沖、網格資源、材質參數等等都可以在真正渲染前預處理它們。
剖析虛幻渲染體系(08)- Shader體系

頂點工廠在渲染層級中的關系。由圖可知,頂點工廠是渲染線程的對象,橫跨于CPU和GPU兩端。

FVertexFactory封裝了可以連結到頂點着色器的頂點資料資源,它和相關類型的定義如下:

// Engine\Source\Runtime\RHI\Public\RHI.h

// 頂點元素.
struct FVertexElement
{
    uint8 StreamIndex;      // 流索引
    uint8 Offset;          // 偏移
    TEnumAsByte<EVertexElementType> Type; // 類型
    uint8 AttributeIndex;// 屬性索引
    uint16 Stride;          // 步長
    // 執行個體索引或頂點索引是否執行個體化的, 若是0, 則元素會對每個執行個體進行重複.
    uint16 bUseInstanceIndex;

    FVertexElement();
    FVertexElement(uint8 InStreamIndex, ...);
    
    void operator=(const FVertexElement& Other);
    friend FArchive& operator<<(FArchive& Ar,FVertexElement& Element);
    
    FString ToString() const;
    void FromString(const FString& Src);
    void FromString(const FStringView& Src);
};

// 頂點聲明元素清單的類型.
typedef TArray<FVertexElement,TFixedAllocator<MaxVertexElementCount> > FVertexDeclarationElementList;


// Engine\Source\Runtime\RHI\Public\RHIResources.h

// 頂點聲明的RHI資源
class FRHIVertexDeclaration : public FRHIResource
{
public:
    virtual bool GetInitializer(FVertexDeclarationElementList& Init) { return false; }
};

// 頂點緩沖區
class FRHIVertexBuffer : public FRHIResource
{
public:
    FRHIVertexBuffer(uint32 InSize,uint32 InUsage);

    uint32 GetSize() const;
    uint32 GetUsage() const;

protected:
    FRHIVertexBuffer();

    void Swap(FRHIVertexBuffer& Other);
    void ReleaseUnderlyingResource();

private:
    // 尺寸.
    uint32 Size;
    // 緩沖區标記, 如BUF_UnorderedAccess
    uint32 Usage;
};


// Engine\Source\Runtime\RenderCore\Public\VertexFactory.h

// 頂點輸入流.
struct FVertexInputStream
{
    // 頂點流索引
    uint32 StreamIndex : 4;
    // 在VertexBuffer的偏移.
    uint32 Offset : 28;
    // 頂點緩存區
    FRHIVertexBuffer* VertexBuffer;

    FVertexInputStream();
    FVertexInputStream(uint32 InStreamIndex, uint32 InOffset, FRHIVertexBuffer* InVertexBuffer);

    inline bool operator==(const FVertexInputStream& rhs) const;
    inline bool operator!=(const FVertexInputStream& rhs) const;
};

// 頂點輸入流數組.
typedef TArray<FVertexInputStream, TInlineAllocator<4>> FVertexInputStreamArray;

// 頂點流标記
enum class EVertexStreamUsage : uint8
{
    Default            = 0 << 0, // 預設
    Instancing        = 1 << 0, // 執行個體化
    Overridden        = 1 << 1, // 覆寫
    ManualFetch        = 1 << 2  // 手動擷取
};

// 頂點輸入流類型.
enum class EVertexInputStreamType : uint8
{
    Default = 0,  // 預設
    PositionOnly, // 隻有位置
    PositionAndNormalOnly // 隻有位置和法線
};

// 頂點流元件.
struct FVertexStreamComponent
{
    // 流資料的頂點緩沖區, 如果為null, 則不會有資料從此頂點流被讀取.
    const FVertexBuffer* VertexBuffer = nullptr;

    // vertex buffer的偏移.
    uint32 StreamOffset = 0;
    // 資料的偏移, 相對于頂點緩沖區中每個元素的開頭.
    uint8 Offset = 0;
    // 資料的步長.
    uint8 Stride = 0;
    // 從流讀取的資料類型.
    TEnumAsByte<EVertexElementType> Type = VET_None;
    // 頂點流标記.
    EVertexStreamUsage VertexStreamUsage = EVertexStreamUsage::Default;

    (......)
};

// 着色器使用的頂點工廠的參數綁定接口.
class FVertexFactoryShaderParameters
{
public:
    // 綁定參數到ParameterMap. 具體邏輯由子類完成.
    void Bind(const class FShaderParameterMap& ParameterMap) {}

    // 擷取頂點工廠的着色器綁定和頂點流. 具體邏輯由子類完成.
    void GetElementShaderBindings(
        const class FSceneInterface* Scene,
        const class FSceneView* View,
        const class FMeshMaterialShader* Shader,
        const EVertexInputStreamType InputStreamType,
        ERHIFeatureLevel::Type FeatureLevel,
        const class FVertexFactory* VertexFactory,
        const struct FMeshBatchElement& BatchElement,
        class FMeshDrawSingleShaderBindings& ShaderBindings,
        FVertexInputStreamArray& VertexStreams) const {}

    (......)
};

// 用來表示頂點工廠類型的類.
class FVertexFactoryType
{
public:
    // 類型定義
    typedef FVertexFactoryShaderParameters* (*ConstructParametersType)(EShaderFrequency ShaderFrequency, const class FShaderParameterMap& ParameterMap);
    typedef const FTypeLayoutDesc* (*GetParameterTypeLayoutType)(EShaderFrequency ShaderFrequency);
    (......)

    // 擷取頂點工廠類型數量.
    static int32 GetNumVertexFactoryTypes();

    // 擷取全局的着色器工廠清單.
    static RENDERCORE_API TLinkedList<FVertexFactoryType*>*& GetTypeList();
    // 擷取已存的材質類型清單.
    static RENDERCORE_API const TArray<FVertexFactoryType*>& GetSortedMaterialTypes();
    // 通過名字查找FVertexFactoryType
    static RENDERCORE_API FVertexFactoryType* GetVFByName(const FHashedName& VFName);

    // 初始化FVertexFactoryType靜态成員, 必須在VF類型建立之前調用.
    static void Initialize(const TMap<FString, TArray<const TCHAR*> >& ShaderFileToUniformBufferVariables);
    static void Uninitialize();

    // 構造/析構函數.
    RENDERCORE_API FVertexFactoryType(...);
    virtual ~FVertexFactoryType();

    // 資料擷取接口.
    const TCHAR* GetName() const;
    FName GetFName() const;
    const FHashedName& GetHashedName() const;
    const TCHAR* GetShaderFilename() const;

    // 着色器參數接口.
    FVertexFactoryShaderParameters* CreateShaderParameters(...) const;
    const FTypeLayoutDesc* GetShaderParameterLayout(...) const;
    void GetShaderParameterElementShaderBindings(...) const;

    // 标記通路.
    bool IsUsedWithMaterials() const;
    bool SupportsStaticLighting() const;
    bool SupportsDynamicLighting() const;
    bool SupportsPrecisePrevWorldPos() const;
    bool SupportsPositionOnly() const;
    bool SupportsCachingMeshDrawCommands() const;
    bool SupportsPrimitiveIdStream() const;

    // 擷取哈希.
    friend uint32 GetTypeHash(const FVertexFactoryType* Type);
    // 基于頂點工廠類型的源碼和包含計算出來的哈希.
    const FSHAHash& GetSourceHash(EShaderPlatform ShaderPlatform) const;
    // 是否需要緩存材質的着色器類型.
    bool ShouldCache(const FVertexFactoryShaderPermutationParameters& Parameters) const;

    void ModifyCompilationEnvironment(...);
    void ValidateCompiledResult(EShaderPlatform Platform, ...);

    bool SupportsTessellationShaders() const;

    // 增加引用的Uniform Buffer包含.
    void AddReferencedUniformBufferIncludes(...);
    void FlushShaderFileCache(...);
    const TMap<const TCHAR*, FCachedUniformBufferDeclaration>& GetReferencedUniformBufferStructsCache() const;

private:
    static uint32 NumVertexFactories;
    static bool bInitializedSerializationHistory;

    // 頂點工廠類型的各類資料和标記.
    const TCHAR* Name;
    const TCHAR* ShaderFilename;
    FName TypeName;
    FHashedName HashedName;
    uint32 bUsedWithMaterials : 1;
    uint32 bSupportsStaticLighting : 1;
    uint32 bSupportsDynamicLighting : 1;
    uint32 bSupportsPrecisePrevWorldPos : 1;
    uint32 bSupportsPositionOnly : 1;
    uint32 bSupportsCachingMeshDrawCommands : 1;
    uint32 bSupportsPrimitiveIdStream : 1;
    ConstructParametersType ConstructParameters;
    GetParameterTypeLayoutType GetParameterTypeLayout;
    GetParameterTypeElementShaderBindingsType GetParameterTypeElementShaderBindings;
    ShouldCacheType ShouldCacheRef;
    ModifyCompilationEnvironmentType ModifyCompilationEnvironmentRef;
    ValidateCompiledResultType ValidateCompiledResultRef;
    SupportsTessellationShadersType SupportsTessellationShadersRef;

    // 全局頂點工廠類型清單.
    TLinkedList<FVertexFactoryType*> GlobalListLink;
    // 緩存引用的Uniform Buffer的包含.
    TMap<const TCHAR*, FCachedUniformBufferDeclaration> ReferencedUniformBufferStructsCache;
    // 跟蹤ReferencedUniformBufferStructsCache緩存了哪些平台的聲明.
    bool bCachedUniformBufferStructDeclarations;
};


// ------頂點工廠的工具宏------

// 實作頂點工廠參數類型
#define IMPLEMENT_VERTEX_FACTORY_PARAMETER_TYPE(FactoryClass, ShaderFrequency, ParameterClass)

// 頂點工廠類型的聲明
#define DECLARE_VERTEX_FACTORY_TYPE(FactoryClass)
// 頂點工廠類型的實作
#define IMPLEMENT_VERTEX_FACTORY_TYPE(FactoryClass,ShaderFilename,bUsedWithMaterials,bSupportsStaticLighting,bSupportsDynamicLighting,bPrecisePrevWorldPos,bSupportsPositionOnly)
// 頂點工廠的虛函數表實作
#define IMPLEMENT_VERTEX_FACTORY_VTABLE(FactoryClass


// 頂點工廠
class FVertexFactory : public FRenderResource
{
public:
    FVertexFactory(ERHIFeatureLevel::Type InFeatureLevel);

    virtual FVertexFactoryType* GetType() const;

    // 擷取頂點資料流.
    void GetStreams(ERHIFeatureLevel::Type InFeatureLevel, EVertexInputStreamType VertexStreamType, FVertexInputStreamArray& OutVertexStreams) const
    {
        // Default頂點流類型
        if (VertexStreamType == EVertexInputStreamType::Default)
        {
            bool bSupportsVertexFetch = SupportsManualVertexFetch(InFeatureLevel);

            // 将頂點工廠的資料構造到FVertexInputStream中并添加到輸出清單
            for (int32 StreamIndex = 0;StreamIndex < Streams.Num();StreamIndex++)
            {
                const FVertexStream& Stream = Streams[StreamIndex];

                if (!(EnumHasAnyFlags(EVertexStreamUsage::ManualFetch, Stream.VertexStreamUsage) && bSupportsVertexFetch))
                {
                    if (!Stream.VertexBuffer)
                    {
                        OutVertexStreams.Add(FVertexInputStream(StreamIndex, 0, nullptr));
                    }
                    else
                    {
                        if (EnumHasAnyFlags(EVertexStreamUsage::Overridden, Stream.VertexStreamUsage) && !Stream.VertexBuffer->IsInitialized())
                        {
                            OutVertexStreams.Add(FVertexInputStream(StreamIndex, 0, nullptr));
                        }
                        else
                        {
                            OutVertexStreams.Add(FVertexInputStream(StreamIndex, Stream.Offset, Stream.VertexBuffer->VertexBufferRHI));
                        }
                    }
                }
            }
        }
        // 隻有位置和的頂點流類型
        else if (VertexStreamType == EVertexInputStreamType::PositionOnly)
        {
            // Set the predefined vertex streams.
            for (int32 StreamIndex = 0; StreamIndex < PositionStream.Num(); StreamIndex++)
            {
                const FVertexStream& Stream = PositionStream[StreamIndex];
                OutVertexStreams.Add(FVertexInputStream(StreamIndex, Stream.Offset, Stream.VertexBuffer->VertexBufferRHI));
            }
        }
        // 隻有位置和法線的頂點流類型
        else if (VertexStreamType == EVertexInputStreamType::PositionAndNormalOnly)
        {
            // Set the predefined vertex streams.
            for (int32 StreamIndex = 0; StreamIndex < PositionAndNormalStream.Num(); StreamIndex++)
            {
                const FVertexStream& Stream = PositionAndNormalStream[StreamIndex];
                OutVertexStreams.Add(FVertexInputStream(StreamIndex, Stream.Offset, Stream.VertexBuffer->VertexBufferRHI));
            }
        }
        else
        {
            // NOT_IMPLEMENTED
        }
    }
    
    // 偏移執行個體的資料流.
    void OffsetInstanceStreams(uint32 InstanceOffset, EVertexInputStreamType VertexStreamType, FVertexInputStreamArray& VertexStreams) const;
    
    static void ModifyCompilationEnvironment(...);
    static void ValidateCompiledResult(...);

    static bool SupportsTessellationShaders();

    // FRenderResource接口, 釋放RHI資源.
    virtual void ReleaseRHI();

    // 設定/擷取頂點聲明的RHI引用.
    FVertexDeclarationRHIRef& GetDeclaration();
    void SetDeclaration(FVertexDeclarationRHIRef& NewDeclaration);

    // 根據類型擷取頂點聲明的RHI引用.
    const FVertexDeclarationRHIRef& GetDeclaration(EVertexInputStreamType InputStreamType) const 
    {
        switch (InputStreamType)
        {
        case EVertexInputStreamType::Default:                return Declaration;
        case EVertexInputStreamType::PositionOnly:            return PositionDeclaration;
        case EVertexInputStreamType::PositionAndNormalOnly:    return PositionAndNormalDeclaration;
        }
        return Declaration;
    }

    // 各類标記.
    virtual bool IsGPUSkinned() const;
    virtual bool SupportsPositionOnlyStream() const;
    virtual bool SupportsPositionAndNormalOnlyStream() const;
    virtual bool SupportsNullPixelShader() const;

    // 用面向錄影機精靈的方式渲染圖元.
    virtual bool RendersPrimitivesAsCameraFacingSprites() const;

    // 是否需要頂點聲明.
    bool NeedsDeclaration() const;
    // 是否支援手動的頂點擷取.
    inline bool SupportsManualVertexFetch(const FStaticFeatureLevel InFeatureLevel) const;
    // 根據流類型擷取索引.
    inline int32 GetPrimitiveIdStreamIndex(EVertexInputStreamType InputStreamType) const;

protected:
    inline void SetPrimitiveIdStreamIndex(EVertexInputStreamType InputStreamType, int32 StreamIndex)
    {
        PrimitiveIdStreamIndex[static_cast<uint8>(InputStreamType)] = StreamIndex;
    }

    // 為頂點流元件建立頂點元素.
    FVertexElement AccessStreamComponent(const FVertexStreamComponent& Component,uint8 AttributeIndex);
    FVertexElement AccessStreamComponent(const FVertexStreamComponent& Component, uint8 AttributeIndex, EVertexInputStreamType InputStreamType);
    // 初始化頂點聲明.
    void InitDeclaration(const FVertexDeclarationElementList& Elements, EVertexInputStreamType StreamType = EVertexInputStreamType::Default)
    {
        if (StreamType == EVertexInputStreamType::PositionOnly)
        {
            PositionDeclaration = PipelineStateCache::GetOrCreateVertexDeclaration(Elements);
        }
        else if (StreamType == EVertexInputStreamType::PositionAndNormalOnly)
        {
            PositionAndNormalDeclaration = PipelineStateCache::GetOrCreateVertexDeclaration(Elements);
        }
        else // (StreamType == EVertexInputStreamType::Default)
        {
            // Create the vertex declaration for rendering the factory normally.
            Declaration = PipelineStateCache::GetOrCreateVertexDeclaration(Elements);
        }
    }

    // 頂點流, 需要設定到頂點流的資訊體.
    struct FVertexStream
    {
        const FVertexBuffer* VertexBuffer = nullptr;
        uint32 Offset = 0;
        uint16 Stride = 0;
        EVertexStreamUsage VertexStreamUsage = EVertexStreamUsage::Default;
        uint8 Padding = 0;

        friend bool operator==(const FVertexStream& A,const FVertexStream& B);
        FVertexStream();
    };

    // 用于渲染頂點工廠的頂點流.
    TArray<FVertexStream,TInlineAllocator<8> > Streams;

    // VF(頂點工廠)可以顯式地将此設定為false,以避免在沒有聲明的情況下出現錯誤. 主要用于需要直接從緩沖區擷取資料的VF(如Niagara).
    bool bNeedsDeclaration = true;
    bool bSupportsManualVertexFetch = false;
    int8 PrimitiveIdStreamIndex[3] = { -1, -1, -1 };

private:
    // 隻有位置的頂點流, 用于渲染深度Pass的頂點工廠.
    TArray<FVertexStream,TInlineAllocator<2> > PositionStream;
    // 隻有位置和法線的頂點流.
    TArray<FVertexStream, TInlineAllocator<3> > PositionAndNormalStream;

    // 用于正常渲染頂點工廠的RHI頂點聲明.
    FVertexDeclarationRHIRef Declaration;

    // PositionStream和PositionAndNormalStream對應的RHI資源.
    FVertexDeclarationRHIRef PositionDeclaration;
    FVertexDeclarationRHIRef PositionAndNormalDeclaration;
};
           

上面展示了Vertex Factory的很多類型,有好幾個是核心類,比如FVertexFactory、FVertexElement、FRHIVertexDeclaration、FRHIVertexBuffer、FVertexFactoryType、FVertexStreamComponent、FVertexInputStream、FVertexFactoryShaderParameters等。那麼它們之間的關系是什麼呢?

為了更好地說明它們之間的關系,以靜态模型的FStaticMeshDataType為例:

剖析虛幻渲染體系(08)- Shader體系

FStaticMeshDataType會包含若幹個FVertexStreamComponent執行個體,每個FVertexStreamComponent包含了一個在FVertexDeclarationElementList的FVertexElement執行個體索引和一個在FVertexInputStreamArray清單的FVertexStream執行個體索引。

此外,FVertexFactory是個基類,内置的子類主要有:

  • FGeometryCacheVertexVertexFactory:幾何緩存頂點的頂點工廠,常用于預生成的布料、動作等網格類型。
  • FGPUBaseSkinVertexFactory:GPU蒙皮骨骼網格的父類,它的子類有:
    • TGPUSkinVertexFactory:可指定骨骼權重方式的GPU蒙皮的頂點工廠。
  • FLocalVertexFactory:局部頂點工廠,常用于靜态網格,它擁有數量較多的子類:
    • FInstancedStaticMeshVertexFactory:執行個體化的靜态網格頂點工廠。
    • FSplineMeshVertexFactory:樣條曲線網格頂點工廠。
    • FGeometryCollectionVertexFactory:幾何收集頂點工廠。
    • FGPUSkinPassthroughVertexFactory:啟用了Skin Cache模式的蒙皮骨骼頂點工廠。
    • FSingleTriangleMeshVertexFactory:單個三角形網格的頂點工廠,用于體積雲渲染。
    • ......
  • FParticleVertexFactoryBase:用于粒子渲染的頂點工廠基類。
  • FLandscapeVertexFactory:用于渲染地形的頂點工廠。

除了以上繼承自FVertexFactory,還有一些不是繼承自FVertexFactory的類型,如:

  • FGPUBaseSkinAPEXClothVertexFactory:布料頂點工廠。
    • TGPUSkinAPEXClothVertexFactory:可帶骨骼權重模式的布料頂點工廠。

除了FVertexFactory,相應的其它核心類也有繼承體系。比如FVertexFactoryShaderParameters的子類有:

  • FGeometryCacheVertexFactoryShaderParameters
  • FGPUSkinVertexFactoryShaderParameters
  • FMeshParticleVertexFactoryShaderParameters
  • FParticleSpriteVertexFactoryShaderParameters
  • FGPUSpriteVertexFactoryShaderParametersVS
  • FGPUSpriteVertexFactoryShaderParametersPS
  • FSplineMeshVertexFactoryShaderParameters
  • FLocalVertexFactoryShaderParametersBase
  • FLandscapeVertexFactoryVertexShaderParameters
  • FLandscapeVertexFactoryPixelShaderParameters

另外,有部分頂點工廠還會在内部派生FStaticMeshDataType的類型,以複用靜态網格相關的資料成員。

為了更好地說明頂點工廠的使用方式,下面就以最常見的FLocalVertexFactory和使用了FLocalVertexFactory的CableComponent為例:

// Engine\Source\Runtime\Engine\Public\LocalVertexFactory.h

class ENGINE_API FLocalVertexFactory : public FVertexFactory
{
public:
    FLocalVertexFactory(ERHIFeatureLevel::Type InFeatureLevel, const char* InDebugName);

    // 派生自FStaticMeshDataType的資料類型.
    struct FDataType : public FStaticMeshDataType
    {
        FRHIShaderResourceView* PreSkinPositionComponentSRV = nullptr;
    };

    // 環境變量更改和校驗.
    static bool ShouldCompilePermutation(const FVertexFactoryShaderPermutationParameters& Parameters);
    static void ModifyCompilationEnvironment(const FVertexFactoryShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment);
    static void ValidateCompiledResult(const FVertexFactoryType* Type, EShaderPlatform Platform, const FShaderParameterMap& ParameterMap, TArray<FString>& OutErrors);

    // 由TSynchronizedResource從遊戲線程更新而來的資料.
    void SetData(const FDataType& InData);
    // 從其它頂點工廠複制資料.
    void Copy(const FLocalVertexFactory& Other);

    // FRenderResource接口.
    virtual void InitRHI() override;
    virtual void ReleaseRHI() override
    {
        UniformBuffer.SafeRelease();
        FVertexFactory::ReleaseRHI();
    }

    // 頂點顔色接口.
    void SetColorOverrideStream(FRHICommandList& RHICmdList, const FVertexBuffer* ColorVertexBuffer) const;
    void GetColorOverrideStream(const FVertexBuffer* ColorVertexBuffer, FVertexInputStreamArray& VertexStreams) const;
    
    // 着色器參數和其它資料接口.
    inline FRHIShaderResourceView* GetPositionsSRV() const;
    inline FRHIShaderResourceView* GetPreSkinPositionSRV() const;
    inline FRHIShaderResourceView* GetTangentsSRV() const;
    inline FRHIShaderResourceView* GetTextureCoordinatesSRV() const;
    inline FRHIShaderResourceView* GetColorComponentsSRV() const;
    inline const uint32 GetColorIndexMask() const;
    inline const int GetLightMapCoordinateIndex() const;
    inline const int GetNumTexcoords() const;
    FRHIUniformBuffer* GetUniformBuffer() const;
    
    (......)

protected:
    // 從遊戲線程傳入的資料. FDataType是FStaticMeshDataType的子類.
    FDataType Data;
    // 局部頂點工廠的着色器參數.
    TUniformBufferRef<FLocalVertexFactoryUniformShaderParameters> UniformBuffer;
    // 頂點顔色流索引.
    int32 ColorStreamIndex;

    (......)
};

// Engine\Source\Runtime\Engine\Public\LocalVertexFactory.cpp

void FLocalVertexFactory::InitRHI()
{
    // 是否使用gpu場景.
    const bool bCanUseGPUScene = UseGPUScene(GMaxRHIShaderPlatform, GMaxRHIFeatureLevel);

    // 初始化位置流和位置聲明.
    if (Data.PositionComponent.VertexBuffer != Data.TangentBasisComponents[0].VertexBuffer)
    {
        // 增加頂點聲明.
        auto AddDeclaration = [this, bCanUseGPUScene](EVertexInputStreamType InputStreamType, bool bAddNormal)
        {
            // 頂點流元素.
            FVertexDeclarationElementList StreamElements;
            StreamElements.Add(AccessStreamComponent(Data.PositionComponent, 0, InputStreamType));

            bAddNormal = bAddNormal && Data.TangentBasisComponents[1].VertexBuffer != NULL;
            if (bAddNormal)
            {
                StreamElements.Add(AccessStreamComponent(Data.TangentBasisComponents[1], 2, InputStreamType));
            }

            const uint8 TypeIndex = static_cast<uint8>(InputStreamType);
            PrimitiveIdStreamIndex[TypeIndex] = -1;
            if (GetType()->SupportsPrimitiveIdStream() && bCanUseGPUScene)
            {
                // When the VF is used for rendering in normal mesh passes, this vertex buffer and offset will be overridden
                StreamElements.Add(AccessStreamComponent(FVertexStreamComponent(&GPrimitiveIdDummy, 0, 0, sizeof(uint32), VET_UInt, EVertexStreamUsage::Instancing), 1, InputStreamType));
                PrimitiveIdStreamIndex[TypeIndex] = StreamElements.Last().StreamIndex;
            }

            // 初始化聲明.
            InitDeclaration(StreamElements, InputStreamType);
        };

        // 增加PositionOnly和PositionAndNormalOnly兩種頂點聲明, 其中前者不需要法線.
        AddDeclaration(EVertexInputStreamType::PositionOnly, false);
        AddDeclaration(EVertexInputStreamType::PositionAndNormalOnly, true);
    }

    // 頂點聲明元素清單.
    FVertexDeclarationElementList Elements;
    
    // 頂點位置
    if(Data.PositionComponent.VertexBuffer != NULL)
    {
        Elements.Add(AccessStreamComponent(Data.PositionComponent,0));
    }

    // 圖元id
    {
        const uint8 Index = static_cast<uint8>(EVertexInputStreamType::Default);
        PrimitiveIdStreamIndex[Index] = -1;
        if (GetType()->SupportsPrimitiveIdStream() && bCanUseGPUScene)
        {
            // When the VF is used for rendering in normal mesh passes, this vertex buffer and offset will be overridden
            Elements.Add(AccessStreamComponent(FVertexStreamComponent(&GPrimitiveIdDummy, 0, 0, sizeof(uint32), VET_UInt, EVertexStreamUsage::Instancing), 13));
            PrimitiveIdStreamIndex[Index] = Elements.Last().StreamIndex;
        }
    }

    // 切線和法線, 切線法線才需要被頂點流使用, 副法線由shader生成.
    uint8 TangentBasisAttributes[2] = { 1, 2 };
    for(int32 AxisIndex = 0;AxisIndex < 2;AxisIndex++)
    {
        if(Data.TangentBasisComponents[AxisIndex].VertexBuffer != NULL)
        {
            Elements.Add(AccessStreamComponent(Data.TangentBasisComponents[AxisIndex],TangentBasisAttributes[AxisIndex]));
        }
    }

    if (Data.ColorComponentsSRV == nullptr)
    {
        Data.ColorComponentsSRV = GNullColorVertexBuffer.VertexBufferSRV;
        Data.ColorIndexMask = 0;
    }

    // 頂點顔色
    ColorStreamIndex = -1;
    if(Data.ColorComponent.VertexBuffer)
    {
        Elements.Add(AccessStreamComponent(Data.ColorComponent,3));
        ColorStreamIndex = Elements.Last().StreamIndex;
    }
    else
    {
        FVertexStreamComponent NullColorComponent(&GNullColorVertexBuffer, 0, 0, VET_Color, EVertexStreamUsage::ManualFetch);
        Elements.Add(AccessStreamComponent(NullColorComponent, 3));
        ColorStreamIndex = Elements.Last().StreamIndex;
    }

    // 紋理坐标
    if(Data.TextureCoordinates.Num())
    {
        const int32 BaseTexCoordAttribute = 4;
        for(int32 CoordinateIndex = 0;CoordinateIndex < Data.TextureCoordinates.Num();CoordinateIndex++)
        {
            Elements.Add(AccessStreamComponent(
                Data.TextureCoordinates[CoordinateIndex],
                BaseTexCoordAttribute + CoordinateIndex
                ));
        }

        for (int32 CoordinateIndex = Data.TextureCoordinates.Num(); CoordinateIndex < MAX_STATIC_TEXCOORDS / 2; CoordinateIndex++)
        {
            Elements.Add(AccessStreamComponent(
                Data.TextureCoordinates[Data.TextureCoordinates.Num() - 1],
                BaseTexCoordAttribute + CoordinateIndex
                ));
        }
    }

    // 光照圖
    if(Data.LightMapCoordinateComponent.VertexBuffer)
    {
        Elements.Add(AccessStreamComponent(Data.LightMapCoordinateComponent,15));
    }
    else if(Data.TextureCoordinates.Num())
    {
        Elements.Add(AccessStreamComponent(Data.TextureCoordinates[0],15));
    }

    // 初始化頂點聲明
    InitDeclaration(Elements);

    const int32 DefaultBaseVertexIndex = 0;
    const int32 DefaultPreSkinBaseVertexIndex = 0;
    if (RHISupportsManualVertexFetch(GMaxRHIShaderPlatform) || bCanUseGPUScene)
    {
        SCOPED_LOADTIMER(FLocalVertexFactory_InitRHI_CreateLocalVFUniformBuffer);
        UniformBuffer = CreateLocalVFUniformBuffer(this, Data.LODLightmapDataIndex, nullptr, DefaultBaseVertexIndex, DefaultPreSkinBaseVertexIndex);
    }
}

// 實作FLocalVertexFactory的參數類型.
IMPLEMENT_VERTEX_FACTORY_PARAMETER_TYPE(FLocalVertexFactory, SF_Vertex, FLocalVertexFactoryShaderParameters);

// 實作FLocalVertexFactory.
IMPLEMENT_VERTEX_FACTORY_TYPE_EX(FLocalVertexFactory,"/Engine/Private/LocalVertexFactory.ush",true,true,true,true,true,true,true);
           

下面進入CableComponent相關類型關于FLocalVertexFactory的使用:

// Engine\Plugins\Runtime\CableComponent\Source\CableComponent\Private\CableComponent.cpp

class FCableSceneProxy final : public FPrimitiveSceneProxy
{
public:
    FCableSceneProxy(UCableComponent* Component)
        : FPrimitiveSceneProxy(Component)
        , Material(NULL)
        // 構造頂點工廠.
        , VertexFactory(GetScene().GetFeatureLevel(), "FCableSceneProxy")
        (......)
    {
        // 利用頂點工廠初始化緩沖區.
        VertexBuffers.InitWithDummyData(&VertexFactory, GetRequiredVertexCount());
        (......)
    }

    virtual ~FCableSceneProxy()
    {
        // 釋放頂點工廠.
        VertexFactory.ReleaseResource();
        (......)
    }

    // 建構Cable網格.
    void BuildCableMesh(const TArray<FVector>& InPoints, TArray<FDynamicMeshVertex>& OutVertices, TArray<int32>& OutIndices)
    {
        (......)
    }

    // 設定動态資料(渲染線程調用)
    void SetDynamicData_RenderThread(FCableDynamicData* NewDynamicData)
    {
        // 釋放舊資料.
        if(DynamicData)
        {
            delete DynamicData;
            DynamicData = NULL;
        }
        DynamicData = NewDynamicData;

        // 從Cable點建構頂點.
        TArray<FDynamicMeshVertex> Vertices;
        TArray<int32> Indices;
        BuildCableMesh(NewDynamicData->CablePoints, Vertices, Indices);

        // 填充頂點緩沖區資料.
        for (int i = 0; i < Vertices.Num(); i++)
        {
            const FDynamicMeshVertex& Vertex = Vertices[i];

            VertexBuffers.PositionVertexBuffer.VertexPosition(i) = Vertex.Position;
            VertexBuffers.StaticMeshVertexBuffer.SetVertexTangents(i, Vertex.TangentX.ToFVector(), Vertex.GetTangentY(), Vertex.TangentZ.ToFVector());
            VertexBuffers.StaticMeshVertexBuffer.SetVertexUV(i, 0, Vertex.TextureCoordinate[0]);
            VertexBuffers.ColorVertexBuffer.VertexColor(i) = Vertex.Color;
        }

        // 更新頂點緩沖區資料到RHI.
        {
            auto& VertexBuffer = VertexBuffers.PositionVertexBuffer;
            void* VertexBufferData = RHILockVertexBuffer(VertexBuffer.VertexBufferRHI, 0, VertexBuffer.GetNumVertices() * VertexBuffer.GetStride(), RLM_WriteOnly);
            FMemory::Memcpy(VertexBufferData, VertexBuffer.GetVertexData(), VertexBuffer.GetNumVertices() * VertexBuffer.GetStride());
            RHIUnlockVertexBuffer(VertexBuffer.VertexBufferRHI);
        }

        (......)
    }

    virtual void GetDynamicMeshElements(const TArray<const FSceneView*>& Views, const FSceneViewFamily& ViewFamily, uint32 VisibilityMap, FMeshElementCollector& Collector) const override
    {
        (......)

        for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
        {
            if (VisibilityMap & (1 << ViewIndex))
            {
                const FSceneView* View = Views[ViewIndex];
                
                // 構造FMeshBatch執行個體.
                FMeshBatch& Mesh = Collector.AllocateMesh();
                // 将頂點工廠執行個體傳給FMeshBatch執行個體.
                Mesh.VertexFactory = &VertexFactory;
                
                (......)
                
                Collector.AddMesh(ViewIndex, Mesh);
            }
        }
    }

    (......)

private:
    // 材質
    UMaterialInterface* Material;
    // 頂點和索引緩沖.
    FStaticMeshVertexBuffers VertexBuffers;
    FCableIndexBuffer IndexBuffer;
    // 頂點工廠.
    FLocalVertexFactory VertexFactory;
    // 動态資料.
    FCableDynamicData* DynamicData;

    (......)
};
           

由上面的代碼可知,使用已有的頂點工廠的步驟并複雜,主要在于初始化、指派和傳遞給FMeshBatch執行個體等步驟。

不過,無論是使用已有的還是自定義的頂點工廠,頂點工廠的頂點聲明的順序、類型、元件數量和插槽需要和HLSL層的FVertexFactoryInput保持一緻。比如說FLocalVertexFactory::InitRHI的頂點聲明順序是位置、切線、顔色、紋理坐标、光照圖,那麼我們進入FLocalVertexFactory對應的HLSL檔案(由IMPLEMENT_VERTEX_FACTORY_TYPE等宏指定)看看:

// Engine\Shaders\Private\LocalVertexFactory.ush

// 局部頂點工廠對應的輸入結構體.
struct FVertexFactoryInput
{
    // 位置
    float4    Position    : ATTRIBUTE0;

    // 切線和顔色
#if !MANUAL_VERTEX_FETCH
    #if METAL_PROFILE
        float3    TangentX    : ATTRIBUTE1;
        // TangentZ.w contains sign of tangent basis determinant
        float4    TangentZ    : ATTRIBUTE2;

        float4    Color        : ATTRIBUTE3;
    #else
        half3    TangentX    : ATTRIBUTE1;
        // TangentZ.w contains sign of tangent basis determinant
        half4    TangentZ    : ATTRIBUTE2;

        half4    Color        : ATTRIBUTE3;
    #endif
#endif

    // 紋理坐标
#if NUM_MATERIAL_TEXCOORDS_VERTEX
    #if !MANUAL_VERTEX_FETCH
        #if GPUSKIN_PASS_THROUGH
            // These must match GPUSkinVertexFactory.usf
            float2    TexCoords[NUM_MATERIAL_TEXCOORDS_VERTEX] : ATTRIBUTE4;
            #if NUM_MATERIAL_TEXCOORDS_VERTEX > 4
                #error Too many texture coordinate sets defined on GPUSkin vertex input. Max: 4.
            #endif
        #else
            #if NUM_MATERIAL_TEXCOORDS_VERTEX > 1
                float4    PackedTexCoords4[NUM_MATERIAL_TEXCOORDS_VERTEX/2] : ATTRIBUTE4;
            #endif
            #if NUM_MATERIAL_TEXCOORDS_VERTEX == 1
                float2    PackedTexCoords2 : ATTRIBUTE4;
            #elif NUM_MATERIAL_TEXCOORDS_VERTEX == 3
                float2    PackedTexCoords2 : ATTRIBUTE5;
            #elif NUM_MATERIAL_TEXCOORDS_VERTEX == 5
                float2    PackedTexCoords2 : ATTRIBUTE6;
            #elif NUM_MATERIAL_TEXCOORDS_VERTEX == 7
                float2    PackedTexCoords2 : ATTRIBUTE7;
            #endif
        #endif
    #endif
#elif USE_PARTICLE_SUBUVS && !MANUAL_VERTEX_FETCH
    float2    TexCoords[1] : ATTRIBUTE4;
#endif

    (......)
};
           

是以可知,FVertexFactoryInput結構體的資料順序和FLocalVertexFactory的頂點聲明是一一對應的。

UE的Shader代碼是采樣的了全能着色器(Uber Shader)的設計架構,這就需要在同一個shader代碼檔案裡增加許多各種各樣的宏,以區分不同Pass、功能、Feature Level和品質等級的分支代碼。在C++層,為了友善擴充、設定這些宏定義的開啟及不同的值,UE采用了着色器排列(Shader Permutation)的概念。

每一個排列包含着一個唯一的哈希鍵值,将這組排列的值填充到HLSL,編譯出對應的着色器代碼。下面分析着色器排列的核心類型的定義:

// Engine\Source\Runtime\RenderCore\Public\ShaderPermutation.h

// Bool的着色器排列
struct FShaderPermutationBool
{
    using Type = bool;

    // 次元數量.
    static constexpr int32 PermutationCount = 2;
    // 是否多元的排列.
    static constexpr bool IsMultiDimensional = false;
    // 轉換bool到int值.
    static int32 ToDimensionValueId(Type E)
    {
        return E ? 1 : 0;
    }
    // 轉換為定義的值.
    static bool ToDefineValue(Type E)
    {
        return E;
    }
    // 從排列id轉成bool.
    static Type FromDimensionValueId(int32 PermutationId)
    {
        checkf(PermutationId == 0 || PermutationId == 1, TEXT("Invalid shader permutation dimension id %i."), PermutationId);
        return PermutationId == 1;
    }
};

// 整型的着色器排列
template <typename TType, int32 TDimensionSize, int32 TFirstValue=0>
struct TShaderPermutationInt
{
    using Type = TType;
    static constexpr int32 PermutationCount = TDimensionSize;
    static constexpr bool IsMultiDimensional = false;
    
    // 最大最小值.
    static constexpr Type MinValue = static_cast<Type>(TFirstValue);
    static constexpr Type MaxValue = static_cast<Type>(TFirstValue + TDimensionSize - 1);

    static int32 ToDimensionValueId(Type E)
    static int32 ToDefineValue(Type E);
    static Type FromDimensionValueId(int32 PermutationId);
};

// 可變次元的整型着色器排列.
template <int32... Ts>
struct TShaderPermutationSparseInt
{
    using Type = int32;
    static constexpr int32 PermutationCount = 0;
    static constexpr bool IsMultiDimensional = false;

    static int32 ToDimensionValueId(Type E);
    static Type FromDimensionValueId(int32 PermutationId);
};

// 着色器排列域, 數量是可變的
template <typename... Ts>
struct TShaderPermutationDomain
{
    using Type = TShaderPermutationDomain<Ts...>;

    static constexpr bool IsMultiDimensional = true;
    static constexpr int32 PermutationCount = 1;

    // 構造函數.
    TShaderPermutationDomain<Ts...>() {}
    explicit TShaderPermutationDomain<Ts...>(int32 PermutationId)
    {
        checkf(PermutationId == 0, TEXT("Invalid shader permutation id %i."), PermutationId);
    }

    // 設定某個次元的值.
    template<class DimensionToSet>
    void Set(typename DimensionToSet::Type)
    {
        static_assert(sizeof(typename DimensionToSet::Type) == 0, "Unknown shader permutation dimension.");
    }
    // 擷取某個次元的值.
    template<class DimensionToGet>
    const typename DimensionToGet::Type Get() const
    {
        static_assert(sizeof(typename DimensionToGet::Type) == 0, "Unknown shader permutation dimension.");
        return DimensionToGet::Type();
    }

    // 修改編譯環境變量.
    void ModifyCompilationEnvironment(FShaderCompilerEnvironment& OutEnvironment) const {}

    // 資料轉換.
    static int32 ToDimensionValueId(const Type& PermutationVector)
    {
        return 0;
    }
    int32 ToDimensionValueId() const
    {
        return ToDimensionValueId(*this);
    }
    static Type FromDimensionValueId(const int32 PermutationId)
    {
        return Type(PermutationId);
    }

    bool operator==(const Type& Other) const
    {
        return true;
    }
};


// 下面的宏友善編寫shader的c++代碼時實作和設定着色器排列.

// 聲明指定名字的bool類型着色器排列
#define SHADER_PERMUTATION_BOOL(InDefineName)
// 聲明指定名字的int類型着色器排列
#define SHADER_PERMUTATION_INT(InDefineName, Count)
// 聲明指定名字和範圍的int類型着色器排列
#define SHADER_PERMUTATION_RANGE_INT(InDefineName, Start, Count)
// 聲明指定名字的稀疏int類型着色器排列
#define SHADER_PERMUTATION_SPARSE_INT(InDefineName,...)
// 聲明指定名字的枚舉類型着色器排列
#define SHADER_PERMUTATION_ENUM_CLASS(InDefineName, EnumName)
           

看上面的模闆和宏定義是不是有點懵、不知是以然?沒關系,結合FDeferredLightPS的使用案例,會發現着色器排列其實很簡單:

// 延遲光源的PS.
class FDeferredLightPS : public FGlobalShader
{
    DECLARE_SHADER_TYPE(FDeferredLightPS, Global)

    // 聲明各個次元的着色器排列, 注意用的是繼承, 且父類是用SHADER_PERMUTATION_xxx定義的類型.
    // 注意父類的名詞(如LIGHT_SOURCE_SHAPE, USE_SOURCE_TEXTURE, USE_IES_PROFILE, ...)就是在HLSL代碼中的宏名稱.
    class FSourceShapeDim        : SHADER_PERMUTATION_ENUM_CLASS("LIGHT_SOURCE_SHAPE", ELightSourceShape);
    class FSourceTextureDim        : SHADER_PERMUTATION_BOOL("USE_SOURCE_TEXTURE");
    class FIESProfileDim        : SHADER_PERMUTATION_BOOL("USE_IES_PROFILE");
    class FInverseSquaredDim    : SHADER_PERMUTATION_BOOL("INVERSE_SQUARED_FALLOFF");
    class FVisualizeCullingDim    : SHADER_PERMUTATION_BOOL("VISUALIZE_LIGHT_CULLING");
    class FLightingChannelsDim    : SHADER_PERMUTATION_BOOL("USE_LIGHTING_CHANNELS");
    class FTransmissionDim        : SHADER_PERMUTATION_BOOL("USE_TRANSMISSION");
    class FHairLighting            : SHADER_PERMUTATION_INT("USE_HAIR_LIGHTING", 2);
    class FAtmosphereTransmittance : SHADER_PERMUTATION_BOOL("USE_ATMOSPHERE_TRANSMITTANCE");
    class FCloudTransmittance     : SHADER_PERMUTATION_BOOL("USE_CLOUD_TRANSMITTANCE");
    class FAnistropicMaterials     : SHADER_PERMUTATION_BOOL("SUPPORTS_ANISOTROPIC_MATERIALS");

    // 聲明着色器排列域, 包含了上面定義的所有次元.
    using FPermutationDomain = TShaderPermutationDomain<
        FSourceShapeDim,
        FSourceTextureDim,
        FIESProfileDim,
        FInverseSquaredDim,
        FVisualizeCullingDim,
        FLightingChannelsDim,
        FTransmissionDim,
        FHairLighting,
        FAtmosphereTransmittance,
        FCloudTransmittance,
        FAnistropicMaterials>;

    // 是否需要編譯指定的着色器排列.
    static bool ShouldCompilePermutation(const FGlobalShaderPermutationParameters& Parameters)
    {
        // 擷取着色器排列的值.
        FPermutationDomain PermutationVector(Parameters.PermutationId);

        // 如果是平行光, 那麼IES光照和逆反的衰減将沒有任何意義, 可以不編譯.
        if( PermutationVector.Get< FSourceShapeDim >() == ELightSourceShape::Directional && (
            PermutationVector.Get< FIESProfileDim >() ||
            PermutationVector.Get< FInverseSquaredDim >() ) )
        {
            return false;
        }

        // 如果不是平行光, 那麼大氣和雲體透射将沒有任何意義, 可以不編譯.
        if (PermutationVector.Get< FSourceShapeDim >() != ELightSourceShape::Directional && (PermutationVector.Get<FAtmosphereTransmittance>() || PermutationVector.Get<FCloudTransmittance>()))
        {
            return false;
        }

        (......)

        return IsFeatureLevelSupported(Parameters.Platform, ERHIFeatureLevel::SM5);
    }

    (......)
};

// 渲染光源.
void FDeferredShadingSceneRenderer::RenderLight(FRHICommandList& RHICmdList, ...)
{
    (......)

    for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
    {
        FViewInfo& View = Views[ViewIndex];
        
        (......)
        
        if (LightSceneInfo->Proxy->GetLightType() == LightType_Directional)
        {
            (......)

            // 聲明FDeferredLightPS的着色器排列的執行個體.
            FDeferredLightPS::FPermutationDomain PermutationVector;
            
            // 根據渲染狀态填充排列值.
            PermutationVector.Set< FDeferredLightPS::FSourceShapeDim >( ELightSourceShape::Directional );
            PermutationVector.Set< FDeferredLightPS::FIESProfileDim >( false );
            PermutationVector.Set< FDeferredLightPS::FInverseSquaredDim >( false );
            PermutationVector.Set< FDeferredLightPS::FVisualizeCullingDim >( View.Family->EngineShowFlags.VisualizeLightCulling );
            PermutationVector.Set< FDeferredLightPS::FLightingChannelsDim >( View.bUsesLightingChannels );
            PermutationVector.Set< FDeferredLightPS::FAnistropicMaterials >(ShouldRenderAnisotropyPass());
            PermutationVector.Set< FDeferredLightPS::FTransmissionDim >( bTransmission );
            PermutationVector.Set< FDeferredLightPS::FHairLighting>(0);
            PermutationVector.Set< FDeferredLightPS::FAtmosphereTransmittance >(bAtmospherePerPixelTransmittance);
            PermutationVector.Set< FDeferredLightPS::FCloudTransmittance >(bLight0CloudPerPixelTransmittance || bLight1CloudPerPixelTransmittance);

            // 用填充好的排列從視圖的ShaderMap擷取對應的PS執行個體.
            TShaderMapRef< FDeferredLightPS > PixelShader( View.ShaderMap, PermutationVector );
            
            // 填充PS的其它資料.
            GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
            GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
            GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();

            SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);
            PixelShader->SetParameters(RHICmdList, View, LightSceneInfo, ScreenShadowMaskTexture, LightingChannelsTexture, &RenderLightParams);
            
             (......)
        }
            
    (......)
}
           

由此可知,着色器排列本質上隻是一組擁有不定次元的鍵值,在編譯shader階段,shader編譯器會盡量為每個不同的排列生成對應的shader執行個體代碼,當然也可以通過ShouldCompilePermutation排除掉部分無意義的排列。預編譯好的所有shader存放于視圖的ShaderMap中。每個次元的鍵值可在運作時動态生成,然後用它們組合成的排列域去視圖的ShaderMap擷取對應的編譯好的shader代碼,進而進行後續的着色器資料設定和渲染。

另外,值得一提的是,排列次元父類的名詞(如LIGHT_SOURCE_SHAPE, USE_SOURCE_TEXTURE, USE_IES_PROFILE, ...)就是在HLSL代碼中的宏名稱。比如FSourceShapeDim正是控制着HLSL代碼的LIGHT_SOURCE_SHAPE,根據FSourceShapeDim的值會選用不同片段的代碼,進而控制不同版本和分支的shader代碼。

本章主要分析Shader的部分底層機制,比如Shader Map的存儲機制,Shader的編譯和緩存政策等。

ShaderMap是存儲編譯後的shader代碼,分為FGlobalShaderMap、FMaterialShaderMap、FMeshMaterialShaderMap三種類型。

本小節先闡述Shader Map相關的基礎類型和概念,如下:

// Engine\Source\Runtime\Core\Public\Serialization\MemoryImage.h

// 指針表基類.
class FPointerTableBase
{
public:
    virtual ~FPointerTableBase() {}
    virtual int32 AddIndexedPointer(const FTypeLayoutDesc& TypeDesc, void* Ptr) = 0;
    virtual void* GetIndexedPointer(const FTypeLayoutDesc& TypeDesc, uint32 i) const = 0;
};

// Engine\Source\Runtime\RenderCore\Public\Shader.h

// 用以序列化, 反序列化, 編譯, 緩存一個專用的shader類. 一個FShaderType可以跨多個次元管理FShader的多個執行個體,如EShaderPlatform,或permutation id. FShaderType的排列數量簡單地由GetPermutationCount()給出。  
class FShaderType
{
public:
    // 着色器種類, 有全局, 材質, 網格材質, Niagara等.
    enum class EShaderTypeForDynamicCast : uint32
    {
        Global,
        Material,
        MeshMaterial,
        Niagara,
        OCIO,
        NumShaderTypes,
    };

    (......)

    // 靜态資料擷取接口.
    static TLinkedList<FShaderType*>*& GetTypeList();
    static FShaderType* GetShaderTypeByName(const TCHAR* Name);
    static TArray<const FShaderType*> GetShaderTypesByFilename(const TCHAR* Filename);
    static TMap<FHashedName, FShaderType*>& GetNameToTypeMap();
    static const TArray<FShaderType*>& GetSortedTypes(EShaderTypeForDynamicCast Type);
    
    static void Initialize(const TMap<FString, TArray<const TCHAR*> >& ShaderFileToUniformBufferVariables);
    static void Uninitialize();

    // 構造函數.
    FShaderType(...);
    virtual ~FShaderType();

    FShader* ConstructForDeserialization() const;
    FShader* ConstructCompiled(const FShader::CompiledShaderInitializerType& Initializer) const;

    bool ShouldCompilePermutation(...) const;
    void ModifyCompilationEnvironment(..) const;
    bool ValidateCompiledResult(...) const;

    // 基于shader type的源碼和包含計算哈希值.
    const FSHAHash& GetSourceHash(EShaderPlatform ShaderPlatform) const;
    // 擷取FShaderType指針的哈希值.
    friend uint32 GetTypeHash(FShaderType* Ref);

    // 通路接口.
    (......)

    void AddReferencedUniformBufferIncludes(FShaderCompilerEnvironment& OutEnvironment, FString& OutSourceFilePrefix, EShaderPlatform Platform);
    void FlushShaderFileCache(const TMap<FString, TArray<const TCHAR*> >& ShaderFileToUniformBufferVariables);
    void GetShaderStableKeyParts(struct FStableShaderKeyAndValue& SaveKeyVal);

private:
    EShaderTypeForDynamicCast ShaderTypeForDynamicCast;
    const FTypeLayoutDesc* TypeLayout;
    // 名稱.
    const TCHAR* Name;
    // 類型名.
    FName TypeName;
    // 哈希名
    FHashedName HashedName;
    // 哈希的源碼檔案名.
    FHashedName HashedSourceFilename;
    // 源檔案名.
    const TCHAR* SourceFilename;
    // 入口命.
    const TCHAR* FunctionName;
    // 着色頻率.
    uint32 Frequency;
    uint32 TypeSize;
    // 排列數量.
    int32 TotalPermutationCount;

    (......)

    // 全局的清單.
    TLinkedList<FShaderType*> GlobalListLink;

protected:
    bool bCachedUniformBufferStructDeclarations;
    // 引用的Uniform Buffer包含的緩存.
    TMap<const TCHAR*, FCachedUniformBufferDeclaration> ReferencedUniformBufferStructsCache;
};

// 着色器映射表指針表
class FShaderMapPointerTable : public FPointerTableBase
{
public:
    virtual int32 AddIndexedPointer(const FTypeLayoutDesc& TypeDesc, void* Ptr) override;
    virtual void* GetIndexedPointer(const FTypeLayoutDesc& TypeDesc, uint32 i) const override;

    virtual void SaveToArchive(FArchive& Ar, void* FrozenContent, bool bInlineShaderResources) const;
    virtual void LoadFromArchive(FArchive& Ar, void* FrozenContent, bool bInlineShaderResources, bool bLoadedByCookedMaterial);

    // 着色器類型
    TPtrTable<FShaderType> ShaderTypes;
    // 頂點工廠類型
    TPtrTable<FVertexFactoryType> VFTypes;
};

// 包含編譯期狀态的着色器管線執行個體.
class FShaderPipeline
{
public:
    explicit FShaderPipeline(const FShaderPipelineType* InType);
    ~FShaderPipeline();

    // 增加着色器.
    void AddShader(FShader* Shader, int32 PermutationId);
    // 擷取着色器數量.
    inline uint32 GetNumShaders() const;

    // 查找shader.
    template<typename ShaderType>
    ShaderType* GetShader(const FShaderMapPointerTable& InPtrTable);
    FShader* GetShader(EShaderFrequency Frequency);
    const FShader* GetShader(EShaderFrequency Frequency) const;
    inline TArray<TShaderRef<FShader>> GetShaders(const FShaderMapBase& InShaderMap) const;

    // 校驗.
    void Validate(const FShaderPipelineType* InPipelineType) const;
    // 處理編譯好的着色器代碼.
    void Finalize(const FShaderMapResourceCode* Code);
    
    (......)

    enum EFilter
    {
        EAll,            // All pipelines
        EOnlyShared,    // Only pipelines with shared shaders
        EOnlyUnique,    // Only pipelines with unique shaders
    };

    // 哈希值.
    LAYOUT_FIELD(FHashedName, TypeName);
    // 所有着色頻率的FShader執行個體.
    LAYOUT_ARRAY(TMemoryImagePtr<FShader>, Shaders, SF_NumGraphicsFrequencies);
    // 排列id.
    LAYOUT_ARRAY(int32, PermutationIds, SF_NumGraphicsFrequencies);
};

// 着色器映射表内容.
class FShaderMapContent
{
public:
    struct FProjectShaderPipelineToKey
    {
        inline FHashedName operator()(const FShaderPipeline* InShaderPipeline) 
        { return InShaderPipeline->TypeName; }
    };

    explicit FShaderMapContent(EShaderPlatform InPlatform);
    ~FShaderMapContent();

    EShaderPlatform GetShaderPlatform() const;

    // 校驗.
    void Validate(const FShaderMapBase& InShaderMap);

    // 查找shader.
    template<typename ShaderType>
    ShaderType* GetShader(int32 PermutationId = 0) const;
    template<typename ShaderType>
    ShaderType* GetShader( const typename ShaderType::FPermutationDomain& PermutationVector ) const;
    FShader* GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
    FShader* GetShader(const FHashedName& TypeName, int32 PermutationId = 0) const;

    // 檢測是否有指定shader.
    bool HasShader(const FHashedName& TypeName, int32 PermutationId) const;
    bool HasShader(const FShaderType* Type, int32 PermutationId) const;

    inline TArrayView<const TMemoryImagePtr<FShader>> GetShaders() const;
    inline TArrayView<const TMemoryImagePtr<FShaderPipeline>> GetShaderPipelines() const;

    // 增加, 查找shader或Pipeline接口.
    void AddShader(const FHashedName& TypeName, int32 PermutationId, FShader* Shader);
    FShader* FindOrAddShader(const FHashedName& TypeName, int32 PermutationId, FShader* Shader);
    void AddShaderPipeline(FShaderPipeline* Pipeline);
    FShaderPipeline* FindOrAddShaderPipeline(FShaderPipeline* Pipeline);

    // 删除接口.
    void RemoveShaderTypePermutaion(const FHashedName& TypeName, int32 PermutationId);
    inline void RemoveShaderTypePermutaion(const FShaderType* Type, int32 PermutationId);
    void RemoveShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);

    // 擷取着色器清單.
    void GetShaderList(const FShaderMapBase& InShaderMap, const FSHAHash& InMaterialShaderMapHash, TMap<FShaderId, TShaderRef<FShader>>& OutShaders) const;
    void GetShaderList(const FShaderMapBase& InShaderMap, TMap<FHashedName, TShaderRef<FShader>>& OutShaders) const;

    // 擷取着色器管線清單.
    void GetShaderPipelineList(const FShaderMapBase& InShaderMap, TArray<FShaderPipelineRef>& OutShaderPipelines, FShaderPipeline::EFilter Filter) const;

    (.......)

    // 擷取着色器最大的指令數.
    uint32 GetMaxNumInstructionsForShader(const FShaderMapBase& InShaderMap, FShaderType* ShaderType) const;
    // 儲存編譯好的shader代碼.
    void Finalize(const FShaderMapResourceCode* Code);
    // 更新哈希值.
    void UpdateHash(FSHA1& Hasher) const;

protected:
    using FMemoryImageHashTable = THashTable<FMemoryImageAllocator>;

    // 着色器哈希.
    LAYOUT_FIELD(FMemoryImageHashTable, ShaderHash);
    // 着色器類型.
    LAYOUT_FIELD(TMemoryImageArray<FHashedName>, ShaderTypes);
    // 着色器排列清單.
    LAYOUT_FIELD(TMemoryImageArray<int32>, ShaderPermutations);
    // 着色器執行個體清單.
    LAYOUT_FIELD(TMemoryImageArray<TMemoryImagePtr<FShader>>, Shaders);
    // 着色器管線清單.
    LAYOUT_FIELD(TMemoryImageArray<TMemoryImagePtr<FShaderPipeline>>, ShaderPipelines);
    // 着色器編譯所在的平台.
    LAYOUT_FIELD(TEnumAsByte<EShaderPlatform>, Platform);
};

// FShaderMa的基類.
class FShaderMapBase
{
public:
    (......)

private:
    const FTypeLayoutDesc& ContentTypeLayout;
    // ShaderMap資源.
    TRefCountPtr<FShaderMapResource> Resource;
    // ShaderMap資源代碼.
    TRefCountPtr<FShaderMapResourceCode> Code;
    // ShaderMap指針表.
    FShaderMapPointerTable* PointerTable;
    // ShaderMap内容.
    FShaderMapContent* Content;
    // 内容尺寸.
    uint32 FrozenContentSize;
    // 着色器數量.
    uint32 NumFrozenShaders;
};

// 着色器映射表. 需指定FShaderMapContent和FShaderMapPointerTable
template<typename ContentType, typename PointerTableType = FShaderMapPointerTable>
class TShaderMap : public FShaderMapBase
{
public:
    inline const PointerTableType& GetPointerTable();
    inline const ContentType* GetContent() const;
    inline ContentType* GetMutableContent();

    void FinalizeContent()
    {
        ContentType* LocalContent = this->GetMutableContent();
        LocalContent->Finalize(this->GetResourceCode());
        FShaderMapBase::FinalizeContent();
    }

protected:
    TShaderMap();
    virtual FShaderMapPointerTable* CreatePointerTable();
};

// 着色器管線引用.
class FShaderPipelineRef
{
public:
    FShaderPipelineRef();
    FShaderPipelineRef(FShaderPipeline* InPipeline, const FShaderMapBase& InShaderMap);

    (......)

    // 擷取着色器
    template<typename ShaderType>
    TShaderRef<ShaderType> GetShader() const;
    TShaderRef<FShader> GetShader(EShaderFrequency Frequency) const;
    inline TArray<TShaderRef<FShader>> GetShaders() const;

    // 擷取着色管線, 資源等接口.
    inline FShaderPipeline* GetPipeline() const;
    FShaderMapResource* GetResource() const;
    const FShaderMapPointerTable& GetPointerTable() const;

    inline FShaderPipeline* operator->() const;

private:
    FShaderPipeline* ShaderPipeline; // 着色器管線.
    const FShaderMapBase* ShaderMap; // 着色器映射表.
};
           

上面的很多類型是基類,具體的邏輯需要由子類完成。

FGlobalShaderMap儲存并管理着所有編譯好的FGlobalShader代碼,它的定義和相關類型如下所示:

// Engine\Source\Runtime\RenderCore\Public\GlobalShader.h

// 用于處理最簡單的着色器(沒有材質和頂點工廠連結)的shader meta type, 每個簡單的shader都應該隻有一個執行個體.
class FGlobalShaderType : public FShaderType
{
    friend class FGlobalShaderTypeCompiler;
public:

    typedef FShader::CompiledShaderInitializerType CompiledShaderInitializerType;

    FGlobalShaderType(...);

    bool ShouldCompilePermutation(EShaderPlatform Platform, int32 PermutationId) const;
    void SetupCompileEnvironment(EShaderPlatform Platform, int32 PermutationId, FShaderCompilerEnvironment& Environment);
};

// 全局着色器子表.
class FGlobalShaderMapContent : public FShaderMapContent
{
    (......)
public:
    const FHashedName& GetHashedSourceFilename();

private:
    inline FGlobalShaderMapContent(EShaderPlatform InPlatform, const FHashedName& InHashedSourceFilename);

    // 哈希的源檔案名.
    LAYOUT_FIELD(FHashedName, HashedSourceFilename);
};

class FGlobalShaderMapSection : public TShaderMap<FGlobalShaderMapContent, FShaderMapPointerTable>
{
    (......)
    
private:
    inline FGlobalShaderMapSection();
    inline FGlobalShaderMapSection(EShaderPlatform InPlatform, const FHashedName& InHashedSourceFilename);

    TShaderRef<FShader> GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
    FShaderPipelineRef GetShaderPipeline(const FShaderPipelineType* PipelineType) const;
};

// 全局ShaderMap.
class FGlobalShaderMap
{
public:
    explicit FGlobalShaderMap(EShaderPlatform InPlatform);
    ~FGlobalShaderMap();

    // 根據着色器類型和排列id擷取編譯後的shader代碼.
    TShaderRef<FShader> GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
    // 根據排列id擷取編譯後的shader代碼.
    template<typename ShaderType>
    TShaderRef<ShaderType> GetShader(int32 PermutationId = 0) const
    {
        TShaderRef<FShader> Shader = GetShader(&ShaderType::StaticType, PermutationId);
        return TShaderRef<ShaderType>::Cast(Shader);
    }
    // 根據着色器類型内的排列擷取編譯後的shader代碼.
    template<typename ShaderType>
    TShaderRef<ShaderType> GetShader(const typename ShaderType::FPermutationDomain& PermutationVector) const
    {
        return GetShader<ShaderType>(PermutationVector.ToDimensionValueId());
    }
    
    // 檢測是否有指定的shader.
    bool HasShader(FShaderType* Type, int32 PermutationId) const
    {
        return GetShader(Type, PermutationId).IsValid();
    }
    
    // 擷取着色器管線
    FShaderPipelineRef GetShaderPipeline(const FShaderPipelineType* PipelineType) const;

    // 是否有着色器管線.
    bool HasShaderPipeline(const FShaderPipelineType* ShaderPipelineType) const
    {
        return GetShaderPipeline(ShaderPipelineType).IsValid();
    }

    bool IsEmpty() const;
    void Empty();
    void ReleaseAllSections();

    // 查找或增加shader.
    FShader* FindOrAddShader(const FShaderType* ShaderType, int32 PermutationId, FShader* Shader);
    // 查找或增加shader管線.
    FShaderPipeline* FindOrAddShaderPipeline(const FShaderPipelineType* ShaderPipelineType, FShaderPipeline* ShaderPipeline);

    // 删除接口
    void RemoveShaderTypePermutaion(const FShaderType* Type, int32 PermutationId);
    void RemoveShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);

    // ShaderMapSection操作.
    void AddSection(FGlobalShaderMapSection* InSection);
    FGlobalShaderMapSection* FindSection(const FHashedName& HashedShaderFilename);
    FGlobalShaderMapSection* FindOrAddSection(const FShaderType* ShaderType);
    
    // IO接口.
    void LoadFromGlobalArchive(FArchive& Ar);
    void SaveToGlobalArchive(FArchive& Ar);

    // 清理所有shader.
    void BeginCreateAllShaders();

    (......)

private:
    // 存儲了FGlobalShaderMapSection的映射表.
    TMap<FHashedName, FGlobalShaderMapSection*> SectionMap;
    EShaderPlatform Platform;
};

// 全局ShaderMap的清單, 其中SP_NumPlatforms是49.
extern RENDERCORE_API FGlobalShaderMap* GGlobalShaderMap[SP_NumPlatforms];
           

上面涉及到了ShaderMap的Content、Section、PointerTable、ShaderType等等方面的類型和概念,資料多,關系複雜,不過抽象成UML圖之後就簡單明了多了:

FShaderType <|-- FGlobalShaderType

FPointerTableBase <|-- FShaderMapPointerTable

FShaderMapContent <|-- FGlobalShaderMapContent

FShaderMapBase <|-- TShaderMap

TShaderMap <|-- FGlobalShaderMapSection

FShaderPipeline <-- FShaderPipelineRef

以上類圖為了簡明,隻展示了繼承關系,若是添加關聯、聚合、組合等關系之後,則是以下的模樣:

FShader --o FShaderPipeline

class FShaderPipeline{

FShader Shaders[5]

FShaderPipeline <-- FShaderMapContent

FShaderType <-- FShaderMapContent

FShader --o FShaderMapContent

class FShaderMapContent{

FHashedName ShaderTypes

FShader Shaders

FShaderPipeline ShaderPipelines

FShaderMapContent <-- FShaderMapBase

FShaderMapPointerTable <-- FShaderMapBase

class FShaderMapBase{

FShaderMapPointerTable* PointerTable

FShaderMapContent* Content

class FShaderPipelineRef{

FShaderPipeline* ShaderPipeline

class FGlobalShaderMapContent{

FHashedName HashedSourceFilename

FGlobalShaderMapSection --o FGlobalShaderMap

class FGlobalShaderMap{

TMap<FHashedName, FGlobalShaderMapSection*> SectionMap

上面闡述完了FGlobalShaderMap及其核心類的關聯,下面再看看它是任何被應用到實際渲染中的。首先是在GlobalShader.h和GlobalShader.cpp聲明和定義了FGlobalShaderMap的執行個體和相關接口:

// Engine\Source\Runtime\RenderCore\Private\GlobalShader.h

// 聲明可外部通路的FGlobalShaderMap清單.
extern RENDERCORE_API FGlobalShaderMap* GGlobalShaderMap[SP_NumPlatforms];

// 擷取指定着色平台的FGlobalShaderMap.
extern RENDERCORE_API FGlobalShaderMap* GetGlobalShaderMap(EShaderPlatform Platform);

// 擷取指定FeatureLevel的FGlobalShaderMap.
inline FGlobalShaderMap* GetGlobalShaderMap(ERHIFeatureLevel::Type FeatureLevel)
{ 
    return GetGlobalShaderMap(GShaderPlatformForFeatureLevel[FeatureLevel]); 
}

// Engine\Source\Runtime\RenderCore\Private\GlobalShader.cpp

// 聲明所有着色平台的FGlobalShaderMap.
FGlobalShaderMap* GGlobalShaderMap[SP_NumPlatforms] = {};

// 擷取FGlobalShaderMap.
FGlobalShaderMap* GetGlobalShaderMap(EShaderPlatform Platform)
{
    return GGlobalShaderMap[Platform];
}
           

不過上面隻是定義了GGlobalShaderMap,數組内隻是一個空的清單,真正的建立堆棧鍊如下所示:

// Engine\Source\Runtime\Launch\Private\LaunchEngineLoop.cpp

// 引擎預初始化.
int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
    (......)
    
    // 是否開啟shader編譯, 一般情況下都會開啟.
    bool bEnableShaderCompile = !FParse::Param(FCommandLine::Get(), TEXT("NoShaderCompile"));
    
    (......)
    
    if (bEnableShaderCompile && !IsRunningDedicatedServer() && !bIsCook)
    {
        (......)
        
        // 編譯GlobalShaderMap
        CompileGlobalShaderMap(false);
        
        (......)
    }
    
    (......)
}

// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp

void CompileGlobalShaderMap(EShaderPlatform Platform, const ITargetPlatform* TargetPlatform, bool bRefreshShaderMap)
{
    (......)

    // 如果對應平台的GlobalShaderMap未建立, 則建立之.
    if (!GGlobalShaderMap[Platform])
    {
        (......)

        // 建立對應平台的FGlobalShaderMap.
        GGlobalShaderMap[Platform] = new FGlobalShaderMap(Platform);

        // Cooked模式.
        if (FPlatformProperties::RequiresCookedData())
        {
            (......)
        }
        // Uncooked模式
        else
        {
            // FGlobalShaderMap的id.
            FGlobalShaderMapId ShaderMapId(Platform);

            const int32 ShaderFilenameNum = ShaderMapId.GetShaderFilenameToDependeciesMap().Num();
            const float ProgressStep = 25.0f / ShaderFilenameNum;

            TArray<uint32> AsyncDDCRequestHandles;
            AsyncDDCRequestHandles.SetNum(ShaderFilenameNum);

            int32 HandleIndex = 0;

            // 送出DDC請求.
            for (const auto& ShaderFilenameDependencies : ShaderMapId.GetShaderFilenameToDependeciesMap())
            {
                SlowTask.EnterProgressFrame(ProgressStep);

                const FString DataKey = GetGlobalShaderMapKeyString(ShaderMapId, Platform, TargetPlatform, ShaderFilenameDependencies.Value);

                AsyncDDCRequestHandles[HandleIndex] = GetDerivedDataCacheRef().GetAsynchronous(*DataKey, TEXT("GlobalShaderMap"_SV));

                ++HandleIndex;
            }

            // 處理已經結束的DDC請求.
            TArray<uint8> CachedData;
            HandleIndex = 0;
            for (const auto& ShaderFilenameDependencies : ShaderMapId.GetShaderFilenameToDependeciesMap())
            {
                SlowTask.EnterProgressFrame(ProgressStep);
                CachedData.Reset();
                
                GetDerivedDataCacheRef().WaitAsynchronousCompletion(AsyncDDCRequestHandles[HandleIndex]);
                if (GetDerivedDataCacheRef().GetAsynchronousResults(AsyncDDCRequestHandles[HandleIndex], CachedData))
                {
                    FMemoryReader MemoryReader(CachedData);
                    GGlobalShaderMap[Platform]->AddSection(FGlobalShaderMapSection::CreateFromArchive(MemoryReader));
                }
                else
                {
                    // 沒有在DDC中找到, 忽略之.
                }

                ++HandleIndex;
            }
        }

        // 如果有shader沒有被加載, 編譯之.
        VerifyGlobalShaders(Platform, bLoadedFromCacheFile);

        // 建立所有着色器.
        if (GCreateShadersOnLoad && Platform == GMaxRHIShaderPlatform)
        {
            GGlobalShaderMap[Platform]->BeginCreateAllShaders();
        }
    }
}
           

以上可知,FGlobalShaderMap是在引擎預初始化階段就被建立出執行個體,然後會嘗試從DDC中讀取已經編譯好的shader資料。在此之後,其它子產品就可以正常通路和操作FGlobalShaderMap的對象了。

另外,在FViewInfo内部,也存有FGlobalShaderMap的執行個體,不過它也是通過GetGlobalShaderMap擷取的執行個體:

// Engine\Source\Runtime\Renderer\Private\SceneRendering.h

class FViewInfo : public FSceneView
{
public:
    (......)
    
    FGlobalShaderMap* ShaderMap;
    
    (......)
};

// Engine\Source\Runtime\Renderer\Private\SceneRendering.cpp

void FViewInfo::Init()
{
    (......)

    ShaderMap = GetGlobalShaderMap(FeatureLevel);
    
    (......)
}
           

如此一來,渲染子產品内的大多數邏輯都可以友善地擷取到FViewInfo的執行個體,是以也就可以友善地通路FGlobalShaderMap的執行個體(還不需要指定FeatureLevel)。

FMaterialShaderMap存儲和管理着一組FMaterialShader執行個體的對象。它和相關的類型定義如下:

// Engine\Source\Runtime\Engine\Public\MaterialShared.h

// 材質ShaderMap内容.
class FMaterialShaderMapContent : public FShaderMapContent
{
public:
    (......)

    inline uint32 GetNumShaders() const;
    inline uint32 GetNumShaderPipelines() const;

private:
    struct FProjectMeshShaderMapToKey
    {
        inline const FHashedName& operator()(const FMeshMaterialShaderMap* InShaderMap) { return InShaderMap->GetVertexFactoryTypeName(); }
    };

    // 擷取/增加/删除操作.
    FMeshMaterialShaderMap* GetMeshShaderMap(const FHashedName& VertexFactoryTypeName) const;
    void AddMeshShaderMap(const FVertexFactoryType* VertexFactoryType, FMeshMaterialShaderMap* MeshShaderMap);
    void RemoveMeshShaderMap(const FVertexFactoryType* VertexFactoryType);

    // 有序的網格着色器映射表, 通過VFType->GetId()索引, 用于運作時快速查找.
    LAYOUT_FIELD(TMemoryImageArray<TMemoryImagePtr<FMeshMaterialShaderMap>>, OrderedMeshShaderMaps);
    // 材質編譯輸出.
    LAYOUT_FIELD(FMaterialCompilationOutput, MaterialCompilationOutput);
    // 着色器内容哈希.
    LAYOUT_FIELD(FSHAHash, ShaderContentHash);

    LAYOUT_FIELD_EDITORONLY(TMemoryImageArray<FMaterialProcessedSource>, ShaderProcessedSource);
    LAYOUT_FIELD_EDITORONLY(FMemoryImageString, FriendlyName);
    LAYOUT_FIELD_EDITORONLY(FMemoryImageString, DebugDescription);
    LAYOUT_FIELD_EDITORONLY(FMemoryImageString, MaterialPath);
};

// 材質着色器映射表, 父類是TShaderMap.
class FMaterialShaderMap : public TShaderMap<FMaterialShaderMapContent, FShaderMapPointerTable>, public FDeferredCleanupInterface
{
public:
    using Super = TShaderMap<FMaterialShaderMapContent, FShaderMapPointerTable>;

    // 查找指定id和平台的FMaterialShaderMap執行個體.
    static TRefCountPtr<FMaterialShaderMap> FindId(const FMaterialShaderMapId& ShaderMapId, EShaderPlatform Platform);

    (......)

    // ShaderMap interface
    // 擷取着色器執行個體.
    TShaderRef<FShader> GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
    template<typename ShaderType> TShaderRef<ShaderType> GetShader(int32 PermutationId = 0) const;
    template<typename ShaderType> TShaderRef<ShaderType> GetShader(const typename ShaderType::FPermutationDomain& PermutationVector) const;

    uint32 GetMaxNumInstructionsForShader(FShaderType* ShaderType) const;

    void FinalizeContent();

    // 編譯一個材質的着色器并緩存到shader map中.
    void Compile(FMaterial* Material,const FMaterialShaderMapId& ShaderMapId, TRefCountPtr<FShaderCompilerEnvironment> MaterialEnvironment, const FMaterialCompilationOutput& InMaterialCompilationOutput, EShaderPlatform Platform, bool bSynchronousCompile);

    // 檢測是否有shader丢失.
    bool IsComplete(const FMaterial* Material, bool bSilent);
    // 嘗試增加已有的編譯任務.
    bool TryToAddToExistingCompilationTask(FMaterial* Material);

    // 建構在shader map的shader清單.
    void GetShaderList(TMap<FShaderId, TShaderRef<FShader>>& OutShaders) const;
    void GetShaderList(TMap<FHashedName, TShaderRef<FShader>>& OutShaders) const;
    void GetShaderPipelineList(TArray<FShaderPipelineRef>& OutShaderPipelines) const;

    uint32 GetShaderNum() const;

    // 注冊一個材質着色器映射表到全局表中, 那樣就可以被材質使用.
    void Register(EShaderPlatform InShaderPlatform);

    // Reference counting.
    void AddRef();
    void Release();

    // 删除指定shader type的所有在緩存的入口.
    void FlushShadersByShaderType(const FShaderType* ShaderType);
    void FlushShadersByShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);
    void FlushShadersByVertexFactoryType(const FVertexFactoryType* VertexFactoryType);
    
    static void RemovePendingMaterial(FMaterial* Material);
    static const FMaterialShaderMap* GetShaderMapBeingCompiled(const FMaterial* Material);

    // Accessors.
    FMeshMaterialShaderMap* GetMeshShaderMap(FVertexFactoryType* VertexFactoryType) const;
    FMeshMaterialShaderMap* GetMeshShaderMap(const FHashedName& VertexFactoryTypeName) const;
    const FMaterialShaderMapId& GetShaderMapId() const;
    
    (......)

private:
    // 全局的材質shader map.
    static TMap<FMaterialShaderMapId,FMaterialShaderMap*> GIdToMaterialShaderMap[SP_NumPlatforms];
    static FCriticalSection GIdToMaterialShaderMapCS;
    // 正在編譯的材質.
    static TMap<TRefCountPtr<FMaterialShaderMap>, TArray<FMaterial*> > ShaderMapsBeingCompiled;

    // 着色器映射表id.
    FMaterialShaderMapId ShaderMapId;
    // 編譯期間的id.
    uint32 CompilingId;
    // 對應的平台.
    const ITargetPlatform* CompilingTargetPlatform;

    // 被引用的數量.
    mutable int32 NumRefs;

    // 标記
    bool bDeletedThroughDeferredCleanup;
    uint32 bRegistered : 1;
    uint32 bCompilationFinalized : 1;
    uint32 bCompiledSuccessfully : 1;
    uint32 bIsPersistent : 1;

    (......)
};
           

FMaterialShaderMap和FGlobalShaderMap不一樣的是,它會額外關聯一個材質和一個頂點工廠。對于單個FMaterialShaderMap的内部資料内容,如下所示:

FMaterialShaderMap
    FLightFunctionPixelShader - FMaterialShaderType
    FLocalVertexFactory - FVertexFactoryType
        TDepthOnlyPS - FMeshMaterialShaderType
        TDepthOnlyVS - FMeshMaterialShaderType
        TBasePassPS - FMeshMaterialShaderType
        TBasePassVS - FMeshMaterialShaderType
        (......)
    FGPUSkinVertexFactory - FVertexFactoryType
        (......)
           

由于FMaterialShaderMap跟材質藍圖綁定的,因為它是FMaterial的一個成員:

// Engine\Source\Runtime\Engine\Public\MaterialShared.h

class FMaterial
{
public:
    // 擷取材質的shader執行個體.
    TShaderRef<FShader> GetShader(class FMeshMaterialShaderType* ShaderType, FVertexFactoryType* VertexFactoryType, int32 PermutationId, bool bFatalIfMissing = true) const;
    
    (......)
    
private:
    // 遊戲線程的材質ShaderMap
    TRefCountPtr<FMaterialShaderMap> GameThreadShaderMap;
    // 渲染線程的材質ShaderMap
    TRefCountPtr<FMaterialShaderMap> RenderingThreadShaderMap;
    
    (......)
};

// Engine\Source\Runtime\Engine\Private\Materials\MaterialShared.cpp

TShaderRef<FShader> FMaterial::GetShader(FMeshMaterialShaderType* ShaderType, FVertexFactoryType* VertexFactoryType, int32 PermutationId, bool bFatalIfMissing) const
{
    // 從RenderingThreadShaderMap擷取shader.
    const FMeshMaterialShaderMap* MeshShaderMap = RenderingThreadShaderMap->GetMeshShaderMap(VertexFactoryType);
    FShader* Shader = MeshShaderMap ? MeshShaderMap->GetShader(ShaderType, PermutationId) : nullptr;
    
    (......)

    // 傳回FShader引用.
    return TShaderRef<FShader>(Shader, *RenderingThreadShaderMap);
}
           

是以可以找到,每個FMaterial都有一個FMaterialShaderMap(遊戲線程一個,渲染線程一個),如果要擷取FMaterial的指定類型的Shader,就需要從該FMaterial的FMaterialShaderMap執行個體中擷取,進而完成了它們之間的連結。

以上小節闡述了,FGlobalShaderMap存儲和管理FGlobalShader,而FMaterialShaderMap存儲和管理FMaterialShader,相應地,FMeshMaterialShaderMap則存儲和管理FMeshMaterialShader。它的定義如下:

// Engine\Source\Runtime\Engine\Public\MaterialShared.h

class FMeshMaterialShaderMap : public FShaderMapContent
{
public:
    FMeshMaterialShaderMap(EShaderPlatform InPlatform, FVertexFactoryType* InVFType);

    // 開始編譯指定材質和頂點工廠類型的所有材質.
    uint32 BeginCompile(
        uint32 ShaderMapId,
        const FMaterialShaderMapId& InShaderMapId, 
        const FMaterial* Material,
        const FMeshMaterialShaderMapLayout& MeshLayout,
        FShaderCompilerEnvironment* MaterialEnvironment,
        EShaderPlatform Platform,
        TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>>& NewJobs,
        FString DebugDescription,
        FString DebugExtension
        );

    void FlushShadersByShaderType(const FShaderType* ShaderType);
    void FlushShadersByShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);

    (......)

private:
    // 頂點工廠類型名稱.
    LAYOUT_FIELD(FHashedName, VertexFactoryTypeName);
};
           

FMeshMaterialShaderMap通常不能單獨被建立,而是附加在FMaterialShaderMapContent之中,随着FMaterialShaderMapContent一起被建立和銷毀,具體細節和應用見上一小節。

本節講的是如何将材質藍圖和usf檔案編譯成對應目标平台的shader代碼。為了便于闡述單個Shader檔案的編譯過程,我們不妨追蹤

RecompileShaders

的指令的處理過程(編譯的是全局shader):

// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp

bool RecompileShaders(const TCHAR* Cmd, FOutputDevice& Ar)
{
    (......)

    FString FlagStr(FParse::Token(Cmd, 0));
    if( FlagStr.Len() > 0 )
    {
        // 重新整理着色器檔案緩存.
        FlushShaderFileCache();
        // 重新整理渲染指令.
        FlushRenderingCommands();

        // 處理`RecompileShaders Changed`指令
        if( FCString::Stricmp(*FlagStr,TEXT("Changed"))==0)
        {
            (......)
        }
        // 處理`RecompileShaders Global`指令
        else if( FCString::Stricmp(*FlagStr,TEXT("Global"))==0)
        {
            (......)
        }
        // 處理`RecompileShaders Material`指令
        else if( FCString::Stricmp(*FlagStr,TEXT("Material"))==0)
        {
            (......)
        }
        // 處理`RecompileShaders All`指令
        else if( FCString::Stricmp(*FlagStr,TEXT("All"))==0)
        {
            (......)
        }
        // 處理`RecompileShaders <ShaderPath>`指令
        else
        {
            // 根據檔案名擷取FShaderType.
            TArray<const FShaderType*> ShaderTypes = FShaderType::GetShaderTypesByFilename(*FlagStr);
            // 根據FShaderType擷取FShaderPipelineType.
            TArray<const FShaderPipelineType*> ShaderPipelineTypes = FShaderPipelineType::GetShaderPipelineTypesByFilename(*FlagStr);
            if (ShaderTypes.Num() > 0 || ShaderPipelineTypes.Num() > 0)
            {
                FRecompileShadersTimer TestTimer(TEXT("RecompileShaders SingleShader"));
                
                TArray<const FVertexFactoryType*> FactoryTypes;

                // 周遊材質所有激活的FeatureLevel, 逐個編譯它們.
                UMaterialInterface::IterateOverActiveFeatureLevels([&](ERHIFeatureLevel::Type InFeatureLevel) {
                    auto ShaderPlatform = GShaderPlatformForFeatureLevel[InFeatureLevel];
                    // 開始編譯指定ShaderTypes,ShaderPipelineTypes,ShaderPlatform的shader.
                    BeginRecompileGlobalShaders(ShaderTypes, ShaderPipelineTypes, ShaderPlatform);
                    // 結束編譯.
                    FinishRecompileGlobalShaders();
                });
            }
        }

        return 1;
    }

    (......)
}
           

上面代碼進入了關鍵接口BeginRecompileGlobalShaders開始編譯指定的shader:

void BeginRecompileGlobalShaders(const TArray<const FShaderType*>& OutdatedShaderTypes, const TArray<const FShaderPipelineType*>& OutdatedShaderPipelineTypes, EShaderPlatform ShaderPlatform, const ITargetPlatform* TargetPlatform)
{
    if (!FPlatformProperties::RequiresCookedData())
    {
        // 重新整理對現有全局着色器的挂起通路.
        FlushRenderingCommands();

        // 編譯全局的ShaderMap.
        CompileGlobalShaderMap(ShaderPlatform, TargetPlatform, false);
        
        // 檢測有效性.
        FGlobalShaderMap* GlobalShaderMap = GetGlobalShaderMap(ShaderPlatform);
        if (OutdatedShaderTypes.Num() > 0 || OutdatedShaderPipelineTypes.Num() > 0)
        {
            VerifyGlobalShaders(ShaderPlatform, false, &OutdatedShaderTypes, &OutdatedShaderPipelineTypes);
        }
    }
}

// 編譯單個全局着色器映射表.
void CompileGlobalShaderMap(EShaderPlatform Platform, const ITargetPlatform* TargetPlatform, bool bRefreshShaderMap)
{
    (......)

    // 删除舊的資源.
    if (bRefreshShaderMap || GGlobalShaderTargetPlatform[Platform] != TargetPlatform)
    {
        delete GGlobalShaderMap[Platform];
        GGlobalShaderMap[Platform] = nullptr;

        GGlobalShaderTargetPlatform[Platform] = TargetPlatform;

        // 確定我們查找更新的shader源檔案.
        FlushShaderFileCache();
    }

    // 建立并編譯shader.
    if (!GGlobalShaderMap[Platform])
    {
        (......)

        GGlobalShaderMap[Platform] = new FGlobalShaderMap(Platform);

        (......)

        // 檢測是否有shader未加載, 是則編譯之.
        VerifyGlobalShaders(Platform, bLoadedFromCacheFile);

        if (GCreateShadersOnLoad && Platform == GMaxRHIShaderPlatform)
        {
            GGlobalShaderMap[Platform]->BeginCreateAllShaders();
        }
    }
}

// 檢測是否有shader未加載, 是則編譯之.
void VerifyGlobalShaders(EShaderPlatform Platform, bool bLoadedFromCacheFile, const TArray<const FShaderType*>* OutdatedShaderTypes, const TArray<const FShaderPipelineType*>* OutdatedShaderPipelineTypes)
{
    (......)

    // 擷取FGlobalShaderMap執行個體.
    FGlobalShaderMap* GlobalShaderMap = GetGlobalShaderMap(Platform);
    
    (......)

    // 所有作業, 包含single和pipeline.
    TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> GlobalShaderJobs;

    // 先添加single jobs.
    TMap<TShaderTypePermutation<const FShaderType>, FShaderCompileJob*> SharedShaderJobs;

    for (TLinkedList<FShaderType*>::TIterator ShaderTypeIt(FShaderType::GetTypeList()); ShaderTypeIt; ShaderTypeIt.Next())
    {
        FGlobalShaderType* GlobalShaderType = ShaderTypeIt->GetGlobalShaderType();
        if (!GlobalShaderType)
        {
            continue;
        }

        int32 PermutationCountToCompile = 0;
        for (int32 PermutationId = 0; PermutationId < GlobalShaderType->GetPermutationCount(); PermutationId++)
        {
            if (GlobalShaderType->ShouldCompilePermutation(Platform, PermutationId) 
                && (!GlobalShaderMap->HasShader(GlobalShaderType, PermutationId) || (OutdatedShaderTypes && OutdatedShaderTypes->Contains(GlobalShaderType))))
            {
                // 如果是過期的shader類型, 删除之.
                if (OutdatedShaderTypes)
                {
                    GlobalShaderMap->RemoveShaderTypePermutaion(GlobalShaderType, PermutationId);
                }

                // 建立編譯global shader type的作業
                auto* Job = FGlobalShaderTypeCompiler::BeginCompileShader(GlobalShaderType, PermutationId, Platform, nullptr, GlobalShaderJobs);
                TShaderTypePermutation<const FShaderType> ShaderTypePermutation(GlobalShaderType, PermutationId);
                // 添加到作業清單.
                SharedShaderJobs.Add(ShaderTypePermutation, Job);
                PermutationCountToCompile++;
            }
        }

        (......)
    }

    // 處理FShaderPipeline, 如果是可共享的pipeline, 則不需要重複添加作業.
    for (TLinkedList<FShaderPipelineType*>::TIterator ShaderPipelineIt(FShaderPipelineType::GetTypeList()); ShaderPipelineIt; ShaderPipelineIt.Next())
    {
        const FShaderPipelineType* Pipeline = *ShaderPipelineIt;
        if (Pipeline->IsGlobalTypePipeline())
        {
            if (!GlobalShaderMap->HasShaderPipeline(Pipeline) || (OutdatedShaderPipelineTypes && OutdatedShaderPipelineTypes->Contains(Pipeline)))
            {
                auto& StageTypes = Pipeline->GetStages();
                TArray<FGlobalShaderType*> ShaderStages;
                for (int32 Index = 0; Index < StageTypes.Num(); ++Index)
                {
                    FGlobalShaderType* GlobalShaderType = ((FShaderType*)(StageTypes[Index]))->GetGlobalShaderType();
                    if (GlobalShaderType->ShouldCompilePermutation(Platform, kUniqueShaderPermutationId))
                    {
                        ShaderStages.Add(GlobalShaderType);
                    }
                    else
                    {
                        break;
                    }
                }

                // 删除過期的PipelineType
                if (OutdatedShaderPipelineTypes)
                {
                    GlobalShaderMap->RemoveShaderPipelineType(Pipeline);
                }

                if (ShaderStages.Num() == StageTypes.Num())
                {
                    (......)

                    if (Pipeline->ShouldOptimizeUnusedOutputs(Platform))
                    {
                        // Make a pipeline job with all the stages
                        FGlobalShaderTypeCompiler::BeginCompileShaderPipeline(Platform, Pipeline, ShaderStages, GlobalShaderJobs);
                    }
                    else
                    {
                        for (const FShaderType* ShaderType : StageTypes)
                        {
                            TShaderTypePermutation<const FShaderType> ShaderTypePermutation(ShaderType, kUniqueShaderPermutationId);

                            FShaderCompileJob** Job = SharedShaderJobs.Find(ShaderTypePermutation);
                            auto* SingleJob = (*Job)->GetSingleShaderJob();
                            auto& SharedPipelinesInJob = SingleJob->SharingPipelines.FindOrAdd(nullptr);
                            // 添加pipeline作業.
                            SharedPipelinesInJob.Add(Pipeline);
                        }
                    }
                }
            }
        }
    }

    if (GlobalShaderJobs.Num() > 0)
    {
        GetOnGlobalShaderCompilation().Broadcast();
        // 添加編譯作業到GShaderCompilingManager中.
        GShaderCompilingManager->AddJobs(GlobalShaderJobs, true, false, "Globals");

        // 部分平台不支援異步shader編譯.
        const bool bAllowAsynchronousGlobalShaderCompiling =
            !IsOpenGLPlatform(GMaxRHIShaderPlatform) && !IsVulkanPlatform(GMaxRHIShaderPlatform) &&
            !IsMetalPlatform(GMaxRHIShaderPlatform) && !IsSwitchPlatform(GMaxRHIShaderPlatform) &&
            GShaderCompilingManager->AllowAsynchronousShaderCompiling();

        if (!bAllowAsynchronousGlobalShaderCompiling)
        {
            TArray<int32> ShaderMapIds;
            ShaderMapIds.Add(GlobalShaderMapId);

            GShaderCompilingManager->FinishCompilation(TEXT("Global"), ShaderMapIds);
        }
    }
}
           

由此可知,shader的編譯作業由全局對象GShaderCompilingManager完成,下面進入FShaderCompilingManager的類型定義:

// Engine\Source\Runtime\Engine\Public\ShaderCompiler.h

class FShaderCompilingManager
{
    (......)
    
private:
    //////////////////////////////////////////////////////
    // 線程共享的屬性: 隻有當CompileQueueSection獲得時才能讀寫.
    bool bCompilingDuringGame;
    // 正在編譯的作業清單.
    TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> CompileQueue;
    TMap<int32, FShaderMapCompileResults> ShaderMapJobs;
    int32 NumOutstandingJobs;
    int32 NumExternalJobs;
    FCriticalSection CompileQueueSection;

    //////////////////////////////////////////////////////
    // 主線程狀态 - 隻有主線程可通路.
    TMap<int32, FShaderMapFinalizeResults> PendingFinalizeShaderMaps;
    TUniquePtr<FShaderCompileThreadRunnableBase> Thread;

    //////////////////////////////////////////////////////
    // 配置屬性
    uint32 NumShaderCompilingThreads;
    uint32 NumShaderCompilingThreadsDuringGame;
    int32 MaxShaderJobBatchSize;
    int32 NumSingleThreadedRunsBeforeRetry;
    uint32 ProcessId;
    (......)

public:
    // 資料通路和設定接口.
    bool ShouldDisplayCompilingNotification() const;
    bool AllowAsynchronousShaderCompiling() const;
    bool IsCompiling() const;
    bool HasShaderJobs() const;
    int32 GetNumRemainingJobs() const;
    void SetExternalJobs(int32 NumJobs);

    enum class EDumpShaderDebugInfo : int32
    {
        Never                = 0,
        Always                = 1,
        OnError                = 2,
        OnErrorOrWarning    = 3
    };
    
    (......)

    // 增加編譯作業.
    ENGINE_API void AddJobs(TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>>& NewJobs, bool bOptimizeForLowLatency, bool bRecreateComponentRenderStateOnCompletion, const FString MaterialBasePath, FString PermutationString = FString(""), bool bSkipResultProcessing = false);
    
    // 删除編譯作業.
    ENGINE_API void CancelCompilation(const TCHAR* MaterialName, const TArray<int32>& ShaderMapIdsToCancel);
    // 結束編譯作業, 會阻塞線程直到指定的材質編譯完成.
    ENGINE_API void FinishCompilation(const TCHAR* MaterialName, const TArray<int32>& ShaderMapIdsToFinishCompiling);
    // 阻塞所有shader編譯, 直到完成.
    ENGINE_API void FinishAllCompilation();
    // 關閉編譯管理器.
    ENGINE_API void Shutdown();
    // 處理已經完成的異步結果, 将它們附加到關聯的材質上.
    ENGINE_API void ProcessAsyncResults(bool bLimitExecutionTime, bool bBlockOnGlobalShaderCompletion);

    static bool IsShaderCompilerWorkerRunning(FProcHandle & WorkerHandle);
};

// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp

void FShaderCompilingManager::AddJobs(TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>>& NewJobs, bool bOptimizeForLowLatency, bool bRecreateComponentRenderStateOnCompletion, const FString MaterialBasePath, const FString PermutationString, bool bSkipResultProcessing)
{
    (......)
    
    // 注冊作業到GShaderCompilerStats.
    if(NewJobs.Num())
    {
        FShaderCompileJob* Job = NewJobs[0]->GetSingleShaderJob();
        if(Job) //assume that all jobs are for the same platform
        {
            GShaderCompilerStats->RegisterCompiledShaders(NewJobs.Num(), Job->Input.Target.GetPlatform(), MaterialBasePath, PermutationString);
        }
        else
        {
            GShaderCompilerStats->RegisterCompiledShaders(NewJobs.Num(), SP_NumPlatforms, MaterialBasePath, PermutationString);
        }
    }
    
    // 入隊編譯清單.
    if (bOptimizeForLowLatency)
    {
        int32 InsertIndex = 0;

        for (; InsertIndex < CompileQueue.Num(); InsertIndex++)
        {
            if (!CompileQueue[InsertIndex]->bOptimizeForLowLatency)
            {
                break;
            }
        }

        CompileQueue.InsertZeroed(InsertIndex, NewJobs.Num());

        for (int32 JobIndex = 0; JobIndex < NewJobs.Num(); JobIndex++)
        {
            CompileQueue[InsertIndex + JobIndex] = NewJobs[JobIndex];
        }
    }
    else
    {
        CompileQueue.Append(NewJobs);
    }

    // 增加作業數量.
    FPlatformAtomics::InterlockedAdd(&NumOutstandingJobs, NewJobs.Num());

    // 增加着色器映射表的作業數量.
    for (int32 JobIndex = 0; JobIndex < NewJobs.Num(); JobIndex++)
    {
        NewJobs[JobIndex]->bOptimizeForLowLatency = bOptimizeForLowLatency;
        FShaderMapCompileResults& ShaderMapInfo = ShaderMapJobs.FindOrAdd(NewJobs[JobIndex]->Id);
        ShaderMapInfo.bRecreateComponentRenderStateOnCompletion = bRecreateComponentRenderStateOnCompletion;
        ShaderMapInfo.bSkipResultProcessing = bSkipResultProcessing;
        auto* PipelineJob = NewJobs[JobIndex]->GetShaderPipelineJob();
        if (PipelineJob)
        {
            ShaderMapInfo.NumJobsQueued += PipelineJob->StageJobs.Num();
        }
        else
        {
            ShaderMapInfo.NumJobsQueued++;
        }
    }
}

void FShaderCompilingManager::FinishCompilation(const TCHAR* MaterialName, const TArray<int32>& ShaderMapIdsToFinishCompiling)
{
    (......)

    TMap<int32, FShaderMapFinalizeResults> CompiledShaderMaps;
    CompiledShaderMaps.Append( PendingFinalizeShaderMaps );
    PendingFinalizeShaderMaps.Empty();
    
    // 阻塞編譯.
    BlockOnShaderMapCompletion(ShaderMapIdsToFinishCompiling, CompiledShaderMaps);

    // 重試并擷取潛在的錯誤.
    bool bRetry = false;
    do 
    {
        bRetry = HandlePotentialRetryOnError(CompiledShaderMaps);
    } 
    while (bRetry);

    // 處理編譯好的ShaderMap.
    ProcessCompiledShaderMaps(CompiledShaderMaps, FLT_MAX);

    (......)
}
           

以上可知,最終的shader編譯作業執行個體類型是FShaderCommonCompileJob,它的執行個體對進入一個全局的隊列,以便多線程異步地編譯。下面是FShaderCommonCompileJob及其相關類型的定義:

// Engine\Source\Runtime\Engine\Public\ShaderCompiler.h

// 存儲了用于編譯shader或shader pipeline的通用資料.
class FShaderCommonCompileJob
{
public:
    uint32 Id;
    // 是否完成了編譯.
    bool bFinalized;
    // 是否成功.
    bool bSucceeded;
    bool bOptimizeForLowLatency;

    FShaderCommonCompileJob(uint32 InId);
    virtual ~FShaderCommonCompileJob();

    // 資料接口.
    virtual FShaderCompileJob* GetSingleShaderJob();
    virtual const FShaderCompileJob* GetSingleShaderJob() const;
    virtual FShaderPipelineCompileJob* GetShaderPipelineJob();
    virtual const FShaderPipelineCompileJob* GetShaderPipelineJob() const;

    // 未着色編譯器作業擷取一個全局的id.
    ENGINE_API static uint32 GetNextJobId();

private:
    // 作業id的計數器.
    static FThreadSafeCounter JobIdCounter;
};

// 用于編譯單個shader的所有輸入和輸出資訊.
class FShaderCompileJob : public FShaderCommonCompileJob
{
public:
    // 着色器的頂點工廠, 可能是null.
    FVertexFactoryType* VFType;
    // 着色器類型.
    FShaderType* ShaderType;
    // 排列id.
    int32 PermutationId;
    // 編譯的輸入和輸出.
    FShaderCompilerInput Input;
    FShaderCompilerOutput Output;

    // 共享此作業的Pipeline清單.
    TMap<const FVertexFactoryType*, TArray<const FShaderPipelineType*>> SharingPipelines;

    FShaderCompileJob(uint32 InId, FVertexFactoryType* InVFType, FShaderType* InShaderType, int32 InPermutationId);

    virtual FShaderCompileJob* GetSingleShaderJob() override;
    virtual const FShaderCompileJob* GetSingleShaderJob() const override;
};

// 用于編譯ShaderPipeline的資訊.
class FShaderPipelineCompileJob : public FShaderCommonCompileJob
{
public:
    // 作業清單.
    TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> StageJobs;
    bool bFailedRemovingUnused;

    // 所屬的ShaderPipeline
    const FShaderPipelineType* ShaderPipeline;

    FShaderPipelineCompileJob(uint32 InId, const FShaderPipelineType* InShaderPipeline, int32 NumStages);

    virtual FShaderPipelineCompileJob* GetShaderPipelineJob() override;
    virtual const FShaderPipelineCompileJob* GetShaderPipelineJob() const override;
};
           

以上作業經過FShaderCompilingManager::AddJobs等接口加入到FShaderCompilingManager::CompileQueue隊列中,然後主要由FShaderCompileThreadRunnable::PullTasksFromQueue接口拉取作業并執行(多生産者多消費者模式):

// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp

int32 FShaderCompileThreadRunnable::PullTasksFromQueue()
{
    int32 NumActiveThreads = 0;
    {
        // 進入臨界區, 以便通路輸入和輸出隊列.
        FScopeLock Lock(&Manager->CompileQueueSection);

        const int32 NumWorkersToFeed = Manager->bCompilingDuringGame ? Manager->NumShaderCompilingThreadsDuringGame : WorkerInfos.Num();
        // 計算每個工作線程的作業數量.
        const auto NumJobsPerWorker = (Manager->CompileQueue.Num() / NumWorkersToFeed) + 1;
        
        // 周遊所有WorkerInfos.
        for (int32 WorkerIndex = 0; WorkerIndex < WorkerInfos.Num(); WorkerIndex++)
        {
            FShaderCompileWorkerInfo& CurrentWorkerInfo = *WorkerInfos[WorkerIndex];

            // 如果本工作線程沒有任何隊列作業, 從其它輸入隊列查找.
            if (CurrentWorkerInfo.QueuedJobs.Num() == 0 && WorkerIndex < NumWorkersToFeed)
            {
                if (Manager->CompileQueue.Num() > 0)
                {
                    bool bAddedLowLatencyTask = false;
                    const auto MaxNumJobs = FMath::Min3(NumJobsPerWorker, Manager->CompileQueue.Num(), Manager->MaxShaderJobBatchSize);
                    
                    int32 JobIndex = 0;
                    // Don't put more than one low latency task into a batch
                    for (; JobIndex < MaxNumJobs && !bAddedLowLatencyTask; JobIndex++)
                    {
                        bAddedLowLatencyTask |= Manager->CompileQueue[JobIndex]->bOptimizeForLowLatency;
                        // 從管理器的CompileQueue添加到本工作線程的QueuedJobs.
                        CurrentWorkerInfo.QueuedJobs.Add(Manager->CompileQueue[JobIndex]);
                    }

                    CurrentWorkerInfo.bIssuedTasksToWorker = false;                    
                    CurrentWorkerInfo.bLaunchedWorker = false;
                    CurrentWorkerInfo.StartTime = FPlatformTime::Seconds();
                    NumActiveThreads++;
                    // 從從管理器的CompileQueue删除已經劫取的作業. 其中CompileQueue是ThreadSafe模式的TArray.
                    Manager->CompileQueue.RemoveAt(0, JobIndex);
                }
            }
            // 本工作線程有作業.
            else
            {
                if (CurrentWorkerInfo.QueuedJobs.Num() > 0)
                {
                    NumActiveThreads++;
                }

                // 增加已經完成的作業到輸出隊列(ShaderMapJobs)
                if (CurrentWorkerInfo.bComplete)
                {
                    for (int32 JobIndex = 0; JobIndex < CurrentWorkerInfo.QueuedJobs.Num(); JobIndex++)
                    {
                        FShaderMapCompileResults& ShaderMapResults = Manager->ShaderMapJobs.FindChecked(CurrentWorkerInfo.QueuedJobs[JobIndex]->Id);
                        ShaderMapResults.FinishedJobs.Add(CurrentWorkerInfo.QueuedJobs[JobIndex]);
                        ShaderMapResults.bAllJobsSucceeded = ShaderMapResults.bAllJobsSucceeded && CurrentWorkerInfo.QueuedJobs[JobIndex]->bSucceeded;
                    }
                    
                    (......)
                    
                    // 更新NumOutstandingJobs數量.
                    FPlatformAtomics::InterlockedAdd(&Manager->NumOutstandingJobs, -CurrentWorkerInfo.QueuedJobs.Num());

                    // 清空作業資料.
                    CurrentWorkerInfo.bComplete = false;
                    CurrentWorkerInfo.QueuedJobs.Empty();
                }
            }
        }
    }
    return NumActiveThreads;
}
           

以上工作線程資訊CurrentWorkerInfo的類型是FShaderCompileWorkerInfo:

// 着色器編譯工作線程資訊.
struct FShaderCompileWorkerInfo
{
    // 工作程序的handle. 可能是非法的.
    FProcHandle WorkerProcess;
    // 追蹤是否存在有問題的任何.
    bool bIssuedTasksToWorker;    
    // 是否已啟動.
    bool bLaunchedWorker;
    // 是否所有任務問題都已收到.
    bool bComplete;
    // 最近啟動任務批次的時間.
    double StartTime;
    
    // 工作程序需負責編譯的工作.(注意是線程安全模式)
    TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> QueuedJobs;

    // 構造函數.
    FShaderCompileWorkerInfo();
    // 析構函數, 不是Virtual的.
    ~FShaderCompileWorkerInfo()
    {
        if(WorkerProcess.IsValid())
        {
            FPlatformProcess::TerminateProc(WorkerProcess);
            FPlatformProcess::CloseProc(WorkerProcess);
        }
    }
};
           

至此,Shader的編譯流程和機制已經闡述得差不多了,剩下的細節和機理可以自行研究。

我們在開發的時候,隻會編寫一種UE Style的HLSL,那麼UE背後是如何将它們編譯成不同圖形API(下表)和FeatureLevel的Shader指令呢?

圖形API 着色語言 解析
Direct3D HLSL(High Level Shading Language) 進階着色語言,隻能用于windows平台
OpenGL GLSL(OpenGL Shading Language) 可跨平台,但基于狀态機的設計和現代GPU架構格格不入
OpenGL ES ES GLSL 專用于移動平台
Metal MSL(Metal Shading Language) 隻能用于Apple系統
Vulkan SPIR-V SPIR-V是中間語言,可友善且完整地轉譯其它平台的shader

SPIR-V由Khronos(也是OpenGL和Vulkan的締造者)掌管,它實際上是個龐大的生态系統,包含了着色語言、工具鍊及運作時庫:

剖析虛幻渲染體系(08)- Shader體系

SPIR-V的生态系統一覽,Shader跨平台隻是其中一部分。

SPIR-V也是目前不少商業引擎或渲染器的shader跨平台方案。那麼UE是不是也是使用SPIR-V,還是選擇了其它方案?本節将解答此問題,挖掘UE使用的Shader跨平台方案。

對于Shader跨平台,通常需要考慮以下幾點:

  • 單次編碼多平台使用。這個是基本要求,不能實作此特性,則無從談起跨平台,也增加開發人員的工作量,降低工作效率。
  • 可離線編譯。目前多數shader編譯器都支援這個功能。
  • 需要反射來建立在運作時渲染器使用的中繼資料。 比如紋理被綁定到哪個索引,Uniform是否被使用使用等等。
  • 特定的優化措施。如離線校驗,内聯化,無用的指令和資料檢測、删除,指令合并和簡化,離線編譯的是中間語言還是目标機器碼等等。

UE早期在Shader跨平台方案考慮了幾種思路:

  • 純粹用宏封裝各種着色語言的差異。簡單的着色邏輯應該可行,但實際上,各種着色語言存在巨大的差異,幾乎無法用宏抽象。是以不可行。
  • 使用FXC編譯HLSL,然後轉換位元組碼。良好的效果,但緻命缺點是無法支援Mac OS平台,是以被棄用。
  • 第三方跨平台編譯器。在當時(2014年),沒有一個能夠支援SM5.0的文法和Coumte Shader的編譯器。

面對當時(2014年前後)的現狀,UE4.3受glsl-optimizer的啟發,基于Mesa GLSL parser and IR造了個自己的輪子HLSLCC(HLSL Cross Compiler)。HLSLCC将分析器用來分析SM5.0(而非GLSL),實作Mesa IR到GLSL的轉換器(類似于glsl-optimizer)。另外,Mesa天然支援IR優化,是以HLSLCC也支援IR優化。

剖析虛幻渲染體系(08)- Shader體系

HLSLCC在GLSL下的管線示意圖。Shader編譯器的輸入是HLSL源碼,會先轉成MCPP,然後經過HLSLCC處理成GLSL源碼和參數表。

HLSLCC的主要工作步驟如下所述:

  • Preprocessing,預處理階段。通過類似C風格的預處理器運作,在編譯之前,UE使用MCPP進行預處理,是以跳過了這一步。
  • Parsing,文法分析階段。通過Mesa的_mesa_hlsl_parse接口,HLSL将被分析成抽象文法樹,Lexer(文法分析)和Parser分别由flex和bison生成。
  • Compilation,編譯階段。利用 _mesa_ast_to_hir,将AST(抽象文法樹)編譯為Mesa IR。在此階段,編譯器執行隐式轉換、函數重載解析、生成内部函數的指令等功能,也将生成 GLSL 主入口點,會将輸入及輸出變量的全局聲明添加到IR,同時計算HLSL入口點的輸入,調用HLSL入口點,并将輸出寫入全局輸出變量。
  • Optimization,優化階段。主要通過do_optimization_pass對IR執行多遍優化,包括直接插入函數、消除無用代碼、傳播常量、消除公共的子表達式等等。
  • Uniform packing,全局變量打包。将全局統一變量打包成數組并保留映射資訊,以便引擎可将參數與一緻變量數組的相關部分綁定。
  • Final optimization,最終優化階段。打包統一變量之後,将對IR運作第二遍優化,以簡化打包統一變量時生成的代碼。
  • Generate GLSL,生成GLSL。最後步驟,将已經優化的IR轉換為GLSL源代碼。除了生成所有構造及統一變量緩沖區的定義以及源代碼本身以外,還會在檔案開頭的注釋中寫入一個映射表。

以上的闡述涉及的源碼在Engine\Source\ThirdParty\hlslcc目錄下面,核心檔案有:

  • ast.h
  • glcpp-parse.h
  • glsl_parser_extras.h
  • hlsl_parser.h
  • ir_optimization.h

下面是編譯階段涉及到的核心函數:

函數名
apply_type_conversion 此函數将一種類型的值轉換為另一種類型(如果有可能的話)。是執行隐式轉換還是顯式轉換由參數控制。
arithmetic_result_type 這組函數确定對輸入值應用操作的結果類型。
validate_assignment 确定某個 rvalue 是否可賦予特定類型的 lvalue。必要時,将應用允許的隐式轉換。
do_assignment 将 rvalue 賦予 lvalue(如果可使用 validate_assignment 完成)。
ast_expression::hir 将AST中的表達式節點轉換為一組IR指令。
process_initializer 将初始化表達式應用于變量。
ast_struct_specifier::hir 建構聚合類型,以表示所聲明的結構。
ast_cbuffer_declaration::hir 建構常量緩沖區布局的結構體,并将其存儲為統一變量塊。
process_mul 處理HLSL内部乘法的特殊代碼。
match_function_by_name 根據輸入參數的名稱和清單來查找函數特征符。
rank_parameter_lists 對兩個參數清單進行比較,并指定數字排名以訓示這兩個清單的比對程度。是一個輔助函數,用于執行重載解析:排名最低的特征符将勝出,如果有任何特征符的排名與排名最低的特征符相同,那麼将函數調用聲明為具有歧義。排名為零表示精确比對。
gen_texture_op 處理内置HLSL紋理和采樣對象的方法調用。
_mesa_glsl_initialize_functions 生成HLSL内部函數的内置函數。大部分函數(例如 sin 和 cos)會生成IR代碼以執行操作,但某些函數(例如 transpose 和 determinant)會保留函數調用以推遲操作,使其由驅動程式的 GLSL 編譯器執行。

HLSLCC從UE4.3的首個版本開始,到至今的4.26,經曆了數次疊代。例如在UE4.22,Shader的跨平台示意圖如下:

剖析虛幻渲染體系(08)- Shader體系

UE4.22的shader跨平台示意圖,其中Metal SL由Mesa IR轉譯而來,Vulkan由Mesa IR-GLSL-GLSlang-SPIR-V多重轉義而來。

在UE4.25,Shader的跨平台示意圖如下:

剖析虛幻渲染體系(08)- Shader體系

UE4.25的shader跨平台示意圖,最大的改變在于增加了Shader Conductor,進而通過DXC->SPIR-V再轉譯到Metal、Vulkan、DX等平台。

是以,UE4.25的最大改變在于新增了Shader Conductor,轉換成SPIR-V,以實作Metal、Vulkan等平台的轉移。

其中Shader Conductor也是第三方庫,位于引擎的Engine\Source\ThirdParty\ShaderConductor目錄下。它的核心子產品有:

  • ShaderConductor.hpp
  • ShaderConductor.cpp
  • Native.h
  • Native.cpp

Shader Conductor内部還包含了DirectXShaderCompiler、SPIRV-Cross、SPIRV-Headers、SPIRV-Tools等元件。

UE4.25的思路跟叛逆者(龔敏敏)的KlayGE的Shader跨平台方案如出一轍:

剖析虛幻渲染體系(08)- Shader體系

Vulkan不但擁有全新的API,還帶來了一個新的shader中間格式SPIR-V。這正是通往統一的跨平台shader編譯路上最更要的一級台階。從趨勢來看,未來将會越來越多引擎和渲染器以SPIR-V做為首選的跨平台技術解決方案。

另外提一個小細節,Direct3D和OpenGL雖然在标準化裝置坐标一緻,但在UV空間的坐标是不一緻的:

剖析虛幻渲染體系(08)- Shader體系

UE為了不讓shader的開發人員察覺到這一差異,采用了翻轉的圖檔,強制使得UV坐标用統一的範式:

剖析虛幻渲染體系(08)- Shader體系

這樣做的後果就是OpenGL的紋理實際上是垂直翻轉的(從RenderDoc截取的UE在OpenGL平台下的應用也可佐證),不過渲染後期可以再次翻轉就行了。但是,UE采用颠倒(Upside down)的渲染方式,并且将颠倒的參數內建到投影矩陣:

是以,看起來标準化裝置坐标和D3D下的紋理都是垂直翻轉的。

Shader緩存有兩種,一種是存于DDC的離線資料,常用來加速編輯器階段和開發階段的效率,具體可參見8.3.1.2 FGlobalShaderMap。另一種是運作時的Shader緩存,早期的UE由FShaderCache承擔,但UE4.26已經取消了FShaderCache,由FShaderPipelineCache取而代之。

FShaderPipelineCache提供了新的管道狀态對象(PSO)日志記錄、序列化和預編譯機制 。緩存管道狀态對象并将初始化器序列化到磁盤,允許在下次遊戲運作時預編譯這些狀态,這可以減少卡頓。但FShaderPipelineCache依賴于FShaderCodeLibrary、Share Material Shader Code和RHI側的PipelineFileCache。

下面是FShaderPipelineCache的定義:

// Engine\Source\Runtime\RenderCore\Public\ShaderPipelineCache.h

class FShaderPipelineCache : public FTickableObjectRenderThread
{
    // 編譯作業結構體.
    struct CompileJob
    {
        FPipelineCacheFileFormatPSO PSO;
        FShaderPipelineCacheArchive* ReadRequests;
    };

public:
    // 初始化FShaderPipelineCache.
    static void Initialize(EShaderPlatform Platform);
    // 銷毀FShaderPipelineCache
    static void Shutdown();
    // 暫停/繼續打包預編譯.
    static void PauseBatching();
    static void ResumeBatching();
    
    // 打包模式
    enum class BatchMode
    {
        Background, // 最大打包尺寸由r.ShaderPipelineCache.BackgroundBatchSize決定.
        Fast, // 最大打包尺寸由r.ShaderPipelineCache.BatchSize決定.
        Precompile // 最大打包尺寸由r.ShaderPipelineCache.PrecompileBatchSize決定.
    };
    
    // 設定和擷取資料接口.
    static void SetBatchMode(BatchMode Mode);
    static uint32 NumPrecompilesRemaining();
    static uint32 NumPrecompilesActive();
    
    static int32 GetGameVersionForPSOFileCache();
    static bool SetGameUsageMaskWithComparison(uint64 Mask, FPSOMaskComparisonFn InComparisonFnPtr);
    static bool IsBatchingPaused();
    
    // 打開FShaderPipelineCache
    static bool OpenPipelineFileCache(EShaderPlatform Platform);
    static bool OpenPipelineFileCache(FString const& Name, EShaderPlatform Platform);
    
    // 儲存/關閉FShaderPipelineCache
    static bool SavePipelineFileCache(FPipelineFileCache::SaveMode Mode);
    static void ClosePipelineFileCache();

    // 構造/析構函數.
    FShaderPipelineCache(EShaderPlatform Platform);
    virtual ~FShaderPipelineCache();

    // Tick相關接口.
    bool IsTickable() const;
    // 幀Tick
    void Tick( float DeltaTime );
    bool NeedsRenderingResumedForRenderingThreadTick() const;
    
    TStatId GetStatId() const;
    
    enum ELibraryState
    {
        Opened,
        Closed
    };
    
    // 狀态變換通知.
    static void ShaderLibraryStateChanged(ELibraryState State, EShaderPlatform Platform, FString const& Name);

    // 預編譯上下文.
    class FShaderCachePrecompileContext
    {
        bool bSlowPrecompileTask;
    public:
        FShaderCachePrecompileContext() : bSlowPrecompileTask(false) {}
        void SetPrecompilationIsSlowTask() { bSlowPrecompileTask = true; }
        bool IsPrecompilationSlowTask() const { return bSlowPrecompileTask; }
    };

    // 信号委托函數.
    static FShaderCachePreOpenDelegate& GetCachePreOpenDelegate();
    static FShaderCacheOpenedDelegate& GetCacheOpenedDelegate();
    static FShaderCacheClosedDelegate& GetCacheClosedDelegate();
    static FShaderPrecompilationBeginDelegate& GetPrecompilationBeginDelegate();
    static FShaderPrecompilationCompleteDelegate& GetPrecompilationCompleteDelegate();

    (......)
    
private:
    // 打包預編譯的各種資料.
    static FShaderPipelineCache* ShaderPipelineCache;
    TArray<CompileJob> ReadTasks;
    TArray<CompileJob> CompileTasks;
    TArray<FPipelineCachePSOHeader> OrderedCompileTasks;
    TDoubleLinkedList<FPipelineCacheFileFormatPSORead*> FetchTasks;
    TSet<uint32> CompiledHashes;
    
    FString FileName;
    EShaderPlatform CurrentPlatform;
    FGuid CacheFileGuid;
    uint32 BatchSize;
    
    FShaderCachePrecompileContext ShaderCachePrecompileContext;

    FCriticalSection Mutex;
    TArray<FPipelineCachePSOHeader> PreFetchedTasks;
    TArray<CompileJob> ShutdownReadCompileTasks;
    TDoubleLinkedList<FPipelineCacheFileFormatPSORead*> ShutdownFetchTasks;

    TMap<FBlendStateInitializerRHI, FRHIBlendState*> BlendStateCache;
    TMap<FRasterizerStateInitializerRHI, FRHIRasterizerState*> RasterizerStateCache;
    TMap<FDepthStencilStateInitializerRHI, FRHIDepthStencilState*> DepthStencilStateCache;
    
    (......)
};
           

FShaderPipelineCache的打包預編譯獲得的資料儲存在工程目錄的Saved目錄下,字尾是.upipelinecache:

// Engine\Source\Runtime\RHI\Private\PipelineFileCache.cpp

bool FPipelineFileCache::SavePipelineFileCache(FString const& Name, SaveMode Mode)
{
    bool bOk = false;
    
    // 必須開啟PipelineFileCache且記錄PSO到檔案緩存.
    if(IsPipelineFileCacheEnabled() && LogPSOtoFileCache())
    {
        if(FileCache)
        {
            // 儲存的平台名稱.
            FName PlatformName = FileCache->GetPlatformName();
            // 儲存的目錄
            FString Path = FPaths::ProjectSavedDir() / FString::Printf(TEXT("%s_%s.upipelinecache"), *Name, *PlatformName.ToString());
            // 執行儲存操作.
            bOk = FileCache->SavePipelineFileCache(Path, Mode, Stats, NewPSOs, RequestedOrder, NewPSOUsage);
            
            (......)
        }
    }
    
    return bOk;
}
           

由于是運作時生效的Shader緩存,那麼必然是要內建到UE的運作時子產品中。實際上是在FEngineLoop内完成對它的操控:

int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
    (......)
    
    {
        bool bUseCodeLibrary = FPlatformProperties::RequiresCookedData() || GAllowCookedDataInEditorBuilds;
        if (bUseCodeLibrary)
        {
            {
                FShaderCodeLibrary::InitForRuntime(GMaxRHIShaderPlatform);
            }

    #if !UE_EDITOR
            // Cooked data only - but also requires the code library - game only
            if (FPlatformProperties::RequiresCookedData())
            {
                // 初始化FShaderPipelineCache
                FShaderPipelineCache::Initialize(GMaxRHIShaderPlatform);
            }
    #endif //!UE_EDITOR
        }
    }
    
    (......)
}

int32 FEngineLoop::PreInitPostStartupScreen(const TCHAR* CmdLine)
{
    (......)
    
    IInstallBundleManager* BundleManager = IInstallBundleManager::GetPlatformInstallBundleManager();
    if (BundleManager == nullptr || BundleManager->IsNullInterface())
    {
        (......)

        {
            // 打開包含了材質着色器的遊戲庫.
            FShaderCodeLibrary::OpenLibrary(FApp::GetProjectName(), FPaths::ProjectContentDir());
            for (const FString& RootDir : FPlatformMisc::GetAdditionalRootDirectories())
            {
                FShaderCodeLibrary::OpenLibrary(FApp::GetProjectName(), FPaths::Combine(RootDir, FApp::GetProjectName(), TEXT("Content")));
            }

            // 打開FShaderPipelineCache.
            FShaderPipelineCache::OpenPipelineFileCache(GMaxRHIShaderPlatform);
        }
    }
    
    (......)
}
           

此外,GameEngine也會運作時相應指令行的繼續和暫停預編譯打包。一旦FShaderPipelineCache的實際準備好,RHI層就可以相應它的實際和信号,以Vulkan的FVulkanPipelineStateCacheManager為例:

// Engine\Source\Runtime\VulkanRHI\Private\VulkanPipeline.h

class FVulkanPipelineStateCacheManager
{
    (......)

private:
    // 追蹤ShaderPipelineCache的預編譯的委托.
    void OnShaderPipelineCacheOpened(FString const& Name, EShaderPlatform Platform, uint32 Count, const FGuid& VersionGuid, FShaderPipelineCache::FShaderCachePrecompileContext& ShaderCachePrecompileContext);
    void OnShaderPipelineCachePrecompilationComplete(uint32 Count, double Seconds, const FShaderPipelineCache::FShaderCachePrecompileContext& ShaderCachePrecompileContext);

    (......)
};
           

如果要開啟Shader Pipeline Cache,需要在工程配置裡勾選以下兩項(預設已開啟):

剖析虛幻渲染體系(08)- Shader體系

下面有一些指令行變量可以設定Shader Pipeline Cache的屬性:

指令行 作用
r.ShaderPipelineCache.Enabled 開啟Shader Pipeline Cache,以便從磁盤加載已有的資料并預編譯。
r.ShaderPipelineCache.BatchSize / BackgroundBatchSize 可以設定不同Batch模式下的尺寸。
r.ShaderPipelineCache.LogPSO 開啟Shader Pipeline Cache下的PSO記錄。
r.ShaderPipelineCache.SaveAfterPSOsLogged 設定預期的PSO記錄數量,到了此數量便自動儲存。

另外,在GGameIni或GGameUserSettingsIni内,Shader Pipeline Cache用字段 [ShaderPipelineCache.CacheFile]存儲資訊。

本章将講述Shader的開發案例、調試技巧和優化技術。

如果項目處于開發階段,最好将Shader的編譯選項改成Development,可以通過修改Engine\Config\ConsoleVariables.ini的以下配置達成:

剖析虛幻渲染體系(08)- Shader體系

将指令變量前面的分号去掉即可。它們的含義如下:

r.ShaderDevelopmentMode=1 獲得關于着色器編譯的詳細日志和錯誤重試的機會。
r.DumpShaderDebugInfo=1 将編譯的所有着色器的檔案儲存到磁盤ProjectName/Saved/ShaderDebugInfo的目錄。包含源檔案、預處理後的版本、一個批處理檔案(用于使用編譯器等效的指令行選項來編譯預處理版本)。
r.DumpShaderDebugShortNames=1 儲存的Shader路徑将被精簡。
r.Shaders.Optimize=0 禁用着色器優化,使得shader的調試資訊被保留。
r.Shaders.KeepDebugInfo=1 保留調試資訊,配合RenderDoc等截幀工具時特别有用。
r.Shaders.SkipCompression=1 忽略shader壓縮,可以節省調試shader的時間。

開啟了以上指令之後,用RenderDoc截幀将可以完整地看到Shader的變量、HLSL代碼(不開啟将是彙編指令),還可以單步調試。能夠有效提升Shader開發和調試的效率。

r.DumpShaderDebugInfo開啟後,随意在UE的内置shader修改一行代碼(比如在Common.ush加個空格),重新開機UE編輯器,着色器将被重新編譯,完成之後在ProjectName/Saved/ShaderDebugInfo的目錄下生成有用的調試資訊:

剖析虛幻渲染體系(08)- Shader體系

打開某個具體的材質shader目錄,可以發現有源檔案、預處理後的版本、批處理檔案以及哈希值:

剖析虛幻渲染體系(08)- Shader體系

另外,如果修改了Shader的某些檔案(如BasePassPixelShader.ush),不需要重新開機UE編輯器,可以在控制台輸入

RecompileShaders

指令重新編譯指定的shader檔案。其中

RecompileShaders

的具體含義如下:

指令
RecompileShaders all 編譯源碼有修改的所有shader,包含global、material、meshmaterial。
RecompileShaders changed 編譯源碼有修改的shader。
RecompileShaders global 編譯源碼有修改的global shader。
RecompileShaders material 編譯源碼有修改的material shader。
編譯指定名稱的材質。
RecompileShaders 編譯指定路徑的shader源檔案。

執行以上指令之前,必須先儲存shader檔案的修改。

另外,要在調試時建構項目時,可以設定ShaderCompileWorker的解決方案屬性(Visual Studio:生成 -> 配置管理器)為 Debug_Program:

剖析虛幻渲染體系(08)- Shader體系

這樣就可以用ShaderCompileWorker (SCW) 添加Shader調試指令行:

PathToGeneratedUsfFile -directcompile -format=ShaderFormat -ShaderType -entry=EntryPoint
           
  • PathToGeneratedUsfFile 是 ShaderDebugInfo 檔案夾中的最終 usf 檔案。
  • ShaderFormat 是您想要調試的着色器平台格式(在本例中,這是 PCD3D_SM5)。
  • ShaderType 是 vs/ps/gs/hs/ds/cs 中的一項,分别對應于“頂點”、“像素”、“幾何體”、“物體外殼”、“域”和“計算”着色器類型。
  • EntryPoint 是 usf 檔案中此着色器的入口點的函數名稱。

例如:

<ProjectPath>\Saved\ShaderDebugInfo\PCD3D_SM5\M_Egg\LocalVF\BPPSFNoLMPolicy\BasePassPixelShader.usf -format=PCD3D_SM5 -ps -entry=Main
           

可以對D3D11ShaderCompiler.cpp中的CompileD3DShader()函數設定斷點,通過指令行運作 SCW,可以了解如何調用平台編譯器:

// Engine\Source\Developer\Windows\ShaderFormatD3D\Private\D3DShaderCompiler.cpp

void CompileD3DShader(const FShaderCompilerInput& Input, FShaderCompilerOutput& Output, FShaderCompilerDefinitions& AdditionalDefines, const FString& WorkingDirectory, ELanguage Language)
{
    FString PreprocessedShaderSource;
    const bool bIsRayTracingShader = IsRayTracingShader(Input.Target);
    const bool bUseDXC = bIsRayTracingShader
        || Input.Environment.CompilerFlags.Contains(CFLAG_WaveOperations)
        || Input.Environment.CompilerFlags.Contains(CFLAG_ForceDXC);
    const TCHAR* ShaderProfile = GetShaderProfileName(Input.Target, bUseDXC);

    if(!ShaderProfile)
    {
        Output.Errors.Add(FShaderCompilerError(TEXT("Unrecognized shader frequency")));
        return;
    }

    // 設定附加的定義.
    AdditionalDefines.SetDefine(TEXT("COMPILER_HLSL"), 1);

    if (bUseDXC)
    {
        AdditionalDefines.SetDefine(TEXT("PLATFORM_SUPPORTS_SM6_0_WAVE_OPERATIONS"), 1);
        AdditionalDefines.SetDefine(TEXT("PLATFORM_SUPPORTS_STATIC_SAMPLERS"), 1);
    }

    if (Input.bSkipPreprocessedCache)
    {
        if (!FFileHelper::LoadFileToString(PreprocessedShaderSource, *Input.VirtualSourceFilePath))
        {
            return;
        }

        // 删除常量, 因為是僅調試模式.
        CrossCompiler::CreateEnvironmentFromResourceTable(PreprocessedShaderSource, (FShaderCompilerEnvironment&)Input.Environment);
    }
    else
    {
        if (!PreprocessShader(PreprocessedShaderSource, Output, Input, AdditionalDefines))
        {
            return;
        }
    }

    GD3DAllowRemoveUnused = Input.Environment.CompilerFlags.Contains(CFLAG_ForceRemoveUnusedInterpolators) ? 1 : 0;

    FString EntryPointName = Input.EntryPointName;

    Output.bFailedRemovingUnused = false;
    if (GD3DAllowRemoveUnused == 1 && Input.Target.Frequency == SF_Vertex && Input.bCompilingForShaderPipeline)
    {
        // 總是增加SV_Position
        TArray<FString> UsedOutputs = Input.UsedOutputs;
        UsedOutputs.AddUnique(TEXT("SV_POSITION"));

        // 不能删除任何僅輸出的系統文法.
        TArray<FString> Exceptions;
        Exceptions.AddUnique(TEXT("SV_ClipDistance"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance0"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance1"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance2"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance3"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance4"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance5"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance6"));
        Exceptions.AddUnique(TEXT("SV_ClipDistance7"));

        Exceptions.AddUnique(TEXT("SV_CullDistance"));
        Exceptions.AddUnique(TEXT("SV_CullDistance0"));
        Exceptions.AddUnique(TEXT("SV_CullDistance1"));
        Exceptions.AddUnique(TEXT("SV_CullDistance2"));
        Exceptions.AddUnique(TEXT("SV_CullDistance3"));
        Exceptions.AddUnique(TEXT("SV_CullDistance4"));
        Exceptions.AddUnique(TEXT("SV_CullDistance5"));
        Exceptions.AddUnique(TEXT("SV_CullDistance6"));
        Exceptions.AddUnique(TEXT("SV_CullDistance7"));
        
        DumpDebugShaderUSF(PreprocessedShaderSource, Input);

        TArray<FString> Errors;
        if (!RemoveUnusedOutputs(PreprocessedShaderSource, UsedOutputs, Exceptions, EntryPointName, Errors))
        {
            DumpDebugShaderUSF(PreprocessedShaderSource, Input);
            UE_LOG(LogD3D11ShaderCompiler, Warning, TEXT("Failed to Remove unused outputs [%s]!"), *Input.DumpDebugInfoPath);
            for (int32 Index = 0; Index < Errors.Num(); ++Index)
            {
                FShaderCompilerError NewError;
                NewError.StrippedErrorMessage = Errors[Index];
                Output.Errors.Add(NewError);
            }
            Output.bFailedRemovingUnused = true;
        }
    }

    FShaderParameterParser ShaderParameterParser;
    if (!ShaderParameterParser.ParseAndMoveShaderParametersToRootConstantBuffer(
        Input, Output, PreprocessedShaderSource,
        IsRayTracingShader(Input.Target) ? TEXT("cbuffer") : nullptr))
    {
        return;
    }

    RemoveUniformBuffersFromSource(Input.Environment, PreprocessedShaderSource);

    uint32 CompileFlags = D3D10_SHADER_ENABLE_BACKWARDS_COMPATIBILITY
        // 解壓unifor矩陣成行優先(row-major), 以比對CPU布局.
        | D3D10_SHADER_PACK_MATRIX_ROW_MAJOR;

    if (Input.Environment.CompilerFlags.Contains(CFLAG_Debug)) 
    {
        // 增加調試标記.
        CompileFlags |= D3D10_SHADER_DEBUG | D3D10_SHADER_SKIP_OPTIMIZATION;
    }
    else
    {
        if (Input.Environment.CompilerFlags.Contains(CFLAG_StandardOptimization))
        {
            CompileFlags |= D3D10_SHADER_OPTIMIZATION_LEVEL1;
        }
        else
        {
            CompileFlags |= D3D10_SHADER_OPTIMIZATION_LEVEL3;
        }
    }

    for (int32 FlagIndex = 0; FlagIndex < Input.Environment.CompilerFlags.Num(); FlagIndex++)
    {
        // 累積标記設定到shader.
        CompileFlags |= TranslateCompilerFlagD3D11((ECompilerFlags)Input.Environment.CompilerFlags[FlagIndex]);
    }

    TArray<FString> FilteredErrors;
    if (bUseDXC)
    {
        if (!CompileAndProcessD3DShaderDXC(PreprocessedShaderSource, CompileFlags, Input, EntryPointName, ShaderProfile, Language, false, FilteredErrors, Output))
        {
            if (!FilteredErrors.Num())
            {
                FilteredErrors.Add(TEXT("Compile Failed without errors!"));
            }
        }
        CrossCompiler::FShaderConductorContext::ConvertCompileErrors(MoveTemp(FilteredErrors), Output.Errors);
    }
    else
    {
        // 重寫預設的編譯器路徑到更新的dll.
        FString CompilerPath = FPaths::EngineDir();
        CompilerPath.Append(TEXT("Binaries/ThirdParty/Windows/DirectX/x64/d3dcompiler_47.dll"));

        if (!CompileAndProcessD3DShaderFXC(PreprocessedShaderSource, CompilerPath, CompileFlags, Input, EntryPointName, ShaderProfile, false, FilteredErrors, Output))
        {
            if (!FilteredErrors.Num())
            {
                FilteredErrors.Add(TEXT("Compile Failed without errors!"));
            }
        }

        // 處理錯誤.
        for (int32 ErrorIndex = 0; ErrorIndex < FilteredErrors.Num(); ErrorIndex++)
        {
            const FString& CurrentError = FilteredErrors[ErrorIndex];
            FShaderCompilerError NewError;

            // Extract filename and line number from FXC output with format:
            // "d:\UE4\Binaries\BasePassPixelShader(30,7): error X3000: invalid target or usage string"
            int32 FirstParenIndex = CurrentError.Find(TEXT("("));
            int32 LastParenIndex = CurrentError.Find(TEXT("):"));
            if (FirstParenIndex != INDEX_NONE &&
                LastParenIndex != INDEX_NONE &&
                LastParenIndex > FirstParenIndex)
            {
                // Extract and store error message with source filename
                NewError.ErrorVirtualFilePath = CurrentError.Left(FirstParenIndex);
                NewError.ErrorLineString = CurrentError.Mid(FirstParenIndex + 1, LastParenIndex - FirstParenIndex - FCString::Strlen(TEXT("(")));
                NewError.StrippedErrorMessage = CurrentError.Right(CurrentError.Len() - LastParenIndex - FCString::Strlen(TEXT("):")));
            }
            else
            {
                NewError.StrippedErrorMessage = CurrentError;
            }
            Output.Errors.Add(NewError);
        }
    }

    const bool bDirectCompile = FParse::Param(FCommandLine::Get(), TEXT("directcompile"));
    if (bDirectCompile)
    {
        for (const auto& Error : Output.Errors)
        {
            FPlatformMisc::LowLevelOutputDebugStringf(TEXT("%s\n"), *Error.GetErrorStringWithLineMarker());
        }
    }

    ShaderParameterParser.ValidateShaderParameterTypes(Input, Output);

    if (Input.ExtraSettings.bExtractShaderSource)
    {
        Output.OptionalFinalShaderSource = PreprocessedShaderSource;
    }
}
           

此外,如果不借助RenderDoc等工具,可以将需要調試的資料轉換成合理範圍的顔色值,以觀察它的值是否正常,例如:

// 将世界坐标除以一個範圍内的數值, 并輸出到顔色.
OutColor = frac(WorldPosition / 1000);
           

配合RecompileShaders的指令,這一技巧非常管用且高效。

渲染的優化技術五花八門,大到系統、架構、工程層級,小到具體的語句,不過本節專注于UE環境下的Shader正常優化技巧。

由于UE采用了Uber Shader的設計,同一個shader源檔案包含了大量的宏定義,這些宏定義根據不同的值可以組合成非常非常多的目标代碼,而這些宏通常由排列來控制。如果我們能夠有效控制排列的數量,也可以減少Shader的編譯數量、時間,提升運作時的效率。

在工廠配置中,有一些選項可以取消勾選,以減少排列的數量:

剖析虛幻渲染體系(08)- Shader體系

但需要注意,如果取消了勾選,意味着引擎将禁用該功能,需要根據實際情況做出權衡和選擇,而不應該為了優化而優化。

此外,在引擎渲染子產品的很多内置類型,都提供ShouldCompilePermutation的接口,以便編譯器在正式編譯之前向被編譯對象查詢某個排列是否需要編譯,如果傳回false,編譯器将忽略該排列,進而減少shader數量。支援ShouldCompilePermutation的類型包含但不限于:

  • FShader
  • FGlobalShader
  • FMaterialShader
  • FMeshMaterialShader
  • FVertexFactory
  • FLocalVertexFactory
  • FShaderType
  • FGlobalShaderType
  • FMaterialShaderType
  • 上述類型的子類

是以,我們在新添加以上類型的子類時,有必要認真對待ShouldCompilePermutation,以便剔除一些無用的shader排列。

對于材質,可以關閉材質屬性模闆的 Automatically Set Usage in Editor選項,防止編輯過程中産生額外的标記,增加shader排列:

剖析虛幻渲染體系(08)- Shader體系

但帶來的效益可能不明顯,還會因為漏選某些标記導緻材質不能正常工作(比如不支援蒙皮骨骼,不支援BS等)。

此外,要謹慎添加Switch節點,這些通常也會增加排列數量:

剖析虛幻渲染體系(08)- Shader體系

  • 避免if、switch分支語句。
  • 避免

    for

    循環語句,特别是循環次數可變的。
  • 減少紋理采樣次數。
  • 禁用

    clip

    discard

    操作。
  • 減少複雜數學函數調用。
  • 使用更低精度的浮點數。OpenGL ES的浮點數有三種精度:highp(32位浮點), mediump(16位浮點), lowp(8位浮點),很多計算不需要高精度,可以改成低精度浮點。
  • 避免重複計算。可以将所有像素一樣的變量提前計算好,或者由C++層傳入:
    precision mediump float;
    float a = 0.9;
    float b = 0.6;
    
    varying vec4 vColor;
    
    void main()
    {
        gl_FragColor = vColor * a * b; // a * b每個像素都會計算,導緻備援的消耗。可将a * b在c++層計算好再傳進shader。
    }
               
  • 向量延遲計算。
    highp float f0, f1;
    highp vec4 v0, v1;
    
    v0 = (v1 * f0) * f1; // v1和f0計算後傳回一個向量,再和f1計算,多了一次向量計算。
    // 改成:
    v0 = v1 * (f0 * f1); // 先計算兩個浮點數,這樣隻需跟向量計算一次。
               
  • 充分利用向量分量掩碼。
    highp vec4 v0;
    highp vec4 v1;
    highp vec4 v2;
    v2.xz = v0 * v1; // v2隻用了xz分量,比v2 = v0 * v1的寫法要快。
               
  • 避免或減少臨時變量。
  • 盡量将Pixel Shader計算移到Vertex Shader。例如像素光改成頂點光。
  • 将跟頂點或像素無關的計算移到CPU,然後通過uniform傳進來。
  • 分級政策。不同畫質不同平台采用不同複雜度的算法。
  • 頂點輸入應當采用逐Structure的布局,避免每個頂點屬性一個數組。逐Structure的布局有利于提升GPU緩存命中率。
  • 盡可能用Compute Shader代替傳統的VS、PS管線。CS的管線更加簡單、純粹,利于并行化計算,結合LDS機制,可有效提升效率。
  • 降分辨率渲染。有些資訊沒有必要全配置設定率渲染,如模糊的倒影、SSR、SSGI等。

結合開發案例,有利于鞏固對UE Shader體系的掌握和了解。

本節通過增加一個全新的最簡化的Global Shader,以闡述Shader添加過程和步驟。

首先需要新增加一個shader源檔案,此處命名為MyTest.ush:

// VS主入口.
void MainVS(
    in float4 InPosition : ATTRIBUTE0,
    out float4 Output : SV_POSITION)
{
    Output = InPosition;
}


// 顔色變量, 由c++層傳入.
float4 MyColor;

// PS主入口.
float4 MainPS() : SV_Target0
{
    return MyColor;
}
           

再添加C++相關的VS和PS:

#include "GlobalShader.h"

// VS, 繼承自FGlobalShader
class FMyVS : public FGlobalShader
{
    DECLARE_EXPORTED_SHADER_TYPE(FMyVS, Global, /*MYMODULE_API*/);

    FMyTestVS() {}
    FMyTestVS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
        : FGlobalShader(Initializer)
    {
    }

    static bool ShouldCache(EShaderPlatform Platform)
    {
        return true;
    }
};

// 實作VS.
IMPLEMENT_SHADER_TYPE(, FMyVS, TEXT("MyTest"), TEXT("MainVS"), SF_Vertex);


// PS, 繼承自FGlobalShader
class FMyTestPS : public FGlobalShader
{
    DECLARE_EXPORTED_SHADER_TYPE(FMyPS, Global, /*MYMODULE_API*/);

    FShaderParameter MyColorParameter;

    FMyTestPS() {}
    FMyTestPS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
        : FGlobalShader(Initializer)
    {
        // 綁定着色器參數.
        MyColorParameter.Bind(Initializer.ParameterMap, TEXT("MyColor"), SPF_Mandatory);
    }

    static void ModifyCompilationEnvironment(EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment)
    {
        FGlobalShader::ModifyCompilationEnvironment(Platform, OutEnvironment);
        // 增加定義.
        OutEnvironment.SetDefine(TEXT("MY_DEFINE"), 1);
    }

    static bool ShouldCache(EShaderPlatform Platform)
    {
        return true;
    }

    // 序列化.
    virtual bool Serialize(FArchive& Ar) override
    {
        bool bShaderHasOutdatedParameters = FGlobalShader::Serialize(Ar);
        Ar << MyColorParameter;
        return bShaderHasOutdatedParameters;
    }

    void SetColor(FRHICommandList& RHICmdList, const FLinearColor& Color)
    {
        // 設定顔色到RHI.
        SetShaderValue(RHICmdList, RHICmdList.GetBoundPixelShader(), MyColorParameter, Color);
    }
};

// 實作PS.
IMPLEMENT_SHADER_TYPE(, FMyPS, TEXT("MyTest"), TEXT("MainPS"), SF_Pixel);
           

最後編寫渲染代碼調用上述自定義的VS和PS:

void RenderMyTest(FRHICommandList& RHICmdList, ERHIFeatureLevel::Type FeatureLevel, const FLinearColor& Color)
{
    // 擷取全局着色器映射表.
    auto ShaderMap = GetGlobalShaderMap(FeatureLevel);

    // 擷取VS和PS執行個體.
    TShaderMapRef<FMyVS> MyVS(ShaderMap);
    TShaderMapRef<FMyPS> MyPS(ShaderMap);

    // 渲染狀态.
    static FGlobalBoundShaderState MyTestBoundShaderState;
    SetGlobalBoundShaderState(RHICmdList, FeatureLevel, MyTestBoundShaderState, GetVertexDeclarationFVector4(), *MyVS, *MyPS);

    // 設定PS的顔色.
    MyPS->SetColor(RHICmdList, Color);

    // 設定渲染狀态.
    RHICmdList.SetRasterizerState(TStaticRasterizerState::GetRHI());
    RHICmdList.SetBlendState(TStaticBlendState<>::GetRHI());
    RHICmdList.SetDepthStencilState(TStaticDepthStencilState::GetRHI(), 0);

    // 建立全螢幕方塊的頂點.
    FVector4 Vertices[4];
    Vertices[0].Set(-1.0f, 1.0f, 0, 1.0f);
    Vertices[1].Set(1.0f, 1.0f, 0, 1.0f);
    Vertices[2].Set(-1.0f, -1.0f, 0, 1.0f);
    Vertices[3].Set(1.0f, -1.0f, 0, 1.0f);

    // 繪制方塊.
    DrawPrimitiveUP(RHICmdList, PT_TriangleStrip, 2, Vertices, sizeof(Vertices[0]));
}
           

RenderMyTest實作完之後,可以添加到FDeferredShadingSceneRenderer::RenderFinish之中,以接入到主渲染流程中:

// 控制台變量, 以便運作時檢視效果.
static TAutoConsoleVariable CVarMyTest(
    TEXT("r.MyTest"),
    0,
    TEXT("Test My Global Shader, set it to 0 to disable, or to 1, 2 or 3 for fun!"),
    ECVF_RenderThreadSafe
);

void FDeferredShadingSceneRenderer::RenderFinish(FRHICommandListImmediate& RHICmdList)
{
    (......)
    
    // 增加自定義的代碼,以覆寫UE之前的渲染。
    int32 MyTestValue = CVarMyTest.GetValueOnAnyThread();
    if (MyTestValue != 0)
    {
        FLinearColor Color(MyTestValue == 1, MyTestValue == 2, MyTestValue == 3, 1);
        RenderMyTest(RHICmdList, FeatureLevel, Color);
    }

    FSceneRenderer::RenderFinish(RHICmdList);
    
    (......)
}
           

以上邏輯最終渲染的顔色由r.MyTest決定:如果是0,則禁用;是1顯示紅色;是2顯示綠色;是3顯示藍色。

新增加FVertexFactory子類的過程如下:

// FMyVertexFactory.h

// 聲明頂點工廠着色器參數.
BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT(FMyVertexFactoryParameters, )
    SHADER_PARAMETER(FVector4, Color)
END_GLOBAL_SHADER_PARAMETER_STRUCT()

// 聲明類型.
typedef TUniformBufferRef<FMyVertexFactoryParameters> FMyVertexFactoryBufferRef;

// 索引緩沖.
class FMyMeshIndexBuffer : public FIndexBuffer
{
public:
    FMyMeshIndexBuffer(int32 InNumQuadsPerSide) : NumQuadsPerSide(InNumQuadsPerSide) {}

    void InitRHI() override
    {
        if (NumQuadsPerSide < 256)
        {
            IndexBufferRHI = CreateIndexBuffer<uint16>();
        }
        else
        {
            IndexBufferRHI = CreateIndexBuffer<uint32>();
        }
    }

    int32 GetIndexCount() const { return NumIndices; };

private:
    template <typename IndexType>
    FIndexBufferRHIRef CreateIndexBuffer()
    {
        TResourceArray<IndexType, INDEXBUFFER_ALIGNMENT> Indices;

        // 配置設定頂點索引記憶體.
        Indices.Reserve(NumQuadsPerSide * NumQuadsPerSide * 6);

        // 用Morton順序建構索引緩沖, 以更好地重用頂點.
        for (int32 Morton = 0; Morton < NumQuadsPerSide * NumQuadsPerSide; Morton++)
        {
            int32 SquareX = FMath::ReverseMortonCode2(Morton);
            int32 SquareY = FMath::ReverseMortonCode2(Morton >> 1);

            bool ForwardDiagonal = false;

            if (SquareX % 2)
            {
                ForwardDiagonal = !ForwardDiagonal;
            }
            if (SquareY % 2)
            {
                ForwardDiagonal = !ForwardDiagonal;
            }

            int32 Index0 = SquareX + SquareY * (NumQuadsPerSide + 1);
            int32 Index1 = Index0 + 1;
            int32 Index2 = Index0 + (NumQuadsPerSide + 1);
            int32 Index3 = Index2 + 1;

            Indices.Add(Index3);
            Indices.Add(Index1);
            Indices.Add(ForwardDiagonal ? Index2 : Index0);
            Indices.Add(Index0);
            Indices.Add(Index2);
            Indices.Add(ForwardDiagonal ? Index1 : Index3);
        }

        NumIndices = Indices.Num();
        const uint32 Size = Indices.GetResourceDataSize();
        const uint32 Stride = sizeof(IndexType);

        // Create index buffer. Fill buffer with initial data upon creation
        FRHIResourceCreateInfo CreateInfo(&Indices);
        return RHICreateIndexBuffer(Stride, Size, BUF_Static, CreateInfo);
    }

    int32 NumIndices = 0;
    const int32 NumQuadsPerSide = 0;
};

// 頂點索引.
class FMyMeshVertexBuffer : public FVertexBuffer
{
public:
    FMyMeshVertexBuffer(int32 InNumQuadsPerSide) : NumQuadsPerSide(InNumQuadsPerSide) {}

    virtual void InitRHI() override
    {
        const uint32 NumVertsPerSide = NumQuadsPerSide + 1;
        
        NumVerts = NumVertsPerSide * NumVertsPerSide;

        FRHIResourceCreateInfo CreateInfo;
        void* BufferData = nullptr;
        VertexBufferRHI = RHICreateAndLockVertexBuffer(sizeof(FVector4) * NumVerts, BUF_Static, CreateInfo, BufferData);
        FVector4* DummyContents = (FVector4*)BufferData;

        for (uint32 VertY = 0; VertY < NumVertsPerSide; VertY++)
        {
            FVector4 VertPos;
            VertPos.Y = (float)VertY / NumQuadsPerSide - 0.5f;

            for (uint32 VertX = 0; VertX < NumVertsPerSide; VertX++)
            {
                VertPos.X = (float)VertX / NumQuadsPerSide - 0.5f;

                DummyContents[NumVertsPerSide * VertY + VertX] = VertPos;
            }
        }

        RHIUnlockVertexBuffer(VertexBufferRHI);
    }

    int32 GetVertexCount() const { return NumVerts; }

private:
    int32 NumVerts = 0;
    const int32 NumQuadsPerSide = 0;
};

// 頂點工廠.
class FMyVertexFactory : public FVertexFactory
{
    DECLARE_VERTEX_FACTORY_TYPE(FMyVertexFactory);

public:
    using Super = FVertexFactory;

    FMyVertexFactory(ERHIFeatureLevel::Type InFeatureLevel);
    ~FMyVertexFactory();

    virtual void InitRHI() override;
    virtual void ReleaseRHI() override;

    static bool ShouldCompilePermutation(const FVertexFactoryShaderPermutationParameters& Parameters);
    static void ModifyCompilationEnvironment(const FVertexFactoryShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment);
    static void ValidateCompiledResult(const FVertexFactoryType* Type, EShaderPlatform Platform, const FShaderParameterMap& ParameterMap, TArray<FString>& OutErrors);

    inline const FUniformBufferRHIRef GetMyVertexFactoryUniformBuffer() const { return UniformBuffer; }

private:
    void SetupUniformData();

    FMyMeshVertexBuffer* VertexBuffer = nullptr;
    FMyMeshIndexBuffer* IndexBuffer = nullptr;

    FMyVertexFactoryBufferRef UniformBuffer;
};


// FMyVertexFactory.cpp

#include "ShaderParameterUtils.h"

// 實作FMyVertexFactoryParameters, 注意在shader的名字是MyVF.
IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT(FMyVertexFactoryParameters, "MyVF");


// 頂點工廠着色器參數.
class FMyVertexFactoryShaderParameters : public FVertexFactoryShaderParameters
{
    DECLARE_TYPE_LAYOUT(FMyVertexFactoryShaderParameters, NonVirtual);

public:
    
    void Bind(const FShaderParameterMap& ParameterMap)
    {
    }

    void GetElementShaderBindings(
        const class FSceneInterface* Scene,
        const class FSceneView* View,
        const class FMeshMaterialShader* Shader,
        const EVertexInputStreamType InputStreamType,
        ERHIFeatureLevel::Type FeatureLevel,
        const class FVertexFactory* InVertexFactory,
        const struct FMeshBatchElement& BatchElement,
        class FMeshDrawSingleShaderBindings& ShaderBindings,
        FVertexInputStreamArray& VertexStreams) const
    {
        // 強制轉換成FMyVertexFactory.
        FMyVertexFactory* VertexFactory = (FMyVertexFactory*)InVertexFactory;

        // 增加shader幫定到表格.
        ShaderBindings.Add(Shader->GetUniformBufferParameter<FMyVertexFactoryShaderParameters>(), VertexFactory->GetMyVertexFactoryUniformBuffer());

        // 填充頂點流.
        if (VertexStreams.Num() > 0)
        {
            // 處理頂點流索引.
            for (int32 i = 0; i < 2; ++i)
            {
                FVertexInputStream* InstanceInputStream = VertexStreams.FindByPredicate([i](const FVertexInputStream& InStream) { return InStream.StreamIndex == i+1; });
                // 綁定頂點流索引.
                InstanceInputStream->VertexBuffer = InstanceDataBuffers->GetBuffer(i);
            }

            // 處理偏移.
            if (InstanceOffsetValue > 0)
            {
                VertexFactory->OffsetInstanceStreams(InstanceOffsetValue, InputStreamType, VertexStreams);
            }
        }
    }
};

// ----------- 實作頂點工廠 -----------

FMyVertexFactory::FMyVertexFactory(ERHIFeatureLevel::Type InFeatureLevel)
{
    VertexBuffer = new FMyMeshVertexBuffer(16);
    IndexBuffer = new FMyMeshIndexBuffer(16);
}

FMyVertexFactory::~FMyVertexFactory()
{
    delete VertexBuffer;
    delete IndexBuffer;
}

void FMyVertexFactory::InitRHI()
{
    Super::InitRHI();

    // 設定Uniform資料.
    SetupUniformData();

    VertexBuffer->InitResource();
    IndexBuffer->InitResource();

    // 頂點流: 位置
    FVertexStream PositionVertexStream;
    PositionVertexStream.VertexBuffer = VertexBuffer;
    PositionVertexStream.Stride = sizeof(FVector4);
    PositionVertexStream.Offset = 0;
    PositionVertexStream.VertexStreamUsage = EVertexStreamUsage::Default;

    // 簡單的執行個體化頂點流資料 其中VertexBuffer在綁定時設定.
    FVertexStream InstanceDataVertexStream;
    InstanceDataVertexStream.VertexBuffer = nullptr;
    InstanceDataVertexStream.Stride = sizeof(FVector4);
    InstanceDataVertexStream.Offset = 0;
    InstanceDataVertexStream.VertexStreamUsage = EVertexStreamUsage::Instancing;

    FVertexElement VertexPositionElement(Streams.Add(PositionVertexStream), 0, VET_Float4, 0, PositionVertexStream.Stride, false);

    // 頂點聲明.
    FVertexDeclarationElementList Elements;
    Elements.Add(VertexPositionElement);

    // 添加索引頂點流.
    for (int32 StreamIdx = 0; StreamIdx < NumAdditionalVertexStreams; ++StreamIdx)
    {
        FVertexElement InstanceElement(Streams.Add(InstanceDataVertexStream), 0, VET_Float4, 8 + StreamIdx, InstanceDataVertexStream.Stride, true);
        Elements.Add(InstanceElement);
    }

    // 初始化聲明.
    InitDeclaration(Elements);
}

void FMyVertexFactory::ReleaseRHI()
{
    UniformBuffer.SafeRelease();
    
    if (VertexBuffer)
    {
        VertexBuffer->ReleaseResource();
    }

    if (IndexBuffer)
    {
        IndexBuffer->ReleaseResource();
    }

    Super::ReleaseRHI();
}

void FMyVertexFactory::SetupUniformData()
{
    FMyVertexFactoryParameters UniformParams;
    UniformParams.Color = FVector4(1,0,0,1);

    UniformBuffer = FMyVertexFactoryBufferRef::CreateUniformBufferImmediate(UniformParams, UniformBuffer_MultiFrame);
}

void FMyVertexFactory::ShouldCompilePermutation(const FVertexFactoryShaderPermutationParameters& Parameters)
{
    return true;
}

void FMyVertexFactory::ModifyCompilationEnvironment(const FVertexFactoryShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment)
{
    OutEnvironment.SetDefine(TEXT("MY_MESH_FACTORY"), 1);
}

void FMyVertexFactory::ValidateCompiledResult(const FVertexFactoryType* Type, EShaderPlatform Platform, const FShaderParameterMap& ParameterMap, TArray<FString>& OutErrors)
{
}
           

C++層的邏輯已經完成,但HLSL層也需要編寫對應的代碼:

#include "/Engine/Private/VertexFactoryCommon.ush"

// VS插值到PS的結構體。
struct FVertexFactoryInterpolantsVSToPS
{
#if NUM_TEX_COORD_INTERPOLATORS
    float4    TexCoords[(NUM_TEX_COORD_INTERPOLATORS+1)/2] : TEXCOORD0;
#endif

#if VF_USE_PRIMITIVE_SCENE_DATA
    nointerpolation uint PrimitiveId : PRIMITIVE_ID;
#endif

#if INSTANCED_STEREO
    nointerpolation uint EyeIndex : PACKED_EYE_INDEX;
#endif
};

struct FVertexFactoryInput
{
    float4    Position    : ATTRIBUTE0;

    float4 InstanceData0 : ATTRIBUTE8;
    float4 InstanceData1 : ATTRIBUTE9; 

#if VF_USE_PRIMITIVE_SCENE_DATA
    uint PrimitiveId : ATTRIBUTE13;
#endif
};

struct FPositionOnlyVertexFactoryInput
{
    float4    Position    : ATTRIBUTE0;

    float4 InstanceData0 : ATTRIBUTE8;
    float4 InstanceData1 : ATTRIBUTE9; 

#if VF_USE_PRIMITIVE_SCENE_DATA
    uint PrimitiveId : ATTRIBUTE1;
#endif
};

struct FPositionAndNormalOnlyVertexFactoryInput
{
    float4    Position    : ATTRIBUTE0;
    float4    Normal        : ATTRIBUTE2;

    float4 InstanceData0 : ATTRIBUTE8;
    float4 InstanceData1 : ATTRIBUTE9; 

#if VF_USE_PRIMITIVE_SCENE_DATA
    uint PrimitiveId : ATTRIBUTE1;
#endif
};

struct FVertexFactoryIntermediates
{
    float3 OriginalWorldPos;
    
    uint PrimitiveId;
};

uint GetPrimitiveId(FVertexFactoryInterpolantsVSToPS Interpolants)
{
#if VF_USE_PRIMITIVE_SCENE_DATA
    return Interpolants.PrimitiveId;
#else
    return 0;
#endif
}

void SetPrimitiveId(inout FVertexFactoryInterpolantsVSToPS Interpolants, uint PrimitiveId)
{
#if VF_USE_PRIMITIVE_SCENE_DATA
    Interpolants.PrimitiveId = PrimitiveId;
#endif
}

#if NUM_TEX_COORD_INTERPOLATORS
float2 GetUV(FVertexFactoryInterpolantsVSToPS Interpolants, int UVIndex)
{
    float4 UVVector = Interpolants.TexCoords[UVIndex / 2];
    return UVIndex % 2 ? UVVector.zw : UVVector.xy;
}

void SetUV(inout FVertexFactoryInterpolantsVSToPS Interpolants, int UVIndex, float2 InValue)
{
    FLATTEN
    if (UVIndex % 2)
    {
        Interpolants.TexCoords[UVIndex / 2].zw = InValue;
    }
    else
    {
        Interpolants.TexCoords[UVIndex / 2].xy = InValue;
    }
}
#endif

FMaterialPixelParameters GetMaterialPixelParameters(FVertexFactoryInterpolantsVSToPS Interpolants, float4 SvPosition)
{
    // GetMaterialPixelParameters is responsible for fully initializing the result
    FMaterialPixelParameters Result = MakeInitializedMaterialPixelParameters();

#if NUM_TEX_COORD_INTERPOLATORS
    UNROLL
    for (int CoordinateIndex = 0; CoordinateIndex < NUM_TEX_COORD_INTERPOLATORS; CoordinateIndex++)
    {
        Result.TexCoords[CoordinateIndex] = GetUV(Interpolants, CoordinateIndex);
    }
#endif    //NUM_MATERIAL_TEXCOORDS

    Result.TwoSidedSign = 1;
    Result.PrimitiveId = GetPrimitiveId(Interpolants);

    return Result;
}

FMaterialVertexParameters GetMaterialVertexParameters(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, float3 WorldPosition, half3x3 TangentToLocal)
{
    FMaterialVertexParameters Result = (FMaterialVertexParameters)0;
    
    Result.WorldPosition = WorldPosition;
    Result.TangentToWorld = float3x3(1,0,0,0,1,0,0,0,1);
    Result.PreSkinnedPosition = Input.Position.xyz;
    Result.PreSkinnedNormal = float3(0,0,1);

#if NUM_MATERIAL_TEXCOORDS_VERTEX
    UNROLL
    for(int CoordinateIndex = 0; CoordinateIndex < NUM_MATERIAL_TEXCOORDS_VERTEX; CoordinateIndex++)
    {
        Result.TexCoords[CoordinateIndex] = Intermediates.MorphedWorldPosRaw.xy;
    }
#endif  //NUM_MATERIAL_TEXCOORDS_VERTEX

    return Result;
}

FVertexFactoryIntermediates GetVertexFactoryIntermediates(FVertexFactoryInput Input)
{
    FVertexFactoryIntermediates Intermediates;

    // Get the packed instance data
    float4 Data0 = Input.InstanceData0;
    float4 Data1 = Input.InstanceData1;

    const float3 Translation = Data0.xyz;
    const float3 Scale = float3(Data1.zw, 1.0f);
    const uint PackedDataChannel = asuint(Data1.x);

    // Lod level is in first 8 bits and ShouldMorph bit is in the 9th bit
    const float LODLevel = (float)(PackedDataChannel & 0xFF);
    const uint ShouldMorph = ((PackedDataChannel >> 8) & 0x1); 

    // Calculate the world pos
    Intermediates.OriginalWorldPos = float3(Input.Position.xy, 0.0f) * Scale + Translation;

#if VF_USE_PRIMITIVE_SCENE_DATA
    Intermediates.PrimitiveId = Input.PrimitiveId;
#else
    Intermediates.PrimitiveId = 0;
#endif

    return Intermediates;
}

half3x3 VertexFactoryGetTangentToLocal(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
    return half3x3(1,0,0,0,1,0,0,0,1);
}

float4 VertexFactoryGetRasterizedWorldPosition(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, float4 InWorldPosition)
{
    return InWorldPosition;
}

float3 VertexFactoryGetPositionForVertexLighting(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, float3 TranslatedWorldPosition)
{
    return TranslatedWorldPosition;
}

FVertexFactoryInterpolantsVSToPS VertexFactoryGetInterpolantsVSToPS(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, FMaterialVertexParameters VertexParameters)
{
    FVertexFactoryInterpolantsVSToPS Interpolants;

    Interpolants = (FVertexFactoryInterpolantsVSToPS)0;

#if NUM_TEX_COORD_INTERPOLATORS
    float2 CustomizedUVs[NUM_TEX_COORD_INTERPOLATORS];
    GetMaterialCustomizedUVs(VertexParameters, CustomizedUVs);
    GetCustomInterpolators(VertexParameters, CustomizedUVs);
    
    UNROLL
    for (int CoordinateIndex = 0; CoordinateIndex < NUM_TEX_COORD_INTERPOLATORS; CoordinateIndex++)
    {
        SetUV(Interpolants, CoordinateIndex, CustomizedUVs[CoordinateIndex]);
    }
#endif

#if INSTANCED_STEREO
    Interpolants.EyeIndex = 0;
#endif

    SetPrimitiveId(Interpolants, Intermediates.PrimitiveId);

    return Interpolants;
}

float4 VertexFactoryGetWorldPosition(FPositionOnlyVertexFactoryInput Input)
{
    return Input.Position;
}

float4 VertexFactoryGetPreviousWorldPosition(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
    float4x4 PreviousLocalToWorldTranslated = GetPrimitiveData(Intermediates.PrimitiveId).PreviousLocalToWorld;
    PreviousLocalToWorldTranslated[3][0] += ResolvedView.PrevPreViewTranslation.x;
    PreviousLocalToWorldTranslated[3][1] += ResolvedView.PrevPreViewTranslation.y;
    PreviousLocalToWorldTranslated[3][2] += ResolvedView.PrevPreViewTranslation.z;

    return mul(Input.Position, PreviousLocalToWorldTranslated);
}

float4 VertexFactoryGetTranslatedPrimitiveVolumeBounds(FVertexFactoryInterpolantsVSToPS Interpolants)
{
    float4 ObjectWorldPositionAndRadius = GetPrimitiveData(GetPrimitiveId(Interpolants)).ObjectWorldPositionAndRadius;
    return float4(ObjectWorldPositionAndRadius.xyz + ResolvedView.PreViewTranslation.xyz, ObjectWorldPositionAndRadius.w);
}

uint VertexFactoryGetPrimitiveId(FVertexFactoryInterpolantsVSToPS Interpolants)
{
    return GetPrimitiveId(Interpolants);
}

float3 VertexFactoryGetWorldNormal(FPositionAndNormalOnlyVertexFactoryInput Input)
{
    return Input.Normal.xyz;
}

float3 VertexFactoryGetWorldNormal(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
    return float3(0.0f, 0.0f, 1.0f);
}
           

由此可見,如果新增加了FVertexFactory的自定義類型,需要在HLSL實作以下接口:

函數 描述
FVertexFactoryInput 定義輸入到VS的資料布局,需要比對c++側的FVertexFactory的類型。
FVertexFactoryIntermediates 用于存儲将在多個頂點工廠函數中使用的緩存中間資料,比如TangentToLocal。
FVertexFactoryInterpolantsVSToPS 從VS傳遞到PS的頂點工廠資料。
VertexFactoryGetWorldPosition 從頂點着色器調用來獲得世界空間的頂點位置。
VertexFactoryGetInterpolantsVSToPS 轉換FVertexFactoryInput到FVertexFactoryInterpolants,在硬體光栅化插值之前計算需要插值或傳遞到PS的資料。
GetMaterialPixelParameters 由PS調用,根據FVertexFactoryInterpolants計算并填充FMaterialPixelParameters結構體。

本篇主要闡述了UE的shader體系的基礎概念、類型、機制,希望童鞋們學習完本篇之後,對UE的shader不再陌生,并能夠應用于實際項目實踐中。

按慣例,本篇也布置一些小思考,以助了解和加深UE Shader體系的掌握和了解:

  • FShader的繼承體系中有哪些重要的子類?它們的功能是什麼?有什麼異同?
  • Shader Parameter和Uniform Buffer如何聲明、實作、應用并更新到GPU中?
  • Shader Map的存儲和編譯機制是怎麼樣的?
  • UE在Shader跨平台中采用了什麼方案?為什麼要那樣做?有沒更好的方式?
  • 如何更好地調試或優化Shader?

  • 感謝所有參考文獻的作者,部分圖檔來自參考文獻和網絡,侵删。
  • 本系列文章為筆者原創,隻發表在部落格園上,歡迎分享本文連結,但未經同意,不允許轉載!
  • 系列文章,未完待續,完整目錄請戳内容綱目。

  • Unreal Engine Source
  • Rendering and Graphics
  • Materials
  • Graphics Programming
  • Shader Development
  • Debugging the Shader Compile Process
  • Creating a Custom Mesh Component in UE4 | Part 0: Intro
  • Creating a Custom Mesh Component in UE4 | Part 1: An In-depth Explanation of Vertex Factories
  • Creating a Custom Mesh Component in UE4 | Part 2: Implementing the Vertex Factory
  • Unreal Engine 4 Rendering Part 1: Introduction
  • Unreal Engine 4 Rendering Part 5: Shader Permutations
  • 【UE4 Renderer】<03> PipelineBase
  • UE4材質系統源碼分析之材質編譯成HLSL CODE
  • UE4 HLSL 和 Shader 開發指南和技巧
  • Uniform Buffer、FVertexFactory、FVertexFactoryType
  • 遊戲引擎随筆 0x02:Shader 跨平台編譯之路
  • UE4 Shader 編譯以及變種實作
  • 虛幻4渲染程式設計(Shader篇)【第四卷:虛幻4C++層和Shader層的簡單資料通訊】
  • UE4渲染部分2: Shaders和Vertex Data
  • HLSL Cross Compiler
  • AsyncCompute
  • 深入GPU硬體架構及運作機制
  • 移動遊戲性能優化通用技法
  • Adding Global Shaders to Unreal Engine
  • Create a New Global Shader as a Plugin
  • The Industry Open Standard Intermediate Language for Parallel Compute and Graphics
  • 跨平台引擎Shader編譯流程分析
  • 關于Shader的跨平台方案的考慮
  • UE4的着色器跨平台解決方案
  • 跨平台shader編譯的過去、現在和未來
  • BRINGING UNREAL ENGINE 4 TO OPENGL
  • FShaderCache

繼續閱讀