天天看點

PhysXSDKDoc翻譯 - TaskManagementTask ManagementTaskManagerCpuDispatcherCpuDispatcher Implementation GuidelinesBaseTaskTaskLightCpuTask

翻譯:rikpan,校對:ericating88,轉載請注明出處。

我們的水準有限,錯漏處,請指點。

PhysX 3.2.1版本

Task Management

PxTask is a subsystem for managing compute resources for PhysX and APEX. It manages CPU and GPU compute resources, as well as SPU units on PlayStation3, by distributing Tasks to a user-implemented dispatcher and resolving Task dependencies such that Tasks are run in a given order.

PxTask是用于管理PhysX和APEX計算資源的子系統。它通過配置設定Tasks給使用者實作的分發器,管理CPU和GPU的計算資源,同時也管理PlayStation3的SPU,并且決定Task從屬關系,進而讓Tasks按照給定的順序運作。

Middleware products typically do not want to create CPU threads for their own use. This is especially true on consoles where execution threads can have significant overhead. In the PxTask model, the computational work is broken into jobs that are submitted to the game's thread pool as they become ready to run.

中間件産品通常不希望建立它們自己使用的CPU線程。在控制台程式上執行的線程非常重要,是以不建立自己使用的CPU線程很正确。在PxTask模型中,計算工作被配置設定成jobs送出到遊戲線程池,此時jobs已做好運作的準備。

The following classes comprise the PxTask CPU resource management.

TaskManager

A TaskManager manages inter-task dependencies and dispatches ready tasks to their respective dispatcher. There is a dispatcher for CPU tasks, GPU tasks, and SPU tasks assigned to the TaskManager.

TaskManager管理着inter-task從屬關系,并且将準備好的任務配置設定到它們各自的分發器。TaskManager被指派了用于CPU任務、GPU任務和SPU任務的分發器。

TaskManagers are owned and created by the SDK. Each PxScene will allocate its own TaskManager instance which users can configure with dispatchers through either the PxSceneDesc or directly through the TaskManager interface.

SDK擁有和建立TaskManagers。每個PxScene将會配置設定它們自己的TaskManager執行個體,使用者可以通過PxSceneDesc配置分發器,或者直接調用TaskManager的接口配置分發器,然後每個PxScene配置設定它們自己的TaskManager執行個體時會用之前配置好的分發器。

CpuDispatcher

The CpuDispatcher is an abstract class the SDK uses for interfacing with the application's thread pool. Typically, there will be one single CpuDispatcher for the entire application, since there is rarely a need for more than one thread pool. A CpuDispatcher instance may be shared by more than one TaskManager, for example if multiple scenes are being used.

CpuDispatcher是一個虛類,是SDK與應用程式的線程池之間使用的接口。通常情況下,整個應用程式隻會有一個CpuDispatcher,很少需要超過一個線程池的情況。CpuDispatcher執行個體可能在多個TaskManager間共享,例如使用了多場景。

PxTask includes a default CpuDispatcher implementation, but we prefer applications to implement this class themselves so PhysX and APEX can efficiently share CPU resources with the application.

PxTask包含一個預設的CpuDispatcher實作,但我們(譯著:NVidia)更希望應用程式實作他們自己的類以便PhysX和APEX能和應用程式更有效地共享CPU資源。

Note

The TaskManager will call CpuDispatcher::submitTask() from either the context of API calls (aka: scene::simulate()) or from other running tasks, so the function must be thread-safe.

TaskManager會從任何的API調用上下文(即:scene::simulate())或其他正在運作的任務中調用CpuDispatcher::submitTask(),是以函數(譯著:CpuDispatcher::submitTask)必須線程安全。

An implemention of the CpuDispatcher interface must call the following two methods on each submitted task for it to be run correctly:

CpuDispatcher接口的實作必須在每個送出的任務中調用以下兩個接口才能運作正确。

baseTask->run();        // optionally call runProfiled() to wrap with PVD profiling events
                        // 選擇調用runProfiled()會wrap(譯著:術語還真不好翻譯) PVD剖析事件
baseTask->release();
           

The PxExtensions library has default implementations for all dispatcher types, the following code snippets are taken from SampleParticles and SampleBase and show how the default dispatchers are created. mNbThreads which is passed to PxDefaultCpuDispatcherCreate defines how many worker threads the CPU dispatcher will have.

PxExtensions 庫有所有分發器類型的預設實作,下面的代碼片段取自SampleParticles和SampleBase,并展示了如何建立預設分發器。傳遞給PxDefaultCpuDispatcherCreate 的mNbThreads定義了CPU分發器會擁有多少條工作線程。

Best performance is usually achieved if the number of threads is equal to the available hardware threads of the platform you are running on:

通常達到最佳性能的實作方法是,線程數量等于你正在運作平台的可用硬體線程數(譯著:CPU核心數)

PxSceneDesc sceneDesc(mPhysics->getTolerancesScale());
    [...]
    // create CPU dispatcher which mNbThreads worker threads
    mCpuDispatcher = PxDefaultCpuDispatcherCreate(mNbThreads);
    if(!mCpuDispatcher)
        fatalError("PxDefaultCpuDispatcherCreate failed!");
    sceneDesc.cpuDispatcher = mCpuDispatcher;
#ifdef PX_WINDOWS
    // create GPU dispatcher
    pxTask::CudaContextManagerDesc cudaContextManagerDesc;
    mCudaContextManager = pxTask::createCudaContextManager(cudaContextManagerDesc);
    sceneDesc.gpuDispatcher = mCudaContextManager->getGpuDispatcher();
#endif
    [...]
    mScene = mPhysics->createScene(sceneDesc);
           

Note

CudaContextManagerDesc support appGUID now. It only works on release build. If your application employs PhysX modules that use CUDA you need to use a GUID so that patches for new architectures can be released for your game. You can obtain a GUID for your application from Nvidia. The application should log the failure into a file which can be sent to NVIDIA for support.

CudaContextManagerDesc 現在支援appGUID。它僅适用于release版本。如果你的應用程式使用了CUDA的PhysX實體子產品,你就需要使用一個GUID來釋出新的遊戲架構更新檔。你可以從Nvidia獲得你的應用程式的GUID。應用程式能夠記錄錯誤到檔案中,然後發給NVIDIA獲得幫助。

CpuDispatcher Implementation Guidelines

After the scene's TaskManager has found a ready-to-run task and submitted it to the appropriate dispatcher it is up to the dispatcher implementation to decide how and when the task will be run.

Often in game scenarios the rigid body simulation is time critical and the goal is to reduce the latency from simulate() to the completion of fetchResults(). The lowest possible latency will be achieved when the PhysX tasks have exclusive access to CPU resources during the update. In reality, PhysX will have to share compute resources with other game tasks. Below are some guidelines to help ensure a balance between throughput and latency when mixing the PhysX update with other work.

場景的TaskManager發現準備好運作的任務,并将其送出給适當的分發器,分發器的具體實作決定如何以及何時運作任務。通常在遊戲的應用場景中,剛體模拟是時間關鍵的,并且目的是為了減少從simulate()到完成fetchResults()的延遲。當更新時PhysX任務獨享通路CPU資源會有更少的延遲。事實上,PhysX會與其他遊戲任務共享計算資源。下面的指南會幫助确定當PhysX和其他任務在一起更新時,吞吐量(CPU使用率)和延遲之間的平衡。

Avoid interleaving long running tasks with PhysX tasks, this will help reduce latency.

避免将長時間運作的任務插入到PhysX任務,這會幫助降低延遲。(譯著:在BaseTask::run中,不要做長時間運作的任務)

Avoid assigning worker threads to the same execution core as higher priority threads. If a PhysX task is context switched during execution the rest of the rigid body pipeline may be stalled, increasing latency.

避免工作線程作為高優先級線程配置設定到相同的執行核心。如果PhysX任務在執行時上下文切換,其他的剛體流水線可能會停止,這會增加延遲。(譯著:PhysX任務執行時上下文切換,在高優先級的線程中執行的任務,可能拖累低優先級線程中執行的任務。PhysX任務所在的線程應該優先級一樣高才好)

PhysX occasionally submits tasks and then immediately waits for them to complete, because of this, executing tasks in LIFO (stack) order may perform better than FIFO (queue) order.

PhysX偶爾會送出任務,并且立刻等待任務完成,是以,執行任務的順序采用後進先出的順序可能比先入先出的順序好。

PhysX is not a perfectly parallel SDK, so interleaving small to medium granularity tasks will generally result in higher overall throughput.

PhysX并非完美的并行SDK,是以插入小到中等粒度的任務通常會有更高的總吞吐量(CPU使用率)。(譯著:PhysX在simulate後開始多線程模拟,simulate接口立即傳回,當調用fetchResults時等待模拟完成。如果simulate後立刻fetchResults會等待比較長的時間,并且這些等待時間裡PhysX并非一直在做密集型計算,也就是PhysX并非完美的并行SDK,此時是有CPU資源空閑的,是以在simulate和fetchResults之間插入小到中等粒度的任務,會更有效的利用CPU資源。在sample中,是插入渲染處理)

If your thread pool has per-thread job-queues then queuing tasks on the thread they were submitted may result in more optimal CPU cache coherence, however this is not required.

如果你的線程池是每條線程都有工作隊列,它們送出的線程可能會導緻更優化的CPU高速緩存一緻性,但是這不是必需的。

For more details see the default CpuDispatcher implementation that comes as part of the PxExtensions package. It uses worker threads that each have their own task queue and steal tasks from the back of other worker's queues (LIFO order) to improve workload distribution.

更多的細節參考PxExtensions的預設CpuDispatcher實作。它使用的工作線程每條都有自己的任務隊列,并從其他工作隊列尾部中抓取任務(後進先出順序),以此提高工作負載配置設定。

BaseTask

BaseTask is the abstract base class for all PxTask task types. All task run() functions will be executed on application threads, so they need to be careful with their stack usage, use a little stack as possible, and they should never block for any reason.

BaseTask是所有PxTask任務類型的虛基類。所有任務run()函數将在應用程式線程中被執行,是以它們需要注意堆棧的使用,盡可能使用小堆棧,并且它們不能因為任何原因阻塞。

Task

The Task class is the standard task type. Tasks must be submitted to the TaskManager each simulation step for them to be executed. Tasks may be named at submission time, this allows them to be discoverable. Tasks will be given a reference count of 1 when they are submitted, and the TaskManager::startSimulation() function decrements the reference count of all tasks and dispatches all Tasks whose reference count reaches zero. Before TaskManager::startSimulation() is called, Tasks can set dependencies on each other to control the order in which they are dispatched. Once simulation has started, it is still possible to submit new tasks and add dependencies, but it is up to the programmer to avoid race hazards. You cannot add dependencies to tasks that have already been dispatched, and newly submitted Tasks must have their reference count decremented before that Task will be allowed to execute.

Task類是标準的任務類型。每次模拟任務必須被送出給TaskManager以便被執行。Tasks可能在送出時被命名,這會讓它們能被查詢。在任務被送出後會有被賦予1次引用計數,并且TaskManager::startSimulation()函數會減少所有任務的引用計數,并配置設定所有引用計數為0的任務。在TaskManager::startSimulation()被調用前,Tasks能彼此設定從屬關系,以便控制被配置設定後的執行順序。一旦模拟開始,仍然可以送出新的任務,并添加從屬關系,但是由程式員負責避免任務間的惡性競争。你不能向已經被配置設定的任務添加從屬關系,并且新送出的任務必須在Task允許執行前已減少它們的引用計數(譯著:手動減少引用計數為0)。

Synchronization points can also be defined using Task names. The TaskManager will assign the name a TaskID with no Task implementation. When all of the named TaskID's dependencies are met, it will decrement the reference count of all Tasks with that name.

同步點也可以使用Task名稱定義。Task沒有賦予名稱時TaskManager會配置設定一個TaskID名稱。當所有已命名的TaskID的依賴關系滿足(譯著:屬主任務都已執行),所有使用該名稱的Tasks(譯著:屬主Tasks)的引用計數會被減少。

APEX uses the Task class almost exclusively to manage CPU resources. The ApexScene defines a number of named Tasks that the modules use to schedule their own Tasks (ex: start after LOD calculations are complete, finish before the PhysX scene is stepped).

APEX幾乎完全使用Task類管理CPU資源。ApexScene定義了一定數量已命名的Tasks來排程它們自己的Tasks(例如:在LOD計算完成後開始,在PhysX場景步進前完成)

LightCpuTask

LightCpuTask is another subclass of BaseTask that is explicitly scheduled by the programmer. LightCpuTasks have a reference count of 1 when they are initialized, so their reference count must be decremented before they are dispatched. LightCpuTasks increment their continuation task reference count when they are initialized, and decrement the reference count when they are released (after completing their run() function).

LightCpuTask 是BaseTask的另一個子類,它被程式員明确的排程。LightCpuTasks初始化時有1次引用,是以它們的引用次數必須在被配置設定前降低(譯著:手動減少引用計數為0)。LightCpuTasks初始化時增加它們後續任務的引用計數,并且當它們被釋放時降低後續任務的引用計數(在完成它們的run函數後)。

PhysX 3.0 uses LightCpuTasks almost exclusively to manage CPU resources. For example, each stage of the simulation update may consist of multiple parallel tasks, when each of these tasks has finished execution it will decrement the reference count on the next task in the update chain. This will then be automatically dispatched for execution when its reference count reaches zero.

PhysX 3.0幾乎完全使用LightCpuTasks管理CPU資源。例如,每個模拟更新階段有可能由多個并行任務組成。當這些任務中的任意一個完成執行時,都會減少更新鍊上下一個任務的引用計數。當引用計數為0時就會自動被配置設定執行。

Note

Even when using LightCpuTasks exclusively to manage CPU resources, the TaskManager startSimulation() and stopSimulation() calls must be made each simulation step to keep the GpuDispatcher synchronized.

當隻用LightCpuTasks管理CPU資源時,Taskmanager的startSimulation()和stopSimulation()調用必須在每次模拟時保持GpuDispatcher同步。

The following code snippets show how the crabs' A.I. in SampleSubmarine is run as a CPU Task. By doing so the Crab A.I. is run as a background Task in parallel with the PhysX simulation update.

下面的代碼片段展示了SampleSubmarine裡作為CPU任務運作的模拟螃蟹AI。通過像模拟螃蟹AI的例子,可以讓背景任務在PhysX模拟更新時并行運作。

For a CPU task that does not need handling of multiple continuations LightCpuTask can be subclassed.

對CPU任務而言并不需要處理多個連續的LightCpuTask。

A LightCpuTask subclass requires that the getName and a run method be defined:

LightCpuTask 的子類需要實作getName和run方法定義如下:

class Crab: public ClassType, public physx::pxtask::LightCpuTask, public SampleAllocateable
{
public:
    Crab(SampleSubmarine& sample, const PxVec3& crabPos, RenderMaterial* material);
    ~Crab();
    [...]

    // Implements LightCpuTask
    virtual  const char*    getName() const { return "Crab AI Task"; }
    virtual  void           run();

    [...]
}
           

After PxScene::simulate() has been called, and the simulation started, the application calls removeReference() on each Crab task, this in turn causes it to be submitted to the CpuDispatcher for update. Note that it is also possible to submit tasks to the dispatcher directly (without manipulating reference counts) as follows:

PxScene::simulate()被調用,并且模拟開始後,應用程式在每個模拟螃蟹任務上調用removeReference(),這會向CpuDispatcher送出模拟螃蟹任務并更新。注意,也可以直接送出任務給配置設定器(不增減引用計數),如下:

pxtask::LightCpuTask& task = &mCrab;
mCpuDispatcher->submitTask(task);
           

Once queued for execution by the CpuDispatcher, one of the thread pool's worker threads will eventually call the task's run method. In this example the Crab task will perform raycasts against the scene and update its internal state machine:

一旦CpuDispatcher排好隊等待執行,其中一個線程池的工作線程将會最終調用任務的run方法。在這個執行個體中模拟螃蟹任務對場景執行射線查詢并更新它内部的狀态機:

void Crab::run()
{
    // run as a separate task/thread
    scanForObstacles();
    updateState();
}
           

It is safe to perform API read calls, such as scene queries, from multiple threads while simulate() is running. However, care must be taken not to overlap API read and write calls from multiple threads. In this case the SDK will issue an error, see Data Access and Buffering for more information.

An example for explicit reference count modification and task dependency setup:

當simulate()正在運作時從多線程調用API讀取資料是安全的,比如場景查詢。但是,必須注意不要在從多線程重疊(?)調用API讀取資料和寫入資料。

// assume all tasks have a refcount of 1 and are submitted to the task manager
// 確定所有的任務都有一次引用計數,并且已送出給TaskManager
// 3 task chains a0-a2, b0-b2, c0-c2
// 3條任務鍊a0-a2, b0-b2, c0-c2
// b0 shall start after a1
// b0會在a1之後開始執行
// the a and c chain have no dependencies and shall run in parallel
// a和c鍊沒有屬主,會并行執行
//
// a0-a1-a2
//      \
//       b0-b1-b2
// c0-c1-c2

// setup the 3 chains
// 生成3條任務鍊
for(PxU32 i = 0; i < 2; i++)
{
    a[i].setContinuation(&a[i+1]);
    b[i].setContinuation(&b[i+1]);
    c[i].setContinuation(&c[i+1]);
}

// b0 shall start after a1
// b0會在a1之後開始執行
b[0].startAfter(a[1].getTaskID());

// setup is done, now start all task by decrementing their refcount by 1
// 生成完畢,現在減少所有的引用計數1次後開始所有任務
// tasks with refcount == 0 will be submitted to the dispatcher (a0 & c0 will start).
// 任務的refcount == 0會被送出給配置設定器(a0 & c0會開始執行)
for(PxU32 i = 0; i < 3; i++)
{
    a[i].removeReference();
    b[i].removeReference();
    c[i].removeReference();
}