天天看點

64位中使用AWE配置設定記憶體

We have already talked about Windows AWE mechanism on 32 bit and how SQL Server utilizes it. Today I would like to go over AWE & related mechanism on 64 bit platforms.

我們已經談過windows AWE 在32bit 中sql server 如何利用。下面是AWE在64位環境的相關性

To some people it comes as a surprise that AWE mechanism is still present and actually could be useful on 64 bit platforms. As you remember the mechanism consists of two parts allocating physical memory and mapping it to the given process's VAS. The advantage of allocation mechanism is that once physical memory is allocated operating system can't reclaim it until either the process is terminated or the process frees memory back to the OS. This feature allows an application to control and even avoid paging altogether. Advantage of mapping/unmapping mechanism is that the same physical page could be mapped into different VAS's regions. As you imaging unmapping is not necessary on 64 bit platforms since we have enough VAS to accommodate all of existing physical memory.

對于有些人會很驚奇AWE機制任然存在并且在64位中是可用的。這個機制分為2部分配置設定記憶體和映射到給定的VAS中。這種配置設定機制的好處是不會被系統回收,直到程序終止或者程序記憶體被釋放會作業系統。這特性允許應用程式控制并且避免分頁。映射和非映射機制的好處是相同的實體頁可以被映射到不同VAS空間。你可以想象在64bit中非映射是沒有必要的,因為64bit 有足夠的vas容納所有實體記憶體。

From Operating System theory, OS implements a page table entry, PTE, to describe a mapping of a page in VAS to physical page. Internally physical page is described by page frame number, PFN. Given PFN one can derive complete information about physical page it represents. For example PFN shows to which NUMA node the particular page belongs. OS has a database, collection of PFNs that it manages.  If page in VAS is committed, it has PTE which might or might not point to given PFN.  Conceptually, page that PTE represents can be either in memory or not, for example swapped out to disk. In the former case it is bound to a given PFN and in latter it is not. In its turn, once a physical page is bound to page in VAS, its PFN points back to PTE.

來自作業系統的理論,作業系統引入一個頁表項,PTE,描述了一個在VAS中的頁到實體頁的映射。實體頁被頁幀号描述,PFN。給定的PFN可以導出所有它代表的實體頁。例如PFN顯示了指定的實體也屬于那個NUMA節點。OS有一個資料庫收集了PFN并且管理它。如果頁在VAS中已經被送出了,PTE可能指向了一個給定的PFN。理論上PTE上的頁可能在記憶體上也可能沒有,如切換到了磁盤上。先前的例子頁綁定了一個PFN,并且之後不綁定了。一旦實體頁被綁定到VAS中,PFN就反向指向到PTE。

When OS commits, frees, pages out/in a given PTE or needs to derive some information about it, for example NUMA residency, it has to acquire process's working set lock - to guarantee stability of PTE to PFN binding. This lock is a rather expensive and might hurt scalability of the process. Latter versions of Windows made this lock as light as possible but avoiding still will benefit application's scalability..

當OS送出,釋放,切換一個給定的PTE或者取一些頁的資訊。比如NUMA位置,頁不得不請求程序工作集鎖,來保證PTE到PFN之間的穩定性。這個鎖定比較昂貴并且可能會損害程序的擴充性。之後的windows版本會使這個鎖越來越輕量但是不能保證會對程式的擴充性有好處。

When allocating physical pages utilizing AWE mechanism we are given a set of PFN entries directly from PFN database - remember that you should not manipulate or modify set of entries you get back  nor can you rely on values you get back. OS is required to take a PFN database lock when allocating PFN entries. Using AWE map mechanism you can map allocated PFN entries to the process's VAS. When mapping occurs PTEs are allocated, bound to PFNs and marked as locked. In this case OS needs to acquire process's working set lock only ones. When mapping regular pages, OS does it on demand and hence will have to acquire both working set and PFN database lock for every page. Since pages are locked in memory, OS will ignore these PTEs during  paging process.

當實體記憶體通過AWE配置設定的時候,我們擷取的PFN項直接來至于PFN資料庫——記住你不能删除或者修改你擷取的項,也不能依賴你擷取的值。當PFN被配置設定的時候,OS會請求鎖定PFN資料庫。使用AWE映射機制你可以把PFN映射到程序的VAS中。當映射發生PTE就會被配置設定,綁定到PFN并且标記被鎖定。這個時候OS隻需要擷取程序的工作集鎖。當映射正常也的時候,OS按需求,并且會請求工作集鎖和PFN資料庫鎖。因為頁在記憶體中是鎖定的,在分頁程序的時候OS會忽略這些PTE。

On 64 bit platforms it is better to refer to such pages as locked pages - please don't confuse them with pages locked through VirtualLock API. As described above locked pages have two major properties - they are not considered for paging by OS and during allocation they acquire both working set and PFN database lock only ones. 

在64bit平台上, 這些頁被鎖定會更好——不要喝通過VirtualLock API混淆。頁鎖定有2個主要的屬性——他們不會被OS分頁并且在配置設定的時候隻會請求工作集鎖和PFN鎖的一個。

The first property has implicit implication on high end hardware such as NUMA. It provides explicit memory residency. Remember that OS commits a page on demand. To allocate physical memory, it will use a node on which a thread touching memory is running.  Latter on, the page can be swapped out by OS. Next time it will be brought up into memory, OS will again allocate physical page from the node a thread touching memory is running on. In this case a node could be completely different  from original one. Such behavior makes hard for applications to rely on page's NUMA residency. Locked pages solve this problem by removing themselves from paging altogether.  Moreover Windows 2003 SP1 introduced a new API - QueryWorkingSetEx. It allows to query extended  information about PTE's PFN.  In order to find out real page residency you should use this API. When pages are locked you need to it only ones. Otherwise you will have to do it periodically since residency of the page can actually change.

 第一個屬性隐約的涉及到了高端的硬體如NUMA。它提供了顯示的記憶體位置。記住OS按需求送出了頁。配置設定記憶體的時候會線上程觸及到的記憶體節點中配置設定記憶體。

The second property - taking both working set and PFN's database lock only ones enables applications to perform faster and better scalable ramp up.

 第二個屬性——隻要擷取工作集鎖和PFN資料庫鎖一個,可以讓應用程式運作的更快并且有更好的擴充性。

On NUMA SQL Server' Buffer Pool marks each allocated page with its node residency. It leverages QueryWorkingSetEx to accomplish it.  Once page is allocated it calls the API to find  out page residency.   It does it only once. Therefore enabling locked pages for SQL server on 64 bit platform  will improve SQL Server ramp up time and will improve performance & scalability over longer period of time. When running SQL Server with locked pages enabled you shouldn't be worried about overall system performance due to memory starvation - SQL Server participates in OS's paging mechanism by listening on OS's memory notification API's and shrinks its working set accordingly.

 在NUMA SQL Server buffer pool 中标記了每個配置設定也的節點位置。使用QueryWorkingSetEx來完成。一旦頁配置設定,就會調用api查找出頁所在節點,隻運作一次。是以SQL Server 64位平台會提高SQL Server 加速并且會提高性能和可擴充性。當運作sql server鎖定頁你不需要擔心因為記憶體不足引起的性能問題——sql server 會監聽OS記憶體通知api參與分頁并且壓縮工作集。

Let us summarize now - on 64 bit platform, locked pages, awe mechanism, enable better application's scalability and performance both during ramp up time and over long period of time. However, keep in mind that an application is still required to implement a way of responding to memory pressure to avoid starving the whole system for memory.

總結一下:在64位的平台上,鎖定頁,AWE配置設定機制,可以讓應用程式有更好的擴充性和性能。但是記住一個應用程式也要增強相應記憶體壓力的能力來避免整個系統的記憶體不足。