天天看點

x64位軟體調用約定II

x64 calling convention [節選]

This section describes the standard processes and conventions that one function (the caller) uses to make calls into another function (the callee) in x64 code.

本節描述x64位代碼中一個函數(調用者)調用另一個函數的标準流程和約定

Calling convention defaults

The x64 Application Binary Interface (ABI) uses a four-register fast-call calling convention by default. Space is allocated on the call stack as a shadow store for callees to save those registers. There's a strict one-to-one correspondence between the arguments to a function call and the registers used for those arguments. Any argument that doesn’t fit in 8 bytes, or isn't 1, 2, 4, or 8 bytes, must be passed by reference. A single argument is never spread across multiple registers. The x87 register stack is unused, and may be used by the callee, but must be considered volatile across function calls. All floating point operations are done using the 16 XMM registers. Integer arguments are passed in registers RCX, RDX, R8, and R9. Floating point arguments are passed in XMM0L, XMM1L, XMM2L, and XMM3L. 16-byte arguments are passed by reference. Parameter passing is described in detail in ​​Parameter Passing​​​. In addition to these registers, RAX, R10, R11, XMM4, and XMM5 are considered volatile. All other registers are non-volatile. Register usage is documented in detail in ​​Register Usage​​​ and ​​Caller/Callee Saved Registers​​.

x64應用程式二進制接口(ABI ABI的具體定義參考程式員的自我修養)預設使用4個寄存器的快速調用約定。(編譯器)為被調用者在堆棧上配置設定空間用以存儲這些寄存器(快速調用的寄存器)。這些函數參數和用于傳參的寄存器之間有着嚴格的一一對應的關系。任何參數的大小如果不滿足8位元組,或者不是1,2,4,8位元組,必須以引用方式傳遞。(這裡分2種情況讨論:1.參數大小在8B以下的類型,多數屬于基本類型,而基本類型無外乎1B、2B、4B、8B;如果是8B以下的結構體,又會被編譯器對齊到8B;2.而參數大小在8B以上的類型,即使在編碼階段是按值傳遞,但實際上編譯器仍然會隐式的生成一個臨時變量,臨時變量以引用傳遞的方式傳給被調用者)單個參數不會跨越多個寄存器。x87寄存器棧不再使用(可能是指fld/fsd這類指令),但有可能由被調用函數使用,是以要考慮交叉調用的情況。所有浮點數操作都使用16個XMM寄存器。整型參數使用RCX,RDX,R8,R9寄存器傳遞。浮點數參數通過XMM0L,XMM1L,XMM2L,XMM3L寄存器傳遞。16位元組的參數以引用方式傳遞。參數傳遞細節參考​​Parameter Passing​​​. 除了這些寄存器,RAX,R10,R11,XMM4,XMM5是易失型寄存器。其他所有寄存器是非易失型寄存器。寄存器使用方法已歸檔于​​Register Usage​​​ and ​​Caller/Callee Saved Registers​​.

Alignment

對齊

Most structures are aligned to their natural alignment. The primary exceptions are the stack pointer and ​

​malloc​

​​or ​

​alloca​

​​ memory, which are aligned to 16 bytes in order to aid performance. Alignment above 16 bytes must be done manually, but since 16 bytes is a common alignment size for XMM operations, this value should work for most code. For more information about structure layout and alignment, see ​​Types and Storage​​​. For information about the stack layout, see ​​x64 stack usage​​.

絕大多數結構體是自然對齊(8B對齊)的。主要的例外是棧指針和malloc或者alloca配置設定的記憶體,為了提高效率,他們被對齊到16B邊界。大于16B的對齊邊界必須手動設定,但是,由于16B對齊對于XMM操作而言很常見,是以這個(對齊)值對絕大多數代碼而言是奏效的。更多關于結構體布局和對齊相關的資訊,檢視​​Types and Storage​​​。堆棧布局資訊,檢視​​x64 stack usage​​.

Parameter passing

傳參

The first four integer arguments are passed in registers. Integer values are passed in left-to-right order in RCX, RDX, R8, and R9, respectively. Arguments five and higher are passed on the stack. All arguments are right-justified in registers, so the callee can ignore the upper bits of the register and access only the portion of the register necessary.

前四個整型參數通過寄存器傳遞。整型值從左往右以此通過RCX,RDX,R8,R9傳遞。更多的參數通過棧傳遞。所有參數在七寸器中是向右對齊的(高位向左擴充?),是以被調用方可以忽略寄存器的高位,而隻通路必要的部分。

Any floating-point and double-precision arguments in the first four parameters are passed in XMM0 - XMM3, depending on position. The integer registers RCX, RDX, R8, and R9 that would normally be used for those positions are ignored, except in the case of varargs arguments. For details, see ​​Varargs​​. Similarly, the XMM0 - XMM3 registers are ignored when the corresponding argument is an integer or pointer type.

前4個參數中的任何浮點數和雙精度浮點數的通過XMM0-XMM3傳遞。通常用于這些位置的整型寄存器RCX,RDX,R8,R9會被忽略,除非用于可變參數。參見​​Varargs​​。類似的當參數類型是整型或者指針型時,這些位置上的XMM0-XMM3寄存器會被忽略。

​​__m128​​ types, arrays, and strings are never passed by immediate value. Instead, a pointer is passed to memory allocated by the caller. Structs and unions of size 8, 16, 32, or 64 bits, and __m64 types, are passed as if they were integers of the same size. Structs or unions of other sizes are passed as a pointer to memory allocated by the caller. For these aggregate types passed as a pointer, including __m128, the caller-allocated temporary memory must be 16-byte aligned.

__m128類型,數組已經字元串永遠不會以立即數的方式傳遞。取而代之,會傳遞指向這片記憶體區域的指針。大小為8,16,32,或者64位以及__m64位結構體/聯合體,被當做整型值傳遞。而其他大小的(由于結構體對齊,其他大小是指>64bit)結構體或者聯合體被調用者用指針傳遞。對于這些以指針傳遞聚合體,包含__m128,調用者臨時配置設定的記憶體必須16位元組對齊(這段其實在上面的對齊中有提及)

Intrinsic functions that don't allocate stack space, and don't call other functions, sometimes use other volatile registers to pass additional register arguments. This optimization is made possible by the tight binding between the compiler and the intrinsic function implementation.

The callee is responsible for dumping the register parameters into their shadow space if needed.

如果有必要,被調用函數負責轉儲(儲存)寄存器參數到他們的臨時空間中

The following table summarizes how parameters are passed:

下表總結了參數如何被傳遞:

Parameter type How passed
Floating point First 4 parameters - XMM0 through XMM3. Others passed on stack.
Integer First 4 parameters - RCX, RDX, R8, R9. Others passed on stack.
Aggregates (8, 16, 32, or 64 bits) and __m64 First 4 parameters - RCX, RDX, R8, R9. Others passed on stack.
Aggregates (other) By pointer. First 4 parameters passed as pointers in RCX, RDX, R8, and R9
__m128 By pointer. First 4 parameters passed as pointers in RCX, RDX, R8, and R9

Example of argument passing 1 - all integers

func1(int a, int b, int c, int d, int e);
// a in RCX, b in RDX, c in R8, d in R9, e pushed on stack      

Example of argument passing 2 - all floats

func2(float a, double b, float c, double d, float e);
// a in XMM0, b in XMM1, c in XMM2, d in XMM3, e pushed on stack      

Example of argument passing 3 - mixed ints and floats

func3(int a, double b, int c, float d);
// a in RCX, b in XMM1, c in R8, d in XMM3      

Example of argument passing 4 -__m64, __m128, and aggregates

func4(__m64 a, _m128 b, struct c, float d);
// a in RCX, ptr to b in RDX, ptr to c in R8, d in XMM3      

Return values

傳回值

A scalar return value that can fit into 64 bits is returned through RAX; this includes __m64 types. Non-scalar types including floats, doubles, and vector types such as ​​__m128​​​, ​​__m128i​​​, ​​__m128d​​ are returned in XMM0. The state of unused bits in the value returned in RAX or XMM0 is undefined.

小于64bit的标量值可以通過RAX傳回;非标量值包括float,double,和向量等通過XMM0傳回。通過RAX或XMM0傳回的未使用的位(高位)的值沒有明确定義。

User-defined types can be returned by value from global functions and static member functions. To return a user-defined type by value in RAX, it must have a length of 1, 2, 4, 8, 16, 32, or 64 bits. It must also have no user-defined constructor, destructor, or copy assignment operator; no private or protected non-static data members; no non-static data members of reference type; no base classes; no virtual functions; and no data members that do not also meet these requirements. (This is essentially the definition of a C++03 POD type. Because the definition has changed in the C++11 standard, we don't recommend using ​

​std::is_pod​

​ for this test.) Otherwise, the caller assumes the responsibility of allocating memory and passing a pointer for the return value as the first argument. Subsequent arguments are then shifted one argument to the right. The same pointer must be returned by the callee in RAX.

Example of return value 1 - 64-bit result

__int64 func1(int a, float b, int c, int d, int e);
// Caller passes a in RCX, b in XMM1, c in R8, d in R9, e pushed on stack,
// callee returns __int64 result in RAX.      

Example of return value 2 - 128-bit result

__m128 func2(float a, double b, int c, __m64 d);
// Caller passes a in XMM0, b in XMM1, c in R8, d in R9,
// callee returns __m128 result in XMM0.      

Example of return value 3 - user type result by pointer

struct Struct1 {
   int j, k, l;    // Struct1 exceeds 64 bits.
};
Struct1 func3(int a, double b, int c, float d);
// Caller allocates memory for Struct1 returned and passes pointer in RCX,
// a in RDX, b in XMM2, c in R9, d pushed on the stack;
// callee returns pointer to Struct1 result in RAX.      

Example of return value 4 - user type result by value

struct Struct2 {
   int j, k;    // Struct2 fits in 64 bits, and meets requirements for return by value.
};
Struct2 func4(int a, double b, int c, float d);
// Caller passes a in RCX, b in XMM1, c in R8, and d in XMM3;
// callee returns Struct2 result by value in RAX.