轉載自

http://thomas-sanchez.net/computer-sciences/2011/08/15/what-every-c-programmer-should-know-the-hard-part/

What every C++ programmer should know, The hard part

Previously, I explained how C++ does to handle the classes and inheritance between them. But, I did not cover how the virtual is handled.

It adds a lot of complexity, C++ is compiled and when a binary is linked against a library they have to speak the same language: they have to share the same ABI. The C++ creators had to find a way to give along the program lifetime metadata about the manipulated classes.

They chose the Virtual Tables.

The Virtual Table

When a C++ program is compiled, the binary embedded some information about the manipulated classes by the program. When a class inherits from an interface, the actual implementation of the method should always be accessible. The Virtual Table (VTable) are generated during the compilation process,they can be seen as array of method pointers.

Let’s take an example:

`01`	`#include <iostream>`

`03`	`struct` `Interface`

`05`	`Interface() : i(0x424242) {}`

`06`	`virtual` `void` `test_method() = 0;`

`07`	`virtual` `~Interface(){}`

`08`	`int` `i;`

};

`11`	`struct` `Daughter :` `public` `Interface`

`13`	`void` `test_method()`

`15`	`std::cout <<` `"This is a call to the method"` `<< std::endl;`

`16`	`std::cout <<` `"This: "` `<<` `this` `<< std::endl;`

};

`20`	`int` `main()`

`22`	`Daughter* d =` `new` `Daughter;`

`23`	`Interface* i = d;`

`25`	`i->test_method();`

`27`	`std::cout <<` `sizeof` `(Daughter) << std::endl;`

`28`	`std::cout << ((` `void` `*)i) << std::endl;`

`29`	`std::cout << ((` `void` `**)i)[1] << std::endl;`

I recall that all the test have been done on a Linux 64bits.

The size of a

Daughter

instance is not 8 as we could expect but 16bytes. The memory dump shows that the first field of the class is not the value of

but a strange value and our field come next to it. Our ‘strange’ value is actually a pointer, in fact it is a pointer inside our binary.

nm -C test | grep 400d
0000000000400de0 V vtable for Daughter

I will explain after why there is a difference of some bytes between the two. So this pointer represent the location of the

Daughter

VTable. We can now check its content.

As I said, a VTable is a kind of array of method pointer.

To get a pointer on it, it is simply:

size_t** vtable = *(size_t***)i;
std::cout << vtable[0] << std::endl;

And if we check the new address printed on the output we can see that it is actually our pointer on method.

nm -C test | grep -E 400c6a
0000000000400c6a W Daughter::test_method()

We can play a little bite more to test deeper:

typedef void (*VtablePtr) (Daughter*);
VtablePtr ptr = (VtablePtr)vtable[0];
ptr(d);

The VTable are determined along the compilation. When the compiler see a virtual method in a class in start to construct a VTable associated to this class. When this class is inherited by another one, it will automatically duplicate and receive a pointer on a VTable for the current parsed class. Each entry of the VTable will be filled when the actual definition of the method is encountered. It is always the last definition which is kept.

The index of the method in the VTable is the same as the apparition order in the source file, that's why it's very important that all the part of a project is compiled with consistent header. It is always embarrassing when the bad method is called in a project without knowing why…

Here is the complete code:

`01`	`#include <iostream>`

`03`	`struct` `Interface`

`05`	`Interface() : i(0x424242) {}`

`06`	`virtual` `void` `test_method() = 0;`

`07`	`virtual` `~Interface(){}`

`08`	`int` `i;`

};

`11`	`struct` `Daughter :` `public` `Interface`

`13`	`void` `test_method()`

`15`	`std::cout <<` `"This is a call to the method"` `<< std::endl;`

`16`	`std::cout <<` `"This: "` `<<` `this` `<< std::endl;`

};

`20`	`int` `main()`

`22`	`Daughter* d =` `new` `Daughter;`

`23`	`Interface* i = d;`

`25`	`i->test_method();`

`27`	`std::cout <<` `sizeof` `(Daughter) << std::endl;`

`28`	`std::cout << ((` `void` `*)i) << std::endl;`

`29`	`std::cout << ((` `void` `**)i)[1] << std::endl;`

`31`	`size_t` `** vtable = (` `size_t` `**)i;`

`32`	`std::cout << vtable[0] << std::endl;`

`34`	`typedef` `void` `(VtablePtr) (Daughter);`

`35`	`VtablePtr ptr = (VtablePtr)vtable[0];`

ptr(d);

In conclusion, when virtual appears an instance should be seen like this:

VPTR
Base1
Daughter

And the instance is heavier of

sizeof(void*)*nb_of_vptr

bytes.

Virtual in multiple inheritance

As usual, we are going to start with a trivial code:

`01`	`#include <iostream>`

`03`	`struct` `Mother`

`05`	`virtual` `void` `mother()=0;`

`06`	`virtual` `~Mother() {}`

`07`	`int` `i;`

};

`10`	`struct` `Father`

`12`	`virtual` `void` `father()=0;`

`13`	`virtual` `~Father() {}`

`14`	`int` `j;`

};

`17`	`struct` `Daughter :` `public` `Mother,` `public` `Father`

`19`	`void` `mother()`

`20`	`{ std::cout <<` `"Mother: "` `<<` `this` `<< std::endl; }`

`22`	`void` `father()`

`23`	`{ std::cout <<` `"Father: "` `<<` `this` `<< std::endl; }`

`25`	`int` `k;`

};

`28`	`int` `main()`

`30`	`Daughter* d =` `new` `Daughter;`

`31`	`Mother* m = d;`

`32`	`Father* f = d;`

`34`	`std::cout <<` `"Daughter: "` `<< (` `void` `*)d << std::endl;`

`35`	`std::cout <<` `"Father : "` `<< (` `void` `*)f << std::endl;`

`36`	`std::cout <<` `sizeof` `(*d) << std::endl;`

`38`	`std::cout << ((` `void` `*)d) << std::endl;`

`39`	`std::cout << ((` `void` `*)f) << std::endl;`

As you can note, the two table used are different. When the types are manipulated, this is not always (never?) the concrete type used but the abstract one. With multiple inheritance it can be a

Mother

or a

Father

instances, so when a

Father

is used and the actual implementation is in

Daughter

, the method should be accessible. That's why there is another VTable pointer.

However, when an instance of type

Daughter

is used through a

Father

pointer,

Daughter

method cannot be called directly. Indeed, the instance pointer needs to be adjusted to match a

Daughter

instance. To solve this problem, there are the Thunk function.

If we print the first entry of the VTable and if we disassemble the code a this location, we have this:

`1`	`0000000000400cf4 <non-` `virtual` `thunk to Daughter::father()>:`

`2`	`400cf4: 48 83 ef 10 sub $0x10,%rdi`

`3`	`400cf8: eb 00 jmp 400cfa <Daughter::father()>`

These two instructions perform pointer adjustment by subtracting the size of the

Mother

class (and then match the

Daughter

instance). Therefore, if you have multiple inheritance with method you can add some indirection very easily:

Get the VTable;
Move to the wanted method (apply an offset on the VTable pointer, for example 8 to get the second method);
Call the method;
Adjust the this pointer;
Jump to the actual method definition.

Method Pointer

Yes, method pointer have a cost. Contrary to the C where function pointers have no overhead, the C++ had to deal with the difference between:

From which instance the method is accessed;
Is the method virtual?

The first point require a pointer adjustment. The second point, well, lot of things.

Firstly, the size of a method pointer is 16 bytes (against 8 in C). The method pointer is in three parts:

Offset
Address/index
virtual?

The first one is on 8 bytes, the second on 8 bytes also. The third part is on one byte and is merged with the second one. If the last byte is set then the second part should be seen as an index (the index of the method in the VTable), otherwise it is the address of the method.

Therefore, calling a method pointer require ~ 20 asm instructions (in the worst case):

Get the offset to apply on the instance pointer;
Apply it;
Check if we call a virtual member function;
If yes, subtract 1;
Get the VTable;
Get the method address;
Call the method.

Conclusion

In a next article I'll cover the VTable prefix and the virtual inheritance but there are less common in C++ code. In these two articles I tried to put some light on C++'s internal mechanism. The C++ is a fast language but it can become much less efficient because of complex class relation. I don't say: "don't use virtual and method pointer", I think programmers should be aware of these counterparts.

I think the readability is more important than performances. Yes, you can have a lot of overhead in C++ but it will still be more efficient than a lot of languages. But sometimes you can avoid virtualization. For example, the common ways for a beginner (and sometimes less beginners C++ programmers) to do an abstraction is to define an interface and for the different implementation, define a new class which inherits from this interface.

Sometimes, ok it is the right thing to do, sometimes not. If you are asked to write an abstraction to the filesystem on Linux and Windows if you follow the described way, you'll write an

iFS

interface, a

WindowsFS

and a

LinuxFS

. It'll work well but you can do even better: You can write a

WindowsFS

and

LinuxFS

and define a new type

FS

according to the platform where the code is compiled, on Linux we could imagine something like this:

typedef LinuxFS FS;

With a code like this, you'll avoid some overheard due to the interface. It works well on abstraction of platform specific features but it does not work on data abstraction and you'll need an interface.

Here are some resources:

CRTP
Wikipedia

non-virtual thunk for Virtual Function in multiple inheritance What every C++ programmer should know, The hard part The Virtual Table Virtual in multiple inheritance Method Pointer Conclusion

What every C++ programmer should know, The hard part

The Virtual Table

Virtual in multiple inheritance

Method Pointer

Conclusion

繼續閱讀

.configure/make/make install的作用

configure/make/make install的作用

cygwin與mingw的差別

boost 編譯安裝一行指令完成(on Windows)

STL 的容器

Centos7安裝YouCompleteMe（二）

MinGW+Notepad++編譯運作 C++代碼MinGW+Notepad++編譯運作 C++代碼

linux配置C，C++編譯環境系列一之環境配置

i = i++ 的困惑

dev c++ 中文支援：Illegal byte sequence

linux下動态庫so檔案的一些認識

linux下簡易安裝gcc

個人覺得C++BuilderX是個失敗的作品

linux段錯誤

可變參數宏， Variadic Macros

GNU科學函數庫[參考手冊][v0.1 Build 090201 Beta][GNU Scientific Library]