Linux ELF 詳解2 -- Section Header & Section

上一篇：ELF 詳解1 – ELF Header

ELF Section Header & Section

先看 Section Header 的定義

typedef struct {
	Elf32_Word	sh_name;
	Elf32_Word	sh_type;
	Elf32_Word	sh_flags;
	Elf32_Addr	sh_addr;
	Elf32_Off	sh_offset;
	Elf32_Word	sh_size;
	Elf32_Word	sh_link;
	Elf32_Word	sh_info;
	Elf32_Word	sh_addralign;
	Elf32_Word	sh_entsize;
} Elf32_Shdr;

typedef struct {
	Elf64_Word	sh_name; // 4 B (B for bytes)
	Elf64_Word	sh_type; // 4 B
	Elf64_Xword	sh_flags; // 8 B
	Elf64_Addr	sh_addr; // 8 B
	Elf64_Off	sh_offset; // 8 B
	Elf64_Xword	sh_size; // 8 B
	Elf64_Word	sh_link; // 4 B
	Elf64_Word	sh_info; // 4 B
	Elf64_Xword	sh_addralign; // 8 B
	Elf64_Xword	sh_entsize; // 8 B
} Elf64_Shdr; // total size: 64 B

我們隻關注 Elf64_Shdr（64位系統的定義）。

用 readelf 檢視 program.o 的 Section Header 清單。

# -S是檢視Section Header，-W是拓展顯示的寬度
$ readelf -SW program.o
There are 14 section headers, starting at offset 0x418:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        0000000000000000 000040 000051 00  AX  0   0  1
  [ 2] .rela.text        RELA            0000000000000000 0002d8 0000a8 18   I 12   1  8
  [ 3] .data             PROGBITS        0000000000000000 000098 000010 00  WA  0   0  8
  [ 4] .rela.data        RELA            0000000000000000 000380 000018 18   I 12   3  8
  [ 5] .bss              NOBITS          0000000000000000 0000a8 000000 00  WA  0   0  1
  [ 6] .rodata           PROGBITS        0000000000000000 0000a8 000013 00   A  0   0  1
  [ 7] .comment          PROGBITS        0000000000000000 0000bb 000036 01  MS  0   0  1
  [ 8] .note.GNU-stack   PROGBITS        0000000000000000 0000f1 000000 00      0   0  1
  [ 9] .eh_frame         PROGBITS        0000000000000000 0000f8 000038 00   A  0   0  8
  [10] .rela.eh_frame    RELA            0000000000000000 000398 000018 18   I 12   9  8
  [11] .shstrtab         STRTAB          0000000000000000 0003b0 000066 00      0   0  1
  [12] .symtab           SYMTAB          0000000000000000 000130 000180 18     13  10  8
  [13] .strtab           STRTAB          0000000000000000 0002b0 000024 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

重新看一下 ELF Header

$ readelf -h program.o
ELF Header:
...
  Start of section headers:          1048 (bytes into file)
...
  Size of section headers:           64 (bytes)  #跟 Section Header 定義中的 total size 一緻
  Number of section headers:         14  #跟 Section Header 清單中的 Header 的個數一緻
  Section header string table index: 11

接下來從位元組級别檢視一下這些 Section Header 的内容。

從 ELF Header 中可以看出 Section header table 的 offset 是1048位元組，每個 Section Header 大小為64 bytes，一共有14個。

Section Header undefined

$ hexdump -C -s1048 -n64 program.o
00000418  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000458

可以看到 index=0 的 Section Header，内容全部為0。（在某些情況下，它的某部分字段不為0，這些特殊情況目前先跳過）

這是一個非常特别的 Section Header，它的作用是表示 undefined 。它的 index（=0）也有一個特别的名字，叫 SHN_UNDEF。後面會講到它是如何發揮作用。

Section Header .text

index=1 的 Section Header，offset = 1048 + 64 = 1112

$ hexdump -C -s1112 -n64 program.o
00000458  20 00 00 00 01 00 00 00  06 00 00 00 00 00 00 00  | ...............|
00000468  00 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000478  51 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |Q...............|
00000488  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000498

我們以這個 Section Header 為例，來檢視各字段的内容。

sh_name

雖然這個字段叫 sh_name，但實際上它包含的是一個索引，這個索引是從一個叫 Section header string table 的地方開始偏移，後面會講到，現在隻需要知道它相當于一個字元數組下标就好。

資料類型：Elf64_Word （4 bytes）

$ hexdump -C -s1112 -n4 program.o
00000458  20 00 00 00                                       | ...|
0000045c

值為：“20 00 00 00” + little endian = 0x20 = 32

sh_type

Section Header 的類型。因為 Section Header 描述了 Section，是以它的類型也可以看做是 Section 的類型。隻關注紅框内的就好，其他的是擴充部分，可以先跳過。

Linux ELF 詳解2 -- Section Header & Section

這裡有很多類型，每個類型都有不同的含義，譬如：

SHT_NULL 表示這個 Section Header inactive，沒有對應的 Section，其他字段值是 undefined。index=0 的 Section Header，它的類型就是這個。
SHT_STRTAB 表示 Section 的内容是個 string table。string table 其實就是一個字元數組，存放了多個 string，每個 string 都以 ‘\0’ 作為終結符。一個對象檔案可以有多個 string table。後面會講到不同的 string table 發揮不同的作用。index=11（.shstrtab）的 Section Header，類型就是這個。

資料類型：Elf64_Word （4 bytes）

$ hexdump -C -s1116 -n4 program.o
0000045c  01 00 00 00                                       |....|
00000460

值為：“01 00 00 00” + little endian = 0x1

由上圖可知，1對應的類型是 SHT_PROGBITS，表示這部分的資訊，格式和意義完全由程式來定義。内容可以是已初始化的資料，未初始化的資料，comment 或者是程式代碼等。

sh_flags

表示 Section 的各種屬性，每一位代表不同的含義，可以多個位進行組合。

Linux ELF 詳解2 -- Section Header & Section

比較常見的有：

SHF_WRITE 表示 Section 的資料在程序運作期間可寫
SHF_ALLOC 表示 Section 在程序運作期間需要占據記憶體
SHF_EXECINSTR 表示 Section 包含了可執行的機器指令
SHF_INFO_LINK 表示這個 Section Header 的 sh_info 字段包含了另外一個 Section Header 的 index。（這個後面會詳細講到）

資料類型：Elf64_Xword （8 bytes）

$ hexdump -C -s1120 -n8 program.o
00000460  06 00 00 00 00 00 00 00                           |........|
00000468

值為：“06 00 00 00 00 00 00 00” + little endian = 0x6 = SHF_ALLOC | SHF_EXECINSTR => 這部分是可執行代碼，程序運作時需要放置在記憶體中。

再對照回 Section Header 清單中 “.text” 那一行

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [ 1] .text             PROGBITS        0000000000000000 000040 000051 00  AX
...  
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

從下面的 “Key to Flags” 可以看出 “.text” 對應的就是 A (alloc), X (execute)。

sh_addr

Section 存放于程序記憶體映像中的虛存位址，如果 Section 不需要出現在記憶體中，則為0。Relocatable file 的虛存位址都為0。Executable file 和 Shared object file 才會為有需要的 Section 計算虛存位址。後面會講到。

資料類型：Elf64_Addr （8 bytes）

$ hexdump -C -s1128 -n8 program.o
00000468  00 00 00 00 00 00 00 00                           |........|
00000470

sh_offset

表示 Section 在檔案中的位元組偏移量（offset）。

類型為 NOBITS (SHT_NOBITS = 8) 的 Section Header 比較特别，它本身在檔案中不占據空間，但還是擁有 offset，下圖顯示 “.bss” 和 “.rodata” 的 offset 是一樣的，但 “.bss” 的 size 為0。

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [ 5] .bss              NOBITS          0000000000000000 0000a8 000000 00  WA  0   0  1
  [ 6] .rodata           PROGBITS        0000000000000000 0000a8 000013 00   A  0   0  1
...

資料類型：Elf64_Off （8 bytes）

$ hexdump -C -s1136 -n8 program.o
00000470  40 00 00 00 00 00 00 00                           |@.......|
00000478

值為：“40 00 00 00 00 00 00 00” + little endian = 0x40 = 4 * 16 = 64

因為 program.o 沒有 Program header table（從 ELF Header 中可以得知），是以 “.text” 的 Section 在檔案中是緊貼着 ELF Header 存放的。

sh_size

表示 Section 在檔案中占據的位元組數。

類型為 NOBITS 的 Section，即使這個值不為0，它在檔案中也不占據空間。".bss" Section 就是這個類型。bss 全稱為 “block started by symbol”（更簡單的記法是：Better Save Space），存放的是未初始化的資料，因為資料無初始值，是以隻需要記錄它在記憶體中占據的空間即可，在檔案中不需要額外的存儲。相反，".data" Section 存儲的則是已經初始化的資料，是以需要在檔案中記錄下初始值，才能在程序記憶體映像中把這些值帶進去。

資料類型：Elf64_Xword （8 bytes）

$ hexdump -C -s1144 -n8 program.o
00000478  51 00 00 00 00 00 00 00                           |Q.......|
00000480

值為：“51 00 00 00 00 00 00 00” + little endian = 0x51 = 5 * 16 + 1 = 81

sh_link

包含另外一個 Section Header 的 index，具體的含義取決于 Section 的類型。後面會詳細講到。

資料類型：Elf64_Word （4 bytes）

$ hexdump -C -s1152 -n4 program.o
00000480  00 00 00 00                                       |....|
00000484

sh_info

包含了額外的資訊，具體的含義取決于 Section 的類型。

需要把 sh_type，sh_link 和 sh_info 聯合在一起進行解析。

另外，當 sh_flags=SHF_INFO_LINK 時，sh_info 則表示另外一個 Section Header 的 index。後面會講到。

資料類型：Elf64_Word （4 bytes）

$ hexdump -C -s1156 -n4 program.o
00000484  00 00 00 00                                       |....|
00000488

sh_addralign

表示對齊限制。0或1表示無限制。sh_addr % sh_addralign 必須等于0，後面會講到。

資料類型：Elf64_Xword （8 bytes）

$ hexdump -C -s1160 -n8 program.o
00000488  01 00 00 00 00 00 00 00                           |........|
00000490

值為：“01 00 00 00 00 00 00 00” + little endian = 0x1 => 無限制

sh_entsize

有些 Section 會包含一組大小固定的記錄，這時 sh_entsize 就表示記錄的大小，通過 sh_size / sh_entsize 就能得到記錄的個數。如果 Section 沒有包含這種類型的記錄，則值為0。比如後面會看到的 “.shstrtab” Section，包含的是一組 string，但 string 的長度不一樣，于是 sh_entsize 的值就為0。

資料類型：Elf64_Xword （8 bytes）

$ hexdump -C -s1168 -n8 program.o
00000490  00 00 00 00 00 00 00 00                           |........|
00000498

可以把上面各個字段的資訊和 Section Header 清單中 “.text” 行的輸出相對比來加深了解。

接下來将檢視其中幾個 Section 的内容。

Section .shstrtab

這個 Section 包含了各個 Section Header 的 string 名字。

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [11] .shstrtab         STRTAB          0000000000000000 0003b0 000066 00      0   0  1
...

可以看出：

offset = 0x3b0

size = 0x66 = 6 * 16 + 6 = 102

# 注意，這裡 -s 後面跟随的是16進制數字
$ hexdump -C -s0x3b0 -n96 program.o
000003b0  00 2e 73 79 6d 74 61 62  00 2e 73 74 72 74 61 62  |..symtab..strtab|
000003c0  00 2e 73 68 73 74 72 74  61 62 00 2e 72 65 6c 61  |..shstrtab..rela|
000003d0  2e 74 65 78 74 00 2e 72  65 6c 61 2e 64 61 74 61  |.text..rela.data|
000003e0  00 2e 62 73 73 00 2e 72  6f 64 61 74 61 00 2e 63  |..bss..rodata..c|
000003f0  6f 6d 6d 65 6e 74 00 2e  6e 6f 74 65 2e 47 4e 55  |omment..note.GNU|
00000400  2d 73 74 61 63 6b 00 2e  72 65 6c 61 2e 65 68 5f  |-stack..rela.eh_|
00000410

右邊的列已經顯示出所有 Section Header 的名字 string，這些 string 都是以 ‘\0’ 作為終結符。對于所有不可見或無法顯示的字元，都以 ‘.’ 來顯示，這樣一來，反而把能顯示出來的 ‘.’ 和無法顯示的字元混在一起了。

比如 “..symtab.” 這個字元串，如果參照左邊的編碼表，就能發現，它的第一個和最後一個字元都是 “00”，也就是 ‘\0’，第二個字元是能顯示的 ‘.’，是以不要混淆了。

接下來，我們把所有 Section Header 的 sh_name 都拿出來，對照上表看看結果。

從 ELF Header 和 Section Header 清單可以知道：

Section header table offset 為1048 bytes
每個 Section Header 大小為 64 bytes
一共有14個 Section Header
Section Header 的 sh_name 長度為4 bytes
Section header string table offset 為 0x3b0
Section header string table size 為 0x66

print_sh_names.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>

int main() {
	off_t headerTableOffset = 1048;
	size_t headerSize = 64;
	off_t stringSectionOffset = 0x3b0;
	size_t stringSectionSize = 0x66;
	size_t nameSize = 4;
	int sectionNum = 14;
	char content[stringSectionSize];
	union {
		char b[4];
		int off;
	} headerName;

	int fd = open("program.o", O_RDONLY);
	if (fd == -1)
	  exit(EXIT_FAILURE);
	
	if (lseek(fd, stringSectionOffset, SEEK_SET) == -1)
	  exit(EXIT_FAILURE);

	if (read(fd, content, stringSectionSize) != stringSectionSize)
	  exit(EXIT_FAILURE);

	int i;
	int currHeaderOffset = headerTableOffset;
	char *start;
	for (i = 0; i < sectionNum; ++i) {
		if (lseek(fd, currHeaderOffset, SEEK_SET) == -1)
		  exit(EXIT_FAILURE);
		if (read(fd, headerName.b, 4) != 4)
		  exit(EXIT_FAILURE);

		start = content + headerName.off;
		printf("[%2d]: off: 0x%02x,  name: %s\n", i, headerName.off, start);

		currHeaderOffset += headerSize;
	}

	if (close(fd) == -1)
	  exit(EXIT_FAILURE);

	return 0;
}

$ ./print_sh_names 
[ 0]: off: 0x00,  name: 
[ 1]: off: 0x20,  name: .text
[ 2]: off: 0x1b,  name: .rela.text
[ 3]: off: 0x2b,  name: .data
[ 4]: off: 0x26,  name: .rela.data
[ 5]: off: 0x31,  name: .bss
[ 6]: off: 0x36,  name: .rodata
[ 7]: off: 0x3e,  name: .comment
[ 8]: off: 0x47,  name: .note.GNU-stack
[ 9]: off: 0x5c,  name: .eh_frame
[10]: off: 0x57,  name: .rela.eh_frame
[11]: off: 0x11,  name: .shstrtab
[12]: off: 0x01,  name: .symtab
[13]: off: 0x09,  name: .strtab

可以看到跟 “readelf -SW program.o” 的結果是一緻的。

Section .text

這個 Section 包含了程式中可執行的指令。

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [ 1] .text             PROGBITS        0000000000000000 000040 000051 00  AX  0   0  1
...

由上可知：

offset = 0x40
size = 0x51 = 5 * 16 + 1 = 81
flag = AX => 程序運作期間占據記憶體 + 這部分是可執行的指令 => 可執行

$ hexdump -s0x40 -n81 -C program.o
00000040  55 48 89 e5 48 83 ec 10  bf 64 00 00 00 e8 00 00  |UH..H....d......|
00000050  00 00 89 c2 8b 05 00 00  00 00 01 d0 89 45 fc 8b  |.............E..|
00000060  05 00 00 00 00 89 c6 bf  00 00 00 00 b8 00 00 00  |................|
00000070  00 e8 00 00 00 00 8b 45  fc 89 c6 bf 00 00 00 00  |.......E........|
00000080  b8 00 00 00 00 e8 00 00  00 00 b8 00 00 00 00 c9  |................|
00000090  c3                                                |.|
00000091

再用 objdump 檢視其中的代碼：

$ objdump -d program.o

program.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	48 83 ec 10          	sub    $0x10,%rsp
   8:	bf 64 00 00 00       	mov    $0x64,%edi
   d:	e8 00 00 00 00       	callq  12 <main+0x12>
  12:	89 c2                	mov    %eax,%edx
  14:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 1a <main+0x1a>
  1a:	01 d0                	add    %edx,%eax
  1c:	89 45 fc             	mov    %eax,-0x4(%rbp)
  1f:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 25 <main+0x25>
  25:	89 c6                	mov    %eax,%esi
  27:	bf 00 00 00 00       	mov    $0x0,%edi
  2c:	b8 00 00 00 00       	mov    $0x0,%eax
  31:	e8 00 00 00 00       	callq  36 <main+0x36>
  36:	8b 45 fc             	mov    -0x4(%rbp),%eax
  39:	89 c6                	mov    %eax,%esi
  3b:	bf 00 00 00 00       	mov    $0x0,%edi
  40:	b8 00 00 00 00       	mov    $0x0,%eax
  45:	e8 00 00 00 00       	callq  4a <main+0x4a>
  4a:	b8 00 00 00 00       	mov    $0x0,%eax
  4f:	c9                   	leaveq 
  50:	c3                   	retq

兩者完全吻合，這其實就是我們 main 函數反彙編後的指令。

Section .data

這個 Section 存放的是已經初始化了的資料。

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [ 3] .data             PROGBITS        0000000000000000 000098 000010 00  WA  0   0  8
...

由上可知：

offset = 0x98
size = 0x10 = 1 * 16 = 16
flag = WA => 程序運作期間可寫 + 程序運作期間占據記憶體 => 可讀寫

$ hexdump -C -s0x98 -n16 program.o
00000098  64 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |d...............|
000000a8

結合 program.c 的代碼：

$ cat program.c
...
static char d = 'd';
char* f = (char*) function;
...

這裡隻有2個變量是被初始化的：

d 的值為 ‘d’，十六進制就是0x64。hexdump 結果中頭4個位元組 “64 00 00 00”，little endian 之後就是 0x64
f 的值是函數 function 的位址，但是因為 function 來源于外部，目前無法确定它的位址，是以初始值為0。x86-64 中位址為8位元組，是以 f 的值對應到 hexdump 結果中的最後8個位元組，也就是 “00 00 00 00 00 00 00 00”。
在 d 和 f 中間還隔着4個0（“00 00 00 00”），這4個0就是用作對齊的 padding，後面 Symbol 篇章會講到對齊限制。

Section .rodata

這個 Section 存放的是隻讀資料。

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [ 6] .rodata           PROGBITS        0000000000000000 0000a8 000013 00   A  0   0  1
...

由上可知：

offset = 0xa8
size = 0x13 = 1 * 16 + 3 = 19
flag = A => 程序運作期間占據記憶體 => 隻讀

$ hexdump -C -s0xa8 -n19 program.o
000000a8  61 3a 20 25 64 0a 00 72  65 73 75 6c 74 3a 20 25  |a: %d..result: %|
000000b8  64 0a 00                                          |d..|
000000bb

這裡包含2個字元串：

“61 3a 20 25 64 0a 00” => “a: %d\n” (“0x0a” 對應的是 ‘\n’)

“72 65 73 75 6c 74 3a 20 25 64 0a” => “result: %d\n”

結合 program.c 的代碼：

$ cat program.c
...
int main() {
...
	printf("a: %d\n", a);
	printf("result: %d\n", d);
}

可以看出 printf 的 format string 就是作為隻讀資料存在。

Section .comment

這個 Section 包含了版本控制資訊。

$ readelf -SW program.o
...
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
...
  [ 7] .comment          PROGBITS        0000000000000000 0000bb 000036 01  MS  0   0  1
...

由此可知：

offset = 0xbb
size = 0x36 = 3 * 16 + 6 = 54
flag = MS => 需要把重複的字元串進行合并，libfunc.so 也有同樣的 comment，在最終生成的 program 中，comment 隻有一份。

$ hexdump -C -s0xbb -n54 program.o
000000bb  00 47 43 43 3a 20 28 55  62 75 6e 74 75 20 35 2e  |.GCC: (Ubuntu 5.|
000000cb  34 2e 30 2d 36 75 62 75  6e 74 75 31 7e 31 36 2e  |4.0-6ubuntu1~16.|
000000db  30 34 2e 31 32 29 20 35  2e 34 2e 30 20 32 30 31  |04.12) 5.4.0 201|
000000eb  36 30 36 30 39 00                                 |60609.|
000000f1

雖然 .text, .data, .rodata, .comment 的類型都是 PROGBITS，但它們分别代表不同的含義。

下一篇：ELF 詳解3 – Symbol Table & Symbol

Linux ELF 詳解2 -- Section Header & Section

ELF Section Header & Section

Section Header undefined

Section Header .text

sh_name

sh_type

sh_flags

sh_addr

sh_offset

sh_size

sh_link

sh_info

sh_addralign

sh_entsize

Section .shstrtab

Section .text

Section .data

Section .rodata

Section .comment

繼續閱讀

Apache (You don't have permission to access / on this server.）

debian9更新4.9.0核心到4.19.2核心過程

centOS7 配置 vsftpd 虛拟使用者及權限Vsftpd配置虛拟使用者及權限

linux-svn解除安裝與安裝

vsftp虛拟多使用者多權限一鍵部署腳本

Ubuntu14.04 LTS下安裝mongodb

httpd服務的部署、啟動、配置和簡單優化一、部署二、啟動三、配置檔案

配置網頁内容通路

手動安裝Intel network I217-LM網卡的Linux驅動

禁止ubuntu系統彈出報錯界面

Ubuntu Linux下Apache的配置檔案

samba伺服器的功能

【Linux】UDP廣播封包接收速率問題

Linux裝置模型（中）之上層容器

PowerPC平台 Linux移植三

Linux ELF 詳解2 -- Section Header &amp; Section

ELF Section Header & Section

Section Header undefined

Section Header .text

sh_name

sh_type

sh_flags

sh_addr

sh_offset

sh_size

sh_link

sh_info

sh_addralign

sh_entsize

Section .shstrtab

Section .text

Section .data

Section .rodata

Section .comment

繼續閱讀

Linux ELF 詳解2 -- Section Header & Section