文章目錄
-
- 編譯的步驟
- 一步一步編譯
- 指定編譯到某個階段
- gcc -E -S -c
- HelloWorld.i HelloWorld.s HelloWorld.o HelloWorld 每個檔案中内容是什麼?
-
- HelloWorld.i 預處理檔案
- HelloWorld.s 彙編代碼檔案
- HelloWorld.o 不可執行二進制檔案
- HelloWorld 可執行二進制檔案
- 可能會用到的gcc 指令 -g,-masm
-
- gcc -masm 指定彙編風格
- gcc -g 在可執行檔案中加入調試資訊
- 反彙編工具 objdump
-
- MacOS 對objdump的輸出進行優化
編譯基礎 從hello.c到hello可執行檔案的過程
編譯的步驟
可以分為 預處理->編譯->彙編->連接配接階段
預處理:加入頭檔案,替換宏。
編譯:包含預處理,将 C 程式轉換成彙程式設計式。
彙編:包含預處理和編譯,将彙程式設計式轉換成可連結的二進制程式。
連結:包含以上所有操作,将可連結的二進制程式和其它别的庫連結在一起,形成可執行的程式檔案。
一步一步編譯
預處理-源檔案生成預處理檔案: gcc -E HelloWorld.c -o HelloWorld.i
編譯器編譯-預處理檔案生成彙編代碼檔案: gcc -S HelloWorld.i -o HelloWorld.s
彙編器編譯-彙編代碼檔案生成不可執行二進制檔案: gcc -c HelloWorld.s -o HelloWorld.o
連結-不可執行二進制檔案生成可執行二進制檔案: gcc HelloWorld.o -o HelloWorld
說明:不可執行二進制檔案為什麼不可以執行?因為還沒有通過連結器連結
指定編譯到某個階段
編譯生成-->預處理檔案: gcc -E HelloWorld.c -o HelloWorld.i
編譯到-->彙編代碼檔案: gcc -S HelloWorld.c -o HelloWorld.s
編譯到-->不可執行檔案 gcc -c HelloWorld.c -o HelloWorld.o
編譯到-->可執行檔案 gcc HelloWorld.o -o HelloWorld 生成可執行二進制檔案
以下是編譯的圖:
gcc -E -S -c
-E Only run the preprocessor
-S Only run preprocess and compilation steps
-c Only run preprocess, compile, and assemble steps
HelloWorld.i HelloWorld.s HelloWorld.o HelloWorld 每個檔案中内容是什麼?
接下來用下面這段程式HelloWorld.c 做為源檔案
#include "stdio.h"
int main(int argc, char const *argv[])
{
int a=1;
int b=2;
int c=3;
printf("Hello World!\n");
return 0;
}
HelloWorld.i 預處理檔案
# 1 "HelloWorld.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 361 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "HelloWorld.c" 2
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/stdio.h" 1 3 4
# 64 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/stdio.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/_stdio.h" 1 3 4
# 68 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/_stdio.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/cdefs.h" 1 3 4
# 608 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/cdefs.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/_symbol_aliasing.h" 1 3 4
# 609 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/cdefs.h" 2 3 4
# 674 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/cdefs.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/_posix_availability.h" 1 3 4
# 675 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/cdefs.h" 2 3 4
# 69 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/_stdio.h" 2 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/Availability.h" 1 3 4
# 242 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/Availability.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/AvailabilityInternal.h" 1 3 4
# 243 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/Availability.h" 2 3 4
# 70 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/_stdio.h" 2 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/_types.h" 1 3 4
# 27 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/_types.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/_types.h" 1 3 4
# 33 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/sys/_types.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/machine/_types.h" 1 3 4
# 32 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/machine/_types.h" 3 4
# 1 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/i386/_types.h" 1 3 4
# 37 "/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/i386/_types.h" 3 4
... 省略了很多資訊
__attribute__((__availability__(swift, unavailable, message="Use mkstemp(3) instead.")))
__attribute__((deprecated("This function is provided for compatibility reasons only. Due to security concerns inherent in the design of tempnam(3), it is highly recommended that you use mkstemp(3) instead.")))
char *tempnam(const char *__dir, const char *__prefix) __asm("_" "tempnam" );
int main(int argc, char const *argv[])
{
int a=1;
int b=2;
int c=3;
printf("Hello World!\n");
return 0;
}
HelloWorld.s檔案中根據觀察是加入了頭檔案.h資訊
HelloWorld.s 彙編代碼檔案
.section __TEXT,__text,regular,pure_instructions
.build_version macos, 10, 14 sdk_version 10, 14
.globl _main ## -- Begin function main
.p2align 4, 0x90
_main: ## @main
.cfi_startproc
## %bb.0:
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $32, %rsp
movl $0, -4(%rbp)
movl %edi, -8(%rbp)
movq %rsi, -16(%rbp)
movl $1, -20(%rbp)
movl $2, -24(%rbp)
movl $3, -28(%rbp)
leaq L_.str(%rip), %rdi
movb $0, %al
callq _printf
xorl %ecx, %ecx
movl %eax, -32(%rbp) ## 4-byte Spill
movl %ecx, %eax
addq $32, %rsp
popq %rbp
retq
.cfi_endproc
## -- End function
.section __TEXT,__cstring,cstring_literals
L_.str: ## @.str
.asciz "Hello World!\n"
.subsections_via_symbols
這個為ATT格式彙編代碼
HelloWorld.o 不可執行二進制檔案
Offset: 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000: CF FA ED FE 07 00 00 01 03 00 00 00 01 00 00 00 Ozm~............
00000010: 04 00 00 00 08 02 00 00 00 20 00 00 00 00 00 00 ................
00000020: 19 00 00 00 88 01 00 00 00 00 00 00 00 00 00 00 ................
...
00000300: 00 00 00 00 00 00 00 00 07 00 00 00 01 00 00 00 ................
00000310: 00 00 00 00 00 00 00 00 00 5F 6D 61 69 6E 00 5F ........._main._
00000320: 70 72 69 6E 74 66 00 00 printf..
這個也就是機器指令,CPU就是讀這個執行指令的
HelloWorld 可執行二進制檔案
Offset: 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000: CF FA ED FE 07 00 00 01 03 00 00 80 02 00 00 00 Ozm~............
00000010: 0F 00 00 00 C0 04 00 00 85 00 20 00 00 00 00 00 [email protected]
00000020: 19 00 00 00 48 00 00 00 5F 5F 50 41 47 45 5A 45 ....H...__PAGEZE
00000030: 52 4F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 RO..............
....
00001fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00001fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00001ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00002000: 11 22 10 51 00 00 00 00 11 40 64 79 6C 64 5F 73 ."[email protected]_s
00002010: 74 75 62 5F 62 69 6E 64 65 72 00 51 72 00 90 00 tub_binder.Qr...
00002020: 72 10 11 40 5F 70 72 69 6E 74 66 00 90 00 00 00 [email protected]_printf.....
00002030: 00 01 5F 00 05 00 02 5F 6D 68 5F 65 78 65 63 75 .._...._mh_execu
00002040: 74 65 5F 68 65 61 64 65 72 00 21 6D 61 69 6E 00 te_header.!main.
00002050: 25 02 00 00 00 03 00 C0 1E 00 00 00 00 00 00 00 %[email protected]
00002060: C0 1E 00 00 00 00 00 00 02 00 00 00 0F 01 10 00 @...............
00002070: 00 00 00 00 01 00 00 00 16 00 00 00 0F 01 00 00 ................
00002080: 40 0F 00 00 01 00 00 00 1C 00 00 00 01 00 00 01 @...............
00002090: 00 00 00 00 00 00 00 00 24 00 00 00 01 00 00 01 ........$.......
000020a0: 00 00 00 00 00 00 00 00 02 00 00 00 03 00 00 00 ................
000020b0: 00 00 00 40 02 00 00 00 20 00 5F 5F 6D 68 5F 65 [email protected]__mh_e
000020c0: 78 65 63 75 74 65 5F 68 65 61 64 65 72 00 5F 6D xecute_header._m
000020d0: 61 69 6E 00 5F 70 72 69 6E 74 66 00 64 79 6C 64 ain._printf.dyld
000020e0: 5F 73 74 75 62 5F 62 69 6E 64 65 72 00 00 00 00 _stub_binder....
上一個HelloWorld.o的不可執行檔案的最後一個位址為00000320 ,而HelloWorld的可執行檔案的位址為000020e0
顯然可執行檔案是比HelloWorld.o大的,是以HelloWorld的可執行檔案連結了很多庫檔案資訊,是以大的多
好的,到此整個從HelloWorld.c到HelloWorld可執行檔案的過程分析完了,其實還是挺有趣,感覺很充實
接下來我們玩一玩反彙編,
可能會用到的gcc 指令 -g,-masm
這兩個-g,-masm是無意間發現的
gcc -masm 指定彙編風格
$ gcc -S -masm=intel HelloWorld.c -o HelloWorld.s
.section __TEXT,__text,regular,pure_instructions
.build_version macos, 10, 14 sdk_version 10, 14
.intel_syntax noprefix
.globl _main ## -- Begin function main
.p2align 4, 0x90
_main: ## @main
.cfi_startproc
## %bb.0:
push rbp
.cfi_def_cfa_offset 16
.cfi_offset rbp, -16
mov rbp, rsp
.cfi_def_cfa_register rbp
sub rsp, 32
mov dword ptr [rbp - 4], 0
mov dword ptr [rbp - 8], edi
mov qword ptr [rbp - 16], rsi
mov dword ptr [rbp - 20], 1
mov dword ptr [rbp - 24], 2
mov dword ptr [rbp - 28], 3
lea rdi, [rip + L_.str]
mov al, 0
call _printf
xor ecx, ecx
mov dword ptr [rbp - 32], eax ## 4-byte Spill
mov eax, ecx
add rsp, 32
pop rbp
ret
.cfi_endproc
## -- End function
.section __TEXT,__cstring,cstring_literals
L_.str: ## @.str
.asciz "Hello World!\n"
.subsections_via_symbols
gcc -g 在可執行檔案中加入調試資訊
softwaredeMacBook-Pro:gcc software$ gcc -c -g HelloWorld.c -o HelloWorld.o
反彙編工具 objdump
在MacOS 下objdump很不友好,浪費了我兩個小時時間在這個上,最後把辛酸路程總結在下文,供大家參考
MacOS下的objdump是LLVM平台的,其他windows,Linux的objdump是GUN的
LLVM 平台的objdump文檔位址:https://llvm.org/docs/CommandGuide/llvm-objdump.html
GUN平台的objdump文檔位址:https://sourceware.org/binutils/docs/binutils/objdump.html
首先看一下MacOS下的objdump --version
softwaredeMacBook-Pro:~ software$ objdump --version
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Optimized build.
Default target: x86_64-apple-darwin18.7.0
Host CPU: skylake
Registered Targets:
aarch64 - AArch64 (little endian)
aarch64_be - AArch64 (big endian)
arm - ARM
arm64 - ARM64 (little endian)
armeb - ARM (big endian)
thumb - Thumb
thumbeb - Thumb (big endian)
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
目前我們可以用objdump把二進制檔案HelloWorld.o(或HelloWorld)反彙編至彙編代碼
softwaredeMacBook-Pro:gcc software$ objdump -d HelloWorld.o
HelloWorld.o: file format Mach-O 64-bit x86-64
Disassembly of section __TEXT,__text:
_main:
0: 55 pushq %rbp
1: 48 89 e5 movq %rsp, %rbp
4: 48 83 ec 20 subq $32, %rsp
8: c7 45 fc 00 00 00 00 movl $0, -4(%rbp)
f: 89 7d f8 movl %edi, -8(%rbp)
12: 48 89 75 f0 movq %rsi, -16(%rbp)
16: c7 45 ec 01 00 00 00 movl $1, -20(%rbp)
1d: c7 45 e8 02 00 00 00 movl $2, -24(%rbp)
24: c7 45 e4 03 00 00 00 movl $3, -28(%rbp)
2b: 48 8d 3d 14 00 00 00 leaq 20(%rip), %rdi
32: b0 00 movb $0, %al
34: e8 00 00 00 00 callq 0 <_main+0x39>
39: 31 c9 xorl %ecx, %ecx
3b: 89 45 e0 movl %eax, -32(%rbp)
3e: 89 c8 movl %ecx, %eax
40: 48 83 c4 20 addq $32, %rsp
44: 5d popq %rbp
45: c3 retq
看起是不是很辣眼睛?是的,這就是LLVM.objdump,好的坑已踩好,這時我們就想辦法跳出來
先解釋一下,從左到右:
_main:标号
0,1,4,8:彙編位址
55:機器代碼
pushq: 彙編代碼
然後我決定對這個屎一樣輸出進行優化:首先要解決1.彙編風格為Intel,然後解決,2.輸出内容未對齊的檔案
MacOS 對objdump的輸出進行優化
執行以下指令:
objdump -d --no-show-raw-insn -S -x86-asm-syntax=intel hello.o
輸出:
hello.o: file format Mach-O 64-bit x86-64
Disassembly of section __TEXT,__text:
_main:
; {
0: push rbp
1: mov rbp, rsp
4: sub rsp, 32
8: mov dword ptr [rbp - 4], 0
f: mov dword ptr [rbp - 8], edi
12: mov qword ptr [rbp - 16], rsi
; int a=1;
16: mov dword ptr [rbp - 20], 1
; int b=2;
1d: mov dword ptr [rbp - 24], 2
; int c=3;
24: mov dword ptr [rbp - 28], 3
; printf("Hello World!\n");
2b: lea rdi, [rip + 20]
32: mov al, 0
34: call 0 <_main+0x39>
39: xor ecx, ecx
; return 0;
3b: mov dword ptr [rbp - 32], eax
3e: mov eax, ecx
40: add rsp, 32
44: pop rbp
45: ret
這樣是不是就清爽了很多,哈哈
hello.o: file format Mach-O 64-bit x86-64 ,二進制檔案為 mac h-O 64-bit格式的
注意:
gcc hello.c -g -c -o hello.o
要加上-g 把調試資訊放到 hello.o中,這樣objdump才有效
總結完畢,感覺思路更清晰了,離自己寫出作業系統又近了一步:
好的,我的分享到此結束,如果大家對自己動手寫作業系統有興趣,可以通路下面貼的專欄,我們大家一起學習進步: