Ref:
From Source Code to Executable
How A Compiler Works: GNU Toolchain
Assembler
Object_file
Chapter 7. The Toolchain

Compilation flow

Pre-processing

  • Pre-processing will handle ‘#’ directives
    • File inclusion with nested inclusion
    • Conditional compilation and Macro expansion
  • Use -E flag to stop after pre-processing

Compilation

Assembler

  • Assembler (as) translates assembly to binary
  • Creates so-called object files (in ELF format)
  • An assembler program creates object code by translating combinations of mnemonics and syntaxfor operations and addressing modes into their numerical equivalents. 

Linker

  • Linker (ld) puts binary together with startup code and required libraries
  • Thus the linking process is really two steps; 
    • Combining all object files into one executable file
    • Going through each object file to resolve any symbols

object File Related

  • An object file is a file containing object code, meaning relocatable format machine code that is usually not directly executable.
  • Object code is simply a binary representation of specific input source code file.
    原始碼的二進位表示式
  • An object file format is a computer file format used for the storage of object code and related data.
  • 每個 object file 都有一個 section list。

Section

  • section包含
    • name
    • size
    • contents (data)
    • 狀態
      • loadable: 執行時該section是否需要被載入到記憶體
      • allocatable: 如果section本身沒資料(如.bss)可以設成這個狀態,讓loader先保留記憶體的一塊空間
        • section不是loadable 或allocatable 的話一般來說都是給debug用的

Symbols

  • Symbols help humans to understand programming.
  • The primary job of the compilation process is to remove symbols. (from symbol to binary)
  • 把人類看得懂得的名稱,對應到一個地址
  • 一個object 檔案存放多個symbol,又稱為symbol table
  • Defined symbol: 有對應到一個記憶體位址的名稱
  • Undefined symbol: 沒有對應到記憶體位址的名稱
  • 名稱通常就是全域變數、靜態變數或是函數的名稱(identifier)
  • 一般來說,如果把單獨的c編譯成object file時
    • Defined symbol: global variable, static varible 和funciton
    • Undefined symbol: extern variable和外部funciton
  • 可以使用objdump -t或是nm看symbol資訊

Static Linking v.s. Dynamic Linking

Static Linking (.a file)

  • all the library modules are copied to the final executable image. When the program is loaded, OS places only a single file to the memory which contain both the source code and the referencing libraries. 
  • Static linking is done by the linkers in the final step of the compilation.
  • Statically linked files consume more disk and memory as all the modules are already linked. 
  • In Static linking, if external source program is changed then they have to be recompiled and relinked. 
  • Statically linked programs are faster than their dynamic counterpart.
  • Since the statically linked file contains every package and module, no compatibility issues occur
  • Statically linked programs always take constant load time

Dynamic Linking (.so file)

  • Dynamic linking only the names of external or shared libraries is placed into the memory. Dynamic linking lets many programs use single copy of executable module.
  • Dynamic linking is done at run time by the OS.
  • Dynamic linking, only one copy of the reference module is stored which is used by many programs thereby saving memory and disk space.
  • Dynamic linking only a single module needs to be updated and recompiled.
  • Dynamic linking, since the library files are separately stored there may be compatibility issues ( say one library file is compiled by new version of compiler).
  • Load time is variable in dynamically linked programs.

Leave a Reply

Close Menu