Page 118 - C-Language

P. 118

In general the compiler takes four steps when converting a .c file into an executable:

1. pre-processing - textually expands #include directives and #define macros in your .c file
2. compilation - converts the program into assembly (you can stop the compiler at this step by
adding the -S option)
3. assembly - converts the assembly into machine code
4. linkage - links the object code to external libraries to create an executable

Note also that the name of the compiler we are using is GCC, which stands for both "GNU C
compiler" and "GNU compiler collection", depending on context. Other C compilers exist. For Unix-
like operating systems, many of them have the name cc, for "C compiler", which is often a
symbolic link to some other compiler. On Linux systems, cc is often an alias for GCC. On macOS
or OS-X, it points to clang.

The POSIX standards currently mandates c99 as the name of a C compiler — it supports the C99
standard by default. Earlier versions of POSIX mandated c89 as the compiler. POSIX also
mandates that this compiler understands the options -c and -o that we used above.

Note: The -Wall option present in both gcc examples tells the compiler to print warnings about
questionable constructions, which is strongly recommended. It is a also good idea to add other
warning options, e.g. -Wextra.

The Translation Phases

As of the C 2011 Standard, listed in §5.1.1.2 Translation Phases, the translation of source code to
program image (e.g., the executable) are listed to occur in 8 ordered steps.

1. The source file input is mapped to the source character set (if necessary). Trigraphs are
replaced in this step.
2. Continuation lines (lines that end with \) are spliced with the next line.
3. The source code is parsed into whitespace and preprocessing tokens.
4. The preprocessor is applied, which executes directives, expands macros, and applies
pragmas. Each source file pulled in by #include undergoes translation phases 1 through 4
(recursively if necessary). All preprocessor related directives are then deleted.
5. Source character set values in character constants and string literals are mapped to the
execution character set.
6. String literals adjacent to each other are concatenated.
7. The source code is parsed into tokens, which comprise the translation unit.
8. External references are resolved, and the program image is formed.

An implementation of a C compiler may combine several steps together, but the resulting image
must still behave as if the above steps had occurred separately in the order listed above.

Read Compilation online: https://riptutorial.com/c/topic/1337/compilation

https://riptutorial.com/ 94

113 114 115 116 117 118 119 120 121 122 123