Project Stage 1 - `-fdump-tree-all` part 1
Intro
In this post, I will write about my experiment and examination of dump files that are generated by gcc compiler. This post is written by me wiht ChatGPT's help.
Progress
What is dump files?
A dump file is essentially a snapshot of the internal data or transformations at a particular stage of the compiler's processing of a .c
file.
Let's take a look at what exactly is happing when you comiple a '.c file'
1. Input (.c file):
- You provide the .c file (your source code) to the compiler.
2. Compilation Stages:
- The compiler goes through multiple stages, including parsing, syntax tree generation, optimizations, and code generation.
- At each of these stages, the compiler might apply different transformations to make the code faster, smaller, or more efficient.
3. Dump Files:
- With specific flags (like -fdump-tree-all), you tell the compiler to save the output after each stage in a dump file.
- These dump files contain the data and structure of the program at that point in the compilation process.
What they are for and why they are important ?
According to ChatGPT, these dump files allow developers to review what the compiler did at each stage and understand the changes it made.
Let's Dump it
Now, I think it is time to dump the dump files by compiling a simple .c program. However, my gut-feeling tells me that it should not be too simple to see what happens to the code; so I added a for-loop, if-statement and some print statement. To see what happen to comments, I also added some comments in the C file.
Here is my simple C program.
#include <stdio.h>
int main() {
int sum = 0;
// For loop to add numbers from 1 to 5
for (int i = 1; i <= 5; i++) {
sum += i; // Add i to sum
// If statement to check if the number is even or odd
if (i % 2 == 0) {
printf("%d is even\n", i);
} else {
printf("%d is odd\n", i);
}
}
printf("Sum of numbers from 1 to 5 is: %d\n", sum);
return 0;
}
1. Let's first start with `-fdump-tree-all` option.
The result will be:gcc -fdump-tree-all test.c -o test
test test.c.017t.ompexp test.c.052t.profile_estimate test.c.255t.switchlower_O0
test.c test.c.021t.fixup_cfg1 test.c.056t.release_ssa test.c.263t.isel
test.c.005t.original test.c.022t.ssa test.c.057t.local-fnsummary2 test.c.266t.waccess3
test.c.006t.gimple test.c.023t.walloca1 test.c.098t.fixup_cfg3 test.c.267t.optimized
test.c.009t.omplower test.c.026t.waccess1 test.c.105t.adjust_alignment test.c.364t.statistics
test.c.010t.lower test.c.030t.fixup_cfg2 test.c.251t.veclower test.c.365t.earlydebug
test.c.013t.eh test.c.031t.local-fnsummary1 test.c.252t.cplxlower0 test.c.366t.debug
test.c.015t.cfg test.c.032t.einline test.c.253t.bitintlower0
It dumped out a lot of different files. One thing to notice is that all files have similar namings.
They contain 'number + t' in the middle. We can guess these are the steps of the compilation. For example, Let's take a look at `test.c.005t.original` file
;; Function main (null)
;; enabled by -tree-original
{
int sum = 0;
int sum = 0;
{
int i = 1;
int i = 1;
goto <D.4860>;
<D.4859>:;
sum = sum + i;
if (((unsigned int) i & 1) == 0)
{
printf ((const char * restrict) "%d is even\n", i);
}
else
{
printf ((const char * restrict) "%d is odd\n", i);
}
i++ ;
<D.4860>:;
if (i <= 5) goto <D.4859>; else goto <D.4857>;
<D.4857>:;
}
printf ((const char * restrict) "Sum of numbers from 1 to 5 is: %d\n", sum);
return 0;
}
return 0;
It doesn't look very differnt so far. However, now it has a bit assembly-looking-like code added
such as <D.4860>:; or <D.4857>:;Now, let's take a look at the next file `test.c.006t.gimple`
int main ()
{
int D.4865;
{
int sum;
sum = 0;
{
int i;
i = 1;
goto <D.4860>;
<D.4859>:
sum = sum + i;
i.0_1 = (unsigned int) i;
_2 = i.0_1 & 1;
if (_2 == 0) goto <D.4862>; else goto <D.4863>;
<D.4862>:
printf ("%d is even\n", i);
goto <D.4864>;
<D.4863>:
printf ("%d is odd\n", i);
<D.4864>:
i = i + 1;
<D.4860>:
if (i <= 5) goto <D.4859>; else goto <D.4857>;
<D.4857>:
}
printf ("Sum of numbers from 1 to 5 is: %d\n", sum);
D.4865 = 0;
return D.4865;
}
D.4865 = 0;
return D.4865;
}
It looks like the code syntax is getting similar to the assembly languages that we learned.Let's take a look at the next step `test.c.009t.omplower`
;; Function main (main, funcdef_no=0, decl_uid=4853, cgraph_uid=1, symbol_order=0)
int main ()
{
int D.4865;
... # Examtly the Same Here
D.4865 = 0;
return D.4865;
}
Let's keep going with `test.c.010t.lower` file.
;; Function main (main, funcdef_no=0, decl_uid=4853, cgraph_uid=1, symbol_order=0)
int main ()
{
int i;
int sum;
int D.4865;
sum = 0;
i = 1;
goto <D.4860>;
<D.4859>:
sum = sum + i;
i.0_1 = (unsigned int) i;
_2 = i.0_1 & 1;
if (_2 == 0) goto <D.4862>; else goto <D.4863>;
<D.4862>:
printf ("%d is even\n", i);
goto <D.4864>;
<D.4863>:
printf ("%d is odd\n", i);
<D.4864>:
i = i + 1;
<D.4860>:
if (i <= 5) goto <D.4859>; else goto <D.4857>;
<D.4857>:
printf ("Sum of numbers from 1 to 5 is: %d\n", sum);
D.4865 = 0;
goto <D.4866>;
D.4865 = 0;
goto <D.4866>;
<D.4866>:
return D.4865;
}
Therefore we are going to the next file. 'test.c.015t.cfg'
Removing basic block 9
;; 2 loops found
;;
;; Loop 0
;; header 0, latch 1
;; depth 0, outer -1
;; nodes: 0 1 2 3 4 5 6 7 8 9
;;
;; Loop 1
;; header 7, latch 6
;; depth 1, outer 0
;; nodes: 7 6 4 5 3
;; 2 succs { 7 }
;; 3 succs { 4 5 }
;; 4 succs { 6 }
;; 5 succs { 6 }
;; 6 succs { 7 }
;; 7 succs { 3 8 }
;; 8 succs { 9 }
;; 9 succs { 1 }
int main ()
{
int i;
int sum;
int D.4865;
<bb 2> :
sum = 0;
i = 1;
goto <bb 7>; [INV]
<bb 3> :
sum = sum + i;
i.0_1 = (unsigned int) i;
_2 = i.0_1 & 1;
if (_2 == 0)
goto <bb 4>; [INV]
else
goto <bb 5>; [INV]
<bb 4> :
printf ("%d is even\n", i);
goto <bb 6>; [INV]
<bb 5> :
printf ("%d is odd\n", i);
<bb 6> :
i = i + 1;
<bb 7> :
if (i <= 5)
goto <bb 3>; [INV]
else
goto <bb 8>; [INV]
<bb 8> :
printf ("Sum of numbers from 1 to 5 is: %d\n", sum);
D.4865 = 0;
<bb 9> :
<L6>:
return D.4865;
}
`test.c.017t.ompexp` this file also looks related to OpenMP, and it has the exactly the same code except that the following comments are deleted:
Removing basic block 9
;; 2 loops found
;;
;; Loop 0
;; header 0, latch 1
;; depth 0, outer -1
;; nodes: 0 1 2 3 4 5 6 7 8 9
;;
;; Loop 1
;; header 7, latch 6
;; depth 1, outer 0
;; nodes: 7 6 4 5 3
;; 2 succs { 7 }
;; 3 succs { 4 5 }
;; 4 succs { 6 }
;; 5 succs { 6 }
;; 6 succs { 7 }
;; 7 succs { 3 8 }
;; 8 succs { 9 }
;; 9 succs { 1 }
Next, we have `test.c.021t.fixup_cfg1`. This file doesn't have any changes. So I will skip it.Cheer up! A lot to go. `test.c.022t.ssa`
;; Function main (main, funcdef_no=0, decl_uid=4853, cgraph_uid=1, symbol_order=0)
int main ()
{
int i;
int sum;
int D.4865;
unsigned int i.0_1;
unsigned int _2;
int _11;
<bb 2> :
sum_7 = 0;
i_8 = 1;
goto <bb 7>; [INV]
<bb 3> :
sum_12 = sum_3 + i_4;
i.0_1 = (unsigned int) i_4;
_2 = i.0_1 & 1;
if (_2 == 0)
goto <bb 4>; [INV]
else
goto <bb 5>; [INV]
<bb 4> :
printf ("%d is even\n", i_4);
goto <bb 6>; [INV]
<bb 5> :
printf ("%d is odd\n", i_4);
<bb 6> :
i_15 = i_4 + 1;
<bb 7> :
# sum_3 = PHI <sum_7(2), sum_12(6)>
# i_4 = PHI <i_8(2), i_15(6)>
if (i_4 <= 5)
goto <bb 3>; [INV]
else
goto <bb 8>; [INV]
<bb 8> :
printf ("Sum of numbers from 1 to 5 is: %d\n", sum_3);
_11 = 0;
<bb 9> :
<L6>:
return _11;
}
# sum_3 = PHI <sum_7(2), sum_12(6)>
means that the variable sum_3
takes the value sum_7
if control flows from block <bb 2>
and sum_12
if it comes from block <bb 6>.
This mechanism is particularly useful for handling variables updated in loops, where a variable might have different values depending on which iteration or block is reached.i_4
): In SSA form, each variable is assigned only once, so variables are often "versioned" with suffixes to maintain uniqueness. Here, i_4
is one of the SSA representations of i
. Different versions (e.g., i_8
, i_4
, i_15
) denote updates at different stages in the control flow, allowing the compiler to analyze dependencies precisely.Next, `test.c.026t.waccess1`
pointer_query counters:
index cache size: 0
index entries: 0
access cache size: 0
access entries: 0
hits: 0
misses: 0
failures: 0
max_depth: 0
Next, `test.c.030t.fixup_cfg2` hasn't changed anything.
`test.c.031t.local-fnsummary1` This file contained another extra information, but the code itself hasn't changed.
IPA function summary for main/0
global time: 45.000000
self size: 19
global size: 0
min size: 0
self stack: 0
global stack: 0
size:8.000000, time:9.000000
size:2.000000, time:0.000000, executed if:(not inlined)
calls:
printf/1 function body not available
freq:1.00 loop depth: 0 size: 3 time: 12
printf/1 function body not available
freq:1.00 loop depth: 1 size: 3 time: 12
printf/1 function body not available
freq:1.00 loop depth: 1 size: 3 time: 12
Comments
Post a Comment