Project Stage 1 - `-fdump-rtl-all`

Intro

In the last post, we dicussed `-fdump-tree-all` option, and the files generated by the option. In this post, we will run a command `gcc -fdump-rtl-all test.c -o test` and will guess what's done.

Progress

Let's first start with `-fdump-tree-all` option.

gcc -fdump-rtl-all test.c -o test
The result will be:
test                        test.c.271r.jump             test.c.310r.mode_sw           test.c.329r.jump2                test.c.359r.shorten
test.c                      test.c.283r.reginfo          test.c.311r.asmcons           test.c.349r.zero_call_used_regs  test.c.360r.nothrow
test.c.268r.expand          test.c.306r.outof_cfglayout  test.c.318r.ira               test.c.350r.alignments           test.c.361r.dwarf2
test.c.269r.vregs           test.c.307r.split1           test.c.319r.reload            test.c.354r.barriers             test.c.362r.final
test.c.270r.into_cfglayout  test.c.309r.dfinit           test.c.326r.pro_and_epilogue  test.c.356r.split5               test.c.363r.dfinish

Let's take a look at the first rtl file. `test.c.268r.expand`


;; Function main (main, funcdef_no=0, decl_uid=4853, cgraph_uid=1, symbol_order=0)


;; Generating RTL for gimple basic block 2

;; Generating RTL for gimple basic block 3

;; Generating RTL for gimple basic block 4

;; Generating RTL for gimple basic block 5

;; Generating RTL for gimple basic block 6

;; Generating RTL for gimple basic block 7

;; Generating RTL for gimple basic block 8

;; Generating RTL for gimple basic block 9


try_optimize_cfg iteration 1

Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Merging block 10 into block 9...
Merged blocks 9 and 10.
Merged 9 and 10 without moving.
Removing jump 54.
Merging block 11 into block 9...
Merged blocks 9 and 11.
Merged 9 and 11 without moving.


try_optimize_cfg iteration 2



;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])
        (const_int 0 [0])) "test.c":4:9 -1
     (nil))
(insn 6 5 7 2 (set (reg:SI 105)
        (const_int 1 [0x1])) "test.c":7:14 -1
     (nil))
(insn 7 6 8 2 (set (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])
        (reg:SI 105)) "test.c":7:14 -1
     (nil))
(jump_insn 8 7 9 2 (set (pc)
        (label_ref 37)) "test.c":7:5 -1
     (nil)
 -> 37)
(barrier 9 8 39)
(code_label 39 9 10 4 5 (nil) [1 uses])
(note 10 39 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 11 10 12 4 (set (reg:SI 107)
        (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])) "test.c":8:13 -1
     (nil))
(insn 12 11 13 4 (set (reg:SI 108)
        (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":8:13 -1
     (nil))
(insn 13 12 14 4 (set (reg:SI 106 [ sum_12 ])
        (plus:SI (reg:SI 107)
            (reg:SI 108))) "test.c":8:13 -1
     (nil))
(insn 14 13 15 4 (set (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])
        (reg:SI 106 [ sum_12 ])) "test.c":8:13 -1
     (nil))
(insn 15 14 16 4 (set (reg:SI 101 [ i.0_1 ])
        (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":11:19 -1
     (nil))
(insn 16 15 17 4 (set (reg:SI 102 [ _2 ])
        (and:SI (reg:SI 101 [ i.0_1 ])
            (const_int 1 [0x1]))) "test.c":11:19 -1
     (nil))
(insn 17 16 18 4 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 102 [ _2 ])
            (const_int 0 [0]))) "test.c":11:12 -1
     (nil))
(jump_insn 18 17 19 4 (set (pc)
        (if_then_else (ne (reg:CC 66 cc)
                (const_int 0 [0]))
            (label_ref 26)
            (pc))) "test.c":11:12 -1
     (nil)
 -> 26)
(note 19 18 20 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 20 19 21 5 (set (reg:SI 1 x1)
        (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":12:13 -1
     (nil))
(insn 21 20 22 5 (set (reg:DI 109)
        (high:DI (symbol_ref/f:DI ("*.LC0") [flags 0x2]  <var_decl 0xffffabf476c0 *.LC0>))) "test.c":12:13 -1
     (nil))
(insn 22 21 23 5 (set (reg:DI 0 x0)
        (lo_sum:DI (reg:DI 109)
            (symbol_ref/f:DI ("*.LC0") [flags 0x2]  <var_decl 0xffffabf476c0 *.LC0>))) "test.c":12:13 -1
     (expr_list:REG_EQUAL (symbol_ref/f:DI ("*.LC0") [flags 0x2]  <var_decl 0xffffabf476c0 *.LC0>)
        (nil)))
Wow, this is a very long code and daunting, and I didn't even paste all of the code in the file. It's just about a half of the file. And I cant' even understand what is what. It's looks impossible to understand. According to GCC Docs, it "Dumps after RTL generation".

Let's take a look at the next one. `test.c.269r.vregs`


;; Function main (main, funcdef_no=0, decl_uid=4853, cgraph_uid=1, symbol_order=0)

(note 1 0 3 NOTE_INSN_DELETED)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])
        (const_int 0 [0])) "test.c":4:9 69 {*movsi_aarch64}
     (nil))
(insn 6 5 7 2 (set (reg:SI 105)
        (const_int 1 [0x1])) "test.c":7:14 69 {*movsi_aarch64}
     (nil))
(insn 7 6 8 2 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])
        (reg:SI 105)) "test.c":7:14 69 {*movsi_aarch64}
     (nil))
(jump_insn 8 7 9 2 (set (pc)
        (label_ref 37)) "test.c":7:5 6 {jump}
     (nil)
 -> 37)
(barrier 9 8 39)
(code_label 39 9 10 4 5 (nil) [1 uses])
(note 10 39 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 11 10 12 4 (set (reg:SI 107)
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])) "test.c":8:13 69 {*movsi_aarch64}
     (nil))
(insn 12 11 13 4 (set (reg:SI 108)
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":8:13 69 {*movsi_aarch64}
     (nil))
(insn 13 12 14 4 (set (reg:SI 106 [ sum_12 ])
        (plus:SI (reg:SI 107)
            (reg:SI 108))) "test.c":8:13 119 {*addsi3_aarch64}
     (nil))
(insn 14 13 15 4 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])
        (reg:SI 106 [ sum_12 ])) "test.c":8:13 69 {*movsi_aarch64}
     (nil))
(insn 15 14 16 4 (set (reg:SI 101 [ i.0_1 ])
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":11:19 69 {*movsi_aarch64}
     (nil))
(insn 16 15 17 4 (set (reg:SI 102 [ _2 ])
        (and:SI (reg:SI 101 [ i.0_1 ])
            (const_int 1 [0x1]))) "test.c":11:19 503 {andsi3}
     (nil))
(insn 17 16 18 4 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 102 [ _2 ])
            (const_int 0 [0]))) "test.c":11:12 404 {cmpsi}
     (nil))
(jump_insn 18 17 19 4 (set (pc)
        (if_then_else (ne (reg:CC 66 cc)
                (const_int 0 [0]))
            (label_ref 26)
            (pc))) "test.c":11:12 19 {condjump}
     (nil)
 -> 26)
(note 19 18 20 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 20 19 21 5 (set (reg:SI 1 x1)
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":12:13 69 {*movsi_aarch64}
     (nil))
(insn 21 20 22 5 (set (reg:DI 109)
        (high:DI (symbol_ref/f:DI ("*.LC0") [flags 0x2]  <var_decl 0xffff9f7476c0 *.LC0>))) "test.c":12:13 70 {*movdi_aarch64}
     (nil))
It dumps after converting virtual registers to hard registers. Ad we can see the difference between .expand file and .vreg files as Following:
$ diff test.c.268r.expand test.c.269r.vregs
It showes the following result:
4,42d3
<
< ;; Generating RTL for gimple basic block 2
<
< ;; Generating RTL for gimple basic block 3
<
< ;; Generating RTL for gimple basic block 4
<
< ;; Generating RTL for gimple basic block 5
<
< ;; Generating RTL for gimple basic block 6
<
< ;; Generating RTL for gimple basic block 7
<
< ;; Generating RTL for gimple basic block 8
<
< ;; Generating RTL for gimple basic block 9
<
<
< try_optimize_cfg iteration 1
<
< Merging block 3 into block 2...
< Merged blocks 2 and 3.
< Merged 2 and 3 without moving.
< Merging block 10 into block 9...
< Merged blocks 9 and 10.
< Merged 9 and 10 without moving.
< Removing jump 54.
< Merging block 11 into block 9...
< Merged blocks 9 and 11.
< Merged 9 and 11 without moving.
<
<
< try_optimize_cfg iteration 2
<
<
<
< ;;
< ;; Full RTL generated for this function:
< ;;
46c7
< (insn 5 2 6 2 (set (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
---
> (insn 5 2 6 2 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
48c9
<         (const_int 0 [0])) "test.c":4:9 -1
---
>         (const_int 0 [0])) "test.c":4:9 69 {*movsi_aarch64}
51c12
<         (const_int 1 [0x1])) "test.c":7:14 -1
---
>         (const_int 1 [0x1])) "test.c":7:14 69 {*movsi_aarch64}
53c14
< (insn 7 6 8 2 (set (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
---
> (insn 7 6 8 2 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
55c16
<         (reg:SI 105)) "test.c":7:14 -1
---
>         (reg:SI 105)) "test.c":7:14 69 {*movsi_aarch64}
58c19
<         (label_ref 37)) "test.c":7:5 -1
---
>         (label_ref 37)) "test.c":7:5 6 {jump}
65,66c26,27
<         (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
<                 (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])) "test.c":8:13 -1
---
>         (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
>                 (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])) "test.c":8:13 69 {*movsi_aarch64}
69,70c30,31
<         (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
<                 (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":8:13 -1
---
>         (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
>                 (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":8:13 69 {*movsi_aarch64}
74c35
<             (reg:SI 108))) "test.c":8:13 -1
---
>             (reg:SI 108))) "test.c":8:13 119 {*addsi3_aarch64}
76c37
< (insn 14 13 15 4 (set (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
---
> (insn 14 13 15 4 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
78c39
<         (reg:SI 106 [ sum_12 ])) "test.c":8:13 -1
---
>         (reg:SI 106 [ sum_12 ])) "test.c":8:13 69 {*movsi_aarch64}
81,82c42,43
<         (mem/c:SI (plus:DI (reg/f:DI 96 virtual-stack-vars)
<                 (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":11:19 -1
---
>         (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
>                 (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":11:19 69 {*movsi_aarch64}
86c47
<             (const_int 1 [0x1]))) "test.c":11:19 -1
---
>             (const_int 1 [0x1]))) "test.c":11:19 503 {andsi3}
90c51
<             (const_int 0 [0]))) "test.c":11:12 -1
---
>             (const_int 0 [0]))) "test.c":11:12 404 {cmpsi}
96c57
<             (pc))) "test.c":11:12 -1
---
>             (pc))) "test.c":11:12 19 {condjump}
101,102c62,63 
We can see what's done by `vregs` process. It looks like it got rid of  `virtual-stack-vars`, and added `{*movsi_aarch64}` . 

Let's check just one more. `test.c.270r.into_cfglayout`

;; Function main (main, funcdef_no=0, decl_uid=4853, cgraph_uid=1, symbol_order=0)

try_optimize_cfg iteration 1

Removing jump 8.
Removing jump 24.


try_optimize_cfg iteration 2



try_optimize_cfg iteration 1

(note 3 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])
        (const_int 0 [0])) "test.c":4:9 69 {*movsi_aarch64}
     (nil))
(insn 6 5 7 2 (set (reg:SI 105)
        (const_int 1 [0x1])) "test.c":7:14 69 {*movsi_aarch64}
     (nil))
(insn 7 6 39 2 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])
        (reg:SI 105)) "test.c":7:14 69 {*movsi_aarch64}
     (nil))
      ; pc falls through to BB 7
(code_label 39 7 10 3 5 (nil) [1 uses])
(note 10 39 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 11 10 12 3 (set (reg:SI 107)
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])) "test.c":8:13 69 {*movsi_aarch64}
     (nil))
(insn 12 11 13 3 (set (reg:SI 108)
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":8:13 69 {*movsi_aarch64}
     (nil))
(insn 13 12 14 3 (set (reg:SI 106 [ sum_12 ])
        (plus:SI (reg:SI 107)
            (reg:SI 108))) "test.c":8:13 119 {*addsi3_aarch64}
     (nil))
(insn 14 13 15 3 (set (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -4 [0xfffffffffffffffc])) [1 sum+0 S4 A32])
        (reg:SI 106 [ sum_12 ])) "test.c":8:13 69 {*movsi_aarch64}
     (nil))
(insn 15 14 16 3 (set (reg:SI 101 [ i.0_1 ])
        (mem/c:SI (plus:DI (reg/f:DI 64 sfp)
                (const_int -8 [0xfffffffffffffff8])) [1 i+0 S4 A64])) "test.c":11:19 69 {*movsi_aarch64}
     (nil))
(insn 16 15 17 3 (set (reg:SI 102 [ _2 ])
        (and:SI (reg:SI 101 [ i.0_1 ])
            (const_int 1 [0x1]))) "test.c":11:19 503 {andsi3}
     (nil))
(insn 17 16 18 3 (set (reg:CC 66 cc)
        (compare:CC (reg:SI 102 [ _2 ])
            (const_int 0 [0]))) "test.c":11:12 404 {cmpsi}
     (nil))
(jump_insn 18 17 19 3 (set (pc)
        (if_then_else (ne (reg:CC 66 cc)
                (const_int 0 [0]))
            (label_ref 26)
            (pc))) "test.c":11:12 19 {condjump}
This process dumps after converting to cfglayout mode. As you can see, there is some information provided in the beginning.

try_optimize_cfg iteration 1

Removing jump 8.
Removing jump 24.


try_optimize_cfg iteration 2



try_optimize_cfg iteration 1

Conclusion

Since RTL files are very low level, I couldn't understand the meaning of the code and what's going on in the file. However, at least I could see some optimization is done on register level.


Comments

popular posts in this blog

Project Stage 2 - part 2 : Clone-Pruning Analysis Pass

Project Stage 2 part 4 - Testing clone-test-core.c file with Modified GCC file and making further modification

Project Stage 2 part 3 - Compile a program with revised GCC