disabling loop unrolling in GCC
Oleg Goldshmidt
pub at goldshmidt.org
Mon Dec 21 18:06:16 IST 2009
2009/12/21 Shachar Shemesh <shachar at shemesh.biz>:
> Hi all,
>
> I'm trying, without success, to disable loop unrolling when compiling a
> program with -O3 with gcc (4.4, but I see the same problem with 4.3).
I am actually very surprized that -O3 unrolls loops. It is not
supposed to. The idea to include -funroll-loops into O3 was raised
quite a few times and was always rejected. Maybe something changed in
recent years. The documentation certainly does not say loop unrolling
is enabled with either -O2 or -O3.
I suspect something is the matter with -ftree-loop-optimize. The gcc
documentation says,
`-ftree-loop-optimize'
Perform loop optimizations on trees. This flag is enabled by
default at `-O' and higher.
However, the behaviour depends on which optimization options you use.
E.g., -O2 won't unroll no matter what:
$ gcc -c -O2 -ftree-loop-optimize loop.c
$ objdump -S loop.o
loop.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <func>:
0: 31 c0 xor %eax,%eax
2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
8: 83 c0 01 add $0x1,%eax
b: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 15 <func+0x15>
12: 00 00 00
15: 83 f8 08 cmp $0x8,%eax
18: 75 ee jne 8 <func+0x8>
1a: f3 c3 repz retq
However, try compiling with -O3 -fno-tree-loop-optimize and you will succeed.
$ gcc -c -O3 -fno-tree-loop-optimize loop.c
$ objdump -S loop.o
loop.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <func>:
0: 31 c0 xor %eax,%eax
2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
8: 83 c0 01 add $0x1,%eax
b: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 15 <func+0x15>
12: 00 00 00
15: 83 f8 07 cmp $0x7,%eax
18: 7e ee jle 8 <func+0x8>
1a: f3 c3 repz retq
Or, if you are primarily interested in code size as you indicate, why not -Os?
$ gcc -c -Os loop.c
$ objdump -S loop.o
loop.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <func>:
0: 31 c0 xor %eax,%eax
2: ff c0 inc %eax
4: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # e <func+0xe>
b: 00 00 00
e: 83 f8 08 cmp $0x8,%eax
11: 75 ef jne 2 <func+0x2>
13: c3 retq
Hope it helps,
--
Oleg Goldshmidt | pub at goldshmidt.org
More information about the Linux-il
mailing list