<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
</head>
<body dir="ltr" bgcolor="#ffffff" text="#000000">
Hi all,<br>
<br>
I'm trying, without success, to disable loop unrolling when compiling a
program with -O3 with gcc (4.4, but I see the same problem with 4.3).<br>
<br>
The program is the following one:<br>
<blockquote type="cite">volatile int v;<br>
<br>
void func()<br>
{<br>
int i;<br>
<br>
for( i=0; i<8; ++i ) {<br>
v=0;<br>
}<br>
}</blockquote>
I compile it with the following command line:<br>
<br>
gcc -c -O3 test.c<br>
<br>
An "objdump -S test.o" gives:<br>
<blockquote type="cite">test.o: file format elf64-x86-64<br>
<br>
Disassembly of section .text:<br>
<br>
0000000000000000 <func>:<br>
0: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # a
<func+0xa><br>
7: 00 00 00<br>
a: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 14
<func+0x14><br>
11: 00 00 00<br>
14: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 1e
<func+0x1e><br>
1b: 00 00 00<br>
1e: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 28
<func+0x28><br>
25: 00 00 00<br>
28: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 32
<func+0x32><br>
2f: 00 00 00<br>
32: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 3c
<func+0x3c><br>
39: 00 00 00<br>
3c: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 46
<func+0x46><br>
43: 00 00 00<br>
46: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 50
<func+0x50><br>
4d: 00 00 00<br>
50: c3 retq<br>
</blockquote>
If I compile with -O2, the results are:<br>
<blockquote type="cite">test.o: file format elf64-x86-64<br>
<br>
Disassembly of section .text:<br>
<br>
0000000000000000 <func>:<br>
0: 31 c0 xor %eax,%eax<br>
2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)<br>
8: 83 c0 01 add $0x1,%eax<br>
b: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 15
<func+0x15><br>
12: 00 00 00<br>
15: 83 f8 08 cmp $0x8,%eax<br>
18: 75 ee jne 8 <func+0x8><br>
1a: f3 c3 repz retq<br>
</blockquote>
Where it gets worrying is when I try to cancel loop unrolling. I tried
"-fno-unroll-loops" and "-fno-peel-loops", to no effect. I even tried
messing with the --param option (max-unrolled-insns, max-unroll-times,
max-peel-times) to no noticeable effect.<br>
<br>
Even more worryingly, the documentation seems totally wrong. It claims
(<a class="moz-txt-link-freetext" href="http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Optimize-Options.html#index-O3-632">http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Optimize-Options.html#index-O3-632</a>)
that -O3 is equal to -O2 plus -finline-functions,
-funswitch-loops, -fpredictive-commoning,
-fgcse-after-reload and -ftree-vectorize. Trying to compile with -O2
and the additional optimization options does not, however, unroll the
loop, which suggests that -O3 differs from -O2 in another way as well.<br>
<br>
Help?<br>
<br>
Shachar<br>
<pre class="moz-signature" cols="72">--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
<a class="moz-txt-link-freetext" href="http://www.lingnu.com">http://www.lingnu.com</a>
</pre>
</body>
</html>