<div dir="ltr">Reviving an old thread, here's a research, claiming link order and memory layout can have a big effect on performance. The tool randomizes them, to give a meaningful comparison between two programs.<div>

<br></div><div>A nugget from the slides, the difference between -O2 and -O3 on some LLVM benchmarks was found to be a random statistical noise.</div><div><br></div><div>(no need to reread the old thread, it's loosely related)</div>

<div><div><br></div><div><a href="http://plasma.cs.umass.edu/emery/stabilizer">http://plasma.cs.umass.edu/emery/stabilizer</a><br></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Feb 2, 2011 at 10:10 AM, Nadav Har'El <span dir="ltr"><<a href="mailto:nyh@math.technion.ac.il" target="_blank">nyh@math.technion.ac.il</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Tue, Feb 01, 2011, Tzafrir Cohen wrote about "Re: New Freecell Solver gcc-4.5.0 vs. LLVM+clang Benchmark":<br>


<div class="">> On Tue, Feb 01, 2011 at 12:49:00PM +0200, Elazar Leibovich wrote:<br>

><br>

> > The program they tested[1] is strictly CPU bound. BTW, standard deviation<br>

> > wouldn't work as well, but it shows (I think) that there's  no such think as<br>

> > "ideal minimal runtime".<br>

> ><br>

> > [1]<br>

> > static int i = 0, j = 0, k = 0;<br>

> > int main() {<br>

> > int g = 0, inc = 1;<br>

> > for (; g<65536; g++) {<br>

> >   i += inc;<br>

> >   j += inc;<br>

> >   k += inc;<br>

> > }<br>

> > return 0;<br>

> > }<br>

> ><br>

> > [2] <a href="http://www-plan.cs.colorado.edu/diwan/asplos09.pdf" target="_blank">http://www-plan.cs.colorado.edu/diwan/asplos09.pdf</a><br>

><br>

> That program is CPU-bound, but the time it runs is short enough for the<br>

> size of the environment to actually mean something.<br>

<br>

</div>Right, I think this example demonstrates Oleg's point, that there's usually<br>

no in-determinism, but rather just ignorance of what else is going on outside<br>

your program.<br>

<br>

In this case, this is a program that finishes ridiculously fast - on my<br>

low-end computer, it finishes in 0.3 milliseconds (!). But when you run<br>

this program, things like fork, disk read, system calls, dynamic linking,<br>

and a whole lot of other crap add much more than 0.3 milliseconds of overhead,<br>

and much of that other crap is undeterministic - e.g., if the executable<br>

needs to be read from disk, and something else is using the disk, you notice<br>

a slowdown.<br>

<br>

But still - my original point was that if you measure this program's runtime<br>

several times, the minimum, not the average, is often more representative of<br>

the "true" runtime (in this case 0.3 milliseconds). I measured this program<br>

with "time", and one time got 30 milliseconds, several times got 2 milliseconds,<br>

and one time got 1 millisecond. Yes, that minimum of 1 millisecond is closer<br>

to the performance of the program than the average of 1, 2 and 30.<br>

<br>

Anyway, you piqued my curiousity, and I'll read that ASPLOS paper :-)<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Nadav Har'El                        |   Wednesday, Feb  2 2011, 28 Shevat 5771<br>

<a href="mailto:nyh@math.technion.ac.il">nyh@math.technion.ac.il</a>             |-----------------------------------------<br>

Phone <a href="tel:%2B972-523-790466" value="+972523790466">+972-523-790466</a>, ICQ 13349191 |I saw a book titled "Die Microsoft<br>

<a href="http://nadav.harel.org.il" target="_blank">http://nadav.harel.org.il</a>           |Windows". Turns out it was in German...<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

_______________________________________________<br>

Linux-il mailing list<br>

<a href="mailto:Linux-il@cs.huji.ac.il">Linux-il@cs.huji.ac.il</a><br>

<a href="http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il" target="_blank">http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il</a><br>

</div></div></blockquote></div><br></div>