<div dir="ltr">Reviving an old thread, here's a research, claiming link order and memory layout can have a big effect on performance. The tool randomizes them, to give a meaningful comparison between two programs.<div>
<br></div><div>A nugget from the slides, the difference between -O2 and -O3 on some LLVM benchmarks was found to be a random statistical noise.</div><div><br></div><div>(no need to reread the old thread, it's loosely related)</div>
<div><div><br></div><div><a href="http://plasma.cs.umass.edu/emery/stabilizer">http://plasma.cs.umass.edu/emery/stabilizer</a><br></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Feb 2, 2011 at 10:10 AM, Nadav Har'El <span dir="ltr"><<a href="mailto:nyh@math.technion.ac.il" target="_blank">nyh@math.technion.ac.il</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Tue, Feb 01, 2011, Tzafrir Cohen wrote about "Re: New Freecell Solver gcc-4.5.0 vs. LLVM+clang Benchmark":<br>
<div class="">> On Tue, Feb 01, 2011 at 12:49:00PM +0200, Elazar Leibovich wrote:<br>
><br>
> > The program they tested[1] is strictly CPU bound. BTW, standard deviation<br>
> > wouldn't work as well, but it shows (I think) that there's no such think as<br>
> > "ideal minimal runtime".<br>
> ><br>
> > [1]<br>
> > static int i = 0, j = 0, k = 0;<br>
> > int main() {<br>
> > int g = 0, inc = 1;<br>
> > for (; g<65536; g++) {<br>
> > i += inc;<br>
> > j += inc;<br>
> > k += inc;<br>
> > }<br>
> > return 0;<br>
> > }<br>
> ><br>
> > [2] <a href="http://www-plan.cs.colorado.edu/diwan/asplos09.pdf" target="_blank">http://www-plan.cs.colorado.edu/diwan/asplos09.pdf</a><br>
><br>
> That program is CPU-bound, but the time it runs is short enough for the<br>
> size of the environment to actually mean something.<br>
<br>
</div>Right, I think this example demonstrates Oleg's point, that there's usually<br>
no in-determinism, but rather just ignorance of what else is going on outside<br>
your program.<br>
<br>
In this case, this is a program that finishes ridiculously fast - on my<br>
low-end computer, it finishes in 0.3 milliseconds (!). But when you run<br>
this program, things like fork, disk read, system calls, dynamic linking,<br>
and a whole lot of other crap add much more than 0.3 milliseconds of overhead,<br>
and much of that other crap is undeterministic - e.g., if the executable<br>
needs to be read from disk, and something else is using the disk, you notice<br>
a slowdown.<br>
<br>
But still - my original point was that if you measure this program's runtime<br>
several times, the minimum, not the average, is often more representative of<br>
the "true" runtime (in this case 0.3 milliseconds). I measured this program<br>
with "time", and one time got 30 milliseconds, several times got 2 milliseconds,<br>
and one time got 1 millisecond. Yes, that minimum of 1 millisecond is closer<br>
to the performance of the program than the average of 1, 2 and 30.<br>
<br>
Anyway, you piqued my curiousity, and I'll read that ASPLOS paper :-)<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
Nadav Har'El | Wednesday, Feb 2 2011, 28 Shevat 5771<br>
<a href="mailto:nyh@math.technion.ac.il">nyh@math.technion.ac.il</a> |-----------------------------------------<br>
Phone <a href="tel:%2B972-523-790466" value="+972523790466">+972-523-790466</a>, ICQ 13349191 |I saw a book titled "Die Microsoft<br>
<a href="http://nadav.harel.org.il" target="_blank">http://nadav.harel.org.il</a> |Windows". Turns out it was in German...<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
_______________________________________________<br>
Linux-il mailing list<br>
<a href="mailto:Linux-il@cs.huji.ac.il">Linux-il@cs.huji.ac.il</a><br>
<a href="http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il" target="_blank">http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il</a><br>
</div></div></blockquote></div><br></div>