Results of Using Profile-Guided Optimisations on Freecell Solver

Results of Using Profile-Guided Optimisations on Freecell Solver

Shlomi Fish shlomif at iglu.org.il
Fri Jun 5 12:56:37 IDT 2009


Hi all,

I attended Shachar's talk yesterday for Herzelinux about Profile-Guided 
Optimisations in gcc, etc. The talk was very interesting, but the focus of the 
talk was only covered for a very small part of it.

In any case, Shachar mentioned that he tried to test the results of Profile 
Guided Optimisations (PGO) on three programs and was able to build only one 
with them. When I returned home I was anxious to try out the PGO with Freecell 
Solver.

I had to fix the "Makefile.gnu" that came with the distribution because it had 
some bugs that caused gcc to yell at me (I missed some flags in the linking 
stage). But I got it working. Now with the following steps:

On https://svn.berlios.de/svnroot/repos/fc-solve/trunk/fc-solve/source:

1. make -f Makefile.gnu clean .

2. $ cat profile-guided-optimisations.bash
#!/bin/bash
make -f Makefile.gnu FREECELL_ONLY=1 \
    CFLAGS='-DFCS_FREECELL_ONLY=1 -O3 -Wall -Werror=implicit-function-
declaration -march=pentium4 -fomit-frame-pointer -fprofile-generate -fPIC' \

    # END_LFLAGS='-lm -lgcov -static-libgcc' \
    # CREATE_SHARED='

3. bash profile-guided-optimisations.bash

4. ./freecell-solver-range-parallel-solve 1 32000 500 -l gi | tee dump

(The actual profiling).

5. $ cat profile-guided-optimisations-stage2.bash
#!/bin/bash
cat profile-guided-optimisations.bash |
    perl -pe 's/-fprofile-generate/-fprofile-use/' |
    bash

6. bash profile-guided-optimisations-stage2.bash

7. ./freecell-solver-range-parallel-solve 1 32000 500 -l gi | tee dump

Benchmarking.

-----------

Then:

1. With profile guided optimisations, the benchmark ran at 126.168390989304 
seconds.

2. Without profile guided optimisations, (only the other flags), the benchmark 
ran at 124.147925853729 seconds. So its seems PGO have made things somewhat 
worse.

3. I should note that a previous benchmark that I ran, ran at:

{{{{{{{{{{
Trunk (r1755) with "./Tatzer -l p4b" after running strip on the 
binaries and regular (non-threaded) range-solver "-l gi": (gcc-4.4.0)

122.316657066345s
}}}}}}}}}}

I don't know what made it slightly slower now.

All of the benchmarks were conducted while running only IceWM on top of my 
Mandriva Cooker system.

Regards,

	Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
"The Human Hacking Field Guide" - http://xrl.us/bjn8q

God gave us two eyes and ten fingers so we will type five times as much as we
read.



More information about the Linux-il mailing list