Results of Using Profile-Guided Optimisations on Freecell Solver
Shlomi Fish
shlomif at iglu.org.il
Fri Jun 5 12:56:37 IDT 2009
Hi all,
I attended Shachar's talk yesterday for Herzelinux about Profile-Guided
Optimisations in gcc, etc. The talk was very interesting, but the focus of the
talk was only covered for a very small part of it.
In any case, Shachar mentioned that he tried to test the results of Profile
Guided Optimisations (PGO) on three programs and was able to build only one
with them. When I returned home I was anxious to try out the PGO with Freecell
Solver.
I had to fix the "Makefile.gnu" that came with the distribution because it had
some bugs that caused gcc to yell at me (I missed some flags in the linking
stage). But I got it working. Now with the following steps:
On https://svn.berlios.de/svnroot/repos/fc-solve/trunk/fc-solve/source:
1. make -f Makefile.gnu clean .
2. $ cat profile-guided-optimisations.bash
#!/bin/bash
make -f Makefile.gnu FREECELL_ONLY=1 \
CFLAGS='-DFCS_FREECELL_ONLY=1 -O3 -Wall -Werror=implicit-function-
declaration -march=pentium4 -fomit-frame-pointer -fprofile-generate -fPIC' \
# END_LFLAGS='-lm -lgcov -static-libgcc' \
# CREATE_SHARED='
3. bash profile-guided-optimisations.bash
4. ./freecell-solver-range-parallel-solve 1 32000 500 -l gi | tee dump
(The actual profiling).
5. $ cat profile-guided-optimisations-stage2.bash
#!/bin/bash
cat profile-guided-optimisations.bash |
perl -pe 's/-fprofile-generate/-fprofile-use/' |
bash
6. bash profile-guided-optimisations-stage2.bash
7. ./freecell-solver-range-parallel-solve 1 32000 500 -l gi | tee dump
Benchmarking.
-----------
Then:
1. With profile guided optimisations, the benchmark ran at 126.168390989304
seconds.
2. Without profile guided optimisations, (only the other flags), the benchmark
ran at 124.147925853729 seconds. So its seems PGO have made things somewhat
worse.
3. I should note that a previous benchmark that I ran, ran at:
{{{{{{{{{{
Trunk (r1755) with "./Tatzer -l p4b" after running strip on the
binaries and regular (non-threaded) range-solver "-l gi": (gcc-4.4.0)
122.316657066345s
}}}}}}}}}}
I don't know what made it slightly slower now.
All of the benchmarks were conducted while running only IceWM on top of my
Mandriva Cooker system.
Regards,
Shlomi Fish
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
"The Human Hacking Field Guide" - http://xrl.us/bjn8q
God gave us two eyes and ten fingers so we will type five times as much as we
read.
More information about the Linux-il
mailing list