Why is GNU/Linux so Bloated?
Oleg Goldshmidt
pub at goldshmidt.org
Thu Jun 11 22:22:13 IDT 2009
Shlomi Fish <shlomif at iglu.org.il> writes:
> Hi all!
>
> Based on the gcc-4.4.0 (with -Os) / x86-Linux shared library sizes
> here:
>
> http://tech.groups.yahoo.com/group/fc-solve-discuss/message/998
>
> And the Visual C++/Win32 (also x86) .dll sizes here:
>
> http://tech.groups.yahoo.com/group/fc-solve-discuss/message/999
>
> My question is: why are the Visual C++ generated binaries so much
> smaller than the equivalent Linux ones? Any insights would be
> appreciated.
Shlomi,
The short answer is, I don't know. I didn't even try to figure out
where the apples and the oranges were in your fc-solve-discuss
postings. Since you don't list files and sizes (at least not in any
way I can decipher, being unfamiliar with the project) or specify how
you compile and link (apart from -Os), I don't know if you compare
apples to apples.
I'll assume you compare dynamically linked executables on Linux/gcc
and on Windows/cl, and the corresponding so and dll libraries.
I'll wave my hands wildly and offer a couple of guesses that you can
try to investigate. They may be completely off the mark.
1) You probably know that DLLs work differently from Linux shared
libraries. DLLs contain relocatable code that uses a preferred base
address to which the loader will want to map the file. If a process
is linked against several libraries all but one need to be
relocated to other free addresses, COW-ed while fixing the addresses,
independently paged, etc. This also means that DLLs are dynamically
loaded, but not shared (they can only be shared between processes
with the same memory layout). Linux shared libraries contain
position-independent code (PIC) that uses only relative (to the
program counter) addresses. These libraries are really shared.
PIC implies address translation tables that are filled at load
time, but I suppose they are allocated at link time. This may be
one source of size overhead. I have no idea how important this
overhead is, you need to consult the experts.
There are (or at least used to be) -fPIC and -fpic options to
GCC. IIRC, -fpic implied a limit on the size of translation tables,
and refused to build if the resulting tables were too large. In
comparison, -fPIC implies no limits. However, I seem to recall that
the limits were quite small.
2) I suppose that the structure of the code is important. E.g., does
your optimization include inlining? Inlining replicates code,
objects, etc., hence it may affect something. I am not sure if -Os
overrides inlining.
3) Do you use exceptions a lot? IIRC, GCC generates stack unwinding
information for each function that may throw an exception (unless
something changed - you are using a recent version). This
information is stored in the executable. I don't know if the MS
compiler does the same thing.
--
Oleg Goldshmidt | pub at goldshmidt.org
More information about the Linux-il
mailing list