Disk I/O as a bottleneck?
is123 at zahav.net.il
is123 at zahav.net.il
Sun May 8 10:31:27 IDT 2011
On Sun, 08 May 2011 09:55:35 +0300
Nadav Har'El <nyh at math.technion.ac.il> wrote:
> On Sun, May 08, 2011, is123 at zahav.net.il wrote about "Re: Disk I/O as a
> bottleneck?":
> > I don't agree with this setup. Regular consumer drives setup with RAID
> > to stripe are going to be much, much faster and have less problems in
> > the long run than single SSDs at this point as well as being a better
> > value until prices change a lot.
>
> Having two hard disks will, at best case, *double* your seek time. This is
> still pretty slow, isn't it? Won't an SSD, even cheap one, have a better
> random access read performance?
Striping works because the controllers can overlap I/O. There's no reason
you have to seek on two drives, that's the *worst* case and probably
shouldn't happen. The best case is the controller knows the data is on one
drive and seek remains the same. Of course a physical drive is always more
beneficial for sequential access than random, but striping is still better
for random than not striping because of the caching- your system may be
doing other work that needs data found on the other drive, and that means
seek that you totally avoid. The more drives you spread your data across
the less likely you are to thrash any one drive. As we say, "six barbers,
no waiting."
I don't think any cheap SSDs cache enough to beat a good striped setup of
even 2 physical disks and the page sizes aren't usually matched to Linux's
requirements. I read an article about how SSD design was mostly made to
Windows filesystems and can work against you somewhat if you use them on
Linux or other OS. I don't know how true it was but it makes you think.
Anyway as you add more disks you keep increasing performance with a
striping setup, there is no downside except if you mirror but then again
there is no free lunch. You can have it fast or reliable or cheap, but you
can't have it all. The question is how much RAID and how many drives you
can buy for the price of a good SSD, how much cache will you get with
striped RAID and how much cache do you get with a good SSD, and how long
will everything last? SSD still has some nasty issues like wear leveling,
lifetime problems, etc. Enterprise servers are still using physical drives.
Those guys can pay anything and they buy the "old" stuff so that must tell
us something. Personally I'm staying with drives that spin until technology
improves and I start reading about production servers that are totally SSD.
YMMV ;-)
> > Consider not using swap, because swap when in use causes a lot of
> > thrashing and kills performance especially if you only have 1 or 2
> > drives. If you have
>
> Even without any swap space, you can have a lot of thrashing: Clean pages
> - program text (the code), shared libraries, memory-mapped files, and so
> on - are simply "forgotten" when the page cache is needed for other
> things, and when you get back to that other program, suddenly all its
> code and libraries are not in memory, and getting them back requires a
> lot of random-access reads from disk, and (especially if the disk is
> doing other things at the same time) causes the program to appear "stuck"
> or "sluggish".
That's true but swap only makes it worse since it's many magnitudes slower
than RAM and in my experience, Linux doesn't do a very good job of managing
it. If I have 4G of RAM and less than 2G is in use, I still see a few meg
being used by swap. What's the reason for that? I don't want to start
swapping until my RAM is very close to maxxed out. And then I want
swap to get cleaned out very quickly after RAM is available. Swapping should
be an absolute last resort on a modern machine. It looks like swap
processing is something that hasn't received much attention since it was
designed and things have changed a lot and it's time to stop using it
until they revisit how it should work. If you have a monitoring program
that tells you how much RAM and swap are in use, you will see a dramatic
slowdown in performance when you swap even a little.
I think the OP said he had a fast CPU and tons of RAM so if the system
feels sluggish then the obvious things to look at are filesystem layout,
choice of filesystems, turning off swap. If that doesn't work then maybe
look into another OS...at this point Linux (and BSD) still aren't doing SMP
as well as other OS but there may not be any better choices on Intel
hardware. As you or somebody else pointed out, many apps don't thread
enough to exploit a hyperthreading quad. Maybe the OP should look into
running Solaris, which is known to do well in this area. It's probably
going to be a long time until Linux and Linux apps really exploit
multicore well.
More information about the Linux-il
mailing list