CFS

Wed Jun 10 00:34:55 IDT 2009

Hi,

Since the early 1970's we've known how to adjust the priority of a
process based on a variety of factors.  Unix (not linux) uses the
well-known multi-level feedback queue -- you can read about it here:

http://en.wikipedia.org/wiki/Multilevel_feedback_queue

To make a long story short --- when a process is interactive, it never
uses its full quantum; consequently, its priority is raised -- this
gives preference to tasks such using an editor.
When a process is memory bound it will use its full quantum, in this
case its priority is lowered.
A characteristic of a multi-level feedback queue is that quantum get
shorter as the priority
is increased; thus, a memory bound process gets to run longer, but
less often, whereas an
interactive process runs more often but in shorter intervals.

The wikipedia entry references an old paper by klienrock (queuing
theory vol 1, vol 2, fame)
and muntz.

Linux claims that with cfs they have as good as a scheduler as freebsd
-- freebsd uses the unix multilevel feedback queue model.  One of the
goals of the multilevel feedback queue is to create a system that is
not jittery -- right now, people running linux on a modern processor
live in a very different world than when these algorithms were
created.

I ran a system with 60 terminals connected to a 16Mhz 80286 with 4mb
of memory -- doing order entry, order lookups, product shipping and
inventory -- the computer also ran a dozen printers.

Today, the processors are faster with more memory -- we run on our
desktop quite demanding applications --- video and audio streaming --
while at the same time editing a document, compiling code, etc.   I
wasn't able to determine whether anyone has characterized the
performance of cfs with respect ot a variety of workloads -- this kind
of work is done analytically
(e.g., in the old days, we used monte-carlo simulations).

Shlomo

On Tue, Jun 9, 2009 at 11:40 AM, <linux-il-request at cs.huji.ac.il> wrote:
> Send Linux-il mailing list submissions to
>        linux-il at cs.huji.ac.il
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
> or, via email, send a message with subject or body 'help' to
>        linux-il-request at cs.huji.ac.il
>
> You can reach the person managing the list at
>        linux-il-owner at cs.huji.ac.il
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linux-il digest..."
>
>
> Today's Topics:
>
>   1. Re: How to count dropped connections (Amos Shapira)
>   2. Re: How to count dropped connections (Amos Shapira)
>   3. Re: How to count dropped connections (Shachar Shemesh)
>   4. Re: How to count dropped connections (shimi)
>   5. Re: How to count dropped connections (Imri Zvik)
>   6. Re: OT: Bezeqint made me "poof... he's gone" (guy keren)
>   7. Re: A question about CFS (Oleg Goldshmidt)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 9 Jun 2009 21:40:25 +1000
> From: Amos Shapira <amos.shapira at gmail.com>
> Subject: Re: How to count dropped connections
> To: shimi <linux-il at shimi.net>
> Cc: linux-il <linux-il at cs.huji.ac.il>
> Message-ID:
>        <9c2cca270906090440t7e56c349n38c09021fc779799 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> 2009/6/9 shimi <linux-il at shimi.net>:
>> Look,
>>
>> Basically, if your HTTP connection handler handles many connections well -
>> your module (backend) processing would become the bottleneck. That's how it
>> usually happens. So if you wrap the requests by a frontend proxy (again, I
>> recommend nginx) - and just put an error log for the relevant vhosts there,
>> every time nginx cannot pass a request to backend processing, would be
>> logged. Then you only need to look at the log, and that's it!
>
> Thanks. I looked at nginx a while ago and it looks like a great
> performance booster. We might be able to use it for some of the
> applications if/when it comes to that. I'll consider it for this
> problem for most applications.
>
> On the other hand:
>
> 1. It means that we'll have another application-level proxy where
> right now we are very happy with LVS's performance, transparency and
> handling of lots of other traffic going on (we also use it for
> internal VIP forwarding among the various components of the system).
> I.e. we'll need yet another technology to maintain in addition to LVS,
> which we are very happy with.
>
> 2. One of the applications does lots of TCP/IP-level connection
> sniffing so it can't be used behind an application-level proxy, it
> must have a direct connection to the browser (LVS works for us since
> it acts like a bridge - doesn't touch anything inside the packet
> except for the destination MAC address).
>
> Your suggestion to check nginx incoming vs. outgoing gave me another
> idea - I'll try to find whether I can get such stats from the LVS
> server, though LVS itself could be dropping connections due to lack of
> space to track all of them too (which closes the loop - how can I tell
> whether nginx' own server doesn't drop incoming connections which
> nginx itself doesn't know about?).
>
> Thanks,
>
> --Amos
>
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 9 Jun 2009 21:42:08 +1000
> From: Amos Shapira <amos.shapira at gmail.com>
> Subject: Re: How to count dropped connections
> To: Noam Rathaus <noamr at beyondsecurity.com>
> Cc: linux-il <linux-il at cs.huji.ac.il>
> Message-ID:
>        <9c2cca270906090442y5a3aff8ai36f66e56cf2ce7fa at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> 2009/6/9 Noam Rathaus <noamr at beyondsecurity.com>:
>> Amos,
>>
>> What are you trying to count? I hope I understood you correctly, you want to
>> know how many HTTP requests are being handled, against those that couldn't
>> be handled due to lack of connections.
>
> Yes. "How many connections from customers have reached our servers but
> failed to complete the TCP hand shake and send a request?".
>
>>
>> netstat is a very bad counting devices, unless you are counting packets.
>
> I know. I try to use it as a tool to find counters which might exist
> in the kernel. For instance - it can't tell me which port or IP
> address the connections failed on.
>
>>
>> If you want to count "requests" I would count incoming connection requests
>> (SYN) vs apache log of requests
>>
>> The incoming connections should be counted using tcpdump or similar
>
> I just read during my googl'ing that tcpdump is not reliable - it
> could report packets more than once, e.g. packets which haven't been
> sent or count packets more than once. Also it slows down the network
> for time-stamping.
>
> Maybe a clever iptables rule can count incoming SYN packets on the
> relevant ports (we listen on about 4-5 different ports) and then I can
> compare it against Apache access log for same period.
>
>>
>> while apache log should be easily achievable by grep
>
> If the TCP-level connection is dropped before an HTTP request is
> received then I'm not sure Apache's log will show it (just tried this
> on a Ubuntu desktop, don't know how much it indicates for CentOS 5).
>
> Thanks,
>
> --Amos
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 09 Jun 2009 15:13:43 +0300
> From: Shachar Shemesh <shachar at shemesh.biz>
> Subject: Re: How to count dropped connections
> To: Amos Shapira <amos.shapira at gmail.com>
> Cc: linux-il <linux-il at cs.huji.ac.il>
> Message-ID: <4A2E51F7.4040707 at shemesh.biz>
> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>
> Amos Shapira wrote:
>>
>> Maybe a clever iptables rule can count incoming SYN packets on the
>> relevant ports (we listen on about 4-5 different ports) and then I can
>> compare it against Apache access log for same period.
>>
> No need for anything special. Just do "iptables -L -v" to see how many
> hits on each rule. iptables even has command option that give you the
> stats and atomically zero the counters. All you need in addition is
> grep, and you're almost set.
>>
>>> while apache log should be easily achievable by grep
>>>
>>
>> If the TCP-level connection is dropped before an HTTP request is
>> received then I'm not sure Apache's log will show it (just tried this
>> on a Ubuntu desktop, don't know how much it indicates for CentOS 5).
>>
> Do you count that as a successful connection? It sounds to me like it is
> not, which means that apache not listing it is actually a good thing.
>
> What I would be worried about (not very, mind you) is SYN floods and
> other stuff. Some failed TCP connections should not be counted (SYN is
> invalid, three way handshake did not complete due to client
> considerations, retransmitted SYNs etc.). The only way I can think of to
> find those is a sniffer (I don't know of any tcpdump rules that can
> match those, and I wouldn't trust its performance anyway, so I think a
> dedicated one would work best).
>
> Shachar
>
> --
> Shachar Shemesh
> Lingnu Open Source Consulting Ltd.
> http://www.lingnu.com
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20090609/b651a960/attachment-0001.html>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 9 Jun 2009 15:22:58 +0300
> From: shimi <linux-il at shimi.net>
> Subject: Re: How to count dropped connections
> To: Amos Shapira <amos.shapira at gmail.com>
> Cc: linux-il <linux-il at cs.huji.ac.il>
> Message-ID:
>        <9eba290f0906090522u2df58d8fg51644a24760ecbd at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Tue, Jun 9, 2009 at 2:40 PM, Amos Shapira <amos.shapira at gmail.com> wrote
>
>> 1. It means that we'll have another application-level proxy where
>> right now we are very happy with LVS's performance, transparency and
>> handling of lots of other traffic going on (we also use it for
>> internal VIP forwarding among the various components of the system).
>> I.e. we'll need yet another technology to maintain in addition to LVS,
>> which we are very happy with.
>
>
> Can't help with that, but I will mention that nginx has a load balancing
> feature as well;
>
>
>>
>> 2. One of the applications does lots of TCP/IP-level connection
>> sniffing so it can't be used behind an application-level proxy, it
>> must have a direct connection to the browser (LVS works for us since
>> it acts like a bridge - doesn't touch anything inside the packet
>> except for the destination MAC address).
>
>
> You mean that it connects back to the origin and then run stuff on it? If
> that's the problem, nginx has an option to forward the originating IP
> address via an HTTP header, which you can then use in your application.
>
>
>>
>>
>> Your suggestion to check nginx incoming vs. outgoing gave me another
>> idea - I'll try to find whether I can get such stats from the LVS
>> server, though LVS itself could be dropping connections due to lack of
>> space to track all of them too (which closes the loop - how can I tell
>> whether nginx' own server doesn't drop incoming connections which
>> nginx itself doesn't know about?).
>>
>
> I think that you can't (side of sniffing). However nginx is designed for
> tens of thousands of simultaneous keep-alive sessions, with a *very* small
> footprint. This is a very high limit. I think that above it, you'ld need
> something which is hardware assisted (like F5 BigIP)
>
> >From my experience, nginx allowed to reduce the number of running Apache
> threads (apache+mod_php5 was the setup). I don't know how much of your
> content can be served directly by nginx (I dumped Apache completely, because
> nginx can do PHP via FastCGI). How much you'll reduce depends on what volume
> of the requests is going to your proprietary module :) But all static
> content handling can be done by nginx. When number of Apache threads go
> down, your CPU load goes down, and your memory usage goes WAY down. My
> numbers were 10-20 loadavg with 6GB RAM, down to loadavg of 1 with a few
> tens of MBs of RAM... my hardware problem became a bandwidth problem... our
> uplink couldn't sustain what we now been able to push... of course, that's
> because I was pure-PHP - so your case is different.
>
> But really, setting it up is a breeze, and very easy to test. I'll do give
> you a tip, though. nginx has static buffers set up for everything; And it
> tests them and returns an HTTP error if a request is larger than the
> buffers. So if your requests are bigger than very plain access (large
> cookies, file uploads, etc) - be advised to set up the relevant buffers in
> nginx.conf with values larger than the default (the default is VERY light on
> resources...)
>
> -- Shimi
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20090609/8d71c78e/attachment-0001.html>
>
> ------------------------------
>
> Message: 5
> Date: Tue, 09 Jun 2009 18:02:19 +0300
> From: Imri Zvik <imriz at inter.net.il>
> Subject: Re: How to count dropped connections
> To: linux-il at cs.huji.ac.il
> Cc: Shachar Shemesh <shachar at shemesh.biz>
> Message-ID: <200906091802.19536.imriz at inter.net.il>
> Content-Type: text/plain; charset=utf-8
>
> On Tuesday 09 June 2009 15:13:43 Shachar Shemesh wrote:
>> > If the TCP-level connection is dropped before an HTTP request is
>> > received then I'm not sure Apache's log will show it (just tried this
>> > on a Ubuntu desktop, don't know how much it indicates for CentOS 5).
>> > ?
>>
>> Do you count that as a successful connection? It sounds to me like it is
>> not, which means that apache not listing it is actually a good thing.
>>
>> What I would be worried about (not very, mind you) is SYN floods and
>> other stuff. Some failed TCP connections should not be counted (SYN is
>> invalid, three way handshake did not complete due to client
>> considerations, retransmitted SYNs etc.). The only way I can think of to
>> find those is a sniffer (I don't know of any tcpdump rules that can
>> match those, and I wouldn't trust its performance anyway, so I think a
>> dedicated one would work best).
>
> How about using iptables to count the TCP packets containing SYN's and
> comparing it to the access_log entries? There are a couple of pitfalls here
> that needs to be addressed (like retransmition of SYN packets), but this
> could probably be avoided by using parsing script, which would eliminate the
> duplicates.
>
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Tue, 09 Jun 2009 21:31:40 +0300
> From: guy keren <choo at actcom.co.il>
> Subject: Re: OT: Bezeqint made me "poof... he's gone"
> To: Oron Peled <oron at actcom.co.il>
> Cc: linux-il <linux-il at cs.huji.ac.il>
> Message-ID: <4A2EAA8C.7080005 at actcom.co.il>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>
> looks like www.actcom.co.il doesn't respond to requests any longer.
> users.actcom.co.il does respond to requests.
>
> it looks like this breaks my site as well, since i've used
> www.actcom.co.il in the links between my pages :0
>
> i guess i'll modify the site for a new URL and upload a new version over
> the weekend. in fact, i think i would better buy my own domain finally
> and move the site elsewhere.... i've postponed this for more then a
> decade ;)
>
> --guy
>
> Oron Peled wrote:
>> Good day (cross-posted, check when replying).
>>
>> A a previous customer of Actcom I continued with Bezeqint under
>> the same terms (including a contract renewal ~1 year ago).
>>
>> Few days ago I accidentally discovered that my hosted homepage wasn't
>> accessible -- further tests + ~1 hour on the phone (navigating through
>> Bezeqint support structure) revealed the unbelievable....
>>
>> THE FREAKING BASTARDS PULLED THE PLUG ON THE DOMAINS WITHOUT EVEN TELLING
>> ANYBODY.
>>
>> I'm now in damage control mode (formal faxes to customer support, etc.)
>> Anybody else?
>>
>
>
>
>
> ------------------------------
>
> Message: 7
> Date: Tue, 09 Jun 2009 21:37:49 +0300
> From: Oleg Goldshmidt <pub at goldshmidt.org>
> Subject: Re: A question about CFS
> To: Shachar Shemesh <shachar at shemesh.biz>
> Cc: linux-il <linux-il at cs.huji.ac.il>
> Message-ID: <m3d49djls2.fsf at goldshmidt.org>
> Content-Type: text/plain; charset=us-ascii
>
>
> Shachar Shemesh <shachar at shemesh.biz> writes:
>
>> Hi all,
>> I'm trying to understand Linux's Completely Fair Scheduler better
>
> Hi,
>
> Disclaimer: i've never had a chance to look at the CFS nearly as
> closely as the previous incarnation ["O(1)"]. I'll take a shot, though.
>
>>    1.  How do the different priorities factor in?
>
> My understanding is that CFS is within a single priority level.
>
>>      Obviously, the real time priorities do not (their scheduling
>>      methods are well defined), but how about the nice values? The
>>      "fair clock" is a property of "rq" (run queue)
>>      - does that mean that each nice value has its own queue?
>
> Yes, as in O(1). The implementation of the runqueue has changed in
> CFS.
>
>> If so, how do the relative priorities happen?
>
> I am not sure what you mean by "relative priorities". Do you mean,
> "niceness"? If so, there is a simple mapping of niceness to priority
> in kernel/sched.c (unchanged from O(1)):
>
> /*
>  * Convert user-nice values [ -20 ... 0 ... 19 ]
>  * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
>  * and back.
>  */
> #define NICE_TO_PRIO(nice)      (MAX_RT_PRIO + (nice) + 20)
> #define PRIO_TO_NICE(prio)      ((prio) - MAX_RT_PRIO - 20)
> #define TASK_NICE(p)            PRIO_TO_NICE((p)->static_prio)
>
>
>>    2.  Are those "all processes in the system", or just "ready
>>        processes in the system"?
>
> Only ready or running processes are scheduled - others don't even want
> the CPU.
>
>>      Wikipedia says that CFS treats wait time due to "sorry, no CPU
>>      for you" and wait time due to "I don't need the CPU right now"
>>      the same, so this suggests all processes.
>
> I suspect you are a bit confused here, though it sounds like you've
> made the right guess below. Accounting for sleep time when
> deciding whom to give the CPU the soonest is related to recognizing
> interactive processes: they typically sleep a lot waiting for user
> input and such. In the O(1) scheduler such processes are allocated
> additional time slices in the current epoch, in CFS their "fair share"
> of the CPU is increased accordingly.
>
>>    3.  The articles say that CFS gives extra priority for
>>        interactive processes, but does not mention how. Is this just
>>        a by product of the wall time - wait time calculation (makes
>>        sense), or is there any additional tweaking going on that
>>        does that?
>
> See above.
>
> --
> Oleg Goldshmidt | pub at goldshmidt.org
>
>
>
> ------------------------------
>
> _______________________________________________
> Linux-il mailing list
> Linux-il at cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>
>
> End of Linux-il Digest, Vol 6, Issue 9
> **************************************
>