About Multi-cores and Multi-tasking

About Multi-cores and Multi-tasking

Shachar Shemesh shachar at shemesh.biz
Wed Apr 21 08:25:04 IDT 2010


Shlomi Fish wrote:
> Hi all!
>
> I once read that in order to truly take advantage of having multiple cores on 
> the same CPU, then one needs to use several threads. On the other hand some 
> people assume or implied that if your application splits the work among 
> several processes, then it can also take advantage of multiple cores. So my 
> question is: can several distinct processes each execute in their own cores?
>
> >From my experience with benchmarking http://fc-solve.berlios.de/ , I've 
> noticed that multi-processing was a bit faster than multi-threading on my 
> P4-2.4GHz machine ("hyperthreading") while multi-threading was faster than 
> multi-processing on my Intel x86-64-based laptop with two cores running in 
> x86-64 mode. It's possible that the multi-tasking in both cases is sub-
> optimal, but I've ran the same programs on both computers.
>
> I'd appreciate if anyone can shed any light on it.
>
> Regards,
>
> 	Shlomi Fish
>   
Let's define a few terms.

Multi-processor - several CPUs on the same machine. These are two 
distinct CPUs.
Multi-core - several CPUs inside the same chip. The "CPU"s are distinct, 
but they do share some caching data.
Hyperthreading - Now there's a misunderstood concept. Ever since the 
first Pentium, even a single core has the ability for some parallel 
processing. The original Pentium had two pipelines, in which it could 
(sometimes) execute two consecutive machine instructions at the same time.

With hyperthreading, the parallelilsm extend beyond executive 
instructions. Essentially, you have one CPU with two sets of registers 
and, say, six pipelines. Two are dedicated to the first set of 
registers, two to the second, and two can be assigned to either one, as 
the need arises.

The rest of this mail is pure conjecture.

The more shared the data is between the execution threads, the closer 
you want them to run. This way, cache does not need to be invalidated 
whenever a piece of data changes. This is why, when you run two 
processes, multi-CPU is faster (each process has its own data and 
instruction cache, which results in larger cache), but when it is a 
multi threaded app, multi core is faster (shared L2 cache). When you run 
the threads mode on multi-cpu, each change done to the shared address 
space needs to invalidate the cache the other CPU has for that same 
area, which is why you suffer the slowdowns.

As for hyperthreading - in theory, that should have been fastest, as 
they share L1 cache as well as L2. There are several "implementation" 
problems with this, however. The most obvious one is that it is 
difficult to know when one CPU is truly idle. Think about a thread 
spinning on a spin lock while waiting for the other one to complete a 
task. It does not appear idle, so it gets both assignable pipelines, 
while the other semi-core has only two pipes to complete the task that 
is blocking the lock to begin with. Another problem with hyperthreading 
is that it is stretching the L1 cache thin when executing unrelated 
tasks. My guess is that this is why Intel wound out removing it.

If, as Shimi is saying, they are re-introducing them, maybe they think 
that they found reasonable solutions to the above problems.

Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20100421/9473b776/attachment.html>


More information about the Linux-il mailing list