Thread scheduling on a multi core machine

haftas · 12-11-2010, 12:23 PM

Hello guys.

I'm working on a project, and I would like to know how the Linux kernel schedules POSIX threads on a multi-core machine.

What I'm looking for is the algorithm used. Is it safe to assume that on a program that fired 10 threads, the two threads that have been least recently used and that are not in a critical region will be executed in parallel for some time. Then the next two threads are picked in a round-robin fashion. Is it like this that it works?

Where can I find what I'm looking for? Thanks.

syg00 · 12-11-2010, 04:21 PM

No you cannot assume anything of the sort - even if you knew the algorithm intimately.
Here is the doco from the author when the CFS was introduced. The previous incarnation was time-slice based, and is still in use in many (most ?) production sites.

haftas · 12-11-2010, 10:58 PM

Thanks a lot for the reply

I skimmed over the link you posted - I will read it more thoroughly later today. But let me ask you the following question.

The application I'm working on is responsible to reschedule the threads of any application. Based on some criteria, it's going to say "now its the turn of these two (because of multicore) threads to run". It will block all other threads so the kernel will not be able to pick them up.

It appears to me that the link does not mention how threads can run in parallel, so I don't know how to design my application's "scheduling" policy. My application would normally have only one single thread running on a single core machine, and all the others blocked. Now I should consider having two threads active, since I have seen on my machine that threads can actually run in parallel.

Does this make any sense?

syg00 · 12-12-2010, 03:07 AM

There is a separate queue for each processor, so the decisions documented are made per processor. There is also awareness of hiperthreading and (NUMA) nodes. Release your threads as you wish, they will be added to the appropriate run queue. In need you can assign processes to processors, but isn't recommended.

haftas · 12-12-2010, 07:52 AM

After a bit more reading and research, I realized that nobody mentions how threads are assigned to CPUs. For example, if I have just created 10 runnable threads, waiting to be scheduled, which of them are going be placed at which CPU? Is this just done randomly, ie. each thread has 50% of getting in the queue of CPU1 and 50% for CPU2? Once they end up in one CPU, can they then switch to another?

This is important for me. Another example. An application creates two threads. Is it possible that they will both end up in CPU1? This makes no sense if it's true. So the question is... how does the kernel put threads from the same app on differect CPUs..?

I have googled it, still haven't found the answer.

Thanks

syg00 · 12-12-2010, 04:26 PM

I suspect only Ingo knows the real answers. For most of the time it doesn't matter. And yes tasks will be redispatched to balance the load. It sometimes makes sense to have tasks on the one CPU if they are using the same data. Then the (hardware) cache becomes more effective and performance of both tasks can improve.

haftas · 12-12-2010, 08:43 PM

So do you have any final advice to give me before we let this thread die? Any general ideas about how to attack this problem? As you can see my task is not trivial...

syg00 · 12-12-2010, 09:34 PM

Actually I don't see the concern.
Manage your threads as you see fit - don't try to second-guess the scheduler. It's been (massively) changed recently, and will be again no doubt. You can't even rely on knowing the processor count - and even if you do get it (correct), there is no guarantee you can use them all. Virtualization, cgroups, CPU hot-plug ... all can change the environment dynamically.

My advice would be to concentrate on making your code multi-processor safe - let the scheduler look after the details of actually dispatching them. There is always other work competing for the processors - kernel, kernel threads, interrupt handers ... not to mention userspace.

haftas · 12-14-2010, 09:29 AM

The point was that my scheduler would not be a fair scheduler, so that it would allow the user to reason about "what if" questions...

But thanks for your support