[SCore-users] Gang scheduling details

Shreenivasa Venkataramaiah shreeni at cs.uh.edu
Thu Sep 26 03:17:44 JST 2002


Thanks a lot. As of now I went through your papers about the 
implementation and have got a fair idea about it. Before going into 
further details of the implementation, I would like to do some 
performance measurements of my own so that I can decide on the next 
course of action.

For this I need to know how to disable gang-scheduling on SCore? I am 
planning to run the NAS benchmarks with and without gang scheduling. 
When I say disable gang scheduling, I mean I should be able to run the 
benchmarks as if the cluster was just a collection of independent nodes 
with some version of MPI (preferably MPICH) on them.

Thanks again,
shreeni.

Atsushi HORI wrote:

>Thanks, John,
>
>  
>
>>You'll have to wait for the official response to confirm this, but my
>>interpretation and is that the scheduling on the time-sharing queues in
>>SCore is loosely based on the Distributed Hierarchical Control gang
>>scheduling algorithm from Feitelson.
>>    
>>
>
>A hierarchical control is used in SCore, but it is different from 
>Feitelson's DHC. The control of parallel processes (this is defined 
>as a set of (Unix) processes derived from a parallel program) and job 
>scheduling are decoupled, while in DHC they are merged into one 
>distributed tree.
>
>In the early stage of SCore development, I designed and implemented a 
>distributed scheduler (and control of parallel processes), but soon I 
>found that there was a large amount of scheduling overhead. The 
>reason is simple. Communication is still more than 100 times slower 
>than local memory access. Now centarlized scheduling is implemented 
>in SCore. The control of parallel processes remains distributed from 
>the earliest version.
>
>DHC is using binary tree for both of job scheduling and job control. 
>The binary tree can be optimum for job scheduling, however, not 
>optimum for job control. Once I evaluated SCore implementation, and I 
>found 8-ary or 16-ary distributed tree was the best for controling 
>parallel processes. This is another reason why SCore is NOT based on 
>DHC.
>
>  
>
>>Also I believe there are some papers by Hori et al on the
>>implementation.  And for details of the buddy allocation scheme start
>>with Feitelson.
>>    
>>
>
>Well, binary buddy is a "natural" design, I believe. I chose binary 
>buddy scheme based on the fact that many users want to submit jobs 
>with the numbers of power of 2, as in the Feitelson's papaer (not for 
>DHC), 
>
>Anyway, Shreeni, I could answer how gang-scheudling implementation 
>issues are answered in SCore, if you ask me in more details.
>
>----
>Atsushi HORI
>Swimmy Software, Inc.
>
>_______________________________________________
>SCore-users mailing list
>SCore-users at pccluster.org
>http://www.pccluster.org/mailman/listinfo/score-users
>  
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://new1.pccluster.org/pipermail/score-users/attachments/20020925/73ec1fc8/attachment.html>


More information about the SCore-users mailing list