[SCore-users] sc_watch problem

Atsushi HORI hori at swimmy-soft.com
Thu Dec 4 19:03:53 JST 2003


Hi,

>Under heavy (IO ??) load from cluster members the sc_watch seems to 
>get some kind
>of a timeout and it decides to rerun itself. This usualy happens 
>once or twice
>per night, when we are running many jobs.

The sc_watch periodically communicate with SCore-D, and when a 
timeout happens (there is no answer with in a certain time duration), 
it assumes something worng.

Under a heady load situation, a timeout can happen because compute 
hosts are very busy and response to sc_watch is late.

To avoid this, you may set the timeout duration by "-t" option 
followed by a number of time duration in minute. The default timeout 
is set to ten minutes.

----
Atsushi HORI
Swimmy Software, Inc.




More information about the SCore-users mailing list