Monitoring SCore-D

The SCore-D operating system is monitored with the use of broadcasting programs, which monitor the information produced by SCore-D and then broadcast the information to client programs. This philosophy is adopted so there are only a few broadcasting programs connected to SCore-D, rather than many client programs trying to get the same information at the same time. This can be visualized with the following diagram:
[Monitoring schematic diagram]
The cluster is conceptualized with the SCore-D operating system, scored(8), running on the left-hand side. Broadcast programs are started on one or more server machines. Clients connect to the broadcast programs to get monitoring information, rather than directly to SCore-D.

The broadcast program provided with the SCore software is called scbcast(8). One scbcast server process can be initiated for each monitoring function. Currently, the following monitoring functions are supported: sysmon, syslog and schedmon.

The order that the processes are started is important. The scbcast servers must be started before SCore-D and the client monitoring processes. SCore-D and the client monitoring processes will make TCP/IP connections to scbcast, and the monitoring information produced by SCore-D is broadcast by scbcast to its clients. Here is an example of initiating the processes:

server1# scbcast sysmon
scbcast started
server1# scout -g pcc
SCOUT: Spawn done.
SCOUT: session started
server1# scored -sysmon server1
SYSLOG: Timeslice is set to 500[ms]
SYSLOG: Cluster[0]: comp0.pccluster.org@0...comp3.pccluster.org@3
SYSLOG:   BIN=linux, CPUGEN=pentium-iii, SMP=1, SPEED=500
SYSLOG:   Network[0]: myrinet/myrinet
SYSLOG: SCore-D network: myrinet/myrinet
SYSLOG: SCore-D server: comp3.pccluster.org:9901
<1> SCore-D: Connected to sysmon server (server1:9904)
The above example shows a broadcast server started on host 'server1' to monitor the sysmon function. scout(1) is started for the cluster environment and then scored is started with the -sysmon option. The argument to each of the options is the host address of the scbcast server. No TCP/IP port information is specified so the defaults of 9904 is used, respectively. This information can be seen in the last two lines of output from SCore-D shown above. If the broadcast server has not been started before scored then the following warning message will be issued:
<1> SCore-D:WARNING Failed to connect to sysmon server (server1:9904)
Status information is now sent from SCore-D to the scbcast server processes and system monitoring can be performed with either the sctop(1) command .

The following example shows how to monitor SCore-D system status using the sc_syslog(8) command and to output the information to a log file. Start the scbcast server for syslog:

server2# scbcast syslog
scbcast started
server2# 
Execute the sc_syslog command to monitor SCore-D system status. The options to sc_syslog are the host running the scbcast server and the name of the log file you wish to create (or append to):
$ sc_syslog server2 /tmp/scored.messages
sc_syslog started.
$ 
Now start scored in a scout environment, with the -syslog option:
server1# scout -g pcc
SCOUT: Spawn done.
SCOUT: session started
server1# scored -syslog server2
SYSLOG: Timeslice is set to 500[ms]
<1> SCore-D: Connected to syslog server (server2:9902)
Notice that the SYSLOG messages are not output to stdout, but are output to the file '/tmp/scored.messages', together with timestamps:
$ cat /tmp/scored.messages
16/Feb/2000 15:22:59 <1> SCore-D: Connected to syslog server (server2:9902)
16/Feb/2000 15:22:59 Cluster[0]: comp0.pccluster.org@0...comp3.pccluster.org@3
16/Feb/2000 15:22:59   BIN=linux, CPUGEN=pentium-iii, SMP=1, SPEED=500
16/Feb/2000 15:22:59   Network[0]: myrinet/myrinet
16/Feb/2000 15:22:59 SCore-D network: myrinet/myrinet
16/Feb/2000 15:22:59 SCore-D server: comp3.pccluster.org:9901
$ 
The above diagram shows another useful feature of broadcasting. scbcast servers on other hosts can also be clients to other scbcast servers. This allows broadcasting to be cascaded so servers are not overloaded with clients, or to allow clients to attach to a local broadcast server. The scbcast server on host 'server3' is initiated with the following command:
server3# scbcast syslog -cascade server2
scbcast started (cascade of server2)
server3# 

See also

scout(1), sctop(1), scored(8), scbcast(8), sc_syslog(8)

$Id: scbcast.html,v 1.3 2002/03/07 12:03:44 kameyama Exp $