scrun
command. See the
scrun
(1) man page
for detailed descriptions of the resource specification options available.
The options require knowledge of the cluster configuration, which can be
obtained by looking at the SCore cluster database file,
/opt/score/etc/scorehosts.db
. The following is an example:
# PM/Myrinet myrinet type=myrinet \ -firmware:file=/opt/score/share/lanai/lanai.mcp \ -config:file=/opt/score/etc/pm-myrinet.conf # PM/Ethernet ethernet type=ethernet \ -config:file=/opt/score/etc/pm-ethernet.conf # PM/Agent/UDP udp type=agent -agent=pmaudp \ -config:file=/opt/score/etc/pm-udp.conf # PM/SHMEM shmem0 type=shmem -node=0 shmem1 type=shmem -node=1 # Macro to define a host #define PCC msgbserv=(server.pccluster.org:8764) \ cpugen=pentium-iii speed=500 smp=2 \ network=myrinet,ethernet,udp,shmem0,shmem1 \ group=pcc # comp0.pccluster.org PCC comp1.pccluster.org PCC comp2.pccluster.org PCC comp3.pccluster.org PCCFollowing are examples of using options on the
scrun
(1)
command running in a
scout
(1) session. First, start a SCOUT session:
$ scout -g pcc SCOUT: Spawn done. SCOUT: session started $The group
pcc
was specified in the
PCC
host macro definition as group=pcc
above.
In the Getting Started chapter an example was given
to execute the ./hello
command on four nodes. Let's just review
that command:
$ scrun -nodes=4 ./hello SCORE: connected (jid=100) <0:0> SCORE: 4 hosts ready. Hello World! (from node 2) Hello World! (from node 0) Hello World! (from node 3) Hello World! (from node 1)The option
-nodes=4
specifies that the command
should be executed on four nodes. The sequence of events that happens is
scrun
firstly invokes
scored
(8), the
user-level parallel operating system, on the hosts of the cluster specified
by the pcc
group on the scout
command
above. The user program is then executed on the invoked scored
operating system. If the number of requested nodes is less than the total
number of nodes in the partition, then scored
allocates nodes
from the latter nodes in the partition, and such that node loads are
balanced. For instance, if the group specified on the scout
invocation has 16 nodes (nodes 0 to 15) and the scrun
command
above is executed, then nodes 12, 13, 14 and 15 will be allocated to the
command. The user program still sees the logical nodes as 0, 1, 2 and 3.
There is no way for the user to allocate an exact node or nodes.
In the example above, four nodes were specified. If the nodes
are part of an SMP cluster, where each host in the cluster has multiple
processors (the PCC
definition in the database above has
smp=2
defined), then all four hosts cannot be guaranteed to
execute the program. The number of allocated hosts will be the number of
hosts divided by the number of processors in an SMP. In the example above,
two hosts (each with two processors) will be used. This is equivalent to the
command 'scrun -nodes=2x2 ./hello
'. To guarantee all four hosts
will be used, the following option has to be specified:
$ scrun -nodes=4x1 ./helloThe
x
option specifier requests that only
1
process on each of
4
hosts will be created.
$ scrun -nodes=4x2 ./helloThe program
hello
will be executed on
4
hosts each having
2
processors, thus 8 nodes will be used to
execute the program.
Following are more examples of using the
scrun
(1) command
to execute parallel programs.
$ scrun -nodes=4,statistics ./a.outThis will execute
a.out
on four nodes and when the program
completes, resource usage information will be output to standard error of the
scrun
process.
$ scrun -nodes=4,monitor ./a.outThis will execute
a.out
on four nodes and also attach a real-time
X Window user program CPU activity monitor. You must have an X window server
running and the DISPLAY
environment variable set correctly.
$ scrun -nodes=2,network=myrinet ./a.outIf your cluster has multiple networks then you can specify which network to route messages and data using the
network=network_name
option. The above example will execute a.out
on two nodes using
the myrinet
network as described in the cluster database above.
$ scrun -nodes=3 ./a.outThis may execute
a.out
on three or four nodes, depending on the
number of hosts in your cluster. The number of nodes may be rounded up to
the next power of two. If you have a cluster with 7 compute
hosts, then exactly 3 hosts will be allocated.
$ scrun -nodes=2,cpulimit=100 ./a.outThis will execute
a.out
on two nodes with a time limit of 100
seconds.
$ scrun -nodes=4.linux ./a.outThis will execute
a.out
on four nodes running a binary type of
linux
on a heterogeneous cluster.
$ scrun -nodes=4.alphalinux ./a.outThis will execute
a.out
on four nodes running a binary type of
alphalinux
on a heterogeneous cluster.
$ scrun -nodes=4.alphalinux.alpha-21264 ./a.outThis will execute
a.out
on four nodes running a binary type of
alphalinux
and with a CPU type of alpha-21264
on a
heterogeneous cluster. alpha-21264
should be one of the CPU
types defined in the cluster database with the
cpugen=gentype
attribute.
$ scrun -nodes=2..pentium-iii.400 ./a.outThis will execute
a.out
on two nodes with a CPU type of
pentium-iii
and a speed of 400
on a
heterogeneous cluster. 400
is a number for the speed factor (usually the clock speed of the processor in MHz) of the CPU. It should be one of
the numbers defined in the cluster database with the
speed=number
attribute.
The SCOUT session is completed by issuing exit
at the prompt:
$ exit SCOUT: session done
![]() |
PC Cluster Consortium |