. This release is a snapshot
of the ongoing research effort on parallel and distributed computing
at the PC Cluster Consortium.
We release this package, hoping that the SCore Cluster System
Software will contribute to research on parallel and distributed
systems throughout the world.
SCore supports the following features:
- Single System Image
Using SCore, users are not aware whether or not a system is a cluster
of single/multi-processor computers or a cluster of clusters.
A parallel application and an ordinary UNIX command may run by just
specifying a computer node group of such a cluster.
A Unix command runs in the SIMD execution style.
- Multiple Network Support
The PMvII high performance communication library is a dedicated communication
library for cluster computing using many types of networks.
PMvII allows a program to communicate on different types of networks.
PMvII drivers for Myrinet, Ethernet, UDP, RHiNET, SCI and the Shmem shared memory
interface have been implemented.
- Seamless Programming Environment
The Hmake command enables users to compile a program on
heterogeneous computers. It generates binaries for different underlying types
of execution environments such as Intel Pentium, Itanium and Compaq processors.
Using the MPC++ Multi-Threaded Template Library, a program runs on such a
heterogeneous processor environment.
- Heterogeneous Programming Language
Programs written in the MPC++ Multi-Threaded Template Library (MTTL) may
not only run on homogeneous computer environment but also run on
heterogeneous computer environment without modifying the code.
- Multiple Programming Paradigms
Unlike other cluster software, SCore not only supports the message
passing paradigm, but also supports the shared memory parallel programming
paradigm and the multi-threaded parallel programming paradigm.
- Parallel Programming Support
- Real-time process activity monitor
SCore-D allows us to watch parallel process activity in real-time using the
Real-Time Load Monitor. Each bar of the Real-Time Load Monitor represents
the processor utilization on each node.
- Deadlock detection
Since SCore-D knows the global status of a parallel process, i.e., each
process status and communication buffer status, it can detect whether or
not the parallel process is deadlocked.
- Automatic debugger attachment
In most cluster systems, when a process dies, there is no chance to invoke
a debugger interactively. SCore attaches the gdb(1) debugger to the target
parallel process when an exception signal is detected.
- Fault Tolerance
- Preemptive checkpoint
To enable a checkpoint function, the SCore checkpoint facility does not
require any additional API and does not assume any parallel programming
languages. The user parallel process image is stored to a local hard disk
for checkpoint by a user specified interval.
Moreover, the user's process image on the local hard disk is stored
redundantly, so, when one of the checkpointing data disk is broken,
the user's process image is able to re-produce using the data of the
other nodes.
- Parallel process migration
Using the checkpoint function, a parallel process may migrate to another
group of computers in SCore.
- Flexible Job Scheduling
To utilize processor resources and to enable an interactive programming
environment, the SCore-D global operating system multiplexes parallel
processes in processors' space and time domains simultaneously.
- Gang scheduling
Parallel processes are gang-scheduled when multiplexed in the time domain.
- Batch scheduling
Batch scheduling is implemented by setting an infinite value for the
scheduling slice time.