catwalk-romio

User level MPI-IO parallel file system

SYNOPSIS


catwalk -mpi
catwalk -kill
catwalk -mpi -SSH REMOTE-HOST
catwalk -SSH REMOTE-HOST

DESCRIPTION

The Catwalk server process running on the server node where files to be accseed from the compute nodes can work with the Linux ssh command or 3rd party batch job scheduler.

CAWALK DAEMON

Let us assume that you are on the head node (front-end or login node) of a cluster and the your data files needed for your MPI program(s) are only located on the local disks of the head node.

Firstly you can create a Catwalk environment on your machine with the catwalk command with the -mpi option but no command follows. Then the catwalk proess becomes a daemon proess and output some message where some evironment variables setting are shown. Here is the example.

	% catwalk -mpi
	export CATWALK_MPIIO=headnode:50052
	setenv CATWALK_MPIIO headnode:50052
	export CATWALK_DAEMON_PID=4884
	setenv CATWALK_DAEMON_PID 4884
	% export CATWALK_MPIIO=headnode:50052
	% export CATWALK_DAEMON_PID=4884
	% mpirun catwalk my_mpi_prog
	  ..
	% catwalk -kill

The you cut the output messages and paste on a shell terminal. Thus a Catwalk environment is created. Finally you can run your MPI program(s) so that the files on the head node are accessed by the program. You can kill the Catwalk daemon process with the -kill option as shown in the above example.

This Catwalk daemon process enable remote access to any files on the machine. Thus it is very DANGEROUS to run the Catwalk daemon process and strongly recommended to use the Catwalk-ROMIO with the ssh described below.

SSH

Let us assume that you are using a desktop machine and want to access a cluster via remote login and the data files are located on your machine.

	mypc% catwalk -mpi -SSH head
	head% mpirun catwalk my_mpi_prog

When the Catwalk-ROMIO is invoked with the -SSH option, then Catwalk calls the ssh command with some appropriate options so that the TCP connections needed by Catwalk are established through the TCP forwarding of the ssh. In this case, the catwalk proess running on your machine is not a daemon process and when the ssh session terminates, the catwalk process also terminates.

The TCP forwarding port number on the head node can be specified when the remote host name is immediately followed by colon (:) and a port number (see below).

	mypc% catwalk -mpi -SSH head:34567

Next, let us assume you have to go thtough a gateway machine to reach a cluster head node. The catwalk command only with the -SSH option can inherit the TCP forwarding settings. Note that if you put the -mpi option on the gateway node, in the example below, then the catwalk process running on the gateway will create a new Catwalk environment and no files on your machine will be able to access.

	mypc% catwalk -mpi -SSH gate
	gate% catwalk -SSH head
	head% mpirun catwalk my_mpi_prog

The SSH optin sof Catwalk-ROMIO can also accept the SSH options. Here is the example of specifying the -Y option to forward the X11 protocol.

	% catwalk -mpi -SSH -Y remote

3RD-PARTY BATCH SCHEDULER

In most cases, a batch scheduler is introduced to submit jobs to a cluster, but some btach scheduler and MPI runtime environment are hard to pass environment variables to the processes running on compute nodes. In this case, the Catwalk environment can be specified in a job script.

	% echo $CATWALK_MPIIO
	mymachine:50052
	% emacs jobscript.sh
	...
	mpirun catwalk -server mymachine:50052 my_mpi_prog
	...
	# end of editing job script
	% qsub jobscript.sh

In the above example, it is assumed that you are already in a Catwalk environment. The only thining you have to do is that put the -server option followed by the value of the CATWALK_MPIIO environemnt variable.

Note

Although Catwalk is designed to be independent from MPI implementation, Catwalk-ROMIO has to have ROMIO implementation in an MPI. Thus Catwalk-ROMIO only works with the MPI in the SCore package.

SEE ALSO

catwalk(1), scout(1), scrun(1), catwalk(7).
CREDIT
This document is a part of the SCore cluster system software developed at PC Cluster Consortium, Japan. Copyright (C) 2003-2008 PC Cluster Consortium.