A Catwalk environment consists of the catwalk server process running on file server where target files to be read or written are located and the Catwalk client processes running on compute nodes to access the target files. The Catwalk srver process and the client processes are both created by the catwalk command. When a Catwalk server process is created on the server node, it sets an environment variable whose value consists of server host name and a port number. When this environment variable is set and effective, it is called Catwalk environment.
Unfortunately, current implementation allows users only have either on-demand file staging system or MPI-IO parallel file system.
The Catwalk server process must be invoked with the -nhosts option followed by the number of hosts (not the number of processes). Then the Linux command or user program invoked under the Catwalk client process can access the files on the file server.
The filename to be the target of on-demand file staging must not include any slash (/) character (i.e. basename). When a file exists on the current directory of the compute nodes having the same name, then the local file is accessed and no file staging takes place.
The input files are copied from the server node to the current directory of comute nodes when the program tries to open the files which are absent at that moment. The location of input files on the server node can be specified by the path option followed by colon (:) separated list of dirctories. The search begins from the leftmost directory on the list.
The output files are copied to the server node when the program terminates. The location of the output files on the server node can be specified with the -dir option followed by a directory name. When the dir option is omitted and the path option is specified, then the leftmost directory name would be the location of the output file.
Catwalk assumes that the current directory of the compute nodes are on the local disk. Otherwise the results are not guaranteed. When the current directory is in an NFS region, a catastrophic behaviour may happen especially when the number of hosts are large.
When the Catwalk environment is create in an MPI-IO mode, then any accesible files on the server node can be accessed via the MPI-IO functions called by the program running on client nodes. The location of the temporary files on the compute nodes can be specified with the -dir option followed by a directory name. The input files is copied to the directory on the compute node and the output files are copied from the directory to the server node when the MPI_File_close() function is called.
The Catwalk MPI-IO implementation is also called "Catwalk_ROMIO" and can work with the ssh command and the 3rd party batch job scheduler. See catwalk-romio(7) man page for more details.
There is no such restriction with the MPI-IO feature of Catwalk with the -mpiio option.