[SCore-users-jp] [SCore-users] SCORE_RSH and use of ssh instead of rsh

C. Evangelinos ce107 @ dam.brown.edu
2003年 3月 28日 (金) 08:36:37 JST


Thanks to the list's suggestions the NIS setup with only /var/scored
local to each compute node works fine. I'm also exporting via NFS
/opt/score to the rest of the nodes (read-only) so they can have full
functionality for compiling, executing etc. Such a setup meant that I
could not use the bininstall way of doing things and ended up doing
quite a few things on my own.

A few comments (mainly SCore but also Omni-related):
1) removing the rpms leaves init scripts behind in /etc/rc.d as well
as the new devices (the latter is not really a problem)
2) It would be nice to have a script that reproduces the effects of
installing the rpms for setting up device and configuration scripts,
local directories etc. for the case of NFS installations like mine
which do not use EIT or the RPMS for the compute nodes. I may end up
writing one myself anyway as I add nodes.
3) I got SCore to work fine (so far) on a system with a Realtek
ethernet card (8139too driver). It cannot handle interrupt reaping
however - the machine becomes highly unstable after a little while,
the system log fills up with
kernel: eth0: Too much work at interrupt, IntrStatus=0x0001.
messages and the machine requires a reboot. With reaping set to off
everything works fine. Performance between such a box and another one
with an Intel eepro100 driven card is so-and-so: Ping pong latency
(RTT/2) is ~58us, asymptotic ping-pong bandwidth is ~77Mbit/s out of
100 (worse than what LAM gets). BTW I'd be nice if the SCore document
reported RTT/2 instead of RTT numbers as I've seen people
misunderstand MPICH/PM numbers for double their actual value.
4) It would be more graceful to set things up so that if one already
has Java installed, the system doesn't look for things in
/opt/score/java/linux
Setting OMNI_JAVAVM seems to fix things for the Omni compiler but
jumpshot ignores setting JAVA_HOME and JVM as environment variables
before calling it.
5) There should be a way to pass back-end specific compiler
optimization flags to the Omni compiler. 

My main remaining problems are:
a) Integration with SGE - I just got someone to translate the Japanese
instructions but I'd like to know whether the source code that comes
with SCore (contrib) is modified or the Sun one as I want to use SCore
with the latest patched version of SGE out of Sun (and I'd prefer if
possible to avoid having to recompile everything but use as much of
Sun's binary installation as possible). 
b) This is the most important problem and related to the title of my
e-mail: 
For various security reasons I cannot use SCore with rsh (beyond
testing). Even with tcp wrappers enabled to limit access I'd prefer to
use ssh instead. SCORE_RSH seems to work with very few SCore binaries
and most importantly cannot work with scout. Is there a quick fix for
that or is rsh hardcoded in too many places in the source code?
Moreover, given the way connections propagate on an SCore cluster,
would running an ssh agent on the machine where scout is entered
enough to provide for transparent connections or do ssh-agents need to
run everywhere with some mechanism for new shells to get the required
environment variables setup automatically?
c) Moreover, if running as an SGE job, what mechanism would SCore use?
Normal rsh, ssh (supposing it's fixed as a replacement) or SGE's rsh?

Thanks everyone for their help,

Constantinos Evangelinos

Center for Fluid Mechanics
Brown University
and
Ocean Engineering Department
MIT


PS> On another mini-cluster with IBM nodes with NetXtreme BCM5703X
Gigabit Ethernet cards (tg3 driver) I get for netpipe's ping-pong an
RTT/2 latency of 68us and an asymptotic bandwidth that is around
535Mbit/s though sometimes one gets an extra 200Mbit/s for no
reason... Avoid...

_______________________________________________
SCore-users mailing list
SCore-users @ pccluster.org
http://www.pccluster.org/mailman/listinfo/score-users



SCore-users-jp メーリングリストの案内