Network trunking is a technique to increase communication bandwidth by connecting multiple ethernet NICs(especially 100Base/T Ethernet). To realize network trunking communication, multiple ethernet NICs on one PC and ethernet switches for the ethernet NICs are needed, and PM/Ethernet configuration files for each ethernet NIC must be prepared and tested.
PM/Ethernet manages multiple NICs using a unit number (which is defiend in pm-ethernet.conf, and specified on etherpmctl command), and only NICs with same unit number on cluster nodes can communicate with each other. Moreover, Ethernet MAC address is directly used on PM/Ethernet communication, NICs with the same unit number must be installed on the same Ethernet network as well as each node can communicate with the other nodes using Ethernet address directly. However, you do not have to connect NICs with different unit number into same Ethernet network like the "Beowulf Channel Bonding" technique.
Network Interface Cards:
When you want to use within 2 NICs on one PC, a combination of different NIC hardware can be acceptable (ex, tulip + eepro100). But you want to use more than 2 NICs on one PC, same NIC hardware is recommended. Here is a list of tested NICs of network trunking.
Number of NICs | Tested NICs |
2 NICs | DEC Tulip, Intel EEPRO100, 3Com 3C905B, VIA chipset NICs |
3 NICs | DEC Tulip, Intel EEPRO100, 3Com 3C905B |
4 NICs | DEC Tulip, Intel EEPRO100 |
Ethernet Switches:
When you want to use 3 NICs on 8 node cluster, 3 eight port ethernet switches (or 1 sixteen port switch and 1 eight port switch) are needed, and no connection along the switches are needed. If you want to connect the cluster to other network one ethernet switch with more than 8 port is required.
If you build a new cluster same motherboard is recommended because of allocation of ethernet device number such as a number XX in ethXX. If you use different motherboard, be careful to allocation of ethernet device number.
Configuration files needed for network trunking are pm-ethernet.conf files for each ethernet device (such as eth0, eth1, eth2...). In this document, sample configuration files for 4 node cluster with 4 NICs are described.
Compute hosts
comp0.score.rwcp.or.jp comp1.score.rwcp.or.jp comp2.score.rwcp.or.jp comp3.score.rwcp.or.jp
|
# Configuration file for PM/UDP(Agent)0 comp0.score.rwcp.or.jp 1 comp1.score.rwcp.or.jp 2 comp2.score.rwcp.or.jp 3 comp3.score.rwcp.or.jp
|
pm-ethernet.conf
files
# mkpmethernetconf -unit 0 -speed 100 -device eth0 pm-udp.conf pm-ethernet-0.conf # cat pm-ethernet-0.conf unit 0 maxnsend 8 0 00:90:CC:0F:B9:A0 comp0.score.rwcp.or.jp 1 00:90:CC:0F:B9:A3 comp1.score.rwcp.or.jp 2 00:20:18:58:AC:DA comp2.score.rwcp.or.jp 3 00:20:18:58:BC:00 comp3.score.rwcp.or.jp
|
# mkpmethernetconf -unit 1 -speed 100 -device eth1 pm-udp.conf pm-ethernet-1.conf # cat pm-ethernet-1.conf unit 1 maxnsend 8 0 00:90:CC:0F:B8:03 comp0.score.rwcp.or.jp 1 00:90:CC:0F:B9:A9 comp1.score.rwcp.or.jp 2 00:20:18:58:AC:EE comp2.score.rwcp.or.jp 3 00:20:18:58:AE:61 comp3.score.rwcp.or.jp
|
# mkpmethernetconf -unit 2 -speed 100 -device eth2 pm-udp.conf pm-ethernet-2.conf # cat pm-ethernet-2.conf unit 2 maxnsend 8 0 00:90:CC:0F:B8:25 comp0.score.rwcp.or.jp 1 00:90:CC:0F:B9:C1 comp1.score.rwcp.or.jp 2 00:20:18:58:AC:3E comp2.score.rwcp.or.jp 3 00:20:18:58:AC:8B comp3.score.rwcp.or.jp
|
# mkpmethernetconf -unit 3 -speed 100 -device eth3 pm-udp.conf pm-ethernet-3.conf # cat pm-ethernet-3.conf unit 3 maxnsend 8 0 00:90:CC:0F:B8:06 comp0.score.rwcp.or.jp 1 00:90:CC:0F:B9:AD comp1.score.rwcp.or.jp 2 00:20:18:58:AC:3C comp2.score.rwcp.or.jp 3 00:20:18:58:AC:EC comp3.score.rwcp.or.jp
|
# cp pm-ethernet-[0123] /opt/score/etc |
scorehosts.db
file
ethernet-0 type=ethernet \ -config:file=/opt/score/etc/ethernet-0.conf ethernet-1 type=ethernet \ -config:file=/opt/score/etc/ethernet-1.conf ethernet-2 type=ethernet \ -config:file=/opt/score/etc/ethernet-2.conf ethernet-3 type=ethernet \ -config:file=/opt/score/etc/ethernet-3.conf ethernet-x2 type=ethernet \ -config:file=/opt/score/etc/ethernet-1.conf \ -trunk0:file=/opt/score/etc/ethernet-2.conf ethernet-x3 type=ethernet \ -config:file=/opt/score/etc/ethernet-2.conf \ -trunk0:file=/opt/score/etc/ethernet-1.conf \ -trunk1:file=/opt/score/etc/ethernet-0.conf ethernet-x4 type=ethernet \ -config:file=/opt/score/etc/ethernet-3.conf \ -trunk0:file=/opt/score/etc/ethernet-0.conf \ -trunk1:file=/opt/score/etc/ethernet-1.conf \ -trunk2:file=/opt/score/etc/ethernet-2.conf |
# cat /opt/score/etc/scorehosts.db /* PM/Ethernet */ ethernet type=ethernet \ -config:file=/opt/score/etc/pm-ethernet.conf ethernet-0 type=ethernet \ -config:file=/opt/score/etc/ethernet-0.conf ethernet-1 type=ethernet \ -config:file=/opt/score/etc/ethernet-1.conf ethernet-2 type=ethernet \ -config:file=/opt/score/etc/ethernet-2.conf ethernet-3 type=ethernet \ -config:file=/opt/score/etc/ethernet-3.conf ethernet-x2 type=ethernet \ -config:file=/opt/score/etc/ethernet-1.conf \ -trunk0:file=/opt/score/etc/ethernet-2.conf ethernet-x3 type=ethernet \ -config:file=/opt/score/etc/ethernet-2.conf \ -trunk0:file=/opt/score/etc/ethernet-1.conf \ -trunk1:file=/opt/score/etc/ethernet-0.conf ethernet-x4 type=ethernet \ -config:file=/opt/score/etc/ethernet-3.conf \ -trunk0:file=/opt/score/etc/ethernet-0.conf \ -trunk1:file=/opt/score/etc/ethernet-1.conf \ -trunk2:file=/opt/score/etc/ethernet-2.conf #include "/opt/score/etc/ndconf/0" #include "/opt/score/etc/ndconf/1" #include "/opt/score/etc/ndconf/2" #include "/opt/score/etc/ndconf/3" #define MSGBSERV msgbserv=(server.score.rwcp.or.jp:8764) comp0.score.rwcp.or.jp NODE_0 \ network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV comp1.score.rwcp.or.jp NODE_1 \ network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV comp2.score.rwcp.or.jp NODE_2 \ network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV comp3.score.rwcp.or.jp NODE_3 \ network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV |
In this file, ethernet-0, ethernet-1, ethernet-2 and ethernet-3 networks should be used for test purpose only, and should be removed after following communication tests are finished. Because, these definition causes a trouble in SCore-D multiuser environment. |
/etc/rc.d/init.d/pm_ethernet
file
A sample code for /etc/rc.d/init.d/pm_ethernet file is as follows:
#!/bin/sh # # pm_ethernet: Starts the PM Ethernet driver # # Version: @(#) /etc/rc.d/init.d/pm_ethernet 1.00 # # Author: Shinji Sumimoto (Real World Computing Partnership) # chkconfig: 345 90 18 # description: PM Ethernet driver # probe: true IF=eth0 UNIT=0 INTERRUPT_REAPING=on # Source function library. . /etc/rc.d/init.d/functions # check module module=`modprobe -l pm_ethernet_dev.o` # See how we were called. case "$1" in start) echo if [ x$module != x ]; then modprobe pm_ethernet_dev fi ifconfig eth1 up ifconfig eth2 up ifconfig eth3 up /sbin/etherpmctl $IF -pm on -ir $INTERRUPT_REAPING -unit $UNIT -sc off /sbin/etherpmctl eth1 -pm on -ir $INTERRUPT_REAPING -unit 1 -sc off /sbin/etherpmctl eth2 -pm on -ir $INTERRUPT_REAPING -unit 2 -sc off /sbin/etherpmctl eth3 -pm on -ir $INTERRUPT_REAPING -unit 3 -sc off touch /var/lock/subsys/pm_ethernet ;; stop) echo -n "Stopping PM/Ethernet: " if [ x$module != x ]; then rmmod pm_ethernet_dev fi /sbin/etherpmctl $IF -pm off /sbin/etherpmctl eth1 -pm off /sbin/etherpmctl eth2 -pm off /sbin/etherpmctl eth3 -pm off ifconfig eth1 down ifconfig eth2 down ifconfig eth3 down echo rm -f /var/lock/subsys/pm_ethernet ;; status) if [ x$module != x ]; then /sbin/lsmod fi ;; restart) $0 stop $0 start ;; *) echo "Usage: $0 {start|stop|status|restart}" exit 1 esac |
Send HUP signal to scoreboard, and execute
#/etc/rc.d/init.d/pm_ethernet restart |
See PM/Ethernet Test Procedure
, and use network ethernet-1, ethernet-2 or ethernet-3 instead of ethernet.
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-0 -v starting master 0 : pe=4 starting slave: 2 3 1. testing*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* .*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* .*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*. *.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.* .*.*.*.*.*.*.*.*.*.*.*comp3( 3) Signal: Interrupted system call(4) comp0( 0) Signal: Interrupted system call(4) comp1( 1) Signal: Interrupted system call(4) comp2( 2) Signal: Interrupted system call(4) |
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-1 -v |
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-2 -v |
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-3 -v |
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-x2 -v starting master 0 : pe=4 starting slave: 2 3 1. testing*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* .*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* .*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*. *.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.* .*.*.*.*.*.*.*.*.*.*.*comp3( 3) Signal: Interrupted system call(4) comp0( 0) Signal: Interrupted system call(4) comp1( 1) Signal: Interrupted system call(4) comp2( 2) Signal: Interrupted system call(4) |
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-x3 -v |
# /opt/score/sbin/rcstest comp0.score.rwcp.or.jp ethernet-x4 -v |