Network Trunking for PM/Ethernet Administrator's Guide


Index

  1. Introduction
  2. Hardware Setup
    1. Hardware Requirements
    2. Hardware Configuration
  3. Installation Procedure
    1. Making pm-ethernet.conf files
    2. Modifying scorehosts.db file
    3. Modifying /etc/rc.d/init.d/pm_ethernet file
    4. Restarting PM/Ethernet and scoreborad
  4. Test Procedure of Single Network
  5. Test Procedure of Multiple Network
    1. Using rpmtest
    2. Using rcstest
  6. Performance tuning

1. Introduction

Network trunking is a technique to increase communication bandwidth by connecting multiple ethernet NICs(especially 100Base/T Ethernet). To realize network trunking communication, multiple ethernet NICs on one PC and ethernet switches for the ethernet NICs are needed, and PM/Ethernet configuration files for each ethernet NIC must be prepared and tested.

PM/Ethernet manages multiple NICs using a unit number (which is defiend in pm-ethernet.conf, and specified on etherpmctl command), and only NICs with same unit number on cluster nodes can communicate with each other. Moreover, Ethernet MAC address is directly used on PM/Ethernet communication, NICs with the same unit number must be installed on the same Ethernet network as well as each node can communicate with the other nodes using Ethernet address directly. However, you do not have to connect NICs with different unit number into same Ethernet network like the "Beowulf Channel Bonding" technique.

2. Hardware Setup

  1. Hardware Requirements

    Network Interface Cards:

    When you want to use within 2 NICs on one PC, a combination of different NIC hardware can be acceptable (ex, tulip + eepro100). But you want to use more than 2 NICs on one PC, same NIC hardware is recommended. Here is a list of tested NICs of network trunking.

    Number of NICsTested NICs
    2 NICsDEC Tulip, Intel EEPRO100, 3Com 3C905B, VIA VT86C100 Rhine NICs
    3 NICsDEC Tulip, Intel EEPRO100, 3Com 3C905B
    4 NICsDEC Tulip, Intel EEPRO100
    Comments: VIA chipset NICs did not work on more than 2 NICs because of hardware error. 3Com 3C905B NICs worked on 4 NICs, but bandwidth performance did not increase. The network trunking using de4x5 driver causes system hung-up.

    Ethernet Switches:

    When you want to use 3 NICs on 8 node cluster, 3 eight port ethernet switches (or 1 sixteen port switch and 1 eight port switch) are needed, and no connection along the switches are needed. If you want to connect the cluster to other network one ethernet switch with more than 8 port is required.

  2. Hardware Configuration

    If you build a new cluster same motherboard is recommended because of allocation of ethernet device number such as a number XX in ethXX. If you use different motherboard, be careful to allocation of ethernet device number.

3. Software Setup

Configuration files needed for network trunking are pm-ethernet.conf files for each ethernet device (such as eth0, eth1, eth2...). In this document, installation procedure for 4 node cluster with 4 NICs are described.

  1. Following is a node list used in this cluster.

    Compute hosts
    comp0.pccluster.org
    comp1.pccluster.org
    comp2.pccluster.org
    comp3.pccluster.org

  2. Create a Configuration file(pm-udp.conf) of PM/UDP(Agent)

    # Configuration file for PM/UDP(Agent)
    0 comp0.pccluster.org
    1 comp1.pccluster.org
    2 comp2.pccluster.org
    3 comp3.pccluster.org

  3. Making pm-ethernet.conf files
    1. Create a Configuration file(pm-ethernet-0.conf) of PM/Ethernet for eth0 device using following command. If you installed SCore using EIT this file is same as /opt/score/etc/pm-ethernet.conf
      # mkpmethernetconf -unit 0 -speed 100 -device eth0 pm-udp.conf pm-ethernet-0.conf
      # cat pm-ethernet-0.conf
      unit 0
      maxnsend 8
      0 00:90:CC:0F:B9:A0 comp0.pccluster.org
      1 00:90:CC:0F:B9:A3 comp1.pccluster.org
      2 00:20:18:58:AC:DA comp2.pccluster.org
      3 00:20:18:58:BC:00 comp3.pccluster.org

    2. Create a Configuration file(pm-ethernet-1.conf) of PM/Ethernet for eth1 device using following command
      # mkpmethernetconf -unit 1 -speed 100 -device eth1 pm-udp.conf pm-ethernet-1.conf
      # cat pm-ethernet-1.conf
      unit 1
      maxnsend 8
      0 00:90:CC:0F:B8:03 comp0.pccluster.org
      1 00:90:CC:0F:B9:A9 comp1.pccluster.org
      2 00:20:18:58:AC:EE comp2.pccluster.org
      3 00:20:18:58:AE:61 comp3.pccluster.org

    3. Create a Configuration file(pm-ethernet-2.conf) of PM/Ethernet for eth2 device using following command
      # mkpmethernetconf -unit 2 -speed 100 -device eth2 pm-udp.conf pm-ethernet-2.conf
      # cat pm-ethernet-2.conf
      unit 2
      maxnsend 8
      0 00:90:CC:0F:B8:25 comp0.pccluster.org
      1 00:90:CC:0F:B9:C1 comp1.pccluster.org
      2 00:20:18:58:AC:3E comp2.pccluster.org
      3 00:20:18:58:AC:8B comp3.pccluster.org

    4. Create a Configuration file(pm-ethernet-3.conf) of PM/Ethernet for eth3 device using following command
      # mkpmethernetconf -unit 3 -speed 100 -device eth3 pm-udp.conf pm-ethernet-3.conf
      # cat pm-ethernet-3.conf
      unit 3
      maxnsend 8
      0 00:90:CC:0F:B8:06 comp0.pccluster.org
      1 00:90:CC:0F:B9:AD comp1.pccluster.org
      2 00:20:18:58:AC:3C comp2.pccluster.org
      3 00:20:18:58:AC:EC comp3.pccluster.org

  4. Copy the configuration files (pm-ethernet-[0123].conf) to /opt/score/etc

    # cp pm-ethernet-[0123] /opt/score/etc

  5. Modifying scorehosts.db file
    Add following entries to /opt/score/etc/scorehosts.db and add network (ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4) to scorehosts.db.

    ethernet-0 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-0.conf
    ethernet-1 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-1.conf
    ethernet-2 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-2.conf
    ethernet-3 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-3.conf
    ethernet-x2 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-1.conf \
    	-trunk0:file=/opt/score/etc/pm-ethernet-2.conf
    ethernet-x3 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-2.conf \
    	-trunk0:file=/opt/score/etc/pm-ethernet-1.conf \
    	-trunk1:file=/opt/score/etc/pm-ethernet-0.conf
    ethernet-x4 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-3.conf \
    	-trunk0:file=/opt/score/etc/pm-ethernet-0.conf \
    	-trunk1:file=/opt/score/etc/pm-ethernet-1.conf \
    	-trunk2:file=/opt/score/etc/pm-ethernet-2.conf
    

    # cat /opt/score/etc/scorehosts.db
    
    /* PM/Ethernet */
    ethernet        type=ethernet \
                    -config:file=/opt/score/etc/pm-ethernet.conf
    ethernet-0 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-0.conf
    ethernet-1 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-1.conf
    ethernet-2 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-2.conf
    ethernet-3 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-3.conf
    ethernet-x2 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-1.conf \
    	-trunk0:file=/opt/score/etc/pm-ethernet-2.conf
    ethernet-x3 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-2.conf \
    	-trunk0:file=/opt/score/etc/pm-ethernet-1.conf \
    	-trunk1:file=/opt/score/etc/pm-ethernet-0.conf
    ethernet-x4 type=ethernet \
    	-config:file=/opt/score/etc/pm-ethernet-3.conf \
    	-trunk0:file=/opt/score/etc/pm-ethernet-0.conf \
    	-trunk1:file=/opt/score/etc/pm-ethernet-1.conf \
    	-trunk2:file=/opt/score/etc/pm-ethernet-2.conf
    #include "/opt/score/etc/ndconf/0"
    #include "/opt/score/etc/ndconf/1"
    #include "/opt/score/etc/ndconf/2"
    #include "/opt/score/etc/ndconf/3"
    
    #define MSGBSERV        msgbserv=(server.pccluster.org:8764)
    
    comp0.pccluster.org NODE_0 \
     network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV
    comp1.pccluster.org NODE_1 \
     network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV
    comp2.pccluster.org NODE_2 \
     network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV
    comp3.pccluster.org NODE_3 \
     network=ethernet,ethernet-0,ethernet-1,ethernet-2,ethernet-3,ethernet-x2,ethernet-x3,ethernet-x4 group=_scoreall_,pccall smp=1 MSGBSERV
    

    In this file, ethernet-0, ethernet-1, ethernet-2 and ethernet-3
    networks should be used for test purpose only, and needless networks
    should be removed after following communication tests are
    finished. Because, these definition causes a trouble in SCore-D
    multiuser environment.
    

  6. Modifying /etc/rc.d/init.d/pm_ethernet file

    A sample code for /etc/rc.d/init.d/pm_ethernet file is as follows:
    #
    # pm_ethernet:  Starts the PM Ethernet driver
    #
    # Version:      @(#) /etc/rc.d/init.d/pm_ethernet 1.00
    #
    # Author:       Shinji Sumimoto (Real World Computing Partnership)
    # chkconfig: 345 90 18
    # description: PM Ethernet driver
    # probe: true
    
    IF=eth0
    UNIT=0
    INTERRUPT_REAPING=on
    
    # Source function library.
    . /etc/rc.d/init.d/functions
    
    # check module
    module=`modprobe -l pm_ethernet_dev.o`
    
    # See how we were called.
    case "$1" in
      start)
            echo
            if [ x$module != x ]; then
                modprobe pm_ethernet_dev
            fi
            ifconfig eth1 up  # this depends on your environment
            ifconfig eth2 up  # this depends on your environment
            ifconfig eth3 up  # this depends on your environment
            /sbin/etherpmctl $IF -pm on -ir $INTERRUPT_REAPING -unit $UNIT
            /sbin/etherpmctl eth1 -pm on -ir $INTERRUPT_REAPING -unit 1
            /sbin/etherpmctl eth2 -pm on -ir $INTERRUPT_REAPING -unit 2
            /sbin/etherpmctl eth3 -pm on -ir $INTERRUPT_REAPING -unit 3
            touch /var/lock/subsys/pm_ethernet
            ;;
      stop)
            echo -n "Stopping PM/Ethernet: "
            if [ x$module != x ]; then
                rmmod pm_ethernet_dev
            fi
            /sbin/etherpmctl $IF -pm off
            /sbin/etherpmctl eth1 -pm off
            /sbin/etherpmctl eth2 -pm off
            /sbin/etherpmctl eth3 -pm off
            ifconfig eth1 down  # this depends on your environment
            ifconfig eth2 down  # this depends on your environment
            ifconfig eth3 down  # this depends on your environment
            echo
            rm -f /var/lock/subsys/pm_ethernet
            ;;
      status)
            if [ x$module != x ]; then
                /sbin/lsmod
            fi
            ;;
      restart)
            $0 stop
            $0 start
            ;;
      *)
            echo "Usage: $0 {start|stop|status|restart}"
            exit 1
    esac
    

  7. Restarting PM/Ethernet and scoreborad

    To restart scoreboard, and re-load PM/Ethernet, please execute following commands:
    #/etc/rc.d/init.d/scoreboard restart
    #/etc/rc.d/init.d/pm_ethernet restart

4. Test Procedure of Single Network

  1. Test sequence of eth1, eth2, eth3 network using rpmtest

    See PM/Ethernet Test Procedure, and use network ethernet-1, ethernet-2 or ethernet-3 instead of ethernet.

  2. Test sequence using rcstest
    Test sequence of eth0 network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-0 -v
    starting master 0 : pe=4
    starting slave: 2 3 1.

    testing*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
    .*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
    .*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.
    *.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
    *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*
    .*.*.*.*.*.*.*.*.*.*.*comp3( 3) Signal: Interrupted system call(4)
    comp0( 0) Signal: Interrupted system call(4)
    comp1( 1) Signal: Interrupted system call(4)
    comp2( 2) Signal: Interrupted system call(4)

    Use Ctrl-C to quit this test program

  3. Test sequence of eth1 network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-1 -v

  4. Test sequence of eth2 network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-2 -v

  5. Test sequence of eth3 network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-3 -v

5. Test Procedure of Multiple Network

  1. Test sequence of 2 NICs trunking network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-x2 -v
    starting master 0 : pe=4
    starting slave: 2 3 1.

    testing*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
    .*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
    .*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.
    *.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
    *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.*.*.*
    .*.*.*.*.*.*.*.*.*.*.*comp3( 3) Signal: Interrupted system call(4)
    comp0( 0) Signal: Interrupted system call(4)
    comp1( 1) Signal: Interrupted system call(4)
    comp2( 2) Signal: Interrupted system call(4)

    Use Ctrl-C to quit this test program

  2. Test sequence of 3 NICs trunking network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-x3 -v

  3. Test sequence of 4 NICs trunking network using rcstest

    # /opt/score/sbin/rcstest comp0.pccluster.org ethernet-x4 -v

6. Performance Tuning

  1. You can tune network trunking communication performance by changing the maxnsend and backoff value in pm-ethernet.conf.

PCCC logo PC Cluster Consotium

$Id: ether-trunking.html,v 1.9 2003/01/25 11:50:46 s-sumi Exp $