************************************************************************ * Myricom GM networking software and documentation * * Copyright (c) 2001 by Myricom, Inc. * * All rights reserved. See the file `COPYING' for copyright notice. * ************************************************************************ README-linux for gm-1.5 README for linux distribution Supported platforms: Linux 2.2 and 2.4 for ia32, sparc, ppc, alpha. Linux 2.4 for ia64 (Itanium). - Alpha Linux with more than 2 Gigabytes of memory requires Linux 2.4.9 to install GM. - GM will only compile and run under sparc64 linux (Ultrasparc). - For PPC, GM has been tested on G3 and G4 platforms using Yellow Dog and Black Lab Linux. Supported interfaces: LANai4 with 1Meg or 512K, LANai7, and LANai9 (If you have LANai4 with 256K, you will need to upgrade your interface, or use a previous version of GM (gm-1.2.3 for 256K). For installation instructions of an earlier GM version please refer to the respective README and README- files. Please also note that Linux 2.4 is not supported on earlier GM versions). WARNING: When building/linking GM applications, you must do so on a linux box that matches the OS version of the machine on which you will be running. You cannot compile on a 2.2.x machine and run the executable on a 2.4.x machine. Table of Contents: ----------------- I. GM Installation a. Configuring, compiling, and loading the GM driver b. Running the GM Mapper c. Testing the GM installation II. Verifying the GM performance III. Running IP over GM IV. Improving IP Performance V. Fork() and System() Support VI. Sample Scripts to automatically load GM and start the Mapper VII. Operating-system-specific Caveats a. RedHat kernel source or config for Linux 2.2.16-22enterprise b. RedHat 7.0, 7.1 and gcc 2.96 compiler c. Using Compaq Compilers for Alpha Linux (ccc cxx) d. Linux 2.2 - 2.4 for Sparc64 e. VA Linux f. APIC IRQ conflict on Supermicro P4DC6 (dual P4-Xeon) g. AGP (nVidia) conflicts h. Motherboards with i840 and i860 chipsets i. Linux 2.4.x for PowerPC ************************************************************************ If difficulties are encountered, please consult the FAQ http://www.myri.com/scs/GM_FAQ.html and all technical support questions should be directed to help@myri.com. ************************************************************************ =================== I. GM Installation =================== GM installation is performed in the following three steps. 1. Configure, compile, and load the GM driver: --------------------------------------------- gunzip -c gm-1.5.tar.gz | tar xvf - cd gm-1.5 ./configure make cd binary su root ./GM_INSTALL By default, we assume that the header file for your Linux installation is located in /usr/src/linux. If your Linux installation is not located in /usr/src/linux, you must configure with the following option: ./configure --with-linux= where specifies the directory for the linux kernel source. By default, we also assume that you have LANai9 or LANai7 interfaces. If you have LANai4 (with 1Meg of memory), you will need to configure with: ./configure --disable-new-features If you have LANai4 with 512K, you will need to configure with: ./configure --disable-new-features --with-min-supported-sram=256 and you will only have 4 GM ports available instead of 8. (As previously noted, if you have LANai4 with 256K, you cannot install gm-1.5. You will need to upgrade your interface or use a previous version of GM (gm-1.2.3 for 256K)). Note: If you have a mixture of hosts with LANai4 and LANai7 (or LANai9) interfaces that need to talk to each other, you must configure with --disable-new-features on all of the hosts. For a complete listing of all options to configure, type: ./configure --help Note: Do not use the configure flag --enable-directcopy. This flag is not a valid option to GM 1.5. It will be re-enabled in a future release. The GM_INSTALL script will unload any existing GM device driver, load the current device driver and create /dev/gm device i-nodes. It does not configure the IP device, nor does it set up any scripts to load the GM driver at boot time. During the GM_INSTALL phase, GM prints messages to the kernel log (dmesg). If the running kernel and the kernel header used for compilation are mismatched, GM will print a warning message to the kernel log. Please be sure to read the {GM_HOME}/README and the {GM_HOME}/README-linux for further details of operating-system-specific caveats. Note: If the host is rebooted, you must reload the GM driver (and rerun the GM mapper). There are sample scripts, contributed by a customer, in {GM_HOME}/drivers/linux/scripts for loading GM and running the mapper at reboot. 2. Running the GM Mapper ------------------------ Myrinet is a source-routed network. I.e., each host must know the route to all other hosts through the switching fabric. The GM mapper automatically discovers all of the hosts connected to the Myrinet network, computes a set of deadlock free minimum length routes between the hosts, and distributes appropriate routes to each host on the connected network. Loopback and point-to-point network topologies require that gm_simpleroute must be run instead of the GM Mapper. (Refer to the GM README and the FAQ for details.) For a switch network topology, the GM Mapper must be run before any communication over Myrinet can be initiated. Further technical details about the GM mapper can be found in mt/README. Depending upon the user's needs, there are three different ways in which the GM mapper may be used. MAP_ONCE mapping: ---------------- The first way is by far the most common, and we shall refer to it as "map_once". In this method, the mapper is run on one host in the network (any of the hosts). It is rerun if a host (re)boots or a hostname is changed or after a change of Myrinet topology (swapping of ports on a switch). (If the Mapper must be rerun for any of these reasons, it is best to run it on the same host.) The command for this method of running the GM mapper is: cd {GM_HOME}/binary/sbin/ su root ./mapper map_once.args STATIC mapping: -------------- The second way in which the GM mapper may be used is called "static mapping" or "file mapping". In this method, an active mapper is run once when ALL of the hosts are up and running the GM driver. This initial active mapper will generate a map file and a host file. These files are then copied to all of the hosts in the network, or shared by NFS. An entry in the boot scripts will allow each host to read the map file and the host file and update the routing table on its local Myrinet interface(s). This method is particularly appealing as no human intervention is needed and no traffic is generated at boot time. The commands for this method of running the GM mapper are: cd {GM_HOME}/binary/sbin/ su root ./mapper static.args Copy the 3 files created by this command (static.map, static.routes, and static.hosts) to each {GM_HOME}/binary/sbin/ directory on each host if the gm tree is not mounted by NFS. Add the following command to the boot scripts of the host (scripts in /etc/init.d or /etc/rc.d/init.d). cd {GM_HOME}/binary/sbin/ su root ./file_mapper file.args HA mapping: ----------- The third way in which the GM mapper may be used is for the users who have a need for High Availability (HA) in an aggressive computing environment. The command for this method of running the GM Mapper is: cd {GM_HOME}/binary/sbin/ su root ./mapper active.args & It will continuously run the GM mapper in the background to detect and add any new hosts or remove any non-responding hosts, to detect any change of topology (change of slots in the switch, change of innerswitch topology), and periodically update the routing tables of the Myrinet cards (by default, every 30 seconds). You should note that this mapping method is quite intrusive. The user is strongly advised to avoid this method of running the GM mapper if his applications produce heavy network traffic (e.g., MPI applications) since the GM Mapper uses non-reliable messages that may be dropped in case of heavy contention, leading to hosts that may be marked as "non-responding" and removed because they are unreachable. A few expert customers use this mapping method to satisfy their high availability constraints for GM applications designed to handle a dynamic change of configuration (by design, MPI is NOT a fault-tolerant application). For the majority of users, the "map_once" GM mapping method is sufficient. For the users with more production-level constraints, the "static mapping" is the most adequate method. For fault-tolerant GM applications, the third method provides the best alternative. 3. Testing the GM Installation ------------------------------ A variety of test scripts are available in {GM_HOME}/binary/bin to test your GM installation. A README describing each of these tests can be found in {GM_HOME}/tests/README. We recommend the following five tests to validate your installation. cd {GM_HOME}/binary/bin 1. Test that the Mapper has correctly detected all of the hosts in your Myrinet network by typing the following command on several of the hosts: ./gm_board_info Note: In the output of this command, all hosts should be listed in the routing table of each node. If not all of the hosts are listed, then it is possible that a cable is not connected, or GM is not properly loaded on all hosts in the Myrinet network. A green LED should be lit up on the switch for each connection that is active. If you see *** No routes found *** in the output, this is an indication that the GM Mapper has not been run. (See README- for details.) When ./gm_board_info successfully reports a list of hosts, you can then run ./gm_allsize and ./gm_nway to test the network. 2. Test the basic connectivity of GM, by typing: ./gm_allsize --verbose --geometric on one of the hosts in the Myrinet network. Note: This loopback test will NOT work in a point-to-point (no switch) configuration. 3. Test GM bandwidth between two hosts, type (on the first host) ./gm_allsize --slave --size=15 and then type the following command (on the second host) ./gm_allsize --unidirectional --bandwidth --remote-host= \ --size=15 --geometric where is the name of the first host. These one-way tests are performed by running in slave mode on one machine and master on the node to be tested. This is done by adding '--slave' on the command line of the slave machine and '-h ' on the command line of the master where is the name of the machine running in slave mode. The name of each host is as specified in the output of ./gm_board_info. The --size parameter indicates the maximum length of message that will be sent, where 2^{size} is the value of that length. In this example, the maximum length of message sent is 2^{15}=32K. The --geometric parameter reduces the number of message lengths that will be tested. The default for gm_allsize is to test every length from 1 to 2^max_size incrementing one byte at a time. These tests take a long time to run, and generate data files suitable for input to gnuplot. 4. Test GM latency between two hosts, type (on the first host) ./gm_allsize --slave --size=15 and then type the following command (on the second host) ./gm_allsize --bidirectional --latency --remote-host= \ --size=15 --geometric where is the name of the first host. These one-way tests are performed by running in slave mode on one machine and master on the node to be tested. This is done by adding '--slave' on the command line of the slave machine and '-h ' on the command line of the master where is the name of the machine running in slave mode. The name of each host is as specified in the output of ./gm_board_info. The --size parameter indicates the maximum length of message that will be sent, where 2^{size} is the value of that length. In this example, the maximum length of message sent is 2^{15}=32K. The --geometric parameter reduces the number of message lengths that will be tested. The default for gm_allsize is to test every length from 1 to 2^max_size incrementing one byte at a time. These tests take a long time to run, and generate data files suitable for input to gnuplot. 5. Test that the GM messages arrive reliably and in order (e.g., on 3 hosts named host1, host2, and host3) by typing: ./gm_nway --fast --verify This gm_nway command must be run simultaneously on each host, using the same list of host names in each case. It can be run on any subset of hosts on the network. For a list of all possible runtime options for these commands, you can issue the command with --help as the runtime option, e.g., ./gm_nway --help. ================================ II. Verifying the GM Performance ================================ We recommend the following test to verify the GM performance. View the results of the hardware benchmark test of the PCI bus with the DMA engine of the Myrinet adapter. cd {GM_HOME}/binary/bin ./gm_debug --no-counters Note: The output of this command gives the maximum sustained bandwidth that can be obtained from the PCI bus. Refer to the section entitled "GM Performance" in the {GM_HOME}/README for complete details on expected GM performance. ======================= III. Running IP over GM ======================= The Linux command to enable IP over GM is as follows: /sbin/ifconfig myri0 up where you must replace 'myri0' with the appropriate name (myri1, myr2, etc.) if you have more than one Myrinet interface per host. For more information, please refer to the FAQ (http://www.myri.com/scs/GM_FAQ.html). ============================ IV. Improving IP performance ============================ You definitely want to use Linux 2.4 instead of Linux 2.2, NFS-v3 over TCP, and the following tuning options to get good NFS bandwidth. Otherwise, you are latency dominated and Myrinet IP and Ethernet IP will be about the same. - For linux you want to increase the tcp windows: echo "262144" > /proc/sys/net/core/rmem_max echo "262144" > /proc/sys/net/core/wmem_max echo "262144" > /proc/sys/net/core/wmem_default echo "262144" > /proc/sys/net/core/rmem_default - In linux/include/net/tcp.h, replace the value of #define MAX_WINDOW 32767 with the value of your choice (200k~500k might be good) - check that /proc/sys/net/ipv4/tcp_window_scaling is enabled with the value 1 (as it should be by default). - Play with the buffer sizes of netperf or your favorite net tester. =============================== V. Fork() and System() Support =============================== By default, GM supports the function vfork(). As the function popen() calls vfork(), it is also safe to use at any time with GM. However, GM does not support fork() when a GM port is open, as it is incompatible with some GM OS manipulations. Since the default behavior of the function system() is to call fork(), GM overwrites this function to conditionally call vfork() or fork() depending if the GM port is open or closed. Whenever possible, it is HIGHLY recommended that users write their applications to use vfork() instead of fork(). If you must use fork() and cannot replace it with vfork(), this limitation can be removed by modifying the {GM_HOME}/include/gm_enable_fork_system.h file by specifying: #define GM_ENABLE_FORK_SYSTEM 1 and then recompiling the GM driver. Be forewarned that this modification invalidates any support from Myricom, and assumes that you know the consequences of such a modification. Please contact help@myri.com for further details. ================================================================ VI. Sample Scripts to automatically load GM and start the Mapper ================================================================ The directory {GM_HOME}/share contains some sample initialization scripts, contributed by customers, that can be customized to suit your system to automatically load the gm driver and start the GM Mapper. ======================================= VII. Operating-system-specific Caveats ======================================= --------------------------------------------------------------- a. RedHat kernel source or config for Linux 2.2.16-22enterprise --------------------------------------------------------------- RedHat does not seem to have made available the proper kernel source or config for linux 2.2.16-22enterprise. You may have to make your own kernel to get symbols to match. ---------------------------------------- b. RedHat 7.0, 7.1 and gcc 2.96 compiler ---------------------------------------- Problem compiling GM with the gcc (2.96) shipped with RedHat 7.0 and 7.1. RedHat 7.0 and 7.1 are shipped with the gcc 2.96 compiler. If you see this error message when compiling GM: include/gm_internal.h:28:2: warning: #warning gcc-2.96 can NOT properly compile GM code. include/gm_internal.h:29:2: warning: #warning Please use an earlier version of gcc. include/gm_internal.h:30:2: warning: #warning Here is one that works: include/gm_internal.h:31:2: warning: #warning gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release) include/gm_internal.h:32:2: warning: #warning For IA32 RedHat7.0 you can use "kgcc" include/gm_internal.h:33:2: #error Bad GCC version then you must use kgcc instead of gcc. You should be able to (under the C shell): setenv CC kgcc rm -f config.cache ["-f" in case the file does not exist] ./configure or (under a Bourne shell or Bash): rm -f config.cache CC=kgcc ./configure If kgcc is not present by default in your Linux distribution, you can install one of these packages: RH 7.0 kgcc-1.1.2-40 RH 7.1 compat-glibc-6.2-2.1.3.2 compat-egcs-6.2-1.1.2.14 For more information see http://www.gnu.org/software/gcc/gcc-2.96.html. --------------------------------------------------- c. Using Compaq Compilers for Alpha Linux (ccc cxx) --------------------------------------------------- Under the C shell: setenv CC ccc setenv CXX cxx setenv CXXFLAGS \ "-g -O2 -inline speed -x cxx -noexceptions -nocxxstd -using_std -w2" setenv CFLAGS -gcc_messages setenv KCC gcc rm -f config.cache ./configure or under a Bourne shell or Bash: CC=ccc ; export CC CXX=cxx ; export CXX CXXFLAGS="-g -O2 -inline speed -x cxx -noexceptions -nocxxstd" CXXFLAGS="$(CXXFLAGS) -using_std -w2" ; export CXXFLAGS CFLAGS=-gcc_messages ; export CFLAGS KCC=gcc ; export KCC rm -f config.cache ./configure ------------------------------ d. Linux 2.2 - 2.4 for Sparc64 ------------------------------ Running GM on the sparc64-linux arch does require us to patch the kernel to get ioctls to work from 32-bit user-space. There are the two patches below (it would actually be cleaner to have the first even for other archs), the other is to make the kernel know about gm ioctls from sparc32bit userland. The init_mm patch is not needed after 2.2.10. And you need to generate a file `linux/include/gm_ioctl_switch.h' with: perl drivers/linux/sparc32.pl < include/gm_io.h \ > /usr/src/linux/include/gm_ioctl_switch.h First patch: --- linux/kernel/ksyms.c.std Fri Jun 4 18:14:15 1999 +++ linux/kernel/ksyms.c Fri Jun 4 18:14:17 1999 @ -107,6 +107,7 @ EXPORT_SYMBOL(update_vm_cache); EXPORT_SYMBOL(vmtruncate); EXPORT_SYMBOL(find_vma); +EXPORT_SYMBOL(init_mm); EXPORT_SYMBOL(get_unmapped_area); /* filesystem internal functions */ Second patch: --- linux/arch/sparc64/kernel/ioctl32.c.std Fri Mar 17 20:02:23 2000 +++ linux/arch/sparc64/kernel/ioctl32.c Fri Mar 17 20:06:42 2000 @ -2390,6 +2390,8 @ case AUTOFS_IOC_CATATONIC: case AUTOFS_IOC_PROTOVER: case AUTOFS_IOC_EXPIRE: + +#include "gm_ioctl_switch.h" /* Raw devices */ case _IO(0xac, 0): /* RAW_SETBIND */ ----------- e. VA Linux ----------- VA Linux might be shipping mismatched .ver files and kernel binaries. This note is from a customer with a VA linux software install. # #We figured out the problem. The configuration that we were trying to #generate the .ver file from was not the same as the actual kernel #configuration. So the wrong symbol was being generated by the `make #dep`. # #Since we are using VA Linux Red Hat, there was actually a very simple #fix. Change directory to /usr/src/linux and execute `make mrproper; #make restore`. This completely cleaned out the source directory and #then recreates the configuration based upon the current kernel. #After that a `make dep` generate the correct .ver files and a `make #modules` creates the correct modules for the running kernel. -------------------------------------------------------- f. APIC IRQ conflict on Supermicro P4DC6 (dual P4-Xeon) -------------------------------------------------------- Here is a description of the behavior that was witnessed. # We have seen that on the Supermicro P4DC6 (dual P4-Xeon) # with RH linux 7.1 (2.4 kernel) that the SCSI controller seems to # get confused on OS boot (hardware probing) when the Myrinet NIC is # installed. # This same behavior was experienced on the following machine: # IBM Intellistation M Pro 6850 # 1.4 GHz P4 SMP (only one processor installed) # RH7.1 # Kernel 2.4.3 (redhat update) # All 7.1 updates (updates.redhat.com) # # GM message that IRQ 11 is used, Link lights up, then solid hang # (no keyboard lights, nothing). After receiving one of these Supermicro machines on which to test our theories, one of our developers concluded that: Linux support for the APIC on this motherboard is broken. So if you boot linux using the APIC code, it will map a strange IRQ to the Myrinet board. The solution is to boot Linux with "noapic". When booting Linux with "noapic", the compatibility code is used and the IRQs are not re-mapped. Everything is straight from the BIOS. In this case, the onboard SCSI and the Myrinet NIC get the same Interrupt (it seems that onboard SCSI and all PCI 64 bits slots get the same IRQ). People not using onboard SCSI are happy because everything is fine for them. But if they use SCSI, the Linux driver will hang at boot time. I have checked the code in the Adaptec Linux driver, and it supports shared IRQ, as does the Myrinet driver. What we found when we tested the machine here is that the problem is not dependent on linux - we saw the problem with FreeBSD and Linux, and even without Myrinet. So, that means that the problem is with the BIOS. It appears that the Supermicro BIOS does not report the APIC mapping correctly. For now we will continue to tell customers to disable APIC, i.e., to boot Linux with "noapic". By booting with this kernel flag, the APIC mapping is not used, SCSI and 64-bit slot will share the IRQ and everything works fine. ------------------------- g. AGP (nVidia) conflicts ------------------------- Two types of problems were reported. 1. If I load the GM module first, and then load the nVidia module, it works. But if load the nVidia module first, GM won't load. #Our systems consist of ASUS P3V4X motherboards with 733Mhz PIII's, 1GB #RAM, 40GB of disk, nVidia GeForce II boards, Intel Ethernet Pro 100 #NICS, and ~40GB of disk. They are all running clean installs of #RedHat 7.1 with the latest RedHat patches and a 2.4.7 kernel built with 4GB high memory and module support (no smp support). # #We finally found a work around to get the myrinet to work with the #nVidia cards in our cluster. # #We found that if we load the kernel module before loading nVidia's #kernel module, then it works fine. After the gm module is loaded, we #can then load the nVidia module. We did not need to change any of our #BIOS settings. Once the gm module has been loaded, it can be unloaded #and reload as needed until a reboot occurs. # #We are using nVidia's latest driver from their web site (www.nvidia.com). # # n03 kernel: GM: pci_rev2: Could NOT map board into kernel (span = 0x1000000) # n03 kernel: GM: WARNING: drivers/gm_instance.c:4689:gm_instance_init():kernel: * n03 kernel: GM: Can't map IO memory to system memory This one is a case of shortage of virtual memory (used for IO-mapping PCI memory) in the Linux kernel. On configurations with a lot of physical memory, there will only be 128Mb of the address space that Linux will always reserve for virtual memory dynamically allocated. Unfortunately the nVidia card seems to eat as much virtual memory as it can (it occupies at least 128Mb in PCI memory space), so if you load it before the gm module on such a configuration, you will have the error reported. The fix is to recommend for people with more than 768Mb of memory and a nVidia card to apply the following patch to their kernel: --- arch/i386/kernel/setup.c Thu Aug 2 17:00:46 2001 +++ arch/i386/kernel/setup.c.2 Thu Oct 11 09:00:59 2001 @@-815,7 +815,7 @@ /* * 128MB for vmalloc and initrd */ -#define VMALLOC_RESERVE (unsigned long)(128 << 20) +#define VMALLOC_RESERVE (unsigned long)(256 << 20) #define MAXMEM (unsigned long)(-PAGE_OFFSET-VMALLOC_RESERVE) #define MAXMEM_PFN PFN_DOWN(MAXMEM) #define MAX_NONPAE_PFN (1 << 20) And to be sure the HIGHMEM option is enabled while configuring the kernel they use. If they do not mind losing memory or just to do a test, they can try to boot their current kernel with mem=768m to see if the problem disappears. 2. Overlapping of prefetch memory for the AGP and PCI bridges. SGI Visual Workstation 550 machine. AGP cards (nVidia Quadro, ATI Mach64 PCI graphics card, ATI Rage AGP). What we see with them is that the prefetchable memory assigned by the BIOS for the AGP and PCI bridges is overlapping. This looks like a BIOS problem and we have asked the customer to look into upgrading the BIOS, or to play with the BIOS settings to attempt to get the BIOS to do the right thing (things to try - toggling the plug-n-play OS setting, change the size of the AGP graphics aperture, reinitialize or re-detect the PCI space in the configuration space, etc.) Specifically, it was seen that: The memory for the Myrinet card is mapped at exactly the same spot with the ATI Mach64 PCI graphics card as it is with the ATI Rage AGP graphics card: 03:01.0 Non-VGA unclassified device: MYRICOM Inc.: Unknown device 8043 (rev 03) Region 0: Memory at 82000000 (64-bit, prefetchable) [size=16M] However, now look at the bridges leading to bus 3 (PCI where Myrinet card is) and bus 1 (AGP) in the ATI Rage AGP config: 00:01.0 PCI bridge: Intel Corporation 82840 840 (Carmel) Chipset AGP Bridge (rev 01) (prog-if 00 [Normal decode]) Bus: primary=00, secondary=01, subordinate=01, sec-latency=64 Prefetchable memory behind bridge: 82300000-850fffff 00:02.0 PCI bridge: Intel Corporation 82840 840 (Carmel) Chipset PCI Bridge (Hub B) (rev 01) (prog-if 00 [Normal decode]) Bus: primary=00, secondary=02, subordinate=03, sec-latency=0 Prefetchable memory behind bridge: 81600000-831fffff See how those the prefetchable memory regions overlap? And, more importantly, see how the bridge to the AGP bus's prefetchable memory region overlaps that of the Myrinet card? Note that the only prefetchable memory on the AGP bus is for the rage card and that this memory is a small subset of the region the bridge is claiming: 01:00.0 VGA compatible controller: ATI Technologies Inc 3D Rage IIC AGP (rev 7a) (prog-if 00 [VGA]) Region 0: Memory at 84000000 (32-bit, prefetchable) [size=16M] This issue is unresolved. ------------------------------------------- h. Motherboards with i840 or i860 chipsets ------------------------------------------- Several customers have reported IRQ issues (see entry for APIC) and disappointing DMA performance. A customer received a beta bios release from Supermicro that they have found resolves some IRQ issues for them and increases the performance. Using this new BIOS, along with the small change to the gm code (described below), they are now seeing around 300 MBytes/sec. Here is the change to increase the performance on the 860 (or 840) chipset. In our code, in this file: {GM_HOME}/drivers/linux/gm/gm_arch.c you can find this little section of code (see below). If you have an i840 chipset, modify the flag to be #define GM_INTEL_840 1 If you have an i860 chipset, make sure that you have BOTH of these two lines: myword = myword & ~0x0018; /* read prefetch all types */ myword = myword | 0x0004; /* try DT depth = 512 bytes */ (and if not, add the second one) then rebuild and reload the driver, then run gm_debug -L to see if your peak pci performance is higher. --------------------------- i. Linux 2.4.x for PowerPC --------------------------- There is a bug in Linux 2.4.x for PowerPC related to PCI initialization, and we provide a patch. This patch applies only to PowerPC running greater than a certain version of Linux 2.4.x. The exact version is a mystery. We know that Myrinet worked with a 2.4.5 kernel without the patch, and it definitely doesn't work with 2.4.11 or 2.4.13. These versions of Linux are known to need the patch. A brief description of the "bug" is as follows. When probing the bridge configuration, Linux is confused by a bus where no non-prefetchable range is enabled and does very buggy things in that case (that probably rarely ever happens except with Myrinet boards since most PCI devices advertise their registers as non-prefetchable). In the kernel-log you wil see: >PCI: Probing PCI hardware >Fixup res 1 (101) of dev 00:10.0: 400 -> 802400 >Unknown bridge resource 0: assuming transparent >Unknown bridge resource 2: assuming transparent And when you install GM, you will see a message about the Myrinet card's base address isn't being set correctly. >GM: Version 1.5_Linux_beta2 build 1.5_Linux_beta2 nelson@gala Tue Oct 23 >17:05:24 PDT 2001 >GM: Memory available for registration: 16384 pages (64 MBytes) >GM: WARNING: >GM: Bad PCI Info:base_address[0] = 0!!! (PCI iobase=0x0) >[...] This is the Linux patch to apply to the kernel source. --- linux/drivers/pci/pci.c.start Tue Oct 2 10:32:41 2001 +++ linux/drivers/pci/pci.c Tue Oct 2 10:38:07 2001 @@ -975,11 +975,11 @@ res->name = child->name; } else { /* - * Ugh. We don't know enough about this bridge. Just assume - * that it's entirely transparent. + * Ugh. This memory type seems unconfigured */ - printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 0); - child->resource[0] = child->parent->resource[0]; + printk(KERN_ERR "bridge resource %d not set\n", 0); + res->flags =res->start = res->end = 0; + res->name = child->name; } res = child->resource[1]; @@ -994,8 +994,9 @@ res->name = child->name; } else { /* See comment above. Same thing */ - printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 1); - child->resource[1] = child->parent->resource[1]; + printk(KERN_ERR "bridge resource %d not set\n", 1); + res->flags =res->start = res->end = 0; + res->name = child->name; } res = child->resource[2]; @@ -1025,8 +1026,9 @@ res->name = child->name; } else { /* See comments above */ - printk(KERN_ERR "Unknown bridge resource %d: assuming transparent\n", 2); - child->resource[2] = child->parent->resource[2]; + printk(KERN_ERR "Bridge resource %d not set\n", 2); + res->flags =res->start = res->end = 0; + res->name = child->name; } }