VORPAL FAQ
-
How should I modify my MPICH environment for large VORPAL simulation runs?
-
Where can I find a peer-reviewed journal I can cite when referencing VORPAL?
-
What additional software do I need to run VORPAL in parallel?
-
When I run in parallel, why do I get a prompt asking for my Password?
-
Are any other settings required to run VORPAL in parallel on Linux (or the Mac)?
-
VORPAL hangs randomly when I run in parallel. What could be the problem?
Q: How can I get help installing VORPAL?
A: Send questions about installing VORPAL to Tech-X Customer Support at support@txcorp.com.
Q: How should I modify my MPICH environment for large VORPAL simulation runs?
A: Running large VORPAL jobs on 2048 or more cores can sometimes cause mpi related segfaults. To prevent this, before invoking the VORPAL executable, place the following lines in your *.qsub file:
export MPICH_UNEX_BUFFER_SIZE=200000000
export MPICH_MAX_SHORT_MSG_SIZE=2048
export MPICH_PTL_OTHER_EVENTS=10000
Q: Where should I send my questions related to VORPAL?
A: Address technical questions via email to the VORPAL discussion list.
Please contact Tech-X directly for sales, collaboration, and other questions.
Q: Where can I find a peer-reviewed journal I can cite when referencing VORPAL?
A: C. Nieter and J. R. Cary, "VORPAL: a versatile plasma simulation code", J. Comp. Phys. 196, 448-472 (2004).
A: Yes, we support Leopard (10.5) and Snow Leopard (10.6). Tiger (10.4) is not supported.
Q: Which Windows operating systems are supported?
A: VORPAL runs under Windows 7, Vista, and XP. Both 32-bit and 64-bit versions are available.
Q: Which Linux operating systems are supported?
A: We officially support Fedora Core 7-12, RHEL 5, CentOS 5, and other Red Hat variants such as Scientific Linux. VORPAL is likely to run on other similar Linux operating systems, but it is not guaranteed.
Q: What additional software do I need to run VORPAL in parallel?
A: On Linux and the Mac, our standard parallel build uses openmpi 1.4.1, which is included in the tarball. In order to run parallel on Windows, the Microsoft Compute Cluster Pack SDK is required. This package can be downloaded from the Microsoft Download Center at http://www.microsoft.com/downloads/details.aspx?FamilyID=d8462378-2f68-409d-9cb3-02312bc23bfd&displaylang=en
Q: Do you support parallel execution via MPICH2?
A: Yes, we regularly build with MPICH2 1.2.1 and can provide an MPICH2 build on demand.
Q: When I run in parallel, why do I get a prompt asking for my Password?
A: On the Mac and Linux, openmpi uses rsh or ssh to connect to remote nodes (or your local machine, if you have a multi-core machine). You need to configure rsh or ssh so that you can log in to the remote/local node without a password. The intricacies of setting up rsh and ssh are beyond the scope of this document, but a good place to start would be the openmpi FAQ on this topic:
http://www.open-mpi.org/faq/?category=rsh
You may need to consult your sysadmin to properly configure rsh/ssh.
Q: Are any other settings required to run VORPAL in parallel on Linux (or the Mac)?
A: First off, you should have VORPAL's bin directory in your PATH. That will allow to avoid having to type the full path to VORPAL every time you want to run it. Next, you must have the openmpi bin directory in your PATH. If you're using the bash shell, you should add the following line to your .bashrc file:
export PATH=/path/to/openmpi/bin:/path/to/vorpal/bin:$PATH
Having that line in your .bashrc will ensure that when openmpi invokes rsh/ssh to start a new process on a remote node, your environment will be properly configured.
Q: VORPAL hangs randomly when I run in parallel. What could be the problem?
A: There is a known bug in openmpi 1.2.0-1.4.0 which causes intermittent hangs. If you using one of these versions of openmpi, you will need to upgrade to 1.4.1.
Q: What else could cause VORPAL to hang?
A: Under normal circumstances, VORPAL should not hang. However, if you try to run VORPAL on a problem that exhausts the physical memory of the computer, VORPAL will appear to hang. If you suspect that this is happening, you should closely monitor your system's memory usage. The method for monitoring memory depends on your operating system. Methods for monitoring memory on Linux, Windows, and OS X are discussed below.
To understand how to monitor your system's memory usage, it is important to understand the following terminology: “RAM“ is refers to physical or actual memory in your system, whereas “swap“ or “swap space“ refers to a dedicated portion of the hard disk, which can contain copies of some portions of RAM contents. When the operating system is running out of RAM, it will “swap out” a process (or portion of a process) by copying it from RAM to the swap space. After the swap, the operating system can use the freed RAM space for another purpose.
Monitoring Memory Usage on Linux:
Most Linux distibutions have a utility named “top” that can be used to monitor the system's resource usage. Run top from a shell windowto see all of the processes running on your system. To more easily view only your own processes instead of all the processes running on the system, invoke top with the -u option (i.e., “top -u your_username”).
There are two rows displayed by top that show memory usage for the system. Look for rows that begin with “Mem:” and “Swap:”.
Columns two, three, and four of the Mem: row display information about the total RAM installed on the system, amount of RAM being used, and amount of RAM still available for use, respectively. When the number in column three is equal to the number in column two (.e memory used = memory available), or if column four displays "0k free," this means that all available system memory is being used.
Columns two, three, and four of the Swap: row display information about total amount of swap space available on the system, amount of swap space currently in use, and amount of swap space still available for use, respectively.
Monitoring Memory Usage on Windows:
Press Ctrl-Atl-Del on Windows to bring up the Task Manager. At the top of the Task Manager, select the tab labeled “Processes” to see a list of all the processes running on your system. If you click on the “Mem Usage” column header, the output display will be sorted in order of memory usage.
Monitoring Memory Usage on Mac OS X:
Open a Terminal window (/Applications/Utilities/Terminal) and use “top” as described in Monitoring Memory Usage on Linux. Otherwise, for a more user-friendly view of the activity on your system, launch the Activity Monitor (/Applications/Utilities/Activity Monitor) .
Regardless of which type of system you are using, if you find that VORPAL is consuming most or all of your RAM, this will lead to poor performance or cause the system to hang. The response time degrades when the system swaps processes because hard disks are thousands of times slower than RAM. Therefore, the more swapping that occurs, the slower the system will respond.
To prevent VORPAL from consuming too much memory and causing your machine to swap, perform any or all of the following steps that can be applied to your simulation:
1. Reduce the number of particles.
2. Reduce the size of your grid.
3. Reduce the simulation dimensionality.
Q: When running in parallel, the processes on the head node complete normally, but the processes on the other nodes fail at the end of the run with various MCA errors. What is causing this behavior?
A: We have tracked this problem to a peculiar "feature" of Ubuntu whereby the .bashrc is not sourced during a non-interactive ssh to the nodes. Because the .bashrc is not sourced, the PATH is not updated, and the system mpirun (/usr/bin/mpirun) is being used instead of the desired mpirun which was installed elsewhere. The problem is outlined in the Ubuntu forums:
http://ubuntuforums.org/showthread.php?t=191632
The solution listed in the forums (adding a ~/.bash_profile which contains commands to source the ~/.bashrc) did not work for one of our customers. He solved the problem by adding the PATH change to the /etc/environment file.
