[Simfactory] SimFactory on Ranger
Frank Loeffler
knarf at cct.lsu.edu
Mon Mar 16 15:03:26 CST 2009
Hi,
here comes an update:
Contrary to what they wrote earlier, I now got this:
<quote>
I just tried letting just one of your 2048 jobs ran and it once again
triggered the problem on the /share filesystem. Like I said previously,
you cannot have _ANY_ module commands in your .profile_user or .bashrc
files, or else you need to ensure they are only called only once when
you login. Every single MPI task will try to run the full set of module
commands in your .profile or .bashrc scripts.
I consulted with a few others here and the primary problem is that you
are using bash as your shell and it does not interpret the difference
between a login interactive shell and non-interactive shell (i.e. using
.login and .cshrc). There is a possible work around however, you can put
module commands in your .profile_user as long as you put them inside an
if check for an environment variable. In the if block, you then set the
environment variable. Upon login to Ranger, the environment variable is
set and propagated to jobs so that when they start running, each task
reading the .profile_user does not issue the module commands again.
You will need to make the login script changes, log out and then back in
and resubmit all of your jobs. Please let me know when you have one 2048
test job loaded and we can test it tomorrow during the maintenance and
not impact other users on the system (like happened today when your job
caused /share to hang for a while).
</quote>
That is what I now did:
login3$ more .profile_user
if [ "x" = "x$RANGER_MODULE_WORKAROUND" ]; then
module unload pgi
module unload mvapich
module load intel/10.1
#module load intel/9.1
#module load mvapich/1.0.1
module load mvapich/0.9.9
#module load papi/3.6.0
#module load gcc/4.2.0
fi
Let us see how this works for them.
Frank
More information about the SimFactory
mailing list