[Bigjob-users] BigJob - Remote Job Submission Error!

Andre Luckow andre.luckow at gmail.com
Tue Feb 7 00:59:45 CST 2012


Hi Sai,
I don't see any error. Can you sent the complete log, please?
Thanks!
Andre

On Feb 6, 2012, at 22:23, Sai Saripalli <ssarip1 at tigers.lsu.edu> wrote:

> Hi All,
> 
> I was able to run bigjob on remote machine from sierra(e.g. from sierra on india) but was not able to do so from queenbee. I was facing the following error when I run the example_fg_ingle.py. I was able to see the BigJob directory on the remote machine but not corresponding subjob directory. The following are the parameters in my python script.
> 
>     queue=None # if None default queue is used
>     project=None # if None default allocation is used 
>     walltime=10
>     processes_per_node=8
>     number_nodes = 1
>     workingdirectory="/N/u/ssarip1/agent"
>     #workingdirectory="/home/ssaripal/bigjob_examples/examples" # working directory for agent
>     userproxy=None # userproxy (not supported yet due to context issue w/ SAGA)
> 
> COORDINATION_URL = "redis://cyder.cct.lsu.edu:2525"
> lrms_url = "pbs-ssh://ssarip1@india.futuregrid.org" # resource url to run the jobs on localhost
> 
> 
> 
> Error:
> (python)[ssaripal at qb3 examples]$ python example_fg_single.py 
> 02/06/2012 03:11:30 PM - bigjob - DEBUG - Loading BigJob version: 0.4.33
> 02/06/2012 03:11:30 PM - bigjob - DEBUG - read configfile: /home/ssaripal/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.33-py2.7.egg/bigjob/../bigjob.conf
> 02/06/2012 03:11:30 PM - bigjob - DEBUG - Using SAGA C++/Python.
> Start Pilot Job/BigJob at: pbs-ssh://ssarip1@india.futuregrid.org
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - init BigJob w/: redis://cyder.cct.lsu.edu:2525
> 02/06/2012 03:11:31 PM - root - DEBUG - ['/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.33-py2.7.egg/coordination/../ext/redis-2.4.9/', '/home/ssaripal/bigjob_examples/examples/../', '/home/ssaripal/bigjob_examples/examples', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/pip-1.0.2-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.33-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/bliss-0.1.17-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/redis-2.2.4-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/virtualenv-1.7-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/threadpool-1.2.7-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/uuid-1.30-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/paramiko_on_pypi-1.7.6-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/openssh_wrapper-0.2-py2.7.egg', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/pycrypto_on_pypi-2.3-py2.7-linux-x86_64.egg', '/project/tg_csa/saga/saga/1.5.3/gcc-3.4.6/lib/python2.7.1/site-packages', '/project/tg_csa/saga/external/python/2.7.1/gcc-3.4.6/lib/python2.7/site-packages', '/home/ssaripal/bigjob_examples/examples/PYTHONPATH=/usr/local/packages/python/2.6.4/gcc-4.3.2.new/bin/python2.6', '/usr/local/packages/saga/1.5.3/saga-bindings-python-0.9.0', '/usr/local/packages/saga/1.5.3/lib/python2.3.4/site-packages', '/home/ssaripal/bigjob_examples/examples', '/home/packages/teragrid/pacman-3.26-r1/src', '/home/ssaripal/.bigjob/python/lib/python27.zip', '/home/ssaripal/.bigjob/python/lib/python2.7', '/home/ssaripal/.bigjob/python/lib/python2.7/plat-linux2', '/home/ssaripal/.bigjob/python/lib/python2.7/lib-tk', '/home/ssaripal/.bigjob/python/lib/python2.7/lib-old', '/home/ssaripal/.bigjob/python/lib/python2.7/lib-dynload', '/project/tg_csa/saga/external/python/2.7.1/gcc-3.4.6/lib/python2.7', '/project/tg_csa/saga/external/python/2.7.1/gcc-3.4.6/lib/python2.7/plat-linux2', '/project/tg_csa/saga/external/python/2.7.1/gcc-3.4.6/lib/python2.7/lib-tk', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages', '/home/ssaripal/.bigjob/python/lib/python2.7/site-packages/BigJob-0.4.33-py2.7.egg/bigjob']
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Utilizing Redis Backend
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Parsing URL: redis://cyder.cct.lsu.edu:2525
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - redis:// cyder.cct.lsu.edu 2525
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Connect to Redis: cyder.cct.lsu.edu Port: 2525
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - initialized BigJob: bigjob:bj-2426ab10-5107-11e1-857a-0060dd46c5ff
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - create pilot job entry on backend server: bigjob:bj-2426ab10-5107-11e1-857a-0060dd46c5ff:india.futuregrid.org
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - update state of pilot job to: Unknown
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - set pilot state to: Unknown
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Create remote directory; scheme: ssh://, host: india.futuregrid.org, path: /N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - discovered user: ssarip1
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - starting thread (client mode): 0x9f0f0b50L
> 02/06/2012 03:11:31 PM - paramiko.transport - INFO - Connected (version 2.0, client OpenSSH_4.6)
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - kex algos:['diffie-hellman-group-exchange-sha256', 'diffie-hellman-group-exchange-sha1', 'diffie-hellman-group14-sha1', 'diffie-hellman-group1-sha1'] server key:['ssh-rsa', 'ssh-dss'] client encrypt:['aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'arcfour128', 'arcfour256', 'arcfour', 'aes192-cbc', 'aes256-cbc', 'rijndael-cbc at lysator.liu.se', 'aes128-ctr', 'aes192-ctr', 'aes256-ctr'] server encrypt:['aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'arcfour128', 'arcfour256', 'arcfour', 'aes192-cbc', 'aes256-cbc', 'rijndael-cbc at lysator.liu.se', 'aes128-ctr', 'aes192-ctr', 'aes256-ctr'] client mac:['hmac-md5', 'hmac-sha1', 'hmac-ripemd160', 'hmac-ripemd160 at openssh.com', 'hmac-sha1-96', 'hmac-md5-96'] server mac:['hmac-md5', 'hmac-sha1', 'hmac-ripemd160', 'hmac-ripemd160 at openssh.com', 'hmac-sha1-96', 'hmac-md5-96'] client compress:['none', 'zlib at openssh.com'] server compress:['none', 'zlib at openssh.com'] client lang:[''] server lang:[''] kex follows?False
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - Ciphers agreed: local=aes128-ctr, remote=aes128-ctr
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - using kex diffie-hellman-group1-sha1; server key type ssh-rsa; cipher: local aes128-ctr, remote aes128-ctr; mac: local hmac-sha1, remote hmac-sha1; compression: local none, remote none
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - Switch to new keys ...
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - Trying discovered key b5c71a8a8f0422d09f2b765f546f07af in /home/ssaripal/.ssh/id_rsa
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - userauth is OK
> 02/06/2012 03:11:31 PM - paramiko.transport - INFO - Authentication (publickey) successful!
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - [chan 1] Max packet in: 34816 bytes
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - [chan 1] Max packet out: 32768 bytes
> 02/06/2012 03:11:31 PM - paramiko.transport - INFO - Secsh channel 1 opened.
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - [chan 1] Sesch channel 1 request ok
> 02/06/2012 03:11:31 PM - paramiko.transport.sftp - INFO - [chan 1] Opened sftp connection (server version 3)
> 02/06/2012 03:11:31 PM - paramiko.transport.sftp - DEBUG - [chan 1] mkdir('/N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff', 511)
> 02/06/2012 03:11:31 PM - paramiko.transport.sftp - INFO - [chan 1] sftp session closed.
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - [chan 1] EOF sent (1)
> 02/06/2012 03:11:31 PM - paramiko.transport - DEBUG - EOF in transport thread
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Stage: None to ssh://ssarip1@india.futuregrid.org/N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - BJ Working Directory: /N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Adaptor specific modifications: pbs-ssh
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - Escape SSH
> 02/06/2012 03:11:31 PM - bigjob - DEBUG - "import sys
> import os
> import urllib
> import sys
> import time
> import textwrap
> 
> qsub_file_name=\"bigjob_pbs_ssh\"
> 
> qsub_file = open(qsub_file_name, \"w\")
> qsub_file.write(\"#PBS -l nodes=1:ppn=8\")
> qsub_file.write(\"\n\")
> qsub_file.write(\"#PBS -l walltime=0:10:00\")
> qsub_file.write(\"\n\")
> qsub_file.write(\"#PBS -o /N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff/stdout-bigjob_agent.txt\")
> qsub_file.write(\"\n\")
> qsub_file.write(\"#PBS -e /N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff/stderr-bigjob_agent.txt\")
> qsub_file.write(\"\n\")
> qsub_file.write(\"cd /N/u/ssarip1/agent/bj-2426ab10-5107-11e1-857a-0060dd46c5ff\")
> qsub_file.write(\"\n\")
> qsub_file.write(\"python -c \\\"\" + textwrap.dedent(\"\"\"import sys
> import os
> import urllib
> import sys
> import time
> start_time = time.time()
> home = os.environ.get(\\\\\"HOME\\\\\")
> BIGJOB_AGENT_DIR= os.path.join(home, \\\\\".bigjob\\\\\")
> if not os.path.exists(BIGJOB_AGENT_DIR): os.mkdir (BIGJOB_AGENT_DIR)
> BIGJOB_PYTHON_DIR=BIGJOB_AGENT_DIR+\\\\\"/python/\\\\\"
> BOOTSTRAP_URL=\\\\\"https://raw.github.com/drelu/BigJob/master/bootstrap/bigjob-bootstrap.py\\\\\"
> BOOTSTRAP_FILE=BIGJOB_AGENT_DIR+\\\\\"/bigjob-bootstrap.py\\\\\"
> #ensure that BJ in .bigjob is upfront in sys.path
> sys.path.insert(0, os.getcwd() + \\\\\"/../\\\\\")
> sys.path.insert(0, os.getcwd() + \\\\\"/../../\\\\\")
> p = list()
> for i in sys.path:
>     if i.find(\\\\\".bigjob/python\\\\\")>1:
>           p.insert(0, i)
> for i in p: sys.path.insert(0, i)
> print str(sys.path)
> try: import saga
> except: print \\\\\"SAGA and SAGA Python Bindings not found: BigJob only work w/ non-SAGA backends e.g. Redis, ZMQ.\\\\\";print \\\\\"Python version: \\\\\",  os.system(\\\\\"python -V\\\\\");print \\\\\"Python path: \\\\\" + str(sys.path)
> try: import bigjob.bigjob_agent
> except: print \\\\\"BigJob not installed. Attempting to install it.\\\\\"; opener = urllib.FancyURLopener({}); opener.retrieve(BOOTSTRAP_URL, BOOTSTRAP_FILE); os.system(\\\\\"python \\\\\" + BOOTSTRAP_FILE + \\\\\" \\\\\" + BIGJOB_PYTHON_DIR); activate_this = BIGJOB_PYTHON_DIR+\\\\\"bin/activate_this.py\\\\\"; execfile(activate_this, dict(__file__=activate_this))
> #try to import BJ once again
> import bigjob.bigjob_agent
> # execute bj agent
> args = list()
> args.append(\\\\\"bigjob_agent.py\\\\\")
> args.append(\\\\\"redis://cyder.cct.lsu.edu:2525\\\\\")
> args.append(\\\\\"bigjob:bj-2426ab10-5107-11e1-857a-0060dd46c5ff:india.futuregrid.org\\\\\")
> print \\\\\"Bootstrap time: \\\\\" + str(time.time()-start_time)
> print \\\\\"Starting BigJob Agents with following args: \\\\\" + str(args)
> bigjob_agent = bigjob.bigjob_agent.bigjob_agent(args)
> \"\"\") + \"\\\"\")
> qsub_file.close()
> os.system( \"qsub  \" + qsub_file_name)
> "
> use standard proxy
> 
> 
> 
> Thank you,
> Sai Saripalli
> _______________________________________________
> Bigjob-users mailing list
> Bigjob-users at mail.cct.lsu.edu
> https://mail.cct.lsu.edu/mailman/listinfo/bigjob-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cct.lsu.edu/pipermail/bigjob-users/attachments/20120207/8565401c/attachment-0001.html 


More information about the Bigjob-users mailing list