[Bigjob-users] (Many) Problems Running BigJob via GRAM

Ole Weidner oweidner at cct.lsu.edu
Sun Jan 22 22:00:49 CST 2012


Hi all,

mea culpa. So I just realized that the pbs-ssh mode still needs the SAGA ssh adaptor to perform. I thought it would use ssh and/or paramiko directly to avoid SAGA-dependency. 

Anyways, after installing the ssh adaptors, the error message slightly changes:

qsub_file.close()
os.system( \"qsub  \" + qsub_file_name)
"
use standard proxy
Submit pilot job to: ssh://qb1.loni.org/
01/22/2012 10:53:53 PM - bigjob - DEBUG - PBS JobID: 
Traceback (most recent call last):
  File "interop.py", line 99, in <module>
    main()
  File "interop.py", line 65, in main
    processes_per_node)
  File "/home/oweidner/software/bigjob_pypi/lib/python2.7/site-packages/BigJob-0.4.33-py2.7.egg/bigjob/bigjob_manager.py", line 249, in start_pilot_job
    self.job.run()
  File "/home/oweidner/software/bigjob_pypi/lib/python2.7/site-packages/BigJob-0.4.33-py2.7.egg/bigjob/pbsssh.py", line 102, in run
    raise Exception("BigJob submission via pbs-ssh:// failed")
Exception: BigJob submission via pbs-ssh:// failed

Now that's one informative error message. On the remote machine there's an agent directory with a file called 'bigjob_pbs_ssh' in it. Nothing else. No logs. 

Ideas?

Cheers,
Ole

On Jan 22, 2012, at 9:40 PM, Ole Weidner wrote:

> When I use pbs-ssh://qb1.loni.org (which should trigger BigJob's internal 'pbs-ss' mechanism?), I get the following error message:
> 
> 01/22/2012 10:37:11 PM - bigjob - DEBUG - BJ Working Directory: /home/oweidner/agent/bj-880fb486-4573-11e1-bf9c-bc305b7ee8dc
> 01/22/2012 10:37:11 PM - bigjob - DEBUG - Adaptor specific modifications: pbs-ssh
> 01/22/2012 10:37:11 PM - bigjob - DEBUG - Escape SSH
> ...
> qsub_file.close()
> os.system( \"qsub  \" + qsub_file_name)
> "
> use standard proxy
> Traceback (most recent call last):
> ...
>  SAGA(BadParameter): condor_job: condor_job_adaptor.hpp(126): Adaptor supports 'condor' and 'condorg' URL schemes, 'ssh' is not supported.
>  SAGA(NoSuccess): default_job: posix_job_service.cpp(70): Could not initialize job service for [ssh://qb1.loni.org/]. Only 'localhost' and engage-submit3.renci.org are supported.
>  SAGA(NoSuccess): globus_gram_job: globus_gram_job_adaptor_service.cpp(46): Could not initialize job service for ssh://qb1.loni.org/. Only gram:// schemes are supported.
>  SAGA(NoSuccess): proxy.cpp(261): No adaptor succeeded in executing constructor for job_service_cpi
> 
> It seems that BigJob still tries to use SAGA even though I specified pbs-ssh?!? 
> 
> Cheers
> Ole
> 
> On Jan 22, 2012, at 9:32 PM, Ole Weidner wrote:
> 
>> When I use gram://qb1.loni.org/jobmanager-pbs instead of fork, I get the following error in the logs:
>> 
>> -bash-3.00$ cat stderr-bigjob_agent.txt
>> File "<string>", line 8
>>   BIGJOB_AGENT_DIR= os.path.join(home, .bigjob)
>>                                        ^
>> SyntaxError: invalid syntax
>> 
>> It seems that some 'escaping' mechanism in BigJob removed the quotation marks ?!?
>> 
>> Cheers
>> Ole
>> 
>> 
>> On Jan 22, 2012, at 8:57 PM, Ole Weidner wrote:
>> 
>>> I'm usinggram://qb1.loni.org/jobmanager-fork
>> 
>> _______________________________________________
>> Bigjob-users mailing list
>> Bigjob-users at mail.cct.lsu.edu
>> https://mail.cct.lsu.edu/mailman/listinfo/bigjob-users
> 
> _______________________________________________
> Bigjob-users mailing list
> Bigjob-users at mail.cct.lsu.edu
> https://mail.cct.lsu.edu/mailman/listinfo/bigjob-users



More information about the Bigjob-users mailing list