[Bigjob-users] Fwd: BigJob Condor / [BigJob] Unable to control SPMDVariation during pilot_job submission (#1)

Andre Luckow aluckow at cct.lsu.edu
Wed Jan 4 02:36:38 CST 2012


Hi Ole, hi Sharath,
regarding the current Condor support in BJ.

My understanding is:

1) We use the URL scheme condorg://<hostname> to trigger the Condor
specific BJ plugin.
2) We have a special version of the agent
bigjob/bigjob_agent_condor.py (not sure whether we need this - since
most of the code is the same).
3.) When bigjob.start_pilot_job is called we submit a Condorg job via
SAGA/Condor which starts the BJ agent
4.) Sub-Jobs/Work-Units are spawned via the BJ agent - there is no
Condor pool that is actually used.

Is this correct?

@Sharath:
Question 1: Did you check everything needed into the BJ SVN. I see
that the BJ agent is launched using different parameters than the
normal BJ agent:

jd.arguments = [ "-a", self.coordination.get_address(), "-b",self.pilot_url]

Looking at the bigjob_agent_condor.py, I don't see the place where
these parameters are processed.

Question 2: Do you have a basic example / documentation of how to use
this? What needs to be in the .ini file of the Condor adaptor, how
does the file staging work,...?

Thanks!

Andre



---------- Forwarded message ----------
From: Ole Weidner
<reply+i-2719914-9f9254f617a8ce4c25241ff10510c37e53c4ba22-222015 at reply.github.com>
Date: Wed, Jan 4, 2012 at 7:31 AM
Subject: [BigJob] Unable to control SPMDVariation during pilot_job
submission (#1)
To: Andre Luckow <andre.luckow at googlemail.com>


When I try to run BigJob via the Condor adaptor, I get the following error:

2012-01-04 01:24:49,684 - bigjob - DEBUG - Submit pilot job to:
condor://localhost/
2012-01-04 01:24:49,691 - bigjob.server - ERROR - Exception:
SAGA(BadParameter): condor_job: Problem launching condor job:
(std::exception caught: SAGA(NotImplemented): condor_job: Condor
adaptor does not support the 'SPMDVariation' attribute.

While this error comes from the condor adaptor (it doesn't support
SPMDVariation), I can't find a way to unset jd.spmd_variation. It
seems that it is set explicitly in bigjob/bigjob_manager.py:239.

Would it be possible to make this an option for the
bigjob.start_pilot_job() method, give it a default value of "None" and
don't set it at all in that case?

Is SPMDVariation variation relevant at all during pilot_job
submission, or is it just set "for completeness"? In that case, we
could remove it completely.

---
Reply to this email directly or view it on GitHub:
https://github.com/drelu/BigJob/issues/1


More information about the Bigjob-users mailing list