[Saga-devel] Re: your files
Andre Merzky
andre at merzky.net
Sat Jan 24 18:52:50 CST 2009
Quoting [Shantenu Jha] (Jan 23 2009):
>
> Chris, Michael,
>
> Can you also enlighten me, what you meant by, ".... reliance on the advert
> for coordination"? Why do you think the catastrophic(?) loss of workers
> is due to the Advert Service?
To clarify for the list: the mapreduce worker loss I
observed earlier was caused partially by the UUID collision
Hartmut discovered (and which is not yet solved, only worked
around), and partially by boost::process, which failed to
report job state correctly, which caused my adaptor to clean
up prematurely.
These issues should only affect the aws and ssh adaptors, no
other ones. Thus I am not sure if other worker loss causes
remain.
The main issue with the advert service is that it is slow
(IMHO). It is on my table to get the missing Posgres
modules installed on gumbo to get the AS enroled there. Did
not find time to finish that, sorry... My initially guess
of a race condition in the AS for mapreduce master.workers
seems to be wrong, as far as I can tell.
Hope that helps,
Andre.
> - high load?
> - latency?
> - design flaw?
> i.e., I'd like to know the basis for the claim.
>
> When I was in Hungary I was told that it was trans-atlantic latency, which
> seemed plausible.
>
> Don't get me wrong guys: this is not blame naming or critising; its a very
> simple practical issue, e.g., the answer to the question determines, if
> for example, Kate should spend her time installing a second. And its
> precisely such enhancements that come about when we use things in
> production, as opposed to prototype. Its natural and expected.
>
> Cheers,
> Shantenu
>
> >The freopen call is to capture the saga verbose output of the workers to a
> >file instead of missing it altogether.
> >-Chris
> >
> >On Jan 22, 2009 6:26 PM, "Shantenu Jha" <sjha at cct.lsu.edu> wrote:
> >
> >>Sorry, I have no idea what it means to install postgresql on Linux. Give
> >me > Windows box and I'll...
> >Ok, no problem.
> >
> >Andre: Let Chris/Michael/Kate/Ole/Archit know if you need assistance there.
> >
> >What freopen call?
> >>
> >I'll let Chris answer this. Chris?
> >
> >>>i'm just trying to get over the "identified" bottlenecks. i'm not
> >
> >>I'm not aware of this problem (it's perhaps the best just to CC me on
> >those > mails even if they a...
> >I agree.
> >
> >>Could you elaborate? Chris? Andre? Kate?
> >I think its the last sentence in what I quoted from Chris, but that
> >is unfortunately devoid of details.
> >
> >Chris? Andre? Kate? :)
> >
> >>Sorry for my ignorance. It's all perhaps because of my lengthy stay in >
> >Germany, seems I just g...
> >Not a problem. perfectly understandable. Just to much going on for everyone
> >to keep abreast of everythig. I just guess we need to try harder and
> >probably use saga-devel as much as possible to keep the group updated as a
> >whole.
> >
> >cheers,
> >shantenu
> >
--
Nothing is ever easy.
More information about the saga-devel
mailing list