[Bigjob-users] Cannot rollback transaction (the error causing the rollback was: Cannot begin transaction.)

Ole Weidner oweidner at cct.lsu.edu
Wed Dec 28 20:25:52 CST 2011


Hi Andre,

I just did a few long-running experiments with BigJob/Redis and now I can 'keep alive' a BigJob  (even a temporarily 'dormant' one) for an indefinite amount of time (I used it for 24+ hours). 

Thanks
Ole

On Dec 17, 2011, at 3:03 AM, Andre Luckow wrote:

> Hi Ole,
> this is not the most common use case given that BJ was designed as
> application-level pilot and not a service. Giving the nature of the
> failure, I really think it is caused somewhere deep down in the advert
> adaptor.
> 
> But, I would propose that you give the Redis backend a try.
> 
> Best,
> Andre
> 
> On Sat, Dec 17, 2011 at 1:20 AM, Ole Weidner <oweidner at cct.lsu.edu> wrote:
>> Hi Andre,
>> 
>> I'm not sure, but I think this has to do with long running big job instances. In the context of my developments with bigjob, I have a big job / pilot job running for hours and hours and not necessarily processing work units.
>> 
>> If I have an active bigjob instance with a 'fork://localhost' pilot and wait let's say an hour and then query the status of the big job, I get either the error below or the whole thing just hangs indefinitely. Has that been observed before?
>> 
>> I suspect that it might have something to do with advert connections that are kept open but maybe are closed after some time on the PostgreSQL side?
>> 
>> Cheers
>> Ole
>> 
>> On Dec 16, 2011, at 4:14 PM, Andre Luckow wrote:
>> 
>>> Hi Ole,
>>> I have never seen this error before. I suspect that this is an issue in the advert adaptor/service. Is the directory created? Can the run be completed?
>>> 
>>> Best,
>>> Andre
>>> 
>>> On Dec 16, 2011, at 22:25, Ole Weidner <oweidner at cct.lsu.edu> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> from time to time I get the following error with BigJob:
>>>> 
>>>> 12/16/2011 03:21:35 PM - DEBUG - create advert entry: advert://SAGA:SAGA_client@advert.cct.lsu.edu:8080//bigjob/ef20fda2-282b-11e1-af5c-0016cb924ce9/localhost?
>>>> 12/16/2011 03:21:35 PM - ERROR - SAGA(NoSuccess): default_advert: advert::advert_cpi_impl::create_directory: unexpected error during transaction rollback: Cannot rollback transaction. (the error causing the rollback was: Cannot begin transaction.)
>>>> 
>>>> Any insights on that?
>>>> 
>>>> Cheers
>>>> Ole
>>>> _______________________________________________
>>>> Bigjob-users mailing list
>>>> Bigjob-users at mail.cct.lsu.edu
>>>> https://mail.cct.lsu.edu/mailman/listinfo/bigjob-users
>> 



More information about the Bigjob-users mailing list