[Bigjob-users] Cannot rollback transaction (the error causing the rollback was: Cannot begin transaction.)
Ole Weidner
oweidner at cct.lsu.edu
Wed Dec 28 20:25:52 CST 2011
Hi Andre,
I just did a few long-running experiments with BigJob/Redis and now I can 'keep alive' a BigJob (even a temporarily 'dormant' one) for an indefinite amount of time (I used it for 24+ hours).
Thanks
Ole
On Dec 17, 2011, at 3:03 AM, Andre Luckow wrote:
> Hi Ole,
> this is not the most common use case given that BJ was designed as
> application-level pilot and not a service. Giving the nature of the
> failure, I really think it is caused somewhere deep down in the advert
> adaptor.
>
> But, I would propose that you give the Redis backend a try.
>
> Best,
> Andre
>
> On Sat, Dec 17, 2011 at 1:20 AM, Ole Weidner <oweidner at cct.lsu.edu> wrote:
>> Hi Andre,
>>
>> I'm not sure, but I think this has to do with long running big job instances. In the context of my developments with bigjob, I have a big job / pilot job running for hours and hours and not necessarily processing work units.
>>
>> If I have an active bigjob instance with a 'fork://localhost' pilot and wait let's say an hour and then query the status of the big job, I get either the error below or the whole thing just hangs indefinitely. Has that been observed before?
>>
>> I suspect that it might have something to do with advert connections that are kept open but maybe are closed after some time on the PostgreSQL side?
>>
>> Cheers
>> Ole
>>
>> On Dec 16, 2011, at 4:14 PM, Andre Luckow wrote:
>>
>>> Hi Ole,
>>> I have never seen this error before. I suspect that this is an issue in the advert adaptor/service. Is the directory created? Can the run be completed?
>>>
>>> Best,
>>> Andre
>>>
>>> On Dec 16, 2011, at 22:25, Ole Weidner <oweidner at cct.lsu.edu> wrote:
>>>
>>>> Hi,
>>>>
>>>> from time to time I get the following error with BigJob:
>>>>
>>>> 12/16/2011 03:21:35 PM - DEBUG - create advert entry: advert://SAGA:SAGA_client@advert.cct.lsu.edu:8080//bigjob/ef20fda2-282b-11e1-af5c-0016cb924ce9/localhost?
>>>> 12/16/2011 03:21:35 PM - ERROR - SAGA(NoSuccess): default_advert: advert::advert_cpi_impl::create_directory: unexpected error during transaction rollback: Cannot rollback transaction. (the error causing the rollback was: Cannot begin transaction.)
>>>>
>>>> Any insights on that?
>>>>
>>>> Cheers
>>>> Ole
>>>> _______________________________________________
>>>> Bigjob-users mailing list
>>>> Bigjob-users at mail.cct.lsu.edu
>>>> https://mail.cct.lsu.edu/mailman/listinfo/bigjob-users
>>
More information about the Bigjob-users
mailing list