Bug #4325: Workflow Run Manager - deleted runs sometimes reappear after Kepler relaunch - Kepler - Ecoinformatics Redmine

Actions

Copy link

Bug #4325

closed

Workflow Run Manager - deleted runs sometimes reappear after Kepler relaunch

Added by Derik Barseghian over 15 years ago. Updated about 15 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Derik Barseghian

Category:

workflow run manager

Target version:

wrp-modules-2.0.0

Start date:

08/20/2009

Due date:

% Done:

Estimated time:

Bugzilla-Id:

4325

Description

To reproduce:
- Launch Kepler
- Run a workflow twice.
- Export the 2nd run.
- Run the workflow again.
- Delete third run from WRM
- Close, then relaunch Kepler. #3's back

The problem is related to exporting a run.

Actions

Copy link

Updated by Derik Barseghian over 15 years ago

Also: If instead of deleting the third run, you delete the second run (the one you exported), and then close and relaunch kepler, you get a FileNotFoundException from CacheManager's getCacheObjectIterator. This is because the file is still in cacheContentTable. Watching the DB after a delete operation, the DB row never actually gets deleted.

Actions

Copy link

Updated by Derik Barseghian over 15 years ago

If you:
- Launch Kepler
- Run a workflow twice.
- Export the 2nd run.
- Delete the 2nd run
- Close and relaunch, the 2nd run is gone and no errors.

Actions

Copy link

Updated by Derik Barseghian over 15 years ago

Now I'm unable to reproduce the error from the procedure in comment #2, even though I'd done it a few times. Maybe something more insidious happening here...

Actions

Copy link

Updated by Derik Barseghian over 15 years ago

One way to sometimes get this to occur seems to be quitting kepler immediately after clicking ok in the delete runs dialog. You needn't have exported any of the runs. The preparedStatements will return how many rows they've "deleted" and the delete method always seems to complete before kepler-shutdown, yet when you connect to the database post-kepler shutdown (connecting to the db file instead of server) the run rows still exist in workflow_exec.

Actions

Copy link

Updated by Daniel Crawl over 15 years ago

HSQL delays writing to the file system after updates occur to the database. According to the docs, http://hsqldb.org/doc/guide/ch09.html#set_write_delay-section, the default delay is 20 seconds, but my hsqldb.script says it's 10 seconds. We should probably decrease this.

Actions

Copy link

Updated by Derik Barseghian over 15 years ago

Dan and I discussed this, it seems like a likely culprit. I'm going to try to change the write delay to see if it fixes this bug, and if/how bad it hurts performance. We figure a better solution is probably to do a clean shutdown of the server when Kepler quits, if it's the last Kepler instance running. To know if it's the last Kepler running, we might e.g. store and check, on shutdown, a numberKeplersRunning variable in the database.

Actions

Copy link

Updated by Derik Barseghian over 15 years ago

Decreased write_delay to 100ms in r20931. Needs more testing to check for any performance hit, but so far so good. I can't get runs to reappear no matter how fast I quick after doing a delete (good deal). If this does impact performance, I suspect we still want to make write_delay much shorter than what it was, 10s, so that the db loses less in cases of kepler-crash.

Actions

Copy link