Workflow Run Manager - deleted runs sometimes reappear after Kepler relaunch
- Launch Kepler
- Run a workflow twice.
- Export the 2nd run.
- Run the workflow again.
- Delete third run from WRM
- Close, then relaunch Kepler. #3's back
The problem is related to exporting a run.
#1 Updated by Derik Barseghian almost 11 years ago
Also: If instead of deleting the third run, you delete the second run (the one you exported), and then close and relaunch kepler, you get a FileNotFoundException from CacheManager's getCacheObjectIterator. This is because the file is still in cacheContentTable. Watching the DB after a delete operation, the DB row never actually gets deleted.
#4 Updated by Derik Barseghian almost 11 years ago
One way to sometimes get this to occur seems to be quitting kepler immediately after clicking ok in the delete runs dialog. You needn't have exported any of the runs. The preparedStatements will return how many rows they've "deleted" and the delete method always seems to complete before kepler-shutdown, yet when you connect to the database post-kepler shutdown (connecting to the db file instead of server) the run rows still exist in workflow_exec.
#5 Updated by Daniel Crawl almost 11 years ago
HSQL delays writing to the file system after updates occur to the database. According to the docs, http://hsqldb.org/doc/guide/ch09.html#set_write_delay-section, the default delay is 20 seconds, but my hsqldb.script says it's 10 seconds. We should probably decrease this.
#6 Updated by Derik Barseghian almost 11 years ago
Dan and I discussed this, it seems like a likely culprit. I'm going to try to change the write delay to see if it fixes this bug, and if/how bad it hurts performance. We figure a better solution is probably to do a clean shutdown of the server when Kepler quits, if it's the last Kepler instance running. To know if it's the last Kepler running, we might e.g. store and check, on shutdown, a numberKeplersRunning variable in the database.
#7 Updated by Derik Barseghian almost 11 years ago
Decreased write_delay to 100ms in r20931. Needs more testing to check for any performance hit, but so far so good. I can't get runs to reappear no matter how fast I quick after doing a delete (good deal). If this does impact performance, I suspect we still want to make write_delay much shorter than what it was, 10s, so that the db loses less in cases of kepler-crash.