Kepler: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362014-02-28T21:11:23ZEcoinformatics Redmine
Redmine Feature #6434 (New): add workflow execution time and other metadata to report designerhttps://projects.ecoinformatics.org/ecoinfo/issues/64342014-02-28T21:11:23ZDaniel Crawldanielcrawl@gmail.com
<p>It would be nice if the report designer had drag and drop items for metadata stored in provenance such as the execution timestamp, who ran the workflow, etc.</p> Bug #5641 (New): once timezones are added to provenance, Workflow Run Manager must utilize themhttps://projects.ecoinformatics.org/ecoinfo/issues/56412012-07-18T20:14:45ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>Once timezones are added to provenance (bug#5640), Workflow Run Manager needs to utilize them. WRM should continue to display runs in local time in its GUI.</p> Bug #5640 (New): associate timezones with all timestamps recorded in provenance tableshttps://projects.ecoinformatics.org/ecoinfo/issues/56402012-07-18T20:12:02ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>Right now provenance records timestamps in local time without recording timezone. This is lossy. E.g. one problem scenario: User runs a workflow on their laptop in one timezone, moves timezones, and exports the run. The exported run now has the wrong timestamp recorded. Related is that WRM is assuming local timezone and during run-export, <strong>adding</strong> the local timezone in the recorded run (separate bug).</p>
<p>Part of this bug is to also deal with a user's existing timestamps. While it's not safe to associate local timezone to all their existing timestamps, it's the best guess we can make, short of giving the user a way to change them. The user should at least be made aware this is what's going to happen during the provenance schema upgrade.</p> Bug #5594 (New): use input and output port icons in Items of Interesthttps://projects.ecoinformatics.org/ecoinfo/issues/55942012-04-28T23:29:58ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>Currently actor ports have a "generic file"-ish icon in the Items of Interest panel. From a user's point of view this doesn't make sense, and you can't tell the difference between in and output ports, which would be very useful.</p> Bug #5591 (New): Workflow Run Manager error downloading run for which module dependencies not sat...https://projects.ecoinformatics.org/ecoinfo/issues/55912012-04-28T17:45:40ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>The WRM lets you download run-kars with module dependencies that your current suite doesn't satisfy, and an error is given during download. Download should be allowed (it's not for the similar situation in the Components area), and the usual prompt about unsatisfied mod deps just given during attempts at open (it is). Look into what's parsing moml on download and if it's really necessary.</p> Bug #5429 (New): improve default provenance store performancehttps://projects.ecoinformatics.org/ecoinfo/issues/54292011-06-24T20:13:24ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>Currently there can be some big performance penalties when using kepler with provenance turned on (by default using hsql). It would be great to improve these.</p>
<p>Unless noted, references to workflow execution times below refer to the reap GDD wf set to process 200days of data:<br /><a class="external" href="https://code.ecoinformatics.org/code/reap/trunk/usecases/terrestrial/workflows/derivedMETProducts/growingDegreeDays.kar">https://code.ecoinformatics.org/code/reap/trunk/usecases/terrestrial/workflows/derivedMETProducts/growingDegreeDays.kar</a></p>
<p>I see/saw a few issues:</p>
<p>-1) at one point I mentioned kepler shutdown was taking a very long time. This isn't an issue anymore, shutdown seems near instant.</p>
<p>0) the pre-initialize stage of workflow execution can take a very long time and grows longer w/ each subsequent execution when running with a provenance store that's large. E.g. up to 15m.<br />Dan's fixed this issue, I believe w/ r27746. Pre-init is now close to instant or just a few seconds.</p>
<p>1) execution of the workflow w/ provenance off takes a few seconds. With provenance on, it takes about 4min to run the first time with an empty provenance store.</p>
<p>2) subsequent executions of the same workflow take longer to run. <br />E.g. Here are the execution times of 9 runs of the workflow on 2 different machines: <br />10.6 macbook 2.2ghz intel core 2 duo w/ 4gb RAM:<br />4:01, 4:03, 3:57, 7:43, 8:07, 8:01, 8:33, 8:10, 8:33, <br />ubuntu 10.04 dual 3ghz w/ 2gb RAM: <br />4:03, 4:13, 4:32, 9:13, 12:32, 8:08, 9:54, 9:06, 11:53</p>
<p>3) startup time can take a very long time when the prior Kepler invocation ran data/token intensive workflows. I believe what's happening is hsql is incorporating the changes in the log file into the .data file. I think something's happening w/ the .backup file too. The data file slowly grows very large (a lot more than by 200mb), and finally the log file drops to near 0, and then the data file decreases in size to a size larger than where it started. I think with the default log file max size of 200mb, startup can take on the order of 10-20m. I've tested w/ a variety of log file sizes. Making it dramatically smaller, e.g. 5mb, dramatically improves startup time, but comes at a huge workflow execution time penalty (~20m to run the wf), so this is an unacceptable fix. The execution penalty starts happening when the log file max size is set smaller than about 100mb. With a 100mb log file, startup is still very slow.</p>
<p>One thing I've found that improves execution time performance is increasing the 'memory cache exponent' setting (hsqldb.cache_scale) from the default of 14 to the max of 18. This setting "Indicates the maximum number of rows of cached tables that are held in memory, calculated as 3 <strong>(2</strong>*value) (three multiplied by (two to the power value)). The default results in up to 3*16384 rows from all cached tables being held in memory at any time." <br />With a 200mb log file max size, and cache_scale=18, the first run of the workflow takes about 2:17.</p> Bug #5425 (New): error when running -nocache and no .keplerhttps://projects.ecoinformatics.org/ecoinfo/issues/54252011-06-20T17:41:49ZDaniel Crawldanielcrawl@gmail.com
<p>The following occurs when running from the command line, specifying -nocache, and there's no .kepler:</p>
<pre><code>[null] java.sql.SQLException: Table not found in statement [insert into cacheContentTable (name, lsid, date, file, type, classname) values ( ?, ?, ?, ?, ?, ? )]<br /> [null] at org.hsqldb.jdbc.Util.throwError(Unknown Source)<br /> [null] at org.hsqldb.jdbc.jdbcPreparedStatement.&lt;init&gt;(Unknown Source)<br /> [null] at org.hsqldb.jdbc.jdbcConnection.prepareStatement(Unknown Source)<br /> [null] at org.kepler.objectmanager.cache.CacheManager.&lt;init&gt;(CacheManager.java:113)<br /> [null] at org.kepler.objectmanager.cache.CacheManager.getInstance(CacheManager.java:141)<br /> [null] at org.kepler.objectmanager.ObjectManager.getObjectFromCache(ObjectManager.java:398)<br /> [null] at org.kepler.objectmanager.ObjectManager.getObjectRevision(ObjectManager.java:217)<br /> [null] at org.kepler.util.WorkflowRunUtil.putInObjectManager(WorkflowRunUtil.java:46)<br /> [null] at org.kepler.util.WorkflowRun.&lt;init&gt;(WorkflowRun.java:312)<br /> [null] at org.kepler.provenance.sql.SQLQueryV8.getWorkflowRunsForExecutionLSIDs(SQLQueryV8.java:667)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.workflowRunManagerEventOccurred(WorkflowRunManager.java:1110)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.fireWorkflowRunManagerEvent(WorkflowRunManager.java:1089)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.fireWorkflowRunManagerEvent(WorkflowRunManager.java:1073)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManagerRecording.executionStart(WorkflowRunManagerRecording.java:107)<br /> [null] at org.kepler.provenance.ProvenanceRecorder.preinitialize(ProvenanceRecorder.java:649)<br /> [null] at ptolemy.actor.CompositeActor.preinitialize(CompositeActor.java:1683)<br /> [null] at ptolemy.actor.Manager.preinitializeAndResolveTypes(Manager.java:928)<br /> [null] at ptolemy.actor.Manager.initialize(Manager.java:635)<br /> [null] at ptolemy.actor.Manager.execute(Manager.java:340)<br /> [null] at ptolemy.actor.Manager.run(Manager.java:1109)<br /> [null] at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1639)<br /> [null] java.sql.SQLException: Table not found in statement [select name, lsid, file from cacheContentTable]<br /> [null] at org.hsqldb.jdbc.Util.sqlException(Unknown Source)<br /> [null] at org.hsqldb.jdbc.jdbcStatement.fetchResult(Unknown Source)<br /> [null] at org.hsqldb.jdbc.jdbcStatement.executeQuery(Unknown Source)<br /> [null] at org.kepler.objectmanager.cache.CacheManager.getObject(CacheManager.java:508)<br /> [null] at org.kepler.objectmanager.ObjectManager.getObjectFromCache(ObjectManager.java:398)<br /> [null] at org.kepler.objectmanager.ObjectManager.getObjectRevision(ObjectManager.java:217)<br /> [null] at org.kepler.util.WorkflowRunUtil.putInObjectManager(WorkflowRunUtil.java:46)<br /> [null] at org.kepler.util.WorkflowRun.&lt;init&gt;(WorkflowRun.java:312)<br /> [null] at org.kepler.provenance.sql.SQLQueryV8.getWorkflowRunsForExecutionLSIDs(SQLQueryV8.java:667)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.workflowRunManagerEventOccurred(WorkflowRunManager.java:1110)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.fireWorkflowRunManagerEvent(WorkflowRunManager.java:1089)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.fireWorkflowRunManagerEvent(WorkflowRunManager.java:1073)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManagerRecording.executionStart(WorkflowRunManagerRecording.java:107)<br /> [null] at org.kepler.provenance.ProvenanceRecorder.preinitialize(ProvenanceRecorder.java:649)<br /> [null] at ptolemy.actor.CompositeActor.preinitialize(CompositeActor.java:1683)<br /> [null] at ptolemy.actor.Manager.preinitializeAndResolveTypes(Manager.java:928)<br /> [null] at ptolemy.actor.Manager.initialize(Manager.java:635)<br /> [null] at ptolemy.actor.Manager.execute(Manager.java:340)<br /> [null] at ptolemy.actor.Manager.run(Manager.java:1109)<br /> [null] at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1639)<br /> [null] org.kepler.objectmanager.cache.CacheException: SQL exception when getting object<br /> [null] at org.kepler.objectmanager.cache.CacheManager.getObject(CacheManager.java:533)<br /> [null] at org.kepler.objectmanager.ObjectManager.getObjectFromCache(ObjectManager.java:398)<br /> [null] at org.kepler.objectmanager.ObjectManager.getObjectRevision(ObjectManager.java:217)<br /> [null] at org.kepler.util.WorkflowRunUtil.putInObjectManager(WorkflowRunUtil.java:46)<br /> [null] at org.kepler.util.WorkflowRun.&lt;init&gt;(WorkflowRun.java:312)<br /> [null] at org.kepler.provenance.sql.SQLQueryV8.getWorkflowRunsForExecutionLSIDs(SQLQueryV8.java:667)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.workflowRunManagerEventOccurred(WorkflowRunManager.java:1110)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.fireWorkflowRunManagerEvent(WorkflowRunManager.java:1089)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManager.fireWorkflowRunManagerEvent(WorkflowRunManager.java:1073)<br /> [null] at org.kepler.workflowrunmanager.WorkflowRunManagerRecording.executionStart(WorkflowRunManagerRecording.java:107)<br /> [null] at org.kepler.provenance.ProvenanceRecorder.preinitialize(ProvenanceRecorder.java:649)<br /> [null] at ptolemy.actor.CompositeActor.preinitialize(CompositeActor.java:1683)<br /> [null] at ptolemy.actor.Manager.preinitializeAndResolveTypes(Manager.java:928)<br /> [null] at ptolemy.actor.Manager.initialize(Manager.java:635)<br /> [null] at ptolemy.actor.Manager.execute(Manager.java:340)<br /> [null] at ptolemy.actor.Manager.run(Manager.java:1109)<br /> [null] at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1639)<br /> [null] Caused by: java.sql.SQLException: Table not found in statement [select name, lsid, file from cacheContentTable]<br /> [null] at org.hsqldb.jdbc.Util.sqlException(Unknown Source)<br /> [null] at org.hsqldb.jdbc.jdbcStatement.fetchResult(Unknown Source)<br /> [null] at org.hsqldb.jdbc.jdbcStatement.executeQuery(Unknown Source)<br /> [null] at org.kepler.objectmanager.cache.CacheManager.getObject(CacheManager.java:508)<br /> [null] ... 16 more</code></pre> Bug #5413 (New): large run deletion via the WRM with hsql provenance store takes a very long time...https://projects.ecoinformatics.org/ecoinfo/issues/54132011-05-27T00:18:52ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>Deleting workflow executions that processed a lot of data via the Workflow Run Manager takes far too long - often longer than the execution took. Also there's no gui feedback, kepler just appears locked up. A significant part of the time is likely file i/o -- hsql is deleting each token from the port_event table individually, and for each delete it writes a line the provenanceDB.log file like:<br />DELETE FROM PORT_EVENT WHERE ID=102490</p> Bug #5330 (New): Workflow Run Manager - include provenance trace file in run-karhttps://projects.ecoinformatics.org/ecoinfo/issues/53302011-03-01T03:05:13ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>It sounds like provenance trace files are going to start being included in the provenance database, per execution. <br />Workflow Run Manager could then include trace files in run-kars, creating more comprehensive archives.<br />This probably only really makes sense to do if we can do something useful with the tracefile in the kar. I'm imagining the Provenance Browser could(can?) be invoked from the trace file directly. Or, we could consider importing the data from the trace into the provenance db.</p> Bug #5284 (New): Set uploadToServer to be true at configuration.xmlhttps://projects.ecoinformatics.org/ecoinfo/issues/52842011-01-28T17:37:51ZJing Taotao@nceas.ucsb.edu
<p>When we set up kepler run engine, we always have to change the value from false to true. Why we set it to be false? I think user always wants to upload the run kar file to the repository when they pass the repository name to the kepler.</p> Bug #5175 (New): Fix remaining issues with exporting multiple runs into one KARhttps://projects.ecoinformatics.org/ecoinfo/issues/51752010-09-13T19:14:27ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>In 2.1, opening a KAR that contains two workflows and report layouts won't necessarily open the report layouts.</p>
<p>The problem here is that the karEntryHandler open methods don't know what tableauFrame an item should be associated with when opening it. I thought I had worked around this in 2.1 by using get/setRankingTableauFrame methods to keep a reference in the WorkflowManager singleton to the tableauFrame most recently dealt with, but this doesn't (always?) work because it needs the items opened in a certain order. When this bug occurs, the workflows are both opened (and the reportDesignerPanels are associated with their respective tableauFrames), and then the rest of the entries are dealt with. What I believe the current 2.1 code needs is for one workflow to be opened, and then its items, and then the next workflow and its items. In detail: ReportLayoutKAREntryHandler.open calls getRankingTableauFrame(), but what this open method sometimes needs is a tableauFrame that's been opened prior to the current "ranking" one.</p>
<p>I plan to release the reporting suite with the restriction of only being able to export 1 run into a KAR, so that 2 different solutions do not need to be implemented for this problem (since 2.1 and trunk differ), and so that bug this doesn't hold up the release. We can enable this functionality during the next releases, which will be cut from trunk.</p> Bug #5095 (In Progress): test kepler and wrp for memory leakshttps://projects.ecoinformatics.org/ecoinfo/issues/50952010-07-14T22:56:35ZMatt Jonesjones@nceas.ucsb.edu
<p>Oliver Soong reported having difficulties with memory leaks. There are two specific bugs about this, which I have set to block this testing bug. In addition, testing may reveal additional leaks, which should be fixed before 2.1 is released. Here's Oliver's synopsis of the issues:</p>
<p>I think this is limited to the wrp suite, but Kepler’s performance degrades significantly over time. Provenance recording can become prohibitively slow, and there is no native in-Kepler fix. There is a large memory leak somewhere, and many components are quite memory-intensive regardless. Given the intention to record executions and the large number of analyses scientists perform, I suspect any dedicated user of Kepler will quickly encounter data management problems. In my case, I stopped using local repositories and began closing Kepler after running any large workflows.</p> Bug #5090 (New): Report Layout interface is confusing wrt locking and switching to old layoutshttps://projects.ecoinformatics.org/ecoinfo/issues/50902010-07-10T00:15:26ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>I'm unhappy with how the report layout 'lock edit mode' button works. By default, it's unlocked, and when you first click on a run, you're warned you'll lose your current report layout (even if your current layout is empty, and even if you're changing to the same report layout). On the warning dialog you may say "don't warn me again", but this is on a per window basis, and can get tedious.</p>
<p>If you click the 'lock edit mode' button, on every single run you click on, you have to OK a warning message reminding you you're locked, this gets extremely tedious.</p>
<p>One idea: if there was a distinction in the gui and code between your "currently under construction layout associated with the current window's workflow" vs historic layouts, and an easy way to switch between the two, as well as copy a historic layout into your "currently under construction layout", that might make the gui less confusing and burdensome since there would be no need for locking.</p> Bug #5086 (New): Report instance does not show in viewer after kar openhttps://projects.ecoinformatics.org/ecoinfo/issues/50862010-07-09T20:15:33ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>When opening a run kar that contains a report instance, the report instance viewer does not show the pdf. If a kar contains a report instance, it also contains a run, and so when you open the kar, the run is imported (or not if you already have it). So you may always get at this report instance by clicking on the run in the workflow-run-manager, but this is far from ideal -- the report instance should just be shown in the report instance viewer after you open the kar.</p> Bug #4981 (In Progress): RIO pdfs don't show up in the Component Libraryhttps://projects.ecoinformatics.org/ecoinfo/issues/49812010-05-06T23:33:07ZDerik Barseghianbarseghian@nceas.ucsb.edu
<p>If you create a workflow and a simple report, execute, export the run to a local repository, then expand the kar in the Components Library, no item shows representing the pdf.</p>