Project

General

Profile

Bug #4161

Develop kepler workflow execution engine

Added by Jing Tao about 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
general
Target version:
Start date:
06/15/2009
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
4161

Description

Here is the an email from matt:

I think the next stage would be to have you work on
putting together an execution engine that can, given a schedule of a
workflow to be run, can be called to retrieve the workflow KAR from
metacat, set up a proper staging directory for execution, and run
kepler, making sure that output and errors are handled properly. On
the output side, this means programatically calling the run manager
that Derik and Aaron are making to create a kar file that contains the
results, and then publish that to metacat (Kepler also should have an
API for this when Derik and Aaron are done).

We decided to use web service as the execution engine. The API will be:

1. execute(workflowID)::runID
2. getStatus(runID)::status

status includes:
SCHEDULED
EXECUTING
SUCCEEDED
FAILED
NOTFOUND


Related issues

Blocks Kepler - Bug #4169: Build "server-side" installation of KeplerResolved06/18/2009

Blocks Kepler - Bug #4180: Kepler command-line execution with cacheResolved06/23/2009

History

#1 Updated by Jing Tao about 10 years ago

In execute(workflowID) method, we need to get the kar file from metacat first.
To my understanding. The workflowID will be a lsid format. Do we have any mechanism to get the file from remote repository base on given lsid?

If we don't have this mechanism, we may use metacat id too. So the API will be:
execute(workflowID, metacatID)::runID

Any suggestion and comment will be appreciated.

#2 Updated by Jing Tao about 10 years ago

Of course, adding metacatID is a short term solution.

#3 Updated by Jing Tao about 10 years ago

Here is matt's comment:
<matt> hey jing
<matt> i saw your bugzilla note about retrieving KAR files by LSID
<jing> hi, matt
<matt> The LSID authority service should let you get the KAR file using the getData(LSID) method
<matt> however, it might be better to not rely on that
<matt> and instead, recognize that the ID is an LSID, extract out the metacat identifier part, then make a standard call to the ecogrid get method to get the kar
<matt> do you think that would work?
<jing> if we make sure metacat identifier is part of LSID, it should work.
<matt> it is, in that the metacat id is the namespace:object:rev part of the lsid
<matt> so there is a 1:1 mapping
<jing> sounds good.

#4 Updated by ben leinfelder about 10 years ago

After discussion with Jing and Derik yesterday:
1-we need to make sure we have a server-side version of Kepler that includes the WRP suite.
2-we need to verify that provenance and reporting features work GUI-less
3-we need to make some actions automated: packaging a KAR after execution; uploading that KAR.

will #3 be handled by Jing or by the reporting module?

#5 Updated by ben leinfelder about 10 years ago

In the client-side GUI version of Kepler we open/run workflows and reports like so:
1-Open KAR (contains the workflow and the report layout)
2-Run the workflow (which generates the report when completed and writes it to provenance)
3-Export the run as KAR (contains the workflow that was run, the report layout, the report instance (xml) and a pdf of the report instance).

The execution engine needs to open the workflow from the KAR so that the report layout is available. I believe the current plan only involves the workflow LSID and does not address any report layout issues.

I'd like to see an execution method that can take a KAR LSID (that contains a workflow and a report layout). It would then perform the three steps above.

Note that the KAREntryHandlers take care of the opening/initializing steps for reporting to work - so that's good (and already built in)!

#6 Updated by ben leinfelder about 10 years ago

we've had a couple successful "round trip" runs:
upload a workflow KAR (indus)
-schedule execution (indus)
-execute wf (chico)
-archive results (chico
>indus)
-view results (indus)

Now we've got to resolve the LSID-changing problem that is happening when we open a KAR with a workflow (bug #4224). Then we should be able to run another end-to-end test

#7 Updated by ben leinfelder almost 10 years ago

re-targeting to wrp module, although it may be a 2.0 bug to extend kepler so that it can receive user credentials when running from the command line.

#8 Updated by ben leinfelder over 9 years ago

In terms of the credentials - we can now include them in the configuration xml file that is used by the command line version of Kepler: ConfigNoGUIWithCache.xml

<property name="_domain" class="ptolemy.kernel.util.StringAttribute" value="KNB,DEV" />
<property name="_username" class="ptolemy.kernel.util.StringAttribute" value="uid=kepler,o=unaffiliated,dc=ecoinformatics,dc=org" />
<property name="_password" class="ptolemy.kernel.util.StringAttribute" value="kepler" />

but with the correct user for the TPC group

#9 Updated by ben leinfelder over 9 years ago

script-based execution still works. There's a slight problem on the Metacat side for the first MOML DTD...but that's on it's way to being resolved.
The report for my test was blank, however...that's a problem.

#10 Updated by ben leinfelder over 9 years ago

stylesheet issue on the Metacat side of things -- now fixed.

#11 Updated by ben leinfelder over 9 years ago

installed on my local system - did a roundtrip:
-upload workflow+report
-schedule
-wait
-view PDF report when complete

There are a few minor Metacat updates to be committed, but otherwise things are still working.

#12 Updated by ben leinfelder over 9 years ago

Need to move the authentication credentials from the old config system (ptolemy-based) to the new configuration.xml system.
Seems as if we will not be passing these params on the commandline, so a purely configuration-based approach will suffice for now.

#13 Updated by Derik Barseghian over 9 years ago

Hey Jing and/or Ben, could you give an update on this bug?

#14 Updated by ben leinfelder over 9 years ago

I know in early february I was able to use the commandline to execute a KAR (after some tweaks) and then Josep was unknowingly helping us make enhancements.
As long as the commandline execution works, I think the webservice-based execution is pretty straightforward.
We should update the dev (metacat) server to have a more recent WRP build of Kepler and make sure the TPC scheduling system is configured to use it.

#15 Updated by Jing Tao over 9 years ago

Early of March, I installed an execution engine on my local machine. I finished a full a cycle - uploaded a kar file from kepler to the server, searched the kar file on the web site, scheduled a job and the job was executed successfully.

So I think the engine works. But it may still need some improvement.

#16 Updated by Jing Tao over 9 years ago

so far the engine works. I am closing the bug.

#17 Updated by Jing Tao over 9 years ago

closing the bug.

#18 Updated by Redmine Admin over 6 years ago

Original Bugzilla ID was 4161

Also available in: Atom PDF