Story #6548: Expand ORE model to allow relationships for derived datasets - Metacat - Ecoinformatics Redmine

Actions

Copy link

Story #6548

closed

Expand ORE model to allow relationships for derived datasets

Added by Lauren Walker almost 11 years ago. Updated over 9 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Lauren Walker

Category:

metacat

Target version:

2.5.0

Start date:

05/16/2014

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Description

Design a new model for Metacat's OREs where relationships:

- wasGeneratedBy
- derivedFrom
- used
- etc.

are used to describe datasets that are derived from raw data and metadata. These relationships may span OREs (e.g. an analyst may create data visualizations of another scientist's data and create a new package for those).

Subtasks 2 (0 open — 2 closed)

Actions

Copy link

Updated by ben leinfelder almost 11 years ago

Target version set to 2.5.0

Looking at the ORE spec, this is certainly acceptable practice - they just want to make sure that there isn't any orphaned node so that everything can trace back to either the aggregation or one of the aggregated resources within it. I believe we are also allowed to refer to objects that are aggregatedBy other resourceMaps, though I don't know if the derived resource map needs to also state that it "aggregates" the resource that it is deriving products from. Either way, I think this will be great.

Another thing to consider is doing ALL the semantic annotation assertions in the OREs. Not that it would be required, but it could be convenient since we already have a good precedent with folks starting to generate OREs for DataONE. My one concern is that the index parser would need to know how to handle the existing ORE packaging assertions as well as any SPARQL-based index processing we would want to do.

Actions

Copy link

Updated by Lauren Walker almost 11 years ago

Status changed from New to In Progress

A page in the metacat docs has been added to describe changes to Metacat's ORE model.

Actions

Copy link

Updated by Lauren Walker over 10 years ago

The model will need a second revision. Matt and I talked today about the "activities" in our model, which represent programs/scripts. There needs to be a place in the model for "runs."

A run would represent a single execution of a program. It would have properties like a start time, end time, parameters used, etc. Each run could possibly have unique parameters each time, especially if one or more functions creates a random number.

A program, (e.g. an R script), is not exactly an activity but another entity/data object.

If our model could store runs and separate the idea between programs and runs, data output can be reproducible since the run will have all the information needed to execute the program again using the exact same parameters.

Actions

Copy link