Project

General

Profile

« Previous | Next » 

Revision 8801

add more sections about extending the annotation model, serializations, and permissions.

View differences:

docs/user/metacat/source/semantic-annotation.rst
145 145
	* ``characteristic_sm`` - indexes the oboe:Characteristic[s] for oboe:Measurement[s] in the datapackage
146 146
	* ``standard_sm`` - indexes the oboe:Standard[s] for oboe:Measurement[s] in the datapackage
147 147

  
148

  
149 148
	
150 149
Example
151 150
_______
......
184 183
but not that we measured the Length of the Tree (it may be that we actually measured the Length of the bird in the tree).
185 184

  
186 185

  
186

  
187
Extending the model
188
___________________
189

  
190
The proposed system for asserting and indexing annotations can easily be extended. For practical reasons, we do want to codify a preferred mechanism for expressing 
191
observation measurements and binding them to their data table attributes. But because the model is essentially just a collection of triples, and the mechanism that indexes those
192
triples is configured with custom SPARQL queries, we can accommodate additional statements about data objects and packages in the future.
193

  
194
One such semantic extension involves a provenance graph for derived data products. For detailed information on that endeavor, see the ore-model-expansion section.
195

  
196
Another area for extension uses ORCIDs to give attribution to the appropriate author/creator. This is expressed in the model using prov:wasAttributedTo and 
197
could be readily indexed into a dynamic SOLR field like ``creator_sm``. But until these ORCIDs are more widely adopted, it may be difficult to provide effective querying based on this field.
198
It would also require authors to actively assert that their ORCIDs are associated certain data packages and objects; perhaps using tools that we currently do not have implemented. 
199

  
200

  
201
Annotation serializations
202
______________________________
203

  
204
Our initial serialization technique for semantic annotations is to have a distinct file for the model. We have been using RDF/XML, but other syntaxes will likely be supported out of the box 
205
because we are using the Jena library for model parsing.
206

  
207
Other methods for serializing the model we have considered and may support in the future include:
208
	* ``ORE`` - included as additional triples in our current ORE resource map packaging serializations
209
	* ``RDFa`` - annotations embedded directly within the science metadata
210
	* ``triplestore`` - triples written directly to a triple store endpoint using an API
211
	
212

  
213
Annotation permissions
214
______________________________
215

  
216
Because annotations ussually assert facts (or opinions) about _other_ objects, we will allow these assertions to be indexed only if the rights holder for the RDF model has the same rightsholder
217
priveledges on the target object.
218
This will prevent both malicious and accidental assertions about objects by other parties who should not be influencing how the object is documented or interpreted. 
219
Unfortunately, this also prevents interested, non-rights holder parties from asserting valuable statements about research data in the system. 
220
Ideally, we will accommodate third-party annotations and expose them for use in discovery and integration so long as they are effectively labeled (e.g., "alternative annotaiton", "automated annotation", etc...).
221

  
222

  
223

  
187 224
Sample annotation using OWL
188 225
----------------------------
189 226
Serialization of the example model. Authored in and exported from Protege.

Also available in: Unified diff