Revision 8801
Added by ben leinfelder over 10 years ago
docs/user/metacat/source/semantic-annotation.rst | ||
---|---|---|
145 | 145 |
* ``characteristic_sm`` - indexes the oboe:Characteristic[s] for oboe:Measurement[s] in the datapackage |
146 | 146 |
* ``standard_sm`` - indexes the oboe:Standard[s] for oboe:Measurement[s] in the datapackage |
147 | 147 |
|
148 |
|
|
149 | 148 |
|
150 | 149 |
Example |
151 | 150 |
_______ |
... | ... | |
184 | 183 |
but not that we measured the Length of the Tree (it may be that we actually measured the Length of the bird in the tree). |
185 | 184 |
|
186 | 185 |
|
186 |
|
|
187 |
Extending the model |
|
188 |
___________________ |
|
189 |
|
|
190 |
The proposed system for asserting and indexing annotations can easily be extended. For practical reasons, we do want to codify a preferred mechanism for expressing |
|
191 |
observation measurements and binding them to their data table attributes. But because the model is essentially just a collection of triples, and the mechanism that indexes those |
|
192 |
triples is configured with custom SPARQL queries, we can accommodate additional statements about data objects and packages in the future. |
|
193 |
|
|
194 |
One such semantic extension involves a provenance graph for derived data products. For detailed information on that endeavor, see the ore-model-expansion section. |
|
195 |
|
|
196 |
Another area for extension uses ORCIDs to give attribution to the appropriate author/creator. This is expressed in the model using prov:wasAttributedTo and |
|
197 |
could be readily indexed into a dynamic SOLR field like ``creator_sm``. But until these ORCIDs are more widely adopted, it may be difficult to provide effective querying based on this field. |
|
198 |
It would also require authors to actively assert that their ORCIDs are associated certain data packages and objects; perhaps using tools that we currently do not have implemented. |
|
199 |
|
|
200 |
|
|
201 |
Annotation serializations |
|
202 |
______________________________ |
|
203 |
|
|
204 |
Our initial serialization technique for semantic annotations is to have a distinct file for the model. We have been using RDF/XML, but other syntaxes will likely be supported out of the box |
|
205 |
because we are using the Jena library for model parsing. |
|
206 |
|
|
207 |
Other methods for serializing the model we have considered and may support in the future include: |
|
208 |
* ``ORE`` - included as additional triples in our current ORE resource map packaging serializations |
|
209 |
* ``RDFa`` - annotations embedded directly within the science metadata |
|
210 |
* ``triplestore`` - triples written directly to a triple store endpoint using an API |
|
211 |
|
|
212 |
|
|
213 |
Annotation permissions |
|
214 |
______________________________ |
|
215 |
|
|
216 |
Because annotations ussually assert facts (or opinions) about _other_ objects, we will allow these assertions to be indexed only if the rights holder for the RDF model has the same rightsholder |
|
217 |
priveledges on the target object. |
|
218 |
This will prevent both malicious and accidental assertions about objects by other parties who should not be influencing how the object is documented or interpreted. |
|
219 |
Unfortunately, this also prevents interested, non-rights holder parties from asserting valuable statements about research data in the system. |
|
220 |
Ideally, we will accommodate third-party annotations and expose them for use in discovery and integration so long as they are effectively labeled (e.g., "alternative annotaiton", "automated annotation", etc...). |
|
221 |
|
|
222 |
|
|
223 |
|
|
187 | 224 |
Sample annotation using OWL |
188 | 225 |
---------------------------- |
189 | 226 |
Serialization of the example model. Authored in and exported from Protege. |
Also available in: Unified diff
add more sections about extending the annotation model, serializations, and permissions.