Bug #269
closedresolve packaging issues
0%
Description
There are some contentious issues surrounding the use of packaging (ie, the
triple element) in EML. Some would prefer inclusion via namespaces directly to
make the schema more explicit. But using triples to associate data and metadata
files is more flexible and allows new types of metadata to be added over time
without changes to the original structure.
One complaint is that the current structure requires multiple files to deliver
all of the metadata. One possible solution is to include an element 'metadata'
with content model 'ANY' as the root element, which can contain all of the other
modules, and they in turn can use namespaces to indicate how validation can be
performed.
Updated by Matt Jones almost 23 years ago
At KNB annual meeting 2001 we suggested (concluded) that triples be used to
associate unique, identifieable, retrievable objects. The subject and object
tags of triple should contain identifers.
If there are additional digital objects that do not represent unique objects, or
that are not retrievable, then the 'onlineURL' field of EML should be used. In
no case should triples be used for URLs. Will need to consider this more
thoroughly.
Updated by Matt Jones almost 23 years ago
Chnaging target milestone for the major EML bugs to Beta7, which is scheduled
for early to mid March for release. There are likely other bugs that need to be
entered and resolved for this Beta7 release as well, so lets generate a complete
list!
Updated by Matt Jones almost 23 years ago
Another issue to consider with regard to packaging is the domain for the
relationship element. Right now many relationships are specified as
"isRelatedTo", which really doesn't provide any information. Values such as
"provides access control rules for" and "describes the attributes present in"
are much more useful because they specify the role that a particular document
plays with respect to another. Even with stronger relationships, the issue of
which document types can be subjects and objects on a particular relationship
needs to be determined.
Updated by Matt Jones almost 23 years ago
Yet another packaging issue: the triple elements allow you to associate any two
objects that have unique identifiers. However, there are no constraints on
those associations, so it is technically possible to 1) associate two objects in
a way that is non-sensical (e.g., eml-attribute associated with eml-party), and
2) leave out relationships that need to be specified (e.g., eml-attribute
associated with eml-entity). The basic problem is that we lack a way of
constraining the packaging model: consider this fantasy example declaration for
a module content model:
<!MODULE eml-entity (eml-attribute*)>.
Would something akin to this help? If we need this, shouldn't we just head back
towards using XML content models directly for all core EML content?
If we eliminated packages, then all eml-* modules would be incorporated directly
into the XML trees according to the XML content models. This is more precise,
at the cost of flexibility, especially if we want to make revisions to EML (for
example, to add semantic metadata extensions to eml-attribute).
Overall I am torn about how precisely to modify our packaging model. I think we
need to maintain a flexible mechanism for incorporating arbitrary metadata and
data content in a machine-parsable way because new metadata specifications and
new versions of existing specifications will continually evolve, and many will
be relevant to our scientific target audience. How we best accomplish this is
somewhat open at this point. Ideas?
Updated by Matt Jones over 22 years ago
Created a new module "eml.xsd" that allows us to wrap up multiple eml modules
(and other metadata) into a single XML stream. The "eml" element has a content
model of "any" which allows us to nest arbitrary trees under the eml element.
The processContents attribute is set to "strict", which means that it is
obligatory that the processor locate a schema for the namespace for each
subtree, and that the sub-tree be valid with respect to that schema. Thus, one
might include an eml-entity module under the eml element, and use the
eml:eml-entity namespace to validate it.
I had considered creatign a top-level element that would constrain us to using
some derivative or eml-resource as the first child of "eml", but I determined
that there was no syntax that let me specify that any subclass of Resource was
permissible. Thus, in order that we could create and use additional EML
resource subclasses in the future, I opted to keep the content model for "eml"
as just "any".
This deals with the need to put all of the metadata in one file. However, it
means that we keep all of the content modules at the top level of the tree
(children of "eml"), rather than nesting modules under other modules. I am
prepared to close this bug as RESOLVED FIXED unless someone comments otherwise.
Updated by Matt Jones over 22 years ago
FIXED. Existing "eml" wrapper module to be incorporated into the beta7 release.