Bug #266
closedrevise attribute domain
0%
Description
attribute metadata describes the domain for the attributes using enumerated and
range domains, but does not currently allow for free-text domains. This could
be fixed using FGDC's unrepresentable domain.
Also, there has been a request to add 'paragraph' and 'citation' elements to the
'source' element to be more specific about the source for a domain.
Updated by Matt Jones about 23 years ago
Here's a proposal for this issue with EML attributeDomain:
add a new element to "attributeDomain" called "textDomain" that is defined as an
element where the contents represent a regular expression pattern against which
the free text values must match. If the "textDomain" element is empty, then an
implied pattern of '.*' will be defaulted, allowing any string (including the
empty string) to be valid. Patterns use regular expression syntax as used in
the W3C XML Schema Datatypes recommendation for the pattern facet (section 4.3.4).
New Content model for attribute domain is:
<!ELEMENT attributeDomain ( (enumeratedDomain | textDomain)+ | rangeDomain+ ) >
When more than instance of these elements is provided (e.g., a textDomain is
repeated, then the domains are OR'ed together to allow any of the values. Note
that the whole choice group has become repeatable, so mixtures of enumerated
domains and textDomains are possible, although they are exclusive with
rangeDomains (as is currently the case).
Here's a couple of examples:
Specifies any alphanumeric value:
<attributeDomain><textDomain/></attributeDomain>
Specify repeating sequence of one or more digits:
<attributeDomain><textDomain>[0-9]+</textDomain></attributeDomain>
Specify alphanumeric 5 digit string with the first two digits being "MP":
<attributeDomain><textDomain>MP[a-zA-Z0-9]{3}</textDomain></attributeDomain>
Many more examples are possible. The most common practice will likely be to
simply provide an emply <textDomain/> element indicating that any text is
permissible.
We might also want to complicate it a bit more by making "textDomain" have the
following content model (rather than PCDATA):
<!ELEMENT textDomain (definition, pattern*, source?)>
This would allow us to define what is intended by the value space (e.g., values
represent a US postal code), have multiple patterns that are OR'ed (simplifying
regexp syntax), and define a source like in enumeratedDomains. I think the
definition is worthwhile.
Updated by Matt Jones about 23 years ago
Completed as the extended proposal describes, including the internal structure
for textDomain that allows a definition, repeating patterns, and a source.
<!ELEMENT textDomain (definition, pattern*, source?)>
DONE.