EML: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362002-10-29T19:07:47ZEcoinformatics Redmine
Redmine Bug #659 (Resolved): release tasks for eml-2.0.0rc3https://projects.ecoinformatics.org/ecoinfo/issues/6592002-10-29T19:07:47ZMatt Jonesjones@nceas.ucsb.edu
<p>This is a tracking bug for tasks that need to be doen for the eml-2.0.0 release.<br /> Add any housekeeping, non-controversial tasks that need to be done here.</p>
<p>1) Update all namespace names from eml-2.0.0rc2 to eml-2.0.0<br />2) Finalize bug list in README<br />3) Validate that all tests are passed<br />4) Review documentation on id/references rules in the spec</p> Bug #656 (Resolved): physical should be repeatablehttps://projects.ecoinformatics.org/ecoinfo/issues/6562002-10-28T22:07:12ZMatt Jonesjones@nceas.ucsb.edu
<p>Point from Barbara Benson (NTL):</p>
<p>The physical element in EntityGroup should be repeatable to allow individuals to<br />describe an entity that is available in multiple physical formats (e.g., text,<br />Excel, Oracle). See the email to eml-dev with subject "more comments on<br />eml2rc2" for more details.</p> Bug #655 (Resolved): need better model for numeric domains for attributeshttps://projects.ecoinformatics.org/ecoinfo/issues/6552002-10-25T18:02:46ZMatt Jonesjones@nceas.ucsb.edu
<p>We've found another problem with attributeDomain that needs to be fixed for the<br />EML 2 release. Currently, the numericDomain subtype does not indicate which<br />number type is intended, and so some legitimate numeric domains are not<br />expressible in EML. This is a fatal flaw in the model, especially if domain is<br />required and people can't describe their domains. For example, right now, one<br />can not express a domain that only incudes the positive real numbers.</p>
<p>To fix this, I propose that we change the content model of numericDomain to the<br />following:</p>
<p>numericDomain (numberType, (minimum|minimumExclusive)?, (maximum|maximumExclusive)?)<br />numberType (#PCDATA) and is a choice of the following enumeration:<br /> natural, whole, integer, real</p>
<p>One might argue that the distinction between rational and irrational is needed<br />(but I think not), so we might consider adding "rational" and "irrational" to<br />the list (which together make real numbers). But I don't think irrational<br />numbers are relevant because they can't actually be written down except<br />symbolically (e.g., pi). See <a class="external" href="http://www.purplemath.com/modules/numtypes.htm">http://www.purplemath.com/modules/numtypes.htm</a> for<br />a summary of these number types.</p>
<p>Under this new system, someone who wanted to express a positive integral number<br />that was less than or equal to 10 could say:<br /> <numericDomain><br /> <numberType>whole</numberType><br /> <maximum>10</maximum><br /> <numericDomain></p>
<p>Under this new system, someone who wanted to express a positive fractional<br />number that was less than 10 could say:<br /> <numericDomain><br /> <numberType>real</numberType><br /> <minimumExclusive>0</minimumExclusive><br /> <maximumExclusive>10</maximumExclusive><br /> <numericDomain></p>
<p>Thanks for the feedback.</p> Bug #654 (Resolved): scope of the unit elementhttps://projects.ecoinformatics.org/ecoinfo/issues/6542002-10-25T17:02:33ZPeter McCartneypeter.mccartney@asu.edu
<p>Discussion of stmml has revealed several isses, one of which is the fact that <br />units, as expressed by stmml, are applicable only to measurable quantities. <br />Many variables that ecologists put in an eml dataset and might intuitively <br />appear to have "unit's (geologic age, sex, or species names, for example) do <br />not have units and thus must be decleared "dimensionless" or "undefined" <br />because unit is required for all attributes. I dont think its intuitively <br />apparent to users that these are domains not units and that they should be <br />described as such.</p>
<p>in the required element <measurmentScale> we class all attributes as nominal, <br />ordinal, interval or ratio. Srictly speaking only interval scales have units, <br />the rest are dimensionless. In practice, there is still some value of knowing <br />the units of the denominator and/or numerator in ratios of two dimensions, so <br />we probably dont want to throw out the baby with the bath water there.</p>
<p>To help clarify this, we might consider merging units within measurementScale <br />so that things may be set requred when relevant. an example might be:</p>
<p><measurementScale><br /> <interval><br /> <standardUnit><br /> metersPerSecond<br /> </standardUnit><br /> </interval><br /></measurementScale><br />a variant does away with embedding custom units in additionalMetadata would be:<br /><measurementScale><br /> <interval><br /> <unit library="http://ecoinformatics.org/emlUnitDictionary.xml"><br /> metersPerSecond<br /> </unit><br /> </interval><br /></measurementScale></p>
<p>this would mean any custom unit definitions would need to be published online.</p>
<p>content model for measurement scale might look like:<br />element measurementScale(nominal | ordinal | interval | ratio)<br />element nominal <br />element ordinal<br />element interval (unit)<br />element ratio (i'm not sure what would go here - it seems like we're hacking <br />unit definitions in emlUnitDictionary for ratios already but maybe that should <br />be pulled out and we provide a structured ratio definition here that references <br />two (or more?) true dimensions)</p>
<p>all attributes would still have a domain element - the existing bug on that <br />still applies</p> Bug #638 (Resolved): request for id/ref in attributeDomainhttps://projects.ecoinformatics.org/ecoinfo/issues/6382002-10-17T22:00:38ZMatt Jonesjones@nceas.ucsb.edu
<p>Barbara Benson representing NTL LTER site made the following request:</p>
<blockquote>
<p>3) Here is a suggested addition to EML2. When describing an <br />enumerated domain, the same code/definition pairs are used repeatedly in <br />an eml document. Could an id and reference be allowed in the attribute <br />domains?</p>
</blockquote>
<p>My feeling is that this is an excellent idea. I think this would be extremely<br />common, and we should accomodate her request. The instance docs would largely<br />be the same except for the possibility of having to deal with a reference<br />substition.</p>
<p>Below is an example snippet that Barbara provided as they would like to use this<br />feature at NTL. Note that they reference the "enumeratedDomain" field, whereas<br />I htink we should be referencing the "attributeDomain" field instead so that any<br />of the domain types can be shared. It would make domain comparisons among<br />attributes very easy. Here's her example:</p>
<pre><code>&lt;attribute&gt;<br /> &lt;attributeName&gt;FLAG_AVG_AIR_TEMP&lt;/attributeName&gt;<br /> &lt;attributeDefinition&gt;data flag for air temperature&lt;/attributeDefinition&gt;<br /> &lt;unit&gt;<br /> &lt;customUnit&gt;dimensionless&lt;/customUnit&gt;</code></pre>
<pre><code>&lt;/unit&gt;<br /> &lt;measurementScale&gt;nominal&lt;/measurementScale&gt;<br /> &lt;attributeDomain&gt;<br /> &lt;enumeratedDomain id="enum.METFLAG" scope="document"&gt;<br /> &lt;codeList&gt;<br /> <code>A</code><br /> &lt;definition&gt;Data logger off&lt;/definition&gt;</code></pre>
<code>A?</code><br /> <definition>Data logger off an indeterminate number of<br />hours</definition><br /> <code>An</code><br /> <definition>Data logger off n hours; daily averages may be in<br />error</definition><br /> <code>B</code><br /> <definition>Data logger off</definition>
<code>C</code><br /> <definition>Sensor off; missing value reported</definition><br /> <code>D</code><br /> <definition>Sensor malfunction produced bad values; values set<br />to missing;</definition><br /> <code>E</code><br /> <definition>Sensor noisy; values of uncertain quality</definition>
<code>F</code><br /> <definition>Sensor calibration error; correction factors have<br />been applied.</definition><br /> <code>G</code><br /> <definition>Sensor error; estimate made based on hourly<br />averages.</definition><br /> <code>H</code><br /> <definition>Data suspect; values outside of expected<br />range</definition>
<code>I</code><br /> <definition>Estimated from combining more than one record for<br />the day</definition><br /> <code>J</code><br /> <definition>Estimated from another met station.</definition><br /> <code>K</code><br /> <definition>Sensor malfunction produced bad values: data of<br />limited use.</definition>
<code>L</code><br /> <definition>Non standard routine followed</definition><br /> </codeList><br /> </enumeratedDomain><br /> </attributeDomain><br /> </attribute><br /> <attribute><br /> <attributeName>FLAG_AVG_DEWPOINT_TEMP</attributeName>
<pre><code>&lt;attributeDefinition&gt;data flag for dew point temperature<br />&lt;/attributeDefinition&gt;<br /> &lt;unit&gt;<br /> &lt;customUnit&gt;dimensionless&lt;/customUnit&gt;<br /> &lt;/unit&gt;<br /> &lt;measurementScale&gt;nominal&lt;/measurementScale&gt;<br /> &lt;attributeDomain&gt;&lt;enumeratedDomain&gt;<br /> &lt;references&gt;enum.METFLAG&lt;/references&gt;</code></pre>
<pre><code>&lt;/enumeratedDomain&gt;&lt;/attributeDomain&gt;<br /> &lt;/attribute&gt;</code></pre> Bug #637 (Resolved): attributeDomain should be requiredhttps://projects.ecoinformatics.org/ecoinfo/issues/6372002-10-17T21:48:30ZMatt Jonesjones@nceas.ucsb.edu
<p>The RC2 release shows attribute/attributeDomain as an oprional element. This<br />used to be required, and as far as I knew we agreed that it should be required.<br /> It is a problem if it is optional, as people can leave out this truly<br />fundamental part of an attribute definition. Does anybody remember consciously<br />changing this? Can I change it back?</p>
<p>I'm reviewing an EML submission from an LTER site and they have omitted it for<br />all of their numeric attributes, which is clearly a problem! They also<br />consistently omit precision, which is also a problem, but I don't think it can<br />be required because it doesn't apply to nominal data.</p> Bug #634 (Resolved): Documentation of reference elements in the schemashttps://projects.ecoinformatics.org/ecoinfo/issues/6342002-10-17T20:30:13ZDavid Blankmandblankman@lternet.edu
<p>I can't remember what we decided on documenting "references" elements inside the<br />schemas. They are currently not documented.</p> Bug #632 (Resolved): broken link in faqhttps://projects.ecoinformatics.org/ecoinfo/issues/6322002-10-17T18:50:12ZChad Berkleyberkley@nceas.ucsb.edu
<p>There is a broken link in the FAQ. It is in item 4 "formal specification" and<br />it should be linked to /software/eml/eml-2.0.0rc2/index.html but instead it is<br />linked to /software/eml/eml-2.0.0rc2/eml-docbook.html.</p> Bug #629 (Resolved): unit conversion coefficients need checkinghttps://projects.ecoinformatics.org/ecoinfo/issues/6292002-10-14T20:52:09ZChad Berkleyberkley@nceas.ucsb.edu
<p>The multipliers in the eml-unitDictionary.xml file need to be checked for each<br />unit in the file. There is also one unit, nanomolesPerGramPerSecond, that does<br />not have a multiplier but needs to have one.</p> Bug #628 (Resolved): eml-physical has lit: rather than cit: referenceshttps://projects.ecoinformatics.org/ecoinfo/issues/6282002-10-14T16:59:34ZDavid Blankmandblankman@lternet.edu
<p>I just noticed that eml-phyiscal references citations with lit: rather that cit:<br />as is the case with the other modules.</p> Bug #627 (Resolved): links broken in EML FAQhttps://projects.ecoinformatics.org/ecoinfo/issues/6272002-10-14T16:46:17ZMatt Jonesjones@nceas.ucsb.edu
<p>The EML FAQ has two broken links:</p>
<pre><code>1) Pointing to the EML project members list<br /> 2) Pointing to the EML spec itself</code></pre>
<p>Need to fix these.</p> Bug #626 (Resolved): ProcedureStepType schema needs revision to protect sequence of descriptions.https://projects.ecoinformatics.org/ecoinfo/issues/6262002-10-11T19:34:12ZTim Bergsmatbergsma@kbs.msu.edu
<p>As of RC2, the schema for ProcedureStepType is such that protocol replaces, <br />rather than qualifies, description. I think the choice of (description or <br />(citation or protocol) should be a sequence of description (one), citation <br />(zero to many) and protocol (zero to many); or something to that effect.</p> Bug #625 (Resolved): Cardinality regarding eml-methods should be corrected in two places.https://projects.ecoinformatics.org/ecoinfo/issues/6252002-10-09T19:48:41ZTim Bergsmatbergsma@kbs.msu.edu
<p>eml-dataset has an optional child element methods with the cardinality zero-or-<br />ONE. It should be zero-to-MANY. A dataset may be qualifed by more than one <br />corpus of methodological effort, and it cannot be assumed that those efforts <br />form a single logically and temporally continuous sequence of methodStep. <br />Also, methods itself should be a SEQUENCE, not a CHOICE, of methodStep (itself <br />repeatable), sampling (optional), or qualityControl (optional). It is an error <br />to present a choice of optional elements.</p> Bug #624 (Resolved): eml-methods/methodsType needs clarification on choice/sequencehttps://projects.ecoinformatics.org/ecoinfo/issues/6242002-10-09T19:30:49ZDavid Blankmandblankman@lternet.edu
<p>The current model for methodsType is a repetable choice of methodSteps (1 or<br />more) or sampling (0 or 1) or qualityControl (0 or more). This should be either:</p>
<p>1. a repeteable choice of methodStep or qualityControl or sampling</p>
<p>or</p>
<p>2. Sequence of methodSteps (1 or more), sampling (0 or 1?), qualityControl (0 or<br />more?).</p>
<p>We need to get some agreement on which model to use.</p>
<p>I recomend that we go with option 1.</p> Bug #471 (Resolved): eml spec overview documenthttps://projects.ecoinformatics.org/ecoinfo/issues/4712002-04-16T05:41:08ZMatt Jonesjones@nceas.ucsb.edu
<p>Need an overview document that gives the background and rationale for eml. This<br />would likely have both normative and non-normative sections. Would include an<br />overview of the structure of EML and the rationale for that structure, and its<br />intended use. Descriptes packaging and triples in detail. Probably would have<br />a normative appendix that defines the semantics of every field, which could be<br />auto-generated from the XSD source documentation.</p>
<p>Chris -- you started an outline for this. Can you recreate it here or in a<br />document in CVS?</p>
<p>We should have this for the 2.0.0 release but will not likely have it for the<br />beta7 release.</p>