For b) I suggest the following new text (which still may be too long):
Nominal: unordered categories or text strings (e.g., Male)
Ordinal: ordered categories (e.g., Low, High)
Interval: values from a scale with equidistant points (e.g., 12.2 meters)
Ratio: interval scale with a meaningful zero point (e.g., 273 Kelvin)
Date-time: date or time values from the Gregorian calendar (e.g., 2002-10-14)
For the help box:
Help on Choosing a Measurement Scale
------------------------------------
The concept of a measurement scale as defined by Stevens is useful for
classifying data despite the weaknesses of the approach that have been pointed
out by several practitioners. In particular, the classification allows us to
determine some of the mathematical operations that are apporpriate for a given
set of data, and allows us to determine which types of metadata are needed for
a given set of data. For example, categorical data never have a "unit" of
measurement.
Here is a brief overview of the measurement scales we have employed in EML.
They are based on Steven's original typology, with the addition of "Date-Time"
for purely pragmatic reasons (we need to distinguish date time values in order
to collect certain essential metadata about date and time representation).
NOMINAL
The nominal scale places values into named categories. The different values
within a set are unordered. Some examples of nominal scales include
gender (Male/Female) and marital status (single/married/divorced). Text
fields should be classified as nominal.
ORDINAL
The ordinal scale places values in a set order. All ordinal values are also
nominal. Ordinal data show a particular value's position relative to other
values, such as "low, medium, high, etc." The ordinal scale doesn't
indicate the distance between each item.
INTERVAL
The interval scale uses equal-sized units of measurement on a scale between
values. It therefore allows the comparison of the differences between two
values on the scale. With interval data, the allowable values start from
an arbitrary point (not a meaningful zero), and so there is no concept of
'zero' of the measured quantity. Consequently, ratios of interval values
are not meaningful. For example, one can not infer that someone with a value
of 80 on an ecology test knows twice as much ecology as someone who scores
40 on the test, or that an object at 40 degrees C has twice the kinetic
energy as an object at 20 degrees C. All interval values are also ordered
and therefore are ordinal scale values as well.
RATIO
The ratio scale is an interval scale with a meaningful zero point.
The ratio scale begins at a true zero point that represents an absolute
lack of the quality being measured. Thus, ratios of values are meaningful.
For example, an object that is at elevation of 100 meters above sea level
is twice as high as an object that is at an elevation of 50 meters above
sea level (where sea level is the zero point). Also, an object at 300
degrees Kelvin has three times the kinetic energy of an object at 100
degrees Kelvin (where absolute zero (no motion) defines the zero point of
the Kelvin scale). Interval values can often be converted to ratio
values in order to make ratio comparisons legitimate. For example, an
object at 40 degrees C is 313.15 degrees Kelvin, an object at 20 degrees C
is 293.15 degrees Kelvin, and so the first object has approximately
1.07 times more kinetic energy (note the wrong answer you would have
gotten had you taken the ratio of the values in Celsius).
DATE-TIME
Date and time values in the Gregorian calendar are very strange to use
in calculations in that they have properties of both interval and ratio
scales. They also have some properties that do not conform to the
interval scale because of the adjustments that are made to time to account
for the variations in the period of the Earth around the sun.
While the Gregorian calendar has a meaningful zero point, it would be
difficult to say that a value taken on midnight January 1, 1000 is twice as
old as a value taken on midnight January 1 2000 because the scale has
many irregularities in length in practice. However, over short intervals
the scale has equidistant points based on the SI second, and so can be
considered interval for most some purposes, especially with respect to
measuring the timing of short-term ecological events. Date and time values
can be represented using several distinct notations, and so we have
distinct metadata needs in terms of specifying the format of the value
representation. Because of these pragmatic issues, we separated Date-time
into its own measurement scale. Examples of date-time values are
'2003-05-05', '1999/10/10', and '2001-10-10T14:23:20.3'.