First NC demonstration data
Project #41 –
1. vegPlotParty
In 41NCVS_DB you will find a table called party
Field name |
Data type |
Plots table |
Field in Plots table |
Party |
PK |
vegPlotParty |
Party_ID ? |
FullName |
text |
|
ignore |
SurName |
text |
vegPlotParty |
surName |
GivenName |
text |
vegPlotParty |
givenName |
OrganizationAtPart |
text |
vegPlotParty |
OrganizationName – see note 1.1 below |
Unsure |
logical |
|
ignore |
AltName |
text |
|
See note 1.2 below |
Email |
text |
email |
emailAddress |
MailingList |
|
|
ignore |
LastYearPart |
|
|
ignore |
|
|
|
|
Notes:
1.1 Note that organization name changes during a career. Thus, we cannot have only one organization per person. Instead, we need to eliminate organizationName from vegPlotParty and insert organizationName in plotContributor, projectConytributor, communityUsage, taxonInterpretation and citationContributor. In my example data, those person-organization entries with # indicate the few cases where I know there should be more than one organization, but have not yet fixed the database to take care of the problem.
1.2 Note that I include the field AltName. We seem to have overlooked the possibility of a person having multiple names, such as maiden name and married name. An example in my data is where Felicien Ntawukuliryayo decided his life would be simpler if he were to pick a name at random from the phone book, which he did and became Greg Meyer. How should we model this? I propose we have a new field code PARENT containing the PK of the currently accepted name.
2. roleCode
Notes:
2.1 In my example data, I have roleCodes denormalized, but the
ones to expect are “Honcho” & “Assistant”.
We need to set the list of legit role codes for the database, but for
now translate “Honcho” into “Plot leader” and “Assistant” into “Plot
assistant”. Meanwhile, I do not see an
entity for roleCode
in the URL; perhaps it is time to create one? .
3. citation
For now I have
only three references that need to be cited, as follows
Entry #1
citAuthors: Robert K. Peet, Thomas R.
Wentworth & Peter S. White
citTitle: A flexible, multipurpose method of recording vegetation composition
and structure
citPubDate: 1998
citSeriesName: Castanea
citIssueIdentification: 63(3)
citPage: 262-274
Entry #2
citAuthors: J. Braun-Blanquet
citTitle: Pflanzensoziologie: Grundzüge
der Vegetationskunde.
citPubDate: 1928
citSeriesName: Biologische
Studienbücher 7
citPublisher: Springer
citLocation: Berlin
Entry #3
citAuthors: The Association for Biodiversity Information
citTitle: Ecology Access Reporting Tool. International classification of ecological communities. Natural Heritage
Central Databases.
citEdition: 2.0
citEditionDate:
September 13, 2000
citPage:
citPublisher: The Association for Biodiversity
Information
citLocation: Arlington, Virginia
Notes:
3.1 At present we pull author name out of citation and placed that in citationContributor. I think this a mistake. Librarians and other bibliographically compulsive people want exact the spelling of author name(s) as in the text. We do not allow for this when we reference back to vegPlotParty (unless we want to track all spellings and abbreviations).
3.2 Note that I have adjusted some of the other fields of citation to fit present needs.
4. sampleMethod
sampleMethodName: Carolina Vegetation Survey
sampleMethodDescription:
This
wonderful but complicated method is reasonably well described in the
reference. More detail will be added to
the database as time permits.
CITATION_ID: Peet
et al. 1998 (FK=1 in citation?)
5. coverMethod
Entry #1: coverType = Braun-Blanquet; citation_FK=Braun-Blanquet 1928 (2)
Entry #2: coverType = Carolina Vegetation
Survey; citation_FK=Peet et al 1998 (1)
7. coverIndex
Below I provide entries for coverIndex for two
scales. The fields are CoverMethod_ID (BB=1, CVS=2?), indexCode,
lowerLimit, upperLimit, indexDescription
BB r 0.01 0.05 One or a few individuals
BB + 0.05 0.1 Occasional and < 5%
BB 2 5 25 very abundant & <5%, or 5-25%
BB 3 25 50 25-50%
BB 4 50 75 50-75%
BB 5 75 100 75-100%
CVS 1 + 0.1 Trace
CVS 2 0.1 1 0-1%
CVS 3 1 2 1-2%
CVS 4 2 5 2-5%
CVS 5 5 10 5-10%
CVS 6 10 25 10-25%
CVS 7 25 50 25-50%
CVS 8 50 75 50-75%
CVS 9 75 95 75-95%
CVS 10 95 100 95-100%
7. project
Below I provide details for one project. For now you can ignore the other projects
listed in the project table of the attached access database.
projectName: Amphibolite Mountains of Ashe, Watauga & Allegheny Counties, North Carolina
projectDescription: = Annual sampling PULSE of the North Carolina Vegetation Survey.
projectStartDate: = July 10, 1999
projectEndDate: = July 18, 1999
8. stratumType
Below I populate stratumType with 5 entries for the
fields stratumName and stratumDescription.
NCVS-1 Foliage
<0.5m high
NCVS-2 Foliage
0.5-6m high
NCVS-3 Foliage
6-15m high
NCVS-4 Foliage
15-35m high
NCVS-5 Foliage
>35m high
9. projectContributor
For this single project I provide two
contributors with several new roleCodes (to be added
to the roleCode
entitiy). The
ID for each contributor will need to be looked up in vegPlotParty.
Robert Peet Data
submitter
Robert Peet Contact
person
Robert Peet Project
coordinator
Michael Lee Data
manager
10. namedPlace (USGSQuad, county,state
country?)
In the access table File1 you will find fields called Gen_loc, County, & USGSQuad. I have not provided a separate lookup table
for you, though I could. Probably you
want to port the Gen_loc into namedPlace. I suspect we should provide a full list of USGSQuads and Counties, probably by state. This will call for two new tables in the
database, but I think we would be well advised to add them. I could provide these for the Carolinas, but
do not currently have them for other states.
To make all this as simple as possible, we
include USDA species codes, which uniquely identify taxa
as used in the USDA PLANTS database.
Thus, for the most part you need only use the species list that will
become our database standard.
An unfortunate truth about vegetation data is
that many taxa will be recognized that are not in the
USDA list. Many are created specifically
for the project. For example, in NC we
sometimes refer to Carex rfb
where rfb=red fibrous base and places the sedge in a
set of about a dozen that are nearly impossible to tell apart without sex. In
other cases we recognize taxa that USDA has not yet,
in its infinite wisdom, seen fit to add to their database, and in still other
cases we report taxa not yet published. Thus, the real-live
data I send you have many taxa that will not be USDA
compliant and will not have USDA codes.
To solve this problem we provide an ad hoc USDA code that starts
with either the symbol ”_” or “>”.
All such species must be added to the master database before the plot
data can be entered.
The access table SppTable_small has a number of fields of
potential value, but the main point is to allow you to capture those taxa that lack USDA codes and are thus not available in the
federal database.
We also have NC codes and an NC_Std to tell the status of the taxon.
The NC_Std codes are 1=valid name, 2=unambiguous
change, 3=ambiguous change or invalid code, 4=low resolution NC code (below
level of species), 5=speices with valid infraspecific taxa not recognized,
6=infraspecific taxon,
7=non-vascular, 8-non-organic, 11=genus only.
Where we needed to create USDA codes we
placed the NC code behind the “_” character.
In some cases the USDA did not divide the taxa
as finally as we did such that resolution was lost shifting to a USDA code,
which is what the > symbol indicates.
For _ species, I recommend that you add all species that have an NC_Std code of 1, 4, 5, 6, 7, 8, or 11. Most of these will
have full names written out (to the limit of our knowledge of them).
An issue that no one has discussed yet is
that some of these low-resolution taxa have meanings
that change. We refer to Carex sp #1 for any unidentified species, and we refer to Carex sp #1 (van eerdeni) to
refer always to the same taxa. How do we want to
distinguish these in the database?
12. userDefined
I provide a bunch of soil data. Most of these variables will end up being
user defined.
Expect two files: 41_text_2000.xls,
41_nutr_2000.xls. The first file will have soil texture data, and the second
nutrient data. In both files the primary
key is a composite of study-team-plot.
Module is also part of the primary key, but is assumed = A. Below are
the values to be put in the field UserDefinedName
of userDefinded. Note that the field userDefinedCategory should read “Soil” for all of these (we
need to define the universe of acceptable categories someday); you will figure
out the userDefinedType (mostly numeric), and I will
leave the userDefinedMetadata blank for the present
except that you should enter “North Carolina Vegetation Survey Protocol”. Note
also that each file will have values for both A & B horizon, and
occasionally C horizon. B is often
missing. You will need to use the Horizon variable to sort which variable a
field in the xls file needs to populate in definedValue.
[All of the following are to be repeated for
both A horizon and B horizon, indicated by a trailing _letter.]
Clay_%
Silt_%
Sand_%
CEC
pH
Organic matter
N
S
P
Ca_ppm
Ma_ppm
K_ppm
Na_ppm
Ca_%
Mg_%
K_%
Na_%
Other_bases_%
H_%
B_ppm
Fe_ppm
Mn_ppm
Cu_ppm
Zn_ppm
Al_ppm
Bulk Density
Base_saturation_%
13. File1 = plot summary
Finally, real plot data to enter. In access table File1
we find the following stuff.
Field name |
Data type |
Plots table |
Field in Plots table |
Project-Team-Plot |
|
plot |
Compound primary key |
Date |
|
plotObservation |
ObsStartDate + obsEndDate |
Gen_Loc |
|
plotPlace |
Int of namesPlace
& plot |
State |
|
plot |
State (FK to list?) |
County |
|
plot |
County (FK to list?) |
Quad_Name |
|
plot |
USGSquad
(FK to list?) |
UTM Zone |
|
|
Ignore |
UTM_E |
|
|
ignore |
UTM_N |
|
|
ignore |
Datum |
|
|
see note 13.1 |
Lat |
|
plot |
origLat |
long |
|
plot |
origLong |
MetersError |
|
plot |
horizPosAccuracy |
Ownership |
|
|
ignore |
Ares_Herbs |
|
plot |
Times 100 = area note 13.2 |
Ares_Trees |
|
|
See note 13.3 |
NumIntensives |
|
|
See note 13.4 |
Depth |
|
|
See note 13.5 |
NumPhotos |
|
|
ignore |
Notes |
|
|
ignore |
Notes
13.1 Note that we need to either provide a field in plot for Datum, or we need to specify Datum. My preference it to require Datum because otherwise the user might overlook its importance. Please add for now, and populate with WGS84 for all my plots.
13.2 Note that area = Areas_Herbs multiplied by 100.
13.3 This variable will be important for the tree data, but I am going to wait on explaining that until I have more time.
13.4 This tells us how many nested 100 m2 plots will be inside of this larger plot, unless the number is 1, in which case the plot is the plot. The identification of the subplots will follow in subsequent files.
13.5 Depth is to be a user defined field associated with plotObservation.
14. File2 = Site attributes
Field name |
Data type |
Plots table |
Field in Plots table |
Project-team-plot |
|
plot |
Compound primary key |
Elevation |
|
plot |
ElevationValue |
Slope |
|
plot |
slopeGradient |
Aspect |
|
plot |
slopeAspect |
Topo position |
|
|
ignore |
Landform class |
|
|
ignore |
Bryo/lichen |
|
|
ignore |
Decaying wood |
|
plotObservation |
percentWood |
Bedrock |
|
plotObservation |
PercentRockGravel
– Note 14.1 |
Gravel |
|
|
See 14.1 |
Sand |
|
plotObservation |
percentSoil |
Litter |
|
plotObservation |
percentLitter |
Water |
|
plotObservation |
percentWater |
Soil modules |
|
|
ignore |
|
|
|
|
Notes:
14.1 Note that Bedrock and Gravel should be summed
to provide the entry for PercentRockGravel
You can assume CVS as the sampleMethod
referenced in plotObservation
15. File5 = Soil analysis
These were explained under userDefined in 12 above
16. Herb data =
41herb_NCVS_unfold.arc
In many ways this is the key file for this
exercise.
Project-Team-Plot defines the primary key
Module=S (or A) defines the master plot,
whereas modules with values between 1 and 10 define subplots. The most common configuration is 4 subplots
of 100m2 inside a master plot of 1000m2
US_code defines the
species and thus authNameID of taxonObservation.
CumStrataCoverage of TaxonObs is
recorded as cv in the table
C1, C2, C3, C4, C5 are userdefined
presence in subquadrates to be associated with taxonObservation
S1, S2, S3, S4, S5 are cover by strata for
the taxon, as defined in stratumType.
The remainder of the columns can be ignored
for the present.
17. Tree data = 41tree_NCVS.arc
I include the dataset, but will wait a little
to explain it.
This is going to
be complicated
18. observationContributor
You obtain info for observationContributor from the access
table PlotContrib.
Field name |
Data type |
Plots table |
plotContribution |
PK |
|
Project-Team- Plot |
|
FK_plotObservation |
SurName |
|
FK_vegPlotParty |
GivenName |
|
FK_vegPlotParty |
Role |
FK |
FK_roleCode |
This table records events where a plot
observation has been identified as representing a particular community
type.
Notes:
20.1 We recognize that at one such event the identifying party might find a plot to have affinities with more than one (here up to 4) community types.
20.2 Project-Team-Plot is a compound primary key based on Project number (PROJECT_ID) and Plot number (= Team+Plot) (PLOT_ID)
20.3 ClassCode refers to a community listed in commConcept , which is referenced through the unique CEGLcode (= ABIcode) associated with an entry in commConcept.
20.4 Reference is a FK to a list of references and tells us which reference we looked in to make the determination, or at least which reference we were following. For this purpose we should add a field to communityAssignment called authority_ID ( = Citation_ID). This differs from CITATION_ID which is used to record a publication in which the plot determination was formally published. Because authority_ID will sometimes reference a transient database, we need a reference date field, which perhaps we call authorityDate.
20.5 Inspection, tabular analysis, and multivariate are +/- and repeat for each of the up to 4 entries in conceptUsage associated with an entry in communityAssignment. Expert System is similar except that the entry in methodOther is “Expert system”. For now we have no occurrence of an entry in this column, but thought I would warn you that it might happen.
Field name |
Data type |
Plots table |
Field in Plots table |
|
|
|
|
Project-Team-Plot |
text |
commAssign |
OBS_ID |
Annotation by |
text-FK |
commAssignment |
PARTY_ID |
Annotation date |
date-time |
commAssignment |
startDate |
Inspection |
binary |
conceptUsage |
methodInspection |
Tabular analysis |
binary |
conceptUsage |
methodTables |
Multivariate |
binary |
conceptUsage |
methodMultivariate |
Expert System |
text |
conceptUsage |
methodOther |
ClassCode1 |
Text-FK |
conceptUsage |
CommunityConcept_ID via ABIcode=CEGLcode |
Fit1 |
text |
conceptUsage |
fit |
Confidence1 |
text |
conceptUsage |
confidence |
Reference1 |
text-FK |
CommAssignment |
Authority_ID see note |
Notes1 |
text |
conceptUsage |
classificationNotes |
ClassCode2 |
Text-FK |
As per ClassCode1 |
|
Fit2 |
text |
|
|
Confidence2 |
text |
|
|
Reference2 |
text-FK |
|
|
Notes2 |
text |
|
|
ClassCode3 |
Text-FK |
As per ClassCode1
|
|
Fit3 |
text |
|
|
Confidence3 |
text |
|
|
Reference3 |
text-FK |
|
|
Notes3 |
text |
|
|
ClassCode4 |
Text-FK |
As per ClassCode1
|
|
Fit4 |
Text |
|
|
Confidence4 |
text |
|
|
Reference4 |
text-FK |
|
|
Notes4 |
text |
|
|
Comments |
Memo |
|
|