Notes for John Harris

First NC demonstration data

Project #41 – Amphibolite Mountains 1999

 

1. vegPlotParty

 

In  41NCVS_DB  you will find a table called party

 

Field name

Data type

Plots table

Field in Plots table

Party

PK

vegPlotParty

Party_ID ?

FullName

text

 

ignore

SurName

text

vegPlotParty

surName

GivenName

text

vegPlotParty

givenName

OrganizationAtPart

text

vegPlotParty

OrganizationName

see note 1.1 below

Unsure

logical

 

ignore

AltName

text

 

See note 1.2 below

Email

text

email

emailAddress

MailingList

 

 

ignore

LastYearPart

 

 

ignore

 

 

 

 

 

Notes:

1.1 Note that organization name changes during a career.  Thus, we cannot have only one organization per person.  Instead, we need to eliminate organizationName from vegPlotParty and insert organizationName in plotContributor, projectConytributor, communityUsage, taxonInterpretation and citationContributor.  In my example data, those person-organization entries with # indicate the few cases where I know there should be more than one organization, but have not yet fixed the database to take care of the problem.

 

1.2  Note that I include the field AltName. We seem to have overlooked the possibility of a person having multiple names, such as maiden name and married name.  An example in my data is where Felicien Ntawukuliryayo decided his life would be simpler if he were to pick a name at random from the phone book, which he did and became Greg Meyer. How should we model this?  I propose we have a new field code PARENT containing the PK of the currently accepted name. 

 

2. roleCode

 

          Notes:

2.1  In my example data, I have roleCodes denormalized, but the ones to expect are “Honcho” & “Assistant”.  We need to set the list of legit role codes for the database, but for now translate “Honcho” into “Plot leader” and “Assistant” into “Plot assistant”.  Meanwhile, I do not see an entity for roleCode in the URL; perhaps it is time to create one? .

 

3. citation

 

          For now I have only three references that need to be cited, as follows

 

Entry #1

                        citAuthors: Robert K. Peet, Thomas R. Wentworth & Peter S. White

citTitle: A flexible, multipurpose method of recording vegetation composition and structure

                        citPubDate: 1998

                        citSeriesName: Castanea

                        citIssueIdentification: 63(3)

                        citPage: 262-274

           

Entry #2

citAuthors: J. Braun-Blanquet

citTitle: Pflanzensoziologie: Grundzüge der Vegetationskunde.

                        citPubDate: 1928

                        citSeriesName: Biologische Studienbücher 7

                        citPublisher: Springer

                        citLocation: Berlin

                         

Entry #3

citAuthors: The Association for Biodiversity Information

citTitle: Ecology Access Reporting Tool. International classification of  ecological communities. Natural Heritage Central Databases.

                        citEdition: 2.0

                        citEditionDate:  September 13, 2000

                        citPage:

citPublisher: The Association for Biodiversity Information

                        citLocation: Arlington, Virginia

 

            Notes:

3.1  At present we pull author name out of citation and placed that in citationContributor.  I think this a mistake.  Librarians and other bibliographically compulsive people want exact the spelling of author name(s) as in the text.  We do not allow for this when we reference back to vegPlotParty (unless we want to track all spellings and abbreviations).

3.2  Note that I have adjusted some of the other fields of citation to fit present needs.

 

4. sampleMethod

sampleMethodName: Carolina Vegetation Survey

sampleMethodDescription:  This wonderful but complicated method is reasonably well described in the reference.  More detail will be added to the database as time permits.

CITATION_ID:  Peet et al. 1998 (FK=1 in citation?)

 

5. coverMethod

 

Entry #1: coverType = Braun-Blanquet; citation_FK=Braun-Blanquet 1928 (2)

Entry #2: coverType = Carolina Vegetation Survey; citation_FK=Peet et al 1998 (1)

 

7. coverIndex

 

Below I provide entries for coverIndex for two scales.  The fields are CoverMethod_ID (BB=1, CVS=2?), indexCode, lowerLimit, upperLimit, indexDescription

         

BB      r      0.01  0.05              One or a few individuals

BB      +     0.05   0.1               Occasional and < 5%

BB      1      0.1    5                  Abundant with very low cover, or less abundant

                                                with higher cover, always <5%

BB      2      5     25                  very abundant & <5%, or 5-25%

BB      3    25     50                  25-50%

BB      4    50     75                  50-75%

BB      5    75   100                  75-100%

CVS   1      +       0.1               Trace

CVS   2      0.1    1                  0-1%

CVS   3      1       2                  1-2%

CVS   4      2       5                  2-5%

CVS   5      5     10                  5-10%

CVS   6    10     25                  10-25%

CVS   7    25     50                  25-50%

CVS   8    50     75                  50-75%

CVS   9    75     95                  75-95%

CVS  10   95   100                  95-100%

 

7. project

 

Below I provide details for one project.  For now you can ignore the other projects listed in the project table of the attached access database.

 

projectName: Amphibolite Mountains of Ashe, Watauga & Allegheny Counties, North Carolina

projectDescription: = Annual sampling PULSE of the North Carolina Vegetation Survey.

            projectStartDate: = July 10, 1999

            projectEndDate: =  July 18, 1999

 

8. stratumType

 

Below I populate stratumType with 5 entries for the fields stratumName and stratumDescription.

 

NCVS-1      Foliage <0.5m high

NCVS-2      Foliage 0.5-6m high

NCVS-3      Foliage 6-15m high

NCVS-4      Foliage 15-35m high

NCVS-5      Foliage >35m high

 

9. projectContributor

 

For this single project I provide two contributors with several new roleCodes (to be added to the roleCode entitiy).  The ID for each contributor will need to be looked up in vegPlotParty.

 

Robert Peet                     Data submitter

Robert Peet                     Contact person

Robert Peet                     Project coordinator

Michael Lee                    Data manager

 

10. namedPlace (USGSQuad, county,state country?)

 

In the access table File1 you will find fields called Gen_loc, County, & USGSQuad.  I have not provided a separate lookup table for you, though I could.  Probably you want to port the Gen_loc into namedPlace.  I suspect we should provide a full list of USGSQuads and Counties, probably by state.  This will call for two new tables in the database, but I think we would be well advised to add them.  I could provide these for the Carolinas, but do not currently have them for other states.

 

11. Plant taxa

 

To make all this as simple as possible, we include USDA species codes, which uniquely identify taxa as used in the USDA PLANTS database.  Thus, for the most part you need only use the species list that will become our database standard.

 

An unfortunate truth about vegetation data is that many taxa will be recognized that are not in the USDA list.  Many are created specifically for the project.  For example, in NC we sometimes refer to Carex rfb where rfb=red fibrous base and places the sedge in a set of about a dozen that are nearly impossible to tell apart without sex. In other cases we recognize taxa that USDA has not yet, in its infinite wisdom, seen fit to add to their database, and in still other cases we report taxa not yet published. Thus, the real-live data I send you have many taxa that will not be USDA compliant and will not have USDA codes.  To solve this problem we provide an ad hoc USDA code that starts with either the symbol ”_” or “>”.  All such species must be added to the master database before the plot data can be entered.

 

The access table SppTable_small has a number of fields of potential value, but the main point is to allow you to capture those taxa that lack USDA codes and are thus not available in the federal database.

 

We also have NC codes and an NC_Std to tell the status of the taxon. The NC_Std codes are 1=valid name, 2=unambiguous change, 3=ambiguous change or invalid code, 4=low resolution NC code (below level of species), 5=speices with valid infraspecific taxa not recognized, 6=infraspecific taxon, 7=non-vascular, 8-non-organic, 11=genus only.

 

Where we needed to create USDA codes we placed the NC code behind the “_” character.  In some cases the USDA did not divide the taxa as finally as we did such that resolution was lost shifting to a USDA code, which is what the > symbol indicates.  For _ species, I recommend that you add all species that have an NC_Std code of 1, 4, 5, 6, 7, 8, or 11. Most of these will have full names written out (to the limit of our knowledge of them).

 

An issue that no one has discussed yet is that some of these low-resolution taxa have meanings that change.  We refer to Carex sp #1 for any unidentified species, and we refer to Carex sp #1 (van eerdeni) to refer always to the same taxa. How do we want to distinguish these in the database?

 

12. userDefined

 

I provide a bunch of soil data.  Most of these variables will end up being user defined.

 

Expect two files: 41_text_2000.xls, 41_nutr_2000.xls. The first file will have soil texture data, and the second nutrient data.  In both files the primary key is a composite of study-team-plot.  Module is also part of the primary key, but is assumed = A. Below are the values to be put in the field UserDefinedName of  userDefinded. Note that the field userDefinedCategory should read “Soil” for all of these (we need to define the universe of acceptable categories someday); you will figure out the userDefinedType (mostly numeric), and I will leave the userDefinedMetadata blank for the present except that you should enter “North Carolina Vegetation Survey Protocol”. Note also that each file will have values for both A & B horizon, and occasionally C horizon.  B is often missing. You will need to use the Horizon variable to sort which variable a field in the xls file needs to populate in definedValue.

 

[All of the following are to be repeated for both A horizon and B horizon, indicated by a trailing  _letter.]

Clay_%

Silt_%

Sand_%

 

CEC

pH

Organic matter

N

S

P

Ca_ppm

Ma_ppm

K_ppm

Na_ppm

Ca_%

Mg_%

K_%

Na_%

Other_bases_%

H_%

B_ppm

Fe_ppm

Mn_ppm

Cu_ppm

Zn_ppm

Al_ppm

Bulk Density

Base_saturation_%

 

13. File1 = plot summary

 

Finally, real plot data to enter.  In access table File1 we find the following stuff.

 

Field name

Data type

Plots table

Field in Plots table

Project-Team-Plot

 

plot

Compound primary key

Date

 

plotObservation

ObsStartDate + obsEndDate

Gen_Loc

 

plotPlace

Int of namesPlace & plot

State

 

plot

State (FK to list?)

County

 

plot

County (FK to list?)

Quad_Name

 

plot

USGSquad (FK to list?)

UTM Zone

 

 

Ignore

UTM_E

 

 

ignore

UTM_N

 

 

ignore

Datum

 

 

see note 13.1

Lat

 

plot

origLat

long

 

plot

origLong

MetersError

 

plot

horizPosAccuracy

Ownership

 

 

ignore

Ares_Herbs

 

plot

Times 100 = area  note 13.2

Ares_Trees

 

 

See note 13.3

NumIntensives

 

 

See note 13.4

Depth

 

 

See note 13.5

NumPhotos

 

 

ignore

Notes

 

 

ignore

 

Notes

13.1  Note that we need to either provide a field in plot for Datum, or we need to specify Datum.  My preference it to require Datum because otherwise the user might overlook its importance.  Please add for now, and populate with WGS84 for all my plots.

13.2  Note that area = Areas_Herbs multiplied by 100.

13.3  This variable will be important for the tree data, but I am going to wait on explaining that until I have more time.

13.4  This tells us how many nested 100 m2 plots will be inside of this larger plot, unless the number is 1, in which case the plot is the plot. The identification of the subplots will follow in subsequent files.

13.5  Depth is to be a user defined field associated with plotObservation.

 

 

14. File2 = Site attributes

 

Field name

Data type

Plots table

Field in Plots table

Project-team-plot

 

plot

Compound primary key

Elevation

 

plot

ElevationValue

Slope

 

plot

slopeGradient

Aspect

 

plot

slopeAspect

Topo position

 

 

ignore

Landform class

 

 

ignore

Bryo/lichen

 

 

ignore

Decaying wood

 

plotObservation

percentWood

Bedrock

 

plotObservation

PercentRockGravel – Note 14.1

Gravel

 

 

See 14.1

Sand

 

plotObservation

percentSoil

Litter

 

plotObservation

percentLitter

Water

 

plotObservation

percentWater

Soil modules

 

 

ignore

 

 

 

 

 

Notes:

14.1  Note that Bedrock and Gravel should be summed to provide the entry for PercentRockGravel

 

You can assume CVS as the sampleMethod referenced in plotObservation

 

15. File5 = Soil analysis

These were explained under userDefined in 12 above

 

16.  Herb data = 41herb_NCVS_unfold.arc

 

In many ways this is the key file for this exercise.

Project-Team-Plot defines the primary key

 

Module=S (or A) defines the master plot, whereas modules with values between 1 and 10 define subplots.  The most common configuration is 4 subplots of 100m2 inside a master plot of 1000m2

 

US_code defines the species and thus authNameID of taxonObservation.

 

CumStrataCoverage of TaxonObs is recorded as cv in the table

 

C1, C2, C3, C4, C5 are userdefined presence in subquadrates to be associated with taxonObservation

 

S1, S2, S3, S4, S5 are cover by strata for the taxon, as defined in stratumType.

 

The remainder of the columns can be ignored for the present.

 

 

17.  Tree data = 41tree_NCVS.arc

I include the dataset, but will wait a little to explain it.

          This is going to be complicated

 

18. observationContributor

 

You obtain info for observationContributor from the access table PlotContrib. 

 

Field name

Data type

Plots table

plotContribution

PK

 

Project-Team- Plot

 

FK_plotObservation

SurName

 

FK_vegPlotParty

GivenName

 

FK_vegPlotParty

Role

FK

FK_roleCode

 

 

19. communityUsage & communityAssignment

 

This table records events where a plot observation has been identified as representing a particular community type. 

 

Notes:

20.1  We recognize that at one such event the identifying party might find a plot to have affinities with more than one (here up to 4) community types.

20.2  Project-Team-Plot is a compound primary key based on Project number (PROJECT_ID) and Plot number (= Team+Plot) (PLOT_ID)

20.3  ClassCode refers to a community listed in commConcept , which is referenced through the unique CEGLcode (= ABIcode) associated with an entry in commConcept.

20.4  Reference is a FK to a list of references and tells us which reference we looked in to make the determination, or at least which reference we were following.  For this purpose we should add a field to communityAssignment called authority_ID ( = Citation_ID). This differs from CITATION_ID which is used to record a publication in which the plot determination was formally published. Because authority_ID will sometimes reference a transient database, we need a reference date field, which perhaps we call authorityDate.

20.5  Inspection, tabular analysis, and multivariate are +/- and repeat for each of the up to 4 entries in conceptUsage associated with an entry in communityAssignment.  Expert System is similar except that the entry in methodOther is “Expert system”.  For now we have no occurrence of an entry in this column, but thought I would warn you that it might happen.

 

Field name

Data type

Plots table

Field in Plots table

 

 

 

 

Project-Team-Plot

text

commAssign

OBS_ID

Annotation by

text-FK

commAssignment

PARTY_ID

Annotation date

date-time

commAssignment

startDate

Inspection

binary

conceptUsage

methodInspection

Tabular analysis

binary

conceptUsage

methodTables

Multivariate

binary

conceptUsage

methodMultivariate

Expert System

text

conceptUsage

methodOther

ClassCode1

Text-FK

conceptUsage

 

CommunityConcept_ID

via ABIcode=CEGLcode

Fit1

text

conceptUsage

fit

Confidence1

text

conceptUsage

confidence

Reference1

text-FK

CommAssignment

Authority_ID   see note

Notes1

text

conceptUsage

classificationNotes

ClassCode2

Text-FK

As per ClassCode1

 

Fit2

text

 

 

Confidence2

text

 

 

Reference2

text-FK

 

 

Notes2

text

 

 

ClassCode3

Text-FK

As per ClassCode1

 

Fit3

text

 

 

Confidence3

text

 

 

Reference3

text-FK

 

 

Notes3

text

 

 

ClassCode4

Text-FK

As per ClassCode1

 

Fit4

Text

 

 

Confidence4

text

 

 

Reference4

text-FK

 

 

Notes4

text

 

 

Comments

Memo