Firstwrkshnotes » History » Revision 16

« Previous | Revision 16/20 (diff) | Next »
Corinna Gries, 03/11/2014 08:14 AM

Workshop Notes¶

Breakout session 1: Metrics brainstorming¶

what are you currently using
what would you like to use
how widely is it used
can it be applied to different biological community datasets (sampling approach)
is it already coded {in R}

Metrics¶

Diversity (all of these are generally in R, mostly in vegan)
1. Jaccard index
2. Simpson's diversity
3. Shannons index
4. Turnover - different ways to calculate
5. Dominance
6. Evenness
7. Richness
8. Rank abundance shift
9. Proportion of overall diversity
10. Beta diversity
Community metrics/ordination
1. NMDS (vegan)
2. PCA (vegan)
3. Bray curtis (vegan)
4. Variance tracking, quantify variability change
5. Position in ordination-space
Spatial
1. patch scale
2. spatial autoregression
3. Endemism
4. Summary of species' positions within their ranges
5. meta community statistics
Mechanistic models
1. MAR, needs driver matrix, problem auto-corelation, mostly fresh water or marine (Eli Holmes has state-space MAR in R implemented, not sure if it's on CRAN) http://cran.r-project.org/web/packages/MARSS/index.html
2. MANOVA (vegan? Also, permanova is in vegan)
3. Ecosystem function (e.g. N deposition)
4. interaction population models - inter specific competition (Ben Bolker's book and corresponding package)
5. Economically/legally relevant metrics (e.g. Maximum sustainable yield)
Food webs
1. connectance
2. network analysis
Traits/phylogentic
1. functional/phylogenetic diversity
2. species aggregation (functional groups, trophic levels
3. phylogenetic dispersion
4. Native/exotic
5. Phylogeographic history
Temporal indices
1. species turnover
2. rate of return
3. Variance ratio
4. Mean-variance scaling
5. Spectral analysis
6. Regresssion windows (strucchange)
7. time series models of abundance -- metric would be parameters of model
null models
Comparative analysis of small noise vs large noise systems. What drives differences?

Coded in R¶

Richness/diversity metrics: http://cran.r-project.org/web/packages/vegan/index.html
Diversity metrics (alpha, beta, gamma): http://cran.r-project.org/web/packages/vegetarian/index.html
Hubble metrics: http://cran.r-project.org/web/packages/untb/index.html
Leading indicators, variance, autocorrelation, skew, heteroscedasticity: http://cran.at.r-project.org/web/packages/earlywarnings/index.html

not yet coded:

state-space models and community level resilience
variance components analysis

Breakout Session 2: Identify research questions¶

Group 1
1. Data set transformation to allow compute of many metrics
2. Time series analysis of community level metrics (consider higher freq data too)(earlywarnings R package)
Group 2
1. New R code for capturing climate variance at seasonal and interannual scales and residuals
2. R model for analyzing more spatial variability (Eric's LTER project)
3. Review of non-stationarity
  1. Variance partitioning
  2. Temporal and spatial variance

Discussion and Feedback: Collaboration Approaches¶

Most Important Limitations

Data
- Lack of coordinated long-term measurements
- Time necessary to find data
- Determine usability of data, e.g. stations within a boundary envelope with at least 2 samples over 2 years
- Time necessary to clean data
- Quality control data and deal with problems
- Data sharing permission issues

Workflows
- Need incentive to document as you work; would be different if pushed to KNB as work progresses and get credit for that work done

Collaboration
- Scattered resources: data and code in different locations, hard to move back and forth, hard to work on the code together, hard to know who's working on which parts of the code
- Workspace integration and accessibility
- Project management/tool integration
- Time investment in learning different tools, training needs
- Github is too technical

Recommendations

Data
- Dataset format: long format with columns for species and count/biomass, plus columns for site (plot, subplot, etc.) and date. Separate table with species name to be able to add functional groups, taxonomic rank, etc.. Separate table for site descriptions (manipulations, land use, etc.)
- Gather additional data on biogeochemistry, climate etc.
- Develop standard methods for dealing with outliers, large gaps, species names and spellings
- Develop standards for classifying data points into aggregated
- Create library of cleaned data sets that are massaged into one format

Workflows
- Create library of workflows that provide general cleaning routines that can be applied to arbitrary data, possibly interactive with some user input
- Create library of workflows that make reshaping more accessible to people with little coding experience
- Create library of workflows specifically for dealing with taxonomic names.
- Link workflows to publications, e.g., via a website (repository) where scientists can publish citeable workflows (ecologicalworkflows.org, like myexperiment.org, but possibly more agnostic with respect to dependencies/tools that connect to it (package descriptions))
- Make this repository more accessible by keeping the 'ecology' emphasis, make workflows much more visible in existing repositories (KNB, DataONE) by linking to datasets.
- Create library of workflows for training purposes (e.g. Dan Bunker's R tutorial), link to datasets in a repository

Collaboration Tool
- Pair programming: changes how you work; divide and conquer worked well
- Git repository, has been used successfully in this workshop when some people were familiar with it a could bootstrap the use for other people quickly
- Way to replicate or interface with services like {Google open refine, db constraints, taxize, TNRS)
- Develop a 'Redmine' that is more useful for academics; becomes the point for integration of multiple tools; also BaseCamp/Trello, Digital notebook environments
- Run workflows, organize outputs, communicate with collaborators
- Ability to couple models at multiple scales (e.g., spatial or temporal scales), scale up computing as well
- Incorporate writing process, version control for documents (Google docs is not sufficient)
- Incorporate mechanisms to maintain social connection even in absence of face to face meetings

Datasets

small mammal (VCR, SEV)
arthropod data (CAP, KNZ, FCE)
datasets on kelp published in ESA journal
Cedar Creek :
- species compostion data Accessible at: http://doi.org/10.6073/pasta/50db8bde41c9ea8b32dfbdde8bb0fad2
- climate data accessible at http://doi.org/10.6073/pasta/24eb99ad3102cdcb2f8d02de93dd551e

PISCO intertidal biodiversity surveys
- Methods: http://cbsurveys.ucsc.edu/sampling/images/dataprotocols.pdf
- Point contact data (percent cover, good for sessile/common spp): https://knb.ecoinformatics.org/m/#view/doi:10.6085/AA/pisco_intertidal.50.6
- Quadrat data (percent cover, good for mobile spp): https://knb.ecoinformatics.org/m/#view/doi:10.6085/AA/pisco_intertidal.52.7
- Swath data (extensive, only select rare species like seastars): https://knb.ecoinformatics.org/m/#view/doi:10.6085/AA/pisco_intertidal.51.6

Konza
- climate data (KNZ headquarters): doi:10.6073/pasta/ac19b27f2c28a63890d59ece32f5116b
- Konza species composition (belowground experiment for N addition contrasts): doi:10.6073/pasta/b6653594d336bddf9d5f7f72c7d9200c Konza only collects cover for N addition treatments every 5 years, so we will abandon for now

Detailed notes are on etherpad: https://epad.nceas.ucsb.edu/p/commdyn-20140105

Files (0)

Updated by Corinna Gries almost 11 years ago · 16 revisions

Project

General

Profile

Community Dynamics Toolbox

Wiki

Firstwrkshnotes » History » Revision 16

Workshop Notes¶

Breakout session 1: Metrics brainstorming¶

Metrics¶

Coded in R¶

Breakout Session 2: Identify research questions¶

Discussion and Feedback: Collaboration Approaches¶