There are a number of organizations who participate in projects at NCEAS whos data is currently (or potentially) registered in the KNB data repoitory. This can be done either directly or through replication from their metacat servers. ESA, UCNRS, OBFS, PISCO and LTER are examples of this. If NCEAS projects register data that is connected to the other organizations there is room for duplication. There is also a concern that some of the organization will miss getting credit for the data or be unable to display the data for their own websites and/or skins or have the data package show up for their reports. ESA has the LSID included with the citation that adds to problems with the data package being registered that is "owned" by more than one organization.

ESA is starting to register data sets with their own metacat server and replicate it to the KNB metacat. Here is an example of a duplicate that has been created.

Smith F. . Macroecological database of mammalian body mass. nceas.196.3 (registered earlier)

Smith F. 2006. Macroecological database of mammalian body mass. ESA Data Registry:


same citation listing in the KNB (view has no lisid information)
Smith F. 2006. Macroecological database of mammalian body mass. esa.19.3

NCEAS and ESA data registration for the same Online Distribution Info location and in this case the same title come up when doing a search on the title. I know of three groups (2 are with the SB LTER) that are going to try to submit data papers to the ESA archives who already have data packages in the KNB. This poses the problem of having more duplicates.

Currently the organization field contains a specific name such as: Ecological Society of America, Organization of Biological Field Stations, University of California Natural Reserve System, and National Center for Ecological Analysis and Synthesis. We have views or web skins that will display those specific organizations data sets. This field is automatically generated if people use the skins to register data sets for those specific organizations.

The question is can we avoid duplication and have the different skins and organizations be able to generate views and reports specific to them. Will there be any problems with having more than one organization, who uses metacat, associated with the data package. For instance if UCNRS has a NCEAS postdoc doing research at their reserve can data sets created by this researcher be owned by both organizations.

Here is an example of what the contact section of the eml code might look like for a document that could be associated with more than one organization. If we encourage data packages owned by more than one organization to be listed in the eml file, will that help to prevent duplicates? Will it encourage data sharing.

One consideration, and complication, is that only the ESA site creates and displays a lsid (Life Science ID) along with having only one way replication. ESA registrations have been peer reviewed which can also potentially add more value to them and make them more easily cited. How does this factor in with data sets that are registered on KNB but are scheduled to be added to the ESA Archives data papers?

Here is an example of the eml section that could allow for more than one organization that could be searched on.

<creator id="1169157059583"><organizationName>NCEAS 5600: Vazquez: Null models for specialization and asymmetry in plant-pollinator systems</organizationName>
<creator id="1169157150489"><organizationName>National Center for Ecological Analysis and Synthesis</organizationName>
<creator id="1169157197942"><individualName><givenName>Kevin D.</givenName>
<organizationName>USGS Channel Islands Field Station Marine Science Institute</organizationName>
<positionName>Research Ecologist</positionName>
<address><deliveryPoint>University of California</deliveryPoint>
<city>Santa Barbara</city>
<phone phonetype="voice">(805) 893-8778</phone>
<creator id="1169157176598"><organizationName>University of California Natural Reserve System</organizationName>

