update pub_date when the length of that field is != 4 (use date_created in this scenario). There were 2 entries that had "193" as the pub_date.
replace new lines in creator with spaces. set blank " " titles and creators to "unknown". use "Baltimore Ecosystem Study LTER" for publisher on all BES objects.
include John Kunze's latest suggestions for improved metadata -- a lot of clean-up, especially on characters in the file. Note UTF-8 encoding of the script.
use resourceMapLocation (resolve url for the ore map) as the datacite_relatedIdentifier_isPartOf property
use lowercase 'metadata' and 'data' for the resourceType
set publisher to the source system when publisher == creator (we want them to be different, even if just for appearances)
only include public (readable) DOIs in the final output
use "lastname, firstname" convention throughout
include more descriptive data file name for title of data records
include publisher given name correctly
use correct EZID account names for the three different nodes.https://redmine.dataone.org/issues/2815
align the final column headers with the datacite schema, as applicable.https://redmine.dataone.org/issues/2815
use DataCite isNewVersionOf/isPreviousVersionOf for revision history
not every EML file has an ORE datapackage descriptor -- join only to those when setting the resourceMapId
correctly use document revision for object format and resource map joins.
use correct children of 'publisher' element
include the resourceMapId for the metadata objects, not just the data files.
updated LDAP dump and corrected missing entries that had been removed from LDAP.
handle null givenNames from the LDAP dump.
make sure we only get the publisher text content (not attribute value)
DOI registration:-include more revision history based on the identifier table not just the generated SM metadata-include ecogrid data urls for revisions (long query in xml_nodes_revisions table)
update creator and publisher using LDAP dump. unfortunately LDAP has shifted over the years and not all identities are still active in LDAP...but we did get quite a few creator names updated!https://redmine.dataone.org/issues/2815
save point - adding more columns for access, data packaging, revision historyhttps://redmine.dataone.org/issues/2815
update the table to indicate which DOI account we are targetinghttps://redmine.dataone.org/issues/2815
use production cn url for the resolve url
encode '/' and ':' in the DOI used for the resolve URL
include revisions table in the initial temp table population.use the "first" creator listed in the EML (either org or person).use other reasonable default values as needed to fully populate the spreadsheet columnshttps://redmine.dataone.org/issues/2815
add columns: publisher and pub_date. include default values for all columns - even data files should have title.still a few todos but closer.https://redmine.dataone.org/issues/2815
script to generate DOI registration spreadsheethttps://redmine.dataone.org/issues/2815