<below is copied from a recent email discussing the bug>
Bug resolved, with two outstanding issues:
- The metadata docid is passed as a parameter from the download links, and embedded in the EML XSLT. The best way to solve this would be an additional column in the xml_documents table listing parent docid for BIN objects, so that we could easily query the list, and wouldn't need to use the hacked stylesheet approach, could determine the names when coming in from other pathways (e.g. direct download without GET parameters). But I didn't want to implement a major change with an eminnent release, and this should be discussed before implementation.
- The FGDC special case still needs handling, pending discussion of the doctype used for FGDC documents and a decision that this matters (FGDC data are delivered using a special mechanism currently which provides all data within a zip archive). The work Chris Barteau did isn't currently in the main Metacat servlet, but uses different code for its FGDC handling, unused by the majority of the skins. The FGDC documents (currently numbering 11) are also currently receiving the `doctype` of `metadata`, which is certainly wrong and should be updated.
Issues solved:
1. When documents are created or updated, the objectName element should be copied into the `docname` field within the `xml_documents` table. This requires that clients send correct filename when delivering the data (so Metacat can correctly set `docname` using the registerDocument f'n). Jing recently updated Morpho to do the right thing here, as does the Perl Registry client. From now on, all new documents should have the correct docname.
2. Preexisting documents which were created by older releases of Morpho or uploaded in other ways (replication? Not sure how other systems generate `docname`). These can easily been seen by looking for documents which have generic docnames:
SELECT COUNT FROM xml_documents WHERE docname = docid || '.' || rev AND doctype = 'BIN';
To eliminate these misnomers, I searched for all binaries linked to from metadata documents:
SELECT nodedata, docid FROM xml_nodes WHERE nodetype = 'TEXT' AND nodedata LIKE 'ecogrid%'
These were then fed into a Perl script (src/perl/eml_get_objectnames.pl) which read the related metadata documents, determined the correct filename, and output an SQL script which can be used to update misnomers. This was tested and applied to KNB, so all previously misnamed files are now correct.
3. The download naming logic in Metacat was partially correct, set as `docid-docname` but lacked metadata docid. The name generation was rewritten to match the sensible names logic.