Metacat: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362017-12-20T07:00:31ZEcoinformatics Redmine
Redmine Bug #7234 (New): Validate SystemMetadata.checksumAlgorithm in the DataONE API callshttps://projects.ecoinformatics.org/ecoinfo/issues/72342017-12-20T07:00:31ZChris Jonescjones@nceas.ucsb.edu
<p>Bryce pointed out that we have many incorrect <code>checksumAlgorithm</code> strings various MNs. See <a class="external" href="https://github.nceas.ucsb.edu/KNB/arctic-data/issues/283">https://github.nceas.ucsb.edu/KNB/arctic-data/issues/283</a>. The upshot is that <code>SHA-*</code> is the broadly supported syntax.</p>
<p>I checked the strings with:<br /><pre>
package org.dataone.tests;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.ArrayList;
import java.util.List;
public class MessageDigestDTest {
public static void main(String[] args) {
MessageDigest md = null;
List<String> algorithms = new ArrayList<String>();
algorithms.add("MD5");
algorithms.add("MD-5");
algorithms.add("SHA1");
algorithms.add("SHA-1");
algorithms.add("SHA224");
algorithms.add("SHA-224");
algorithms.add("SHA256");
algorithms.add("SHA-256");
algorithms.add("SHA384");
algorithms.add("SHA-384");
algorithms.add("SHA512");
algorithms.add("SHA-512");
for (String algorithm : algorithms) {
try {
md = MessageDigest.getInstance(algorithm);
System.out.println(md.getAlgorithm() + " is recognized.");
} catch (NoSuchAlgorithmException e) {
System.out.println(e.getMessage());
}
}
}
}
</pre></p>
<p>and got:<br /><pre>
MD5 is recognized.
MD-5 MessageDigest not available
SHA1 is recognized.
SHA-1 is recognized.
SHA224 MessageDigest not available
SHA-224 is recognized.
SHA256 MessageDigest not available
SHA-256 is recognized.
SHA384 MessageDigest not available
SHA-384 is recognized.
SHA512 MessageDigest not available
SHA-512 is recognized.
</pre></p>
<p>Change <code>MNodeService</code>, <code>CNodeService</code>, and <code>D1NodeService</code> methods that send or receive <code>SystemMetadata</code> documents and validate the given string with <code>MessageDigest.getInstance(algorithm)</code>. If we get a <code>NoSuchAlgorithm</code> exception, throw an <code>InvalidSystemMetadata</code> exception for the call.</p> Bug #7229 (New): Mis-Formatting of Data Package Contentshttps://projects.ecoinformatics.org/ecoinfo/issues/72292017-11-17T21:19:00ZThomas Thelen
<p>In MetacatUI we're getting a slight mis-formatting when displaying data package contents. This can be seen in the attached images. The issue was initially reported in MetacatUI as issue 379.</p>
<p><a class="external" href="https://github.com/NCEAS/metacatui/issues/379">https://github.com/NCEAS/metacatui/issues/379</a></p>
<p>From Bryce,</p>
<p>"The HTML in question is actually produced by Metacat and MetcatUI is just rendering it without modification from Metacat's View Service. ... The fix would involve changing the underlying eml-2 XSLT."</p> Bug #7228 (New): Error in sorting data-sets based on title in MetaCatUIhttps://projects.ecoinformatics.org/ecoinfo/issues/72282017-11-15T17:35:09ZRushiraj Nenuji
<p>Reference: Issue <a class="external" href="https://github.com/NCEAS/metacatui/issues/350">https://github.com/NCEAS/metacatui/issues/350</a> MetaCatUI<br /><br />The datasets when sorted based on the title(a-z) results into sorting based on the upper case and lower case separately.</p>
<p>Please find attached image for results of sorting along with this email.</p>
<p>Possible Solution: <a class="external" href="https://stackoverflow.com/questions/2053214/how-to-create-a-case-insensitive-copy-of-a-string-field-in-solr">https://stackoverflow.com/questions/2053214/how-to-create-a-case-insensitive-copy-of-a-string-field-in-solr</a></p>
<p>Thanks!</p>
<p>Best,<br />Rushiraj Nenuji.</p> Bug #7226 (New): Perl Registry hangs on uploadshttps://projects.ecoinformatics.org/ecoinfo/issues/72262017-11-10T20:56:18ZChris Jonescjones@nceas.ucsb.edu
<p>Jeanette has reported an issue where the KNB data registry is hanging when a small file is being uploaded (see attached screenshots).</p>
<p>In the JavaScript console, she saw a <code>Bad Gateway</code> error. With a quick web search on this, it looks like the <code>Bad Gateway</code> response is usually thrown when Apache calls a sub processor like the perl interpreter and doesn't get a response back (timeout), or some other non-response error. We need to track down what is happening in the Perl code to induce this. Here's the Slack thread:<br /><pre>
Feeling the "hanging submission page" pain right now. What do I do? Abandon? Wait and hope?
According to my local file these files should only be 831KB so I'm not sure why it thinks they are so huge
chris [1:15 PM]
@jeanette which server is that?
KNB?
jeanette [1:16 PM]
yeah, production
chris [1:16 PM]
ok
jeanette [1:16 PM]
I bailed out after seeing the timeout error
chris [1:16 PM]
that’s an odd one. i’ll look
jeanette [1:16 PM]
but still not the easiest thing to figure out from a user perspective
chris [1:16 PM]
totally
jeanette [1:17 PM]
I think this has happened to me before, and I think it is triggered by
writing metadata ->
add a file ->
accidentally submit before adding all files ->
go back and edit record ->
add more files ->
submit again ->
page hangs forever
I also think I have heard of a similar workflow causing an error for ADC users
</pre></p> Bug #7224 (New): Document the procedure of registering a new schema in Metacathttps://projects.ecoinformatics.org/ecoinfo/issues/72242017-11-07T20:31:45ZJing Taotao@nceas.ucsb.edu
<p>Metacat will NOT automatically download and register a new schema since 2.8 release. Before we have an admin page to help administrator to register a new schema, at least we need to provide a document to guid the administrator to do it manually.</p> Bug #7223 (New): EZID metadata registration doesn't seem to work with SIDshttps://projects.ecoinformatics.org/ecoinfo/issues/72232017-10-24T22:11:37ZBryce Mecummecum@nceas.ucsb.edu
<p>Earlier today, a DOI was generated using R's `generateIdentifier` (which calls MNStorage.generateIdentifier()). Then the newly-minted DOI was set as the Series ID of an EML 2.1.1 Object. The DOI was successfully registered with EZID but the EZID metadata was not correctly set on the object, as shown when I logged in. See the attached screenshot.</p>
<p>I expected the EZID metadata to get filled in like it normally does for DOIs that get used as PIDs. I took a quick glance at the relevant part of Metacat and it doesn't look like anything special is done to handle SIDs.</p> Bug #7217 (New): Report on metadata creation date in metadata quality summarieshttps://projects.ecoinformatics.org/ecoinfo/issues/72172017-10-11T18:42:38ZPeter Slaughterslaughter@nceas.ucsb.edu
<p>Indexing fields for metadata quality reports do not include the upload date of the metadata they are reporting on. Therefor, summaries that are created, i.e. mean score for a user over time, currently show the time of the creation of the quality report, not the metadata.</p>
<p>Add the field 'mdq.metadata.timestamp' to application-context-mdq.xml to hold the metadata creation or update time.</p>
<p>Each quality suite will be responsible for making this information available in the quality report, so that MDQClient.saveRun<br />can record it.</p> Bug #7216 (New): MDQClient.saveRun doesn't obsolete existing quality documentshttps://projects.ecoinformatics.org/ecoinfo/issues/72162017-10-11T17:30:14ZPeter Slaughterslaughter@nceas.ucsb.edu
<p>MDQClient.saveRun is called to upload a newly created quality document, in response to a metadata quality document being uploaded or updated.</p>
<p>When MDQClient.saveRun is called by MNodeService.update, it does not check if a quality document has already been created<br />for the metadata document. saveRun should check if a previous quality document has been created for the metadata, and obsolete it<br />with the new quality document. This will ensure that quality statistics are accurate, as obsoleted quality reports will not be<br />included in statistical calculations, as they are essentially duplicates.</p> Bug #7215 (New): Metacat produces an invalid ZIP archive when a package member has an invalid for...https://projects.ecoinformatics.org/ecoinfo/issues/72152017-10-06T19:27:14ZBryce Mecummecum@nceas.ucsb.edu
<p>I submitted a new Data Package and went to download it via the Download All button In MetacatUI which triggers the /packages route in Metacat. I then tried to unzip it and couldn't get any of my zip extraction tools to do it. Then I hex dumped it:</p>
<pre><code class="text syntaxhl">bryce@mbp ~/Downloads> hexdump -C resource_map_urn-uuid-13c3000d-09b1-453b-86d7-e852d147fb81.rdf7b2b468205c3.zip
00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |<?xml version="1|
00000010 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 |.0" encoding="UT|
00000020 46 2d 38 22 3f 3e 3c 65 72 72 6f 72 20 64 65 74 |F-8"?><error det|
00000030 61 69 6c 43 6f 64 65 3d 22 30 30 30 30 22 20 65 |ailCode="0000" e|
00000040 72 72 6f 72 43 6f 64 65 3d 22 34 30 34 22 20 6e |rrorCode="404" n|
00000050 61 6d 65 3d 22 4e 6f 74 46 6f 75 6e 64 22 3e 0a |ame="NotFound">.|
00000060 20 20 20 20 3c 64 65 73 63 72 69 70 74 69 6f 6e | <description|
00000070 3e 54 68 65 20 66 6f 72 6d 61 74 20 73 70 65 63 |>The format spec|
00000080 69 66 69 65 64 20 62 79 20 4e 41 20 77 61 73 20 |ified by NA was |
00000090 6e 6f 74 20 66 6f 75 6e 64 20 61 66 74 65 72 20 |not found after |
000000a0 72 65 66 72 65 73 68 69 6e 67 20 74 68 65 20 63 |refreshing the c|
000000b0 61 63 68 65 2e 3c 2f 64 65 73 63 72 69 70 74 69 |ache.</descripti|
000000c0 6f 6e 3e 0a 3c 2f 65 72 72 6f 72 3e 0a |on>.</error>.|
000000cd
</code></pre>
<p>Which looks like Metacat wrote out an XML file and called it a .ZIP. After seeing this particular error message, I realized that Metacat was choking on the "NA" formatId on the Data Object I put in this package. Fair enough I guess.</p>
<p>I think Metacat did two things that were surprising to me:</p>
<p>1. Didn't produce what I asked for even though it reasonably could have (Metacat doesn't need to know the formatID to send me the file)<br />2. Didn't produce any error message (e.g., a non-200 HTTP status) and instead sent me an invalid ZIP file</p> Bug #7213 (New): Document the EZID landing page template propertyhttps://projects.ecoinformatics.org/ecoinfo/issues/72132017-10-05T00:43:02ZChris Jonescjones@nceas.ucsb.edu
<p>Mike Frenock pointed out some issues with the <code>guid.ezid.uritemplate.metadata</code> property. We need to add this to the documentation. Also, consider if we should make this configurable to an external server instad of the local Metacat server.</p> Bug #7212 (New): metacat-index missing metadata quality fieldshttps://projects.ecoinformatics.org/ecoinfo/issues/72122017-10-03T18:59:38ZPeter Slaughterslaughter@nceas.ucsb.edu
<p>The Spring context file ./metacat-index/src/main/resources/application-context-mdq.xml doesn't contain a bean definition for the quality check types 'congruency' or 'dataFormats',<br />although these are check types that we should record results for. There is a bean for check type 'other', but this isn't sufficient.</p> Bug #7210 (New): View service duplicates EML Text contenthttps://projects.ecoinformatics.org/ecoinfo/issues/72102017-09-15T00:13:10ZBryce Mecummecum@nceas.ucsb.edu
<p>This abstract<br /><pre><code class="xml syntaxhl">
<span class="nt"><abstract></span>
<span class="nt"><section></span>
<span class="nt"><title></span>Introduction<span class="nt"></title></span>
<span class="nt"><para></span>Between 1958 and 1999, Austin Post led the USGS collection of aerial imagery of North American glaciers. These images are primarily vertical stereo black and white images, although single oblique images, as well as color images have been collected. The glaciers of North America were the subjects, and the digital products made available here serve to document the changes that have occurred to the glaciers over the past 5 decades. The purpose of this project is to preserve the data contained within these film images in a digital format for future analysis of North American glacier change.<span class="nt"></para></span>
<span class="nt"></section></span>
<span class="nt"><section></span>
<span class="nt"><title></span>File Layout<span class="nt"></title></span>
<span class="nt"><para></span>
<span class="nt"><orderedlist></span>
<span class="nt"><listitem></span>
<span class="nt"><para></span>The first level contains an overall data set of image metadata from 1964 - 1997 (nagapData.csv) and an R script (searchData.R) with instructions on how to search and subset the data. fileLayout.pdf shows the file structure and folder contents visually. There are also three kml files with flight path information by decade.<span class="nt"></para></span>
<span class="nt"></listitem></span>
<span class="nt"><listitem></span>
<span class="nt"><para></span>The second level is the year in which the pictures were taken. There are 32 years with images from 1964 – 1997. The majority of these folders are jpegs with notes provided by Austin Post. They also contain a year-specific csv (YYYY.csv) that contains image metadata for the entire year (date, roll numbers, location name, longitude, latitude, altitude, media, and comments). The overall data set (nagapData.csv) is the aggregate of each individual “YYYY.csv” file.<span class="nt"></para></span>
<span class="nt"></listitem></span>
<span class="nt"><listitem></span>
<span class="nt"><para></span>The glacier photos are located at the third level (this level). The folders at this level are distinguished by camera roll number (1, 2, etc.), and image type (thumbnail, jpeg, or tif); some also contain fiducial and oblique image folders. This level primarily contains image files of aerial photos as either thumbnails, jpegs, or tifs. It also includes a csv with image metadata specific to each roll (date, roll numbers, location name, longitude, latitude, altitude, media, and comments), a text file (info.txt) with camera specifications unique to each image, and a text file (histo.txt or matchReport.txt) with color information and scanner specifications unique to each image.<span class="nt"></para></span>
<span class="nt"></listitem></span>
<span class="nt"></orderedlist></span>
<span class="nt"></para></span>
<span class="nt"></section></span>
<span class="nt"></abstract></span>
</code></pre></p>
<p>produces the following HTML:</p>
<pre><code class="xml syntaxhl"><span class="nt"><div</span> <span class="na">class=</span><span class="s">"sectionText"</span><span class="nt">></span>
<span class="nt"><h4</span> <span class="na">class=</span><span class="s">"bold"</span><span class="nt">></span>Introduction<span class="nt"></h4></span>
<span class="nt"><p></span>Between 1958 and 1999, Austin Post led the USGS collection of aerial imagery of North American glaciers. These images are primarily vertical stereo black and white images, although single oblique images, as well as color images have been collected. The glaciers of North America were the subjects, and the digital products made available here serve to document the changes that have occurred to the glaciers over the past 5 decades. The purpose of this project is to preserve the data contained within these film images in a digital format for future analysis of North American glacier change.<span class="nt"></p></span>
<span class="nt"></div></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"sectionText"</span><span class="nt">></span>
<span class="nt"><h4</span> <span class="na">class=</span><span class="s">"bold"</span><span class="nt">></span>File Layout<span class="nt"></h4></span>
<span class="nt"><p></span>The first level contains an overall data set of image metadata from 1964 - 1997 (nagapData.csv) and an R script (searchData.R) with instructions on how to search and subset the data. fileLayout.pdf shows the file structure and folder contents visually. There are also three kml files with flight path information by decade.<span class="nt"></p></span>
<span class="nt"><p></span>The second level is the year in which the pictures were taken. There are 32 years with images from 1964 <span class="ni">&ndash;</span> 1997. The majority of these folders are jpegs with notes provided by Austin Post. They also contain a year-specific csv (YYYY.csv) that contains image metadata for the entire year (date, roll numbers, location name, longitude, latitude, altitude, media, and comments). The overall data set (nagapData.csv) is the aggregate of each individual <span class="ni">&ldquo;</span>YYYY.csv<span class="ni">&rdquo;</span> file.<span class="nt"></p></span>
<span class="nt"><p></span>The glacier photos are located at the third level (this level). The folders at this level are distinguished by camera roll number (1, 2, etc.), and image type (thumbnail, jpeg, or tif); some also contain fiducial and oblique image folders. This level primarily contains image files of aerial photos as either thumbnails, jpegs, or tifs. It also includes a csv with image metadata specific to each roll (date, roll numbers, location name, longitude, latitude, altitude, media, and comments), a text file (info.txt) with camera specifications unique to each image, and a text file (histo.txt or matchReport.txt) with color information and scanner specifications unique to each image.<span class="nt"></p></span>
<span class="nt"><p></span>
The first level contains an overall data set of image metadata from 1964 - 1997 (nagapData.csv) and an R script (searchData.R) with instructions on how to search and subset the data. fileLayout.pdf shows the file structure and folder contents visually. There are also three kml files with flight path information by decade.
The second level is the year in which the pictures were taken. There are 32 years with images from 1964 <span class="ni">&ndash;</span> 1997. The majority of these folders are jpegs with notes provided by Austin Post. They also contain a year-specific csv (YYYY.csv) that contains image metadata for the entire year (date, roll numbers, location name, longitude, latitude, altitude, media, and comments). The overall data set (nagapData.csv) is the aggregate of each individual <span class="ni">&ldquo;</span>YYYY.csv<span class="ni">&rdquo;</span> file.
The glacier photos are located at the third level (this level). The folders at this level are distinguished by camera roll number (1, 2, etc.), and image type (thumbnail, jpeg, or tif); some also contain fiducial and oblique image folders. This level primarily contains image files of aerial photos as either thumbnails, jpegs, or tifs. It also includes a csv with image metadata specific to each roll (date, roll numbers, location name, longitude, latitude, altitude, media, and comments), a text file (info.txt) with camera specifications unique to each image, and a text file (histo.txt or matchReport.txt) with color information and scanner specifications unique to each image.
<span class="nt"></p></span>
<span class="nt"></div></span>
</code></pre>
<p>which you can see duplicates the content in the ordreredlist. The content shouldn't be duplicated.</p> Bug #7199 (New): Upgrade postgresql jdbc jar file on Metacathttps://projects.ecoinformatics.org/ecoinfo/issues/71992017-06-02T21:03:49ZJing Taotao@nceas.ucsb.edu
<p>Please see detail on this ticket:<br /><a class="external" href="https://redmine.dataone.org/issues/8104">https://redmine.dataone.org/issues/8104</a></p> Bug #7182 (New): Allow partial package downloads when some of the objects are private https://projects.ecoinformatics.org/ecoinfo/issues/71822017-04-13T15:10:57ZLauren Walkerwalker@nceas.ucsb.edu
<p>When you try to download a package that has at least one private object, you get a 401 - Unauthorized response. When I am authorized to read at least one object in a package, I would expect to still be able to download the .zip package with those objects.</p>
<p>It's difficult for MetacatUI to tell when the "Download All" will fail, since it would need to check the /isAuthorized/{pid}?action=read result for every single object in the package, which can sometimes be >100. So right now we have an issue where users are getting a failed package download.</p> Bug #7181 (New): Verify completeness of unit test MetacatRdfXmlSubprocessorTesthttps://projects.ecoinformatics.org/ecoinfo/issues/71812017-04-11T23:41:43ZPeter Slaughterslaughter@nceas.ucsb.edu
<p>Verify that all prov relationships that are indexed via src/main/resources/application-context-prov-base.xml are inspected by the unit test MetacatRdfXmlSubprocessorTest.java which reads src/test/resources/rdfxml-example.xml.</p>