Metacat: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362017-10-24T22:11:37ZEcoinformatics Redmine
Redmine Bug #7223 (New): EZID metadata registration doesn't seem to work with SIDshttps://projects.ecoinformatics.org/ecoinfo/issues/72232017-10-24T22:11:37ZBryce Mecummecum@nceas.ucsb.edu
<p>Earlier today, a DOI was generated using R's `generateIdentifier` (which calls MNStorage.generateIdentifier()). Then the newly-minted DOI was set as the Series ID of an EML 2.1.1 Object. The DOI was successfully registered with EZID but the EZID metadata was not correctly set on the object, as shown when I logged in. See the attached screenshot.</p>
<p>I expected the EZID metadata to get filled in like it normally does for DOIs that get used as PIDs. I took a quick glance at the relevant part of Metacat and it doesn't look like anything special is done to handle SIDs.</p> Bug #7215 (New): Metacat produces an invalid ZIP archive when a package member has an invalid for...https://projects.ecoinformatics.org/ecoinfo/issues/72152017-10-06T19:27:14ZBryce Mecummecum@nceas.ucsb.edu
<p>I submitted a new Data Package and went to download it via the Download All button In MetacatUI which triggers the /packages route in Metacat. I then tried to unzip it and couldn't get any of my zip extraction tools to do it. Then I hex dumped it:</p>
<pre><code class="text syntaxhl">bryce@mbp ~/Downloads> hexdump -C resource_map_urn-uuid-13c3000d-09b1-453b-86d7-e852d147fb81.rdf7b2b468205c3.zip
00000000 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 |<?xml version="1|
00000010 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 55 54 |.0" encoding="UT|
00000020 46 2d 38 22 3f 3e 3c 65 72 72 6f 72 20 64 65 74 |F-8"?><error det|
00000030 61 69 6c 43 6f 64 65 3d 22 30 30 30 30 22 20 65 |ailCode="0000" e|
00000040 72 72 6f 72 43 6f 64 65 3d 22 34 30 34 22 20 6e |rrorCode="404" n|
00000050 61 6d 65 3d 22 4e 6f 74 46 6f 75 6e 64 22 3e 0a |ame="NotFound">.|
00000060 20 20 20 20 3c 64 65 73 63 72 69 70 74 69 6f 6e | <description|
00000070 3e 54 68 65 20 66 6f 72 6d 61 74 20 73 70 65 63 |>The format spec|
00000080 69 66 69 65 64 20 62 79 20 4e 41 20 77 61 73 20 |ified by NA was |
00000090 6e 6f 74 20 66 6f 75 6e 64 20 61 66 74 65 72 20 |not found after |
000000a0 72 65 66 72 65 73 68 69 6e 67 20 74 68 65 20 63 |refreshing the c|
000000b0 61 63 68 65 2e 3c 2f 64 65 73 63 72 69 70 74 69 |ache.</descripti|
000000c0 6f 6e 3e 0a 3c 2f 65 72 72 6f 72 3e 0a |on>.</error>.|
000000cd
</code></pre>
<p>Which looks like Metacat wrote out an XML file and called it a .ZIP. After seeing this particular error message, I realized that Metacat was choking on the "NA" formatId on the Data Object I put in this package. Fair enough I guess.</p>
<p>I think Metacat did two things that were surprising to me:</p>
<p>1. Didn't produce what I asked for even though it reasonably could have (Metacat doesn't need to know the formatID to send me the file)<br />2. Didn't produce any error message (e.g., a non-200 HTTP status) and instead sent me an invalid ZIP file</p> Bug #7210 (New): View service duplicates EML Text contenthttps://projects.ecoinformatics.org/ecoinfo/issues/72102017-09-15T00:13:10ZBryce Mecummecum@nceas.ucsb.edu
<p>This abstract<br /><pre><code class="xml syntaxhl">
<span class="nt"><abstract></span>
<span class="nt"><section></span>
<span class="nt"><title></span>Introduction<span class="nt"></title></span>
<span class="nt"><para></span>Between 1958 and 1999, Austin Post led the USGS collection of aerial imagery of North American glaciers. These images are primarily vertical stereo black and white images, although single oblique images, as well as color images have been collected. The glaciers of North America were the subjects, and the digital products made available here serve to document the changes that have occurred to the glaciers over the past 5 decades. The purpose of this project is to preserve the data contained within these film images in a digital format for future analysis of North American glacier change.<span class="nt"></para></span>
<span class="nt"></section></span>
<span class="nt"><section></span>
<span class="nt"><title></span>File Layout<span class="nt"></title></span>
<span class="nt"><para></span>
<span class="nt"><orderedlist></span>
<span class="nt"><listitem></span>
<span class="nt"><para></span>The first level contains an overall data set of image metadata from 1964 - 1997 (nagapData.csv) and an R script (searchData.R) with instructions on how to search and subset the data. fileLayout.pdf shows the file structure and folder contents visually. There are also three kml files with flight path information by decade.<span class="nt"></para></span>
<span class="nt"></listitem></span>
<span class="nt"><listitem></span>
<span class="nt"><para></span>The second level is the year in which the pictures were taken. There are 32 years with images from 1964 – 1997. The majority of these folders are jpegs with notes provided by Austin Post. They also contain a year-specific csv (YYYY.csv) that contains image metadata for the entire year (date, roll numbers, location name, longitude, latitude, altitude, media, and comments). The overall data set (nagapData.csv) is the aggregate of each individual “YYYY.csv” file.<span class="nt"></para></span>
<span class="nt"></listitem></span>
<span class="nt"><listitem></span>
<span class="nt"><para></span>The glacier photos are located at the third level (this level). The folders at this level are distinguished by camera roll number (1, 2, etc.), and image type (thumbnail, jpeg, or tif); some also contain fiducial and oblique image folders. This level primarily contains image files of aerial photos as either thumbnails, jpegs, or tifs. It also includes a csv with image metadata specific to each roll (date, roll numbers, location name, longitude, latitude, altitude, media, and comments), a text file (info.txt) with camera specifications unique to each image, and a text file (histo.txt or matchReport.txt) with color information and scanner specifications unique to each image.<span class="nt"></para></span>
<span class="nt"></listitem></span>
<span class="nt"></orderedlist></span>
<span class="nt"></para></span>
<span class="nt"></section></span>
<span class="nt"></abstract></span>
</code></pre></p>
<p>produces the following HTML:</p>
<pre><code class="xml syntaxhl"><span class="nt"><div</span> <span class="na">class=</span><span class="s">"sectionText"</span><span class="nt">></span>
<span class="nt"><h4</span> <span class="na">class=</span><span class="s">"bold"</span><span class="nt">></span>Introduction<span class="nt"></h4></span>
<span class="nt"><p></span>Between 1958 and 1999, Austin Post led the USGS collection of aerial imagery of North American glaciers. These images are primarily vertical stereo black and white images, although single oblique images, as well as color images have been collected. The glaciers of North America were the subjects, and the digital products made available here serve to document the changes that have occurred to the glaciers over the past 5 decades. The purpose of this project is to preserve the data contained within these film images in a digital format for future analysis of North American glacier change.<span class="nt"></p></span>
<span class="nt"></div></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"sectionText"</span><span class="nt">></span>
<span class="nt"><h4</span> <span class="na">class=</span><span class="s">"bold"</span><span class="nt">></span>File Layout<span class="nt"></h4></span>
<span class="nt"><p></span>The first level contains an overall data set of image metadata from 1964 - 1997 (nagapData.csv) and an R script (searchData.R) with instructions on how to search and subset the data. fileLayout.pdf shows the file structure and folder contents visually. There are also three kml files with flight path information by decade.<span class="nt"></p></span>
<span class="nt"><p></span>The second level is the year in which the pictures were taken. There are 32 years with images from 1964 <span class="ni">&ndash;</span> 1997. The majority of these folders are jpegs with notes provided by Austin Post. They also contain a year-specific csv (YYYY.csv) that contains image metadata for the entire year (date, roll numbers, location name, longitude, latitude, altitude, media, and comments). The overall data set (nagapData.csv) is the aggregate of each individual <span class="ni">&ldquo;</span>YYYY.csv<span class="ni">&rdquo;</span> file.<span class="nt"></p></span>
<span class="nt"><p></span>The glacier photos are located at the third level (this level). The folders at this level are distinguished by camera roll number (1, 2, etc.), and image type (thumbnail, jpeg, or tif); some also contain fiducial and oblique image folders. This level primarily contains image files of aerial photos as either thumbnails, jpegs, or tifs. It also includes a csv with image metadata specific to each roll (date, roll numbers, location name, longitude, latitude, altitude, media, and comments), a text file (info.txt) with camera specifications unique to each image, and a text file (histo.txt or matchReport.txt) with color information and scanner specifications unique to each image.<span class="nt"></p></span>
<span class="nt"><p></span>
The first level contains an overall data set of image metadata from 1964 - 1997 (nagapData.csv) and an R script (searchData.R) with instructions on how to search and subset the data. fileLayout.pdf shows the file structure and folder contents visually. There are also three kml files with flight path information by decade.
The second level is the year in which the pictures were taken. There are 32 years with images from 1964 <span class="ni">&ndash;</span> 1997. The majority of these folders are jpegs with notes provided by Austin Post. They also contain a year-specific csv (YYYY.csv) that contains image metadata for the entire year (date, roll numbers, location name, longitude, latitude, altitude, media, and comments). The overall data set (nagapData.csv) is the aggregate of each individual <span class="ni">&ldquo;</span>YYYY.csv<span class="ni">&rdquo;</span> file.
The glacier photos are located at the third level (this level). The folders at this level are distinguished by camera roll number (1, 2, etc.), and image type (thumbnail, jpeg, or tif); some also contain fiducial and oblique image folders. This level primarily contains image files of aerial photos as either thumbnails, jpegs, or tifs. It also includes a csv with image metadata specific to each roll (date, roll numbers, location name, longitude, latitude, altitude, media, and comments), a text file (info.txt) with camera specifications unique to each image, and a text file (histo.txt or matchReport.txt) with color information and scanner specifications unique to each image.
<span class="nt"></p></span>
<span class="nt"></div></span>
</code></pre>
<p>which you can see duplicates the content in the ordreredlist. The content shouldn't be duplicated.</p> Bug #7164 (New): View service rendering EML project abstract incorrectlyhttps://projects.ecoinformatics.org/ecoinfo/issues/71642016-12-01T21:38:40ZBryce Mecummecum@nceas.ucsb.edu
<p>The view service's XSLT for rendering EML is producing output that doesn't look quite right. See attached screenshot. A quick investigating revealed it's just missing a wrapping div.</p> Bug #7062 (New): Unable to login to admin interface intermittently, NullPointerException when failhttps://projects.ecoinformatics.org/ecoinfo/issues/70622016-07-22T18:42:35ZBryce Mecummecum@nceas.ucsb.edu
<p>This has been happening for months. When I log into the Metacat admin panel /{context}/admin, I enter my LDAP credentials and click "Login". Sometimes this takes ~10 seconds (why?) but sometimes it takes minutes and then returns with an error:</p>
<p>LoginAdmin.authenticateUser - Could not log in as: uid=mecum,ou=Account,dc=ecoinformatics,dc=org : Connection to the authentication service failed in AuthSession.authenticate: AuthLdap.authenticate - NullPointerException while authenticating in AuthLdap.authenticate: java.lang.NullPointerException . Please try again</p>
<p>I'll try again and it will go through without error.</p> Bug #7022 (New): Fatal processing error when updating an object with incorrect sysmetahttps://projects.ecoinformatics.org/ecoinfo/issues/70222016-05-06T20:02:36ZBryce Mecummecum@nceas.ucsb.edu
<p>Jessica used the R package today to update an object with new bytes via the D1 REST API call for MNStorage.update(). She received this error back from Tomcat:</p>
<pre>
<?xml version="1.0"?>
<error>Fatal processing error.</error>
</pre>
<p>She showed me what she had for her input to the call and I found that the sysmeta formatId was EML 2.1.1 instead of text/csv which was the correct format ID for the object. This was a bug in my code that she was running that I've since addressed.</p>
<p>I expected a more useful error that directly addressed the mismatch in format ID and the file being uploaded.</p>
<p>Why did this action cause a "Fatal processing error"? It seems like the sysmeta format ID being set to EML should've triggered the EML validation routine which should have returned a validation error.</p> Bug #6994 (New): Bad call to MNStorage.update() via REST API can result in bad state and StackOve...https://projects.ecoinformatics.org/ecoinfo/issues/69942016-03-23T20:27:30ZBryce Mecummecum@nceas.ucsb.edu
<p>This all happened on arcticdata.io production over the last couple of days.</p>
<p>I was attempting to update an object and forgot the {PID} part of the REST API URL: PUT /object/{pid}. This resulted in unexpected behavior and an unexpected state.</p>
<p>- The request returned a ServiceError (HTTP Status 500) of "StackOverflowError", this was unexpected.<br />- The sysmeta for the PID I was updating changed: The PID became obsoleted and obseletedBy the new PID I chose. This was expected.<br />- Calls to /meta and /object for the new PID failed, this was unexpected.</p>
<p>It appears that the new PID was reserved but never assigned sysmeta or object bytes, resulting in an unexpected system state.</p>
<p>I then set about a path of archiving the PID by first removing public read access, which resulted in another StackOverflowError but public read access was revoked as expected. In the end, I had Chris Jones do an administrative delete on the object.</p>
<p>I see two things here:</p>
<p>1. The requests returned StackOverflowErrors. It seems like a stack overflow shouldn't be possible. The requests returning this error took ~10+ seconds to return which would imply this is a great attack vector.<br />2. An invalid REST API call was not rejected immediately (the call where I was missing the {PID} part of the URL</p>