Bug #5273
closeddocs with inline-data allow invalid xml into metacat
0%
Description
If you insert a document with inline-data, the data is stripped out of the document before it is validated. However, when you do a GET on the document, it is read off of the disk. So if you insert a doc with inline-data that has invalid characters in it (like unescaped ampersands), metacat will not recognize that it is invalid, but when you try to get the document, you will get a parser error if you try to parse it.
We should be validating the document first before stripping inline-data out of it.
Files
Updated by ben leinfelder over 13 years ago
is inline data not contained in CDATA? I thought you could put anything in CDATA and have it be ignored by parsers.
Updated by Matt Jones over 13 years ago
People certainly can use CDATA sections within their inline element, in which case escaping would be taken care of. But in this case, the data in the CDR document is not in a CDATA element, has reserved XML characters in it, and Metacat is not properly rejecting it as invalid.
Updated by ben leinfelder almost 13 years ago
This file uses unescaped ampersand (&) in the inline data section.
Updated by ben leinfelder almost 13 years ago
I tried this with the attached inline.xml file that has an invalid unescaped ampersand in the inline section -- Metacat rejected it as invalid.
If there is a specific CDR file that is causing this issue still, let's reopen and and figure out how it is slipping by. Otherwise, I believe this is not currently a problem in trunk.