docs with inline-data allow invalid xml into metacat
If you insert a document with inline-data, the data is stripped out of the document before it is validated. However, when you do a GET on the document, it is read off of the disk. So if you insert a doc with inline-data that has invalid characters in it (like unescaped ampersands), metacat will not recognize that it is invalid, but when you try to get the document, you will get a parser error if you try to parse it.
We should be validating the document first before stripping inline-data out of it.
#2 Updated by Matt Jones about 11 years ago
People certainly can use CDATA sections within their inline element, in which case escaping would be taken care of. But in this case, the data in the CDR document is not in a CDATA element, has reserved XML characters in it, and Metacat is not properly rejecting it as invalid.
#4 Updated by ben leinfelder about 10 years ago
I tried this with the attached inline.xml file that has an invalid unescaped ampersand in the inline section -- Metacat rejected it as invalid.
If there is a specific CDR file that is causing this issue still, let's reopen and and figure out how it is slipping by. Otherwise, I believe this is not currently a problem in trunk.