Bug #5273

docs with inline-data allow invalid xml into metacat

Added by Chad Berkley over 11 years ago. Updated over 10 years ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:


If you insert a document with inline-data, the data is stripped out of the document before it is validated. However, when you do a GET on the document, it is read off of the disk. So if you insert a doc with inline-data that has invalid characters in it (like unescaped ampersands), metacat will not recognize that it is invalid, but when you try to get the document, you will get a parser error if you try to parse it.

We should be validating the document first before stripping inline-data out of it.

inline.xml (2.19 KB) inline.xml ben leinfelder, 10/27/2011 06:58 PM


#1 Updated by ben leinfelder over 11 years ago

is inline data not contained in CDATA? I thought you could put anything in CDATA and have it be ignored by parsers.

#2 Updated by Matt Jones over 11 years ago

People certainly can use CDATA sections within their inline element, in which case escaping would be taken care of. But in this case, the data in the CDR document is not in a CDATA element, has reserved XML characters in it, and Metacat is not properly rejecting it as invalid.

#3 Updated by ben leinfelder over 10 years ago

This file uses unescaped ampersand (&) in the inline data section.

#4 Updated by ben leinfelder over 10 years ago

I tried this with the attached inline.xml file that has an invalid unescaped ampersand in the inline section -- Metacat rejected it as invalid.

If there is a specific CDR file that is causing this issue still, let's reopen and and figure out how it is slipping by. Otherwise, I believe this is not currently a problem in trunk.

#5 Updated by Redmine Admin over 9 years ago

Original Bugzilla ID was 5273

Also available in: Atom PDF