Project

General

Profile

Bug #5273

docs with inline-data allow invalid xml into metacat

Added by Chad Berkley over 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Immediate
Category:
metacat
Target version:
Start date:
01/13/2011
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
5273

Description

If you insert a document with inline-data, the data is stripped out of the document before it is validated. However, when you do a GET on the document, it is read off of the disk. So if you insert a doc with inline-data that has invalid characters in it (like unescaped ampersands), metacat will not recognize that it is invalid, but when you try to get the document, you will get a parser error if you try to parse it.

We should be validating the document first before stripping inline-data out of it.

inline.xml (2.19 KB) inline.xml ben leinfelder, 10/27/2011 06:58 PM

History

#1 Updated by ben leinfelder over 8 years ago

is inline data not contained in CDATA? I thought you could put anything in CDATA and have it be ignored by parsers.

#2 Updated by Matt Jones over 8 years ago

People certainly can use CDATA sections within their inline element, in which case escaping would be taken care of. But in this case, the data in the CDR document is not in a CDATA element, has reserved XML characters in it, and Metacat is not properly rejecting it as invalid.

#3 Updated by ben leinfelder almost 8 years ago

This file uses unescaped ampersand (&) in the inline data section.

#4 Updated by ben leinfelder almost 8 years ago

I tried this with the attached inline.xml file that has an invalid unescaped ampersand in the inline section -- Metacat rejected it as invalid.

If there is a specific CDR file that is causing this issue still, let's reopen and and figure out how it is slipping by. Otherwise, I believe this is not currently a problem in trunk.

#5 Updated by Redmine Admin over 6 years ago

Original Bugzilla ID was 5273

Also available in: Atom PDF