Bug #2091
closedDon't determine document namespace by "schemaLocation" attribute
0%
Description
In MetacatServlet class, before inserting document into db, we need to
determine which parser should be initialized. We get the namespace
information from the attribute "schemaLocation" and if namespace is
eml200 document the eml parser200 will be initialized. If it is not
eml200 or eml201, the generate parser will be initialized(which wouldn't
handle writing access rule). Since the document doesn't has
"schemaLocation" attribute, the generate parser will be used and there
won't have any access rules will be created.
Using schemaLocation to determine the namespace is defintiely
a bug. There is never any requirement in XML Schema that someone
provides a schemaLocation -- its only used to suggest a location as a
hint so the .xsd file can be tracked down if the parser doesn't have a
cached copy already. The prefix of the root element should tell you the
xmlns, which is then present in the xmlns:prefix element in the header.
There are routines in the XML parsers for finding this stuff out during
the parse.
Updated by Saurabh Garg over 19 years ago
I discussed this with Jing and Dan and we came up with the following steps to
find out the namespace:
1. Find if the root element has prefix (e.g. <eml:eml>). If found, go to step
2, otheriwse go to step 3.
2. Look for xmlns:prefix element to find the ns
(e.g.:xmlns:eml="eml://ecoinformatics.org/eml-2.0.0")
2a. If xmlns:prefix not found, go to step 3
3. Look for xmlns element to find the ns (e.g.:
xmlns="eml://ecoinformatics.org/eml-2.0.0")
4. If no xmlns element found, you the generic schemea.
Updated by Saurabh Garg about 19 years ago
Changing the target and severity of the bug as this bug is allowing people to
insert invalid eml documents which in turn is causing problem during
replication.
Updated by Saurabh Garg almost 19 years ago
Fixed. Seems to be working fine for now. All the documents replicated to knb1
seems to have doctype set correctly.