How to use LXML parsing to contain prefix but no namespace declaration XML?

I have a bunch of XML files using prefixes but no corresponding namespace declarations.

Like:

< pre>

Either:


...

I know where those prefixes come from, I tried the following, but no success:

from lxml import etree as ElementTree

ElementTree.register_namespace("i18n", "http://namespaces.zope.org")
ElementTree .register_namespace("tal", "http://xml.zope.org/namespaces/tal")

with open(path) as fp:
tree = ElementTree.parse(fp )

But lxml is still choking:

lxml.etree.XMLSyntaxError: Namespace prefix i18n for domain on div is not defined, line 4, column 20< /pre> 

I know I can use ElementTree.XMLParser(recover = True), but I want to keep the prefix, this method does not.

Any ideas?

It is not a valid XML, it uses an undefined prefix, so no XML parser can handle it

Your best option (other than fixing the XML) is to programmatically modify the XML source to add namespace attributes to the root element (only use string support in your language). Before providing the XML to the parser, add xmlns:tal="http://xml.zope.org/namespaces/tal", etc. to the root element. Then, the XML parser should handle it without complaint, and nothing Register the namespace.

I have a bunch of XML files using prefixes but no corresponding namespace declarations.

Like:


...

Either:


...

I know where those prefixes come from, I tried the following , But no success:

from lxml import etree as ElementTree

ElementTree.register_namespace("i18n", "http://namespaces.zope. org")
ElementTree.register_namespace("tal", "http://xml.zope.org/namespaces/tal")

with open(path) as fp:
tree = ElementTree.parse(fp)

But lxml still suffocates:

lxml.etree.XMLSyntaxError: Namespace prefix i18n for domain on div is not defined, line 4, column 20

I know I can use ElementTree.XMLParser(recover = True), but I want Keep the prefix, this method does not.

Any ideas?

It is not valid XML, it uses an undefined prefix, so no XML parser can handle it.

You The best option (other than fixing the XML) is to programmatically modify the XML source to add namespace attributes to the root element (using only string support in your language). Before providing the XML to the parser, add xmlns: tal = "http://xml.zope.org/namespaces/tal" etc. to the root element. Then, the XML parser should handle it without complaint, and there is no registered namespace.

Leave a Comment

Your email address will not be published.