xml - insert/ignore a missing namespace in LXML -


i have parse malformed xml:

>>> lxml import etree >>> root = etree.fromstring(xml_string) xmlsyntaxerror: namespace prefix xlink href on email not defined, line 3, column 2446 

xlink indeed missing among declarations.

is there easy, recommended way tell lxml ignore missing namespaces, or use supplied one?

right now, manually modify xml_string inject namespace before parsing, works ugly , not general enough.

there no way tell lxml insert missing namespace declaration. 1 might imagine

etree.register_namespace("xlink", "http://www.w3.org/1999/xlink") 

could help, has no effect.

even if "ugly", think you'll have continue inject namespace before parsing xml document (perhaps can automate if haven't already).

it is possible make lxml accept malformed input using parser object initialized recover=true. example:

import lxml.etree etree  input = """\ <root>  <x:a>abc</x:a> </root>"""   parser = etree.xmlparser(recover=true) tree = etree.fromstring(input, parser) print etree.tostring(tree) 

output:

<root>  <a>abc</a> </root> 

here prefix removed, , don't think want. namespaces there reason; can't tossed away.


Comments

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -