Define Default Namespace (unprefixed) In Lxml
Solution 1:
Use ElementMaker and give it an nsmap that maps None to your default namespace.
#!/usr/bin/env python
# dogeml.py
from lxml.builder import ElementMaker
from lxml import etree
E = ElementMaker(
nsmap={
None: "http://wow/" # <--- This is the special sauce
}
)
doge = E.doge(
E.such('markup'),
E.many('very namespaced', syntax="tricks")
)
options = {
'pretty_print': True,
'xml_declaration': True,
'encoding': 'UTF-8',
}
serialized_bytes = etree.tostring(doge, **options)
print(serialized_bytes.decode(options['encoding']))
As you can see in the output from this script, the default namespace is defined, but the tags do not have a prefix.
<?xml version='1.0' encoding='UTF-8'?>
<doge xmlns="http://wow/">
<such>markup</such>
<many syntax="tricks">very namespaced</many>
</doge>
I have tested this code with Python 2.7.6, 3.3.5, and 3.4.0, combined with lxml 3.3.1.
Solution 2:
This XSL transformation removes all prefixes from content, while maintaining namespaces defined in the root node:
import lxml.etree as ET
content = '''\
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html>
<h:html xmlns:h="http://www.w3.org/1999/xhtml" xmlns:ml="http://foo">
<h:head>
<h:title>MathJax Test Page</h:title>
<h:script type="text/javascript"><![CDATA[
function test() {
alert(document.getElementsByTagName("p").length);
};
]]></h:script>
</h:head>
<h:body onload="test();">
<h:p>test</h:p>
<ml:foo></ml:foo>
</h:body>
</h:html>
'''
dom = ET.fromstring(content)
xslt = '''\
<xsl:stylesheet version="1.0"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no"/>
<!-- identity transform for everything else -->
<xsl:template match="/|comment()|processing-instruction()|*|@*">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<!-- remove NS from XHTML elements -->
<xsl:template match="*[namespace-uri() = 'http://www.w3.org/1999/xhtml']">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="@*|node()" />
</xsl:element>
</xsl:template>
<!-- remove NS from XHTML attributes -->
<xsl:template match="@*[namespace-uri() = 'http://www.w3.org/1999/xhtml']">
<xsl:attribute name="{local-name()}">
<xsl:value-of select="." />
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
'''
xslt_doc = ET.fromstring(xslt)
transform = ET.XSLT(xslt_doc)
dom = transform(dom)
print(ET.tostring(dom, pretty_print = True,
encoding = 'utf-8'))
yields
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>MathJax Test Page</title>
<script type="text/javascript">
function test() {
alert(document.getElementsByTagName("p").length);
};
</script>
</head>
<body onload="test();">
<p>test</p>
<ml:foo xmlns:ml="http://foo"/>
</body>
</html>
Solution 3:
To expand on @neirbowj's answer, but using ET.Element and ET.SubElement, and rendering a document with a mix of namespaces, where the root happens to be explicitly namespaced and a subelement (channel) is the default namespace:
# I set up but don't use the default namespace:
root = ET.Element('{http://www.w3.org/1999/02/22-rdf-syntax-ns#}RDF', nsmap={None: 'http://purl.org/rss/1.0/'})
# I use the default namespace by including its URL in curly braces:
e = ET.SubElement(root, '{http://purl.org/rss/1.0/}channel')
print(ET.tostring(root, xml_declaration=True, encoding='utf8').decode())
This will print out the following:
<?xml version='1.0' encoding='utf8'?>
<rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><channel/></rdf:RDF>
It automatically uses rdf for the RDF namespace. I'm not sure how it figures it out. If I want to specify it I can add it to my nsmap in the root element:
nsmap = {None: 'http://purl.org/rss/1.0/',
'doge': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'}
root = ET.Element('{http://www.w3.org/1999/02/22-rdf-syntax-ns#}RDF', nsmap=nsmap)
e = ET.SubElement(root, '{http://purl.org/rss/1.0/}channel')
print(ET.tostring(root, xml_declaration=True, encoding='utf8').decode())
...and I get this:
<?xml version='1.0' encoding='utf8'?>
<doge:RDF xmlns:doge="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/"><channel/></doge:RDF>
Post a Comment for "Define Default Namespace (unprefixed) In Lxml"