[lxml-dev] Bug in XPath evaluation - not a bug :)

jholg at gmx.de jholg at gmx.de
Tue Apr 24 11:18:53 CDT 2007


Hi,

> The only solution I see right now is to scan the XML data prior to the
> XPath query in order to map each prefix to its namespace-uri. 
> I do understand now that this is such an exotic use case that it
> wouldn't make much sense to have lxml do these mappings automatically if
> the second argument of .xpath() is omitted.
> The reason I gave this rather lengthy example was to find out if anyone
> reading this has an idea of an alternative solution for my problem
> (applying metadata to specific parts of an XML document without making
> the XPath expressions to address these parts too complex).

Might be you can take advantage of nsmap (don't get confused by the result output, I'm using the lxml.objectify notion)?

>>> root = etree.fromstring("""
... <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
...       xmlns:py="http://codespeak.net/lxml/objectify/pytype"
...       xmlns:other="otherURI"
...       xmlns="myURI"
...       version="v2.0">
...   <a attr1="foo" attr2="bar">1</a>
...   <a py:pytype="float">1.2</a>
...   <a py:pytype="str">1.2</a>
...   <b py:pytype="int">1</b>
...   <b xsi:type="integer">2</b>
...   <b xsi:type="string">2</b>
...   <c>what</c>
...   <c>is</c>
...   <c>this</c>
...   <c>good</c>
...   <c>for?</c>
...   <d/>
...   <e>2006/08/09 13:19:01.000000+02:00</e>
...   <other:e>from another namespace</other:e>
...   <sub1>
...     <sub2>
...        <sub3>
...          <other:x>387.38</other:x>
...        </sub3>
...     </sub2>
...   </sub1>
...   <sub1>
...     <sub2>
...        <sub3>
...          <other:x>387.38</other:x>
...        </sub3>
...     </sub2>
...   </sub1>
...   <sub1>
...     <sub2>
...        <sub3>
...          <other:x>387.38</other:x>
...        </sub3>
...     </sub2>
...   </sub1>
... </root>
... """)
>>> prefixDict = dict(root.nsmap)
>>> del prefixDict[None]
>>> prefixDict[''] = root.nsmap[None]
>>> print etree.XPath('//other:x', prefixDict)(root)
[Decimal("387.38"), Decimal("387.38"), Decimal("387.38")]

What's not so nice is that nsmap uses None for the empty prefix whereas XPath seems to expect an empty string in the prefix-URI-dict.
Plus I'm not sure if you can simply use the root element nsmap, as I did
here.

Holger
-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail



More information about the lxml mailing list