[lxml-dev] PATCH: XMLSyntaxError

Stefan Behnel behnel_ml at gkec.informatik.tu-darmstadt.de
Tue Nov 8 01:19:02 CST 2005

Martijn Faassen wrote:
> Stefan Behnel wrote:
>>>> if we
>>>> change this, why not take the second step and rename "Error" into
>>>> "LxmlException" ? Would reduce the chance of import conflicts, at
>>>> least in new
>>>> code. We may still leave Error in for compatibility, but depricate it.
> I don't want to start namespacing our stuff with prefixes while Python
> has a perfectly good namespacing mechanism.
> So, -1 to prefixing stuff with 'Lxml'.

1) If I read

if somethings_wrong:
  raise Error, "..."

my first reaction is "why do they raise a general stdlib exception?" So I look
up Error in the Python docs and since I do not find it, I search the code and
find that it is defined somewhere as being a module internal exception. That's
the wrong way round (and it's not the fault of the reader).

2) If I read

except Error:
  print "bla"

my first reaction is "are they catching any error in the world?" Remember that
the respective imports are usually pretty far from this code.

3) Sure, you could obviously go for

except mymod.Error:
  print "bla"

but a) you can't do that inside the module itself (back to 1) and b) it is
most likely not your intention to *actively prevent* users of your module from
doing "from mymod import a,b,c" by giving c the name "Error" ["But I MUST
educate them, they are stupid little idiots that don't understand my code
anyway, so..." - no.]

"from mymod import a,b,c" is so common in Python that module authors should
not actively try to prevent it, especially not by using obfuscated naming.
Remember: The users of your code are adults, too.

Have you noticed that we currently have exactly the same problem with the
"parse" function? Simply because someone chose a too general name for it. It
does not parse anything. It does not even parse any XML. It only /parses/
/files/ that contain /XML/. Three semantic parts. "parses" is in the name,
"XML" is somewhere in the namespace, but where is "files"? So the name was
badly chosen, it should have been something like "parseFile". But too late,
it's part of the API, so we can't give it a better semantic name anymore.

What do we learn from that? Even if we have namespaces, it is sometimes
appropriate to give one thought more on naming.

So, +1 for giving semantic names to 'things' right from the beginning (even to
non-public ones).


More information about the lxml mailing list