unicode

You are browsing the archives of unicode.

Python < 2.5 and unicode/str comparisons

Comparing strings to unicode objects should have never been possible, but it does “work”, and you’ve probably seen this behavior in Python 2.5 – 2.7:

Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
>>> u"\xff" == "\xff"
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
False

To do the comparison, Python calls unicode() on the str object behind the scenes, and if it cannot decode it, it emits a warning and returns False.

If you’re still maintaining software that must run on Python 2.4 (or worse), you might run into this old behavior:

Python 2.4.6 (#1, Aug 2 2010, 18:27:11)
>>> u"\xff" == "\xff"
Traceback (most recent call last):
  File "", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)

Also, if you’re writing tests that involve this, keep in mind that Python 2.4 does not have a UnicodeWarning.

(After I wrote this, I found that it was documented in What’s New in Python 2.5.)