How to make Python 2.x warn when enforcing strings to Unicode?

A very common source of encoding errors is that when you add a string with unicode, Python 2 will silently coerce the string to unicode. This may cause mixing Encoding issues, and may be difficult to debug.

For example:

import urllib
import webbrowser
name = raw_input ("What's your name?\nName: ")
greeting = "Hello, %s"% name
if name == "John":
greeting += u'(Feliz cumplea\ xf1os!)'
webbrowser.open('http://lmgtf\x79.com?q=' + urllib.quote_plus(greeting))

If you enter “John”, it will appear Mysterious error:

/usr/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison faile
d to convert both arguments to Unicode- interpreting them as being unequal
return''.join(map(quoter, s))
Traceback (most recent call last):
File "feliz.py", line 7, in < module>
webbrowser.open('http://lmgtf\x79.com?q=' + urllib.quote_plus(greeting))
File "/usr/lib/python2.7/urllib.py ", line 1273, in quote_plus
s = quote(s, safe + '')
File "/usr/lib/python2.7/urllib.py", line 1268, in quote
return''.join(map(quoter, s))
KeyError: u'\xf1'

When the actual forcing occurs, the actual error occurs far away It’s very difficult to track down roads.

How to configure python to issue warnings or exceptions immediately when a string is forcibly converted to unicode?

After asking this question, I did some research and found the perfect answer. Armin Ronacher Created a wonderful little tool called unicode-nazi. Just install it and run your program like this:

python -Werror -municodenazi myprog.py 

And you will be traced where the force occurred:

Traceback (most recent call last):
File "/usr/lib /python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "SITE-PACKAGES/unicodenazi.py", line 128, in
main()
File "SITE-PACKAGES /unicodenazi.py", line 119, in main
execfile(sys.argv[0], main_mod.__dict__)
File "myprog.py", line 4, in
print foo()
File "myprog.py", line 2, in foo
return'bar' + u'baz'
File "SITE-PACKAGES/unicodenazi.py", line 34 , in warning_decode
stacklevel=2)
UnicodeWarning: Implicit conversion of str to unic ode

If you are dealing with a python library that triggers implicit coercion yourself, and you cannot catch exceptions or resolve them in other ways, you can omit -Werror:

python -municodenazi myprog.py

And when it happens, at least see the warning printed on stderr:

/SITE-PACKAGES/ unicodenazi.py:119: UnicodeWarning: Implicit conversion of str to unicode
execfile(sys.argv[0], main_mod.__dict__)
barbaz

A very common source of encoding errors is that when you add a string with unicode, Python 2 will silently coerce the string to unicode. This can cause mixed encoding issues and can be difficult to debug. < p>

For example:

import urllib
import webbrowser
name = raw_input("What's your name?\nName: " )
greeting = "Hello, %s"% name
if name == "John":
greeting += u'(Feliz cumplea\xf1os!)'
webbrowser. open('http://lmgtf\x79.com?q=' + urllib.quote_plus(greeting))

If you enter “John”, a mysterious error will appear:

/usr/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison faile
d to convert both arguments to Unicode-interpreting them as being unequal
return ''.join(map(quoter, s))
Tracebac k (most recent call last):
File "feliz.py", line 7, in
webbrowser.open('http://lmgtf\x79.com?q=' + urllib .quote_plus(greeting))
File "/usr/lib/python2.7/urllib.py", line 1273, in quote_plus
s = quote(s, safe + '')
File "/usr/lib/python2.7/urllib.py", line 1268, in quote
return''.join(map(quoter, s))
KeyError: u'\xf1'< /pre>

When the actual force occurs, the actual error occurs far away from the road, which is particularly difficult to track down.

When the string is forced to unicode, how to configure python to send it out immediately Warning or exception?

After asking this question, I did some research and found the perfect answer. Armin Ronacher created a wonderful gadget called unicode-nazi . Just install it and run your program like this:

python -Werror -municodenazi myprog.py

and you will be forced The place of occurrence is traced back:

Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162 , in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "SITE-PACKAGES/unicodenazi.py", line 128, in
main()
File "SITE-PACKAGES/unicodenazi.py", line 119, in main< br /> execfile(sys.argv[0], main_mod.__dict__)
File "myprog.py", line 4, in
print foo()
File "myprog. py", line 2, in foo
return'bar' + u'baz'
File "SITE-PACKAGES/unicodenazi.py", line 34, in warning_decode
stacklevel=2)< br />UnicodeWarning: Implicit conversion of str to unicode

If you are dealing with a python library that triggers implicit coercion yourself, and you cannot catch exceptions or resolve them in other ways, You can omit -Werror:

python -municodenazi myprog.py

And when it happens, at least see the warning printed on stderr:

p>

/SITE-PACKAGES/unicodenazi.py:119: UnicodeWarning: Implicit conversion of str to unicode
execfile(sys.argv[0], main_mod.__dict__)
barbaz

Leave a Comment

Your email address will not be published.