Python 2.7: test if characters in a string are all Chinese characters -
the following code tests if characters in string chinese characters. works python 3 not python 2.7. how do in python 2.7?
for ch in name: if ord(ch) < 0x4e00 or ord(ch) > 0x9fff: return false
# byte str (you gae) in [1]: s = """chinese (汉语/漢語 hànyǔ or 中文 zhōngwén) group of related language varieties, several of not mutually intelligible,""" # unicode str in [2]: = u"""chinese (汉语/漢語 hànyǔ or 中文 zhōngwén) group of related language varieties, several of not mutually intelligible,""" # convert unicode using str.decode('utf-8') in [3]: print ''.join(c c in s.decode('utf-8') if u'\u4e00' <= c <= u'\u9fff') 汉语漢語中文 in [4]: print ''.join(c c in if u'\u4e00' <= c <= u'\u9fff') 汉语漢語中文
to make sure characters chinese, should do:
all(u'\u4e00' <= c <= u'\u9fff' c in name.decode('utf-8'))
in python application, use unicode internally - decode & encode late - creating unicode sandwich.
Comments
Post a Comment