Message 140055 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	Devin Jeanpierre, eric.araujo, petri.lehtinen, terry.reedy, vstinner
Date	2011-07-09.09:03:45
SpamBayes Score	3.0752268e-05
Marked as misclassified	No
Message-id	<[email protected]>
In-reply-to

Content
The compiler has a PyCF_SOURCE_IS_UTF8 flag: see compile() builtin. The parser has a flag to ignore the coding cookie: PyPARSE_IGNORE_COOKIE. Patch tokenize to support Unicode is simple: use PyCF_SOURCE_IS_UTF8 and/or PyPARSE_IGNORE_COOKIE flags and encode the strings to UTF-8. Rewrite the parser to work directly on Unicode is much more complex and I don't think that we need that.

The compiler has a PyCF_SOURCE_IS_UTF8 flag: see compile() builtin. The parser has a flag to ignore the coding cookie: PyPARSE_IGNORE_COOKIE.

Patch tokenize to support Unicode is simple: use PyCF_SOURCE_IS_UTF8 and/or PyPARSE_IGNORE_COOKIE flags and encode the strings to UTF-8.

Rewrite the parser to work directly on Unicode is much more complex and I don't think that we need that.

History
Date	User	Action	Args
2011-07-09 09:03:46	vstinner	set	recipients: + vstinner, terry.reedy, Devin Jeanpierre, eric.araujo, petri.lehtinen
2011-07-09 09:03:46	vstinner	set	messageid: <[email protected]>
2011-07-09 09:03:45	vstinner	link	issue12486 messages
2011-07-09 09:03:45	vstinner	create