Saturday, 9 September 2017

unicode - How to disable implicit decoding ("upgrading") in Perl?

Quoting the Perl Unicode FAQ "What if I don't decode?"




Whenever your encoded, binary string is used together with a text
string, Perl will assume that your binary string was encoded with
ISO-8859-1, also known as latin-1. If it wasn't latin-1, then your
data is unpleasantly converted. For example, if it was UTF-8, the

individual bytes of multibyte characters are seen as separate
characters, and then again converted to UTF-8. Such double encoding
can be compared to double HTML encoding (>), or double URI
encoding (%253E).



This silent implicit decoding is known as "upgrading".
That may sound positive, but it's best to avoid it.




Disabling this implicit decoding would force the programmer to use decode()/encode() properly and help prevent bugs.




Is it possible to turn off implicit decoding? Ideally, using a binary string together with a text string would result in an error.

No comments:

Post a Comment

casting - Why wasn't Tobey Maguire in The Amazing Spider-Man? - Movies & TV

In the Spider-Man franchise, Tobey Maguire is an outstanding performer as a Spider-Man and also reprised his role in the sequels Spider-Man...