Submit link · Login · Register

Feature: detect mixups between two single-byte encodings · Issue #18 · LuminosoInsight/python-ftfy · GitHub

Original link: https://github.com/LuminosoInsight/python-ftfy/issues/18

There is apparently a fair amount of Spanish text out there that contains a mix-up between Windows-1252 and MacRoman before being encoded in UTF-8. Because Latin-1 for Windows-1252 is the only single-byte mixup we detect, we assume that'...

Continue to site →

Tags

Preview