> I wonder if, for sanity's sake, we should just encode all strings as UTF-8. I
> can't think of any place where we'd want just ascii.
It's definitely a better *default* choice than ASCII, which is what's implicitly used now. Any text that isn't UTF-8 is exceedingly unlikely to pass a simple decoding check. So defaulting to UTF-8 passes more, gives access to full unicode, and yet isn't likely to lead to misinterpretation of data.
> I wonder if, for sanity's sake, we should just encode all strings as UTF-8. I
> can't think of any place where we'd want just ascii.
It's definitely a better *default* choice than ASCII, which is what's implicitly used now. Any text that isn't UTF-8 is exceedingly unlikely to pass a simple decoding check. So defaulting to UTF-8 passes more, gives access to full unicode, and yet isn't likely to lead to misinterpretation of data.
Jeroen