[Catalyst] charset not needed for Catalyst::Action::REST?
Tomas Doran
bobtfish at bobtfish.net
Sat Feb 25 09:05:15 GMT 2012
On 24 Feb 2012, at 15:50, Bill Moseley wrote:
> When using Catalyst::Action::REST the content-type response never includes a charset. JSON seems to be handled correctly in code -- JSON strings are always UTF-8. Does that mean there is no need to specify a charset on responses?
Theoretically, you don't need to, but I think we should.. Specifically I've heard reported encoding issues talking to some other stacks which were fixed by us doing this explicitly.
> And what if a JSON request comes in with a non-UTF8 charset? Should that be ignored? It's application/json, not text/json so maybe there no encoding issues?
I thought that JSON was always UTF-8, but I read the spec recently, and whilst it's always unicode, it can be encoded as utf-others also:
JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.
> What about other serializations? YAML is UTF-8 or UTF-16. Does that mean the charset needs to be included in response? And again, if a request comes in with UTF-16 does it need to be decoded or does that happen in YAML::Syck?
>
The latter, but yes I think the charset should also be included.
> Event text/html doesn't include a charset in a the "serialized" response.
I would think that text/html should be handled by C::P::Unicode::Encoding still, if that was present?
> Does there need to be an additional decoding and encoding layer when using Catalyst::Action::REST?
I'm of the opinion there shouldn't need to be.
> Should I force a charset on all responses
I think we should fix this, at least for JSON and YAML where the right thing to do is entirely clear..
> BTW -- doesn't seem like YAML survies a round trip like JSON does:
> <snip>
> But YAML drops the utf8 flag:
>
> $ perl -MYAML::Syck -MEncode -wle 'print length(YAML::Syck::Load(YAML::Syck::Dump( ["\x{263A}"]) )->[0])'
> 3
Eugh. This works as expected with YAML and YAML::XS, I vote that we should stop using YAML::Syck as it's less maintained (and clearly has encoding issues).
Anyone have strong reasons for not doing this?
Cheers
t0m
More information about the Catalyst
mailing list