[Catalyst] tips for troubleshooting/QAing Unicode (was Re: Passing
UTF-8 arg in URL to DBIC search)
Darren Duncan
darren at darrenduncan.net
Mon Sep 29 03:16:07 BST 2008
Lee Aylward wrote:
> Great timing on this as I am currently struggling with some unicode text
> not displaying correctly in an application I am working on. Per your
> suggestion I put the Japanese text at the top of my template. All of a
> sudden the browsers started displaying that and other non-ascii characters
> correctly. The second I take away the Japanese text it goes back to just
> showing question marks. I am seeing this behavior in both the test
> server and Apache.
>
> I have looked at the Content-Type header and it is definitely serving it
> as utf-8, so I am at abit of a loss. There are no databases involved
> here, but I am displaying information from IMDB::Film. Is there anything
> in the actual HTML that needs to be set?
That seems strange. I wonder if something in your template handler or
other part of your app is trying to DWIM for you and is getting it wrong.
Are your source files actually UTF-8, both the prior and new versions? Are
you explicitly declaring that in one place and not another? I wouldn't
expect the addition of Japanese text to suddenly make the other characters
look correct by itself unless there's some DWIM going on. I suspect you
made some other change between the two versions as well, such as saving the
source file in a different encoding.
Note that the reason I use a Japanese text example is because the vast
majority of my normal program text would fit in the ASCII repertoire, and
it would only be user data that might be Unicode, though most user data
isn't. And Japanese characters are known to not have a one-byte
interpretation and they stand out clearly from latin letters at a glance.
So in your own situation, the text you already have that doesn't display
right, if it is literal text in your source code, should be a surrogate for
my Japanese test example to see if things look right. So see what your
text editor says that your older/incorrect file version's encoding is.
-- Darren Duncan
More information about the Catalyst
mailing list