[Catalyst] tips for troubleshooting/QAing Unicode (was Re: Passing UTF-8 arg in URL to DBIC search)

Darren Duncan darren at darrenduncan.net
Mon Sep 29 03:16:07 BST 2008

Lee Aylward wrote:
> Great timing on this as I am currently struggling with some unicode text
> not displaying correctly in an application I am working on. Per your
> suggestion I put the Japanese text at the top of my template. All of a
> sudden the browsers started displaying that and other non-ascii characters
> correctly. The second I take away the Japanese text it goes back to just
> showing question marks. I am seeing this behavior in both the test
> server and Apache.
> I have looked at the Content-Type header and it is definitely serving it
> as utf-8, so I am at abit of a loss. There are no databases involved
> here, but I am displaying information from IMDB::Film. Is there anything
> in the actual HTML that needs to be set?

That seems strange.  I wonder if something in your template handler or 
other part of your app is trying to DWIM for you and is getting it wrong. 
Are your source files actually UTF-8, both the prior and new versions?  Are 
you explicitly declaring that in one place and not another?  I wouldn't 
expect the addition of Japanese text to suddenly make the other characters 
look correct by itself unless there's some DWIM going on.  I suspect you 
made some other change between the two versions as well, such as saving the 
source file in a different encoding.

Note that the reason I use a Japanese text example is because the vast 
majority of my normal program text would fit in the ASCII repertoire, and 
it would only be user data that might be Unicode, though most user data 
isn't.  And Japanese characters are known to not have a one-byte 
interpretation and they stand out clearly from latin letters at a glance. 
So in your own situation, the text you already have that doesn't display 
right, if it is literal text in your source code, should be a surrogate for 
my Japanese test example to see if things look right.  So see what your 
text editor says that your older/incorrect file version's encoding is.

-- Darren Duncan

More information about the Catalyst mailing list