[Html-widget] UTF-8 and escaping
Adam Sjøgren
adsj at novozymes.com
Tue Jun 20 16:43:15 CEST 2006
Hi.
It looks like HTML::Widget, by way of HTML::Element, escapes the
individual bytes of utf-8 multibyte characters; example:
$ cat utf_8-1.07.pl
#!/usr/bin/perl
use strict;
use warnings;
use HTML::Widget;
my $v='Frække frølår';
my $w=HTML::Widget->new('widget')->method('get')->action('/');
my $e=$w->element('Textarea', 'mytext')->value($v);
print $w->process->as_xml;
$ ./utf_8-1.07.pl
<form action="/" id="widget" method="get"><fieldset><textarea class="textarea" cols="40" id="widget_mytext" name="mytext" rows="20">Frække frølår</textarea></fieldset></form>
$
(LANG is set to LANG=en_DK.UTF-8, so the locale is UTF-8).
Is that supposed to happen; possible to override?
The escaping is done, in a way that does not take multibyte character
sets into account, in HTML::Element::_xml_escape - which isn't called
as a method, so it isn't easily overridable, as far as I can see.
How are people coping with this? I'm afraid I'm overlooking something
:*)
(For now I'm using a <%filter>-section in my Mason-autohandler that
translates &#NNN; back, but that is patching up the symptom rather
than a cure...)
Best regards,
Adam
--
Adam Sjøgren
adsj at novozymes.com
More information about the Html-widget
mailing list