[Html-widget] UTF-8 and escaping

Ash Berlin ash at cpan.org
Tue Jun 20 16:54:13 CEST 2006


Adam Sjøgren wrote:
>   Hi.
>
>
> It looks like HTML::Widget, by way of HTML::Element, escapes the
> individual bytes of utf-8 multibyte characters; example:
>
>  $ cat utf_8-1.07.pl 
>  #!/usr/bin/perl
>
>  use strict;
>  use warnings;
>
>  use HTML::Widget;
>
>  my $v='Frække frølår';
>
>  my $w=HTML::Widget->new('widget')->method('get')->action('/');
>  my $e=$w->element('Textarea', 'mytext')->value($v);
>
>  print $w->process->as_xml;
>  $ ./utf_8-1.07.pl 
>  <form action="/" id="widget" method="get"><fieldset><textarea class="textarea" cols="40" id="widget_mytext" name="mytext" rows="20">Fr&#195;&#166;kke fr&#195;&#184;l&#195;&#165;r</textarea></fieldset></form>
>  $ 
>
> (LANG is set to LANG=en_DK.UTF-8, so the locale is UTF-8).
>
> Is that supposed to happen; possible to override?
>
> The escaping is done, in a way that does not take multibyte character
> sets into account, in HTML::Element::_xml_escape - which isn't called
> as a method, so it isn't easily overridable, as far as I can see.
>
> How are people coping with this? I'm afraid I'm overlooking something
> :*)
>
>
> (For now I'm using a <%filter>-section in my Mason-autohandler that
> translates &#NNN; back, but that is patching up the symptom rather
> than a cure...)
>
>
>   Best regards,
>
>    Adam
>
>   
If I remember rightly, omega from IRC had issues with UTF8 - but i'm not 
sure if he ever came to a solution - will ask when I next see him

Ash





More information about the Html-widget mailing list