[Html-widget] UTF-8 and escaping
Bernhard Graf
html-widget at augensalat.de
Tue Jun 20 17:03:02 CEST 2006
On Tuesday 20 June 2006 16:43, Adam Sjøgren wrote:
> It looks like HTML::Widget, by way of HTML::Element, escapes the
> individual bytes of utf-8 multibyte characters; example:
>
> $ cat utf_8-1.07.pl
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> use HTML::Widget;
>
> my $v='Frække frølår';
>
> my $w=HTML::Widget->new('widget')->method('get')->action('/');
> my $e=$w->element('Textarea', 'mytext')->value($v);
>
> print $w->process->as_xml;
> $ ./utf_8-1.07.pl
> <form action="/" id="widget" method="get"><fieldset><textarea
> class="textarea" cols="40" id="widget_mytext" name="mytext"
> rows="20">Frække
> frølår</textarea></fieldset></form> $
>
> (LANG is set to LANG=en_DK.UTF-8, so the locale is UTF-8).
>
> Is that supposed to happen; possible to override?
>
> The escaping is done, in a way that does not take multibyte character
> sets into account, in HTML::Element::_xml_escape - which isn't called
> as a method, so it isn't easily overridable, as far as I can see.
>
> How are people coping with this? I'm afraid I'm overlooking something
>
> :*)
My solution is patching. I might be wrong, but I regard
HTML::Element::_xml_escape() as broken (like others who filed a report
to http://rt.cpan.org/Public/Dist/Display.html?Name=HTML-Tree).
Patch attached
--
Bernhard Graf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: HTML-Tree.patch
Type: text/x-diff
Size: 450 bytes
Desc: not available
Url : http://lists.rawmode.org/pipermail/html-widget/attachments/20060620/0b24d780/attachment.bin
More information about the Html-widget
mailing list