[Xml-compile] XML::Compile::SOAP 2.02 woes (2)

Gert Doering gert at space.net
Wed Mar 18 14:58:29 GMT 2009


Hi,

and here's Issue #2...

With XML::Compile::SOAP 2.02, input strings that are not already promoted
(not at all).  This is similar to what we had in the past, and what got
fixed in 0.71.

My input document is ISO8859-1 encoded, and perl knows this (BSD system,
no UTF-8 anywhere here, perl 5.8.9).  In the test case, the umlauts are 
just typed into the code.

Checking with utf8::is_utf8($string) and utf8::upgrade($string), and
just printing the string to a (verified!) ISO8859-capable terminal 
confirms that perl *knows* what form of characters are in this string.

Still, when sending them over the wire, they are encoded es ISO8859-1,
but declared as UTF8:

-------------------- quote -----------------
$ tcpdump -A -s0 'tcp port 80'
POST /bin/TDe/ticket_callback.pl?SelfTest HTTP/1.1
TE: deflate,gzip;q=0.3
Keep-Alive: 300
Connection: Keep-Alive, TE
Host: partnerweb.lie.space.net
User-Agent: libwww-perl/5.825
Content-Length: 496
Content-Type: text/xml; charset="utf-8"
SOAPAction: "urn:Test/OpSelfTest"
X-LWP-Version: 5.825
X-XML-Compile-SOAP-Version: 2.02
X-XML-Compile-Version: 1.02
X-XML-LibXML-Version: 1.69

<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Header><s:AuthenticationInfo xmlns:s="urn:TDeModify"><s:userName>user</s:userName><s:password>pass</s:password></s:AuthenticationInfo></SOAP-ENV:Header><SOAP-ENV:Body><s:OpSelfTest xmlns:s="urn:TDeModify"><s:WEBINT_CUSTTICKETID>200711250087</s:WEBINT_CUSTTICKETID><s:WEBINT_TICKETID>C0815-4711-...</s:WEBINT_TICKETID></s:OpSelfTest></SOAP-ENV:Body></SOAP-ENV:Envelope>
-------------------- quote -----------------

(You can't really see the 'umlaut' characters here, but you can see that
there are 3 '.' characters after C0815-4711-... - those 3 dots are the
3 umlauts in the source text.  Proper UTF-8 encoding should have 6 'things'
here, printable or not).

It can be "properly" seen in a hex dump (tcpdump ... | od -c ):

...
0002020    s   :   W   E   B   I   N   T   _   T   I   C   K   E   T   I
0002040    D   >   C   0   8   1   5   -   4   7   1   1   -   ä   ö   ü
0002060    <   /   s   :   W   E   B   I   N   T   _   T   I   C   K   E
...

- so it's indeed "single byte encoded" things.


Given that perl knows the correct encoding of the string, and the whole
SOAP stuff is UTF-8 based, I'm a bit surprised at this...

The .wsdl file is the same as for the last test (just modified to use
http instead of https, to be able to tcpdump, but the effect is the
same with https).

The test case is slightly modified, now with umlauts and a few $utf8
tests.

Gert Doering
        -- NetMaster
-- 
Total number of prefixes smaller than registry allocations:  128645

SpaceNet AG                        Vorstand: Sebastian v. Bomhard
Joseph-Dollinger-Bogen 14          Aufsichtsratsvors.: A. Grundner-Culemann
D-80807 Muenchen                   HRB: 136055 (AG Muenchen)
Tel: +49 (89) 32356-444            USt-IdNr.: DE813185279



More information about the Xml-compile mailing list