[Xml-compile] Re: Null characters in strings
Patrick Powell
papowell at astart.com
Thu Dec 12 00:06:31 GMT 2013
On 12/10/13 14:43, Mark Overmeer wrote:
> * Patrick Powell (papowell at astart.com) [131210 17:17]:
>> I ran into a very odd problem today. Due to a cut and paste update
>> of a database field, a Null (0x00) value was put into a string.
> oops
>
Yeah. I spent quite a while trying to figure this one out.
>> I ran into some funny problems. First, one toolset generated a
>> � for the null. However, this resulted
>> in an error when sent to an XML::SOAP server. - illegal XMLchar
>> value 0 (I do not have the exact error message).
> Who did complain? The server? XML::Compile has a light task: it
> does not look at the content of the strings at all: if anyone
> did encode \x0 into �, then that is libxml2.
The server (Implemented using XML::Compile) complained,
and generated a text message response. It appears to have
originated from the libxml2 library, as you suggested.
>> I then used an XML::Compile based client and looked at the SOAP
>> message that was being sent. It did not appear to have the null
>> value in the string.
> Well, never trust editors in this case. Did you use 'vim' with the
> 'l' command?
I actually used WireShark and looked at the message going out at the
TCP/IP level.
There were no '%#x00;' or similar text in the message but there were
NULLs in the
MySQL text that I used to generate the message.
>> Question: Does XML::Compile or its support libraries/facilties
>> delete NULL characters before
>> doing string encoding (i.e. - translates & to &)? Does it do
>> this for other characters?
> libxml2 wrapped by XML::LibXML
Well, apparently libxml2 has nicely removed the 'illegal' control
characters from outgoing XML
but it refuses to handle them on incoming XML. I suspect that NULL
(0x00) is treated as a
'truly evil' character as it is used as an end of string indicator in
C, and I can envision all sorts
of havoc if it was not deleted during the decoding phase of the XML
translation. Since libXML
is C/C++ based this would make a lot of sense.
>
>> Question: Is there a magic option/flag for XML::Compile so that it
>> will ACCEPT strings with encoded null characters in them?
> No, X::C is agnostic about charsets
> You have to look at the libxml2 lists for your answers... sorry.
Yes. You have to go into the guts of libxml2 to fix this. I think
the correct way to do
this fix is to have the senders of the data fix up their databases and
remove the junk
characters.
More information about the Xml-compile
mailing list