[Catalyst] CSV / UTF-8 / Unicode

Mike Whitaker mike at altrion.org
Thu Jul 4 09:35:20 GMT 2013


On 4 Jul 2013, at 10:30, Mike Whitaker <mike at altrion.org> wrote:
> On 4 Jul 2013, at 09:56, Craig Chant <craig at homeloanpartnership.com> wrote:
>> 
>> Yes it's NVARCHAR(max) , which I understood is MS's data-typing for uNicode VARiable CHARacters, looking at some sample column data via the Windows SQL Management GUI, it appears to display ok.
> 
> It probably isn't UTF-8, though. UTF-8 is only one possible encoding in which you can store unicode character points.

Try, on the off-chance I've read the spec right

- not bothering with mysql_encode_utf8 in the DBI connect args
- passing all data from DBI through decode("UTF-16",...) or decode("UCS-2"...) - MS's docs aren't that clear which!
- reencoding it as utf8 on the way out.





More information about the Catalyst mailing list