[Catalyst] {OT] protecting against attacks with multilingual input

Joel Bernstein joel at fysh.org
Tue Dec 5 12:52:20 GMT 2006


On Tue, Dec 05, 2006 at 01:10:35PM +0100, Daniel McBrearty wrote:
> How does one do this?
> 
> If you have a text input field which can be in *any* language, which
> will get stored in the db, how do you protect against script
> injection?
> 
> If it's just english, I normally only accept characters from a given
> list (something like /[A-Za-z0-9]/ , plus whitespace and punctuation).
> But if the input can be in any language .... ??

Isn't there any way you could require the input to be associated with a
particular language? Perl supports locale definitions which modify, for
example, the set of 'word' characters matched by the \w regular
expression escape. If you could dynamically switch locales to the
correct one for your input text then you could trivially s/\W//g to
sanitize strings.

If you really have to accept input in any language without knowing 
what language it is, then perhaps you should take the opposite approach
and test for the presence of certain characters which you can be sure
would occur in program code but not typically in text. Of course,
English might suffer without the dollar-sign and semicolon, and you may
decide that this is overly restrictive on your users...

/joel



More information about the Catalyst mailing list