Wrox Programmer Forums

Need to download code?

View our list of code downloads.

Go Back   Wrox Programmer Forums > ASP.NET and ASP > ASP.NET 3.5 > ASP.NET 3.5 Professionals
Password Reminder
Register
Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read
ASP.NET 3.5 Professionals If you are an experienced ASP.NET programmer, this is the forum for your 3.5 questions. Please also see the Visual Web Developer 2008 forum.
Welcome to the p2p.wrox.com Forums.

You are currently viewing the ASP.NET 3.5 Professionals section of the Wrox Programmer to Programmer discussions. This is a community of tens of thousands of software programmers and website developers including Wrox book authors and readers. As a guest, you can read any forum posting. By joining today you can post your own programming questions, respond to other developers’ questions, and eliminate the ads that are displayed to guests. Registration is fast, simple and absolutely free .
DRM-free e-books 300x50
 
 
Thread Tools Display Modes
  #1 (permalink)  
Old May 29th, 2010, 03:38 AM
Friend of Wrox
Points: 1,749, Level: 16
Points: 1,749, Level: 16 Points: 1,749, Level: 16 Points: 1,749, Level: 16
Activity: 0%
Activity: 0% Activity: 0% Activity: 0%
 
Join Date: Jun 2007
Location: San Diego, CA, USA.
Posts: 477
Thanks: 10
Thanked 19 Times in 18 Posts
Default i18n friendly user input sanitization

I've been using a pretty draconian whitelist for sanitizing user input.

Code:
[^.@a-zA-Z0-9 ]
However, this really only covers American English (maybe British too, but I'm not sure). It certainly won't work for Spanish, French, or German, much less Arabic, Hebrew, or anything further off the beaten path like the East Asian character sets. Has anyone done work on input validation for any languages other than English? I found virtually nothing on the topic beyond the arguments over blacklist vs. whitelist. There's little on implementation, and nothing I've found on implementation for an internationalized situation.

I can include most of the Western European languages, and "almost" have the characters for the Chinese Pinyin romanization just by including part of the Latin-1 Supplemental code points...

Code:
[^.@a-zA-Z0-9Ŕ-ÖŘ-öř-˙  ]
It would be easier to use the single range, Ŕ-˙ but that would include both × and ÷. Do those characters raise any potential security concerns? Can you get away with them? I would think (and feel free to correct me) that most letters should be safe but punctuation and mathematical operators are more open to abuse.
__________________
-------------------------

Whatever you can do or dream you can, begin it. Boldness has genius, power and magic in it. Begin it now.
-Johann von Goethe

When Two Hearts Race... Both Win.
-Dove Chocolate Wrapper

Chroniclemaster1, Founder of www.EarthChronicle.com
A Growing History of our Planet, by our Planet, for our Planet.
 


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help needed on creating User Friendly URLs minalarora ASP.NET 2.0 Professional 1 May 7th, 2007 08:19 PM
i18n problem in xsl jagadeesh XSLT 1 June 15th, 2006 03:24 AM
i18n Translation jagadeesh BOOK: Expert One-on-One J2EE Design and Development 0 June 13th, 2006 09:18 AM
checking user input hosefo81 Javascript How-To 0 February 16th, 2004 11:59 PM
Validating user input stu9820 VB.NET 2002/2003 Basics 2 January 15th, 2004 12:51 PM



All times are GMT -4. The time now is 05:37 PM.


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
© 2013 John Wiley & Sons, Inc.