Preventing XSS in Javascript strings

Escaping user-input in your HTML is essential for preventing worlds #1 vulnerability.

When you're embedding user input into javascript, a simple htmlspecialchars won't cut it, you'll need to make sure you're escaping other things, like \n (line endings), and \ (slashes). Google doctype has a good list of characters in need of proper escaping to prevent users breaking your javascript.

However, when I dropped the question if a simple string replacement would be good enough, the members of the Web security mailing list gave me a different answer.

When escaping or filtering output using a blacklist (such as the one published on google doctype) browser/unicode escaping bugs are not taking into consideration. Some new vulnerability might appear in the future, which would immediately open a hole in your app. For this reason its wiser to go with a much more defensive white-list approach, essentially only letting things through you know is safe.

Introducing Reform

Reform is a tool that does exactly this. Reform allows you to escape your data for a javascript, xml, html or vbscript (yes it still exists) context. It provides libraries for Java, .NET, PHP, Perl, Python, Javascript and ASP. Pretty cool!

One dislike I have is that it only considers I really small set of unicode codepoints safe, especially when dealing with non-latin languages this is going to add a great deal to the bandwidth usage and the legibility of your sourcecode. One would think there has to be more ranges considered 'safe'.

PHP example:

<?php
  // Assuming the Reform class is included..

  echo '<script type="text/javascript"> var myString = ', Reform::JsString($userInput), '; </script>';

?>

I made a couple of changes in the PHP version, specifically:

  • Prepended the 'static' keyword to every method to make it work in PHP5's strict mode.
  • Removed the UTF-8 checks, I'm in a controlled environment, mbstring is installed, and the internal encoding is utf-8.
  • Added a parameter to Reform::JsString to not automatically put the string between quotes (').