For the lulz (Building Secure Websites)

By June 28, 2011Technical

With all of the #antisec love going around, we felt was a good time to discuss some of the key principles in writing secure webcode. Today’s topic is unsanitised input.

A great piece of philosophy for designing secure systems is that any piece of information that comes from an external source is inherently untrustworthy. This applies quite strongly to the web – every detail in a web request comes from a user somewhere, and can easily be forged and specially crafted to manipulate poorly written code.

This is particularly notable in languages like PHP, where there is a low barrier to entry for writing code, but doesn’t often provide good frameworks or assistance to avoid being caught out by these problems.

Validate All Input

The first rule is to validate all input. You can’t trust anything that comes in from the request to conform to a certain format.

Just because your pretty HTML form only allows certain values to be sent back doesn’t mean that some nasty cracker somewhere can’t craft up a request to your webserver that doesn’t conform to this.

The way we deal with this is to clean these parameters, always. If we expect the parameter to be an integer, we ensure that the parameter only contains an integer, and treat everything else as an error. If it’s a string, we have to assume we don’t know how long that string is, or what sort of characters it may contain.

Now, I know some of you are thinking “Ah, but we can avoid all of this by using javascript to validate our forms!”. That doesn’t work as the javascript has no bearing on requests sent by our crafty cracker – they can only influence the result from a user using a web browser in the way you intended.

Tread with Care

With strings in particular, filtering the contents is not always an option – in these cases, we need to ensure that when we use the string elsewhere, its contents do not interfere with the other operation.

One of the most common places where this gets fouled up is in SQL queries.

A common (erroneous) construct you’ll find used is:

[php] $query = sprintf("SELECT * FROM articles WHERE id="%s"’, $_REQUEST[‘id’]);
$myresults = mysql_query($query);

// do stuff with $myresults…
[/php]

The problem with this is that $id is joined verbatim with the rest of the SQL query. If $id contains characters with special significance in SQL, such as the ” (double-quote) character, they will be interpreted as such.

If Little Johnny Cracker were to send our PHP page a request with id set to: [sql]"; DELETE FROM articles; –[/sql] the next thing we know, our DB server no longer has anything in its articles table!

Copyright © Randall Munroe - xkcd.com

We can protect against this by modifying the string so that characters that are special to SQL are appropriately “escaped”. In this specific instance, there is a mysql_real_escape_string function to do this for us, but most other database libraries and other languages offer different ways to perform this important task.

If we amend our example above, the code becomes:
[php] $query = sprintf(‘SELECT * FROM articles WHERE id="%s"’,
mysql_real_escape_string($_REQUEST[‘id’])) ;
$myresults = mysql_query($query) ;

// do stuff with $myresults…
[/php]

This ensures that the contents of id will be interpreted correctly as part of the string being compared to id, and won’t contain characters that could be interpreted as part of the SQL command.

But this seems all too hard/too manual!

Now, you’re all bound to say, “There must be an easier way!”, and there usually is.

There are plenty of libraries and wrappers available which are aware of the problem with trying to prepare safe SQL queries and provide simple, convenient, parameter escaping to ensure that user input can be safely combined with your queries.

In PHP, there is the PEAR MDB library which can do the above safely. An example:
[php] $sth = $db->prepare(‘SELECT * FROM articles WHERE id=?’) ;
$myresult = $db->execute($sth, $_REQUEST[‘id’]) ;

// do stuff with $myresults…
[/php] Other languages, such as Ruby, Python and Perl, all have their own safe and convenient methods for doing this too.

In Summary

Don’t trust the input into your web code. Make sure you never mix input with strings you’re handing off elsewhere, such as via exec, eval or to a database. And finally, use convenience methods for correcting/filtering user input when possible to make your life easy.