In this third of four posts on XSS you will learn how you can stop XSS attacks cold in your ASP.NET apps. "What? Me? Stop them? But those are vicious, deadly attacks by professionals and I am just a newb!" Oh cut it out. Quit your whining and understand that if you
don't stop them, they
will get your data or attack your company. Find out how to stop the attacks and make it so.
So, in previous posts, we saw what XSS is and how the attacks are executed. Keep in mind the nasty things like img tags that simply execute what is in their "src" attribute without any thought to whether or not there is a real image being returned. If a hacker puts inline script or a URI to an offsite script into the src attribrute, then the script will be returned and executed in your browser.
Ow! So what can be done? Is all lost?? No of course not...settle down. First of all, the bad img tag needs to be on your site somehow.
YOU won't put one there, so how else can it get there? Well, if you allow comments on your press releases or on photos on your site, or you allow users to supply input that gets stored or processed for redisplay on your site (comments, posts, discussion groups, user content, profile, etc) then you have to block XSS from happening.
Whitelist Data Entry
Go to those points of data entry into your system, and don't forget things like URLs that you process. Consider whitelisting all input. What is whitelisting? Well, you know that a blacklist is something you don't want to be on because it keeps you out of fun places. Anyone not on the list is allowed in -- but in our case, what if you forget to add the next new method of attack that comes out? And did you catch every way that the attack could be encoded to hide it and beat your filters? A whitelist is the list of
acceptable input for a point of entry. Everything else is disallowed -- this protects you now AND in the future. And the maitre d' doesn't accept bribes.
Think about your site's contact form that asks for name and address info. Why accept 4000 characters in the city field? Why accept special characters in a zip/postal code field? Whitelist acceptable input for those fields.
- Limit your input
- Strongly type input where possible
- Do NOT rely on input constraints to save you
Do NOT Rely on Input Validation/Whitelisting Only
That last point is important. Very important. In many cases you use javascript to constrain and validate input on the client side. Whoops...I just turned off scripting in my browser and now I am entering a script tag into the address field! So why bother constraining input at all? Good question. The thing to keep in mind is that
input validation should be viewed as a help to your users. Constraining your input is good to stop casual hackers from trying some things, but anyone really trying to get in will not find your input validation a hindrance. So you should set up validators to help someone realize that your address field is only 30 characters when they entered 35, or that a valid phone number does/doesn't accept dashes. That sort of thing.
What To Do If The Whitelist Idea Isn't Possible
Another reason to not rely on input validation is that in some cases your whitelist is going to be too broad to stop some things from happening and/or you can't whitelist easily. Let's say you have a piece of code that looks at the querystring and spits some things out to the screen based on it:
www.site.com/showregiongraph.aspx?region=South
for example.
This page is going to get data based on the
region
querystring variable, and then use that variable as the title to the graph of data. Sure, not a best practice website right there, but you sometimes have to work with other peoples' code, dontcha? So the code grabs the
region
variable and pulls data, showing a message "No data found" if there is an error or if there is no data found. The logic then takes the
region
variable and puts it in a label control on the screen. Easy enough! Weeeelllll, let's put some script in the querystring and see what we get!
www.site.com/showregiongraph.aspx?region=<script>alert('hi')</script>
Clearly, this "region" will not return any data and will probably return an error of some sort. But what about when we take the variable and put it into a label control? Can you be sure it won't be put to the browser "as is" and executed as script?
What if the web page takes in a comment about the website and upon postback says "Thank you for your feedback. The following has been sent to our staff:
your comment here."? And you want to let people put html in the field so they can bold their nasty words and up the font size when they are yelling at you. What if someone puts the above script tag in that field? You get the idea. People can and will misuse your forms to put script in there. Will the script attack you? Maybe. Could be setting up shop to use your server as a member of a botnet...or opening a gate to distributing malware to attack Land's End. Who knows? Anyway, whitelisting won't help you in some cases.
Encode Your Output To The Screen
One of the best (and only) ways to protect your web pages is to encode all output to your browser. Assume it has been tampered with or is user-generated (we never trust users) and put the proper encoding on all of it. Hey, let the input come in decoded and nasty, and then process it normally, but then encode it before you put it back to the screen. Like with the
region
example above, who cares if you try to query the database for a region with a script name? (It DOES matter, but we will cover this when we talk about SQL Injection.) So the database returns no records. Great! But if your page takes the input and does a
Response.Write()
or some other method of putting the raw input to the screen, you need to encode it.
To encode it, you may think to use
HttpUtility.HtmlEncode()
as mentioned in
Microsoft's Patterns and Practices on preventing XSS attacks. This works great for data going to HTML. But what if the data is going to an img tag, an attribute of another tag, the URL, or is being inserted into javascript? HTML encoding may not be what you want.
I highly recommend you use Microsoft's new Anti-XSS Library which gives you more granular control with different methods that go to javascript, URL, HTML, etc. You just put the dll in your bin and use the methods in your code.
Whatever you use, encoded script won't run. Period. Doesn't matter what the pesky script would have done, it becomes plain old text that is output as is to the browser rather than executed.
ASP.NET's Default Protection Against XSS
The nice thing for all of us is that ASP.NET 1.1+ has built-in protection against XSS by looking for script tags or other dangerous things. If someone tries to put script in there, including ways where they try to obfuscate what they are doing, the ASP.NET engine sees it and gives you an error message like the following:
A potentially dangerous Request.QueryString value was detected from the client (txt="<SCRIPT SRC=http://h...").
This is turned on by default. Great, right? Well, us lovely developers hit a message like that and wonder "How can I turn that off so I can allow
b
tags in my content?" We see, "Oh! It is just a page-level attribute!" and turn it off. Sigh. The attribute is
ValidateRequest
and it is set to True by default. Sure, set it to False and then tags are allowed through...but hey wake up! Tags are allowed through! You've just given hackers an open door. Find other ways to allow
b
tags. Which is better, a user complaining they can't bold something or a user complaining their social security number was stolen off your site? Your call.
Remember that the default protection is only on the request so you still need to encode the response.
Also
Another reason for going only after the output to the browser is that the browser is where these scripts run. You
do have to consider how your data is being used by other people/systems, however. Web input can go to an internal reporting system and that can be as big a problem.
One thing you might not have considered is error logging.
Be careful! Don't throw the raw data put into your fields to your event log without encoding it! Think about it -- you drop the script into the event log and there is some internal PHP page your system admins have put in place to view errors called from your site. BAM...script runs in THEIR browser and it might be silent and behind the scenes.
Your site is fully protected,
your input is validated,
your output is encoded,
you don't allow scripts to affect your processing, etc. But
you dropped the script into a data store (event log) that can be used by other groups who don't have all that security in place. In my opinion, your system admins need to be better about security...they don't have an excuse either. Just like you. But don't give someone a lit bomb either.
You are the one in control of the data so
you are responsible for it...and
you are the one that should be held responsible for the script running on their machines. You handed them a lit bomb. Don't blame them for not disarming it.
Summary
That is really all there is to it. Blur your eyes on all my explanation and it boils down to three things:
- Validate and whitelist all input but don't rely on these alone
- Encode all output to the screen, plus to any data store that might have consumers of the data that don't/can't encode their output.
- Don't turn off default ASP.NET protection of
ValidateRequest
unless you plan on writing the code you need to whitelist and protect your input sources.
And doing all of these is easy. Use good validators from ASP.NET's toolbox, and use the Microsoft Anti-XSS Library to encode your output.
The next and last post on XSS will give you links and ideas on where to learn the details of how you perform the actual protection in your code.