Site Meter
 
 

ASP.NET MVC: Prevent XSS with automatic HTML encoding

There’s an interesting (and sometimes heated) debate on the ASP.NET MVC forums about HTML encoding.

It started with a proposal for a helper method to HTML-encode strings as soon as they are received from the visitor, so they’d be stored HTML-encoded in the database. That way, you don’t have to HTML-encode them for display to prevent cross-site scripting. If that was the default behaviour for the UpdateFrom() method, the idea of encoding for storage would no doubt be widely adopted.

Almost everyone else on the forum, though, has a strong preference for not encoding anything until the moment of display. There are some obvious benefits to this approach – you don’t have to remember which strings were pre-encoded (according to their origin), and you don’t have un-encode them when outputting to any non-HTML format. But it does mean you have to remember to encode things wherever you output them.

Sadly the two methods are incompatible, and you will have to choose one side or the other. I am very definitely in the encode-when-displaying camp.

Another solution

What I’d really like is to change the default behaviour of ASPX’s <%= … %> syntax so that it HTML-encodes the result by default. That’s what you want 95% of the time, so why should you keep writing <%= HttpUtility.HtmlEncode(…) %> all the time?

  Current reality In my ideal world
Output unencoded string <%= value %> <%= (RawHtml) value %>
Output encoded string <%= HttpUtility.HtmlEncode(value) %> <%= … %>

This would give us the best of both worlds. You wouldn’t need to remember to HTML-encode your strings (since that happens by default), so there’d be no need to store things pre-encoded in the database and then worry about double-escaping, sharing data with external systems, unencoding for output to non-HTML format and all that other nonsense.

Spike implementation

It’s a great credit to the ASP.NET architecture that we can actually implement that change of behaviour ourselves, and with not much code either. The idea is to intercept the code generation phase that happens when an ASPX file is compiled.

You can specify your own compiler implementation by editing this section of the web.config:

<system.codedom>
   <compilers>
      <compiler language="c#;cs;csharp" type="Microsoft.CSharp.CSharpCodeProvider .. etc" extension=".cs" warninglevel="4" />
   </compilers>
</system.codedom>

… and, helpfully, you can subclass CSharpCodeProvider, override the GenerateCodeFromStatement() method, and redirect all the <%= … %> evaluations through a suitable helper function.

Demonstration

You can download a demonstration project to see this in action, or to install the behaviour into your own project, follow these steps:

1. Download the SafeEncodingHelper assembly (or build it yourself – the demo project includes sources), and add a reference to it in your project.

2. In your web.config, edit the system.codedom.compilers element, to look like this:

<compiler language="c#;cs;csharp" type="SafeEncodingHelper.SafeEncodingCSharpCodeProvider, SafeEncodingHelper" extension=".cs" warninglevel="4">
	<provideroption value="v3.5" name="CompilerVersion" />
	<provideroption value="false" name="WarnAsError" />
</compiler>

3. Also in web.config, under pages/namespaces, add a reference to the SafeEncodingHelper namespace:

<namespaces>
	<add namespace="System.Web.Mvc" />
	<add namespace="System.Linq" />
	<add namespace="SafeEncodingHelper" />
</namespaces>

 

That’s all! You will now find that <%=…%> encodes its output, or you can get unencoded output by casting your value to the RawHtml type, i.e. <%= (RawHtml)myValue %>.

What about MVCToolkit?

You might be thinking that this is going to break the MVC toolkit, since you use it to build HTML controls with a syntax like this:

<%= Html.TextBox("myinput", "It's nice") %>

You might, reasonably, expect this now to render a bunch of useless HTML-encoded nonsense. There’s a neat solution, though – the MVC toolkit could return values of the RawHtml type (which is merely a wrapper around System.String which adds no functionality). This is specially recognised by the SafeEncodingHelper compiler, and bypasses the HTML encoding. So, you can keep your clean syntax for any methods that you specifically want to render unencoded HTML.

Also, if someone isn’t using SafeEncodingHelper, no problem! The RawHtml type has a .ToString() method that simply returns the underlying value, so the MVC toolkit methods would still work just as well.

The demonstration project contains an alternative MVC toolkit that behaves this way. Actually, it only has a single facility (TextBox), but it’s enough to give you the idea.

Should I really use this then?

Firstly, this code comes with no warranties at all. Use it if you want, but beware – I just cooked it up on impulse and there may be any number of special cases I haven’t accounted for. It’s a proof of concept, that’s all.

Unless Microsoft chooses to support the RawHtml type in their MVC toolkit and related methods, you would have to remember to cast all MVC toolkit output to RawHtml, or write your own wrapper methods or something. Not much fun, sorry.

kick it on DotNetKicks.com

22 Responses to ASP.NET MVC: Prevent XSS with automatic HTML encoding

  1. >>Rob Conery of Microsoft is advising that we should HTML-encode all strings as soon as we receive them from the visitor, and store them HTML-encoded in the database<<

    Steve this is not what I’m saying at all. As I’ve mentioned I need the most secure solution possible – and the discussion is ongoing.

  2. Steve

    Rob, I’m sorry if I misrepresented your position. It did seem to me that you were advocating that strategy for quite a while, as anyone can see if they read the thread. Thanks for reconsidering now. I know you’re on our side!

  3. Pingback: Thoughts on awareness of security vulnerabilities & full disclosure » DamienG

  4. Cool thing. cleans up a lot of HttpUtility.HtmlEncode calls

    I’m just implementing a similar approach on AspView.

    I’m also adding a helper method ‘RawHtml’ to allow

    which is (imo) a bit more expressive than

    In WebForms I’d add this method to my custom BasePage

  5. I can see the confusion :) . There’s a difference, I think, that I didn’t make clear. I wasn’t advocating that you must do this (put encoded data in your DB), I was advocating that if you want unencoded data, you ask for it explicitly. In this way you are 1) aware of the choice and 2) don’t find out the hard way.

    People are saying that we’re “training people to do dumb things” – I don’t buy that argument. On my blog I suggested that if this supposition is true, then Microsoft is also training people to write bad HTML :) .

    Encoded data is only nasty (in terms of search) if people input HTML. 99% of the time (like with this comment, which gets stored encoded in your DB :) :):) it doesn’t matter since people aren’t entering HTML.

    All in all – the solution you offer here is a good one, and the ASP.NET team is reading…

  6. Pingback: Wöchentliche Rundablage: ASP.NET MVC, ASP.NET 3.5, ADO.NET Data Services "Astoria", ASP.NET AJAX, Silverlight… | Code-Inside Blog

  7. Microsoft will not support the Raw Html

  8. Steve

    Thanks Hip-Hop. Would you like to expand on this point?

  9. ag

    Steve, this is a nice solution for MVC toolkit.
    Can we use the Helper Class and web.config file for VS 2005 projects?
    I downloaded the project and it doesn’t compile on Visual
    Studio 2005.

  10. Steve

    Ag – this code is mainly of use in ASP.NET MVC projects, and you can only develop those in VS 2008.

  11. Pingback: ASP.NET MVC: Preventing XSS attacks at Mark Needham

  12. Pingback: ASP.NET MVC: Pre-compiling views when using SafeEncodingCSharpCodeProvider at Mark Needham

  13. dkl

    Looks great, but try to add some unreachable code to the aspx page:

    I’m getting Compilation Error with description “CS0162: Warning as Error: Unreachable code detected”. But I did not change WarnAsError settings, and if I change the compiler type back to default provider, it works:-(

    Seems like compiler option WarnAsError is enabled when you supply your own compiler type.

  14. Somehow got to this blog entry and the thread. As an MVC user its great to see how the contribution of you and a few others helped Microsoft avoid making some very poor choices.

    One thing I wanted to ask, are you planning on publishing a new version of the book to go along with the new version of MVC?

  15. Steve

    Thanks Colin! It’s really good to see that the ASP.NET team has made a big change in .NET 4 to deal with the most common XSS issues automatically.

    Yes, I am currently working on an ASP.NET MVC 2.0 edition of my Apress book.

  16. Great work Steve, this is a really nice solution. I wish Microsoft did this by default when they did ASP.NET MVC.

  17. Istvan

    Really nice trick.

    The RawHtml string, however must contain htmlencoded stings too at the textnode parts of the html, which can’t be enforced nor checked, so the HtmlHelper methods have to be busy encoding text-parts of their output which is yet suboptimal.

    Furthermore there are a lot of web sites generating html fragments from user input (like transforming link text (http://www.example.com) to html anchors in a blog comment post for example), in wich case their require RawHtml wrapper too to avoid encoding, but this way the rest of the string also avoid automatic encoding.

    I would prefer a solution where code generated html fragments will be some html DOM structure fragment which makes further, unambigous transformation safe and possible with an ultimate text (HTML markup) conversion during which the text node’s contents are encoded automatically.

  18. Ken

    Awesome, this is what we needed!

    What additions/modifications would you have to make so that content from is also encoded automatically?

  19. Ken

    Woops, didn’t notice that the tags didn’t show up.

    How can we make it work for databinding syntax <%# %>?

  20. Do you have any updates to this functionality?

  21. wheel hub

    Excellent Post.. I enjoy some of content in the post.. please keep it up..
    i want same like this from you… thanks