Donald's Bacon Bytes

bytes of information as tasty as bacon


Topic: Development | Tags:


At work we recently went through a security audit on our application. Even though our application is behind a firewall, internal only and requires authentication, we still needed to make sure our application was protected. One of the things that came up has to do with being vulnerable to XSS. There are currently 2 places in the application that allows a user to create, save and view HTML. If an attacker got into a user’s computer, they could then post some HTML on this application and when another user ran the page with this HTML, that other user could get infected.

As a side note (which is relevant when talking code below), this application is built upon .Net and using C#.

We couldn’t just take away the ability to post HTML either. This was a widely used feature. We could do a search for a <script> tag but there are different variations and attributes someone could use to try and get it past our search and we could miss something. Besides the script tag, someone could post a link to a website that would infect a user and we can not block links either. So, what can you do about this? Here are the first ideas that I had come up with:

  1. Not allow HTML
  2. Use something like Markdown and convert that to allowed HTML
  3. Parse the HTML and compare against a whitelist

Like I said, #1 was not an option for us. We could do #2 but with a lot of our users not being very technical, this would make it difficult for them to use. We could do #2 and provide some kind of wysiwyg editor. This is probably a good option but we also wanted to give the people who knew HTML the ability to write HTML and not have to learn a new markup language. So, we opted for #3.

When I started researching this I came across something from called the AntiXssLibrary. Everythign I read suggested this was exactly what I was looking for. There was a function called GetSafeHtmlFragement which promised to do a lot of what I wanted. After playing around with this library I noticed that it wasn’t doing what it should. I thought I was doing something wrong so I kept trying different things.

Frustrated, I turned to the web some more. After further research, it turns out that Microsoft broke the functionality contained in that method and no word on when (or if) it would be fixed.

The next idea was to use the Encode method of HttpUtility and then replace the encoded values of allowed tags with the actual tag. So, something like this:

string encodedHtml = HttpUtility.Encode(htmlText);

StringBuilder sb = new StringBuilder(encodedHtml);



And so on. I liked this idea because it meant I was only allowing a whitelist set of HTML tags. As I started down this path some more, it got rather complicated. What about attributes? What about bad code inside of attributes? Well, maybe regular expressions could help with that!

My next test code looked like this:

Regex reg = new Regex(“&lt;table\\s(((.+)=&quot;((?:.(?!&quot;(?:\\S+)=|&quot;&gt;))*.?)&quot;)|((.+)=&#39;((?:.(?!&#39;(?:\\S+)=|&#39;&gt;))*.?)&#39;))*(&gt;)?”);

Match m = reg.Match(data);
if (m.Success == true)
string matchVal = m.Value;
string matchValReplaced = matchVal.Substring(4, matchVal.Length – 8);
string decoded = System.Web.HttpUtility.HtmlDecode(matchValReplaced);
data = data.Replace(matchVal, “<” + decoded + “>”);

Again, that seemed to work. I could create multiple regular expressions for the valid tags (or roll it into one expression or something like that). It still seemed rather complicated and like it could be prone to error or could be a hassle to maintain.

Before settling on this solution I wanted to check the web some more. I then came across a blog article  ( by eksith who was in the same boat. They needed to solve the same problem for the same reasons. And they solved it with a lot better code than I hacked together!!

This didn’t require much modifcation for our use. I did modify a few things though. I added a few more ValidHtmlTags to the dictionary as well as some other attributes that were being used by our users. I also still had to solve the issue of not allowing users to link outside of our website. To do that, I created a variable containing a list of strings that were valid strings for the href attribute.

Then I added this (after line 143 of the original code):

if (a.Name == “href”)
a.Value = a.Value.ToLower();
var validCount = ValidBaseUrls.Select(s => a.Value.StartsWith(s)).Where(r => r == true);
if (validCount.Count() <= 0)
a.Value = “#”;

If a user added an anchor tag that linked to somewhere other than our valid list, that href attribute would get replaced with a pound sign.

One of the cool things about this class is that everything other than the allowed HTML tags and attributes will get encoded. That way, when you display them on your page, only allowed tags will get rendered as HTML. Custom Error Page and HttpRequestValidationException

Topic: Development | Tags: ,


I was starting to pull my hair out at work today. I have a website that is running in IIS 7.5 and uses custom 500 error pages. It is running with Integrated Pipeline and the web.config looks like this:

<customErrors mode=”On” defaultRedirect=”Error500.aspx” redirectMode=”ResponseRewrite” xdt:Transform=”Replace”>

<error statusCode=”404″ redirect=”Error404.aspx”/>

<error statusCode=”500″ redirect=”Error500.aspx”/>


<httpErrors existingResponse=”PassThrough”>
<remove statusCode=”500″ subStatusCode=”-1″ />
<remove statusCode=”403″ subStatusCode=”-1″ />
<remove statusCode=”404″ subStatusCode=”-1″ />
<error statusCode=”404″ prefixLanguageFilePath=”” path=”/Error404.aspx” responseMode=”ExecuteURL” />
<error statusCode=”403″ prefixLanguageFilePath=”” path=”/Error404.aspx” responseMode=”ExecuteURL” />
<error statusCode=”500″ prefixLanguageFilePath=”” path=”/Error500.aspx” responseMode=”ExecuteURL” />

Everything was working fine. An error would happen, the status code of 500 would get set and the user would get redirected to my custom error page. Then something weird came up. Through a security audit, it was brought up that adding <!–> into the query string would throw an application error with a 500 status and show server information. My custom error page was not getting fired. This was a bit strange.

Turns out, 2 things needed done to catch this.

First, I had a base class that ALL my pages derived from. In this base class I had to override the OnError method so that I could catch the error that was happening. I was using Server.Transfer to send users to my error pages already when my code needed to throw an error so I then called the method that was doing this error handling and thus calling Server.Transfer.

Unfortunately, that in itself wasn’t enough.

When I started investigating further, I noticed an overload for Server.Transfer that I didn’t notice before. It is called preserveForm. Since .Net was complaining about the query string having  dangerous values, doing a Server.Transfer would transfer the request to another page and still have those dangerous value. By default, preserveForm is set to true and with it being true the get/post values are maintained. Setting this to false cleared out the bad querystring and allowed my custom error page to function properly.

Hopefully that helps someone else out there.