Article Options
Premium Sponsor
Premium Sponsor

 »  Home  »  Security  »  Content Thieves
 »  Home  »  Web Development  »  Content Thieves
Content Thieves
by Tiberius OsBurn | Published  10/06/2002 | Security Web Development | Rating:
Tiberius OsBurn

Tiberius OsBurn is a Senior Developer/System Analyst for The Gallup Organization (http://www.gallup.com). He recently completed a huge data warehousing project that archived data and documents from 1935 to the present - all coded in C#, SQL Server and ASP.NET.

Tiberius has extensive experience in VB, VB.NET, C#, SQL Server, ASP.NET and various other web technologies. Be sure to visit his site for his latest articles of interest to .NET developers.

http://tiberi.us

 

View all articles by Tiberius OsBurn...
Content Thieves

Article source code: contentthieves.zip

Someone's been stealing your content.

Really.

It's easy to do, too. I'm talking about all the fancy jpgs, gifs, docs and pdfs on your site. Guess what? If I can hit them with a URL, they're mine.

Don't like it? Too bad.

If you have a default page, I can set up a spider to snake out all of your content in a couple of minutes. Google has been doing it for quite a while – they finger your site, snatch out all of your graphics and your entire HTML.

Ok, so now you're saying "so what?" – the web is open to anyone, isn't it? Well, yes and no. Sometimes, you don't want to give someone access to your precious pdf file or your latest story crafted in Microsoft Word. You could use straight ACL security to guard these files, but not all of your end-users are going to be using IE – thus, the ACL authentication isn't going to work for these unfortunate few. And guess what? The authentication process used by ASP.NET only applies those resources (.aspx, .ascx, .vb, .cs) that are mapped to the aspnet_isapi.dll. – ASP.NET authentication doesn't work to protect .doc, .zip, or .pdf files.

Don't even try and sneak by a file named 903890xx0s9ki49.pdf - It's still not secure and you're being foolish in assuming that obscurity is security. A good spider or a good hacker is going to be able to sniff your HTTP traffic and bust your 'obscure secure' security.

So, what's the solution? Enter the HttpHandler.

HttpHandler Overview

The HTTPHander is a slick API that allows developers to snatch response/request methods and react to them without all the overhead of full-blown page handling. If you've done any C++ ISAPI work in the past, you'll know where I'm going here.

Each and every HTTP request coming down the pipe in ASP.NET is handled by classes that implement the IHttpHandler interface. In our case, what we're looking to do is intercept the request for a .doc or .pdf file and we'll actually create a custom HttpHandler to determine if an end user has the proper credentials to try and access the file.

By now, everyone has seen examples of how you can trap custom extensions via HttpHandlers – one example out there is filtering out a request from a default.time page. Simply put, the HttpHandler does its job, and you context.response.write out the current time on the server.

Yawn.

Yet another example of useless code that no one will use.

Let's take a look at some real code that you can use in your E-commerce site or just to secure access to files without resorting to ACL, FTP or forced browser requirements.

Modifying Web.Config

First things first: We need to add an XML tag to our web.config file in the folder we want to 'secure'.

<HttpHandlers>
    <add verb="*" path="*.doc" type="pdfIntercept.pdfHandler, pdfInterceptX" />
</HttpHandlers>

What we're doing here is telling the IHttpHandler interface that we'd like to intercept requests for paths that end with  .doc.

The verb attribute is used when you want to restrict requests via 'POST' or 'GET' or 'HEAD'. You'll just want to stick with a '*' – this will allow all of the above.

The path attribute lets the HTTP runtime know what's the valid path for a request. In this case, we're telling the HTTP runtime that we're interested in any resource that ends with .doc.

The type attribute lists out the .NET class that you've created to handle the request. It's the fully qualified class name followed by the assembly name. [NAMESPACE].[CLASS], [ASSEMBLY NAME].

Adding a Custom Extension

Since we're adding a custom extension to IIS (.doc, .pdf), we'll need to map these extensions into IIS. Don't make the mistake in thinking that just because a .doc or .pdf file opens in your browser that you're good to go. You'll need to get what's executing server side vs. client side straight in your head. Once you've done that, check out the screen shots below:

Fire up your IIS Manager and then choose Properties / Edit... You'll need to navigate to the Home Directory tab and then click on the 'Configuration' button.

Next, you'll have to add your 'custom' extension (.pdf, .doc) to the application configuration screen. Click the add button and fill in the needed information. It's a fairly straightforward process, so you shouldn't have too much trouble.

After you've got this critical step handled, you're ready to move to the next step – coding your HttpHandler.

Got Code?

Damn right we have the code.

Here's the scoop – I'm not going to write your authentication for you – I don't know how you do your authentication, and I'm sure you don't want me to know how you do it. I'll give you all the other pieces you need to intercept the request and then stream it out if your credentials are valid.

Let's take a look at the class I've created to intercept a request for a .doc file.

using System;
using System.Web;
using System.IO;

namespace pdfIntercept
{

  public class pdfHandler : IHttpHandler
  {

    //Notice ProcessRequest is the only method
    //exposed by the IHttpHandler
    public void ProcessRequest(HttpContext context)
    {
      try
      {
        string strString = "yes";
        HttpRequest oRequest = context.Request;
        HttpResponse oResponse = context.Response;

        //ADD YOUR CUSTOM AUTHENICATION HERE
        //ADD YOUR CUSTOM AUTHENICATION HERE
        //ADD YOUR CUSTOM AUTHENICATION HERE
        //ADD YOUR CUSTOM AUTHENICATION HERE

        if (strString == "yes")
        {

          //Since they've made it this far, they've been validated
          //by your system…
          //We'll fire up a FileStream object
          FileStream MyFileStream;
          long FileSize;

          //Map the path to the .doc file
          //You might need to parse out the Request path to figure out
          //what resource they're actually requesting…
          string strMapPath = context.Server.MapPath("book1.doc");
          MyFileStream = new FileStream(strMapPath, FileMode.Open);
          FileSize = MyFileStream.Length;

          //Allocate size for our buffer array
          byte[] Buffer = new byte[(int)FileSize];
          MyFileStream.Read(Buffer, 0, (int)FileSize);
          MyFileStream.Close();

          //Do buffer cleanup
          context.Response.Buffer = true;
          context.Response.Clear();

          //Add the appropriate headers
          context.Response.AddHeader("content-disposition", 
            "attachement filename=x.doc");

          //Add the right contenttype
          context.Response.ContentType = "application/doc";

          //Stream it out via a Binary Write
          context.Response.BinaryWrite(Buffer);
        }
        else
        {

          //It's a bogus request and they weren't validated.
          context.Response.Write("<b>DENIED</b>");
        }
      }
      catch (System.Exception err)
      {
        err.ToString();
      }
    }

    //By calling IsReusable, an HTTP factory can query a handler to 
    //determine whether the same instance can be used to service 
    //multiple requests 
    public bool IsReusable
    {
      get
      {
        return false;
      }
    }
  }

}

If you don't want the complexity of spinning your own binary stream, you can always cheat and call the Response.WriteFile method...

              if (strString == "yes")
                {
                    context.Response.Buffer = true;
                    context.Response.Clear();
                    context.Response.AddHeader("content-disposition", 
                        "attachement; filename=x.doc");
                    context.Response.ContentType = "application/doc";
                    context.Response.WriteFile("pp.doc");
                }
                else
                {
                    context.Response.Write("<b>DENIED</b>");
                }

To recap what' going on in the code:

  1. The HttpHandler intercepts a request for a .doc file.
  2. You perform some sort of custom validation against a database or XML file to determine if the request is a valid one.
  3. If the request is valid, you'll binary stream the file out to the end user via the browser.
  4. If the request is invalid, you'll context.response.write them a message, in this case, "Denied".

The HttpHandler is a powerful tool that you can use to really retake control over access to your content.

This implementation can be used to streamline access to paying customers for E-commerce sites or just to 'lock-down' some of the resources on your site. This really helps out when you've exposed a URL to the world and someone has shared your resource without asking and you only want to provide the .doc, .pdf or .zip file to paying or valid users. There's nothing more frustrating than spending time and effort and having your stuff ripped off. 

Stick it to those rotten spiders and thieves by implementing strategies that keep your content safe on the web and out of their sticky hands.

'Nuff said.

How would you rate the quality of this article?
1 2 3 4 5
Poor Excellent
Tell us why you rated this way (optional):

Article Rating
The average rating is: No-one else has rated this article yet.

Article rating:3.38888888888888 out of 5
 36 people have rated this page
Article Score25752
Sponsored Links