Getting HTML From a Webpage

So lets say you are using a SharePoint / Commerce Server website, and you’re authenticated through a cookie token. Lets say you have a page setup whose output you need, possibly for generating e-mail, uploading somewhere, whatever. There are several ways to get that HTML:

  1. Create a hidden iFrame, and use a server callback to send the contents back to you.
  2. Use WebClient to go get it.
  3. Roll your own HTTP client.
  4. Something else I didn’t think of off the top of my head.

Lets be honest… (1) is ugly as sin. And if we try (2) implemented as below:

    WebClient wc = new WebClient();
    string html  = wc.DownloadString("");

It doesn’t work. Why not? Because WebClient creates an entirely new request from a faux browser. No cookies, no authentication… nothing. I ran across a very quick and dirty way to solve this:

    public class MasqueradeWebClient : WebClient

        public void AddCookie( HttpCookie httpCookie )

            Cookie newCookie = new Cookie();

            newCookie.Domain   = HttpContext.Current.Request.Url.Host;
            newCookie.Expires  = httpCookie.Expires;
            newCookie.Name     = httpCookie.Name;
            newCookie.Path     = httpCookie.Path;
            newCookie.Secure   = httpCookie.Secure;
            newCookie.Value    = httpCookie.Value;


        private CookieContainer m_container = new CookieContainer();

        protected override WebRequest GetWebRequest(Uri address)
            WebRequest request = base.GetWebRequest(address);
            if (request is HttpWebRequest)
                (request as HttpWebRequest).CookieContainer = m_container;
            return request;

Using the MasqueradeWebClient, we can now do something like this:

    MasqueradeWebClient mwc = new MasqueradeWebClient();

    for (int i = 0; i < Context.Request.Cookies.Count; i++)
        mwc.AddCookie( Context.Request.Cookies.Get(i) );

    string html  = mwc.DownloadString("");

And viola! Our html will download as expected as we are now a cookie-authenticated user!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s