EPiServer friendly URLs for paginated pages (and why the asp:LinkButton must die)

Recently I got a request from a customer to perform some search engine optimizing for an old EPiServer site we are maintaining. One of the optimizations was to fix the paging on their product page. The products are fetched from an external data source and are not stored as pages in EPiServer, thus normal EPiServer paging controls can not be used. Image a normal paging control like this:

Prev 1, 2 , 3, …, 99 Next

A fairly normal approach to this would have been to use a query parameter for handling the paging. Like this:

4

By using this approach each page would have a unique entry point, using query strings is however not optimal for SEO, but even worse is using asp:LinkButtons. Whoever created the site had decided to use asp:LinkButtons. LinkButtons are very convenient to work with in ASP.NET, but the HTML code they generate is not very SEO friendly. This is the HTML that was generated:

1 2 3 Next

We can see that:

  • LinkButtons generates a javascript that is run when the link is clicked, instead of using a normal link
  • Paging is handled using a postback, the pages will no longer have unique entry points.

This is disastrous from a SEO perspective.

So lets fix it. First step is to replace the LinkButtons with normal links (asp:Hyperlink) – and then we are going to do some URL rewrite magic to create SEO-friendly URLs. The code behind was changed to assign links like /products/?page=4.

Now, to get URLs like /products/page/4, we can create a custom url rewrite module in EPiServer. First we create a new class, Rewrite, that inherits from EPiServer.Web.FriendlyUrlRewriteProvider. In this class we override three methods (don’t ask we why they have such confusing names, someone at EPiServer must have been under the influsene of something when nameing them):

  • ConvertToInternalInternal – Used to convert from /product/page/4/ to an internal EPiServer page with the page id as a query parameter
  • ConvertToExternalInternal – Used to convert from an internal EPiServer URL (/PageType.aspx?lotsofqueryparameters=values) to an external (/products/page/4/)
  • ConvertToInternal – Needed to work around URL rewrite caching behaviour in EPiServer, will get into more detail on this later.

This is the class:

namespace Utils
{
public class Rewrite : FriendlyUrlRewriteProvider {
  // The regexp to match a paged url
  string _regexpPaging = @"(.+)/page/([0-9]+)/$";

  protected override bool ConvertToInternalInternal(
    UrlBuilder url, ref object internalObject){...}

  protected override bool ConvertToExternalInternal(
    UrlBuilder url, object internalObject, Encoding toEncoding){...} 

  public override bool ConvertToInternal(
    UrlBuilder url, out object internalObject) {...}
}

The first method we implement is the ConvertToInternalInternal which will convert the URL to an internal EPiServer URL.

protected override bool ConvertToInternalInternal(UrlBuilder url, ref object internalObject)
{
  if (url == null)
  {
    return false;
  }
  // Regexp to match if the URL ends with /page/{Id}/
  Match match = Regex.Match(url.Path, _regexpPaging);

  // If we have a match, remove the /page/{Id}/ from the end of the URL
  // and add a querystring to the internal URL that is dqcPagingId 
  // (should have a unique name to not clash with some other querystring)
  if (match.Length > 0)
  {
    url.Path = match.Groups[1].Value + "/";
    url.QueryCollection["dqcPagingId"] = match.Groups[2].Value;
    base.ConvertToInternalInternal(url, ref internalObject);
    return true;
  }

  // Now when the /page/{Id}/ is removed from the URL, and the querystring is added, we can let EPiServer do its normal URL-rewriting.
  return base.ConvertToInternalInternal(url, ref internalObject);
}

While not nessesary, we should override the ConvertToExternalInternal as well. This will let EPiServer automatically convert internal urls containing the dqcPagingId querystring to a external url ending with /page/{Id}/.

protected override bool ConvertToExternalInternal(UrlBuilder url, object internalObject, Encoding toEncoding)
{
   // First let EPiServer convert the internal URL to an external. This will give us a URL like:
   // /Products/?dqcPagingId=5 (if it is a paged page)
   bool isRewritten = base.ConvertToExternalInternal(url, internalObject, toEncoding);

    // Check if the URLs query string contains dqcPagingId
    // If it does we add /page/{Id} to the URL and removes the query string
    if (url.Query.Contains("dqcPagingId"))
    {
      url.Path = string.Concat(url.Path, "page/", url.QueryCollection["dqcPagingId"], "/");
      url.QueryCollection.Remove("dqcPagingId");
    }

    return isRewritten;
}

Now one can think we would be done, but there is one more method we need to implement. EPiServer uses a cache to cache URL rewrites from external URLs to internal URLs, but it will only cache the querystrings used by EPiServer, not the dqcPagingId we added. This gives a rather unexpected result. The first time the page is loaded everything loads fine, from code behind we can access the dqcPagingId and show the requested page. If you wait 10 senconds (default cache time) or more and reloads the page, it works fine. But if you reload the page sooner, the dqcPagingId will not be set when the page loads. This is because the querystring dqcPagingId is not cached by EPiServer. This took me quite some time to figure out.

The solution to the problem is to override ConvertToInternal to bypass the default caching:

public override bool ConvertToInternal(UrlBuilder url, out object internalObject)
{

  // If the URL end on /page/{id}/, bypass cache by calling ConvertToInternalInternal
  // A more optimal solution would be to perform some kind of caching here
  if (Regex.IsMatch(url.Path, _regexpPaging))
  {
    internalObject = null;
    ConvertToInternalInternal(url, ref internalObject);
    return true;
  }

  // Else, it is ok to use the cached result
  return base.ConvertToInternal(url, out internalObject);
}

The last step is to add our Rewrite module to the web.config:

  
    
  

Now we should be all done! Enjoy!

Peter

3 thoughts on “EPiServer friendly URLs for paginated pages (and why the asp:LinkButton must die)

  1. Hi peter,

    I had a perfectly working custom rewriter until I upgraded to 6R2. Are you familiar with any known issues or changes in the architecture concerning urlrewriting in 6R2?

    Sandor

Leave a Reply

Your email address will not be published. Required fields are marked *