Search This Blog

Wednesday, October 7, 2015

SharePoint detect when crawl account is accessing your page


Many a times we put some code in our SharePoint pages to add an entry in a list, whenever  a particular page is accessed. This is usually done for getting the page visit count or to keep the record of the latest visited page/document per user. However you may not want to add entries for the search account when it performs a crawl of your content source as this could cause serious performance issues; as we faced, since we were having multiple site collections and in those site collections we had our custom document set home pages implementing the same logic of updating a list whenever the page is accessed.

In order to fix this, I had to detect that whether the page is being accessed by normal user or by the SharePoint Crawl account. The best way I found to detect the page access is using "User-Agent". When the Crawl accesses the page, it adds following string as user agent in Request headers:

MS Search 6.0 Robot
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 6.0 Robot)

I was simply able to detect whether the page is accessed by normal user or SharePoint Crawl by checking the following value:

if (!HttpContext.Current.Request.UserAgent.Contains(“MS Search 6.0 Robot))
        {
          //Add your logic to add/update the list
        }

This works for both SharePoint 2010 as well as 2013 since the Robots tag value is same for both the versions.


No comments:

Post a Comment