IIS7 Rewrite Issue with Googlebot - iis-7

I recently implemented a new handler that serves images on my site.
the old handler was called spotSnap.ashx and the new one is called photo.ashx
i set up a rewrite rule in IIS7 as follows...
<rule name="Redirect spotsnap" patternSyntax="Wildcard" stopProcessing="true">
<match url="spotsnap.ashx" />
<action type="Redirect" url="photo.ashx" redirectType="Permanent" />
</rule>
The rule appears to work correctly - try it here
However the googlebot is clocking up hundreds of errors every day with this...
System.Web.HttpException (0x80004005): A potentially dangerous Request.Path value was detected from the client (:).
at System.Web.HttpRequest.ValidateInputIfRequiredByConfig()
at System.Web.HttpApplication.PipelineStepManager.ValidateHelper(HttpContext context)
these server variables look odd to me (substituted actual website dirctory path for obvious reasons)...
PATH_INFO /http:/photo.ashx
PATH_TRANSLATED c:\path\to\website\http:\photo.ashx
URL /http:/photo.ashx

Related

Error when reprocessing a rewritten url in IIS URL Rewrite 2

I'm trying to create a system to serve images and their resized versions from GridFS using MVC3 and IIS URL Rewrite 2. After testing, I've realized that serving images directly from filesystem is 10x faster than serving them using GridFS file streams. Then I've decided to keep the originals in GridFS and create a copy of the original file and resized versions on servers local file system using a combination of Url Rewrite 2 and Asp.Net handlers.
Here are the rewrite rules I use for serving the original and the resized version:
<rule name="Serve Resized Image" stopProcessing="true">
<match url="images/([a-z]+)/[a-f0-9]+/[a-f0-9]+/[a-f0-9]+/([a-f0-9]+)-([a-f0-9]+)-([a-f0-9]+)-([0-9]+)\.(.+)" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false">
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
</conditions>
<action type="Rewrite" url="/Handlers/ImageResizer.ashx?Uri={REQUEST_URI}&Type={R:1}&Id={R:2}&Width={R:3}&Height={R:4}&ResizeType={R:5}&Extension={R:6}" appendQueryString="false" logRewrittenUrl="true" />
</rule>
<rule name="Serve Original Image" stopProcessing="true">
<match url="images/([a-z]+)/[a-f0-9]+/[a-f0-9]+/[a-f0-9]+/([a-f0-9]+)\.(.+)" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false">
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
</conditions>
<action type="Rewrite" url="/Handlers/Images.ashx?Uri={REQUEST_URI}&Type={R:1}&Id={R:2}&Extension={R:3}" appendQueryString="false" logRewrittenUrl="true" />
</rule>
As you can see, rewrite engine checks if the file exist on the file system, and if not. rewrites the url and sends the request to the handler. Handler serves the stream and writes the file to the file system. On the next request, file is served directly from file system. I've seperated the files to folders by splitting their 24 char IDs (MongoDB Object ID as string) to avoid hunderds of thousands images in same folder.
Here is a sample original image request:
http://localhost/images/test/50115c53/1f37e409/4c7ab27d/50115c531f37e4094c7ab27d.jpg
This and the resized versions works without any problems.
Since this URL is too long and has duplicates in it, I've decided to use rewrite engine again to shorten the url to generate folder names automatically. Here is the rule which I put to the top:
<rule name="Short Path for Images">
<match url="images/([a-z]+)/([a-f0-9]{8})([a-f0-9]{8})([a-f0-9]{8})(.+)" />
<action type="Rewrite" url="images/{R:1}/{R:2}/{R:3}/{R:4}/{R:2}{R:3}{R:4}{R:5}" appendQueryString="false" logRewrittenUrl="true"></action>
</rule>
When I request an image using this rule for example with the following URL:
http://localhost/images/test/50115c531f37e4094c7ab27d.jpg
it only serves the image if the image is already on filesystem, otherwise I get the following error:
HTTP Error 500.50 - URL Rewrite Module Error.
The page cannot be displayed because an internal server error has occurred.
I've checked IIS Log File Entry for the request. It doesn't show any details except:
2012-08-02 14:44:51 127.0.0.1 GET /images/test/50115c531f37e4094c7ab27d.jpg - 80 - 127.0.0.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/537.1+(KHTML,+like+Gecko)+Chrome/21.0.1180.60+Safari/537.1 500 50 161 37
On the other hand, successfull requests log the rewritten URL's like:
GET /Handlers/ImageResizer.ashx Uri=/images/test/50115c53/1f37e409/4c7ab27d/50115c531f37e4094c7ab27d-1f4-1f4-2.jpg&Type=test&Id=50115c531f37e4094c7ab27d&Width=1f4&Height=1f4&ResizeType=2&Extension=jpg
Elmah and EventLog also doesn't show anything. Added a filesystem logger to the top of my controller method and it doesn't log these particular problematic requests.
Can anyone suggest a workaround to get it work?
Edit: After RuslanY's suggestion about Failed Request Tracing, I've managed to identify the error:
ModuleName: RewriteModule
Notification: 1
HttpStatus: 500
HttpReason: URL Rewrite Module Error.
HttpSubStatus: 50
ErrorCode: 2147942561
ConfigExceptionInfo:
Notification: BEGIN_REQUEST
ErrorCode: The specified path is invalid. (0x800700a1)
Entire Trace Result can be see here (IE only)
Unfortunately, this is still not taking me to the solution since the second rule (therefore shortening rule) is working when the file exist on the file system.
As an alternative approach to using UrlRewrite to do this checking, why not using Application Request Routing w/ disk based caching. Dynamic images will be generated, and the caching infrastructure of ARR will save generated images to disk. Less mess, and I've used ARR to great success in production scenarios. Disk cache persist between IIS restarts and can live as long as you say (default is to use cache information from the response, but you can override this to be longer).
Application Request Routing

Page Load fires twice after url rewriting

I have a page (DetaliiProdus.aspx) where I've applied this url rewriting rules:
<rewrite>
<rules>
<rule name="DetaildProductSub1" stopProcessing="true">
<match url="produse/([_0-9a-z-]+)/([0-9]+)/([0-9]+)/([_0-9a-z-]+)" />
<action type="Rewrite" url="Site/DetaliiProdus.aspx?c={R:1}&p={R:2}&s1={R:3}" appendQueryString="false"/>
</rule>
<rule name="DetaildProductSub1Sub2" stopProcessing="true">
<match url="produse/([_0-9a-z-]+)/([0-9]+)/([0-9]+)/([0-9]+)/([_0-9a-z-]+)" />
<action type="Rewrite" url="Site/DetaliiProdus.aspx?c={R:1}&p={R:2}&s1={R:3}&s2={R:4}" appendQueryString="false"/>
</rule>
</rules>
</rewrite>
If I am going to the page directly as you see in the action url (eg:"/Site/DetaliiProdus.aspx?c=m1&p=868&s1=60&s2=140") the Page Load is fired once and all works great.
If I am going to the page using the url rewrite rule: (eg:"/produse/m1/868/60/140/Biserica%20in%20asediu") the Page Load method is fired multiple times (3 times).
Can you give me any clue why this issue occurs? I've already spent over 3 hours on this:(...
Look for additional resources being loaded and passed through your routing rules to your handler page - if there are .js, .css, etc. files being loaded for the page, the form load can fire multiple times during rerouting. When this occurs, you can test the HttpContext object to see what the name of the requested resource is and then abort processing if necessary.
It is preferable to filte requests earlier in the pipeline, but if your page load is being called multiple times, consider filtering in the page load (or page init, etc).
I also recently had my Page_Load fire 3 times with IIS URL Rewriting enabled.
To solve this problem add the following condition to your rewrite rule:
<rule name="Some rule">
...
<conditions logicalGrouping="MatchAny">
<add input="{URL}" pattern="^.*\.(dxr|ashx|axd|css|gif|png|jpg|jpeg|js|flv|f4v)$" negate="true" />
</conditions>
</rule>
as is stated here: IIS url rewrite - css and js incorrectly being rewritten
Files with the above extension will not be rewritten, therby solving the slow load time and multiple Page_Load event firing.

Use IIS7 URL Rewriting to redirect all requests

I have a .NET site that I am taking down and I plan on redirecting all requests to www.mysite.com using a rewrite rule in the web.config. Should be a simple task, but it's not. I've removed all the content from the filesystem except a single Default.aspx page and a default web.config with the following rewrite rule:
<rule name="Redirect All" stopProcessing="true">
<match url="^(www\.)?mysite\.com(/.+)$" />
<action type="Redirect" url="www.mysite.com" appendQueryString="false" />
</rule>
If I request www.mysite.com/garbage.aspx the server is still trying to look up garbage.aspx, or any other url I provide just as if the rewrite rule was not there.
Very frustrating. Ideas?
Do you want to redirect ALL requests to your new site? If so, this should do it:
<match url=".*" />

Canonical Hostname with URLRewrite 2.0 behind load balancer

I have two IIS7 web servers behind a load-balancer. The URL Rewrite 2.0 module is installed on both servers and the following rewrite rule applied to both instances:
<rule name="Enforce canonical hostname" stopProcessing="true">
<match url="(.*)" />
<conditions>
<add input="{HTTP_HOST}" negate="true" pattern="^www\.mydomain\.com$" />
</conditions>
<action type="Redirect" url="http://www.mydomain.com/{R:1}" redirectType="Permanent" />
</rule>
When I try to navigate to http://mydomain.com, my web browser hangs indefinitely. I suspect the load-balancer is affecting the way URL Rewrite works, but I can't be certain.
We ended up using the following technique:
http://www.mcanerin.com/en/articles/301-redirect-iis.asp
The key was to add the $S$Q to the end of the domain name.
My guess is ,
1. load balancer forwards your request to your child servers
2. and when request comes to child servers they redirect the request according to your URL redirect rule, so your request is redirected and again comes on Load balancer
3. and the same procedure(step 1-2) is followed,
thus your request loops again and again and your browser gets hanged.

IIS UrlRewriter Outbound rules ignored if PreCondition is set

I have a very simple OutBound UrlRewriter rule that rewrites url's it finds in the body of the http response stream:
<rewrite>
<outboundRules>
<rule name="Scripted"
preCondition="IsHtml"
patternSyntax="ECMAScript"
stopProcessing="false">
<match filterByTags="None" pattern="http://someurl.com" />
<action type="Rewrite" value="http://anotherurl.com" />
</rule>
<preConditions>
<preCondition name="IsHtml" patternSyntax="Wildcard">
<add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
</preCondition>
</preConditions>
</outboundRules>
</rewrite>
The problem is that as soon as I turn on the preCondition no rewriting takes place.
I need to be able to use a pre-condition because the page is an ASP.NET page and uses ASP.NET script resources e.g. <script src="ScriptResource.axd?d=...." type="text/javascript" />.
By default script resources are gzip compressed and I want to keep them that way. Without the content type precondition the URL rewriter RewriteModule throws a 500.52 error - "Outbound rewrite rules cannot be applied when the content of the HTTP response is encoded ("gzip")."
Using Fiddler I can see that Content-Type: text/html; charset=utf-8 is being sent in the response header but UrlRwriter seems unable to match this.
Why is this happening?
This is because the Server Variable HTTP_ACCEPT_ENCODING is not added to the allowed server variables list. Add it there (you can google how to in IIS).

Resources