IIS 7 URL Rewrite for 404 and Sitefinity - iis-7

We have a new Sitefinity site that is replacing our marketing site. The switchover happened last friday, and we uncovered a problem today: there is content (pdfs, jpgs) on the old site that can no longer be accessed, and did not make it into the content migration plan. On top of that, management has removed rollback as an option.
So, the solution I have come up with is to use IIS 7's url rewriting module to point to a new url that hosts the old site so that content can be accessed. This is the xml in my web.config that I have come up with:
<rewrite>
<rules>
<rule name="RedirectFileNotFound" stopProcessing="true">
<match url=".*" />
<conditions logicalGrouping="MatchAll">
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
<add input="{URL}" negate="false" pattern="/\.*$" />
</conditions>
<action type="Redirect" url="http://www.oldsite.com{REQUEST_URI}" appendQueryString="true" />
</rule>
</rules>
</rewrite>
It attempts to test if the URL resolves to a file or folder, and makes sure that we are requesting something with an extension. If the rules pass, it redirects to the same location on the old site. Ideally, this would mean that anything linking to the old site previously would be able to be left alone.
The problem is, nothing gets redirected.
By fiddling with the rules, I have verified that the module is operational, i.e. i can set it up to rewrite everything, and it works. but these rules do not work.
My theory is that since Sitefinity uses database storage, it somehow short circuits the "IsFile" match type. Complete guess, but I'm kind of at a loss at this point.
How to I use urlrewriting to redirect for 404's in this manner?

I am not sure how the rewriter is implemented, but those rules seem to be too general. Sitefinity uses the routing engine and registers a series of routes that it handles. By definition, those routes are interpreted sequentially, so if a more general rule exists before a more specific one, the latter will not work.
I suspect what may be happening is that the Sitefinity rules already handle the request before the rewriter gets a chance to redirect it. What I can advise is to either implement more specific rewrite/redirect rules, or just handle the whole issue using a different approach. What was the reason your old files were inaccessible after the migration? Can you give a specific URL that fails to return the file, so we can work with a scenario?

this is just a shot in the dark, but do you have "file system fallback" enabled in the sitefinity advanced settings for libraries? perhaps the module is intercepting the request and not letting it proceeed to the file-system...

Thank you guys for your help, but it turned out to be a problem with Dynamic Served Content in general.
Assume that all requests are actually handled by a Default.aspx page. This isn't the way that Sitefinity works, but it is the way that DotNetNuke works, and illustrates the problem nicely.
The url rewrite isfile and isdirectory flags check for physical existence of files. In this case, only Default.aspx actually physically exists. All the other dynamically served content is generated later in the request cycle, and has no physical existence whatsoever.
Because of that, the isfile flag will always fail, and the redirect rule will always execute.
The solution I went with was to allow IIS and .NET to handle the 404s themselves, which properly respects generated content. and route that to a custom error page, 404redirection.aspx. There is code on that page that redirects to my old site, where that content is likely to be host. That site then has additional 404 handling that routes back to the 404NotFound.aspx page, so requests for files that don't exist in either system make a round trip and look like they never went anywhere. This also has the nice side effect of pages that aren't found on the old server get to display our new, pretty, rebranded 404 on the new server.
Simply put, rather than attempting to pre-empt the content generation and error handing, I took a more "go with the flow" approach, and then diverted the flow at a more opportune time.

Related

Default page returns 404 only when using search bot user agent

I have created a website using ASP.NET web pages (not MVC, not web forms).
If I access the default page by mydomain.com in a browser it shows the default page (index.cshtml) fine. However, search engines are seeing a 404 page and if I change my user agent to Googlebot or Bingbot I get a 404 error too.
This only affects the default page - if I use mydomain.com/index.cshtml I don't get the 404 page.
There is no user agent detection in my code.
I have watched the headers and there is no redirections, just an immediate 404 response only when using a bot user agent.
Is there some built-in user agent detection that affects default pages in ASP.NET web pages? Or could my hosting company be doing something (Arvixe)?
I can add code if it helps (but not sure what code I would add), or link to the web site.
I found the cause of the problem.
Apparently Arvixe websites have been hacked. The hackers inserted some code in web.config that displayed a different URL in place of the home page for bots only...
<rewrite>
<rules>
<rule name="1" patternSyntax="ECMAScript" stopProcessing="true">
<match url="^$" ignoreCase="true" negate="false" />
<conditions logicalGrouping="MatchAny" trackAllCaptures="false">
<add input="{HTTP_USER_AGENT}" pattern="Googlebot|Yahoo|MSNBot|bingbot" />
</conditions>
<action type="Rewrite" url="bot.asp" />
</rule>
</rules>
</rewrite>
I did see title/description for sports jerseys in Bing search for my website which is why I was investigating this.
From a search it appears this has affected lots of Arvixe customers, most will probably never know as they are unlikely to see their website with a search bot user agent.
It looks like Arvixe were aware of the hacking and have already put a stop to this by removing the spam file (bot.asp or bot.php) but they have not fixed the web.config. If you have shared hosting with Arvixe you should check for this now.
You should also check your Google search console/analytics accounts for owners/users added as some have reported this too, although you would have got an email warning of this.
I changed all my Arvixe passwords but I doubt they got individual account passwords, they probably hacked at the server level.

Blocking referral spam traffic in asp.net without modifying the web.config

I'm using Google Analytics and I'm using filters to remove referral spams. In my web.config file, I also use this:
<rule name="buy-cheap-online.info" patternSyntax="Wildcard" stopProcessing="true">
<match url="*" />
<conditions>
<add input="{HTTP_REFERER}" pattern="*.buy-cheap-online.info" />
</conditions>
<action type="AbortRequest" />
</rule>
I have dozens of these rules and I want to add more. There's this file on GitHub that includes a list of spammers: https://github.com/piwik/referrer-spam-blacklist/blob/master/spammers.txt
I could just keep adding rules to the web.config but it seems messy. What's another way to block referral spam traffic in asp.net so that all the sites in the text file can be blocked and if the file changes I can easily add new sites by reuploading the text file?
Note: I'm not asking for code to be written for me; I just want to know what other options I have.
That's right keep adding rules will be messy and even worse useless, most of the spam in GA never reaches your site, there is no interaction at all so any server-side solution like the web.config won't have any effect.
We can differentiate the spam mainly in 2 categories:
Ghost Spam that never interacts with your page, so any server-side solutions like the web.config or htaccess file won't have any effect and will only fill your config files with.
Some people still hesitate because they think creating filters is just hiding the problem instead of blocking. But there is nothing to block, it is just some guy making fake records on GA reports.
And Crawler Spam as the name imply, they do access your website and can be blocked this way, but there are only a few of them compared with the ghost.
To give you an idea there are around 8 active crawlers while there are more than 100 ghosts and each week increasing. This is because the ghost method is easier to implement for the spammers.
The best way to get rid of all ghosts with just one filter based in your valid hostnames.
You can find more information about the ghost spam and the solution here
https://stackoverflow.com/a/28354319/3197362
https://moz.com/ugc/stop-ghost-spam-in-google-analytics-with-one-filter
Hope it helps.

IIS localhost is redirecting to the live server instead of showing local aspx files

I need to redesign a site front-end to make it responsive, the site is based on C# and ASPX. I am familiar with working PHP local development environment by using WAMP so for this case I installed visualstudio web express because of its IIS server features for testing local development.
The client sent the folder from its ftp for me to work on so I have everything ready but the problem which I am facing is that when I try to right-click on the website folder from the solution explorer to view in the browser the localhost is redirecting to the actual live site which is running on the server instead of taking me to the localhost with local files so that I can have a view of my modification and changes.
I am unable to figuring out the problem may be there is something I need to change or replace in the webconfig file. I am working in asp environment for the first time your help and guidance in this regard will be very appreciated.
Many thanks.
I had this same problem just now as well as two or three times before and haven't pinned down a clear cut resolution, though I've tried removing redirects, checking the Global.asax, commenting out default document files in the Web.config, etc. I'm mainly working in Chrome on Windows 7, and after I cleared Chrome's cache localhost started working for me again. I don't know if the answer is as simple as that or if the solution comes from some combination of things. All I know is nothing else seemed to work, then clearing the browser cache did.
PROBLEM
I was facing this problem because I wrote this rule inside webconfig file.
<rules>
<rule name="CanonicalHostNameRule1">
<match url="(.*)" />
<conditions>
<add input="{HTTP_HOST}" pattern="^www.abc\.com$" negate="true" />
</conditions>
<action type="Redirect" url="www.abc.com/{R:1}" />
</rule>
</rules>
And because of this when I debug the solution the localhost is replaced with www.abc.com
SOLUTION
Just remove this rule if written
WHY USE RULE
These rules are used when you want to customize the url ,you can read further about Url Rewrite here
Redirecting to the live site from visual studio or localhost can be caused by a faulty url rewrite rule in the site web config. Even if the rule is deleted, the browser may cache the rule if it is a permanent redirect, and therefore clearing the browser cache and fixing the faulty rule can be a viable solution.
See article
https://theludditedeveloper.wordpress.com/2016/01/06/iis-url-rewrite-gotcha-2/
For me it was because the web.config had a rewrite rule added in it for SSL purpose. However, once I removed it, it still kept me redirecting to the live site. I wasted couple of hours searching the code without anything making sense. And then it turned out that the cache was the issue as the browser was still doing the redirection. So, I cleared the cache and it was solved.

ASP.NET HttpContext.RemapHandler to remap to ASP Classic

We are upgrading an ASP Classic website (actually, a virtual directory under a larger website) to ASP.NET 3.5. There will be some legacy directories that remain ASP Classic. Other than that, every .asp file will be replaced by an .aspx file in the same location in the directory hierarchy. We would like not to break old links coming into the site from elsewhere. The website is hosted on IIS 6 (and we have no control over this).
My idea was, in IIS, to replace the usual handler for .asp files, asp.dll, with aspnet_isapi.dll. First question: If I do that, will requests for .asp files then be routed through any custom HTTP modules I create and register in web.config?
Then I would create an HTTP module hooked into BeginRequest that would test whether a request's path (before any querystring) ends in .asp. If so, it would check whether the physical file exists. If not, then I'll use HttpContext.RewritePath to append the "x" to the ".asp". Otherwise, if the .asp file DOES exist, I'll use HttpContext.RemapHandler to switch the handler back to asp.dll so that the file will be processed as the ASP Classic file that it is.
Second question: Will this work? Third question: What do I use as the argument to the RemapHandler method? How do I acquire a reference to an instance of the ASP Classic handler? (If I knew the answer to the third question, I'd have just tried all this on my own!)
UPDATE: OK, I did try it out myself, except that I renamed the remaining .asp files so that their extension is .aspc (ASP Classic), and in IIS I assigned the old asp.dll as their handler. Then, instead of checking whether the .asp file requested exists and remapping to the ASP Classic handler if so, I checked instead whether a file in the corresponding physical location except with the extension .aspc exists. If so, I rewrite the URL to append the "c". This worked! Therefore, the answer to my first question, above, is "yes", and the answer to my second question is "yes, pretty much, except that the part about remapping the handler is unknown". But it would be preferable not to have to change the extensions on all my legacy .asp files, so I am left with one question: Will the original RemapHandler approach work and, if so, what is its argument?
Did you know, that you could handle this quite easily using the web.config file and redirect rules in IIS?
You need to activate the URL Rewrite2 module described here.
This is a really nice feature of IIS, to solve routing problems, see here also for some examples.
Looking at your case, I would do something along the lines of this:
<rule name="execute classic asp if file exists" stopProcessing="true">
<match url="(\w+\.asp)$" />
<conditions>
<add input="C:\Path\To\Your\WebApp\{R:1}.asp" matchType="IsFile" />
</conditions>
<action type="Rewrite" url="{R:1}c" appendQueryString="true" />
</rule>
<rule name="execute dotnet otherwise" stopProcessing="true">
<match url="(\w+\.asp)$" />
<action type="Rewrite" url="{R:1}x" appendQueryString="true" />
</rule>
I'm not sure if all RegExes here would work for you, but its meant as a start to experiment. The reason to explicitely write C:\Path\To\Your\WebApp\ is, that I didn't find a way to get the base path of the web app as a parameter.

IIS7 URL Rewriting Outbound rules

I can't seem to get my head around these rewrite rules for some reason and I was hoping you guys could help. What I want is an outbound rule that will rewrite paths for link, img, script, and input tags.
I want to change this: http://www.mysite.com/appname/css/file.css
To this: http://cdn.mysite.com/css/file.css
So, basically I need to swap the host name and drop the app name from the URL. I've got the pre-condition filters to *.aspx files set already, but the rest seems like Greek to me.
EDIT for clarity
The appname in the URL above is an application in IIS. It's a placeholder for whatever appname happens to be in use. It could be any of over 50 different apps with our current setup. There will ALWAYS be an appname. Perhaps that will make the rule even easier.
The hostname, in this case www.mysite.com, can also vary slightly in terms of the subdomain. It might be www1.mysite.com, www2, etc. Also, just realized that I need to maintain the SSL if there.
So, I guess when it comes down to it, I really just need to take the URL, minus the appname, and append it to the new domain, while respecting the protocol that was used.
Original URL: http(s)://{host}/{appname}/{URL}
Output: http(s)://cdn.mysite.com/{URL}
I assume your website domain is always the same, then this rule should do:
<rule name="CdnRule" preCondition="OnlyAspx" >
<match filterByTags="Img, Input, Link, Script" pattern="^(.+)://.+?\.(.+?)/.+?/(.*)" />
<action type="Rewrite" value="{R:1}://cdn.{R:2}/{R:3}" />
</rule>
<preConditions>
<preCondition name="OnlyAspx">
<add input="{PATH_INFO}" pattern=".+\.aspx$" />
</preCondition>
</preConditions>
EDIT: changed according to clarified question
I assume the subdomain (www, www2, ...)is always there and it has to be ignored in target url.

Resources