What is the significance of the _sm_nck=1 querystring? - http

What is the significance of the querystring:
_sm_nck=1
that I see appended to a lot of page requests? What adds it and why?
We are seeing the occasional page request coming into our site with that parameter added, usually as a second request for the same page from the same IP address a few seconds later. Also if I 'Google' that parameter I see a high number of search results have it appended too.

I think it has something to do with ZScaler not having the site in its database of sites. See the image below. I did not add the _sm_nck myself.

Related

why are there so many duplicate words in url

Like this one, why are there so many duplicate words in url?
http://medicine.uiowa.edu/biochemistry/proteomics/biochemistry/biochemistry/biochemistry/biochemistry/biochemistry/node/451
Even when I add more biochemistry, it still works! Anyone can explain?
I used Chrome's Network Inspector, but all browsers have this capability. When a request is made to https://medicine.uiowa.edu/biochemistry/, the response code is a nice 200. If you hit https://medicine.uiowa.edu/biochemistry/proteomics/, you'll see that you get a 301, meaning that this link has been moved permanently, and you can see that you've been redirected to just /biochemistry again.
You may also get a 304, which tells the browser to simply get the content from a different location without retransmitting any information. Indeed, it appears you can add any number of /proteomics or /biochemistry to the URL and it will go to the same place. My guess is that whoever set up the web server rules used a flawed regular expression for routing.

Redirect page to itself - what's the correct http status code to use

This is a question about web application architecture rather than coding per se, however I still think it belongs here as it's in the problem domain of most web developers:
My problem. I have a page on which the content is not complete (only partial content). I don't want to just return a 200 response because I want it to be clear that the content on the page is only temporary, and that a visitor (google) should return at a later date to retrieve the correct page.
I'm not sure if there is a status code in the http specification that would be useful here.
I'm thinking about using a 302 redirect to the same URI, but I'm not sure if google will see this as gaming (I don't see why it should - no-one would 302 to the same URI on a permanent basis as the page content would be pretty much disregarded).
That's exactly what I want: For the page to be accessible - but for google to disregard the page, remember the URL and come back later to index it.
I don't want to use a meta 'no-index' tag with a 200 response as I fear this will stop the page being reindexed when the correct content is ready.
206 is the partial status code but thats not what you are doing here. Thats for multi part docs. What you have here is a "under construction" type page but only the content in the page is going to change not the uri. So the right thing to do is just return a 200 and let Google index it.
If you don't want it indexed yet because it is not ready for the public yet then add a meta no-index like you say. Google still downloads the page and parses it to find the no-index but does not index it. Remove the no-index when you are ready and it will start indexing. You can even prompt this by submitting a new sitemap.xml file with your page in it.
Google re-indexes insanely quickly these days so don't worry too much about temp blocking a page with a meta tag.
Google will re-index the page when the content changes automatically. Or you can force an update somewhere in the webmaster tools.
Alternatively, you could have the page 302 to an alternate address with your partially completed content until such time as the page is 'finished'. Then copy the final content into your original page and take off the 302.
Any error codes are reserved for error conditions. There are no such error as "This page is not in it's final version", indeed. What you might want is to specify that this page becomes obsolete and invalidated at some later time. For example, the following code means the page becomes obsolete instantly:

Abusing HTTP POST

Currently reading Bloch's Effective Java (2nd Edition) and he makes a point to state, in bold, that overusing POSTs in web applications is inherently bad. Unfortunately, he doesn't specify why.
This startled me, because when I do any web development, all I ever use are POSTs! I have always steered clear of GETs for security reasons and because it felt more professional (long, unsightly URLs always bother me for some reason).
Are there performance differentials between GET and POST? Can anyone elaborate on why overusing POSTs is bad, and why? My understanding - and preliminary searches - seem to all indicate that these two are handles very similarly by the web server. Thanks in advance!
You should use HTTP as it's supposed to be used.
GET should be used for idempotent, read queries (i.e. view an item, search for a product, etc.).
POST should be used for create, delete or update requests (i.e. delete an item, update a profile, etc.)
GET allows refreshing the page, bookmark it, send the URL to someone. POST doesn't allow that. A useful pattern is post/redirect/get (AKA redirect after post).
Note that, except for long search forms, GET URLs should be short. They should usually look like http://www.foo.com/app/product/view?productId=1245, or even http://www.foo.com/app/product/view/1245
You should almost always use GET when requesting content. Only use POST when you are either:
Transmitting sensitive information which should not appear in the URL bar, or
Changing the state on the server (adding/changing/deleting stuff, altough recently some web applications use POST to change, PUT to add and DELETE to delete.)
Here's the difference: If you want to give the link to the page to a friend, or save it somewhere, or even only add it to your bookmarks, you need the full URL of the page. Just like your address bar should say http://stackoverflow.com/questions/7810876/abusing-http-post at the moment. You can Ctrl-C that. You can save that. Enter that link again, you're back at this page.
Now when you use any action other than GET, there is simply no URL to copy. It's like your browser would say you are at http://stackoverflow.com/question. You can't copy that. You can't bookmark that. Besides, if you would try to reload this page, your browser would ask you whether you want to send the data again, which is rather confusing for the non-tech-savy users of your page. And annoying for the entire rest.
However, you should use POST/PUT when transferring data. URL's can only be so long. You can't transmit an entire blog post in an URL. Also, if you reload such a page, You'll almost certainly double-post, because the above described message does not appear.
GET and POST are very different. Choose the right one for the job.
If you are using POST for security reasons, I might drop a mention of other security factors here. You need to ensure that you send the data from a form submit in encrypted form even if you are using POST.
As for the difference between GET and POST, it is as simple as GET is used to send a GET request. So, you would want to get data from a page and act upon it and that is the end of everything.
POST on the other hand, is used to POST data to the application. I am talking about transactions here (complete create, update or delete operations).
If you have a sensitive application that takes, say and ID to delete a user. You would not want to use GET for it because in that case, a witty user may raise mayhem simply changing the ID at the end of the URL and deleting all random uses.
POST allows more data and can be hacked to send streams of files as well. GET has a limited size though.
There is hardly any tradeoff in using GET or POST.

Erratic requests arising from client when using a custom view engine in ASP.Net MVC

I have spent about 7 hours trying to figure this out but gotten nowhere.
This is how my fiddler trace looks like
I have two routes that look like below that are registered for this page.
[route name="DummyResultsWithMarketStateNames" url="DummyResults/state-{statename}/market-{marketname}/page-{page}/{action}"
controller="DummyResults" action="Show"/]
[route name="DummyResultsWithMarketId" url="DummyResults/market-{marketid}/page-{page}/{action}"
controller="DummyResults" action="Show"/]
For this url, the first route matches and it goes to the right action. However, the client is sending in another request a second later in which it removes the last parameter 'page-1' and replaces it with 'none'. I've traced for XHR's and there are none. I'm not sure if this is an issue with the MVC framework itself but how would that translate as a request from the client?!!! Also, I'm getting different behavior with different browsers (IE trace above). Anyone encountered such strange behavior? I'd be happy to provide more info if you'd like.
UPDATE:
I setup the site on IIS and eliminated all image, css or script requests. I still end up with multiple requests. The original dummyresults page seems to be working now after I removed the .htc's. However, I have another page (screenshot below) that is not 'co-operating'. Should I add Ignoreroutes for certain extensions? This is driving me nuts!!! Pardon the 'bleep' on the image (IP reasons). PS: I setup another site for serving up all static resources.
Q: Should I add Ignoreroutes for certain extensions?
A: Of course! By default the WCF extension "*.svc" is ignored. The first thing I add on a ne page is for instance the ignore rule for the favicon.ico.
RouteTable.Routes.IgnoreRoute("*.svc");
RouteTable.Routes.IgnoreRoute("{resource}.axd/{*pathInfo}");
RouteTable.Routes.IgnoreRoute("favicon.ico");

Smart way to disallow users going to a site page directly

A site has 100's of pages, following a certain sitemap. A user can navigate to page2.aspx from page1.aspx. But if the user goes to page2.aspx directly say through a book marked URL, the user should be redirected to page1.aspx.
Edit: I dont want to go in and add code to every page that needs to fulfill this need.
Note: This is not a cross-page postback scenario.
You might consider something that is based off WorkFlow, such as this: http://blogs.msdn.com/mwinkle/archive/2007/06/07/introducing-the-pageflow-sample.aspx
The WCSF team also included a pageflow application block that you can use as a standalone add-on to your application.
I guess you could check the referrer, and if there isn't one / or it isn't page1.aspx then you could redirect back to page1.aspx.
As another answerer mentioned, you could use the Referrer header, but that can be faked by the client.
Since you don't want to modify each page, you could do something with an IHttpModule. Assuming you have some way of describing the valid page navigations, you could do something like this in the BeginRequest handler:
Check the session for a list of valid pages (using a default list for first visit if none are in the session).
If this request is for an invalid page, redirect to the place the user should be.
Based on this request, set up the list of valid pages and redirect page in the session so it's ready for the next request.
I recently worked with real code that checked to see if referrer was blank and used that as a step in authorization. The idea was users wouldn't be able to fake a referrer, you don't need a custom browser to fake a referrer. Users can book mark your page to delicious, then delicious.com is the referrer (and not blank).
I've had real arguments about how sophisticated a user needs to be to do certain hacks-- i.e. if users don't know how to set the referrer, then you can trust it. While true, it's unlikely your users will write a custom browser, but there already are Firefox addons to set headers, referrers etc and they're easy to use.
Josh has the best answer-- on page2 you should check the page hit log and see if the user has recently visted page1
I like alot of the answers above (specifically the workflow).
Another option, is creating each page as a usercontrol and having page1.aspx control what usercontrol gets loaded. This has the advantage of storing your workflow in a single place instead of on each page.
However, I don't think there's a magic bullet out there. It sounds like this security problem is an afterthought, or possibly reported as a bug, and you have been tasked with fixing it quickly and efficiently.
I would start weighing the answers here with their associated cost in hours.. I suspect the quickest solution will be to check referrer addresses on each page. Although hackable, it is obscure and if that risk is acceptable to you it may be the appropriate solution.

Resources