Say that I require a querystring; for example "itemid". If that querystring is for some reason missing, should I give the user a 200 error page or a "404 Not Found"?
I would favour 404 but I'm not really sure.
Maybe you should give a "400 Bad Request".
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for more possibilities.
And like Chris Simpson said give a "404 Not Found" when no item for the corresponding item id is found.
You could also check popular RESTful apis to see how others have handled the problem. For example Twitter.
Good question. Personally I would use a 200 and provide a user friendly error explaining the problem but it really depends on the circumstances.
Would you also show the 404 if they did provide the itemid but the particular item did not exist?
From a usability standpoint, I'd say neither.
You should display a page that tells the user what's wrong and gives them an opportunity to fix it.
If the link is coming from another page on your site (or another site), then a page that tells them that the requested item wasn't found and redirects them to an appropriate page, maybe one that lets them browse items?
If the user's typing the querystring themselves, then I'd have to ask why? Since the URI isn't typically user-friendly.
You should give a user 200, only when the HTTP Request you got was responded with an appropriate Response, even when it is only a simple HTML that says they are missing a parameter.
The 404 code is when the User Agent is requesting a resource that is missing.
Check this list for further info
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
Give a 500 error, probably a 501. This will allow a browser or code to handle it through existing onError mechanisms, instead of having to listen for it in some custom way.
The way I see this working, you should return a 200 - because it should be a valid resource.
Let's say one of your URLs is widgets.com/browse.php?itemid=100 - and that URL displays a specific item in your catalog.
Now, a user enters widgets.com/browse.php - what do we expect the action to be? To list all of the items in your catalog, of course (or at least a paginated list).
Rethink your URL structure and how directory levels and parameters relate to one another.
Related
Is it possible for me to scrape the data from the pop up appears after clicking the link.the website is https://ngodarpan.gov.in/index.php/home/statewise_ngo/61/35
Of course it's possible, it's just a table with pagination.
But you'd better check the legal part before scraping a website, moreover on a governmental one.
Yes, you have to follow exactly what browser does. See network behaviour from your browser.
First, you have to send request to https://ngodarpan.gov.in/index.php/ajaxcontroller/get_csrf in order to get token like this :{"csrf_token":"0d1c59184c7df788dc4b8759f6da40c6"}
After, send another POST request to https://ngodarpan.gov.in/index.php/ajaxcontroller/show_ngo_info. As parameters you have to mention csrf_test_name which which is equals to csrf_token and id which is found from onclick attribute of each link.
You will get JSON as response and just to parse it as you need.
Like this one, why are there so many duplicate words in url?
http://medicine.uiowa.edu/biochemistry/proteomics/biochemistry/biochemistry/biochemistry/biochemistry/biochemistry/node/451
Even when I add more biochemistry, it still works! Anyone can explain?
I used Chrome's Network Inspector, but all browsers have this capability. When a request is made to https://medicine.uiowa.edu/biochemistry/, the response code is a nice 200. If you hit https://medicine.uiowa.edu/biochemistry/proteomics/, you'll see that you get a 301, meaning that this link has been moved permanently, and you can see that you've been redirected to just /biochemistry again.
You may also get a 304, which tells the browser to simply get the content from a different location without retransmitting any information. Indeed, it appears you can add any number of /proteomics or /biochemistry to the URL and it will go to the same place. My guess is that whoever set up the web server rules used a flawed regular expression for routing.
Should a POST request render HTML or redirect?
I hate it when your on a page and refresh and get the browser telling you, you're going to post data again.
Yes. It should send an entity or redirect!
(Sorry, the old programming jokes have to come out sometimes).
It really depends on whether you can meaningfully give something to GET, that makes sense standing on it's own.
Example: I buy something, I get a page saying "thank you, yadda yadda order number, receipt, yadda".
That should be a 303 See Other redirect, so that I GET a page with that info. I can bookmark it for later, refreshing just refreshes the GET. Happy days.
There are times though, when it only makes sense to render an immediate response, and if they refresh, then to repeat the actual operation, and bookmarking is meaningless That should not be a redirect.
For the most part, aim to have as few of the latter anyway. It is though most useful if you have to return them to the form because something failed - nobody wants a bookmark of a failed form, they want to fix what needs fixing and get on with it.
Note, many server-side systems (ASP etc) use 302 when you redirect from a POST, which strictly should mean that it POSTs again, but just about no browser does. Instead, be clearer:
If you want to redirect the POST again, so the POST goes to a different URI - well don't, that has other issues - but if you really really have to, then 307
If you want to follow up a POST with a GET to something explaining the result, the 303. It unambiguously means "now do a GET".
Yes, it is good practice to redirect after POST to avoid this.
This is a question about web application architecture rather than coding per se, however I still think it belongs here as it's in the problem domain of most web developers:
My problem. I have a page on which the content is not complete (only partial content). I don't want to just return a 200 response because I want it to be clear that the content on the page is only temporary, and that a visitor (google) should return at a later date to retrieve the correct page.
I'm not sure if there is a status code in the http specification that would be useful here.
I'm thinking about using a 302 redirect to the same URI, but I'm not sure if google will see this as gaming (I don't see why it should - no-one would 302 to the same URI on a permanent basis as the page content would be pretty much disregarded).
That's exactly what I want: For the page to be accessible - but for google to disregard the page, remember the URL and come back later to index it.
I don't want to use a meta 'no-index' tag with a 200 response as I fear this will stop the page being reindexed when the correct content is ready.
206 is the partial status code but thats not what you are doing here. Thats for multi part docs. What you have here is a "under construction" type page but only the content in the page is going to change not the uri. So the right thing to do is just return a 200 and let Google index it.
If you don't want it indexed yet because it is not ready for the public yet then add a meta no-index like you say. Google still downloads the page and parses it to find the no-index but does not index it. Remove the no-index when you are ready and it will start indexing. You can even prompt this by submitting a new sitemap.xml file with your page in it.
Google re-indexes insanely quickly these days so don't worry too much about temp blocking a page with a meta tag.
Google will re-index the page when the content changes automatically. Or you can force an update somewhere in the webmaster tools.
Alternatively, you could have the page 302 to an alternate address with your partially completed content until such time as the page is 'finished'. Then copy the final content into your original page and take off the 302.
Any error codes are reserved for error conditions. There are no such error as "This page is not in it's final version", indeed. What you might want is to specify that this page becomes obsolete and invalidated at some later time. For example, the following code means the page becomes obsolete instantly:
A site has 100's of pages, following a certain sitemap. A user can navigate to page2.aspx from page1.aspx. But if the user goes to page2.aspx directly say through a book marked URL, the user should be redirected to page1.aspx.
Edit: I dont want to go in and add code to every page that needs to fulfill this need.
Note: This is not a cross-page postback scenario.
You might consider something that is based off WorkFlow, such as this: http://blogs.msdn.com/mwinkle/archive/2007/06/07/introducing-the-pageflow-sample.aspx
The WCSF team also included a pageflow application block that you can use as a standalone add-on to your application.
I guess you could check the referrer, and if there isn't one / or it isn't page1.aspx then you could redirect back to page1.aspx.
As another answerer mentioned, you could use the Referrer header, but that can be faked by the client.
Since you don't want to modify each page, you could do something with an IHttpModule. Assuming you have some way of describing the valid page navigations, you could do something like this in the BeginRequest handler:
Check the session for a list of valid pages (using a default list for first visit if none are in the session).
If this request is for an invalid page, redirect to the place the user should be.
Based on this request, set up the list of valid pages and redirect page in the session so it's ready for the next request.
I recently worked with real code that checked to see if referrer was blank and used that as a step in authorization. The idea was users wouldn't be able to fake a referrer, you don't need a custom browser to fake a referrer. Users can book mark your page to delicious, then delicious.com is the referrer (and not blank).
I've had real arguments about how sophisticated a user needs to be to do certain hacks-- i.e. if users don't know how to set the referrer, then you can trust it. While true, it's unlikely your users will write a custom browser, but there already are Firefox addons to set headers, referrers etc and they're easy to use.
Josh has the best answer-- on page2 you should check the page hit log and see if the user has recently visted page1
I like alot of the answers above (specifically the workflow).
Another option, is creating each page as a usercontrol and having page1.aspx control what usercontrol gets loaded. This has the advantage of storing your workflow in a single place instead of on each page.
However, I don't think there's a magic bullet out there. It sounds like this security problem is an afterthought, or possibly reported as a bug, and you have been tasked with fixing it quickly and efficiently.
I would start weighing the answers here with their associated cost in hours.. I suspect the quickest solution will be to check referrer addresses on each page. Although hackable, it is obscure and if that risk is acceptable to you it may be the appropriate solution.