Is it a good idea to check the HTTP request query string and indicate an error once there's an unexpected parameter? - asp.net

My ASP.NET application will have to handle HTTP GET requests that will have the following URL format:
http://mySite/getStuff?id="actualIdHere"
currently the requirement is to validate that there're no parameters in the query string except id and indicate an error like "unknown parameter P passed".
Is such requirement a good idea? Will it interfere with some obviously valid cases of using the application I haven't thought of?

It would be better to just validate the presence of id.
Validating unknown parameters doesn't serve much of a purpose, they will just be ignored.
Just edited my answer here:
There are also tracking solutions out there that will add to your query string.

One that comes to mind is web analytics.
If your application is going to be a public web site, you will want to implement some tracking of your traffic (e.g. google analytics).
If you want to implement a marketing campaign to draw traffic to your site, you will likely need to add a few parameters (specific to the tracking system you're using) to your querystring to check the effectiveness of your campaign.

It depends on your target audience.
It is not a good practice for public websites where you are aware of SEO, for example if you implement Google Analytics then a user come to your site from Search Results may have a parameter in URL like googleclid.
However in more protected websites it is fine.

It might affect forward compatiblity. For example, if you have separate client applications/websites that actually call this URLs, and future versions of these clients might provide additional parameters to getStuff (like a sort ordering, backlink, etc), making hard requirements on the parameters might make it harder to roll out new versions smoothly (i.e. cannot roll out new clients until the server is updated).
This in addition to the traffic forwarding parameters public websites might get as additional input, like the other answers mention.

Related

HTTP session tracking through base URL "resource"?

A little background: We're currently trying try specify an HTTP API between a couple of vendors so that different products can easily inter operate. We're not writing any "server" software yet, nor any client, but just laying out the basics of the API so that every party can start prototyping and then we can refine it. So the typical use-case for this API would be being used by (thin) HTTP layers inside a given application, not from within the browser.
Communication doesn't really make sense without having session state here, so we were looking into how to track sessions typically.
Thing is, we want to keep the implementation of the API as easy as possible with as little burden as possible on any used HTTP library.
Someone proposed to manage session basically through "URL rewriting", but a little more explicit:
POST .../service/session { ... }
=> reply with 201 Created and session URL location .../service/session/{session-uuid}
subsequent requests use .../service/session/{session-uuid}/whatever
to end the session the client does DELETE .../service/session/{session-uuid}
Looking around the web, initial searches indicate this is somewhat untypical.
Is this a valid approach? Specific drawbacks or pros?
The pros we identified: (please debunk where appropriate)
Simple on the implementation, no cookie or header tracking etc. required
Orthogonal to client authentication mechanism - if authentication is appropriate, we could easily pass the URLs to a second app that could continue to use the session (valid use case in our case)
Should be safe, as we're going https exclusively for this.
Since, PHPSESSID was mentioned, I stumbled upon this other question, where it is mentioned that the "session in URL" approach may be more vulnerable to session fixation attacks.
However, see 2nd bullet above: We plan to implement~specify authentication/authorization orthogonally to this session concept, so passing around the "session" url might even be a feature, so we think we're quite fine with having the session appear in the URL.

Is query string approach reliable?

I am looking for some effective ways to bypass the cache whenever necessary. In the process of searching for that I have found this link
From the referenced post I found that the query string approach may not work when the squid like proxies are used. I did not test this.
I see that stackoverflow in itself is using the query string approach, below is the screenshot for the same captured before login to the site.
Would like to know if the query string approach is a reliable solution to push the css and js updates whenever a new software build is released.
It's reliable browser side, meaning that since the URL is different (because there's a different query parameter), it will fetch a new copy.
Server side it depends on your server. Some caching proxies may ignore query parameters for the purpose of determining URL equality. AWS CloudFront for example does so by default. That's always a configurable setting. Since, presumably, you are in charge of the server, you can configure it as needed.

Is it dangerous if a web resource POSTs to itself?

While reading some articles about writing web servers using Twisted, I came across this page that includes the following statement:
While it's convenient for this example, it's often not a good idea to
make a resource that POSTs to itself; this isn't about Twisted Web,
but the nature of HTTP in general; if you do this, make sure you
understand the possible negative consequences.
In the example discussed in the article, the resource is a web resource retrieved using a GET request.
My question is, what are the possible negative consequences that can arrive from having a resource POST to itself? I am only concerned about the aspects related to the HTTP protocol, so please ignore the fact that I mentioned about Twisted.
The POST verb is used for making a new resource in a collection.
This means that POSTing to a resource has no direct meaning (POST endpoints should always be collections, not resources).
If you want to update your resource, you should PUT to it.
Sometimes, you do not know if you want to update or create the resource (maybe you've created it locally and want to create-or-update it). I think that in that case, the PUT verb is more appropriate because POST really means "I want to create something new".
There's nothing inherently wrong about a page POSTing back to itself - in fact, many of the widely-used frameworks (ASP.NET, etc.) use that method to handle various events that happen on the client - some data is posted back to the same page where the server processes it and sends a new reponse.

When should one use GET instead of POST in a web application?

It seems that sticking to POST is the way to go because it results in clean looking URLs. GET seems to create long confusing URLs. POST is also better in terms of security. Good for protecting passwords in forms. In fact I hear that many developers only use POST for forms. I have also heard that many developers never really use GET at all.
So why and in what situation would one use GET if POST has these 2 advantages?
What benefit does GET have over POST?
you are correct, however it can be better to use gets for search pages and such. Places where you WANT the URL's to be obvious and discoverable. If you look at Google's (or any search page), it puts a www.google.com/?q=my+search at the end so people could link directly to the search.
You actually use GET much more than you think. Simply returning the web page is a GET request. There are also POST, PUT, DELETE, HEAD, OPTIONS and these are all used in RESTful programming interfaces.
GET vs. POST has no implications on security, they are both insecure unless you use HTTP/SSL.
Check the manual, I'm surprised that nobody has pointed out that GET and POST are semantically different and intended for quite different purposes.
While it may appear in a lot of cases that there is no functional difference between the 2 approaches, until you've tested every browser, proxy and server combination you won't be able to rely on that being a consistent in every case. e.g. mobile devices / proxies often cache aggressivley even where they are requested not to (but I've never come across one which incorrectly caches a POST response).
The protocol does not allow for anything other than simple, scalar datatypes as parameters in a GET - e.g. you can only send a file using POST or PUT.
There are also implementation constraints - last time I checked, the size of a URL was limited to around 2k in MSIE.
Finally, as you've noted, there's the issue of data visibility - you may not want to allow users to bookmark a URL containing their credit card number / password.
POST is the way to go because it results in clean looking URLs
That rather defeats the purpose of what a URL is all about. Read RFC 1630 - The Need For a Universal Syntax.
Sometimes you want your web application to be discoverable as in users can just about guess what a URL should be for a certain operation. It gives a nicer user experience and for this you would use GET and base your URLs on some sort of RESTful specification like http://microformats.org/wiki/rest/urls
If by 'web application' you mean 'website', as a developer you don't really have any choice. It's not you as a developer that makes the GET or POST requests, it's your user. They make the requests via their web browser.
When you request a web page by typing its URL into the address bar of the browser (or clicking a link, etc), the browser issues a GET request.
When you submit a web page using a button, you make a POST request.
In a GET request, additional data is sent in the query string. For example, the URL www.mysite.com?user=david&password=fish sends the two bits of data 'user' and 'password'.
In a POST request, the values in the form's controls (e.g. text boxes etc) are sent. This isn't visible in the address bar, but it's completely visible to anyone viewing your web traffic.
Both GET and POST are completely insecure unless SSL is used (e.g. web addresses beginning https).

Stop Direct Page Calls to Ajax Pages

Is there a "clever" way of stopping direct page calls in ASP.NET? (Page functionality, not the page itself)
By clever, I mean not having to add in hashes between pages to stop AJAX pages being called directly. In a nutshell, this is stopping users from accessing the Ajax pages without it coming from one of your websites pages in a legitimate way. I understand that nothing is impossible to break, I am simply interested in seeing what other interesting methods there are.
If not, is there any way that one could do it without using sessions/cookies?
Have a look at this question: Differentiating Between an AJAX Call / Browser Request
The best answer from the above question is to check for a requested-by or custom header.
Ultimately, your web server is receiving requests (including headers) of what the client sends you - all data that can be spoofed. If a user is determined, then any request can look like an AJAX request.
I can't think of an elegant method to prevent this (there are inelegant and probably non-perfect methods whereby you provide a hash of some sort of request counter between ajax and non-ajax requests).
Can I ask why your application is so sensitive to "ajax" pages being called directly? Could you design around this?
You can check the Request headers to see if the call is initiated by AJAX Usually, you should find that x-requested-with has the value XMLHttpRequest. Or in the case of ASP.NET AJAX, check to see if ScriptMAnager.IsInAsyncPostBack == true. However, I'm not sure about preventing the request in the first place.
Have you looked into header authentication? If you only want your app to be able to make ajax calls to certain pages, you can require authentication for those pages...not sure if that helps you or not?
Basic Access Authentication
or the more secure
Digest Access Authentication
Another option would be to append some sort of identifier to your URL query string in your application before requesting the page, and have some sort of authentication method on the server side.
I don't think there is a way to do it without using a session. Even if you use an Http header, it is trivial for someone to create a request with the exact same headers.
Using session with ASP.NET Ajax requests is easy. You may run into some problems, like session expiration, but you should be able to find a solution.
With sessions you will be able to guarantee that only logged-in users can access the Ajax services. When servicing an Ajax request simply test that there is a valid session associated with it. Of course a logged-in user will be able to access the service directly. There is nothing you can do to avoid this.
If you are concerned that a logged-in user may try to contact the service directly in order to steal data, you can add a time limit to the service. For example do not allow the users to access the service more often than one minute at a time (or whatever rate else is needed for the application to work properly).
See what Google and Amazon are doing for their web services. They allow you to contact them directly (even providing APIs to do this), but they impose limits on how many requests you can make.
I do this in PHP by declaring a variable in a file that's included everywhere, and then check if that variable is set in the ajax call file.
This way, you can't directly call the file ever because that variable will never have been defined.
This is the "non-trivial" way, hence it's not too elegant.
The only real idea I can think of is to keep track of every link. (as in everything does a postback and then a response.redirect). In this way you could keep a static List<> or something of IP addresses(and possible browser ID and such) that say which pages are allowed to be accessed at the moment from that visitor.. along with a time out for them and such to keep them from going straight to a page 3 days from now.
I recommend rethinking your design to be sure that this is really needed though. And also note IPs and such can be spoofed.
Also if you follow this route be sure to read up about when static variables get disposed and such. You wouldn't want one of those annoying "your session has expired" messages when they have been using the site for 10 minutes.

Resources