CSRF protection while making use of server side caching - http

Situation
There is a site at examp.le that costs a lot of CPU/RAM to generate and a more lean examp.le/backend that will perform various tasks to read, write and serve user-specific data for authenticated requests. A lot of resources could be saved by utilizing a server side cache on the examp.le site but not on examp.le/backend and just asynchronously grab all user-specific data from the backend once the page arrives at the client. (Total loading time may even be lower, despite the need of an additional request.)
Threat model
CSRF attacks. Assuming (maybe foolishly) that examp.le is reliably safeguarded against XSS code injection, we still need to consider scripts on malicious site exploit.me that cause the victims browser to run a request against examp.le/backend with their authorization cookies included automagically and cause the server to perform some kind of data mutation on behalf of the user.
Solution / problem with that
As far as I understand, the commonly used countermeasure is to include another token in the generated exampl.le page. The server can verify this token is linked to the current user's session and will only accept requests that can provide it. But I assume caching won't work very well if we are baking a random token into every response to examp.le..?
So then...
I see two possible solutions: One would be some sort of "hybrid caching" where each response to examp.le is still programmatically generated but that program is just merging small dynamic parts to some cached output. Wouldn't work with caching systems that work on the higher layers of the server stack, let alone a CDN, but still might have its merits. I don't know if there is a standard ways or libraries to do this, or more specifically if there are solutions for wordpress (which happens to be the culprit in my case).
The other (preferred) solution would be to get an initial anti-CSRF token directly from examp.le/backend. But I'm not quite clear in my understanding about the implications of that. If the script on exploit.me could somehow obtain that token, the whole mechanism would make no sense to begin with. The way I understand it, if we leave exploitable browser bugs and security holes out of the picture and consider only requests coming from a non-obscure browser visiting exploit.me, then the HTTP_ORIGIN header can be absolutely trusted to be tamper proof. Is that correct? But then that begs the question: wouldn't we get mostly the same amount of security in this scenario by only checking authentication cookie and origin header, without throwing tokens back and forth?
I'm sorry if this question feels a bit all over the place, but I'm partly still in the process of getting the whole picture clear ;-)

First of all: Cross-Site Scripting (XSS) and Cross-Site Request Forgery (CSRF) are two different categories of attacks. I assume, you meant to tackle CSRF problem only.
Second of all: it's crucial to understand what CSRF is about. Consider following.
A POST request to exampl.le/backend changes some kind of crucial data
The request to exampl.le/backend is protected by authentication mechanisms, which generate valid session cookies.
I want to attack you. I do it by sending you a link to a page I have forged at cats.com\best_cats_evr.
If you are logged in to exampl.le in one browser tab and you open cats.com\best_cats_evr in another, the code will be executed.
The code on the site cats.com\best_cats_evr will send a POST request to exampl.le/backend. The cookies will be attached, as there is not reason why they should not. You will perform a change on exampl.le/backend without knowing it.
So, having said that, how can we prevent such attacks?
The CSRF case is very well known to the community and it makes little sense for me to write everything down myself. Please check the OWASP CSRF Prevention Cheat Sheet, as it is one of the best pages you can find in this topic.
And yes, checking the origin would help in this scenario. But checking the origin will not help, if I find XSS vulnerability in exampl.le/somewhere_else and use it against you.
What would also help would be not using POST requests (as they can be manipulated without origin checks), but use e.g. PUT where CORS should help... But this quickly turns out to be too much of rocket science for the dev team to handle and sticking to good old anti-CSRF tokens (supported by default in every framework) should help.

Related

HTTP session tracking through base URL "resource"?

A little background: We're currently trying try specify an HTTP API between a couple of vendors so that different products can easily inter operate. We're not writing any "server" software yet, nor any client, but just laying out the basics of the API so that every party can start prototyping and then we can refine it. So the typical use-case for this API would be being used by (thin) HTTP layers inside a given application, not from within the browser.
Communication doesn't really make sense without having session state here, so we were looking into how to track sessions typically.
Thing is, we want to keep the implementation of the API as easy as possible with as little burden as possible on any used HTTP library.
Someone proposed to manage session basically through "URL rewriting", but a little more explicit:
POST .../service/session { ... }
=> reply with 201 Created and session URL location .../service/session/{session-uuid}
subsequent requests use .../service/session/{session-uuid}/whatever
to end the session the client does DELETE .../service/session/{session-uuid}
Looking around the web, initial searches indicate this is somewhat untypical.
Is this a valid approach? Specific drawbacks or pros?
The pros we identified: (please debunk where appropriate)
Simple on the implementation, no cookie or header tracking etc. required
Orthogonal to client authentication mechanism - if authentication is appropriate, we could easily pass the URLs to a second app that could continue to use the session (valid use case in our case)
Should be safe, as we're going https exclusively for this.
Since, PHPSESSID was mentioned, I stumbled upon this other question, where it is mentioned that the "session in URL" approach may be more vulnerable to session fixation attacks.
However, see 2nd bullet above: We plan to implement~specify authentication/authorization orthogonally to this session concept, so passing around the "session" url might even be a feature, so we think we're quite fine with having the session appear in the URL.

Is it possible using ASP.NET to globally block all cookies (including 3rdparty ones) that are dropped when someone is on my site?

The context of this is around the much hyped EU Privacy law which makes it illegal for a site to drop any "non-essential" cookies unless the user has "opted in" to this.
My specific challenge is due to the complexity of the site and the variety of different ways cookies are being dropped - particularly where governed by a CMS that has allowed marketeers to run riot and embed all sorts of content in different places - mostly around 3rd party cookies where there is embedded javascript, img pixels, Iframes e.t.c. (I'm speculating these all allow the dropping of 3rd party cookies having briefly browsed key areas of the site using a FF plugin - I haven't checked the mechanisms of each yet).
So, I've been trying to think whether in ASP.NET there would be a way to globally intercept and block all cookies that get dropped by my site should I need to, and also extend this to check whether they are essential or not, and if not, whether the user has already agreed to having cookies dropped (which would probably consist of a master YES cookie).
There are several things I am unclear about. First - would it be possible to use Response.Filter or Response.Cookies as a pipeline step to strip out any cookies that have already been dropped? Secondly - would it be possible for this to intercept any kind of cookie whatsoever or is it going to be impossible to catch some of the 3rdparty ones if they are executing browser requests from the client to the 3rdparty server directly?
The closest thing I could find that resembles my question is this but that looks like a sitewide solution - not user specific.
A reverse proxy with URL rewriting could probably do this for you. If you spend the time tracking down the resources and implement the heavy hammer of allow/disallow cookies and rewrite 3rd party URLs to go through your reverse proxy. The you can hijack and modify their set-cookie responses. In addition if they set cookies on the client through JavaScript they would be through your server/domain so you would have control over if they are forwarded or not.
This is not a simple solution but it should be possible and could be implemented without changing the application or the user experience.

Why do I need to perform server side validation?

Thanks to everyone who commented or posted an answer! I've kept my original question and update below for completeness.
[Feb 16, 2011 - Update 2] As some people point out - my question should have been: Given a standard asp.net 4 form, if I don't have any server side validation, what types of malicious attacks am I susceptible to?
Here is my take away on this issue.
If data isn't sensitive (comments on a page) - from an asp.net security standpoint, following standard best practices (SqlParameters, request validation enabled, etc) will protect you from malicious attacks.
For sensitive data/applications - it's up to you to decide what type of server side validation is appropriate for your application. You need to think the end to end solution (webservices, other systems, etc). You can view a number of suggestions below - whitelist validation, etc.
If you are using ajax (xhr requests) to post user input you need to reproduce the protection from the other bullets in your code on the server. Again, lots of solutions below – like ensuring that the data does not contain any html/code, etc. (side note: the .net framework requestValidationMode="4.0" does afford some protection in this regard - but I can't speak to how complete a solution it is)
Please feel free to continue to comment...if any of the above is incorrect please let me know. Thanks!
[Feb 3, 2011 - Update 1] I want to thank everyone for their answers! Perhaps I should ask the reverse question:
Assume a simple asp.net 4.0 web form (formview + datasource with request validation enabled) that allows logged in users to post comments to a public page (comments stored in sql server db table). What type of data validation or cleansing should I perform on the new "comments" on the server side?
[Jan 19, 2011 - Original Question] Our asp.net 4 website has a few forms where users can submit data and we use jquery validate on the client side. Users have to be logged in with a valid account to access these forms.
I understand that our client side validation rules could easily be bypassed and clients could post data without required fields, etc. This doesn’t concern me very much - users have to be logged in and I don’t consider our data very “sensitive” nor would I say any of our validation is “critical”. The input data is written to the database using SqlParameters (to defend against sql injection) and we depend on asp.net request validation to defend against potentially dangerous html input.
Is it really worth our time to rewrite the various jquery validation rules on the server? Specifically how could a malicious user compromise our server or what specific attacks could we be open to?
I apologize as it appears that this question has been discussed a few times on this site – but I have yet to find an answer that cites specific risks or issues with not performing server side validation. Thanks in advance
Hypothetical situation:
Let's say you have a zip code field. On the client-side you validate that it must be in a "00000" or "00000-0000" pattern. Since you're allowing a hyphen, you decide to store the field as a varchar in the database.
So, some evil user comes along and decides to bypass all of your client-side validation and submit something that's not in the correct format and makes it past the request validation.
Ok, no big deal..., you're encoding it before displaying it back to the user later anyway.
But what else are you doing with that zip code? Are you submitting it to web service for some sort of lookup? Are you uploading it to a GPS device? Will it ever be interpreted by something else in the future? Does your zipcode field now contain some JSON or something else weird?
Or something like this: http://www.businessinsider.com/livingsocial-server-flaw-2011-1
Security is a dependability attribute that is defined as the probability that the system resists to an attack, or else the probability a fault is not maliciously activated.
In order to implement security, you must perform a threat analysis. Complex computer systems are subject to deeper analyses (think about an aircraft's o a control tower's equipment) as they become more critical and threats pose business or human life at risk.
You can perform your own threat analysis by questioning yourself what happens if a user bypasses validation?.
Two groups of answers, by examples:
Group 1 (critical)
The user can buy articles paying less than their price
The user can be revealed information about other users
The user obtains privileges he/she is not supposed to have
Group 2 (non critical)
The user is displayed inconsistent data in the next page
Processing continues, but the inconsistency leads to an error that requires human intervention
The user's data (but only of that user, not others) get compromised
A strange error page is returned to the user, with lots of technical information that cannot be used anyway
In the first case, you must definitely fix your validation problem, because you could lose money after an attack, or lose the trust of your public (think about forging Facebook URLs and showing someone's photos even if you are not mutually friends).
In the second case, if you are sure that an inconsistent field doesn't put your business or the data at risk, you may still avoid fixing
The real problem is
How do you prove that any inconsistent data sent to your website is never supposed to have any consequence over the system that may pose a threat?
So that's why you lose less time fixing your validation rather than thinking about it
Honestly, users don't care what you consider "sensitive" or "critical" data. Those criteria are up to them to decide.
I know that if I was a user of your application and I saw my data change without me directly doing something to cause the change...I would close my account up as fast as possible. It would be readily apparent that your system wasn't secure and none of my data was safe.
Keep in mind that you're forcing people to log in so you at least have their passwords somewhere. Whether or not they are easily accessed, a breach is a breach and I have lost my trust.
So...while you may not consider an input injection attack important, your users will and that is why you should still do server side input validation.
Your data may not be worth much, that's fine by me.
BUT, attackers could inject CSRF "cross site request forgery" attack code into your application; users of your site may have their data at other sites compromised. Yes, it would require those 'other sites' to have bugs, but that happens. Yes, it would require that users not use the 'logout' buttons on those sites, but not enough people use them. Think of all the tasty data your users have stored at other web sites. You wouldn't something bad to happen to your users.
Attackers could inject HTML that invites users to download and install 'plugins necessary for viewing this content' -- plugins that are keyloggers, or search hard drives for credit card numbers or tax filings. Maybe a plugin to become spambots or porn hosts. Your users trust your site to not recommend plugins that are owned by the Yakuza, right? They might not feel friendly if your site recommends installing evil things.
Depending upon what kinds of bugs invalid data might trigger, you might find yourself a spambot or a porn host. It heavily depends on how defensively you have coded other aspects of your application. Too many applications blindly trust input data.
And the best part: your users aren't human. Your users are browsers, which might be executing attacks supplied by other sites that didn't bother to perform good input validation and output sanitizing. Your users are viruses or worms that happen to find you by chance or by design. You might trust the individuals, but how far do you trust their computers? Me, not very far.
Please write applications to be as secure as you can -- you may put a large button on the front page to drop all users' data if you want -- but please don't intentionally write insecure programs.
This an excellent and brave question. The short (and possibly brave) answer is you don't. If you are aware of all the security vulnerabilities and you still don't believe it's necessary, then that's your choice.
It really depends on who your users are, who the site is exposed to (in terms of intranet or internet) and how easy it is to obtain an account. You say that your data is not sensitive yet you still require users to log in. How bad would it be if an unauthorised user were to access the system by hopping on another user's machine whilst they were elsewhere?
Bear in mind that relying on the request validation to look for malicious input can never be proved to be 100% safe so security is usually done at multiple levels with a fair bit of redundancy.
However it has to be your choice and you are doing the right thing to find out the consequences of leaving this out.
I believe that you need to validate both on the client side and on the server side, and here's why.
On the client side, you are often saving the user from submitting data that is obviously wrong. They have not filled in a required field. They have put letters in a field that is only supposed to contain numbers. They have provided a date in the future when only a date in the past will do (such as date of birth). And so on. By preventing these kinds of mistakes on the client side, you are avoiding user frustration, and also reducing the number of unnecessary hits to your web server.
On the server side, you should generally repeat all of the validation that you did on the client side. That is because, as you have observed, clever users can get around client-side validation and submit invalid data. In addition, there is some validation that is inefficient or impossible to do on the client side. Sometimes, you check that the data entry adheres to business rules. You might check it against existing data in the database. If you just let users enter anything (especially omitting required fields), the website won't function properly for them.
Check out the Tamper Data extension for firefox. You can feed the server anything you want very easily
Anyone performing HTTP POSTs to your server via your web site (with jQuery validation) can also perform HTTP POSTs via some other means that bypasses the jQuery validation. For example, I could use System.Net.HttpWebRequest to POST some data to your server with the appropriate cookies that injects malicious content into the form fields. I'd have to set up the __EVENT_VALIDATION and __VIEWSTATE fields correctly, but if I succeed, I'd be bypassing the validation.
If you don't have server-side data validation, then you are effectively not validating the inputs at all. The jQuery validation is nice for user experience but not a real line of defense.
This is especially so with inputs like a free-form comments field. You definitely want to ensure that the field does not contain HTML or other malicious script. As an extra measure of defense, you should also escape the comment content when it is displayed in your web app with a library like AntiXss (see http://wpl.codeplex.com/).
In terms of client-side vs. server-side validation, my opinion is that client-side validation is just to make sure the form is filled correctly and a user could tamper with the form and bypass the verifications you do in javascript.
On the server-side you could actually make sure that you actually want to store this data and validate it in depth manner and check relative database tables to ensure that your database is always normalized with any data set that you get from the client. I would say even that the server side is more important than the client side in terms of not showing the user what do you look for in the form and how you validate the data.
to summarize, I recommend verification on both sides, but if I had to choose between the two i would recommend server-side validation , but that could mean that your server could potentially performing additional validations that you could have prevented from validating on the client side
To answer your second question:
You need to use a whitelist to keep malicious input out of the incoming comments.
The .NET Framework request validation does a very good job of stopping XSS payloads in incoming POST requests. It may not, however, prevent other malicious or mischevious HTML from getting into the comments (image tags, hyperlinks, etc.).
So if possible I would set up whitelist validation on the server side for allowed characters. A regex should cover this just fine. You should allow A-Za-z0-9, whitespace, and a few punctuation marks. If the regex fails to match, return an error message to the user and stop the transaction. Regarding SQL Injection: I would allow apostrophes through in this case (unless you like terrible grammar in your comments), but put code comments around your parameterized SQL queries to the effect of: "This is the only protection against SQL, so be careful when modifying." You should also lock down the permissions of the database account used by the web process (read/write only, not database owner permissions). What I wouldn't do is try to do blacklist validation on the input, as that is very time consuming to do correctly (see RSnake's XSS Cheat Sheet at http://ha.ckers.org/xss.html for an idea of the number of things you would need to prevent just for XSS).
Between the .NET framework and your own whitelist validation you should be safe from HTML-based attacks such as XSS and CSRF*. SQL injection will be prevented by using parameterized queries. If the comment data touches any other assets you may need to put more controls in place, but those cover the attacks relevant to the basic data submission form you've outlined.
Also, I wouldn't try to "cleanse" the data at all. It is very difficult to do properly and users (as was mentioned above) hate it when their data is modified without their permission. It is more secure and more usable to give user's a clear error message when your data validation fails. If you put their comment back on the page for them to edit, HTML encode the output so you aren't vulnerable to a Reflected XSS attack.
And as always, OWASP.org (http://www.owasp.org) is a good reference for all things webappsec related. Check out their Top Ten and Development Guide projects.
*CSRF may not be a direct concern of yours, as fraudulent posts to your site may not matter to you, but preventing XSS has the side benefit of keeping CSRF payloads targeting other sites from being hosted from your site.

Stop Direct Page Calls to Ajax Pages

Is there a "clever" way of stopping direct page calls in ASP.NET? (Page functionality, not the page itself)
By clever, I mean not having to add in hashes between pages to stop AJAX pages being called directly. In a nutshell, this is stopping users from accessing the Ajax pages without it coming from one of your websites pages in a legitimate way. I understand that nothing is impossible to break, I am simply interested in seeing what other interesting methods there are.
If not, is there any way that one could do it without using sessions/cookies?
Have a look at this question: Differentiating Between an AJAX Call / Browser Request
The best answer from the above question is to check for a requested-by or custom header.
Ultimately, your web server is receiving requests (including headers) of what the client sends you - all data that can be spoofed. If a user is determined, then any request can look like an AJAX request.
I can't think of an elegant method to prevent this (there are inelegant and probably non-perfect methods whereby you provide a hash of some sort of request counter between ajax and non-ajax requests).
Can I ask why your application is so sensitive to "ajax" pages being called directly? Could you design around this?
You can check the Request headers to see if the call is initiated by AJAX Usually, you should find that x-requested-with has the value XMLHttpRequest. Or in the case of ASP.NET AJAX, check to see if ScriptMAnager.IsInAsyncPostBack == true. However, I'm not sure about preventing the request in the first place.
Have you looked into header authentication? If you only want your app to be able to make ajax calls to certain pages, you can require authentication for those pages...not sure if that helps you or not?
Basic Access Authentication
or the more secure
Digest Access Authentication
Another option would be to append some sort of identifier to your URL query string in your application before requesting the page, and have some sort of authentication method on the server side.
I don't think there is a way to do it without using a session. Even if you use an Http header, it is trivial for someone to create a request with the exact same headers.
Using session with ASP.NET Ajax requests is easy. You may run into some problems, like session expiration, but you should be able to find a solution.
With sessions you will be able to guarantee that only logged-in users can access the Ajax services. When servicing an Ajax request simply test that there is a valid session associated with it. Of course a logged-in user will be able to access the service directly. There is nothing you can do to avoid this.
If you are concerned that a logged-in user may try to contact the service directly in order to steal data, you can add a time limit to the service. For example do not allow the users to access the service more often than one minute at a time (or whatever rate else is needed for the application to work properly).
See what Google and Amazon are doing for their web services. They allow you to contact them directly (even providing APIs to do this), but they impose limits on how many requests you can make.
I do this in PHP by declaring a variable in a file that's included everywhere, and then check if that variable is set in the ajax call file.
This way, you can't directly call the file ever because that variable will never have been defined.
This is the "non-trivial" way, hence it's not too elegant.
The only real idea I can think of is to keep track of every link. (as in everything does a postback and then a response.redirect). In this way you could keep a static List<> or something of IP addresses(and possible browser ID and such) that say which pages are allowed to be accessed at the moment from that visitor.. along with a time out for them and such to keep them from going straight to a page 3 days from now.
I recommend rethinking your design to be sure that this is really needed though. And also note IPs and such can be spoofed.
Also if you follow this route be sure to read up about when static variables get disposed and such. You wouldn't want one of those annoying "your session has expired" messages when they have been using the site for 10 minutes.

Why shouldn't data be modified on an HTTP GET request?

I know that using non-GET methods (POST, PUT, DELETE) to modify server data is The Right Way to do things. I can find multiple resources claiming that GET requests should not change resources on the server.
However, if a client were to come up to me today and say "I don't care what The Right Way to do things is, it's easier for us to use your API if we can just use call URLs and get some XML back - we don't want to have to build HTTP requests and POST/PUT XML," what business-conducive reasons could I give to convince them otherwise?
Are there caching implications? Security issues? I'm kind of looking for more than just "it doesn't make sense semantically" or "it makes things ambiguous."
Edit:
Thanks for the answers so far regarding prefetching. I'm not as concerned with prefetching since is mostly surrounding internal network API use and not visitable HTML pages that would have links that could be prefetched by a browser.
Prefetch: A lot of web browsers will use prefetching. Which means that it will load a page before you click on the link. Anticipating that you will click on that link later.
Bots: There are several bots that scan and index the internet for information. They will only issue GET requests. You don't want to delete something from a GET request for this reason.
Caching: GET HTTP requests should not change state and they should be idempotent. Idempotent means that issuing a request once, or issuing it multiple times gives the same result. I.e. there are no side effects. For this reason GET HTTP requests are tightly tied to caching.
HTTP standard says so: The HTTP standard says what each HTTP method is for. Several programs are built to use the HTTP standard, and they assume that you will use it the way you are supposed to. So you will have undefined behavior from a slew of random programs if you don't follow.
How about Google finding a link to that page with all the GET parameters in the URL and revisiting it every now and then? That could lead to a disaster.
There's a funny article about this on The Daily WTF.
GETs can be forced on a user and result in Cross-site Request Forgery (CSRF). For instance, if you have a logout function at http://example.com/logout.php, which changes the server state of the user, a malicious person could place an image tag on any site that uses the above URL as its source: http://example.com/logout.php. Loading this code would cause the user to get logged out. Not a big deal in the example given, but if that was a command to transfer funds out of an account, it would be a big deal.
Good reasons to do it the right way...
They are industry standard, well documented, and easy to secure. While you fully support making life as easy as possible for the client you don't want to implement something that's easier in the short term, in preference to something that's not quite so easy for them but offers long term benefits.
One of my favourite quotes
Quick and Dirty... long after the
Quick has departed the Dirty remains.
For you this one is a "A stitch in time saves nine" ;)
Security:
CSRF is so much easier in GET requests.
Using POST won't protect you anyway but GET can lead easier exploitation and mass exploitation by using forums and places which accepts image tags.
Depending on what you do in server-side using GET can help attacker to launch DoS (Denial of Service). An attacker can spam thousands of websites with your expensive GET request in an image tag and every single visitor of those websites will carry out this expensive GET request against your web server. Which will cause lots of CPU cycle to you.
I'm aware that some pages are heavy anyway and this is always a risk, but it's bigger risk if you add 10 big records in every single GET request.
Security for one. What happens if a web crawler comes across a delete link, or a user is tricked into clicking a hyperlink? A user should know what they're doing before they actually do it.
I'm kind of looking for more than just "it doesn't make sense semantically" or "it makes things ambiguous."
...
I don't care what The Right Way to do things is, it's easier for us
Tell them to think of the worst API they've ever used. Can they not imagine how that was caused by a quick hack that got extended?
It will be easier (and cheaper) in 2 months if you start with something that makes sense semantically. We call it the "Right Way" because it makes things easier, not because we want to torture you.

Resources