HTTP Redirects caused by public hotspots - http

When a user connects for the first time to a public hotspot they often get back a welcome page with a login, instead of the requested page. This can also happen when you request a page from code, leading to corruption. We expect this type of page should always return a "302 redirect", but hard evidence that this always happens is hard to come by and we think some users may be getting back corrupted data - with a 200 return from hotspots in the world.
Does anyone know what the correct behaviour of a hotspot is, and the way to avoid it getting into the flow?
This also goes for proxies that downsample images over 3G connections etc. We have resorted to hashing our files with an integrity check to throw out altered data, but this seems a very heavy weight solution.
Note: we're happy with the data being proxied, we just want it left unaltered.

The correct status code for this case would be status code 511 (http://greenbytes.de/tech/webdav/rfc6585.html#status-511)

Related

Drop a connection without sending any response in axum

Is it possible to not send any response to an incoming request using axum?
I am trying to implement a blacklist, and while it's certainly possible to just return a 404 (you might make an argument for a different error code), if an address has earned a spot on the blacklist, I'd rather not devote the server resources or courtesy (minimal as they may be) to actually spitting out a response; I'd rather just drop the request on the floor, as it were.
The documentation mentions that errors that bubble up to the hyper layer will cause hyper to close the connection without sending a response; it goes on to explain that this is why every single service/handler/etc. in axum must return an Infallible error type. While I agree that closing the connection is (in the docs' terminology) "generally not desirable", I believe that in this particular circumstance it is specifically desirable to drop the incoming request, and hope this is possible, but can't figure out how to do it.

Is there any reason a website should return appropriate HTTP status codes?

I'm working on a small web application, and I'm trying to decide if I should make the effort to emit semantically appropriate HTTP status codes from within the application.
I mean, it makes sense for the web server itself to emit proper response codes. 500 Internal Server Error for a misconfigured Apache or 404 Not Found for a missing index.php or whatever all make sense, since there's nothing else the server can really do.
It also makes sense to manipulate the browser with 303 See Other or other HTTP mechanisms which actually produce behavior.
But if all that happened is a missing GET parameter, for example, is there any reason to go out of my way to return 400 Bad Request? Or how about 404 Not Found, if my application is handling all the routing by itself? From what I can tell there isn't any behavior associated with either of those error codes.
My general opinion: provide codes if the code provides actionable data for the user.
If all you're doing is presenting content, then in most cases I think it's less important. If YouTube fails to load a video, I mostly care about the fact that I can't watch my video. That it failed with a 418 status might be intellectually interesting, but it doesn't really provide me with any helpful information (Even assuming a non-silly failure code).
On the other hand, if you're allowing some kind of user interaction with a server, then the codes become much more important. I might actually care about why my request failed, because I'm now in a position to do something about it.
However, there are some codes that are actionable. 410 Gone for example: If my request failed for that reason, but I just got back a generic "Stuff Broke" message, I'd probably repeat the request a bunch of times, get nowhere, and give up in frustration. Knowing that the thing I'm looking for doesn't exist is a pretty useful thing for me to know.
I think its very important for a web service to respond with appropriate codes as sometime the developer using the service might not know whats wrong or why the app stopped working unless he views the status code.

Why is using a HTTP GET to update state on the server in a RESTful call incorrect?

OK, I know already all the reasons on paper why I should not use a HTTP GET when making a RESTful call to update the state of something on the server. Thus returning possibly different data each time. And I know this is wrong for the following 'on paper' reasons:
HTTP GET calls should be idempotent
N > 0 calls should always GET the same data back
Violates HTTP spec
HTTP GET call is typically read-only
And I am sure there are more reasons. But I need a concrete simple example for justification other than "Well, that violates the HTTP Spec!". ...or at least I am hoping for one. I have also already read the following which are more along the lines of the list above: Does it violate the RESTful when I write stuff to the server on a GET call? &
HTTP POST with URL query parameters -- good idea or not?
For example, can someone justify the above and why it is wrong/bad practice/incorrect to use a HTTP GET say with the following RESTful call
"MyRESTService/GetCurrentRecords?UpdateRecordID=5&AddToTotalAmount=10"
I know it's wrong, but hopefully it will help provide an example to answer my original question. So the above would update recordID = 5 with AddToTotalAmount = 10 and then return the updated records. I know a POST should be used, but let's say I did use a GET.
How exactly and to answer my question does or can this cause an actual problem? Other than all the violations from the above bullet list, how can using a HTTP GET to do the above cause some real issue? Too many times I come into a scenario where I can justify things with "Because the doc said so", but I need justification and a better understanding on this one.
Thanks!
The practical case where you will have a problem is that the HTTP GET is often retried in the event of a failure by the HTTP implementation. So you can in real life get situations where the same GET is received multiple times by the server. If your update is idempotent (which yours is), then there will be no problem, but if it's not idempotent (like adding some value to an amount for example), then you could get multiple (undesired) updates.
HTTP POST is never retried, so you would never have this problem.
If some form of search engine spiders your site it could change your data unintentionally.
This happened in the past with Google's Desktop Search that caused people to lose data because people had implemented delete operations as GETs.
Here is an important reason that GETs should be idempotent and not be used for updating state on the server in regards to Cross Site Request Forgery Attacks. From the book: Professional ASP.NET MVC 3
Idempotent GETs
Big word, for sure — but it’s a simple concept. If an
operation is idempotent, it can be executed multiple times without
changing the result. In general, a good rule of thumb is that you can
prevent a whole class of CSRF attacks by only changing things in your
DB or on your site by using POST. This means Registration, Logout,
Login, and so forth. At the very least, this limits the confused
deputy attacks somewhat.
One more problem is there. If GET method is used , data is sent in the URL itself . In web server's logs , this data gets saved somewhere in the server along with the request path. Now suppose that if someone has access to/reads those log files , your data (can be user id , passwords , key words , tokens etc. ) gets revealed . This is dangerous and has to be taken care of .
In server's log file, headers and body are not logged but request path is . So , in POST method where data is sent in body, not in request path, your data remains safe .
i think that reading this resource:
http://www.servicedesignpatterns.com/WebServiceAPIStyles could be helpful to you to make difference between message API and resource api ?

Is it okay to set a cookie with a HTTP GET request?

This might be a bit of an ethical question, but I'm having quite a discussion in the office about the following issue:
Is it okay to set a cookie with a HTTP GET request? Because whenever a HTTP request changes something in the application, you should use a POST request. HTTP GET should only be used to retrieve data identified by the Request-URI.
In this case, the application doesn't change, but because the cookie is altered, the user might get a different experience when the page loads again, meaning that the HTTP GET request changed the application behaviour (nothing changed server-side though).
Get request reference
The discussion started because we want to use a normal anchor element to set a cookie.
The problem with GETs, especially if they are on an a tag, is when they get spidered by the likes of Google.
In your case, you'd needlessly be creating cookies that will, more than likely, never get used.
I'd also argue that the GET rule it's not really about changing the application, more about changing data. I appreciate the subtle distinction with the cookie ( i.e. you are not changing data on YOUR system ), but generally, it's a good rule to have, and irrespective of where the data is stored, GET shouldn't really be used to change it.
The user can always have different experience when he issues another GET request - you do not expect to return always the same set of data for (imagined) time service: "GET /time/current".
Also, it is not said you are not allowed to change server-side state in response for GET requests - it's perfectly 'legal' to increase a page hit counter, for example, even if you store it in the database.
Consider the section 9.1.1 Safe Methods
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects, so
therefore cannot be held accountable for them.
Also I would say it is perfectly acceptable to change or set a cookie in response for the GET request because you just return some data.

Which HTTP status codes do you actually use when developing web applications? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
The HTTP/1.1 specification (RFC2616) defines a number of status codes that can be returned by HTTP server to signal certain conditions. Some of those codes can be utilized by web applications (and frameworks). Which of those codes are the most useful in practice in both classic and asynchronous (XHR) responses, in what situations you use each of them?
Which codes should be avoided, eg. should applications mess with the 5xx code range at all? What are your conventions when returning HTTP codes in REST web services? Do you ever use redirects other than 302?
The ones I'm using (that I could find with a quick grep 'Status:' anyway):
200 Successfully retrieved a resource without affecting it
201 Sent whenever a form submission puts something significant into the database (forum post, user account, etc.), creating a new resource
204 Sent with empty body, for example after a DELETE
304 HTTP caching. I've found this one is very hard to get right since it has to account for users changing display settings and so on. The best idea I've come up with for that is using a hash of the user's preferences as the ETag. It doesn't help that most browsers have unpredictable and inconsistent behaviour here...
400 Used for bad form submissions that fail some validation check.
403 Used whenever someone is somewhere they shouldn't be (though I try to avoid that by not displaying links to stuff the users shouldn't access).
404 Apart from the normal webserver ones I use these when the URL contains invalid ID numbers. I suppose it'd be a good idea to check in this case whether a higher valid ID exists and send a 410 instead...
429 When user's requests are too frequents
500: I tend to put these in catch{ } blocks where the only option is to give up, to make sure something meaningful is sent to the browser.
I realise I could get away with simply letting the server send "200" for everything, but they save a lot of pain when users are seeing (or causing) errors and not telling you about them. I've already got functions to display access-denied messages and so on, so it's not much work to add these anyway.
418: I'm a teapot
From http://www.ietf.org/rfc/rfc2324.txt Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0)
Don't forget about 503 - Service Unavailable. This one is essential for site downtime. Especially where Search Engines are concerned.
Say you're taking the site down for a few hours for maintenance or upgrade work. By directing all requests to a friendly page that returns a 503 code, it tells spiders to "try again later".
If you simply display a "Temporarily Down" page but still return 200 OK, the spider may index your error pages or, worse, replace existing indexing with this "new" content.
This could seriously impact your SEO rankings, especially if your a large, popular site.
303 See Other is a must for PRG, which you should be using now if you aren't already.
Here are the most common response codes, from my experience:
The response codes in the 1xx-2xx range are typically handled automatically by the underlying webserver (i.e. Apache, IIS), so you don't need to worry about those.
Codes 301 and 302 are typically used for redirects, and 304 is used a lot when the client or proxy already contains a valid copy of the data and does not need a new version from the server (see the RFC for details on exactly how this works).
Code 400 is typically used to indicate that the client sent bad or unexpected data that caused a problem in the server.
Code 403 is for performing authentication, which is again usually handled somewhat automatically by the server configuration.
Code 404 is the error code for a page not found.
Code 500 indicates an error condition in the server that is not necessarily caused by data sent from the client. For example, database connection failures, programming errors, or other unhandled exceptions.
Code 502 is typically seen if you are proxying from a webserver (such as Apache) to an app server (such as Tomcat) in the backend, and the connection cannot be made to the app server.
For asynchronous calls (i.e. AJAX/JSON responses) it's usually safest to always return a 200 response. Even if there was an error in the server while processing the request, it's best to include the error in the JSON object and let the client deal with it that way. The reason being that not all web browsers allow access to the response body for non-200 response codes.
I tend to use 405 Method Not Allowed when somebody tries to GET an URL that can only be POSTed. Anyone else does it the same way?
In Aida/Web framework we use just
200 Ok
302 Redirect
404 Not Found
500 Internal Server Error
I basically use all of them, where appropriate. The spec itself defines each code and the circumstances in which they should be used.
When building a RESTful web application, I don't recommend picking-and-choosing status codes, and restrict oneself to a subset of the full range. That is, unless one's building a web application for a specific HTTP client - in which case one isn't really building a web application, one's building an application for that specific client.
At my firm, we have some Flex clients. They can't properly handle status codes other than 200, so we have them send a special parameter with their requests, which tells our servers to always send a 200, even when it's not the proper response.
I've had nightmares about the number 500.
500 Internal Server Error is returned in Aida/Web when your web application raise an exception. Because this is a Smalltalk application, an exception window is raised on the server (if you run it headfull) while the user got 500 and a short stack.
On the server you can then open a full debugger and try to solve the problem, while the server is continuing running of course. Another nice thing is that exception window with full stack is waiting for you until you come around.
Conclusion: 500 is a blessing, not a nightmare!
I'm using 409 instead of 400 for bad user data (i.e. form submission).
The specs for 409 go on about version conflicts but it mentions that information on how to fix the issue should be sent in the response. Perfect for malformed email or wrong password messages.
400 only addresses syntax issues which to me sounds like the request just doesn't make sense at all rather then failing some regex .
I use webmachine, which automagically generates proper error codes. But there are cases when I need to supply my own. I have found it helpful for development and debugging to return 666 in those cases, so I can easily tell which ones come from my code, and which from webmachine. Besides, I get a chuckle out of it when it comes up. You do have to remember to change this before deployment for real.

Resources