Get the final destination after WP_Http redirects (WordPress) - wordpress

I'm doing some requests to an API via WordPress, and the API uses SSL connections if they're turned on in the API settings. I'd like to determine whether SSL is turned on or off without having to ask the user if SSL is turned on on their account, and the API does a good job at redirecting, meaning
If I access http://api/endpoint and SSL is turned on, I'm redirected to https://api/endpoint
If I access https://api/endpoint and SSL is turned off, I'm redirected to http://api/endpoint
Now what I'd like to do is see whether a redirect happened or not and record that to my options so that the other requests are fired to the correct URL without any redirections.
So my question is: is there a way to determine the final destination after firing a WP_Http->request() when the request is being redirected?
I can't see any info about that in the response arrays, I only get to see the final response but I have no idea what URL that came from. What I can do is set the redirection parameter to 0 and catch the max redirects allowed error, but that's not bullet-proof, since I still don't know whether the redirect happened from http to https or simply another page under http.
I hope this all makes sense, let me know if you have any ideas.
Thanks!
~ K

check $response['headers'] - they may contain 'location' key.
It all depends on the HTTP library you are using.
See class-http.php(wp 3.0.1) file:
line 1393, http_api_curl action - curl handle available directly to catch anything.
fopen:
check lines 887-888, and $http_response_header variable.
also, try to override processHeaders function as it has an access to raw http headers.

The WP_Http class processes the headers and removes all but the last one. So you could do what jetdog described above. Check the original URL and compare it to the returned $response['headers']['location']. If it is different, than you know it redirected.

Related

Nginx not logging all access in access.log (missing data of redirected requests)

In Nginx I have a redirection of all incoming http traffic to the same url but with https.
When I check the access log I only see the 301 error, but not the following petition, that can be a 200 or a 404 or whatever.
How can I see that information in the logs of Nginx?
All I want to see is what happen after you get redirected, because the redirection may work but the underlying url may not, and as of now I can only know what works by trying myself (and that doesn't mean that in a moment someone can get another different response because of who knows)
The following requests should be in same access.log file. The only note is that they will not follow each other.
301 response is returned to browser and then it determine follow propose URL or not. And this is not happen imediately. So after initial 301 log record folowing request may be logged after 10, 100 or even 1000 non-related log message. All is depend on trafic and how many logs you have for single page.
I'll post a separate answer to this, because I think someone may find it useful.
The problem is not with Nginx or its logging, if you try to access the problematic URL you can see that from a browser the request is properly recorded, that means, it's not recorded when the request is sent from an external application, scanners or whatever software used, that doesn't react the same way as a web browser and that don't follow the redirection.

When to add http(s):// to website address

I'm trying to create a web browser using Cocoa and Swift. I have an NSTextField where the user can enter the website he wants to open and a WebView where the page requested is displayed. So far, to improve the user experience, I'm checking if the website entered by the user starts with http:// and add it if it doesn't. Well, it works for most of the cases but not every time, for example when the user wants to open a local web page or something like about:blank. How can I check if adding http:// is necessary and if I should rather add https:// instead of http://?
You need to be more precise in your categorization of what the user typed in.
Here are some examples and expected reactions:
www.google.com: should be translated into http://www.google.com
ftp://www.foo.com: Should not be modified. Same goes to file:// (local)
Barrack Obama: Should probably run a search engine
about:settings: Should open an internal page
So after you figure out these rules with all their exceptions, you can use a regex to find out what should be done.
As for HTTP vs. HTTPS - if the site supports HTTPS, you'll get a redirect response (307 Internal Redirect, 301 Moved Permanently etc) if you go to the HTTP link. So for example, if you try to navigate to http://www.facebook.com, you'll receive a 307 that will redirect you to https://www.facebook.com. In other words, it's up to the site to tell the browser that it has HTTPS (unless of course you navigated to HTTPS to begin with).
A simple and fairly accurate approach would simply be to look for the presence of a different schema. If the string starts with [SomeText]: before any slashes are encountered, it is likely intended to indicate a different schema such as about:, mailto:, file: or ftp:.
If you do not see a non-http schema, try resolving the URL as an HTTP URL by prepending http://.

What happens if a 302 URI can't be found?

If I make an HTTP request to get index.html on http://www.example.com but that URL has a 302 re-direct in place that points to http://www.foo.com/index.html, what happens if the redirect target (http://www.foo.com/index.html) isn't available? Will the user agent try the original URL (http://www.example.com/index.html) or just return an error?
Background to the question: I manage a legacy site that supports a few existing customers but doesn't allow new signs ups. Pretty much all the pages are redirected (using 302s rather than 301s for some unknown reason...) to a newer site. This includes the sign up page. On one of the pages that isn't redirected there is still a link to the sign up page which itself links through to a third party payment page (i.e. on another domain). Last week our current site went down for a couple of hours and in that period someone successfully signed up through the old site. The only way I can imagine this happened is that if a 302 doesn't find its intended URL some (all?) user agents bypass the redirect and then go to originally requested URL.
By the way, I'm aware there are many better ways to handle the particular situation we're in with the two sites. We're on it! This is just one of those weird situations I want to get to the bottom of.
You should receive a 404 Not Found status code.
Since HTTP is a stateless protocol, there is no real connection between two requests of a user agent. The redirection status codes are just a way for servers to politely tell their clients that the resource they were looking for is somewhere else now. The clients, however, are in no way obliged to actually request the resource from that other URL.
Oh, the signup page is at that URL now? Well then I don't want it anymore... I'll go and look at some kittens instead.
Moreover, even if the client decides to do request the new URL (which it usually does ^^), this can be considered as a completely new communication between server and client. Neither server nor client should remember that there was a previous request which resulted in a redirection status code. Instead, the current request should be treated as if it was the first (and only) request. And what happens when you request a URL that cannot be found? You get a 404 Not Found status code.

How do I generate a 403 error when someone tries to access a particular page

I may be barking up completely the wrong tree here but what I would like to do is protect my .js pages by having them return a 403 Forbidden http error status page if someone tries to access them directly via http. I use them to support my index.html page but would like for them to remain hidden.
The helpdesk guys at my ISP basically say they don't know if it's possible but it may be something you could do with a web.config file (which is not something I have used before).
Any help at all would be gratefully received - I am a bit out of my comfort zone with this one
I would like to […] protect my .js pages by having them return a 403 Forbidden http error status page if someone tries to access them directly via http.
Please note that if you include some resource, for example a script via the <script>-tag in HTML or an image via the <img>-tag, the browser does nothing else than simply run another HTTP request to get that resource. The whole communication already happens over HTTP.
While a browser may include additional details in its HTTP request when requesting additional resources, like the Referer-header, it definitely is not required to do so. So if you look out for the Referer-header, be advised that you may lock out other valid clients which do not send the Referer-header in their requests.
Also note that this will not give you any protection whatsoever. One can simply construct HTTP headers when requesting things, so “faking” requests your server would allow (because it thinks they are correct) is not a problem at all. And even without that; every resource you tell the client to use to make your website work will be downloaded by the client. And after that, the client can do whatever he wants with it. It can cache them on the hard disk, or allow the user to quickly look at it without having to run another request.
So if you want to do this for protecting your code, then just forget about it, and make it easier for everyone by just not adding a non-optimal protection. Code you put on the web can be made difficult to read, but if you want the user to see the end result, then you also give out your code in the same step.
In php you can do this with:
header("HTTP/1.0 403 Forbidden");

How to work around POST being changed to GET on 302 redirect?

Some parts of my website are only accessible via HTTPS (not whole website - security vs performance compromise) and that HTTPS is enforced with a 302 redirect on requests to the secure part if they are sent over plain HTTP.
The problem is for all major browsers if you do a 302 redirect on POST it will be automatically switched to GET (afaik this should only happen on 303, but nobody seems to care). Additional issue is that all POST data is lost.
So what are my options here other than accepting POSTs to secure site over HTTP and redirecting afterwards or changing loads of code to make sure all posts to secure part of website go over HTTPS from the beginning?
You are right, this is the only reliable way. The POST request should go over https connection from the very beginning. Moreover, It is recommended that the form, that leads to such POST is also loaded over https. Usually the first form after that you have the https connection is a login form. All browsers applying different security restrictions to the pages loaded over http and over https. So, this lowers the risk to execute some malicious script in context that own some sensible data.
I think that's what 307 is for. RFC2616 does say:
If the 307 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
but it says the same thing about 302 and we know what happens there.
Unfortunately, you have a bigger problem than browsers not dealing with response codes the way the RFC's say, and that has to do with how HTTP works. Simplified, the process looks like this:
The browser sends the request
The browser indicates it has sent the entire request
The server sends the response
Presumably your users are sending some sensitive information in their post and this is why you want them to use encryption. However, if you send a redirect response (step 3) to the user's unencrypted POST (step 1), the user has already sent all of the sensitive information out unencrypted.
It could be that you don't consider the information the user sends that sensitive, and only consider the response that you send to be sensitive. However, this turns out not to make sense. Sensitive information should be available only to certain individuals, and the information used to authenticate the user is necessarily part of the request, which means your response is now available to anyone. So, if the response is sensitive, the request is sensitive as well.
It seems that you are going to want to change lots of code to make sure all secure posts use HTTPS (you probably should have written them that way in the first place). You might also want to reconsider your decision to only host some of your website on HTTPS. Are you sure your infrastructure can't handle using all HTTPS connections? I suspect that it can. If not, it's probably time for an upgrade.

Resources