Can I safely drop "http://" and "www" from URLs in QR codes? - http

I would like to encode some links for QR codes.
The shorter the link the better, because a shorter URL reduces the number of dots in the QR code, which makes it a lot easier to scan.
If I remove "http://www." from the start of my URLs (qoomerang.com/xxxx), the link works fine on my computer. But are standards these days such that I can safely remove them from the QR code aswell - i.e. will the text still be recognised as a website by all smartphones?

www is just a subdomain. Whether it's safe to drop this or not depends on the web server configuration. If the server is configured to serve a certain page on the www subdomain, it will need this.
(Refer to: https://superuser.com/questions/60006/what-is-the-purpose-of-the-www-subdomain for more details)
http:// refers to the protocol and should be retained as this is the only reliable way of identifying a web address and the method to fetch it. Some devices try to find URLs that do not contain http:// but you should not rely on this. Furthermore, the device would not know for certain whether it should use HTTP or HTTP over TLS (https://) to download the link.

Related

Securing HTTP referer

I develop software which stores files in directories with random names to prevent unauthorized users to download a file.
The first thing we need about this is to store them in a separate top-level domain (to prevent cookie theft).
The second danger is HTTP referer which may reveal the name of the secret directory.
My experiments with Chrome browser shows that HTTP referer is sent only when I click a link in my (secret) file. So the trouble is limited only to files which may contain links (in Chrome HTML and PDF). Can I rely on this behavior (not sending the referer is the next page is opened not from a current (secret) page link but with some other method such as entering the URL directly) for all browsers?
So the problem was limited only to HTML and PDF files. But it is not a complete security solution.
I suspect that we can fully solve this problem by adding Content-Disposition: attachment when serving all our secret files. Will it prevent the HTTP referer?
Also note that I am going to use HTTPS for a man-in-the-middle not to be able to download our secret files.
You can use the Referrer-Policy header to try to control referer behaviour. Please take note that this requires clients to implement this.
Instead of trying to conceal the file location, may I suggest you implement proper authentication and authorization handling?
I agree that Referrer-Policy is your best first step, but as DaSourcerer notes, it is not universally implemented on browsers you may support.
A fully server-side solution is as follows:
User connects to .../<secret>
Server generates a one-time token and redirects to .../<token>
Server provides document and invalidate token
Now the referer will point to .../<token>, which is no longer valid. This has usability trade-offs, however:
Reloading the page will not work (though you may be able to address this with a cookie or session)
Users cannot share URL from URL bar, since it's technically invalid (in some cases that could be a minor benefit)
You may be able to get the same basic benefits without the usability trade-offs by doing the same thing with an IFRAME rather than redirecting. I'm not certain how IFRAME influences Referer.
This entire solution is basically just Referer masking done proactively. If you can rewrite the links in the document, then you could instead use Referer masking on the way out. (i.e. rewrite all the links so that they point to https://yoursite.com/redirect/....) Since you mention PDF, I'm assuming that this would be challenging (or that you otherwise do not want to rewrite the document).

When to add http(s):// to website address

I'm trying to create a web browser using Cocoa and Swift. I have an NSTextField where the user can enter the website he wants to open and a WebView where the page requested is displayed. So far, to improve the user experience, I'm checking if the website entered by the user starts with http:// and add it if it doesn't. Well, it works for most of the cases but not every time, for example when the user wants to open a local web page or something like about:blank. How can I check if adding http:// is necessary and if I should rather add https:// instead of http://?
You need to be more precise in your categorization of what the user typed in.
Here are some examples and expected reactions:
www.google.com: should be translated into http://www.google.com
ftp://www.foo.com: Should not be modified. Same goes to file:// (local)
Barrack Obama: Should probably run a search engine
about:settings: Should open an internal page
So after you figure out these rules with all their exceptions, you can use a regex to find out what should be done.
As for HTTP vs. HTTPS - if the site supports HTTPS, you'll get a redirect response (307 Internal Redirect, 301 Moved Permanently etc) if you go to the HTTP link. So for example, if you try to navigate to http://www.facebook.com, you'll receive a 307 that will redirect you to https://www.facebook.com. In other words, it's up to the site to tell the browser that it has HTTPS (unless of course you navigated to HTTPS to begin with).
A simple and fairly accurate approach would simply be to look for the presence of a different schema. If the string starts with [SomeText]: before any slashes are encountered, it is likely intended to indicate a different schema such as about:, mailto:, file: or ftp:.
If you do not see a non-http schema, try resolving the URL as an HTTP URL by prepending http://.

Intercept and use local files in http requests

I'm trying to find a tool that will allow non-programmers to test files on a live server.
For example, they could modify an image on their computer, reload a webpage, then see the results of their work immediately.
I've tried finding a tool for this, because it seems obvious enough that someone must've thought of it, but a lot of software I see doesn't quite fit. A tool called Fiddler does this (they call it autoresponding) but it's Windows-only. I could change the hosts file to redirect to a local instance of nginx or something, but that seems difficult to maintain when all I really want is a simple tool that will something like this...
http://someserver.com/css/(.*) -> /home/user/localcss/$1
Does anybody have any recommendations?
Edit: Redirect clarification
Fiddler has this feature; just click the AutoResponder tab and map URLs to local files. Thousands of people do this every day.
See also video #5 here: http://www.fiddlerbook.com/fiddler/help/video/default.asp
I found Charles Proxy very useful for this
http://www.charlesproxy.com/documentation/tools/map-local/
Max's PAC solution was a life-saver so I'm providing more details (can't yet up vote)
To use a local version of, say, css files, create a file 'proxy.pac', which contains this function:
function FindProxyForURL(url, host)
{
// use regex to match requests ending with '.css'
// and redirect them to localhost
var regexpr = /.**\.css/;
if(regexpr.test(url))
{
return "PROXY localhost";
}
// Or else connect directly:
return "DIRECT";
}
Save 'proxy.pac' and point your browser to this file. In Firefox this is in Options > Advanced > Connection > Settings > Automatic Proxy Configuration URL
For best practice, also add a MIME type to your web server: map '.pac' to type 'application/x-ns-proxy-autoconfig'.
All requests to .css files will now be routed to localhost. Don't forget to ensure the file structure is the same on the proxy server.
In the case of CSS, it may well be easier to override CSS by using a local chrome. For example in Firefox, chrome/userContent.css. See http://kb.mozillazine.org/UserContent.css
It's been a while since I asked this question and I have an good technique that wasn't suggested.
PAC files are supported by all major browsers, and allow you to write a script that can redirect any individual request to a proxy server. So for example the proxy server could serve a PAC file, have the PAC file redirect whitelisted URLs to the proxy server, and return the local versions of these files. It can even support HTTPS.
Beware of one gotcha - Internet Explorer. It helpfully "caches" the results of this script incorrectly, so that if one URL on a domain is proxied, all URLs at that domain will be proxied. This feature can be disabled, however.
You can do this with the modify response rule in Requestly.
Using the local file option you can specify any file to be used as the response for the intercepted request.
According to their documentation it also supports hot reloading, i.e., as long as the file path remains the same, the rule will pick up the changes that you made.
As for the dynamic URL matching, they have support for regex and wildcards in their source filters
Note: This is currently only available in their desktop app.
If you want to implement this using their chrome extension ,which is what I personally did, you can use the Redirect rule paired with a mock server. Here is a page explaining this
You can setup a mock server / mock files endpoint within Requestly instead of using something nginx or a local server to do so. But this works only for text based content, not images
This would also bypass any setup on the tester's local machine. They would only need to install the extension. All you would have to do is send them the endpoint for your mock server and the redirect rule that you created.
Actually you can't do this because browsers don't allow files over http:// to access file on the local machine (just think a moment about it... What would happen if, for example, a malicious webpage loads some private files from your computer?).
Some browsers (e.g. Safari) allows files over file:// to access other file:// files, others don't, but no browser allows http:// to access file://.
Firefox has a feature called "Signed scripts", which are scripts digitally signed with a trusted certificate. They can ask the user to grant them access to the local hard drive. Look at this: http://www.mozilla.org/projects/security/components/signed-scripts.html
Do you mean the Fiddler Web Proxy (www.fiddler2.com)? There is a commercial Java-based alternative named Charles Web Proxy that may fit your needs.

Is there any downside for using a leading double slash to inherit the protocol in a URL? i.e. src="//domain.example"

I have a stylesheet that loads images from an external domain and I need it to load from https:// from secure order pages and http:// from other pages, based on the current URL. I found that starting the URL with a double slash inherits the current protocol. Do all browsers support this technique?
HTML ex:
<img src="//cdn.domain.example/logo.png" />
CSS ex:
.class { background: url(//cdn.domain.example/logo.png); }
If the browser supports RFC 1808 Section 4, RFC 2396 Section 5.2, or RFC 3986 Section 5.2, then it will indeed use the page URL's scheme for references that begin with "//".
When used on a link or #import, IE7/IE8 will download the file twice per http://paulirish.com/2010/the-protocol-relative-url/
Update from 2014:
Now that SSL is encouraged for everyone and doesn’t have performance concerns, this technique is now an anti-pattern. If the asset you need is available on SSL, then always use the https:// asset.
One downside occurs if your URLs are viewed outside the context of a web page. For example, an email message sitting in an email client (say, Outlook) effectively has no URL, and when you're viewing a message containing a protocol-relative URL, there is no obvious protocol context at all (the message itself is independent of the protocol used to fetch it, whether it's POP3, IMAP, Exchange, uucp or whatever) so the URL has no protocol to be relative to. I've not investigated compatibility with email clients to see what they do when presented with a missing protocol handler - I'm guessing that most will take a guess at http. Apple Mail refuses to let you enter a URL without a protocol. It's analogous to the way that relative URLs do not work in email because of a similarly missing context.
Similar problems could occur in other non-HTTP contexts such as in tweets, SMS messages, Word documents etc.
The more general explanation is that anonymous protocol URLs cannot work in isolation; there must be a relevant context. In a typical web page it's thus fine to pull in a script library that way, but any external links should always specify a protocol. I did try one simple test: //stackoverflow.com maps to file:///stackoverflow.com in all browsers I tried it in, so they really don't work by themselves.
The reason could be to provide portable web pages. If the outer page is not transported encrypted (http), why should the linked scripts be encrypted? This seems to be an unnecessary performance loss. In case, the outer page is securely transported encrypted (https), then the linked content should be encrypted, too. If the page is encrypted, the linked content not, IE seems to issue a Mixed Content warning. The reason is that an attacker can manipulate the scripts on the way. See http://ie.microsoft.com/testdrive/Browser/MixedContent/Default.html?o=1 for a longer discussion.
The HTTPS Everywhere campaign from the EFF suggests to use https whenever possible. We have the server capacity these days to serve web pages always encrypted.
Just for completeness. This was mentioned in another thread:
The "two forward slashes" are a common shorthand for "whatever protocol is being used right now"
if (plain http environment) {
use 'http://example.com/my-resource.js'
} else {
use 'https://example.com/my-resource.js'
}
Please check the full thread.
It seems to be a pretty common technique now. There is no downside, it only helps to unify the protocol for all assets on the page so should be used wherever possible.

Is it safe to redirect to the same URL?

I have URLs of the form http://domain/image/⟨uuid⟩/42x42/some_name.png. The Web server (nginx) is configured to look for a file /some/path/image/⟨uuid⟩/thumbnail_42x42.png, and if it does not exist, it sends the URL to the backend (Django via mod_wsgi) which then generates the thumbnail. Then the backend emits a 302 redirect to exactly the same URL that was requested by the client, with the idea that upon this second request the server will notice the thumbnail file and send it directly.
The question is, will this work with all the browsers? So far testing has shown no problems, but can I be sure all the user agents will interpret this as intended?
Update: Let me clarify the intent. Currently this works as follows:
The client requests a thumbnail of an image.
The server sees the file does not exist, so it forwards the request to the backend.
The backend creates the thumbnail and returns 302.
The backend releases all the resources, letting the server share the newly generated file to current and subsequent clients.
Having the backend serve the newly created image is worse for two reasons:
Two ways of serving the same data must be created;
The server is much better at serving static content. What if the client has an extremely slow link? The backend is not particularly fast nor memory-efficient, and keeping it in memory while spoon-feeding the client can be wasteful.
So I keep the backend working for the minimum amount of time.
Update²: I’d really appreciate some RFC references or opinions of someone with experience with lots of browsers. All those affirmative answers are pleasant but they look somewhat groundless.
If it doesn't, the client's broken. Most clients will follow redirect loops until a maximum value. So yes, it should be fine until your backend doesn't generate the thumbnail for any reason.
You could instead change URLs to be http://domain/djangoapp/generate_thumbnail and that'll return the thumbnail and the proper content-type and so on
Yes, it's fine to re-direct to the same URI as you were at previously.

Resources