User info in URI without password - http

I know that URI supports the following syntax:
http://[user]:[password]#[domain.tld]
When there is no password or if the password is empty, is there a colon?
In other words, should I accept this:
http://[user]:#[domain.tld]
Or this:
http://[user]#[domain.tld]
Or are they both valid?

The current URI standard (STD 66) is RFC 3986, and the relevant section is 3.2.1. User Information.
There it’s defined that the userinfo subcomponent (which gets followed by #) can contain any combination of
the character :,
percent-encoded characters, and
characters from the sets unreserved and sub-delims.
So this means that both of your examples are valid.
However, note that the format user:password is deprecated. Anyway, they give recommendations how applications should handle such URIs, i.e., everything after the first : character should not be displayed by applications, unless
the data after the colon is the empty string (indicating no password).
So according to this recommendation, the userinfo subcomponent user: indicates that there is the username "user" and no password.

This is more like convenience and both are valid. I would go with http://[user]#[domain.tld] (and prompt for a password.) because it's simple and not ambiguous. It does not give any chance for user to think if he has to add anything after :

Related

Firestore Security Rules - check if field is a valid email address

How can I verify if an incoming field is a valid e-mail? Is there a way to use string-functions or anything in Firestore security rules?
Example:
Let's say I have a Create-Request with a field called "email". In my Firestore security rules, I would like to check if the email is a valid email address:
contains '#'
ends with either .xx or .xxx (a casual country-domain-ending)
has a '.' before the last three or two letters of the email
the '.' does not follow directly after the '#' - at least two letters have to be in-between
So that e.g. example#emailprovider.com gets accepted, but not example#.com.
I know that this check is quite extensive and further would like to know if it makes sense to introduce such a validation to security rules?
You can use rules.String.matches.
See
https://firebase.google.com/docs/reference/rules/rules.String#matches
https://github.com/google/re2/wiki/Syntax
How to validate an email address using a regular expression?
Performs a regular expression match on the whole string.
A regular expression using Google RE2 syntax.
If you want to set only email address then It's necessary to validate the field as email address.
I found an example of a regex (and adjusted a bit):
^[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,5}$
The source of this is at the bottom of the following page:
https://firebase.google.com/docs/reference/security/database/regex
You should also take into account the note as well:
Note: THIS WILL REJECT SOME VALID EMAILS. Validating email address
in regular expressions is difficult in general. See this site for
more depth on the subject.

How to determine if a URI is escaped?

I am using apache commons HTTPClient to download web resources. The URI for these resources come from third parties, I do not generate them.
The commons httpclient requires a URI object to be given to the GetMethod object.
The URI constructor takes a string (for the uri) and a boolean specifying if it is escaped or not.
Currently, I am doing the following to determine if the original url I am given is already escaped...
boolean isEscaped = URIUtil.getPathQuery(originalUrl).contains("%");
m.setURI(new URI(originalUrl, isEscaped));
Is this the correct way to determine if a uri is already escaped?
Update...
according to wikipedia ( Well, according to wikipedia ( http://en.wikipedia.org/wiki/Percent-encoding ) it says that percent is a reserved character and should always be encoded... I am quoting verbatim here...
Percent-encoding the percent character[edit] Because the percent ("%")
character serves as the indicator for percent-encoded octets, it must
be percent-encoded as "%25" for that octet to be used as data within a
URI.
Doesnt this mean that you can never have a naked '%' character in a valid uri?
Also, the uri(s) come from various sources so I cannot be sure if they are escaped or unescaped.
This wouldn't work. It's possible the un-encoded string has a % in it already.
ex:
https://www.google.com/#q=like%25&safe=off
is the url for a google search for like%. In unescaped form it would be https://www.google.com/#q=like%&safe=off
Your consumers should let you know if the URI is escaped or not.

JavaCC match token group

I ended up writing a parser for a small subset of SQL.
The grammar has a lot of regular tokens (SELECT, CREATE, ...) and a few more general (e.g. S_GEN_IDENTIFIER matches [A-Z_.\d]|\"(~[\n, \r, \"])*\").
The problem is, "SELECT col AS type ..." doesn't get parsed since instead of <S_GEN_IDENTIFIER> "type" column alias is matched as <T_TYPE>.
I had an idea to replace token with a rule with the same name and check is the token of interest lies within some token range (something like [<T_AS> - <T_KEEP_DUPLICATES>]. Unfortunately it turned out that the syntax for tokens and rules differs so I can't do it.
I could just copy-paste all tokens inside the new rule but I don't want to do it for obvious reasons.
Is there any way to check if token lies within the range of predefined tokens?
Perhaps you could treat "type" as an unreserved keyword. Then you can follow the advice of question 4.19 of the FAQ
http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#tth_sEc4.19

Parsing a HTTP Basic authentication with an email containing a colon character ( ':' )

I'm using the Authorization header with the Basic type for authentication.
I'm following the HTTP Basic authentication specifications which states that the credentials should follow this form -> userIdentifier:password encoded in base64
We are using an email as the user identifier and according to the email format specification, the colon(':') character is permitted.
The colon(':') is also a valid character in the password.
Knowing this, I'm looking for a creative way to parse the credentials part of the header that uses a colon(':') as the separator between userID and password.
In this case it's simple -> francis#gmail.com:myPassword
This is where it gets complicated -> francis#gmail.com:80:myPasswordWith:Inside
francis#gmail.com:80 is a valid email according to the email format specification even though this is not used very often. So where do I know where to split ?
We have made the decision not to accept an email containing a ':'. But we want to notify the user that his email is not valid, how can we ensure that we are splitting the string at the right place ?
Hope I asked my question in a clear manner, don't hesitate to ask for more details
Thank you
Don’t notify the user that the email is invalid. Split according to the RFC 2617 rules (everything after the first colon is the password), then try to authenticate, fail, and return a generic “authentication failure” message.
A situation where john#example.org:80 has password secret and john#example.org has password 80:secret at the same time, seems unrealistic.
If you require your users to register, you probably do it with some other mechanism (forms?) where you can easily separate the username and tell that it is invalid.

What are the risks of allowing quote characters as part of a URL parameter?

I need to allow the user to submit queries as follows;
/search/"my search string"
but it's failing because of request validation, as outlined in the following 2 questions:
How to include quote characters as a route parameter? Getting "Illegal characters in path" message
How to modify request validation?
I'm currently trying to figure out how to disable request validation for the quote character, but i'd like to know the risks before I actually put the site live with this disabled? I will not disable the request validation unless I can only disable it for the quote character, so I do intend to disallow every other character that's currently not allowed.
According to the URI generic syntax specification (RFC 2396), the double-quote character is explicitly excluded and must be escaped (i.e. %22). See section 2.4.3. The reason given in the spec:
The angle-bracket "<" and ">" and double-quote (") characters are excluded because they are often used as the delimiters around URI in text documents and protocol fields.
You can see easily why this is the case -- imagine trying to create a link in HTML to your URL:
<a href="http://somesite/search/"my search string""/>
That would fail HTML parsing (and also breaks SO's syntax highlighting). You also would have trouble doing basic things with the URL like emailing it to someone (the email client wouldn't parse the URL correctly), posting it on a message board, sending it in an instant message, etc.
For what it's worth, spaces are also explicitly excluded (same section of the RFC explains why).

Resources