JavaCC match token group

JavaCC match token group - javacc

I ended up writing a parser for a small subset of SQL.
The grammar has a lot of regular tokens (SELECT, CREATE, ...) and a few more general (e.g. S_GEN_IDENTIFIER matches [A-Z_.\d]|\"(~[\n, \r, \"])*\").
The problem is, "SELECT col AS type ..." doesn't get parsed since instead of <S_GEN_IDENTIFIER> "type" column alias is matched as <T_TYPE>.
I had an idea to replace token with a rule with the same name and check is the token of interest lies within some token range (something like [<T_AS> - <T_KEEP_DUPLICATES>]. Unfortunately it turned out that the syntax for tokens and rules differs so I can't do it.
I could just copy-paste all tokens inside the new rule but I don't want to do it for obvious reasons.
Is there any way to check if token lies within the range of predefined tokens?

Perhaps you could treat "type" as an unreserved keyword. Then you can follow the advice of question 4.19 of the FAQ
http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#tth_sEc4.19

Related

Remove http:// or https:// and Trailing / in NetSuite Saved Search

Let me preface this by stating very clearly that I am not a developer and I'm new to NetSuite formulas.
I have a NetSuite saved search that include the Web Address (field id: {url})
I need to remove everything except the main part of the domain (end result should look like abc.com).
I have attempted to use REPLACE({url}, 'http://[,' ']) unsuccessfully.
I have also attempted various LTRIM, RTRIM, TRIM formulas without luck.
I found some information on using REGEXP_SUBSTR, but wasn't successful there either.
I was able to accomplish my goal in Excel using Excel string functions MID, LEN, and RIGHT, but that doesn't seem to translate in NetSuite.
I'd love some assistance.

REGEXP_SUBSTR({url}, '//(.)+') --> get substring starting with //
REPLACE({text}, '/') --> replace / with nothing
The final formula is:
REPLACE(REGEXP_SUBSTR({url}, '//(.)+'), '/')

Jala's answer doesn't seem to work for URLs such as https://stdun7.wixsite.com/stdunstansparish where it returns stdun7.wixsite.comstdunstansparish
In your saved search create a Forumula (Text) field with the following formula
REGEXP_REPLACE({url},'(^http[s]?://)([a-zA-Z0-9.-])(/?.)', '\2')
I'll break down the arguments for the REGEXP_REPLACE function and how it all works...
First argument - {url} the Field containing the url information to parse
Second argument - regexp string
Third argument = replace regexp string
the regexp string has parentheses to denote capture groups of portions of the regular expression.
The first capture group captures the protocol portion of the URL.
The second capture group captures the next part, all permissible hostname characters until the end of the string, or until a '/'
The third capture group captures the remaining portion of the string.
The replace string is used to prepare the return value of the REGEXP_SUBSTR function. Since the entire url is matched by the regexp, the entire string will be replaced by this expression, referencing the second capture group. (aka the hostname)
Since you say you're new to NetSuite formulas, I'll note that those functions are based on Oracle PL/SQL so if you want additional info or examples of how they work beyond what NetSuite provide, sometimes it's instructive to just google things like "pl/sql REGEXP_SUBSTR" etc. to get additional documentation how how they work.
Another good resource is regex101.com, a helpful site to test regular expressions in advance....

jsonapi.org correct way to use pagination using the page query string

In the documentation for jsonapi for pagination is says the following:
For example, a page-based strategy might use query parameters such as
page[number] and page[size]
How would I represent this in the query string? http://localhost:4200/people?page[number]=1&page[size]=25, I don't think using a map link structure is a valid query string. Only the page parameter is reserved according to the documentation.

I don't think using a map link structure is a valid query string.
You're right technically, and that's why the spec has the note that says:
Note: The example query parameters above use unencoded [ and ] characters simply for readability. In practice, these characters must be percent-encoded, per the requirements in RFC 3986.
So, page[size] is really page%5Bsize%5D which is a valid query parameter name.
Only the page parameter is reserved according to the documentation.
When the spec text says that only page is reserved, it actually means that any page[......] style query parameter is reserved. (I can tell you that for sure as one of the spec's editors.) But it should say so more explicitly, so I'll open an issue for it.

User info in URI without password

I know that URI supports the following syntax:
http://[user]:[password]#[domain.tld]
When there is no password or if the password is empty, is there a colon?
In other words, should I accept this:
http://[user]:#[domain.tld]
Or this:
http://[user]#[domain.tld]
Or are they both valid?

The current URI standard (STD 66) is RFC 3986, and the relevant section is 3.2.1. User Information.
There it’s defined that the userinfo subcomponent (which gets followed by #) can contain any combination of
the character :,
percent-encoded characters, and
characters from the sets unreserved and sub-delims.
So this means that both of your examples are valid.
However, note that the format user:password is deprecated. Anyway, they give recommendations how applications should handle such URIs, i.e., everything after the first : character should not be displayed by applications, unless
the data after the colon is the empty string (indicating no password).
So according to this recommendation, the userinfo subcomponent user: indicates that there is the username "user" and no password.

This is more like convenience and both are valid. I would go with http://[user]#[domain.tld] (and prompt for a password.) because it's simple and not ambiguous. It does not give any chance for user to think if he has to add anything after :

Empty URI query string parameters: "a=&b=" versus "a&b"

Should the following URLs be considered functionally equivalent?
http://example.com/foo?a=&b=
http://example.com/foo?a&b
This came about when a user of a Drupal module I wrote which parses apart and then rewrites URIs noticed that the code sometimes causes the query string parts to change in unexpected ways due to how some of the underlying PHP functions behave. For example:
parse_str("a&b", $values); print http_build_query($values);
a=&b=
Is this something I should bother worrying about?
Edit so SO stops complaining that this question is similar to another one: The question is whether it's safe to assume that "no value for X" and "empty value for X" are equivalent, not whether the "no value" style is syntactically correct (which it is).

RFC 3986 Uniform Resource Identifier (URI): Generic Syntax doesn't have anything to say about the structure of the query string aside from how characters like ? should be dealt with. So strictly speaking, your two example URLs are different. Of course, the application which receives those query strings may treat them as functionally equivalent, but this isn't something you can determine from the URL alone.

As per RFC6570 empty query parameters are allowed. Please refer to section 3.2.9
Example Template Expansion
{&x,y,empty} &x=1024&y=768&empty=

Parameter separator in URLs, the case of misused question mark

What I don't really understand is the benefit of using '?' instead of '&' in urls:
It makes nobody's life easier if we use a different character as the first separator character.
Can you come up with a reasonable explanation?
EDIT: after more research I found that "&" can be a part of file name (terms&conditions.html) so "?" is a good separator. But still I think using "?" for separators makes lives easier (from url generators and parsers point of view):
Is there any advantage in using "&" which is not clear at the first glance?

From the URI spec's (RFC 3986) point of view, the only separator here is "?". the format of the query is opaque; the ampersands just are something that HTML happens to use for form submissions.

The answer's pretty much in this article - http://www.skorks.com/2010/05/what-every-developer-should-know-about-urls/ . To highlight it, here goes :
Query is the preferred way to send some parameters to a resource on
the server. These are key=value pairs and are separated from the rest
of the URL by a ? (question mark) character and are normally separated
from each other by & (ampersand) characters. What you may not know is
the fact that it is legal to separate them from each other by the ;
(semi-colon) character as well. The following URLs are equivalent:
http://www.blah.com/some/crazy/path.html?param1=foo&param2=bar
http://www.blah.com/some/crazy/path.html?param1=foo;param2

The RFC 3896 (https://www.ietf.org/rfc/rfc3986.txt) defines general and sub delimiters ... '?' is a general, '&' and ';' are sub. The spec is pretty clear about that.
In this case the latter '?' chars would be treated as part of the query. If the query parser follows the spec strictly, it would then pass the whole query on to the app-destination. If the app-destination could choose to further process the query string in a manner which treats the ? as a param name-value pairs delimiter, that is up to the app's designers.
My guess is that this often 'just works' because code that splits query strings and the original uri uses all delimiters for matching: 1) first query is split on '?' then 2) query string is parsed using char match list that includes '?' (convenience only).... This could be occurring in ubiquitous parsing libraries already.