What are accepted characters sets in name fields in PassengerDetailsRQ? - sabre

What are accepted character sets in name fields in PassengerDetailsRQ?
I can't find relevant information about character sets in https://developer.sabre.com/docs/soap_apis/management/itinerary/Passenger_Details/resources/
For example, umlaut characters complicate the issue and leaves me wondering if I need to force only latin characters.
From my testing, "-", ".", numbers are not allowed. The request fails with ".FRMT.NOT ENT BGNG WITH" message.
I'm interested in the complete list of accepted character sets because I'm not sure what symbols can be entered in.

Using UTF-8 would be correct, as no hyphens, numbers, umlauts, accent marks, apostrophes, or special characters like ñ.
Since PassengerDetails works following the same mandates native Sabre does, it might be useful checking the native Sabre help site, called Format Finder. For this situation in particular, you may check here. (To login there, you can use the same credentials you use for creating sessions).

Related

How to read the multiple CSV files generated at one go dynamically in Oracle 11g?

We want to read the multiple CSV files generated at one go dynamically through Oracle PL/SQL or Oracle Proc (for one of our requirement) and we are looking some pseudo code snippets or logic to build the same.
We searched for the same but no luck. This requirement has to be done purely through Oracle and no Java is involved here.
I dealt with this problem in the past, and what I did was to write a (quite easy) parsing function similar to split. It should accept two variables: String and separator. It then returns a array of strings.
You then load the whole file into a text variable (declared big enough to hold the whole file) and then invoke the split function (with EOL as separator) to split the buffer into lines.
Then, for each line, invoke the parser again using comma as separation.
Though the parser is simple, you need to take into account possible conditions (e.g. bypass blanks that are not part of a string, single/double quotes management, etc.).
Unfortunately, I left the company at which the parser was developed, otherwise I would had post the source here.
Hope this helps you.
UPDATE: Added some PSEUDO-CODE
For the Parser:
This mechanism is based on a state-machine concept.
Define a variable that will reflect the state of the parsing; possible values being: BEFORE_VALUE, AFTER_VALUE, IN_STRING, IN_SEPARATOR, IN_BLANK; initially, you will be in state BEFORE_VALUE;
Examine each character of the received string and, based on the character and the current state;
It is up to you to decide what to do with blanks like in aaa,bbb, ccc,ddd, those before ccc (in my case, I ignored them).
Whenever you start or go through a value, you append the character to e temporary variable;
Once you finished a value, you add the collected sub-string (stored in the temporary variable) to the array of strings.
The state machine mechanism is needed to properly handle situations like when you have a comma as part of a string value (and hence it is not possible to simply search for commas and chop the whole string according to them),
Another point to take into account is empty values, which would be represented by two consecutive commas (i.e. in your state machine if you find a comma when your state is IN_SEPARATOR, it means that you just passed an empty value).
Note that exactly the same mechanism can be used for splitting the initial buffer into lines, and the each line into fields (the only different is the input string, the separator and the delimiter).
For the File handling process:
Load the file into a local buffer (big enough, preferable CLOB),
Split the file into records (using the above function) and then loop through the received records,
For each record, invoke the parser with the correct parameters (i.e. the record string, the delimiter, and ',' as separator),
The parser will return you the fields contained in the record with which you can proceed and do whatever you need to do.
Well, I hope this helps you to implement the needed code (looks complex, but it is not; just code it slowly taking into account the possible conditions you may encounter when running the state-machine.

Constructing a Windows Search query

We have a website in which we will be using the Windows Search feature to allow users to search the pages and documents of the site. I would like to make the search as intuitive as possible, given that most users are already familiar with google-style search syntax. However, using Windows Search seems to present two problems.
If I use the FREETEXT() predicate, then users can enter certain google-style syntax options, such as double quotes for exact phrase matching or use the minus sign to exclude a certain word. These are features I consider necessary. However, the FREETEXT() predicate seems to demand that every search term appear somewhere in the page / document in order for it to be returned in the results.
If I use the CONTAINS() predicate, then users can enter search terms using boolean operators, and they can execute wildcard searches using the * character. However, all search terms must be joined by one of the logical operators or enclosed in double quotation marks.
What I would like is a combination of the two. Users should be able to search for exact phrases using double quotations marks, exclude words using the minus sign, but also have anything not enclosed in quotation marks be subject to wildcard matching (e.g. searching for civ would return documents containing the words civil or civility or civilization).
How could I go about implementing this?
I followed some of the instructions at http://www.codeproject.com/Articles/21142/How-to-Use-Windows-Vista-Search-API-from-a-WPF-App to create the Interop.SearchAPI.dll assembly for .NET. I then used the ISearchQueryHelper.GenerateSQLFromUserQuery() method to build the SQL command.
The generated SQL uses the CONTAINS() predicate, but it builds the CONTAINS() predicate numerous times with different combinations of the search terms, including wild cards. This allows the user to enter exact phrases using double quotation marks, exclude words using the minus sign, and perform automatic wildcard matching as I mentioned in the original question.

How to check if user input data is in other than English language?

I am using Facebook API in my app to do the user authentication and then saves the user data into DB. And I am using same (i.e. facebook) username for my app if it exist else I create the username using name, but the problem is that some user's don't have their display name in English. So how can I check for such input at server side?
My app is written in Asp.net.
You can use regular expressions to check if the characters are only a, b, c...z or A, B, C...Z:
using System.Text.RegularExpressions;
Regex rgx = new Regex("^[a-zA-Z]+$");
if (rgx.IsMatch(inputData))
// input data is in English alphabet; take appropriate action...
else
// input data is not in English alphabet; take appropriate action...
It may be overkill for this task but correct way to detect input language is using something like Extended Linguistic Services APIs or services like Free Language Detection API
In your case I suggesting saving user names in appropriate encoding (like utf-8 or utf-16, which should be fine for user names on Facebook)
Your problem isn't that the usernames are in a foreign language, but rather that you are trying to store data into a database without using the appropriate character encoding (the only reason I've ever seen those ??? is when character encoding was at least one level too low for the current problem).
At a minimum, you should be using utf-8, but you probably want to use utf-16 (or even utf-32 if you're being really conservative). I also recommend this mandatory reading.
Determining whether a username is in English or not is impossible. There are too many possible variants on proper nouns to be able to provide any reliable metric. Then there are transplanted names and the like. You can try to detect if there are non-ASCII characters (I believe /[^ -~]/ should match all of them — space is the lowest "typeable" character in ASCII, ~ is the highest), but then you are compensating for the unicode problem instead of letting the computer handle that gracefully.

What explains Firefox and Safari's different treatment of user-supplied URIs containing more than one # symbol? Which is 'right'?

In Firefox 4.0.1 paste the following into the address bar
http://www.w3.org/#one#two
Note that the browser navigates to the w3.org front page and the address bar still reads
http://www.w3.org/#one#two
In Safari 5.0.4 do the same. Note the browser also navigates, but the address bar text is modified to read
http://www.w3.org/#one%23two
Note the first hash appearance of hash in the string is not altered but the second is modified to the encoded form (aka 'escaped') %23.
It seems reasonable to assume that Safari is trying to convert the user-supplied URI to a link that meets its idea of a valid URI. Firefox does not make a conversion in this case.
I would like to account for the difference in behavior.
The document at http://www.ecma-international.org/publications/standards/Ecma-262.htm is one reference to what form a valid URI takes. In section 15.1.3.1 it states the following with respect to unescaping of URIs by browsers.
The character “#” is not decoded from escape sequences even though it is not a reserved URI character.
What it this arguably implies is that it refers to # symbols throughout the URI string, not just the first occurrence.
In conclusion, my question is:
Do both forms of the link meet the latest standard for valid URIs?
If they are both valid, which browser behavior is most appropriate?
RfC 3986 (the definition of what URIs and thus URLs look like and what the parts mean) does not allow two # characters in one URL, at least in my reading. Which makes the question boil down to:
Is it better to forward the user error to the web application (where the designer might have made the same mistake),
or is it better to transform the user input into something closely-related, but valid?
Also note that the RfC clearly lists # as a reserved character, so the ECMA standard is wrong in what you quoted above.

Creating links to ontology nodes

Let's say that, being abstract from any language, we have some ontology made of triples (e.g. subject (S) - predicate (P) - object (O))
Now if I want to, for some reason, annotate any of these triples (nodes), than I'd like to keep links to them that I can use in web documents.
Here are some conditions:
1) Such link must be in a form of one line of text
2) Such link should be easily parseable both by machine and person
3) Sections of such links should be delimited
4) Such link must be easy to grep, which IMO means they should be wrapped in some distinct letters or characters to make them easy to regex from any web or other document
5) Such link can be used in URL pathnames or query strings, thus has to comply with URL syntax
6) Characters used in such link must not be reserved for URL pathnames, query strings or hashes (e.g. not "/", ";" "?", "#")
My ideas so far were as follows:
a) Start and end such link with some distinct, constant set of letters, e.g. STK_....._OVRFLW
b) Separate sections with dashes "-", e.g. Subject-Predicate-Object
So it would look like:
STK_S1234-P123-O1234_OVRFLW
You have better ideas?
I'm with #msalvadores on this one - this seems to be a classic use of semantic web / linked data (albeit in a rather complex form), and your example seems to be more related to URI design rather than anthing else.
# is dealt with extensively in the semantic web lit, also there are javascript libraries for querying rdf through sparql - it just makes more sense to stick with the standard.
To link to a triple, the standard method is to use reification - essentially naming a triple (to keep with the triple model, it ends up creating 4 triples, but I would consider it the "correct" method in this situation). There is also the "named graph" method, which isn't a standard, but probably has more widespread adoption.
The link will be 1 line of text
It will be easily machine parsable, to make it human parsable, it might be necessary to give some thought to URI design.
Delimitation is once again on URI design
easy grepping - URI design
URL syntax - tick
no "/", ";" "?", "#" - I would try to incorporate it into a url instead of pushing it out
I would consider www.stackoverflow.com/statement/S1234_P123_O123, where S1234 etc. are unique labels (I don't necessarily agree with human readable uris, but I guess they'll have to stay until humans don't have to read uris). The beautiful thing is that it should dereference and give a nice human vs machine readable representation

Resources