PHPExcel Named Ranges not working with special characters in MS Office - phpexcel

I created a dependant dropdown lists as shown here.
The problem is that the named ranges wont work with special characters like spaces, brackets, hyphen, etc (underscore works!) in MS Office but works fine in LibreOffice.
$objPHPExcel->addNamedRange(
new PHPExcel_NamedRange(
'New Zealand',
$objPHPExcel->getSheetByName('Worksheet 1'),
'A1:A2'
)
);
The name New Zealand has a space and it generates excel errors.
I need it working in Office as well.

Quoting from the MS Excel documentation (amd repeating what Tim Williams has said in the comments)
The following is a list of syntax rules that you need to be aware of when you create and edit names.
Valid characters The first character of a name must be a letter, an underscore character (_), or a backslash (\). Remaining characters in the name can be letters, numbers, periods, and underscore characters.
Note You cannot use the uppercase and lowercase characters "C", "c", "R", or "r" as a defined name, because they are all used as a shorthand for selecting a row or column for the currently selected cell when you enter them in a Name or Go To text box.
Cell references disallowed Names cannot be the same as a cell reference, such as Z$100 or R1C1.
Spaces are not valid Spaces are not allowed as part of a name. Use the underscore character (_) and period (.) as word separators, such as, Sales_Tax or First.Quarter.
Name length A name can contain up to 255 characters.
Case sensitivity Names can contain uppercase and lowercase letters. Excel does not distinguish between uppercase and lowercase characters in names. For example, if you created the name Sales and then create another name called SALES in the same workbook, Excel prompts you to choose a unique name.
A name containing a space character may work in LibreOffice, but it will not work in MS Office... the only way to get this working in MS Office is to change the name so that the MS Office naming rules apply

Related

Dealing with quotation marks in a quote-surrounded string

Take this CSV file:
ID,NAME,VALUE
1,Blah,100
2,"Has space",200
3,"Ends with quotes"",300
4,""Surrounded with quotes"",300
It loads just fine in most statistical programs (R, SAS, etc.) but in Excel the third row is misinterpreted because it has two quotation marks. Escaping the last quote as \" will also not work in Excel. The only way I have found so far is to replace the one double quote with two double quotes:
ID,NAME,VALUE
1,Blah,100
2,"Has space",200
3,"Ends with quotes""",300
4,"""Surrounded with quotes""",300
But that would render the file completely useless for all other programs (R, SAS, etc.)
Is there a way to format the CSV file where strings can begin or end with the same characters as that used to surround them, such that it would work in Excel as well as commonly used statistical software?
Your second representation is the normal way to generate a CSV file and so should be easy to work with in any software. See the RFC 4180 specifications. https://www.ietf.org/rfc/rfc4180.txt
So your second example represents this data:
Obs id name value
1 1 Blah 100
2 2 Has space 200
3 3 Ends with quotes" 300
4 4 "Surrounded with quotes" 300
If you want to represent it as a delimited file where none of the values are allowed to contain the delimiter (in other words NOT as a standard CSV file) than it would look like:
id,name,value
1,Blah,100
2,Has space,200
3,Ends with quotes",300
4,"Surrounded with quotes",300
But if you want to allow the values to contain the delimiter then you need some way to distinguish embedded delimiters from real delimiters. So the standard forces values that contain the delimiter to be quoted. But once you do that you also need to also add quotes around fields that contain the quote character itself (and double the embedded quotes) to avoid making an ambiguous file. For example the quotes in the 4th observation in your first file look like they are optional quotes around a value instead of part of the value.
Many programs try to handle ambiguous situations. For example SAS does not allow values to contain embedded line breaks so you will always get four observations with your first example file.
But EXCEL allows the embedding of the end of line character(s) inside of quoted values. So in your original file the value of the second field in the third observations looks like what you would start to get if you added quotes around this value:
Ends with quotes",300
4,"Surrounded with quotes",300
So instead of 4 complete observations of three fields values in each there are only three observations and the last observation has only two field values.
This is caused by the fact that escape character for " in Excel is "": Escaping quotes and delimiters in CSV files with Excel
A quick and simple workaround that comes to mind in R is to first read the content of the csv with readLines, then replace the double (escaped) double quotes with just one double quotes, and then read.table:
read.table(
text = gsub(pattern = "\"\"", "\"", readLines("data.csv")),
sep = ",",
header = TRUE
)

How to extract characters from a string based on the text surrounding them in R

Edited to highlight the language I'm using I'm using the R language and I have many large lists of character strings and they have a similar format. I am interested in the characters directly in front of a series of characters that is consistently in the string, but not in a consistent place within the string. For instance:
a <- "aabbccddeeff"
b <- "aabbddff"
c <- "aabbffgghhii"
d <- "bbffgghhii"
I am interested in extracting the two characters directly preceding the "ff" in each character string. I can't find any reasonable solution apart from breaking each character string down using grepl() and then processing them each independently, which seems like an inefficient way to do it.
You can match those two characters and capture them with sub and the right regular expression.
Strings = c("aabbccddeeff",
"aabbddff",
"aabbffgghhii",
"bbffgghhii")
sub(".*(\\w\\w)ff.*", "\\1", Strings)
[1] "ee" "dd" "bb" "bb"
Explanation, This replaces the entire string with the two characters before the "ff". If there are multiple "ff" in the string, this expression takes the two characters before the last "ff".
How this works: The three arguments to sub are:
1. a pattern to search for
2. What it will be replaced with
3. The strings to apply it to.
Most of the work is in the pattern part - .*(\\w\\w)ff.*. The ff part of the pattern must be obvious. We are targeting things near the specific string ff. What comes right before it is (\\w\\w). \w refers to a "word character". That means any letter a-z or A-Z, any digit 0-9 or the one other character _. We want two characters so we have \\w\\w. By enclosing \\w\\w in parentheses, it turns this pattern of two characters into a "capture group", a string that will be saved into a variable for later use. Since this is the first (and only) capture group in this expression, those two characters will be stored in a variable called \1. Now we want only those two characters so in order to blow away everything before and after we put .* at the front and back. . matches any character and * means do this zero or more times, so .* means zero or more copies of any character. Now we have broken the string into four parts: "ff", the two characters before "ff", everything before that and everything after the ff. This covers the entire string. sub will _replace the part that was matched (everything) with whatever it says in the substitution pattern, in this case "\1". That is just how you write a string that evaluates to \1, the name of the variable where we stored the two characters that we want. We write it that way because backslash "escapes" whatever is after it. We actually want the character \ so we write \ to indicate \ and \1 evaluates to \1. So everything in the string is replaced by the targeted two characters. We apply this to every string in the list of strings Strings.

Creating a password regex

Right now I need to duplicate a password expression validator for a website. The password is only required to be 8-25 characters (only alphabet characters) long. I thought this was weird and had been using this regex
(?!^[0-9]*$)(?!^[a-zA-Z]*$)^([a-zA-Z0-9]{8,25})
but it has to be optional to have a capital letter, special characters and/or numbers throughout the password. I'm not particularly apt at building regex's where there are optional characters. Any help would be be appreciated.
I am using asp.net's RegularExpressionValidator.
This pattern should work:
^[a-zA-Z]{8,25}$
It matches a string consisting of 8 to 25 Latin letters.
If you want to allow numbers as well, this pattern should work:
^[a-zA-Z0-9]{8,25}$
It matches a string consisting of 8 to 25 Latin letters or decimal digits.
If you want to allow special characters as well, this pattern should work:
^[a-zA-Z0-9$#!]{8,25}$
It matches a string consisting of 8 to 25 Latin letters, decimal digits, or symbols, $, # or ! (of course you can add to this set fairly easily).
Your current regex won't work because it will accept special characters as from 9th character (and anything after the 9th character in fact, even a 26th character because you don't have the end of string anchor) .
You probably want something like this:
^(?=.*[a-z])[A-Za-z0-9]{8,25}$
This first makes sure there are lowercase alphabets (you mentioned that uppercase and digits are optional, so this makes obligatory lowercase) and then allows only uppercase and digits.
EDIT: To allow any special characters, you can use this:
^(?=.*[a-z]).{8,25}$
My understanding of your problem is that the password's first requirement is that it has to contain lowercase alphabet characters. The option now is that it can also contain other characters. If this isn't right, let me know.
regex101 demo

regular expression validate if 5th or 6th character "_"

I am facing the problem in asp.net regular expression.
I need to validate if 5th or 6th character is "-" .
for example
3000-4567, 3000-4568 this string is , separated and also has a hyphen. I just need to check if each comma separated string has 5th or 6th character as a "-".
Current regular expression used in the system is
^((\s*\d{4,4}\s*[,]){1,3}?)?(\s*\d{4,4})*$
currently its validating 3000,4567
I've made two slight changes to your regex:
'^((\s*\d{4,5}\s*[/-]){1,3}?)?(\s*\d{4,4})*$'
Changed the cardinality of the first numeric group to {4,5} to allow for 5 digits numbers (which I guess is what you want since the dash can be the sixth character) and changed the separator to a dash. Notice the slash to escape it, since in square brackets the dash is a special character (tho' you probably don't need the brackets there).
As an alternative, consider splitting the string on instances of - and then validating the splitted chunks. That should be much easier.

SQLite: which character can be ignored with FTS match in one word

I need to find any special character. If I put it in the middle of a word, SQLite FTS match can ignore it as if it does not exist, e.g.:
Text Body: book's
If my match string is 'books' I need to get result of "book's"..
No problem using porter or simple tokenizer.
I tried many characters for that like: book!s, book?s, book|s, book,s, book:s…, but when searching by match for 'books' no results of these returned.
I don't understand, why?
I am using: Contentless FTS4 Tables, and External Content FTS4 Tables, my text body has many characters in each word, should be changed to ignore it when searching..
I cannot change match query because I do not know where the special character in the word is. Also, I need to leave the original word length equal to the length of FTS Index word to use match info or snippet(); as such, I cannot remove these characters from text body.
The default tokenizers do not ignore punctuation characters but treat them as word separators.
So the text body or match string book's will end up as two words, book and s.
These will never match a single work like books.
To ignore characters like ', you have to install your own custom tokenizer.

Resources