R error: regular expression is invalid in this locale - r

I am trying to gather all instances of "Walloni\xeb" within a data-frame column in order to remove "\" using the grep function. However, I'm getting the following error message as shown below:
grep("Walloni\xeb", InvoAndinfo2$Regio)
Error in grep("Walloni\xeb", InvoAndinfo2$Regio) :
regular expression is invalid in this locale
Does anyone know what to do to resolve this?

The backslash is a special character in regexp, if you want to look for a string that has a backslash, you should escape it by adding another backslah in front of it.
Try:
grep("Walloni\\xeb", InvoAndinfo2$Regio)

Related

How to pass a chr variable into r"(...)"?

I've seen that since 4.0.0, R supports raw strings using the syntax r"(...)". Thus, I could do:
r"(C:\THIS\IS\MY\PATH\TO\FILE.CSV)"
#> [1] "C:\\THIS\\IS\\MY\\PATH\\TO\\FILE.CSV"
While this is great, I can't figure out how to make this work with a variable, or better yet with a function. See this comment which I believe is asking the same question.
This one can't even be evaluated:
construct_path <- function(my_path) {
r"my_path"
}
Error: malformed raw string literal at line 2
}
Error: unexpected '}' in "}"
Nor this attempt:
construct_path_2 <- function(my_path) {
paste0(r, my_path)
}
construct_path_2("(C:\THIS\IS\MY\PATH\TO\FILE.CSV)")
Error: '\T' is an unrecognized escape in character string starting ""(C:\T"
Desired output
# pseudo-code
my_path <- "C:\THIS\IS\MY\PATH\TO\FILE.CSV"
construct_path(path)
#> [1] "C:\\THIS\\IS\\MY\\PATH\\TO\\FILE.CSV"
EDIT
In light of #KU99's comment, I want to add the context to the problem. I'm writing an R script to be run from command-line using WIndows's CMD and Rscript. I want to let the user who executes my R script to provide an argument where they want the script's output to be written to. And since Windows's CMD accepts paths in the format of C:\THIS\IS\MY\PATH\TO, then I want to be consistent with that format as the input to my R script. So ultimately I want to take that path input and convert it to a path format that is easy to work with inside R. I thought that the r"()" thing could be a proper solution.
I think you're getting confused about what the string literal syntax does. It just says "don't try to escape any of the following characters". For external inputs like text input or files, none of this matters.
For example, if you run this code
path <- readline("> enter path: ")
You will get this prompt:
> enter path:
and if you type in your (unescaped) path:
> enter path: C:\Windows\Dir
You get no error, and your variable is stored appropriately:
path
#> [1] "C:\\Windows\\Dir"
This is not in any special format that R uses, it is plain text. The backslashes are printed in this way to avoid ambiguity but they are "really" just single backslashes, as you can see by doing
cat(path)
#> C:\Windows\Dir
The string literal syntax is only useful for shortening what you need to type. There would be no point in trying to get it to do anything else, and we need to remember that it is a feature of the R interpreter - it is not a function nor is there any way to get R to use the string literal syntax dynamically in the way you are attempting. Even if you could, it would be a long way for a shortcut.

Can't Assign Value of Excel cell To variable in Robot Framework [duplicate]

I'm writing a test case in robot framework. I'm getting the response in below json string:
{"responseTimeStamp":"1970-01-01T05:30:00",
"statusCode":"200",
"statusMsg":"200",
"_object":{"id":"TS82",
"name":"newgroup",
"desc":"ttesteste",
"parentGroups":[],
"childGroups":[],
"devices":null,
"mos":null,
"groupConfigRules" {
"version":null,
"ruleContents":null
},
"applications":null,"type":0
}
}
From that I want to take "_object" using:
${reqresstr} = ${response['_object']}
... but am getting the error "No keyword with name '=' found" error
If I try the following:
${reqresstr}= ${response['_object']}
... I'm getting the error "Keyword name cannot be empty." I tried removing the '=' but still get the same error.
How can I extract '_object' from that json string?
When using the "=" for variable assignment with the space-separated format, you must make sure you have no more than a single space before the "=". Your first example shows that you've got more than one space on either side of the "=". You must have only a single space before the = and two or more after, or robot will think the spaces are a separator between a keyword and argument.
For the "keyword must not be empty" error, the first cell after a variable name must be a keyword. Unlike traditional programming languages, you cannot directly assign a string to a variable.
To set a variable to a string you need to use the Set Variable keyword (or one of the variations such as Set Test Variable). For example:
${reqresstr}= Set variable ${response['_object']}
${reqresstr}= '${response["_object"]}'
wrap it inside quotes and two spaces after =
There is a syntax error in your command. Make sure there is a space between ${reqresstr} and =.
Using your example above:
${reqresstr} = ${response['_object']}

R customize error message when string contains unrecognized escape

I would like to give a more informative error message when users of my R functions supply a string with an unrecognized escape
my_string <- "sql\sql"
# Error: '\s' is an unrecognized escape in character string starting ""sql\s"
Something like this would be ideal.
my_string <- "sql\sql"
# Error: my_string contains an unrecognized escape. Try sql\\sql with double backslashes instead.
I have tried an if statement that looks for single backslashes
if (stringr::str_detect("sql\sql", "\")) stop("my error message")
but I get the same error.
Almost all of my users are Windows users running R 3.3 and up.
Code execution in R happens in two phases. First, R takes the raw string you enter and parses that into commands that can be run; then, R actually runs those commands. The parsing step makes sure what you've written actually makes sense as code. If it doesn't make any sense, then R can't even turn it into anything it can attempt to run.
The error message you are getting about the unrecognized escape sequence is happening at the parsing stage. That means R isn't really even attempting to execute the command, it just straight up can't understand what you are saying. There is no way to catch in error like this in code because there's no user code that's running at that point.
So if you are counting on your users writing code like my_string <- "something", then they need to write valid code. They can't change how strings are encoded or what the assignment operator looks like or how variables can be named. They also can't type !my_string! <=== %something% because R can't parse that either. R can't parse my_string <- "sql\sql" but it can parse my_string <- "sql\\sql" (slashes much be escaped in string literals). If they are not savy users, you might want to consider providing an alternative interface that can sanitize user input before trying to run it as code. Maybe make a shiny front end or have users pass arguments to your scripts via command line parameters.
If you're capturing your user input correctly, for a string input of\, R will store that in my_string as \\.
readline()
\
[1] "\\"
readline()
sql\sql
[1] "sql\\sql"
That means internally in R:
my_string <- "sql\\sql"
However
cat(my_string)
sql\sql
To check the input, you need to escape each escape, because you're looking for \\
stringr::str_detect(my_string, "\\\\")
Which returns TRUE if the input string is sql\sql. So the full line is:
if (stringr::str_detect("sql\\sql", "\\\\")) stop("my error message")

Warning on regex string in Python

So, I am doing a small function to strip all the weird chars from a string, eg. #$& will be replaced just for a " "
The chars I am trying to remove are the following, defined into a string:
xChars = r"#$%()'^*\;:/|+_.–°ªº"
However I kepp getting the warning:
Anomalous backslash in string: '\;'. String constant might be missing an r prefix
However, when i used the r prefix eg. r"\" python rules out some of the special chars i want to replace. It doesnt produce an error it just thinks that those chars are ok or something and it rules them out.
Any ideas on how to fix this ?
Normally backslashes escape characters, therefore the compiler isn´t sure if the backslash has to be escaped. Maybe try using a double backslash to escape the backslash itself like: xChars = r"#$%()'^*\\;:/|+_.–°ªº"

Nesting more than two types of quotes in R

I would like to know how to accommodate more than two types of quotes in a same row in R. Let´s say that I want to print:
'first-quote-type1 "first-quote-type2 "second-quote-type2
'sencond-quote-type1
Using one quote in the beginning and one in the end we have:
print("'first-quote-type1 "first-quote-type2 "second-quote-type2 'sencond-quote-type1")
Error: unexpected symbol in "print("'first-quote-type1 "first"
I tried to include triple quotes as required in Python in this cases:
print(''''first-quote-type1 "first-quote-type2 "second-quote-type2 'sencond-quote-type1''')
print("""'first-quote-type1 "first-quote-type2 "second-quote-type2 'sencond-quote-type1""")
However, I also got a similar error. Some idea how to make this syntax work in R?
To use a quote within a quote you can escape the quote character with a backslash
print("the man said \"hello\"")
However, the print function in R will always escape character.
To not show the escaped character use cat() instead
so...
cat("the man said \"hello\"") will return
the man said "hello"

Resources