R customize error message when string contains unrecognized escape - r

I would like to give a more informative error message when users of my R functions supply a string with an unrecognized escape
my_string <- "sql\sql"
# Error: '\s' is an unrecognized escape in character string starting ""sql\s"
Something like this would be ideal.
my_string <- "sql\sql"
# Error: my_string contains an unrecognized escape. Try sql\\sql with double backslashes instead.
I have tried an if statement that looks for single backslashes
if (stringr::str_detect("sql\sql", "\")) stop("my error message")
but I get the same error.
Almost all of my users are Windows users running R 3.3 and up.

Code execution in R happens in two phases. First, R takes the raw string you enter and parses that into commands that can be run; then, R actually runs those commands. The parsing step makes sure what you've written actually makes sense as code. If it doesn't make any sense, then R can't even turn it into anything it can attempt to run.
The error message you are getting about the unrecognized escape sequence is happening at the parsing stage. That means R isn't really even attempting to execute the command, it just straight up can't understand what you are saying. There is no way to catch in error like this in code because there's no user code that's running at that point.
So if you are counting on your users writing code like my_string <- "something", then they need to write valid code. They can't change how strings are encoded or what the assignment operator looks like or how variables can be named. They also can't type !my_string! <=== %something% because R can't parse that either. R can't parse my_string <- "sql\sql" but it can parse my_string <- "sql\\sql" (slashes much be escaped in string literals). If they are not savy users, you might want to consider providing an alternative interface that can sanitize user input before trying to run it as code. Maybe make a shiny front end or have users pass arguments to your scripts via command line parameters.

If you're capturing your user input correctly, for a string input of\, R will store that in my_string as \\.
readline()
\
[1] "\\"
readline()
sql\sql
[1] "sql\\sql"
That means internally in R:
my_string <- "sql\\sql"
However
cat(my_string)
sql\sql
To check the input, you need to escape each escape, because you're looking for \\
stringr::str_detect(my_string, "\\\\")
Which returns TRUE if the input string is sql\sql. So the full line is:
if (stringr::str_detect("sql\\sql", "\\\\")) stop("my error message")

Related

How to pass a chr variable into r"(...)"?

I've seen that since 4.0.0, R supports raw strings using the syntax r"(...)". Thus, I could do:
r"(C:\THIS\IS\MY\PATH\TO\FILE.CSV)"
#> [1] "C:\\THIS\\IS\\MY\\PATH\\TO\\FILE.CSV"
While this is great, I can't figure out how to make this work with a variable, or better yet with a function. See this comment which I believe is asking the same question.
This one can't even be evaluated:
construct_path <- function(my_path) {
r"my_path"
}
Error: malformed raw string literal at line 2
}
Error: unexpected '}' in "}"
Nor this attempt:
construct_path_2 <- function(my_path) {
paste0(r, my_path)
}
construct_path_2("(C:\THIS\IS\MY\PATH\TO\FILE.CSV)")
Error: '\T' is an unrecognized escape in character string starting ""(C:\T"
Desired output
# pseudo-code
my_path <- "C:\THIS\IS\MY\PATH\TO\FILE.CSV"
construct_path(path)
#> [1] "C:\\THIS\\IS\\MY\\PATH\\TO\\FILE.CSV"
EDIT
In light of #KU99's comment, I want to add the context to the problem. I'm writing an R script to be run from command-line using WIndows's CMD and Rscript. I want to let the user who executes my R script to provide an argument where they want the script's output to be written to. And since Windows's CMD accepts paths in the format of C:\THIS\IS\MY\PATH\TO, then I want to be consistent with that format as the input to my R script. So ultimately I want to take that path input and convert it to a path format that is easy to work with inside R. I thought that the r"()" thing could be a proper solution.
I think you're getting confused about what the string literal syntax does. It just says "don't try to escape any of the following characters". For external inputs like text input or files, none of this matters.
For example, if you run this code
path <- readline("> enter path: ")
You will get this prompt:
> enter path:
and if you type in your (unescaped) path:
> enter path: C:\Windows\Dir
You get no error, and your variable is stored appropriately:
path
#> [1] "C:\\Windows\\Dir"
This is not in any special format that R uses, it is plain text. The backslashes are printed in this way to avoid ambiguity but they are "really" just single backslashes, as you can see by doing
cat(path)
#> C:\Windows\Dir
The string literal syntax is only useful for shortening what you need to type. There would be no point in trying to get it to do anything else, and we need to remember that it is a feature of the R interpreter - it is not a function nor is there any way to get R to use the string literal syntax dynamically in the way you are attempting. Even if you could, it would be a long way for a shortcut.

How to parse #{TEST TAGS} into only the Tags, eliminating current formatting?

Situation.. I have two tags defined, then I try to output them to the console. What comes out seems to be similar to an array, but I'd like to remove the formatting and just have the actual words outputted.
Here's what I currently have:
[Tags] ready ver10
Log To Console \n#{TEST TAGS}
And the result is
['ready', 'ver10']
So, how would I chuck the [', the ', ' and the '], thus only retaining the words ready and ver10?
Note: I was getting [u'ready', u'ver10'] - but once I got some advice to make sure I was running Python3 RobotFramework - after uninstalling robotframework via pip, and now only having robotframework installed via pip3, the u has vanished. That's great!
There are several ways to do it. For example, you could use a loop, or you could convert the list to a string before calling log to console
Using a loop.
Since the data is a list, it's easy to iterate over the list:
FOR ${tag} IN #{Test Tags}
log to console ${tag}
END
Converting to a string
You can use the evaluate keyword to convert the list to a string of values separated by a newline. Note: you have to use two backslashes in the call to evaluate since both robot and python use the backslash as an escape character. So, the first backslash escapes the second so that python will see \n and convert it to a newline.
${tags}= evaluate "\\n".join($test_tags)
log to console \n${tags}

substitute single backslash in R

I have read some questions and answers on this topic in stack overflow but still don't know how to solve this problem:
My purpose is to transform the file directory strings in windows explorer to the form which is recognizable in R, e.g. C:\Users\Public needs to be transformed to C:/Users/Public, basically the single back slash should be substituted with the forward slash. However the R couldn't store the original string "C:\Users\Public" because the \U and \P are deemed to be escape character.
dirTransformer <- function(str){
str.trns <- gsub("\\", "/", str)
return(str.trns)
}
str <- "C:\Users\Public"
dirTransformer(str)
> Error: '\U' used without hex digits in character string starting ""C:\U"
What I am actually writing is a GUI, where the end effect is, the user types or pastes the directory into a entry field, pushes a button and then the program will process it automatically.
Would someone please suggest to me how to solve this problem?
When you need to use a backslash in the string in R, you need to put double backslash. Also, when you use gsub("\\", "/", str), the first argument is parsed as a regex, and it is not valid as it only contains a single literal backslash that must escape something. In fact, you need to make gsub treat it as a plain text with fixed=TRUE.
However, you might want to use normalizePath, see this SO thread.
dirTransformer <- function(str){
str.trns <- gsub("\\\\", "/", str)
return(str.trns)
}
str <- readline()
C:\Users\Public
dirTransformer(str)
I'm not sure how you intend the user to input the path into the GUI, but when using readline() and then typing C:\Users\Public unquoted, R reads that in as:
> str
[1] "C:\\Users\\Public"
We then want to replace "\\" with "/", but to escape the "\\" we need "\\\\" in the gsub.
I can't be sure how the input from the user is going to be read into R in your GUI, but R will most likely escape the \s in the string like it does when using the readline example. the string you're trying to create "C:\Users\Public" wouldn't normally happen.

R error: regular expression is invalid in this locale

I am trying to gather all instances of "Walloni\xeb" within a data-frame column in order to remove "\" using the grep function. However, I'm getting the following error message as shown below:
grep("Walloni\xeb", InvoAndinfo2$Regio)
Error in grep("Walloni\xeb", InvoAndinfo2$Regio) :
regular expression is invalid in this locale
Does anyone know what to do to resolve this?
The backslash is a special character in regexp, if you want to look for a string that has a backslash, you should escape it by adding another backslah in front of it.
Try:
grep("Walloni\\xeb", InvoAndinfo2$Regio)

Paste "25 \%" in R for further processing in LaTeX

I want a character variable in R taking the value from, lets say "a", and adding " \%", to create a %-sign later in LaTeX.
Usually I'd do something like:
a <- 5
paste(a,"\%")
but this fails.
Error: '\%' is an unrecognized escape in character string starting "\%"
Any ideas? A workaround would be to define another command giving the %-sign in LaTeX, but I'd prefer a solution within R.
As many other languages, certain characters in strings have a different meaning when they're escaped. One example for that is \n, which means newline instead of n. When you write \%, R tries to interpret % as a special character and fails doing so. You might want to try to escape the backslash, so that it is just a backslash:
paste(a, "\\%")
You can read on escape sequences here.
You can also look at the latexTranslate function from the Hmisc package, which will escape special characters from strings to make them LaTeX-compatible :
R> latexTranslate("You want to give me 100$ ? I agree 100% !")
[1] "You want to give me 100\\$ ? I agree 100\\% !"

Resources