Escaping backslash (\) in string or paths in R - r

Windows copies path with backslash \, which R does not accept. So, I wanted to write a function which would convert \ to /. For example:
chartr0 <- function(foo) chartr('\','\\/',foo)
Then use chartr0 as...
source(chartr0('E:\RStuff\test.r'))
But chartr0 is not working. I guess, I am unable to escape /. I guess escaping / may be important in many other occasions.
Also, is it possible to avoid the use chartr0 every time, but convert all path automatically by creating an environment in R which calls chartr0 or use some kind of temporary use like using options

From R 4.0.0 you can use r"(...)" to write a path as raw string constant, which avoids the need for escaping:
r"(E:\RStuff\test.r)"
# [1] "E:\\RStuff\\test.r"
There is a new syntax for specifying raw character constants similar to the one used in C++: r"(...)" with ... any character sequence not containing the sequence )". This makes it easier to write strings that contain backslashes or both single and double quotes. For more details see ?Quotes.

Your fundamental problem is that R will signal an error condition as soon as it sees a single back-slash before any character other than a few lower-case letters, backslashes themselves, quotes or some conventions for entering octal, hex or Unicode sequences. That is because the interpreter sees the back-slash as a message to "escape" the usual translation of characters and do something else. If you want a single back-slash in your character element you need to type 2 backslashes. That will create one backslash:
nchar("\\")
#[1] 1
The "Character vectors" section of _Intro_to_R_ says:
"Character strings are entered using either matching double (") or single (') quotes, but are printed using double quotes (or sometimes without quotes). They use C-style escape sequences, using \ as the escape character, so \ is entered and printed as \, and inside double quotes " is entered as \". Other useful escape sequences are \n, newline, \t, tab and \b, backspace—see ?Quotes for a full list."
?Quotes

chartr0 <- function(foo) chartr('\\','/',foo)
chartr0('E:\\RStuff\\test.r')
You cannot write E:\Rxxxx, because R believes R is escaped.

The problem is that every single forward slash and backslash in your code is escaped incorrectly, resulting in either an invalid string or the wrong string being used. You need to read up on which characters need to be escaped and how. Take a look at the list of escape sequences in the link below. Anything not listed there (such as the forward slash) is treated literally and does not require any escaping.
http://cran.r-project.org/doc/manuals/R-lang.html#Literal-constants

Related

Warning on regex string in Python

So, I am doing a small function to strip all the weird chars from a string, eg. #$& will be replaced just for a " "
The chars I am trying to remove are the following, defined into a string:
xChars = r"#$%()'^*\;:/|+_.–°ªº"
However I kepp getting the warning:
Anomalous backslash in string: '\;'. String constant might be missing an r prefix
However, when i used the r prefix eg. r"\" python rules out some of the special chars i want to replace. It doesnt produce an error it just thinks that those chars are ok or something and it rules them out.
Any ideas on how to fix this ?
Normally backslashes escape characters, therefore the compiler isn´t sure if the backslash has to be escaped. Maybe try using a double backslash to escape the backslash itself like: xChars = r"#$%()'^*\\;:/|+_.–°ªº"

substitute single backslash in R

I have read some questions and answers on this topic in stack overflow but still don't know how to solve this problem:
My purpose is to transform the file directory strings in windows explorer to the form which is recognizable in R, e.g. C:\Users\Public needs to be transformed to C:/Users/Public, basically the single back slash should be substituted with the forward slash. However the R couldn't store the original string "C:\Users\Public" because the \U and \P are deemed to be escape character.
dirTransformer <- function(str){
str.trns <- gsub("\\", "/", str)
return(str.trns)
}
str <- "C:\Users\Public"
dirTransformer(str)
> Error: '\U' used without hex digits in character string starting ""C:\U"
What I am actually writing is a GUI, where the end effect is, the user types or pastes the directory into a entry field, pushes a button and then the program will process it automatically.
Would someone please suggest to me how to solve this problem?
When you need to use a backslash in the string in R, you need to put double backslash. Also, when you use gsub("\\", "/", str), the first argument is parsed as a regex, and it is not valid as it only contains a single literal backslash that must escape something. In fact, you need to make gsub treat it as a plain text with fixed=TRUE.
However, you might want to use normalizePath, see this SO thread.
dirTransformer <- function(str){
str.trns <- gsub("\\\\", "/", str)
return(str.trns)
}
str <- readline()
C:\Users\Public
dirTransformer(str)
I'm not sure how you intend the user to input the path into the GUI, but when using readline() and then typing C:\Users\Public unquoted, R reads that in as:
> str
[1] "C:\\Users\\Public"
We then want to replace "\\" with "/", but to escape the "\\" we need "\\\\" in the gsub.
I can't be sure how the input from the user is going to be read into R in your GUI, but R will most likely escape the \s in the string like it does when using the readline example. the string you're trying to create "C:\Users\Public" wouldn't normally happen.

Why do URL parameters use %-encoding instead of a simple escape character

For example, in Unix, a backslash (\) is a common escape character. So to escape a full stop (.) in a regular expression, one does this:
\.
But with % encoding URL parameters, we have an escape character, %, and a control code, so an ampersand (&) doesn't become:
%&
Instead, it becomes:
%26
Any reason why? Seems to just make things more complicated, on the face of it, when we could just have one escape character and a mechanism to escape itself where necessary:
%%
Then it'd be:
simpler to remember; we just need to know which characters to escape, not which to escape and what to escape them to
encoding-agnostic, as we wouldn't be sending an ASCII or Unicode representation explicitly, we'd just be sending them in the encoding the rest of the URL is going in
easy to write an encoder: s/[!\*'();:#&=+$,/?#\[\] "%-\.<>\\^_`{|}~]/%&/g (untested!)
better because we could switch to using \ as an escape character, and life would be simpler and it'd be summer all year long
I might be getting carried away now. Someone shoot me down? :)
EDIT: replaced two uses of "delimiter" with "escape character".
Percent encoding happens not only to escape delimiters, but also so that you can transport bytes that are not allowed inside URIs (such as control characters or non-ASCII characters).
I guess it's because the URL Specification and specifically the HTTP part of it, only allow certain characters so to escape those one must replace them with characters that are allowed.
Also some allowed characters have special meanings like & and ? etc
so replacing them with a control code seems the only way to solve it
If you find it hard to recognize them, bookmark this page
http://www.w3schools.com/tags/ref_urlencode.asp

Are double "" and single '' quotes (always) interchangeable in R?

This is perhaps rather a minor question...
but just a moment ago I was looking through some code I had written and noticed that I tend to just use ="something" and ='something_else' completely interchangeably, often in the same function.
So my question is: Is there R code in which using one or other (single or double quotes) has different behaviour? Or are they totally synonymous?
According to http://stat.ethz.ch/R-manual/R-patched/library/base/html/Quotes.html, "[s]ingle and double quotes delimit character constants. They can be used interchangeably but double quotes are preferred (and character constants are printed using double quotes), so single quotes are normally only used to delimit character constants containing double quotes."
Just for curiosity, there is a further explaination in R-help mailing list for Why double quote is preferred in R:
To avoid confusion for those who are accustomed to programming in the
C family of languages (C, C++, Java), where there is a difference in
the meaning of single quotes and double quotes.
A C programmer reads 'a' as a single character and "a" as a character
string consisting of the letter 'a' followed by a null character to
terminate the string.
In R there is no character data type, there are
only character strings. For consistency with other languages it helps
if character strings are delimited by double quotes. The single quote
version in R is for convenience.
(Since) On most keyboards you don't need to
use the shift key to type a single quote but you do need the shift for
a double quote.
> print(""hi"")
Error: unexpected symbol in "print(""hi"
> print("'hi'")
[1] "'hi'"
> print("hi")
[1] "hi"

Where can I find documentation on escape characters like "\"

I'd like to gain a better understanding of escape character sequences in R. I've tried searching for things like ?'\' but, that escapes itself and ?'\\'
I'd like to avoid this kind of behaviour with cat(). For example:
cat("\")
+
Versus:
cat("\\")
\
The help page you are looking for is ?Quotes (with the capital Q). String literal syntax is also described (less clearly IMHO) at http://cran.r-project.org/doc/manuals/R-lang.html#Literal-constants.
The backslash escape works very nearly the same as it does in C and all the other languages that borrowed backslash escapes from C -- \n inserts a newline, \\ inserts a single backslash, \" in a double quoted string prevents the " from ending the string, etc.

Resources