I have a function with an argument that is a link to a file. My problem is, that even though I specify that I want to have a string here, a part of it seems to be recognized as a date. This results in a part of my string being replaced by "t-". How do I prevent this from happening?
smfunc <- function(link=as.character("T:\11-10-2017 - Folder\filename.csv"))
{
link
}
smfunc()
[1] "T:\t-10-2017 - Folder\filename.csv"
How do I prevent this from happening?
Easy: this does not happen (that would be terrible). The problem is different: you forgot to escape the backslashes:
smfunc = function (link = "T:\\11-10-2017 - Folder\\filename.csv") {
link
}
Without the escaped backslashes, '\11' is interpreted as a numeric character code (with value 11oct = 9dec, which is equivalent to the tab character '\t').
'\f', by pure chance, is a valid escape sequence equivalent to the “form feed” character. This is not the same as '\\f', i.e. a literal backslash followed by an “f”, and which is what you need.
Using as.character, incidentally, is redundant here: your value is already a character vector.
Related
Using the gsub function, I am replacing certain string of texts that contain the Unicode Character 'START OF GUARDED AREA' (U+0096).
gsub("A string with the following character –","A string without the character")
My code works, but if I close my script and reopen it, that character in my code is replaced by a normal dash.
To work around this problem, I was thinking to replace the actual character by a function. I came across function intToUtf8(), which I thought would return my character in question if I use it as follows:
intToUtf8(150)
However, when typing this in my console, it returns "\u0096"
Question 1: why is the character replaced in my script?
Question 2: why isn't my console returning the character '–'?
Many thanks in advance for your precious help!
I'm working with the following code:
Y_Columns <- c("Y.1.1")
paste('{"ImportId":"', Y_Columns, '"}', sep = "")
The paste function produces the following output:
"{\"ImportId\":\"Y.1.1\"}"
How do I get the paste function to omit the \? Such that, the output is:
"{"ImportId":"Y.1.1"}"
Thank you for your help.
Note: I did do a search on SO to see if there were any Q's that asked "what is an escape character in R". But I didn't review all the 160 answers, only the first 20.
This is one way of demonstrating what I wrote in my comment:
out <- paste('{"ImportId":"', Y_Columns, '"}', sep = "")
out
#[1] "{\"ImportId\":\"Y.1.1\"}"
?print
print(out,quote=FALSE)
#[1] {"ImportId":"Y.1.1"}
Both R and regex patterns use escape characters to allow special characters to be displayed in print output or input. (And sometimes regex patterns need to have doubled escapes.) R has a few characters that need to be "escaped" in certain situation. You illustrated one such situation: including double-quote character inside a result that will be printed with surrounding double-quotes. If you were intending to include any single quotes inside a character value that was delimited by single quotes at the time of creation, they would have needed to be escaped as well.
out2 <- '\'quoted\''
nchar(out2)
#[1] 8 ... note that neither the surround single-quotes nor the backslashes get counted
> out2
[1] "'quoted'" ... and the default output quote-char is a double-quote.
Here's a good Q&A to review:How to replace '+' using gsub() function in R
It has two answers, both useful: one shows how to double escape a special character and the other shows how to use teh fixed argument to get around that requirement.
And another potentially useful Q&A on the topic of handling Windows paths:
File path issues in R using Windows ("Hex digits in character string" error)
And some further useful reading suggestions: Look at the series of help pages that start with capital letters. (Since I can never remember which one has which nugget of essential information, I tried ?Syntax first and it has a "See Also" list of essential reading: Arithmetic, Comparison, Control, Extract, Logic, NumericConstants, Paren, Quotes, Reserved. and I then realized what I wanted to refer you to was most likely ?Quotes where all the R-specific escape sequence letters should be listed.
I have a variable named full.path.
And I am checking if the string contained in it is having certain special character or not.
From my code below, I am trying to grep some special character. As the characters are not there, still the output that I get is true.
Could someone explain and help. Thanks in advance.
full.path <- "/home/xyz"
#This returns TRUE :(
grepl("[?.,;:'-_+=()!##$%^&*|~`{}]", full.path)
By plugging this regex into https://regexr.com/ I was able to spot the issue: if you have - in a character class, you will create a range. The range from ' to _ happens to include uppercase letters, so you get spurious matches.
To avoid this behaviour, you can put - first in the character class, which is how you signal you want to actually match - and not a range:
> grepl("[-?.,;:'_+=()!##$%^&*|~`{}]", full.path)
[1] FALSE
I have read some questions and answers on this topic in stack overflow but still don't know how to solve this problem:
My purpose is to transform the file directory strings in windows explorer to the form which is recognizable in R, e.g. C:\Users\Public needs to be transformed to C:/Users/Public, basically the single back slash should be substituted with the forward slash. However the R couldn't store the original string "C:\Users\Public" because the \U and \P are deemed to be escape character.
dirTransformer <- function(str){
str.trns <- gsub("\\", "/", str)
return(str.trns)
}
str <- "C:\Users\Public"
dirTransformer(str)
> Error: '\U' used without hex digits in character string starting ""C:\U"
What I am actually writing is a GUI, where the end effect is, the user types or pastes the directory into a entry field, pushes a button and then the program will process it automatically.
Would someone please suggest to me how to solve this problem?
When you need to use a backslash in the string in R, you need to put double backslash. Also, when you use gsub("\\", "/", str), the first argument is parsed as a regex, and it is not valid as it only contains a single literal backslash that must escape something. In fact, you need to make gsub treat it as a plain text with fixed=TRUE.
However, you might want to use normalizePath, see this SO thread.
dirTransformer <- function(str){
str.trns <- gsub("\\\\", "/", str)
return(str.trns)
}
str <- readline()
C:\Users\Public
dirTransformer(str)
I'm not sure how you intend the user to input the path into the GUI, but when using readline() and then typing C:\Users\Public unquoted, R reads that in as:
> str
[1] "C:\\Users\\Public"
We then want to replace "\\" with "/", but to escape the "\\" we need "\\\\" in the gsub.
I can't be sure how the input from the user is going to be read into R in your GUI, but R will most likely escape the \s in the string like it does when using the readline example. the string you're trying to create "C:\Users\Public" wouldn't normally happen.
I'm trying to create a RegEx Validator that checks the file extension in the FileUpload input against a list of allowed extensions (which are user specified). The following is as far as I have got, but I'm struggling with the syntax of the backward slash (\) that appears in the file path. Obviously the below is incorrect because it just escapes the (]) which causes an error. I would be really grateful for any help here. There seems to be a lot of examples out there, but none seem to work when I try them.
[a-zA-Z_-s0-9:\]+(.pdf|.PDF)$
To include a backslash in a character class, you need to use a specific escape sequence (\b):
[a-zA-Z_\s0-9:\b]+(\.pdf|\.PDF)$
Note that this might be a bit confusing, because outside of character classes, \b represents a word boundary. I also assumed, that -s was a typo and should have represented a white space. (otherwise it shouldn't compile, I think)
EDIT: You also need to escape the dots. Otherwise they will be meta character for any character but line breaks.
another EDIT: If you actually DO want to allow hyphens in filenames, you need to put the hyphen at the end of the character class. Like this:
[a-zA-Z_\s0-9:\b-]+(\.pdf|\.PDF)$
You probably want to use something like
[a-zA-Z_0-9\s:\\-]+\.[pP][dD][fF]$
which is same as
[\w\s:\\-]+\.[pP][dD][fF]$
because \w = [a-zA-Z0-9_]
Be sure character - to put as very first or very last item in the [...] list, otherwise it has special meaning for range or characters, such as a-z.
Also \ character has to be escaped by another slash, even inside of [...].