How to automatically handle strings/paths with backslashes? [duplicate] - r

This question already has answers here:
How to escape backslashes in R string
(3 answers)
Efficiently convert backslash to forward slash in R
(11 answers)
Closed 3 years ago.
I often want to read in csv files and I get the path by using shift + right click and then clicking "copy path".
I paste this path into my code. See an example below:
read_csv("C:\Users\me\data\file.csv")
Obviously this doesn't work because of the backslashes. My current solution is to escape each one, so that my code looks like this:
read_csv("C:\\Users\\me\\data\\file.csv")
It works, but it's annoying and occasionally I'll get errors because I missed one of the backslashes.
I wanted to create a function automatically adds the extra slashes
fix_path <- function(string) str_replace(string, "\\\\", "\\\\\\\\")
but R won't recognize the string in the first place until the backslashes are taken care of.
Is there another way to deal with this? Python has the option of adding an "r" before strings to note that the backslashes should be treated just as regular backslashes, is there anything similar in R? To be clear, I know that I can escape the backslashes, but I am looking for a way to do it automatically.

You can use this hack. Suppose you had copied your path as mentioned then you could use
scan("clipboard", "character", quiet = TRUE)
scan reads the text copied from the clipboard and takes care about the backslashes. Then copy again what is returned from scan

Related

In R, how to change working directory more easily? [duplicate]

This question already has answers here:
Escaping backslash (\) in string or paths in R
(4 answers)
Closed 1 year ago.
In how to change working directory more easily?
Currently, if we use 'setwd',we have to add many '\', sometimes it's boring
Is there any easier way for this ? (Just like Python can add 'r' )
setwd('C:\Users\Administrator\Desktop\myfolder') # can't work
setwd('C:\\Users\\Administrator\\Desktop\\myfolder') # can work,but havt to add many '\'
You could use r (for raw string) and add parenthesis:
> r"(C:\Users\Administrator\Desktop\myfolder)"
[1] "C:\\Users\\Administrator\\Desktop\\myfolder"
>
And now:
setwd(r"(C:\Users\Administrator\Desktop\myfolder)")
Or reading from clipboard automatically adds the extra slashes:
setwd(readClipboard())

Remove multiple instances with a regex expression, but not the text in between instances [duplicate]

This question already has answers here:
Regular expression to stop at first match
(9 answers)
Closed 1 year ago.
In long passages using bookdown, I have inserted numerous images. Having combined the passages into a single character string (in a data frame) I want to remove the markdown text associated with inserting images, but not any text in between those inserted images. Here is a toy example.
text.string <- "writing ![Stairway scene](/media/ClothesFairLady.jpg) writing to keep ![Second scene](/media/attire.jpg) more writing"
str_remove_all(string = text.string, pattern = "!\\[.+\\)")
[1] "writing more writing"
The regex expression doesn't stop at the first closed parenthesis, it continues until the last one and deletes the "writing to keep" in between.
I tried to apply String manipulation in R: remove specific pattern in multiple places without removing text in between instances of the pattern, which uses gsubfn and gsub but was unable to get the solutions to work.
Please point me in the right direction to solve this problem of a regex removal of designated strings, but not the characters in between the strings. I would prefer a stringr solution, but whatever works. Thank you
You have to use the following regex
"!\\[[^\\)]+\\)"
alternatively you can also use this:
"!\\[.*?\\)"
both solution offer a lazy match rather than a greedy one, which is the key to your question
I think you could use the following solution too:
gsub("!\\[[^][]*\\]\\([^()]*\\)", "", text.string)
[1] "writing writing to keep more writing"

Replace latex with r strings using gsub [duplicate]

This question already has an answer here:
"'\w' is an unrecognized escape" in grep
(1 answer)
Closed 1 year ago.
I would like to find and replace tabular instances by tabularx. I tried with gsub but it seems to enter me into a world of escaping pain. Following other questions and answers I find fixed=TRUE which is the best I so far have. The code snippet below almost works, \B is unrecognized. If I escape it twice I get \BEGIN as output!
texText <- '\begin{tabular}{rl}\begin{tabular}{rll}'
texText <- gsub("\begin{tabular}{rl}", "\BEGIN{tabular}{rll}", texText, fixed=TRUE)
I'm using BEGIN as my test to see what is happening. This is before I get to tackling the question of what goes on in the brackets {rl} {ll} {rrl} etc. Ideally I'm looking for a regex that would output:
\begin{tabularx}{rX}\begin{tabularx}{rlX}
That is the final column is replaced by X.
Try using proper escaping:
texText <- "\begin{tabular}{rl}\begin{tabular}{rll}"
output <- gsub("\begin\\{tabular\\}", "\begin{tabularx}", texText)
output
[1] "\begin{tabularx}{rl}\begin{tabularx}{rll}"
A literal backslash requires two backslashes, and also metacharacters such as { and } require two backslashes.

how to get the last part of strings with different lengths ended by ".nc" [duplicate]

This question already has answers here:
Get filename without extension in R
(9 answers)
Find file name from full file path
(4 answers)
Closed 3 years ago.
I have several download links (i.e., strings), and each string has different length.
For example let's say these fake links are my strings:
My_Link1 <- "http://esgf-data2.diasjp.net/pr/gn/v20190711/pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc"
My_Link2 <- "http://esgf-data2.diasjp.net/gn/v20190711/pr_-present_r1i1p1f1_gn_19500101-19591231.nc"
My goals:
A) I want to have only the last part of each string ended by .nc , and get these results:
pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc
pr_-present_r1i1p1f1_gn_19500101-19591231.nc
B) I want to have only the last part of each string before .nc , and get these results:
pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231
pr_-present_r1i1p1f1_gn_19500101-19591231
I tried to find a way on the net, but I failed. It seems this can be done in Python as documented here:
How to get everything after last slash in a URL?
Does anyone know the same method in R?
Thanks so much for your time.
A shortcut to get last part of the string would be to use basename
basename(My_Link1)
#[1] "pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc"
and for the second question if you want to remove the last ".nc" we could use sub like
sub("\\.nc", "", basename(My_Link1))
#[1] "pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231"
With some regex here is another way to get first part :
sub(".*/", "", My_Link1)

How to put \' in my string using paste0 function [duplicate]

This question already has answers here:
How to escape backslashes in R string
(3 answers)
Closed 5 years ago.
I have an array:
t <- c("IMCR01","IMFA02","IMFA03")
I want to make it look like this:
"\'IMCR01\'","\'IMFA02\'","\'IMFA03\'"
I tried different ways like:
paste0("\'",t,"\'")
paste0("\\'",t,"\\'")
paste0("\\\\'",t,"\\\\'")
But none of them is correct. Any other functions are OK as well.
Actually your second attempt is correct:
paste0("\\'",t,"\\'")
If you want to tell paste to use a literal backslash, you need to escape it once (but not twice, as you would need within a regex pattern). This would output the following to the console in R:
[1] "\\'IMCR01\\'" "\\'IMFA02\\'" "\\'IMFA03\\'"
The trick here is that the backslash is even being escaped by R in the console output. If you were instead to write t to a text file, you would only see a single backslash as you wanted:
write(t, file = "/path/to/your/file.txt")
But why does R need to escape backslash when writing to its own console? One possibility is that if it were to write a literal \n then this would actually be interpreted by the console as a newline. Hence the need for eacaping is still there.

Resources