How to replace backslash \ in R with gsub? - r

I would like to ammend some .tex files from within R.
I read the file with readLines() but I cannot replace the following text.
tex <- "$\\times$"
new_tex <- gsub("$\\times$", "\\ $\\times$", tex)
new_tex
It seems that it cannot find the $\\times$
But even if it does, is it possible to write \ without escaping them?
Thank you in advance!

gsub uses regular expressions by default unless you set fixed=TRUE.
In regular expressions $ means the end of the sentence , that's why it does not work.
This, instead should work :
new_tex <- gsub("$\\times$", "\\ $\\times$", tex,fixed=TRUE)
About the backslash, no, you can't write a backslash without escaping it. Otherwise, for example, it would be impossible for the R interpreter, to distinguish between a tab \t and a "backslash + t".

Without fixed = TRUE:
gsub("\\$\\\\times\\$", "\\\\ $\\\\times\\$", tex)
[1] "\\ $\\times$"
Unfortunately you need a lot of backslashes because you need to escape pretty much everything.

Related

How to extract specific characters in R String

I have a file name string:
directoryLocation<-"\Users\me\Dropbox\Work\"
How can I extract all the "\" and replace it with "\"? In other languages, you can loop through the string and then replace character by character, but I don't think you can do that in R.
I tried
substr(directoryLocation,1,1)
but it is highly optimized to this case...how can it be more general?
Thanks
gsub is the general tool for this, but as others have noted you need a confusing four slashes to account for the escapes: you need to escape for both R text and the regexp engine simultaneously.
An alternative, if using Windows, is to use normalizePath and setting the winslash parameter:
normalizePath(directoryLocation,winslash="/",mustWork=FALSE)
[1] "C:/Users/me/Dropbox/Work/"
Though this may perform additional work on expanding relative paths to absolute ones (seen here by prepending with C:).
In theory this would do what you want
gsub("\\\", "/", directoryLocation)
however...
R> directoryLocation<-"\\Users\\me\\Dropbox\\Work\\"
R> directoryLocation
[1] "\\Users\\me\\Dropbox\\Work\\"
R> gsub("\\\\", "/", directoryLocation)
[1] "/Users/me/Dropbox/Work/"
At least on windows one needs to escape all of the backslashes, but gsub is what you want.
gsub("\\\\","/","\\Users\\me\\Dropbox\\Work\\")
[1] "/Users/me/Dropbox/Work/"

Changing "/" into "\" in R

I need to change "/" into "\" in my R code. I have something like this:
tmp <- paste(getwd(),"tmp.xls",sep="/")
so my tmp is c:/Study/tmp.xls
and I want it to be: c:\Study\tmp.xls
Is it possible to change it in R?
Update as per comments.
If this is simply to save the file, then as #sgibb suggested, you are better off using file.path():
file.path(getwd(), "tmp.xls")
Update 2: You want double back-slashes.
tmp is a string and if you want to have an actual backslash you need to escape it -- with a backslash.
However, when R interprets the double slashes (for example, when looking for a file with the path indicated by the string), it will treat the seemingly double slashes as one.
Take a look at what happens when you output the string with cat()
cat("c:\\Study\\tmp.xls")
c:\Study\tmp.xls
The second slash has "disappeared"
Original Answer:
in R, \ is an escape character, thus if you want to print it literally, you need to escape the escape character: \\. This is what you want to put in your paste statement.
You can also use .Platform$file.sep as your sep argument, which will make your code much more portable.
tmp <- paste(getwd(),"tmp.xls",sep=.Platform$file.sep)
If you already have a string you would like to replace, you can use
gsub("/", "\\", tmp, fixed=TRUE)

Paste "25 \%" in R for further processing in LaTeX

I want a character variable in R taking the value from, lets say "a", and adding " \%", to create a %-sign later in LaTeX.
Usually I'd do something like:
a <- 5
paste(a,"\%")
but this fails.
Error: '\%' is an unrecognized escape in character string starting "\%"
Any ideas? A workaround would be to define another command giving the %-sign in LaTeX, but I'd prefer a solution within R.
As many other languages, certain characters in strings have a different meaning when they're escaped. One example for that is \n, which means newline instead of n. When you write \%, R tries to interpret % as a special character and fails doing so. You might want to try to escape the backslash, so that it is just a backslash:
paste(a, "\\%")
You can read on escape sequences here.
You can also look at the latexTranslate function from the Hmisc package, which will escape special characters from strings to make them LaTeX-compatible :
R> latexTranslate("You want to give me 100$ ? I agree 100% !")
[1] "You want to give me 100\\$ ? I agree 100\\% !"

File path issues in R using Windows ("Hex digits in character string" error)

I run R on Windows, and have a csv file on the Desktop. I load it as follows,
x<-read.csv("C:\Users\surfcat\Desktop\2006_dissimilarity.csv",header=TRUE)
but the R gives the following error message
Error: '\U' used without hex digits in character string starting "C:\U"
So what's the correct way to load this file. I am using Vista
replace all the \ with \\.
it's trying to escape the next character in this case the U so to insert a \ you need to insert an escaped \ which is \\
Please do not mark this response as correct as smitec has already answered correctly. I'm including a convenience function I keep in my .First library that makes converting a windows path to the format that works in R (the methods described by Sacha Epskamp). Simply copy the path to your clipboard (ctrl + c) and then run the function as pathPrep(). No need for an argument. The path is printed to your console correctly and written to your clipboard for easy pasting to a script. Hope this is helpful.
pathPrep <- function(path = "clipboard") {
y <- if (path == "clipboard") {
readClipboard()
} else {
cat("Please enter the path:\n\n")
readline()
}
x <- chartr("\\", "/", y)
writeClipboard(x)
return(x)
}
Solution
Try this: x <- read.csv("C:/Users/surfcat/Desktop/2006_dissimilarity.csv", header=TRUE)
Explanation
R is not able to understand normal windows paths correctly because the "\" has special meaning - it is used as escape character to give following characters special meaning (\n for newline, \t for tab, \r for carriage return, ..., have a look here ).
Because R does not know the sequence \U it complains. Just replace the "\" with "/" or use an additional "\" to escape the "\" from its special meaning and everything works smooth.
Alternative
On windows, I think the best thing to do to improve your workflow with windows specific paths in R is to use e.g. AutoHotkey which allows for custom hotkeys:
define a Hotkey, e.g. Cntr-Shift-V
assigns it an procedure that replaces backslashes within your Clipboard with
slaches ...
when ever you want to copy paste a path into R you can use Cntr-Shift-V instead of Cntr-V
Et-voila
AutoHotkey Code Snippet (link to homepage)
^+v::
StringReplace, clipboard, clipboard, \, /, All
SendInput, %clipboard%
My Solution is to define an RStudio snippet as follows:
snippet pp
"`r gsub("\\\\", "\\\\\\\\\\\\\\\\", readClipboard())`"
This snippet converts backslashes \ into double backslashes \\. The following version will work if you prefer to convert backslahes to forward slashes /.
snippet pp
"`r gsub("\\\\", "/", readClipboard())`"
Once your preferred snippet is defined, paste a path from the clipboard by typing p-p-TAB-ENTER (that is pp and then the tab key and then enter) and the path will be magically inserted with R friendly delimiters.
Replace back slashes \ with forward slashes / when running windows machine
I know this is really old, but if you are copying and pasting anyway, you can just use:
read.csv(readClipboard())
readClipboard() escapes the back-slashes for you. Just remember to make sure the ".csv" is included in your copy, perhaps with this:
read.csv(paste0(readClipboard(),'.csv'))
And if you really want to minimize your typing you can use some functions:
setWD <- function(){
setwd(readClipboard())
}
readCSV <- function(){
return(readr::read_csv(paste0(readClipboard(),'.csv')))
}
#copy directory path
setWD()
#copy file name
df <- readCSV()
Replacing backslash with forward slash worked for me on Windows.
The best way to deal with this in case of txt file which contains data for text mining (speech, newsletter, etc.) is to replace "\" with "/".
Example:
file<-Corpus(DirSource("C:/Users/PRATEEK/Desktop/training tool/Text Analytics/text_file_main"))
I think that R is reading the '\' in the string as an escape character. For example \n creates a new line within a string, \t creates a new tab within the string.
'\' will work because R will recognize this as a normal backslash.
readClipboard() works directly too. Copy the path into your clipboard
C:\Users\surfcat\Desktop\2006_dissimilarity.csv
Then
readClipboard()
appears as
[1] "C:\\Users\\surfcat\\Desktop\\2006_dissimilarity.csv"
A simple way is to use python.
in python terminal type
r"C:\Users\surfcat\Desktop\2006_dissimilarity.csv"
and you'll get back
'C:\Users\surfcat\Desktop\2006_dissimilarity.csv'

How do I strip dollar signs ($) from data/ escape special characters in R?

I've been using gsub("toreplace","replacement", myvector) to clean out data in R. While this works for commas and the like, removing "$" has no effect. So if I do gsub("$","",myvector) all the dollar signs remain in place.
I think this is because $ is a special character in R. I tried escaping it "\$" but that yields the same result (no effect). And I couldn't find a resource on escaping special characters in R.
Obviously I should do this in preprocessing. But I was wondering if anyone out there knew how to either a) escape special characters in R b) get rid of pesky $ in R directly. For science.
You have to escape it twice, first for R, second for the regex.
gsub('\\$', '', c("a$a", "bb$"))
[1] "aa" "bb"
See ?Quotes for details on quoting and escaping.
Use fixed = TRUE:
gsub('$', '', c("a$a", "bb$"), fixed = TRUE)
Then you don't need to worry about any special characters. In stringr, this is implemented a little differently:
library(stringr)
str_replace_all(c("$100","ta$ty"), fixed("$"), "")
Thanks to DiggyF and James for the examples!
Escaping characters can be a pain some times, but just putting it in square brackets (make it a character class) helps with this:
> gsub("[$]","",c("$100","ta$ty"))
[1] "100" "taty"
if you have $ followed by number in set of data columns (e.g. $400,000) there is an easier way that worked like charm for me.
data%>%
mutate_at(5:6, parse_number)
where 5:6 are the data column numbers.

Resources