Create a string with special character in R - r

I have an issue creating a string with a special character. I have asked a similar question and I have also read answers to similar questions about my problem but I am not able to find the solution.
I want to create a string character with a special character. I have been trying with cat but I know it is only for printing, not for saving the string in a variable in R.
I want as a result this:
> cat("C:\\Users\\ppp\\ddd\\")
C:\Users\ppp\ddd\
and I have been trying with paste and collapse but without success:
> x = c("C:","Users","ppp","ddd")
> t <- paste0(x, collapse = '\n')
> t
[1] "C:\nUsers\nppp\nddd"

Are you sure you don't want
x = c("C:","Users","ppp","ddd")
t <- paste0(x, collapse = '/')
t
[1] "C:/Users/ppp/ddd"
R uses this format for setting working directories.
You can also do:
x = c("C:","Users","ppp","ddd")
t <- paste0(x, collapse = '\\')
t
[1] "C:\\Users\\ppp\\ddd"
Although this result looks wrong, if you are using the string in a shell() command in R to be interpreted Windows for example, it will be interpreted correctly

Not Answering... but
t <- paste0(x, collapse = '/')
"C:/Users/ppp/ddd" seems to work on windows.

Related

String extraction with regular expression in R

I am trying to extract bunch of information from filenames using regular expressions in R. As I am matching the pattern, str_view() is showing me the correct set of strings. Yet, when I am trying to sub those and extract the remaining portion, it doesn't work. I also tried str_extract() but it isn't working. What am I doing wrong?
fname <- "TC2L6C_2020-08-14_1516_6C-ASG_29_00020.tab"
fext <- tools::file_path_sans_ext(fname)
stringr::str_view(fext, ".*-ASG_\\d+_", match = TRUE)
P_num <- gsub(".*-ASG_\\d{2}_", "", fext)
P_num <- stringr::str_extract(fname, "(?<=-ASG_\\d+)([^_])*(?=\\.tab)")
Using trimws from base R
trimws(fname, whitespace = ".*_|\\..*")
[1] "00020"
data
fname <- "TC2L6C_2020-08-14_1516_6C-ASG_29_00020.tab"
Here is a simple approach using sub:
fname <- "TC2L6C_2020-08-14_1516_6C-ASG_29_00020.tab"
output <- sub("^.*-ASG_\\d+_(.*)\\.tab$", "\\1", fname)
output
[1] "00020"
Above we use a capture group to isolate the portion of the filename, sans extension, which you want to match.

Interchangeable simulating and writing data to a file

I'm experimenting with R and I try to interchangeably simulate and write data to a file. I tried out many variants for example:
connection<-file("file.txt", open="w")
for (i in 1:2){
X<-runif(3,0,1)
writeLines(as.character(X), con=connection, sep="\n")
}
close(connection)
But what I get is
0.442033957922831
0.0713443560525775
0.950616024667397
0.0807233764789999
0.186026858631521
0.658676357707009
instead of something like
0.442033957922831 0.0713443560525775 0.950616024667397
0.0807233764789999 0.186026858631521 0.658676357707009
Could you explain me what I'm doing wrong?
We can paste the elements in 'X' to a single string and then use sep='\n', otherwise after each element, it is jumping to nextline
connection<-file("file.txt", open="w")
for (i in 1:2){
X<-runif(3,0,1)
writeLines(paste(X, collapse=" "), con=connection, sep="\n")
}
close(connection)
-output
Instead of writing line by line in a for loop we can create the string once and write it in the text file in one-go.
We can use replicate to repeat the runif code n times, paste the numbers row-wise, and paste them again collapsing with a new line character.
temp <- paste0(apply(t(replicate(2, runif(3,0,1))), 1, paste, collapse = ' '),
collapse = '\n')
connection <- file("file.txt")
writeLines(temp, connection)
close(connection)
where temp gives us a string of length one which looks like this :
temp
#[1] "0.406911700032651 0.416268902365118 0.698520892066881\n0.96398281189613 0.834513065638021 0.655840792460367"
which looks in text file as :
cat(temp)
#0.406911700032651 0.416268902365118 0.698520892066881
#0.96398281189613 0.834513065638021 0.655840792460367

Remove comma which is a thousands separator in R

I need to import a bunch of .csv files into R. I do this using the following code:
Dataset <- read.csv(paste0("./CSV/State_level/",file,".csv"),header = F,sep = ";",dec = "," , stringsAsFactors = FALSE)
The input is an .csv file with "," as separator for decimal places. Unfortunately there are quite a few entries as follows: 20,012,054.
This should really be: 20012,054 and leads to either NAs but usually the whole df being imported as character and not numeric which I'd like to have.
How do I get rid of the first "," when looking from left to right and only if the number has more than 3 figuers infront of the decimal-comma?
Here is a sample of how the data looks in the .csv-file:
A data.frame might look like this:
df<-data.frame(a=c(0.5,0.84,12.25,"20,125,25"), b=c("1,111,054",0.57,105.25,0.15))
I used "." as decimal separator in this case to make it a number, which in the .csv is a ",", but this is not the issue for numbers in the format: 123,45.
Thank you for your ideas & help!
We can use sub to get rid of the first ,
df[] <- lapply(df, function(x) sub(",(?=.*,)", "", x, perl = TRUE))
Just to show it would leave the , if there is only a single , in the code
sub(",(?=.*,)", "", c("0,5", "20,125,25"), perl = TRUE)
#[1] "0,5" "20125,25"

Concatinate text using paste to call a vector in r

I'm very new to R so may still be thinking in spreadsheets. I'd like to loop a list of names from a vector (list) through a function (effect) and append text to the front and end of the name a bit of text ("data$" and ".time0" or ".time1") so it references a specific vector of a dataframe I already have loaded (i.e., data$variable.time0 and data$variable.time1).
Paste just gives me a character named "data$variable.time0" or "data$variable.time1", rather than referencing the vector of the dataframe I want it to. Can I convert this to a reference somehow?
for (i in list){
function(i)
}
effect <- function(i){
time0 <- paste("data$",i,".time0", sep = ""))
time1 <- paste("data$",i,".time1", sep = ""))
#code continues but not relevant here
}
You can use eval(parse(text = "...")) to evaluate characters.
Try
time0 <- eval(parse(text = paste("data$",i,".time0", sep = ""))))
within your loop.

Is there a way to make R strings verbatim (not escaped)?

Typical example:
path <- "C:/test/path" # great
path <- "C:\\test\\path" # also great
path <- "C:\test\path"
Error: '\p' is an unrecognized escape in character string starting ""C:\test\p"
(of course - \t is actually an escape character.)
Is there any mark that can be used to treat the string as verbatim? Or can it be coded?
It would be really useful when copy/pasting path names in Windows...
R 4.0.0 introduces raw strings:
dir <- r"(c:\Program files\R)"
https://stat.ethz.ch/R-manual/R-devel/library/base/html/Quotes.html
https://blog.revolutionanalytics.com/2020/04/r-400-is-released.html
You can use scan ( but only in interactive session -- not in source)
Like
path=scan(what="",allowEscapes=F,nlines=1)
C:\test\path
print(path)
And then
Ctrl+A ++ Ctrl+Enter
give you result
But not work in function or source :
{
path=scan(what="character",allowEscapes=F,nlines=1)
C:\test\path
print(path)
}
throw error
Maybe readline() or scan(what = "charactor"), both work in terminal, not script or function:
1.readline():
> path <- readline()
C:\test\path #paste your path, ENTER
> path
[1] "C:\\test\\path"
2.scan(what = "charactor"):
> path = scan(what = "character")
1: C:\test\path #paste, ENTER
2: #ENTER
#Read 1 item
> path
[1] "C:\\test\\path"
EDIT:
Try this:
1.Define a function getWindowsPath():
> getWindowsPath <- function() #define function
{
return(scan(file = "clipboard", what = "character"))
}
2.Copy windows path using CTRL+C:
#CTRL+C: C:\test\path
> getWindowsPath()
#Read 1 item
[1] "C:\\test\\path"
If you are copying and pasting in windows, you can set up a file connection to the clipboard. Then you can use scan to read from it, with allowEscapes turned off. However, Windows allows spaces in file paths, and scan doesn't understand that, so you have to wrap the result in paste0 with collapse set to a 0-length character string.
x = file(description = "clipboard")
y = paste0(scan(file = x, what = "character", allowEscapes = F), collapse = "")
Unfortunately, this only works for the path currently in the clipboard, so if you are copying and pasting lots of paths into an R script, this is not a solution. A workaround in that situation would be to paste each path into a separate text file and save it. Then, in your main script, you could run the following
y = paste0(scan(file = "path1.txt", what = "character", allowEscapes = F), collapse = "")
You would probably need one saved file for each path.

Resources