Naming output files in R [duplicate] - r

This question already has answers here:
Concatenate a vector of strings/character
(8 answers)
Closed 6 years ago.
I'm working in R and I would like to export a txt file putting in its name the value of a particular variable; I read about the command paste and it works perfectly here:
write.table(mydata,file=paste(cn,"data.txt"))
where cn is the value to put at the beginning of the file data.txt. I would like to automatically put this file in an output folder where I keep all the other results. I try to do something like this:
write.table(mydata,file=paste(cn,"./output/data.txt"))
But it doesn't work. Any suggestion?

paste() just creates a string by concatenating the individual values and uses a space as default separator:
write.table(mydata, file = paste("./output/", cn ,"data.txt", sep = ""))
or with paste0(...), which is equivalent to paste(..., sep = ""):
write.table(mydata, file = paste0("./output/", cn ,"data.txt"))

Related

Importing csv file in R and remove character added in numeric header [duplicate]

This question already has answers here:
Why am I getting X. in my column names when reading a data frame?
(5 answers)
Closed last month.
These extra x are added in the csv file. I want these header without the x. I'm importing a csv file into R using read.csv() the file is being read but extra character is included in the header as is it numeric. How to remove this extra character?
The extra 'X' are added to the header because the original column names are not syntactically valid variable names. A legitimate column name has to start with a letter or the dot not followed by a number. By default read.csv() will check the names of the variables in the data frame to ensure validity of names. You can switch off this feature through
read.csv(..., check.names = FALSE)
You can rename row.names of the element you imported. If the onlly thing you want to do is delete the first string of each name, you can do this (lets call your dataset df):
colnames(df) <- substring(colnames(df),2)

how to get the last part of strings with different lengths ended by ".nc" [duplicate]

This question already has answers here:
Get filename without extension in R
(9 answers)
Find file name from full file path
(4 answers)
Closed 3 years ago.
I have several download links (i.e., strings), and each string has different length.
For example let's say these fake links are my strings:
My_Link1 <- "http://esgf-data2.diasjp.net/pr/gn/v20190711/pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc"
My_Link2 <- "http://esgf-data2.diasjp.net/gn/v20190711/pr_-present_r1i1p1f1_gn_19500101-19591231.nc"
My goals:
A) I want to have only the last part of each string ended by .nc , and get these results:
pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc
pr_-present_r1i1p1f1_gn_19500101-19591231.nc
B) I want to have only the last part of each string before .nc , and get these results:
pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231
pr_-present_r1i1p1f1_gn_19500101-19591231
I tried to find a way on the net, but I failed. It seems this can be done in Python as documented here:
How to get everything after last slash in a URL?
Does anyone know the same method in R?
Thanks so much for your time.
A shortcut to get last part of the string would be to use basename
basename(My_Link1)
#[1] "pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231.nc"
and for the second question if you want to remove the last ".nc" we could use sub like
sub("\\.nc", "", basename(My_Link1))
#[1] "pr_day_MRI-AGCM3-2-H_highresSST_gn_20100101-20141231"
With some regex here is another way to get first part :
sub(".*/", "", My_Link1)

Remove Hex Code from String in R [duplicate]

This question already has answers here:
Remove special characters from data frame
(2 answers)
Closed 4 years ago.
I have converted a .doc document to .txt, and I have some weird formatting that I cannot remove (from looking at other posts, I think it is in Hex code, but I'm not sure).
My data set is a data frame with two columns, one identifying a speaker and the second column identifying the comments. Some strings now have weird characters. For instance, one string originally said (minus the quotes):
"Why don't we start with a basic overview?"
But when I read it in R after converting it to a .txt, it now reads:
"Why don<92>t we start with a basic overview?"
I've tried:
df$comments <- gsub("<92>", "", df$comments)
However, this doesn't change anything. Furthermore, whenever I do any other substitutions within a cell (for instance, changing "start" to "begin", it changes that special character into a series of weird ? that're surrounded in boxes.
Any help would be very appreciated!
EDIT:
I read my text in like this:
df <- read_delim("file.txt", "\n", escape_double = F, col_names = F, trim_ws = T)
It has 2 columns; the first is speaker and the second is comments.
I found the answer here: R remove special characters from data frame
This code worked: gsub("[^0-9A-Za-z///' ]", "", a)

R - how to write a function to read a CSV file [duplicate]

This question already has answers here:
Calculate the mean of one column from several CSV files
(2 answers)
Closed 4 years ago.
I have CSV files named "001", "002",..."100" stored in the working directory. I need to write a function to read any of these files. I tried the function below, but it doesn't work.
func = function(ID)
{
inp = read.csv("ID.csv")
}
I think this is because "ID.csv" is a character whereas ID is a numeric variable, but I am not sure. Can someone please explain the reason and suggest the right code?
Sounds like you sort of understand the problem. "ID.csv" is a string literal and it is literally looking for a file named ID.csv. If I were you, I would input ID as a string like you have it (i.e. "001" instead of 1). Then try this:
func = function(ID)
{
inp = read.csv(paste(ID,".csv",sep=""))
}

File picking using pattern in R [duplicate]

This question already has answers here:
R-project filepath from concatenation
(1 answer)
Passing directory path as parameter in R
(1 answer)
Closed 8 years ago.
I have a directory which is having multiple files which starts with 001.csv, 002.csv and so on. I want to pick those files in a function for which I pass as an argument to the function.
For ex.
myFiles<-function(x=1:30){
// I should pick only those files which starts with 001.csv till 030.csv.
}
I tried using pattern matching but I am not sure how to make pattern matching using another variable which consists of vectors. I even tried using paste function so as to paste the full file path but it was giving me file name as 1.csv and not 001.csv
tt<-function(dirname,type,nums=1:30){
filenames<-list.files(dirname)
c<-nums
myVector<-0
for(i in 1:length(c)){
myVector[i]<-paste(dirname,"/",c[i],".csv",sep="")
#print(myVector[i])
}
}
One way you are able to get the correct names is to pad the start of the numbers with 0s using formatC e.g.
paste0(formatC(seq(1:30), width = 3, format = "d", flag = "0"), ".csv")

Resources