I am trying to source multiple functions, that differ by a number in the name.
For example: func1, func2.
I tried using "func_1", and "func_2", as well as putting the number first, "1func" and "2func". No matter how I index the function names, the source function just reads in one function that it calls "func" - which is not what I want.
I have tried using for-loops and sapply:
for-loop:
func.list <- list.files(path="/some_path",pattern="some pattern",full.names=TRUE)
for(i in 1:length(func.list)){
source(func.list[i])
}
sapply:
sapply(func.list,FUN=source)
I am going to be writing multiple versions of a data correction function, and would really like to be able to index them - because giving a concise, but specific, name would be difficult, and not allow me to selectively source just the function files from their directory.
In my code, func.list gives the output (I have replaced the actual directory because of privacy/contractual issues):
[1] "mypath/1resp.correction.R"
[2] "mypath/2resp.correction.R"
Then when I source func.list with either the for-loop or sapply code (listed above), R only loads one function named resp.correction, with the code body from "2resp.correction.R".
The argument to source is a file name, not a function name. So you cannot be fancy here: you need to provide the exact filenames.
It sounds like your two files contain the definitions of a function with the same name (resp.correction) in both files, so yes, as you source one file after the other, the function is overwritten in your global environment.
You could, inside your loop, reassign the function to a different name:
func.list <- list.files(path="/some_path",pattern="some pattern",full.names=TRUE)
for(i in 1:length(func.list)) {
source(func.list[i], local = TRUE)
assign(paste0("resp.correction", i), resp.correction, envir = .GlobalEnv)
}
Related
I have a list of names (e.g: authors) and a pdf file which includes those names. I need to calculate how many times those authors are mentioned in the pdf file.
Let's say my table of authors is named "author" and the pdf file's name is "pdf" (I converted and stored this pdf file in R already using pdf_text already)
I've tried the following:
author$count <- 0
author$count <- for (i in author$name) { sum(str_count(pdf, i))}
But it didn't work. When I printed author$count, the results were NULL. Is there a way to fix this?
Unlike most other functions, for does not return a value in R, which unfortunately makes it much less useful. Instead, in most situations one of the vector mapping functions (lapply, vapply etc.) is more suitable to the task.
In your case, vapply does the trick:
author$count <- vapply(author$name, \(i) sum(str_count(pdf, i)), integer(1L))
(If you’re using an older version of R, you need to replace \(i) with function (i).)
Note that you do not need to assign 0 to author$count beforehand. That value would be overwritten anyway.
A note on vapply vs. sapply
vapply ensures that the result of the function call actually conforms to the expected format (here: integer(1L), i.e. every element is a single integer). sapply doesn’t do this, which makes using sapply risky in non-interactive code, since it won’t notify you if there’s an error with the data. purrr::map_* behaves similarly to vapply.
We may need to assign within the loop. Also, loop across the sequence to do the assignment
for(i in seq_along(author$name)) {
author$count[i] <- sum(str_count(pdf, author$name[i]))
}
I am basically altering an existing long, rather complicated code, to include new parameters and I want to add a for loop in an existing sapply function, but the for loop uses an argument from sapply.
(see code below)
ld is a character string of the list of directories, so in the current code it is applying a function to each directory one by one (I just used .... to simplify it).
Now I use list.dirs to introduce td, which is a list of the subdirectories inside each ld (it works,I checked for a single folder). I am trying make a for loop which loops through the tds for a given ld.
E.g. once working on ld[1], I want it identify the subfolders in this folder only, save them as a character string and apply another function (in the for loop) to each of these subfolders.
Then it moves to the second folder in the ld strings and applies the function to each subfolder there and so on.
sapply(file.path(ld), function(folder){
....
td = list.dirs(ld, recursive = FALSE)
for (j in td){
.........
}
}
I get the error:
Error in file(file, "rt") : invalid 'description' argument
I put a couple of print functions to see what happens and I see that within sapply:
td = list.dirs(ld, recursive = FALSE)
lists all the subfolders in all of the ld folders, not only the current one.
Also, immediately after I introduce the for loop, I printed out td[j], which gives NA.
Now is the time to say I never figured out how to use apply family of functions which is probably why I can't get this to work, I assume they work as a regular loop while this isn't the case. Also it would be easier to make a nested for loop I guess, but before I do it, I am trying to make minimal changes to the current code in case I mess it up, since I am very beginner in programming..
I hope this is clear, any help with be appreciated!
I am trying to write a program to open a large amount of files and run them through a function I made called "sort". Every one of my file names starts with "sa1", however after that the characters vary based on the file. I was hoping to do something along the lines of this:
for(x in c("Put","Characters","which","Vary","by","File","here")){
sa1+x <- read.csv("filepath/sa1+x",header= FALSE)
sa1+x=sort(sa1+x)
return(sa1+x)
}
In this case, say that x was 88. It would open the file sa188, name that dataframe sa188, and then run it through the function sort. I dont think that writing sa1+x is the correct way to bind together two values, but I dont know a way to.
You need to use a list to contain the data in each csv file, and loop over the filenames using paste0.
file_suffixes <- c("put","characters","which","vary","by","file","here")
numfiles <- length(file_suffixes)
list_data <- list()
sorted_data <- list()
filename <- "filepath/sa1"
for (x in 1:numfiles) {
list_data[[x]] <- read.csv(paste0(filename, file_suffixes[x]), header=FALSE)
sorted_data[[x]] <- sort(list_data[[x]])
}
I am not sure why you use return in that loop. If you're writing a function, you should be returning the sorted_data list which contains all your post-sorting data.
Note: you shouldn't call your function sort because there is already a base R function called sort.
Additional note: you can use dir() and regex parsing to find all the files which start with "sa1" and loop over all of them, thus freeing you from having to specify the file_suffixes.
I have created a list of objects in my work environment
data <- c("variable1", "variable2", "variable3")
i would like to save the files to different directories with the variable name as the directory... so i did this to give me a list of file names to pass to the save function via lapply..
paste0(data,"/",data,".rda")
lapply(data,FUN=save,file = paste0(data,"/",data,".rda"))
i get the error
Error in FUN(X[[i]], ...) : object ‘X[[i]]’ not found
i'm not sure what i'm doing wrong here..
Do you have a list of objects, or a list of names of objects? You say you have the former, but the code you give is for the latter.
Also, if you only have one object per file, then it's better to use the saveRDS function (and loadRDS to load it).
lapply(data, function(x) saveRDS(get(x), paste0(x, "/", x, ".rds")))
If you have to use save:
lapply(data, function(x) save(list=x, file=paste0(x, "/", x, ".rds")))
Several things going on here.
First, you need not use lapply when you don't care about the return value of the function called at each iteration. It offers nothing in this case.
Second, and more importantly, what you are doing is writing objects to files with names derived from their variable names in R. That's an anti-pattern.
Instead, create a list of the objects, and use for for the work. We need to use saveRDS for this (thanks Hong Ooi) as l[[n]] is also not the name of an object in the environment.
l <- list(variable1 = variable1, variable2 = variable2, variable3=variable3)
for (n in names(l)) {
fname = paste0(n, '/', n, '.rda')
saveRDS(file=fname, l[[n]])
}
It would be better to just save the entire list, but then all the data would be in one file in one directory.
As for what's actually wrong with your code:
You pass the same value for file to all invocations of save, and you don't intend to do so. This value is a vector, but what you want is that each iteration gets one element from this vector.
The way lapply computes the value to pass to the function confuses save. In particular, it does this:
names <- as.character(substitute(list(...)))[-1L]
That results in something like the following, which is not the name of an object in the environment.
c("variable1", "variable2", "variable3")[[1]]
I'm having a problem with the below function:
ab<-matrix(c(1:20),nrow=4)
rownames(ab)<-c("a","b","c","d")
cd<-c("a","c")
test<-function(x,y,ID_Tag){
for(i in y) {
M_scaled<-t(scale(t(x),center=T))
a<-quantile(M_scaled[match(i,rownames(x)),])
assign(paste0("Probes_",ID_Tag,"_quan_",i),a)
}
}
test(ab,cd,"C1")
x is the dataframe/matrix
y is the string I need to search for in rownames(x)
ID_Tag is is the number I use to distinguish my samples from each other.
The function is running, but no output is generated into strings afterwards.
Hope somebody can help me
When you use assign within a function it will make the assignment to a variable that is accessible within that function only (i.e. it's like using <-). To get around this, you need to specify the envir argument in assign to be either the global environment globalenv() or the parent frame of the function. So try changing your assign statement to
assign(..., envir = parent.frame())
or
assign(..., envir = globalenv())
depending on what you want exactly (in the example you provided they are equivalent). Have a look at ?parent.frame for more info on these. Another possibility is to specify the pos argument in assign, check ?assign.
As an aside, assigning global objects from within a function can lead to various problems in general. I find it better practice in your example to return a list of objects created in the for loop rather than use assign.