I want to create a function and use sapply to pass a number of input variables through it. The trick: the "variables" are actually vectors. I include an example below, where I would like to transpose the vectors a, b, and d, without having to manually write each command individually. I include the x <- part, while leaving it blank, because this is one of the main points of my confusion. Were I creating a normal function, I would simply create a vector of all the variables I want to pass through form5. However, if I create a vector from vectors, I'll just have a longer vector. So to be clear, I'd like sapply to return a matrix or dataframe with all 3 transposed vectors.
a <- c(1:10)
b <- c(11:20)
d <- c(21:30)
X <-
form5 <- function(x){
t(x)
}
sapply(x, form5)
Related
I have a function, MyFun(a,b,c,d), that returns a list of data frames and plots. The arguments, a, b, c, d, are just character strings that represent “start date”, “end date”, “current version #”, and “previous version #”. For brevity, I am going to call the arguments by the names a, b,c,d.
I want to run MyFun(a, b, c, d) with unique sets of arguments and store all the output into a list. To do so, I created a list of lists:
arg_set1 <- list(a1,b1,c1,d1)
arg_set2 <- list(a2,b2,c2,d2)
arg_set3 <- list(a3,b3,c3,d3)
arg_sets <- list(arg_set1, arg_set2, arg_set3)
This is the part I’m uncertain of: I am attempting to use Lapply to get a list of outputs from MyFun(a,b,c,d), using the 3 lists listed in arg_sets as my input: output <- lapply(arg_sets, MyFun)
To my understanding, the above lapply statement does not work because lapply is unable to know that the series of arguments it should pass to MyFun are contained in arg_sets[1], arg_sets[2], arg_sets[3]. As an alternative, I have also tried to pass my input arguments to lapply as a data frame, with columns of a,b,c,d and each row encompassing a unique set of parameters I want lapply to pass to MyFun(a, b,c,d). However, I ran into essentially the same issue as before with the list of input arguments; I am unable to define to lapply to pass each row of the input matrix as a set of arguments to MyFun(a,b,c,d).
Any advice would be much appreciated!
Use do.call
output <- lapply(arg_sets, function(x) do.call(my_fun, x))
See this simple example,
my_fun <- function(x, y , z) {
x + y + z
}
arg_sets <- list(a = as.list(1:3), b= as.list(4:6))
lapply(arg_sets, function(x) do.call(my_fun, x))
#$a
#[1] 6
#$b
#[1] 15
If instead of list, you create a vector of arguments you can change the above function as
arg_set1 <- c(a1,b1,c1,d1)
arg_set2 <- c(a2,b2,c2,d2)
arg_set3 <- c(a3,b3,c3,d3)
arg_sets <- list(arg_set1, arg_set2, arg_set3)
lapply(arg_sets, function(x) do.call(my_fun, as.list(x)))
df is a frequency table, where the values in a were reported as many times as recorded in column x,y,z. I'm trying to convert the frequency table to the original data, so I use the rep() function.
How do I loop the rep() function to give me the original data for x, y, z without having to repeat the function several times like I did below?
Also, can I input the result into a data frame, bearing in mind that the output will have different column lengths:
a <- (1:10)
x <- (6:15)
y <- (11:20)
z <- (16:25)
df <- data.frame(a,x,y,z)
df
rep(df[,1], df[,2])
rep(df[,1], df[,3])
rep(df[,1], df[,4])
If you don't want to repeat the for loop, you can always try using an apply function. Note that you cannot store it in a data.frame because the objects are of different lengths, but you could store it in a list and access the elements in a similar way to a data.frame. Something like this works:
df2<-sapply(df[,2:4],function(x) rep(df[,1],x))
What this sapply function is saying is for each column in df[,2:4], apply the rep(df[,1],x) function to it where x is one of your columns ( df[,2], df[,3], or df[,4]).
The below code just makes sure the apply function is giving the same result as your original way.
identical(df2$x,rep(df[,1], df[,2]))
[1] TRUE
identical(df2$y,rep(df[,1], df[,3]))
[1] TRUE
identical(df2$z,rep(df[,1], df[,4]))
[1] TRUE
EDIT:
If you want it as a data.frame object you can do this:
res<-as.data.frame(sapply(df2, '[', seq(max(sapply(df2, length)))))
Note this introduces NAs into your data.frame so be careful!
I am looking for a best practice to store multiple vector results of an evaluation performed at several different values. Currently, my working code does this:
q <- 55
value <- c(0.95, 0.99, 0.995)
a <- rep(0,q) # Just initialize the vector
b <- rep(0,q) # Just initialize the vector
for(j in 1:length(value)){
for(i in 1:q){
a[i]<-rnorm(1, i, value[j]) # just as an example function
b[i]<-rnorm(1, i, value[j]) # just as an example function
}
df[j] <- data.frame(a,b)
}
I am trying to find the best way to store individual a and b for each value level
To be able to iterate through the variable "value" later for graphing
To have the value of the variable "value" and/or a description of it available
I'm not exactly sure what you're trying to do, so let me know if this is what you're looking for.
q = 55
value <- c(sd95=0.95, sd99=0.99, sd995=0.995)
a = sapply(value, function(v) {
rnorm(q, 1:q, v)
})
In the code above, we avoid the inner loop by vectorizing. For example, rnorm(55, 1:55, 0.95) will give you 55 random normal deviates, the first drawn from a distribution with mean=1, the second from a distribution with mean=2, etc. Also, you don't need to initialize a.
sapply takes the place of the outer loop. It applies a function to each value in value and returns the three vectors of random draws as the data frame a. I've added names to the values in value and sapply uses those as the column names in the resulting data frame a. (It would be more standard to make value a list, rather than a vector with named elements. You can do that with value <- list(sd95=0.95, sd99=0.99, sd995=0.995) and the code will otherwise run the same.)
You can create multiple data frames and store them in a list as follows:
q <- list(a=10, b=20)
value <- list(sd95=0.95, sd99=0.99, sd995=0.995)
df.list = sapply(q, function(i) {
sapply(value, function(v) {
rnorm(i, 1:i, v)
})
})
This time we have two different values for q and we wrap the sapply code from above inside another call to sapply. The inner sapply does the same thing as before, but now it gets the value of q from the outer sapply (using the dummy variable i). We're creating two data frames, one called a and the other called b. a has 10 rows and b has 20 (due to the values we set in q). Both data frames are stored in a list called df.list.
I follow the step in this post: Is there a way to define a subsequent set of data.frame in R? to assign data to a whole set of data frames.
What I did is to store all my "symbol" into a vector x <- c("A","B",C"...), then use the above method to get a set of data.frame such as A, B, C...etc.
My question is: how can I access these data.frame iteratively by another for loop, then assign to vectors such as:
x <- c("A","B","C") ## A, B, C are names of data.frame
y <- c("a","b","c") ## a, b, c are names of vectors
for (i in x){ y[i] <- x[i][,2]}
Thanks a lot in advance!!
I have a dataframe of variable names and weightings.
Example:
Names <- c("a","b","c")
Weightings <- c(1,2,3)
df <- cbind(Names,Weightings)
I need to generate a function from this data as such.
myfun <- function(x,data){
data[x,"a"]*1+data[x,"b"]*2+data[x,"c"]*3}
I have another dataframe named data where the column names match a, b, and c and I will apply myfun to this data over all rows.
The issue I have is that the size of the Names and Weightings vector can vary. I could be working with 5 names and Weightings but I want it to generate the new function "myfun" as such.
Newnames <- c("a","b","c","d","e")
NewWeightings <- c(1,2,3,4,5)
myfun <- function(data){
data[x,"a"]*1+data[x,"b"]*2+data[x,"c"]*3+data[x,"d"]*4+data[x,"e"]*5}
Is there an easy way to automate the creation of this function so I could give someone the code, and a .csv file of column names and weightings and they could generate their new function.
What about a strategy like this. We use a function to make a function
getMyFunction <- function(columns, weights) {
stopifnot(length(columns)==length(weights))
function(x, data) {
rowSums(data[x, columns] * weights)
}
}
Basically the rowSums takes care of the addition, we specify a vector of columns all at once, and the default * is element-wise so that takes care of the weights.
Then we build a function like
Names <- c("a","b","c")
Weightings <- c(1,2,3)
myFun <- getMyFunction(Names, Weightings)
and we can use it with
dd<-data.frame(a=c(1,1), b=c(1,2), c=c(1,3))
myFun(1,dd)
# [1] 6
myFun(2,dd)
# [1] 13
myFun(1:2,dd)
# [1] 6 13