I have a function of this form:
foo<-function(x,y){
if(length(y)==1){
return(x*y)
}
else{
return(x-y[1]*y[2])
}
}
and for the y argument I pass either a number or a vector of numbers:
> #test function:
> foo(1,2)
[1] 2
> foo(1,c(1,2))
[1] -1
Now I wish to use mapply to this function, but I run into problems when I wish to pass a vector for the y argument:
df<-data.frame(
"a"<-floor(runif(6, 1,10)),
"b"<-floor(runif(6, 18,80)),
"c"<-floor(runif(6, 1,80)),
"d"<-floor(runif(6, 100,800)),
"e"<-floor(runif(6, 1000,4000)),
"f"<-floor(runif(6, 1,10)),
"g"<-floor(runif(6, 5,80))
)
names(df)=c("a","b","c","d","e","f","g")
The following works fine:
> mapply(FUN=foo,df["a"],df["b"])
,but I run into trouble when I try to do the following:
> mapply(FUN=foo,df["a"],cbind(df["b"],df["c"]))
I'm very grateful for tips on how to better use an argument that have verying length, or how to pass the argument to mapply!
There are a lot of possible fixes here. Fundamentally, you need to turn 2nd input into mapply into a list with two elements in each list. One way to achieve that is to do something like:
tmp <- as.data.frame(t(df[c('b', 'c')]))
result <- mapply(FUN=foo,df["a"], tmp)
since a data frame is a list. This is going to run the function on all combinations of df["a"] and tmp. The elements you want will be along the diagonal (1st element of df['a'] with the first element of tmp, so the final answer is
diag(result)
BTW, when you are inside a function such as data.frame, use = for assignment instead of <-. You also do not need the quotes around the letters (they are being ignored). so you're call to data.frame should look like
df<-data.frame(
a = floor(runif(6, 1,10)),
b = floor(runif(6, 18,80)),
c = floor(runif(6, 1,80)),
d = floor(runif(6, 100,800)),
e = floor(runif(6, 1000,4000)),
f = floor(runif(6, 1,10)),
g = floor(runif(6, 5,80))
)
Which allows you to avoid having to name the data frame after you define it.
Update without diagonal call
f1 <- function(x) {
if(length(x) ==2 ) x[1] * x[2]
else x[1] - x[2]*x[3]
}
apply(df[,c("a","b", "c")], 1, f1)
Related
I'm trying to write a function that maps a function to reduce to concatenate a list of vectors together into 1 with the very first entry and the very last entry.
For example,
reduce(list(1:10, 11:20, 21:100), r_cat, .init = NULL)
should return a vector equal to
1:100
This is what I have so far.
r_cat = function(x, y) {
out <- y[[1]]
for(i in seq(2, length(y))) {
out <- x(out, y[[i]])
}
out
}
Any thoughts?
No need to write a new function, unlist solves your problem:
List <- list(1:10, 11:20, 21:100)
unlist(List)
If you want to use Reduce from R base, then you can use c
Reduce("c", List)
You can also get the same result plugging c into reduce from purrr
library(purrr)
reduce(List, c)
I have a data frame:
df <- data.frame( a = 1:5, b = 1:5, c = 1:5, d = as.factor(1:5))
I want to write a function that takes as its argument one of the columns a,b or c, and one of the factors of column d, and returns only the values of column a, b, or c, that have said factor value for column d.
I tried the following code:
fun1 <- function(x,y) {
u <- x[data$d == "y"]
return(u)
}
and I keep getting back numeric(0) as the output of the function. When I try similar code outside of the function() environment, it appears to work fine. Any help would be appreciated.
Probably a duplicate but I don't know how I would find it in the haystack of items with tags: data.frame, indexing, columns, values. Best practice is to pass the "data" as well as the search terms. (Calling the object df1 rather than df.)
fun1 <- function(dfrm, col,val) {
u <- dfrm[dfrm$d == val , col]
return(u)
}
fun1(df1, 'b', 3)
#[1] 3
I have a very large dataframe...
v.l.df <- data.frame(seq(0, 10, 0.0001),seq(0, 10, 0.0001),seq(0, 10, 0.0001))
...and a function with some if statements and calculations...
a.f <- function(cell_value,action){
if(action == 1){
cell_value * 1
}
else if(action == 2){
cell_value * 5
}
}
I now want to apply this function to the first two columns of my v.l.df row by row and build the sums of the returns. The new columns should thus contain (pseudo code):
new_col_1 new_col_2
a.f(v.l.df[1,1],1) + a.f(v.l.df[1,2],1) a.f(v.l.df[1,1],2) + a.f(v.l.df[1,2],2)
a.f(v.l.df[2,1],1) + a.f(v.l.df[2,2],1) a.f(v.l.df[2,1],2) + a.f(v.l.df[2,2],2)
...
How can this be achieved? I am struggeling with the multiple arguments when using apply and the sum of the returned values form the function.
EDIT: Changed the example function. Should now return the folowing
> a.f(2,1)
[1] 2
> a.f(2,2)
[1] 10
I'd do this in a couple of steps. You can reduce to fewer steps, but I prefer to keep it more readable:
First, apply a.f to all cells two times, using action=1 and action=2 to the first two columns of v.1.df (to pass aditional arguments inside apply, just put them after defining FUN):
action.1 = apply(v.1.df[,1:2], c(1,2), FUN = a.f, action=1)
action.2 = apply(v.1.df[,1:2] ,c(1,2), FUN = a.f, action=2)
Then ppply rowSums to both action.1 and action.2 and store the results in the same data.frame:
v.l.df$new.1 = rowSums(action.1) #or v.l.df$new.1 = apply(action.1,1,sum)
v.l.df$new.2 = rowSums(action.2) #or v.l.df$new.1 = apply(action.2,1,sum)
I believe your result is achieved by:
v.l.df$new_col_1 <- a.f(v.l.df$V1, 1) + a.f(v.l.df$V2, 1)
v.l.df$new_col_2 <- a.f(v.l.df$V1, 2) + a.f(v.l.df$V2, 2)
Assuming your first two columns are named V1 and V2 respectively.
You may also define another function
a.f.2 <- function(val1, val2, method) {
a.f(val1, method) + a.f(val2, method)
}
And apply it as follows
v.l.df$new_col_1 <- a.f.2(v.l.df$V1, v.l.df$V2, 1)
v.l.df$new_col_2 <- a.f.2(v.l.df$V1, v.l.df$V2, 2)
You can write this summary function with ... argument, to take an arbitrary number of inputs. The example below expects (and does not check for) columns of a data frame
a.f.n<- function(method,...){
rowSums(sapply(...,a.f,method))
}
Then apply this as follows:
v.l.df$new_col_1 <- a.f.n(v.l.df[,1:1000], method=1)
v.l.df$new_col_2 <- a.f.n(v.l.df[,1:1000], method=2)
I am not sure how efficient this will be, but it is compact. :-)
I have a list of filtering functions f1,f2,f3,f4,.... which take a matrix m and a number of options as input and return a subset of the rows of matrix as output. Now I would like to be able to define in an orderly way some meta-filtering function settings metaf1, metaf2, metaf3,... which would specify the sequential application of a specified nr of filtering functions, e.g. first f2 and then f3, using given options for each. I would like to store these filtering settings in a list of say class "metafiltering", and then have another function apply the filtering steps specified in a given metafiltering object. My idea would be able to in this way allow filtering settings to be stored and applied in an orderly way. How would I achieve this in the most elegant way in R? Or is there perhaps other convenient methods to achieve something like this?
EDIT: to give an example, say I have matrix
m=replicate(10, rnorm(20))
and filtering functions (these are just examples, obviously mine are more complicated :-) )
f1=function(m,opt1,opt2) {
return(m[(m[,2]>opt1)&(m[,1]>opt2),])
}
f2=function(m,opt1) {
return(m[(m[,3]>opt1),])
}
And I have defined the following metafiltering settings of specific class which would specify two functions which would have to be applied sequentially to matrix m
metafilterfuncs=list(fun1=f1(opt1=0.1,opt2=0.2),fun2=f2(opt1=0.5))
class("metafilterfuncs")="metafiltering"
The question I have then is how I could apply the filtering steps of an arbitrary metafiltering function object to given matrix m using the specified functions and settings?
You can do something like this :
You define a sort of functions pieplines where you give a priority for each function.
pipelines <- c(f1=100,f2=300,f3=200)
I define 3 dummy functions here for test:
f1 <- function(m,a) m + a
f2 <- function(m,b) m + b
f3 <- function(m,c) m + c
For each function , you store the argument in another list :
args <- list(f1=c(a=1),f2=c(b=2),f3=c(c=3))
Then you apply your functions :
m <- matrix(1:2,ncol=2)
for (func in names(pipelines[order(pipelines)]))
{
m <- do.call(func,list(m,args[[func]]))
}
pryr has a function, compose, like what you need, but it doesn't quite cut it. The compose function requires the functions to be given one by one, not in a list, and it cannot take arguments. It's also oddly placed in that package. A similar function can be found in plyr, namely each. But this function does not apply functions sequentially, but individually and outputs a named vector (list?).
agstudy provided a solution above, but it suffers from a problem: it can only take scalar arguments because it gives the arguments in a named vector. The solution to this is to use a named list instead. So, here's an improved function to replace the one in pryr.
compose2 = function(x, funcs, args, msg_intermediate = F) {
if (length(funcs) != length(args)) stop("length of functions and arguments must match")
for (i in seq_along(funcs)) {
x = do.call(what = funcs[[i]], args = c(x, args[[i]]))
if ((i != length(funcs)) && msg_intermediate) message(x)
}
x
}
msg_intermediate is a nice debugging argument that messages the intermediate results, so one can easier understand what happens.
Test it:
adder = function(x, n) x + n
compose2(0,
funcs = list(adder, adder, adder),
args = list(list(n = 1), list(n = 2), list(n = 3)),
msg_intermediate = T
)
Outputs:
1
3
[1] 6
This is what you get when you take 0, then add 1 (=1), then add 2 (=3), then add 3 (=6).
The args argument for compose2 takes a list of lists, so that one can supply non-scalar function arguments. Here's an example:
add_div = function(x, n, d) (x + n) / d
compose2(0,
funcs = list(add_div, add_div, add_div),
args = list(list(n = 1, d = 1), list(n = 2, d = 2), list(n = 3, d = 3)),
msg_intermediate = T
)
Output:
1
1.5
[1] 1.5
Which is what you get when you take 0, add 1, divide by 1 (=1), then take 1, add 2 then divide by 2 (=1.5), then take 1.5, add 3 and then divide by 3 (=1.5).
I would like to use a function from the apply family (in R) to apply a function of two arguments to two matrices. I assume this is possible. Am I correct? Otherwise, it would seem that I have to put the two matrices into one, and redefine my function in terms of the new matrix.
Here's an example of what I'd like to do:
a <- matrix(1:6,nrow = 3,ncol = 2)
b <- matrix(7:12,nrow = 3,ncol = 2)
foo <- function(vec1,vec2){
d <- sample(vec1,1)
f <- sample(vec2,1)
result <- c(d,f)
return(result)
}
I would like to apply foo to a and b.
(Strictly answering the question, not pointing you to a better approach for you particular use here....)
mapply is the function from the *apply family of functions for applying a function while looping through multiple arguments.
So what you want to do here is turn each of your matrices into a list of vectors that hold its rows or columns (you did not specify). There are many ways to do that, I like to use the following function:
split.array.along <- function(X, MARGIN) {
require(abind)
lapply(seq_len(dim(X)[MARGIN]), asub, x = X, dims = MARGIN)
}
Then all you have to do is run:
mapply(foo, split.array.along(a, 1),
split.array.along(b, 1))
Like sapply, mapply tries to put your output into an array if possible. If instead you prefer the output to be a list, add SIMPLIFY = FALSE to the mapply call, or equivalently, use the Map function:
Map(foo, split.array.along(a, 1),
split.array.along(b, 1))
You could adjust foo to take one argument (a single matrix), and use apply in the function body.
Then you can use lapply on foo to sample from each column of each matrix.
> a <- matrix(1:6,nrow = 3,ncol = 2)
> b <- matrix(7:12,nrow = 3,ncol = 2)
> foo <- function(x){
apply(x, 2, function(z) sample(z, 1))
}
> lapply(list(a, b), foo)
## [[1]]
## [1] 1 6
## [[2]]
## [1] 8 12