do.call(), multiple parameters - r

I have a function with many arguments:
fun(A,B,C,D,E)
Now I want to assign fixed value a,b,c,d to A,B,C,D and assign E a list of 1 : 7
I want to use do.call() as below, but it doesn't work.
a <- do.call(function(x) fun(A = a, B = b, C = c, D = d, E = x), list(1:7))
I turn to lapply, and it works,
a <- lapply(c(1:7), function(x) fun(A = a, B = b, C = c, D = d, E = x))
As Joshua Ulrich's answer, when I try
a `<- do.call(fun, list(A = a, B = b, C = c, D = d, E = list(1:7)))`
it says
(list) object cannot be coerced to type 'double'
So I guess fun needs a double value for E, but do.call() doesn't give the values one by one, but a list.
I don't want to use lapply because it returns a list of list, which, if I want to point at a special list, I have to use [[]], and only single value is allowed in [[]], and I cannot use a vector to point at, e.g. [[a]],with a <- c(1:7).
How to make do.call() work?

That should be:
a <- do.call(fun, list(A = a, B = b, C = c, D = d, E = list(1:7)))
And I have a feeling you want E to be a vector, not a list, in which case it should be:
a <- do.call(fun, list(A = a, B = b, C = c, D = d, E = 1:7))

Related

applying a created function in a list with data frames

I would like to calculate the variation coefficient in a list composed of several data frames. However, when I apply my function that calculates the variation coeficient in my list of data frames I am getting this error:
coef_var = lapply(dists_log, cvs)
Error in is.data.frame(x) :
'list' object cannot be coerced to type 'double'
Here what I did:
List = list (A = data.frame(A = rnorm(30), B = rnorm(30), C =rnorm (30), D = rnorm(30)),
B = data.frame(A = rnorm(30), B = rnorm(30), C =rnorm (30), D = rnorm(30)),
C = data.frame(A = rnorm(30), B = rnorm(30), C =rnorm (30), D = rnorm(30)),
D = data.frame(A = rnorm(30), B = rnorm(30), C =rnorm (30), D = rnorm(30)))
#function to calculate the variation coeficient
cvs <- function (dist){
cv <- sd(dist, na.rm=T) / mean(dist, na.rm=T) * 100
return(cv)
}
The I run:
coef_var = lapply(dists_log, cvs)
and got the error message above
Can someone help me with this error?
We need a nested list as sd and mean requires the input to be vector and not a data.frame. So, we loop over the columns of the data.frame with lapply, apply the 'cvs' function, assign back to the object and return the data.frame object
lapply(dists_log, function(x) {x[] <- lapply(x, cvs); x})
If we are expecting only a single element as output
lapply(dists_log, function(x) unlist(lapply(x, cvs)))

rev() in r and how to apply it to a list using loops

I have a list of say {a,b,c,d,...} and each element, a,b,c,d, ... are data.table that I need to reverse the order of, however, for the data.table I only want to rev() all of it except the first column, as it is an ID. I tried using loops to do it but it returned
Error in `[<-.data.table`(`*tmp*`, , -1, value = list(code_a = c("a", :
Item 1 of column numbers in j is -1 which is outside range [1,ncol=4]. Use column names instead in j to add new columns.
Example:
a <- c("a","b","c","d","e","f")
b <- 1:6
c <- c("F","E","D","C","B","A")
d <- 10:15
dt1 <- data.table("ID" = b, "code_a" = a)
dt2 <- data.table("ID" = b, "code_c" = c)
dt3 <- data.table("ID" = b, "code_d" = d)
dt <- list(dt1,dt2,dt3)
rev_dt <- rev(dt)
merged_list <- list()
rev_merged_list <- list()
rev_merged_list <- Reduce(merge, rev_dt, accumulate = TRUE)
merged_list <- rev_merged_list
merged_list <- rev(merged_list)
for(z in 1:length(dt)){
merged_list[[z]][,-1] = rev(merged_list[[z]][,-1])
}
More Information:
The for loop here is supposed to be:
- for z from 1 to the length of dt
- the merged_list element z (which with double square brackets) should be a data.table
- where the data does not include the first column
- should be assigned to the rev of the same element z, where the first column is also excluded
Does this logic hold for the above loop? I am unsure what is wrong!
Expected Output:
output_ <- list()
a_ <- data.table("ID" = b, "code_a" = a, "code_c" = c, "code_d" = d)
b_ <- data.table("ID" = b, "code_c" = c, "code_d" = d)
c_ <- data.table("ID" = b, "code_d" = d)
output_[[1]] <- a_
output_[[2]] <- b_
output_[[3]] <- c_
output_
I was told yesterday that the merge above i can specify a right hand merge, however in doing so, I need to specify a by = "ID" in the merge, but I am unsure what is the x and y values in the case of merging multiple sets of data.
I am also under the impression that lapply() can do the same thing instead of loop, but I am unsure in this case how might I achieved that. Thanks~
We can use setcolorder
for(i in seq_along(merged_list)){
setcolorder(merged_list[[i]],
c(names(merged_list[[i]])[1], rev(names(merged_list[[i]])[-1])))
}
all.equal(merged_list, output_, check.attributes = FALSE)
#[1] TRUE

Automatically add any variables that exist in one data.frame but missing in other data.frames in R

Suppose I have a reference data.frame called a. I was wondering how I could automatically add any variables that exist in a but missing in other data.frames b and d?
NOTE: My goal is to make a function out of this such that any number of data.frames, and any number of variables can be completed based on a single reference data.frame.
a <- data.frame(x = 2:3, y = 4:5, z = c(T, F)) ## reference data.frame
b <- data.frame(x = 6:7) ## Add y and z here
d <- data.frame(x = 7:8) ## Add y and z here
Supposing all the data.frames involved share the same number of rows, you can simply:
toadd<-setdiff(colnames(a),colnames(b))
b[toadd]<-a[toadd]
Wrapping the above in a function:
f<-function(refdf, ...) {
res<-listdf<-list(...)
res<-lapply(listdf, function(x) {
toadd<-setdiff(names(refdf),names(x))
x[toadd]<-refdf[toadd]
x
})
c(list(refdf),res)
}
Then try for instance:
f(a,b)
f(a,b,d)
# Using a reference data.frame perform a right join in order
# to append required vectors to provided data.frames:
add_empty_vecs <- function(refdf, ...){
# Store the names of the other data.frames: df_names => character vector
df_names <- as.list(substitute(list(...)))[-1L]
# Return the right joined the reference data.frame to the
# provided data.frames: list => .GlobalEnv()
setNames(lapply(list(...), function(y){
merge(refdf, y, by = intersect(names(refdf), names(y)), all.y = TRUE)
}
), c(df_names))
}
# Apply function only df b:
add_empty_vecs(a, b)
# Apply function to both df b & df d:
add_empty_vecs(a, b, d)
# Apply function to all b, d, e:
add_empty_vecs(a, b, d, e)
Data:
a <- data.frame(x = 2:3, y = 4:5, z = c(T, F)) ## reference data.frame
b <- data.frame(x = 6:7) ## Add y and z here
d <- data.frame(x = 7:8) ## Add y and z here
e <- data.frame(x = 9:10)

Split a tensor in torch

Given an input tensor of size n x 2A x B x C, how to split it into two tensors, each of size n x A x B x C? Essentially, n is the batch size.
You can use torch.split:
torch.split(input_tensor, split_size_or_sections=A, dim=1)
I think you could do something like:
tensor_a = torch.Tensor(n, 2A, B,C)
-- Initialize tensor_a with the data
tensor_b = torch.Tensor(n, A, B, C)
tensor_b = tensor_a[{{},1,{},{}}]
tensor_c = torch.Tensor(n, A, B, C)
tensor_c = tensor_a[{{},2,{},{}}]

Priority/Decision Based Choice of Row

I have a data.frame that has a number of duplicate rows, akin to something like this:
con <- textConnection(Lines <- "
First, Last, Address, Address 2, Email, Custom1, Custom2, Custom3
A, B, C, D, F#G.com,1,2,3
A, B, C, D, F#G.com,1,2,2
A, B, C, D, F#G.com,1,2,1
")
x <- read.csv(con)
close(con)
Now, when I de-duplicate, in the following manner:
x <- x[!duplicated(x[,c("email")]),]
Could you recommend a method for prioritizing those rows that contain Custom3=1? Or is there a better mechanism for de-duplication?
Try sorting before finding duplicates:
x <- x[order(x[,c("Custom3")]),]
x <- x[!duplicated(x[,c("email")]),]

Resources