Need to fix a problem with seq function in R - r

I have to create a function with 2 arguments (a,b) that will return the first n multiples of 3 that are less than or equal to a. Here is what i am doing:
f <- function(a, b){
v = seq(from = 0, to= a, by = 3, length.out = b)
return(v)
}
It says that the seq() has too many arguments, and I understand why. If i remove the 'from', there would be some cases where the vector wouldnt started with zero. How could i fix the problem?
THank you

seq supports either by= or length.out=, not both. You can have the same affect with head(seq(...)):
seq(from = 0, to = 20, by = 3, length.out = 4)
# Error in seq.default(from = 0, to = 20, by = 3, length.out = 4) :
# too many arguments
seq(from = 0, to = 20, by = 3)
# [1] 0 3 6 9 12 15 18
head(seq(from = 0, to = 20, by = 3), n = 4)
# [1] 0 3 6 9
which for your function should be:
f <- function(a, b){
head(seq(from = 0, to = a, by = 3), n = b)
}
f(20, 4)
# [1] 0 3 6 9

How about the following?
f <- function(a, b) {
if (3 * b > a) return(seq(0L, a, by = 3L))
return(seq(0, by = 3L, length.out = b))
}

Related

Search a vector in a specific element of a nested list in R

To explain what I want to do exactly, I will use the following example:
a = list(x = 5, y = c(11, 12, 13))
b = list(x = 4.7, y = c(112, 5, 2))
c = list(x = 77, y = c(5, 1, 1))
d = list(x = 5, y = c(22, 11, 43))
test_list = list(a, b, c, d)
I have a nested list: test_list. I would like to search vector 5 only in element x in the tested_list, and return the indices of the list, e.g., here as c(1,4).
Thanks a lot.
I would try with lapply like here:
a = list(x = 5, y = c(11, 12, 13))
b = list(x = 4.7, y = c(112, 5, 2))
c = list(x = 77, y = c(5, 1, 1))
d = list(x = 5, y = c(22, 11, 43))
test_list = list(a, b, c, d)
which(unlist(lapply(test_list, function(x) {
x$x == 5
})))
First you choose x then for 5 then unlist and then check which are TRUE.
Try:
which(vapply(test_list, function(x) x[["x"]] == 5, logical(1)))
Similarly, using purrr:
which(map_lgl(test_list, ~ pluck(., "x") == 5))
[1] 1 4
As 'x' is of length 1 in each list element, it may be better to do the comparison at once after extracting the element
which(sapply(test_list, `[[`, 'x')==5)
#[1] 1 4

Assign multiple results from function when grouping

I have this code which does exactly what I want but I have to call my function three times for every group which seems very inefficient.
library(data.table)
myRegr = function(x, y) {
regr = lm.fit(cbind(1, x), y)
coefs = regr$coef
k = coefs[[2]]
m = coefs[[1]]
r2 = 1 - var(regr$residuals) / var(y)
return (c(k = k, m = m, r2 = r2))
}
dt = data.table(a = c(0, 0, 0, 1, 1, 1),
x = c(12, 21, 15, 34, 32, 31),
y = c(3, 1, 6, 4, 2, 8))
result = dt[,list(minX = min(x),
minY = min(y),
k = myRegr(x, y)["k"],
m = myRegr(x, y)["m"],
r2 = myRegr(x, y)["r2"]
),
by = list(a)]
print(result)
Outputs:
a minX minY k m r2
0 12 1 -0.3095238 8.285714 0.3176692
1 31 2 -1.0000000 37.000000 0.2500000
Any idea how I can rewrite this to only call the function once?
UPDATE:
My example didn't cover the complete problem as I have a fourth column which I'm selecting on, here is a better example:
library(data.table)
myRegr = function(x, y) {
regr = lm.fit(cbind(1, x), y)
coefs = regr$coef
k = coefs[[2]]
m = coefs[[1]]
r2 = 1 - var(regr$residuals) / var(y)
return (c(k = k, m = m, r2 = r2))
}
df = data.frame(a = c(0, 0, 0, 1, 1, 1),
x = c(12, 21, 15, 34, 32, 31),
y = c(3, 1, 6, 4, 2, 8),
time = as.POSIXct(c("2019-01-01 08:12:00", "2019-01-01 08:13:00", "2019-01-01 08:14:00", "2019-01-01 08:12:00", "2019-01-01 08:13:00", "2019-01-01 08:14:00")))
dt = data.table(df)
result = dt[, list(firstX = x[time == min(time)],
firstY = y[time == min(time)],
k = myRegr(x, y)["k"],
m = myRegr(x, y)["m"],
r2 = myRegr(x, y)["r2"]
),
by = a]
print(result)
Outputs:
a firstX firstY k m r2
0 12 3 -0.3095238 8.285714 0.3176692
1 34 4 -1.0000000 37.000000 0.2500000
Tried wrapping it all in a function but it actually slowed things down:
library(data.table)
myRegrList = function(group) {
firstX = group[,x[time == min(time)]]
firstY = group[,y[time == min(time)]]
regr = lm.fit(cbind(1, group$x), group$y)
coefs = regr$coef
k = coefs[[2]]
m = coefs[[1]]
r2 = 1 - var(regr$residuals) / var(group$y)
return (list(firstX = firstX, firstY = firstY, k = k, m = m, r2 = r2))
}
result = dt[, myRegrList(.SD), by = a]
print(result)
If you make your function return a list you only need to call
dt[, myRegr(x, y), by = a]
# a minX minY k m r2
#1: 0 12 1 -0.3095238 8.285714 0.3176692
#2: 1 31 2 -1.0000000 37.000000 0.2500000
With
myRegr = function(x, y) {
regr = lm.fit(cbind(1, x), y)
coefs = regr$coef
k = coefs[[2]]
m = coefs[[1]]
r2 = 1 - var(regr$residuals) / var(y)
return (list(# minX = min(x),
# minY = min(y),
k = k,
m = m,
r2 = r2))
}
update
You might subset for x and y values and then join with the result of your function
result <- dt[dt[, .I[which.min(time)], by = a]$V1, .(a, x, y)]
result <- result[dt[, myRegr(x, y), by = a], on = .(a)]
result
# a x y k m r2
#1: 0 12 3 -0.3095238 8.285714 0.3176692
#2: 1 34 4 -1.0000000 37.000000 0.2500000
You can modify your function to return a vector and dcast final result:
library(data.table)
myRegr = function(x, y) {
regr <- lm.fit(cbind(1, x), y)
c(
regr$coef[[1]],
regr$coef[[2]],
1 - var(regr$residuals) / var(y)
)
}
result <- df[, .(minX = min(x), minY = min(y), myRegr(x, y), c("m", "k", "r2")), a]
dcast(result, a + minX + minY ~ V4, value.var = "V3")
This solution is not perfect as I have to create V4 (add c("m", "k", "r2") vector). There should be a better way to do this (perhaps even not to use dcast). Maybe more experienced data.table users could advice on this?
Data:
df <- data.table(
a = c(0, 0, 0, 1, 1, 1),
x = c(12, 21, 15, 34, 32, 31),
y = c(3, 1, 6, 4, 2, 8)
)

Function not finding an argument when vectorized in R?

I'm wondering why my vectorized function below works fine when I use cii(peta = c(.3, .4), N = 120, df1 = 3, df2 = 116) BUT when I use cii(F.value = c(30, 40), N = 120, df1 = 3, df2 = 116) the function gives "peta" is missing?
P.S. I have made it clear in my function that when F.value is NA, function should use peta and ELSE use F.value. But why I'm getting the "peta" is missing error?
cii <- function(peta, F.value = NA, N, df1, df2, conf.level = .9){ # Upper-Level FUNCTION
ci <- Vectorize(function(peta, F.value, N, df1, df2, conf.level){ # Lower-Level FUNCTION
options(warn = -1)
q = ifelse(is.na(F.value), (-peta * df2) / ((peta * df1) - df1), F.value)
alpha = (1 - conf.level)/2
f <- function (ncp, alpha, q, df1, df2) {
abs(suppressWarnings(pf(q = q, df1 = df1, df2 = df2, ncp, lower.tail = FALSE)) - alpha)}
I <- sapply(c(alpha, 1-alpha), function(x) optimize(f, interval = c(-30, 30), alpha = x, q = q, df1 = df1, df2 = df2)[[1]])
round(data.frame(lower = I[1], upper = I[2], conf.level = conf.level, F.value = q), 6)
})
data.frame(t(ci(peta = peta, F.value = F.value, N = N, df1 = df1, df2 = df2, conf.level = conf.level)))
}
### TWO EXAMPLES OF USE: ###
cii(F.value = c(30, 40), N = 120, df1 = 3, df2 = 116) # Gives Error!!!
cii(peta = c(.3, .4), N = 120, df1 = 3, df2 = 116) # Works Fine !!!
Your problem is that cii() expects there to be an input for the peta argument, and you are not providing one. You have a few options:
1. Call cii() with peta = NA
cii(peta = NA, F.value = c(30, 40), N = 120, df1 = 3, df2 = 116)
# Output:
lower upper conf.level F.value
1 29.99996 29.99996 0.9 30
2 29.99996 29.99996 0.9 40
2. Re-write cii() to assign a default value to peta
e.g.
cii <- function(peta = c(.3, .4), F.value = NA, N, df1, df2, conf.level = .9){ ... }
cii(F.value = c(30, 40), N = 120, df1 = 3, df2 = 116)
# Output
lower upper conf.level F.value
1 29.99996 29.99996 0.9 30
2 29.99996 29.99996 0.9 40
As for this case:
cii(peta = c(.3, .4), N = 120, df1 = 3, df2 = 116)
You don't get an error because in your function you've already assigned a default value of NA to F.value.
BTW although Marcus's solution works - your error occurs because you are using Vectorize
You can call a function with a single argument if the second argument is not required for evaluation
myfun <- function(a, b) {
ifelse(a==1, print(a), print(b))
if (a==1) { print(a) } else { print(b) }
}
myfun(1)
# [1] 1
# [1] 1
myfun(2)
# Error in print(b) : argument "b" is missing, with no default
This is also true for a nested(?) (internal) function
myfun <- function(a, b) {
internalfun <- function(a, b) { ifelse(a==3, print(a), print(b)) }
if (a == 1) { print(a) } else { print(b) }
internalfun(a = 3)
}
myfun(1)
# [1] 1
# [1] 3
# [1] 3
# BTW, I do not understand why `3` is printed twice
This still works when you explicitly name the arguments when calling the nested function
myfun <- function(a, b) {
internalfun <- function(a, b) { ifelse(a==3, print(a), print(b)) }
if (a == 1) { print(a) } else { print(b) }
internalfun(a = 3, b = b)
}
myfun(1)
# [1] 1
# [1] 3
# [1] 3
But it fails when using Vectorize
myfun <- function(a, b) {
internalfun <- Vectorize(function(a, b) { ifelse(a==3, print(a), print(b)) })
if (a == 1) { print(a) } else { print(b) }
internalfun(a = 3, b = b)
}
myfun(1)
# [1] 1
# Error in FUN(X[[i]], ...) : argument "b" is missing, with no default
My guess is that when vectorizing your function, Vectorize has to 'evaluate' your arguments, which is leading to the error

Subset list, keep names

I have list out like this:
u <- list(a = list(b = 1, c = 2),
x = list(k = list(ka = 1, kb = 3),
l = list(la = 1, la = 4)))
v <- list(a = list(b = 1, c = 2),
x = list(m = list(ma = 5, mb = 8),
n = list(na = 5, nb = 8)))
w <- list(a = list(b = 1, c = 2),
x = list(o = list(oa = 4, ob = 1),
p = list(pa = 8, pb = 0)))
out <- list(u, v, w)
I would like to create another list where there are elements k, l, m, n, o, p and names of the list elements are preserved. I found a solution, but looks sub-optimal:
x <- lapply(out, function(y) y[['x']])
o <- list()
for (a in x) {
o <- c(o, a)
}
> str(o, max.level = 1)
List of 6
$ k:List of 2
$ l:List of 2
$ m:List of 2
$ n:List of 2
$ o:List of 2
$ p:List of 2
Is there a better way?
The loop could be replaced with unlist:
res <- unlist( lapply(out,"[[","x"), recursive=FALSE)
identical(res,o)
# [1] TRUE
My lapply is the same as in the OP; it's just a shortcut.
As #akrun suggested, you could more closely mirror the OP's loop with
do.call("c", lapply(out, '[[', 'x'))

Create a list from matrix in R

I have two lists v and w and I would like to create again a list z from matrix M . How can I do this in R?
v = list(a = c(1, 5), b = 2, c= 3)
w = list( a= c(2, 10), b = 4, c = 6)
M = as.matrix(unlist( v) * unlist(w))
> M
[,1]
a1 2
a2 50
b 8
c 18
z = list(a = c(2, 50), b = 8, c = 18)
Do it like this:
mapply(`*`, v, w)
Maybe you want z <- lapply(1:length(v), function(i) v[[i]]*w[[i]])? Add names(z) <- names(v) to keep the names.

Resources