I'm trying to subset number of rows in a list using R.
I have 2 lists one has matrix with n rows and p columns the second list has the number of rows that I need to subset.
mat <- list(a = matrix(rnorm(8*4),8), b = matrix(rnorm(15*4),15), c = matrix(rnorm(7*4),7))
rw <- list(a = 6, b = 7, c = 4)
Both list have common names, in the above example, I would like to retain for element a first 6 rows, for b first 7 rows and c 4 rows.
How would you do that in R
One solution with Map:
Map(function(x, y) x[1:y, ], mat, rw)
# $a
# [,1] [,2] [,3] [,4]
# [1,] 1.3331549 -0.6985623 -1.1842788 -0.1496880
# [2,] 0.2096395 -0.2901906 0.4210395 0.9116542
# [3,] 0.1763317 1.3858205 -1.1567526 -1.1794618
# [4,] 1.3596395 0.5815012 -0.3681799 -0.6569447
# [5,] 0.2251352 0.2331387 -1.2509844 -1.1346729
# [6,] 0.6796729 1.1274772 0.3992489 0.2305927
#
# $b
# [,1] [,2] [,3] [,4]
# [1,] 0.30700748 -1.2173855 -0.3377885 -0.6748974
# [2,] 1.09506443 -0.6142685 -1.1301122 -0.7792081
# [3,] -0.61049306 -1.3414474 0.9771373 1.0191636
# [4,] 0.66687294 -0.5269721 0.9971987 -0.6514121
# [5,] 0.54623236 0.9020964 0.3252700 -0.3925129
# [6,] -0.04848903 -0.5204047 0.3344675 -0.3232105
# [7,] -0.56502719 -0.3743275 2.1760364 -0.2941956
#
# $c
# [,1] [,2] [,3] [,4]
# [1,] -0.3225609 -0.40126955 -1.787255 -1.5005721
# [2,] 0.3474430 -1.16657015 1.106033 0.3114282
# [3,] 0.4099467 -0.04353555 0.838330 0.3282246
# [4,] -1.4648740 0.51279791 0.198768 -0.3394502
I have a list of matrices and a list of vectors, and I want to divide the columns of each matrix with the corresponding vector element.
For example, given
set.seed(230)
data <- list(cbind(c(NA, rnorm(6)),c(rnorm(6),NA)), cbind(runif(7), runif(7)))
divisors <- list(c(0.5,2), c(3,4))
I'm looking for a vectorized function that produces output that looks the same as
for(i in 1:length(data)){
for(j in 1:ncol(data[[i]])){data[[i]][,j] <- data[[i]][,j] / divisors[[i]][j]}
}
i.e.
[[1]]
[,1] [,2]
[1,] NA 0.28265752
[2,] -0.46967014 -0.07132588
[3,] 0.20253439 -0.37432527
[4,] 0.65736410 0.06630705
[5,] 0.72349294 0.67202129
[6,] 0.88532648 -0.80892508
[7,] 0.08162027 NA
[[2]]
[,1] [,2]
[1,] 0.26597435 0.18120979
[2,] 0.31213250 0.16493883
[3,] 0.19250804 0.14104145
[4,] 0.21196882 0.10172964
[5,] 0.10389773 0.04979742
[6,] 0.02754329 0.15064043
[7,] 0.25771766 0.23042586
The closest I have been able to come is
Map(`/`, data, divisors)
But that divides rows (rather than columns) of the matrix by the vector. Any help appreciated.
Transpose your matrices before and after:
lapply(Map(`/`, lapply(data, t), divisors), t)
# [[1]]
# [,1] [,2]
# [1,] NA 0.28265752
# [2,] -0.46967014 -0.07132588
# [3,] 0.20253439 -0.37432527
# [4,] 0.65736410 0.06630705
# [5,] 0.72349294 0.67202129
# [6,] 0.88532648 -0.80892508
# [7,] 0.08162027 NA
#
# [[2]]
# [,1] [,2]
# [1,] 0.26597435 0.18120979
# [2,] 0.31213250 0.16493883
# [3,] 0.19250804 0.14104145
# [4,] 0.21196882 0.10172964
# [5,] 0.10389773 0.04979742
# [6,] 0.02754329 0.15064043
# [7,] 0.25771766 0.23042586
I prefer the transpose approach above, but another option is to expand your divisor vectors into matrices of the same dimensions as in data:
div_mat = Map(matrix, data = divisors, nrow = sapply(data, nrow), ncol = 2, byrow = T)
Map("/", data, div_mat)
Let's say I have this list of matrices:
c1 <- matrix(rnorm(10),5,2)
c2 <- c1+(rnorm(10))
c3 <- c1+(rnorm(10))
c4 <- c1+(rnorm(10))
c5 <- c1+(rnorm(10))
c6 <- c1+(rnorm(10))
clist <- list(c1,c2,c3,c4,c5,c6)
[[1]]
[,1] [,2]
[1,] -0.1591251 0.36887661
[2,] 0.4200732 -1.21884880
[3,] -0.6763903 -0.02593779
[4,] 0.1658612 -0.65441390
[5,] -1.4652644 -0.10981210
[[2]]
[,1] [,2]
[1,] -1.5475582 1.33232706
[2,] 0.9781123 -0.70260202
[3,] -1.1577471 2.04805617
[4,] 0.4535016 -1.08563438
[5,] -3.0072380 0.06337565
[[3]]
[,1] [,2]
[1,] 0.5475332 -0.7793278
[2,] -1.8806731 -1.1158255
[3,] -0.4837955 -0.8165737
[4,] -1.4951387 -0.2655842
[5,] -0.1487497 -0.4243752
[[4]]
[,1] [,2]
[1,] -1.270525331 0.5796936
[2,] 1.309900315 -2.4646281
[3,] -2.313890536 1.5281795
[4,] 0.003287924 -2.3560008
[5,] -1.903412482 -2.6763855
[[5]]
[,1] [,2]
[1,] -0.4553650 0.06665067
[2,] -0.4382334 -0.91694728
[3,] -1.8101902 0.29204456
[4,] 0.6602221 -0.45068171
[5,] -1.3796827 0.51264234
[[6]]
[,1] [,2]
[1,] -1.0130324 1.4233890
[2,] 0.9672156 -0.9425755
[3,] -2.5090911 -0.5489537
[4,] 0.7705731 1.0351301
[5,] -0.0414573 -1.8325651
I want to merge c1+c2, c3+c4, c5+c6 and keep them in a list. I could this manually with the following code:
cm1 <- do.call("rbind", clist[1:2])
cm2 <- do.call("rbind", clist[3:4])
cm3 <- do.call("rbind", clist[5:6])
cmlist <- list(cm1, cm2, cm3)
But because my actual data will be much larger, this method would be very time consuming. Is there a much quicker way to do it?
Try this:
cmlist1=lapply(seq(1,length(clist),by=2),function(x)do.call("rbind", clist[x:(x+1)]))
How about the following?
mapply(rbind,
clist[which(seq(1,length(clist)) %% 2 == 1)],
clist[which(seq(1,length(clist)) %% 2 == 0)],
SIMPLIFY = F)
We create a grouping variable g (equal to c(1, 1, 2, 2, ...)) and then split by it and rbind the elements of each component together:
n <- length(clist)
g <- c(gl(n, 2, n))
lapply(split(clist, g), "do.call", what = "rbind")
I'm having a bit of trouble understanding how to use calls in R. I want to take an object created by a function and use it as an argument to another function, modifying some of the arguments to the original function along the way. I've looked at Hadley Wickham's page on expressions, but it doesn't quite seem to tell me how to do what I want to do.
Here is a partially-working example of the sort of thing that I want to do. First, fake data:
library(MASS)
N <- 1000
p <- 10
A <- matrix(rnorm(p^2), p)
X <- mvrnorm(N, rep(0, p), t(A) %*% A)
B <- rnorm(p)
y <- X %*% B + rnorm(N)
Next, a function to do ridge regression. It is a function of X, y, and the ridge penalty L. It returns the coefs and the call:
pols <- function(X, y, L){
cl <- match.call()
beta <- solve(t(X) %*% X + diag(rep(L, p))) %*% t(X) %*% y
return(list(beta = beta, cl = cl))
}
1> pols(X, y, 1)
$beta
[,1]
[1,] -0.02622669
[2,] -1.96523722
[3,] 0.36375563
[4,] -1.14192468
[5,] -0.14436051
[6,] -0.29700918
[7,] -0.81543748
[8,] -0.17699934
[9,] -0.01342649
[10,] 0.58862577
$cl
pols(X = X, y = y, L = 1)
Now, how do I use the call to drive the following function? It takes a pols object and a vector of different values of L and uses them to re-call pols
Lvec <- 1:10
tryLs <- function(pols, Lvec){
for (i in Lvec){
1. Extract the args from the call in pols
2. Modify the argument `L` based on Lvec
3. Run `pols` with old arguments, but `L` modified according to `i`
}
}
How do I make this last function work?
To clarify, the workflow I'm envisioning is something like:
obj <- pols(X, y, 0)
Lvec <- 1:10
output <- tryLs(obj, Lvec)
I'm going to make a few guesses/assumptions here.
(1) When you say "a pols object", you mean an object returned by the pols function. I've modified pols() below so that it returns an object of type "pols". This isn't at all necessary, but might be useful in the future if you want to do fancier things (e.g. implement custom printing or plotting methods for these objects).
Setup:
library(MASS)
N <- 1000
p <- 10
A <- matrix(rnorm(p^2), p)
X <- mvrnorm(N, rep(0, p), t(A) %*% A)
B <- rnorm(p)
y <- X %*% B + rnorm(N)
I'm also modifying pols so that the element containing the call is called call: this makes the objects automatically work with R's default update method.
pols <- function(X, y, L){
cl <- match.call()
beta <- solve(t(X) %*% X + diag(rep(L, p))) %*% t(X) %*% y
r <- list(beta = beta, call = cl)
class(r) <- "pols"
return(r)
}
In order to have a pols object we have to run pols() once and save the result:
pols1 <- pols(X,y,0)
Now here's your function. My second assumption is that you only want the $beta values returned ...
tryLs <- function(pols,Lvec) {
sapply(Lvec,
function(L) update(pols,L=L)$beta)
}
Lvec <- 1:10
tryLs(pols1,Lvec)
If you wanted to do this at a slightly more nuts-and-bolts level (rather than using update) you would do something along the lines of
pols$call$L <- new_L_value
new_result <- eval(pols$call,parent.frame())
If you look at update.default() you'll see that's more or less what it does (it is using the information from match.call(), implicitly ...)
If I guess correctly as to what you need, I would use partial from the pryr package. This allows you to create a function with a number of the arguments already set:
library(pryr)
preset_pols = partial(pols, X = preset_X, y = preset_y)
preset_pols(L = 1)
calling preset_pols will now always use the data specified in preset_X and preset_y.
In my opinion there is no need for the for loop, lapply would do just fine here:
list_of_results = lapply(Lvec, preset_pols)
Lvec <- 1:10
tryLs <- function(pols, Lvec){
for (i in Lvec){
print(paste("Result for ",i))
print(pols(X,y,i))$beta
print(pols(X,y,i))$cl
}
}
tryLs(pols,Lvec)
[1] "Result for 1"
$beta
[,1]
[1,] 0.03317113
[2,] -0.37399461
[3,] -1.35395755
[4,] 0.09850883
[5,] -0.14503628
[6,] -1.97204600
[7,] -0.56459244
[8,] -1.10422047
[9,] -0.92047748
[10,] 1.76236287
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] 0.03317113
[2,] -0.37399461
[3,] -1.35395755
[4,] 0.09850883
[5,] -0.14503628
[6,] -1.97204600
[7,] -0.56459244
[8,] -1.10422047
[9,] -0.92047748
[10,] 1.76236287
$cl
pols(X = X, y = y, L = i)
[1] "Result for 2"
$beta
[,1]
[1,] -0.01014376
[2,] -0.32064189
[3,] -1.29381243
[4,] 0.10695047
[5,] -0.24791384
[6,] -1.83662948
[7,] -0.55615073
[8,] -1.12204424
[9,] -0.96717380
[10,] 1.79084625
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.01014376
[2,] -0.32064189
[3,] -1.29381243
[4,] 0.10695047
[5,] -0.24791384
[6,] -1.83662948
[7,] -0.55615073
[8,] -1.12204424
[9,] -0.96717380
[10,] 1.79084625
$cl
pols(X = X, y = y, L = i)
[1] "Result for 3"
$beta
[,1]
[1,] -0.04097765
[2,] -0.28237279
[3,] -1.25064282
[4,] 0.11286963
[5,] -0.32135783
[6,] -1.74000917
[7,] -0.55025764
[8,] -1.13481390
[9,] -1.00038377
[10,] 1.81099139
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.04097765
[2,] -0.28237279
[3,] -1.25064282
[4,] 0.11286963
[5,] -0.32135783
[6,] -1.74000917
[7,] -0.55025764
[8,] -1.13481390
[9,] -1.00038377
[10,] 1.81099139
$cl
pols(X = X, y = y, L = i)
[1] "Result for 4"
$beta
[,1]
[1,] -0.06401718
[2,] -0.25352501
[3,] -1.21807596
[4,] 0.11721395
[5,] -0.37641945
[6,] -1.66761823
[7,] -0.54595545
[8,] -1.14442668
[9,] -1.02517135
[10,] 1.82592968
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.06401718
[2,] -0.25352501
[3,] -1.21807596
[4,] 0.11721395
[5,] -0.37641945
[6,] -1.66761823
[7,] -0.54595545
[8,] -1.14442668
[9,] -1.02517135
[10,] 1.82592968
$cl
pols(X = X, y = y, L = i)
[1] "Result for 5"
$beta
[,1]
[1,] -0.08186374
[2,] -0.23095555
[3,] -1.19257456
[4,] 0.12050945
[5,] -0.41923287
[6,] -1.61137106
[7,] -0.54271257
[8,] -1.15193566
[9,] -1.04434740
[10,] 1.83739926
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.08186374
[2,] -0.23095555
[3,] -1.19257456
[4,] 0.12050945
[5,] -0.41923287
[6,] -1.61137106
[7,] -0.54271257
[8,] -1.15193566
[9,] -1.04434740
[10,] 1.83739926
$cl
pols(X = X, y = y, L = i)
[1] "Result for 6"
$beta
[,1]
[1,] -0.09607715
[2,] -0.21277987
[3,] -1.17201761
[4,] 0.12307151
[5,] -0.45347618
[6,] -1.56641949
[7,] -0.54021027
[8,] -1.15797228
[9,] -1.05959733
[10,] 1.84644233
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.09607715
[2,] -0.21277987
[3,] -1.17201761
[4,] 0.12307151
[5,] -0.45347618
[6,] -1.56641949
[7,] -0.54021027
[8,] -1.15797228
[9,] -1.05959733
[10,] 1.84644233
$cl
pols(X = X, y = y, L = i)
[1] "Result for 7"
$beta
[,1]
[1,] -0.1076495
[2,] -0.1977993
[3,] -1.1550561
[4,] 0.1251007
[5,] -0.4814888
[6,] -1.5296799
[7,] -0.5382458
[8,] -1.1629381
[9,] -1.0719931
[10,] 1.8537217
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.1076495
[2,] -0.1977993
[3,] -1.1550561
[4,] 0.1251007
[5,] -0.4814888
[6,] -1.5296799
[7,] -0.5382458
[8,] -1.1629381
[9,] -1.0719931
[10,] 1.8537217
$cl
pols(X = X, y = y, L = i)
[1] "Result for 8"
$beta
[,1]
[1,] -0.1172419
[2,] -0.1852151
[3,] -1.1407910
[4,] 0.1267308
[5,] -0.5048296
[6,] -1.4990974
[7,] -0.5366841
[8,] -1.1671009
[9,] -1.0822491
[10,] 1.8596792
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.1172419
[2,] -0.1852151
[3,] -1.1407910
[4,] 0.1267308
[5,] -0.5048296
[6,] -1.4990974
[7,] -0.5366841
[8,] -1.1671009
[9,] -1.0822491
[10,] 1.8596792
$cl
pols(X = X, y = y, L = i)
[1] "Result for 9"
$beta
[,1]
[1,] -0.1253119
[2,] -0.1744744
[3,] -1.1286001
[4,] 0.1280542
[5,] -0.5245776
[6,] -1.4732498
[7,] -0.5354316
[8,] -1.1706458
[9,] -1.0908596
[10,] 1.8646205
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.1253119
[2,] -0.1744744
[3,] -1.1286001
[4,] 0.1280542
[5,] -0.5245776
[6,] -1.4732498
[7,] -0.5354316
[8,] -1.1706458
[9,] -1.0908596
[10,] 1.8646205
$cl
pols(X = X, y = y, L = i)
[1] "Result for 10"
$beta
[,1]
[1,] -0.1321862
[2,] -0.1651825
[3,] -1.1180392
[4,] 0.1291370
[5,] -0.5415033
[6,] -1.4511217
[7,] -0.5344217
[8,] -1.1737051
[9,] -1.0981778
[10,] 1.8687639
$cl
pols(X = X, y = y, L = i)
$beta
[,1]
[1,] -0.1321862
[2,] -0.1651825
[3,] -1.1180392
[4,] 0.1291370
[5,] -0.5415033
[6,] -1.4511217
[7,] -0.5344217
[8,] -1.1737051
[9,] -1.0981778
[10,] 1.8687639
$cl
pols(X = X, y = y, L = i)