Indexing for loops in lists consisting of matrices - r

I am struggling with indexing correctly piece of code below. Perhaps the set up of my data is not ideal either.
First, I want to compare values between two columns in abcd b and d
and each set of generated RandomNum rows.
If the row value in RandomNum is greater than b abcd value,
then it should = 0, if else it should = 1
The outcome would be a list with two matrices, one comparing column b
and one column or vector d against each set of row values (treated as vectors) in RandomNum using criteria described above.
The outcome list would have 2 matrices with dimensions of nv * rp and would be saved as list UDRandomNum.
Then, I want to take a difference between the outcome value 0 or 1 in UDRandomNum row or column of 3 and each time deduct it from columns b and d in abcd. So the outcome saved in Differences would again have two matrices, each one comparing set of vectors of three against the same column b and d from abcd. I hope it makes sense.
set.seed(101)
a <- c(0.1,0.2,0.3)
b <- c(0.8,0.2,0.5)
c <- c(0.4,0.9,1.0)
d <- c(0.7,0.9,0.2)
ab <- cbind(a,b)
cd <- cbind(c,d)
abcd <- list(ab,cd)
rp <- 100
nv <- length(a)
RandomNum <- vector("list",length(a))
# Draw random values between 0 and 1 from the uniform distribution
for (i in 1:length(a)) {
RandomNum[[i]] <- t(replicate(rp, runif(nv, min=0,max=1)))
}
The problematic piece starts here:
UDRandomNum <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
UDRandomNum[[i]] <- RandomNum[[i]][i] <
abcd[[i]][,2][col(RandomNum[[i]])]+0
}
# Later, I want to take the difference between the outcomes (1 or 0)
Differences <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
Differences[[i]] <- abs(sweep(UDRandomNum,2,abcd[,2]))
}
So it my RandomNum I have the outcome that starts with:
[[1]]
[,1] [,2] [,3]
[1,] 0.076929106 0.42883794 0.502711454
[2,] 0.254765247 0.57422550 0.578616861
[3,] 0.270195792 0.30920944 0.094095268
[4,] 0.512404975 0.97536980 0.082336057
...
I compare it with b: 0.8,0.2,0.5 (if the value in RandomNum is > b then we get 0, else we get 1)
UDRandomNum:
[[1]]
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 1 0 0
[3,] 1 0 1
[4,] 1 0 1
...
And finally I would deduct values in UDRandomNum from b so the outcome in Differences would look following (whatever the results i typed equation for convenience, but it should be a number resulting from this equation)
[[1]]
[,1] [,2] [,3]
[1,] abs(1-0.076929106) abs(0-0.42883794) abs(0-0.502711454)
[2,] abs(1-0.254765247) abs(0-0.57422550) abs(0-0.578616861)
[3,] abs(1-0.270195792) abs(0-0.30920944) abs(1-0.094095268)
[4,] abs(1-0.512404975) abs(0-0.97536980) abs(1-0.082336057)
...

UDRandomNum <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
UDRandomNum[[i]] <- RandomNum[[1]] <
matrix(abcd[[i]][,2][col(RandomNum[[i]])], ncol=3)
}
Differences <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
Differences[[i]] <- abs(UDRandomNum[[i]] - RandomNum[[i]])
}
Explanation
With the UDRandomNum list, try changing the abcd b and d columns into matrices that match the dimensions of RandomNum. And for the Differences list, R will automatically coerce the TRUE and FALSE results to 1 and 0 when you subtract the two matrices UDRandomNum and RandomNum.

Related

Inside the loop, write a line of code that populates the matrix M with the numbers 1 through 20

I keep ending up with a matrix populated entirely by 20s. It is iterating over the number and through the indices of the matrix M but it is over writing it each time when I am looking for a matrix that is 10x2 with only unique values.
n = 20;
M = matrix(NA, ncol = 2, nrow = 10);
a = 1
b = 1
for (i in 1:n){
for (r in 1:nrow(M))
for (c in 1:ncol(M))
i -> M[r,c]
print(M)
}
M
I would suggest that if the outer for-loop is indexed by increasing values to be entered as elements into the matrix, that the value of i should be used to decide which position it goes to. (If the values were not sequential then you could use the result of seq_along( your_non_consecutive_variable) as the index for the loop and the way to pick the value to be entered into the matrix. You CANNOT work with a single value set at the outer loop, and then repeat an assignment of that value multiple times with two nested inner loops.
n = 20;
M = matrix(NA, ncol = 2, nrow = 10);
a = 1
b = 1
for (i in 1:n){
if( i <= 10){ M[i, 1] <- i} else
{ M[i-10, 2] <- i}}
M
#---------
[,1] [,2]
[1,] 1 11
[2,] 2 12
[3,] 3 13
[4,] 4 14
[5,] 5 15
[6,] 6 16
[7,] 7 17
[8,] 8 18
[9,] 9 19
[10,] 10 20
That said this is only to be used as an exercise in understanding for-loops. A more R-ish way of putting values into a matrix would be:
var <- sample(1:20)
M <- matrix( var, 2, 10)
The values in var get assigned to rows 1:10 in the first column and then rows 1:10 in the second column. R handles its matrix indexing in a column major fashion. This is important to understand when working with the results of sapply operations.

Replacing pair of element of symmetric matrix with NA

I have a positive definite symmetric matrix. Pasting the matrix generated using the following code:
set.seed(123)
m <- genPositiveDefMat(
dim = 3,
covMethod = "unifcorrmat",
rangeVar = c(0,1) )
x <- as.matrix(m$Sigma)
diag(x) <- 1
x
#Output
[,1] [,2] [,3]
[1,] 1.0000000 -0.2432303 -0.4110525
[2,] -0.2432303 1.0000000 -0.1046602
[3,] -0.4110525 -0.1046602 1.0000000
Now, I want to run the matrix through iterations and in each iteration I want to replace the symmetric pair with NA. For example,
Iteration 1:
x[1,2] = x[2,1] <- NA
Iteration2:
x[1,3] = x[3,1] <- NA
and so on....
My idea was to check using a for loop
Prototype:
for( r in 1:nrow(x)
for( c in 1:ncol(x)
if x[r,c]=x[c,r]<-NA
else
x[r,c]
The issue with my code is for row 1 and column 1, the values are equal hence it sets to 0 (which is wrong). Also, the moment it is not NA it comes out of the loop.
Appreciate any help here.
Thanks
If you need the replacement done iteratively, you can use the indexes of values represented by upper.tri(x)/lower.tri to do the replacements pair-by-pair. That will allow you to pass the results to a function before/after each replacement, e.g.:
idx <- which(lower.tri(mat), arr.ind=TRUE)
sel <- cbind(
replace(mat, , seq_along(mat))[ idx ],
replace(mat, , seq_along(mat))[ idx[,2:1] ]
)
# [,1] [,2]
#[1,] 2 4 ##each row represents the lower/upper pair
#[2,] 3 7
#[3,] 6 8
for( i in seq_len(nrow(sel)) ) {
mat[ sel[i,] ] <- NA
print(mean(mat, na.rm=TRUE))
}
#[1] 0.2812249
#[1] 0.5581359
#[1] 1

Learning R - What is this Function Doing?

I am learning R and reading the book Guide to programming algorithms in r.
The book give an example function:
# MATRIX-VECTOR MULTIPLICATION
matvecmult = function(A,x){
m = nrow(A)
n = ncol(A)
y = matrix(0,nrow=m)
for (i in 1:m){
sumvalue = 0
for (j in 1:n){
sumvalue = sumvalue + A[i,j]*x[j]
}
y[i] = sumvalue
}
return(y)
}
How do I call this function in the R console? And what exactly is passing into this function A, X?
The function takes an argument A, which should be a matrix, and x, which should be a numeric vector of same length as values per row in A.
If
A <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
then you have 3 values (number of columns, ncol) per row, thus x needs to be something like
x <- c(4,5,6)
The function itself iterates all rows, and in each row, each value is multiplied with a value from x, where the value in the first column is multiplied with the first value in x, the value in As second column is multiplied with the second value in x and so on. This is repeated for each row, and the sum for each row is returned by the function.
matvecmult(A, x)
[,1]
[1,] 49 # 1*4 + 3*5 + 5*6
[2,] 64 # 2*4 + 4*5 + 6*6
To run this function, you first have to compile (source) it and then consecutively run these three code lines:
A <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
x <- c(4,5,6)
matvecmult(A, x)
This function is designed to return the product of a matrix A with a vector x; i.e. the result will be the matrix product A x (where - as is usual in R, the vector is a column vector). An example should make things clear.
# define a matrix
mymatrix <- matrix(sample(12), nrow <- 4)
# see what the matrix looks like
mymatrix
# [,1] [,2] [,3]
# [1,] 2 10 9
# [2,] 3 1 12
# [3,] 11 7 5
# [4,] 8 4 6
# define a vector where multiplication of our matrix times the vector will be defined
vec3 <- c(-1,0,1)
# apply the function to our matrix and vector
result <- matvecmult(mymatrix, vec3)
result
# [,1]
# [1,] 7
# [2,] 9
# [3,] -6
# [4,] -2
class(result)
# [1] "matrix"
So matvecmult(mymatrix, vec3) is how you would call this function, and the result is an n by 1 matrix, where n is the number of rows in the matrix argument.
You can also get some insight by playing around and seeing what happens when you pass something other than a matrix-vector pair where the product is defined. In some cases, you will get an error; sometimes you get nonsense; and sometimes you get something you might not expect just from the function name. See what happens when you call matvecmult(mymatrix, mymatrix).
The function is calculating the product of a Matrix and a column vector. It assumes both the number of columns of the matrix is equal to the number of elements in the vector.
It stores the number of columns of A in n and number of rows in m.
It then initializes a matrix of mrows with all values as 0.
It iterates along the rows of A and multiplies each value in each row with the values in x.
The answer is the stored in y and finally it returns the single column matrix y.

How to generate a matrices A) each row has a single value of one; B) rows sum to one

This is a two-part problem: the first is to create an NXN square matrix for which only one random element in each row is 1, the other items must be zero. (i.e. the sum of elements in each row is 1).
The second is to create an NXN square matrix for which the sum of items in each row is 1, but each element follows a distribution e.g. normal distribution.
Related questions include (Create a matrix with conditional sum in each row -R)
Matlab seems to do what I want automatically (Why this thing happens with random matrix such that all rows sum up to 1?), but I am looking for a solution in r.
Here is what I tried:
# PART 1
N <- 50
x <- matrix(0,N,N)
lapply(1:N, function(y){
x[y,sample(N,1)]<- 1
})
(I get zeroes still)
# PART 2
N <- 50
x <- matrix(0,N,N)
lapply(1:N, function(y){
x[y,]<- rnorm(N)
})
(It needs scaling)
Here's another loop-less solution that uses the two column addressing facility using the "[<-" function. This creates a two-column index matrix whose first column is simply an ascending series that assigns the row locations, and whose second column (the one responsible for picking the column positions) is a random integer value. (It's a vectorized version of Matthew's "easiest method", and I suspect would be faster since there is only one call to sample.):
M <- matrix(0,N,N)
M[ cbind(1:N, sample(1:N, N, rep=TRUE))] <- 1
> rowSums(M)
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
If you didn't specify rep=TRUE, then colSums(M) would have all been ones as well, but that was not what you requested. It does mean the rank of your resultant matrix may be less than N. If you left out the rep=TRUE the matrix would be full rank.
Here you see why lapply doesn't always replace a loop. You're trying to iterate through the rows of x and modify the matrix, but what you're modifying is a copy of the x from the global environment.
The easiest fix is to use a for loop:
for (y in 1:N) {
x[y,sample(N,1)]<- 1
}
apply series should be used for the return value, rather than programming functions with side-effects.
A way to do this is to return the rows, then rbind them into a matrix. The second example is shown here, as this more closely resembles an apply:
do.call(rbind, lapply((1:N), function(i) rnorm(N)))
However, this is more readable:
matrix(rnorm(N*N), N, N)
Now to scale this to have row sums equal to 1. You use the fact that a matrix is column-oriented and that vectors are recycled, meaning that you can divide a matrix M by rowSums(M). Using a more reasonable N=5:
m <- matrix(rnorm(N*N), N, N)
m/rowSums(m)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.1788692 0.5398464 0.24980924 -0.01282655 0.04430168
## [2,] 0.4176512 0.2564463 0.11553143 0.35432975 -0.14395871
## [3,] 0.3480568 0.7634421 -0.38433940 0.34175983 -0.06891932
## [4,] 1.1807180 -0.0192272 0.16500179 -0.31201400 -0.01447859
## [5,] 1.1601173 -0.1279919 -0.07447043 0.20865963 -0.16631458
No-loop solution :)
n <- 5
# on which column in each row insert 1s
s <- sample(n,n,TRUE)
# indexes for each row
w <- seq(1,n*n,by=n)-1
index <- s+w
# vector of 0s
vec <- integer(n*n)
# put 1s
vec[index] <- 1
# voila :)
matrix(vec,n,byrow = T)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 0 0 1 0
[3,] 0 0 0 0 1
[4,] 1 0 0 0 0
[5,] 1 0 0 0 0

R: How to do this matrix operation without loops or more efficient?

I'm trying to make this operation matrices, multiplying the first column with 2, 3 and 4, the first hold value, and then multiply the second column with 3 and 4, keep the value of the third and multiply the third column with 4. I want to do this without using a "for" loop, wanted to use functions like sapply or mapply. Does anyone have an idea how to do it?
Example with one line:
a[1,1]*(a[1,2], a[1,3], a[1,4]) = 2 4 4 4
a[1,1] a[1,2]*(a[1,3], a[1,4]) = 2 4 16 16 #keep a[1,1] a[1,2]
a[1,1] a[1,2] a[1,3] a[1,3]*(a[1,4]) = 2 4 16 256 # #keep a[1,1] a[1,2] a[1,3]
Input:
> a<- matrix(2,4,4) # or any else matrix like a<- matrix(c(1,8,10,1,4,1),3,3)
> a
[,1] [,2] [,3] [,4]
[1,] 2 2 2 2
[2,] 2 2 2 2
[3,] 2 2 2 2
[4,] 2 2 2 2
Output:
> a
[,1] [,2] [,3] [,4]
[1,] 2 4 16 256
[2,] 2 4 16 256
[3,] 2 4 16 256
[4,] 2 4 16 256
EDIT: LOOP VERSION
a<- matrix(2,4,4);
ai<-a[,1,drop=F];
b<- matrix(numeric(0),nrow(a),ncol(a)-1);
i<- 1;
for ( i in 1:(ncol(a)-1)){
a<- a[,1]*a[,-1,drop=F];
b[,i]<- a[,1];
}
b<- cbind(ai[,1],b);
b
If I understand correctly, what you are trying to do is, starting with a matrix A with N columns, perform the following steps:
Step 1. Multiply columns 2 through N of A by column 1 of A. Call the resulting matrix A1.
Step 2. Multiply columns 3 through N of A1 by column 2 of A1. Call the resulting matrix A2.
...
Step (N-1). Multiply column N of A(N-2) by column (N-1) of A(N-2). This is the desired result.
If this is indeed what you are trying to do, you need to either write a double for loop (which you want to avoid, as you say) or come up with some iterative method of performing the above steps.
The double for way would look something like this
DoubleFor <- function(m) {
res <- m
for(i in 1:(ncol(res)-1)) {
for(j in (i+1):ncol(res)) {
res[, j] <- res[, i] * res[, j]
}
}
res
}
Using R's vectorized operations, you can avoid the inner for loop
SingleFor <- function(m) {
res <- m
for(i in 1:(ncol(res)-1))
res[, (i+1):ncol(res)] <- res[, i] * res[, (i+1):ncol(res)]
res
}
When it comes to iterating a procedure, you may want to define a recursive function, or use Reduce. The recursive function would be something like
RecursiveFun <- function(m, i = 1) {
if (i == ncol(m)) return(m)
n <- ncol(m)
m[, (i+1):n] <- m[, (i+1):n] * m[, i]
Recall(m, i + 1) # Thanks to #batiste for suggesting using Recall()!
}
while Reduce would use a similar function without the recursion (which is provided by Reduce)
ReduceFun <- function(m) {
Reduce(function(i, m) {
n <- ncol(m)
m[, (i+1):n] <- m[, (i+1):n] * m[, i]
m
}, c((ncol(m)-1):1, list(m)), right = T)
}
These will all produce the same result, e.g. testing on your matrix
a <- matrix(c(1, 8, 10, 1, 4, 1), 3, 3)
DoubleFor(a)
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 8 32 2048
# [3,] 10 10 1000
all(DoubleFor(a) == SingleFor(a) & SingleFor(a) == RecursiveFun(a) &
RecursiveFun(a) == ReduceFun(a))
# [1] TRUE
Just out of curiosity, I did a quick speed comparison, but I don't think any one of the above will be significantly faster than the others for your size of matrices, so I would just go with the one you think is more readable.
a <- matrix(rnorm(1e6), ncol = 1e3)
system.time(DoubleFor(a))
# user system elapsed
# 22.158 0.012 22.220
system.time(SingleFor(a))
# user system elapsed
# 27.349 0.004 27.415
system.time(RecursiveFun(a))
# user system elapsed
# 25.150 1.336 26.534
system.time(ReduceFun(a))
# user system elapsed
# 26.574 0.004 26.626

Resources