If I have a matrix:
x <- matrix(c(0), ncol=2, nrow=2)
x
[,1] [,2]
[1,] 0 0
[2,] 0 0
and I want to change x[,2][2] to a 1 instead of a 0 and save that in a new matrix y so that the output would be:
y
[,1] [,2]
[1,] 0 0
[2,] 0 1
how do I do this in R?
If you don't want to modify the original matrix, you could reverse the order of the operations that you want to perform, meaning that you first store a copy of the matrix x in a new variable y and then manipulate the entries of the matrix y.
y <- x
y[2,2] <- 1
Else, if you really want to change x and afterwards store a copy of the modified matrix in y... well, I guess that the changes are rather obvious: that would be x[2,2] <-1 followed by y <- x.
Related
I am struggling with indexing correctly piece of code below. Perhaps the set up of my data is not ideal either.
First, I want to compare values between two columns in abcd b and d
and each set of generated RandomNum rows.
If the row value in RandomNum is greater than b abcd value,
then it should = 0, if else it should = 1
The outcome would be a list with two matrices, one comparing column b
and one column or vector d against each set of row values (treated as vectors) in RandomNum using criteria described above.
The outcome list would have 2 matrices with dimensions of nv * rp and would be saved as list UDRandomNum.
Then, I want to take a difference between the outcome value 0 or 1 in UDRandomNum row or column of 3 and each time deduct it from columns b and d in abcd. So the outcome saved in Differences would again have two matrices, each one comparing set of vectors of three against the same column b and d from abcd. I hope it makes sense.
set.seed(101)
a <- c(0.1,0.2,0.3)
b <- c(0.8,0.2,0.5)
c <- c(0.4,0.9,1.0)
d <- c(0.7,0.9,0.2)
ab <- cbind(a,b)
cd <- cbind(c,d)
abcd <- list(ab,cd)
rp <- 100
nv <- length(a)
RandomNum <- vector("list",length(a))
# Draw random values between 0 and 1 from the uniform distribution
for (i in 1:length(a)) {
RandomNum[[i]] <- t(replicate(rp, runif(nv, min=0,max=1)))
}
The problematic piece starts here:
UDRandomNum <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
UDRandomNum[[i]] <- RandomNum[[i]][i] <
abcd[[i]][,2][col(RandomNum[[i]])]+0
}
# Later, I want to take the difference between the outcomes (1 or 0)
Differences <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
Differences[[i]] <- abs(sweep(UDRandomNum,2,abcd[,2]))
}
So it my RandomNum I have the outcome that starts with:
[[1]]
[,1] [,2] [,3]
[1,] 0.076929106 0.42883794 0.502711454
[2,] 0.254765247 0.57422550 0.578616861
[3,] 0.270195792 0.30920944 0.094095268
[4,] 0.512404975 0.97536980 0.082336057
...
I compare it with b: 0.8,0.2,0.5 (if the value in RandomNum is > b then we get 0, else we get 1)
UDRandomNum:
[[1]]
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 1 0 0
[3,] 1 0 1
[4,] 1 0 1
...
And finally I would deduct values in UDRandomNum from b so the outcome in Differences would look following (whatever the results i typed equation for convenience, but it should be a number resulting from this equation)
[[1]]
[,1] [,2] [,3]
[1,] abs(1-0.076929106) abs(0-0.42883794) abs(0-0.502711454)
[2,] abs(1-0.254765247) abs(0-0.57422550) abs(0-0.578616861)
[3,] abs(1-0.270195792) abs(0-0.30920944) abs(1-0.094095268)
[4,] abs(1-0.512404975) abs(0-0.97536980) abs(1-0.082336057)
...
UDRandomNum <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
UDRandomNum[[i]] <- RandomNum[[1]] <
matrix(abcd[[i]][,2][col(RandomNum[[i]])], ncol=3)
}
Differences <- vector("list",length(abcd))
for (i in 1:length(abcd)) {
Differences[[i]] <- abs(UDRandomNum[[i]] - RandomNum[[i]])
}
Explanation
With the UDRandomNum list, try changing the abcd b and d columns into matrices that match the dimensions of RandomNum. And for the Differences list, R will automatically coerce the TRUE and FALSE results to 1 and 0 when you subtract the two matrices UDRandomNum and RandomNum.
I am learning R and reading the book Guide to programming algorithms in r.
The book give an example function:
# MATRIX-VECTOR MULTIPLICATION
matvecmult = function(A,x){
m = nrow(A)
n = ncol(A)
y = matrix(0,nrow=m)
for (i in 1:m){
sumvalue = 0
for (j in 1:n){
sumvalue = sumvalue + A[i,j]*x[j]
}
y[i] = sumvalue
}
return(y)
}
How do I call this function in the R console? And what exactly is passing into this function A, X?
The function takes an argument A, which should be a matrix, and x, which should be a numeric vector of same length as values per row in A.
If
A <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
then you have 3 values (number of columns, ncol) per row, thus x needs to be something like
x <- c(4,5,6)
The function itself iterates all rows, and in each row, each value is multiplied with a value from x, where the value in the first column is multiplied with the first value in x, the value in As second column is multiplied with the second value in x and so on. This is repeated for each row, and the sum for each row is returned by the function.
matvecmult(A, x)
[,1]
[1,] 49 # 1*4 + 3*5 + 5*6
[2,] 64 # 2*4 + 4*5 + 6*6
To run this function, you first have to compile (source) it and then consecutively run these three code lines:
A <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
x <- c(4,5,6)
matvecmult(A, x)
This function is designed to return the product of a matrix A with a vector x; i.e. the result will be the matrix product A x (where - as is usual in R, the vector is a column vector). An example should make things clear.
# define a matrix
mymatrix <- matrix(sample(12), nrow <- 4)
# see what the matrix looks like
mymatrix
# [,1] [,2] [,3]
# [1,] 2 10 9
# [2,] 3 1 12
# [3,] 11 7 5
# [4,] 8 4 6
# define a vector where multiplication of our matrix times the vector will be defined
vec3 <- c(-1,0,1)
# apply the function to our matrix and vector
result <- matvecmult(mymatrix, vec3)
result
# [,1]
# [1,] 7
# [2,] 9
# [3,] -6
# [4,] -2
class(result)
# [1] "matrix"
So matvecmult(mymatrix, vec3) is how you would call this function, and the result is an n by 1 matrix, where n is the number of rows in the matrix argument.
You can also get some insight by playing around and seeing what happens when you pass something other than a matrix-vector pair where the product is defined. In some cases, you will get an error; sometimes you get nonsense; and sometimes you get something you might not expect just from the function name. See what happens when you call matvecmult(mymatrix, mymatrix).
The function is calculating the product of a Matrix and a column vector. It assumes both the number of columns of the matrix is equal to the number of elements in the vector.
It stores the number of columns of A in n and number of rows in m.
It then initializes a matrix of mrows with all values as 0.
It iterates along the rows of A and multiplies each value in each row with the values in x.
The answer is the stored in y and finally it returns the single column matrix y.
This is a two-part problem: the first is to create an NXN square matrix for which only one random element in each row is 1, the other items must be zero. (i.e. the sum of elements in each row is 1).
The second is to create an NXN square matrix for which the sum of items in each row is 1, but each element follows a distribution e.g. normal distribution.
Related questions include (Create a matrix with conditional sum in each row -R)
Matlab seems to do what I want automatically (Why this thing happens with random matrix such that all rows sum up to 1?), but I am looking for a solution in r.
Here is what I tried:
# PART 1
N <- 50
x <- matrix(0,N,N)
lapply(1:N, function(y){
x[y,sample(N,1)]<- 1
})
(I get zeroes still)
# PART 2
N <- 50
x <- matrix(0,N,N)
lapply(1:N, function(y){
x[y,]<- rnorm(N)
})
(It needs scaling)
Here's another loop-less solution that uses the two column addressing facility using the "[<-" function. This creates a two-column index matrix whose first column is simply an ascending series that assigns the row locations, and whose second column (the one responsible for picking the column positions) is a random integer value. (It's a vectorized version of Matthew's "easiest method", and I suspect would be faster since there is only one call to sample.):
M <- matrix(0,N,N)
M[ cbind(1:N, sample(1:N, N, rep=TRUE))] <- 1
> rowSums(M)
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
If you didn't specify rep=TRUE, then colSums(M) would have all been ones as well, but that was not what you requested. It does mean the rank of your resultant matrix may be less than N. If you left out the rep=TRUE the matrix would be full rank.
Here you see why lapply doesn't always replace a loop. You're trying to iterate through the rows of x and modify the matrix, but what you're modifying is a copy of the x from the global environment.
The easiest fix is to use a for loop:
for (y in 1:N) {
x[y,sample(N,1)]<- 1
}
apply series should be used for the return value, rather than programming functions with side-effects.
A way to do this is to return the rows, then rbind them into a matrix. The second example is shown here, as this more closely resembles an apply:
do.call(rbind, lapply((1:N), function(i) rnorm(N)))
However, this is more readable:
matrix(rnorm(N*N), N, N)
Now to scale this to have row sums equal to 1. You use the fact that a matrix is column-oriented and that vectors are recycled, meaning that you can divide a matrix M by rowSums(M). Using a more reasonable N=5:
m <- matrix(rnorm(N*N), N, N)
m/rowSums(m)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.1788692 0.5398464 0.24980924 -0.01282655 0.04430168
## [2,] 0.4176512 0.2564463 0.11553143 0.35432975 -0.14395871
## [3,] 0.3480568 0.7634421 -0.38433940 0.34175983 -0.06891932
## [4,] 1.1807180 -0.0192272 0.16500179 -0.31201400 -0.01447859
## [5,] 1.1601173 -0.1279919 -0.07447043 0.20865963 -0.16631458
No-loop solution :)
n <- 5
# on which column in each row insert 1s
s <- sample(n,n,TRUE)
# indexes for each row
w <- seq(1,n*n,by=n)-1
index <- s+w
# vector of 0s
vec <- integer(n*n)
# put 1s
vec[index] <- 1
# voila :)
matrix(vec,n,byrow = T)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 0 0 1 0
[3,] 0 0 0 0 1
[4,] 1 0 0 0 0
[5,] 1 0 0 0 0
I guess, I have a two leveled question referring to diag in R and matlab.
1) I was wondering if there was a way already developed to access different diagonals of matrices in R similar to the way it is done in Matlab (see http://www.mathworks.com/help/techdoc/ref/diag.html).
2) If there is not already a current function how can my code be improved such that it is similar to the R diag where
diag(x = 1, nrow, ncol) # returns the values of the diagonal
diag(x) <- value # inserts values on the diagonal
Presently my code returns the elements on the diagonal given k but how can it be written so that if it is specified like the second way (above) that it allows me to insert the values on the diagonal. Presently to do this, I use diag.ind to give me the indices and then using those indices to insert the values on the k diagonal.
Here is the code:
'diag.ind'<-function(x,k=0){
if(k=='') k=0
x<-as.matrix(x)
if(dim(x)[2]==dim(x)[1]){
stp_pt_r<-dim(x)[1]
stp_pt_c<-dim(x)[2]
}
if(ncol(x)> dim(x)[1]){
stp_pt_r<-dim(x)[1]
stp_pt_c<-stp_pt_r + 1
}
if(ncol(x)< dim(x)[1]){
stp_pt_c<-dim(x)[2]
stp_pt_r<-stp_pt_c+1
}
if(k==0){
r<-as.matrix(seq(1,stp_pt_r,by=1))
c<-as.matrix(seq(1,stp_pt_c,by=1))
ind.r<- cbind(r,c)
}
if(k>0){
r<-t(as.matrix(seq(1,stp_pt_r,by=1)))
c<-t(as.matrix(seq((1+k),stp_pt_c,by=1)))
ind<-t(rbind.fill.matrix(r,c))
ind.r<-ind[!is.na(ind[,2]),]
}
if(k<0){
k<-abs(k)
r<-t(as.matrix(seq((1+k),stp_pt_r,by=1)))
c<-t(as.matrix(seq(1,stp_pt_c,by=1)))
ind<-t(rbind.fill.matrix(r,c))
ind.r<-ind[!is.na(ind[,1]),]
}
diag.x<-x[ind.r]
output<-list(diag.x=diag.x, diag.ind=ind.r)
return(output)
}
This is kind of clunky and I feel like I must be reinventing the wheel. Thanks in advance for any insight!
After your reply to Andrie this may satisfy:
exdiag <- function(mat, off) {mat[row(mat)+off == col(mat)]}
x <- matrix(1:16, ncol=4)
exdiag(x,1)
#[1] 5 10 15
I was thinking you wanted a function that can assign or return one of a diagonal or a sub- or super- diagonal matrix, This is the constructor function:
subdiag <- function(vec, size, offset=0){
M <- matrix(0, size, size)
M[row(M)-offset == col(M)] <- vec
return(M)}
> subdiag(1, 5, 1)
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 1 0 0 0 0
[3,] 0 1 0 0 0
[4,] 0 0 1 0 0
[5,] 0 0 0 1 0
Called with only two arguments you would get a diagonal matrix. You can construct super-diagonal matrices with negative offsets. If this is what you wanted for the constructor, then it should not be too hard to construct a similar subdiag<- function to go along with it.
In MATLAB, to assign the values x to the diagonal of A:
n = size(A,1);
A(1:n+1:end) = x
Look up linear indexing.
Although, that might not be what you asked.
I'm trying to write a function to determine the euclidean distance between x (one point) and y (a set of n points).
How should I pass y to the function? Until now, I used a matrix like that:
[,1] [,2] [,3]
[1,] 0 2 1
[2,] 1 1 1
Which would pass the points (0,2,1) and (1,1,1) to that function.
However, when I pass x as a normal (column) vector, the two variables don't match in the function.
I either have to transpose x or y, or save a vector of vectors an other way.
My question: What is the standard way to save more than one vector in R? (my matrix y)
Is it just my y transposed or maybe a list or dataframe?
There is no standard way, so you should just pick the most effective one, what on the other hand depends on how this vector of vectors looks just after creation (it is better to avoid any conversion which is not necessary) and on the speed of the function itself.
I believe that a data.frame with columns x, y and z should be pretty good choice; the distance function will be quite simple and fast then:
d<-function(x,y) sqrt((y$x-x[1])^2+(y$y-x[2])^2+(y$z-x[3])^2)
The apply function with the margin argument = 1 seems the most obvious:
> x
[,1] [,2] [,3]
[1,] 0 2 1
[2,] 1 1 1
> apply(x , 1, function(z) crossprod(z, 1:length(z) ) )
[1] 7 6
> 2*2+1*3
[1] 7
> 1*1+2*1+3*1
[1] 6
So if you wanted distances then square-root of the crossproduct of the differences to a chose point seems to work:
> apply(x , 1, function(z) sqrt(sum(crossprod(z -c(0,2,2), z-c(0,2,2) ) ) ) )
[1] 1.000000 1.732051