Matrix of expected values from matrix of observed using a loop - r

I am trying to figure out how to use a for loop to create a matrix of expected values. it should be able to handle a matrix of any size. This is all I've been able to come up with so far.
for(i in 1:obsv){
for(j in 1:obsv){
obsv[i,j]<-(sum(obsv[i,])*sum(obsv[,j]))/sum(obsv)
}
}
##obsv is the name of the matrix of observed values

Your loop is obviously wrong, see below. The main error was that you need to loop through 1:nrow(obsv) and 1:ncol(obsv), not like you are doing it.
I will use a fake matrix, since you haven't posted an example dataset.
obsv <- matrix(1:25, ncol = 5)
obsv2 <- obsv # modify a copy
for(i in 1:nrow(obsv)){
for(j in 1:ncol(obsv)){
obsv2[i, j] <- sum(obsv[i, ])*sum(obsv[, j])/sum(obsv)
}
}
Now, the above code can be greatly simplified. A one-liner will do it.
obsv3 <- rowSums(obsv) %*% t(colSums(obsv))/sum(obsv)
identical(obsv2, obsv3)
#[1] TRUE

Related

Using for loop to append vectors of variable length

I am trying to create a vector or list of values based on the output of a function performed on individual elements of a column.
library(hpoPlot)
xyz_hpo <- c("HP:0003698", "HP:0007082", "HP:0006956")
getallancs <- function(hpo_col) {
for (i in 1:length(hpo_col)) {
anc <- get.ancestors(hpo.terms, hpo_col[i])
output <- list()
output[[length(anc) + 1]] <- append(output, anc)
}
return(anc)
}
all_ancs <- getallancs(xyz_hpo)
get.ancestors outputs a character vector of variable length depending on each term. How can I loop through hpo_col adding the length of each ancs vector to the output vector?
Welcome to Stack Overflow :) Great job on providing a minimal reproducible example!
As mentioned in the comments, you need to move the output <- list() outside of your for loop, and return it after the loop. At present it is being reset for each iteration of the loop, which is not what you want. I also think you want to return a vector rather than a list, so I have changed the type of output.
Also, in your original question, you say that you want to return the length of each anc vector in the loop, so I have changed the function to output the length of each iteration, rather than the whole vector.
getallancs <- function(hpo_col) {
output <- numeric()
for (i in 1:length(hpo_col)) {
anc <- get.ancestors(hpo.terms, hpo_col[i])
output <- append(output, length(anc))
}
return(output)
}
If you are only doing this for a few cases, such as your example, this approach will be fine, however, this paradigm is typically quite slow in R and it's better to try and vectorise this style of calculation if possible. This is especially important if you are running this for a large number of elements where computation will take more than a few seconds.
For example, one way the function above could be vectorised is like so:
all_ancs <- sapply(xyz_hpo, function(x) length(get.ancestors(hpo.terms, x)))
If in fact you did mean to output the whole vector of anc, not just the lengths, the original function would look like this:
getallancs <- function(hpo_col) {
output <- character()
for (i in 1:length(hpo_col)) {
anc <- get.ancestors(hpo.terms, hpo_col[i])
output <- c(output, anc)
}
return(output)
}
Or a vectorised version could be
all_ancs <- unlist(lapply(xyz_hpo, function(x) get.ancestors(hpo.terms, x)))
Hope that helps. If it solves your problem, please mark this as the answer.

In R, is it possible to use a pair, tuple or equivalent in a matrix?

I am trying to create a matrix of coordinates(indexes) that I randomly pick one from using the sample function. I then use these to select a cell in another matrix. What is the best way to do this? The trouble is how to store these integers in the matrix so that they are easy to separate. Right now I have them stored as strings with a comma, that I then split. Someone suggested I use a pair, or a string, but I cannot seam to get these to work with a matrix. Thanks!
EDIT:What i currently have looks like this (changed a little to make sense out of context):
probs <- matrix(c(0,0,0.6,0,0,
0,0.7,1,0.7,0,
0.6,1,0,1,0.6,
0,0.7,1,0.7,0,
0,0,0.6,0,0),5,5)
cordsMat <- matrix("",5,5)
for (x in 1:5){
for (y in 1:5){
cordsMat[x,y] = paste(x,y,sep=",")
}
}
cords <- sample(cordsMat,1,,probs)
cordsVec <- unlist(strsplit(cords,split = ","))
cordX <- as.numeric(cordsVec[1])
cordY <- as.numeric(cordsVec[2])
otherMat[cordX,cordY]
It sort of works but i would also be interested for a better way, as this will get repeated a lot.
If you want to set the probabilities it can easily be done by providing it to sample
# creating the matrix
matrix(sample(rep(1:6, 15:20), 25), 5) -> other.mat
# set the probs vec
probs <- c(0,0,0.6,0,0,
0,0.7,1,0.7,0,
0.6,1,0,1,0.6,
0,0.7,1,0.7,0,
0,0,0.6,0,0)
# the coordinates matrix
mat <- as.matrix(expand.grid(1:nrow(other.mat),1:ncol(other.mat)))
# sampling a row randomly
sample(mat, 1, prob=probs) -> rand
# getting the value
other.mat[mat[rand,1], mat[rand,2]]
[1] 6

How to iterate/loop through multiple numbered variables in R

So, I'm new to programming in R, so I don't even know if this is feasible to even do. I have 50 matrices (50,000 rows by 10 columns) I'm trying to populate for a Monte Carlo simulation. I created all matrices in a loop and they're called mCMatrix1, mCMatrix2 etc.
I want to populate the matrices in a loop, something to this effect:
for (i in 50){
for (j in 50000){
num <- mu + tR %*% rnorm(10) # returns a 10 row, 1 column matrix
mCMatrixC"i"[]= num[,1] # basically rotates the matrix to fill in the first row
}
}
where I can somehow code the program to know that it needs to populate mCMatrix1, then mCMatrix2, all the way to the 50th matrix. For STATA users, I remember you could loop through variables with with v = forval(range of values), mCMatrix`v' . (It's been a while since I've used STATA, so the syntax probably isn't right, but it was something to that effect.
You can build a list of matrices for easier access and access it using the following. I am not sure about the matrix operation you do in the loop so I have chosen a random matrix as an example.
> list_matrices = c()
> for (i in 1:10) { list_matrices[[i]] = matrix(rnorm(9), nrow=3)}
> list_matrices[[1]]
[,1] [,2] [,3]
[1,] -0.09855292 0.2665513 0.72873888
[2,] -0.03005994 -0.4834303 -1.12356622
[3,] 0.98443875 0.5895932 0.07072777
If the core issue is to generate new (numbered) variable names and assign values to them, then I think you can use this approach:
for(i in 1:3)
{
n<- sprintf("matr%d",i)
print(n)
assign(x=n,value = i)
}
matr1
matr2
matr3
R runs on lists and data.frames which is a little bit different from other methods. Your easiest method is to create a list of of matrix names and iterate through the list.
Rawr's approach is the simplest and probably most effective.
Then you simply access it by mlist[n], n being the matrix you want.
If you want a complete data frame approach its a little more complicated but it gives a data table with indices rather than a list of matrices
library(dplyr)
yourData <- data.frame()
for (k in 1:50) {
yourData <- yourData %>%
rbind((as.data.frame(matrix(rnorm(50000 * 10), nrow=50000, ncol=10))) %>%
mutate(Run = k))
}
That way you could access it as
yourData %>% filter(Run = n)

Filling matrix in R without for loops

I am a beginner in R and i know the way i have done is wrong and slow. I would like to fill a matrix and i need to compute each term. I have tried two for loops, here is the code. Do you know a better way to do it?
KernelGaussianMatrix <- function(x,delta){
Mat = matrix(0,nrow=length(x),ncol=length(x))
for (i in 1:length(x)){
for (j in 1:length(x)){
Mat[i,j] = KernelGaussian(x[i],x[j],delta)
}
}
return(Mat)
}
Thx
you want to use the function outer as in:
Mat <- outer(x,x,KernelGaussian,delta)
note that any arguments after the third argument in outer are provided as additional arguments to the function provided as the third argument to outer
If a for loop is required to generate the values than your method is fine.
If the values are already in an array values you can try mat = matrix(values, nrow=n, ncol=p) or something similar.

usings a for loop to append to an empty object in r

this may seem like a novice question, but I'm struggling to understand why this doesn't work.
answer = c()
for(i in 1:8){
answer = c()
knn.pred <- knn(data.frame(train_week$Lag2), data.frame(test_week$Lag2), train_week$Direction, k=i)
test <- mean(knn.pred == test_week$Direction)
append(answer, test)
}
I want the results 1-8 in a vector called answer. it should loop through 8 times, so ideally a vector with 8 numbers would be my output. When I run the for loop, I only get the final answer, meaning it isn't appending. any help would be appreciated, sorry for the novice question, really trying to learn R.
First of all, please include a reproducible example in your question next time. See How to make a great R reproducible example?.
Second, you set answer to c() in the first line of your loop, so this happens in each iteration.
Third, append, just like almost all functions in R, does not modify its argument in place, but it returns a new object. So the correct code is:
answer = c()
for (i in 1:8){
knn.pred <- knn(data.frame(train_week$Lag2), data.frame(test_week$Lag2),
train_week$Direction, k = i)
test <- mean(knn.pred == test_week$Direction)
answer <- append(answer, test)
}
While this wasn't the question, I can't help noting that this is a very inefficient way of creating vectors and lists. It is an anti-pattern. If you know the length of the result vector, then allocate it, and set its elements. E.g
answer = numeric(8)
for (i in 1:8){
knn.pred <- knn(data.frame(train_week$Lag2), data.frame(test_week$Lag2),
train_week$Direction, k = i)
test <- mean(knn.pred == test_week$Direction)
answer[i] <- test
}
You are overwriting answer inside the for loop. Try removing that line. Also, append doesn't act on its arguments directly; it returns the modified vector. So you need to assign it.
answer <- c()
for(i in 1:8){
knn.pred <- knn(data.frame(train_week$Lag2), data.frame(test_week$Lag2), train_week$Direction, k=i)
test <- mean(knn.pred == test_week$Direction)
answer <- append(answer, test)
}

Resources