R fill several empty matrices with loop - r

I would like to fill my created empty matrices with a loop:
First I have created my empty matrizes, which works fine:
for(q in (15:30)){
assign(paste0("P",q), matrix(, nrow = q, ncol = q+1))}
But now when I want to fill these matrices with my formula, I get an dimension mistake:
for(c in (1:q+1)){
for(i in (1:q)){assign(paste0("P",q)[i,c],
((((((q-c) + 1 -(q-c+1- i))/q)^.69)/(((((q-c) + 1 - (q-c+1-i))/q)^.69+(((1 - ((q-c) + 1 -(q-c+1-i))/q))^.69))^(1/.69))) - (((((q-c)-(q-c+1-i))/q)^.69)/(((((q-c) - (q-c+1-i))/q)^.69+(((1 - ((q-c)-(q-c+1-i))/q))^.69))^(1/.69)))))}}}
Nevertheless when I use this loop for a single matrix it works e.g.:
t <- 20
c <- 1
i <- 1
for(c in (1:t+1)){
for(i in (1:t)){P20[i,c]<-( (((((t-c) + 1 -(t-c+1-i))/t)^.69)/
(((((t-c) + 1 - (t-c+1-i))/t)^.69+(((1 - ((t-c) + 1 -(t-c+1-i))/t))^.69))^(1/.69))) -
(((((t-c)-(t-c+1-i))/t)^.69)/(((((t-c) - (t-c+1-i))/t)^.69+(((1 - ((t-c)-(t-c+1-i))/t))^.69))^(1/.69))))}}
The formula is giving out probability weights according to Cummulative Prospect Theory, if anyone is interested.
Do you guys have an idea how I can make this more elegant? Should I better write a user-defined function?

If you are happy with your resultant matrices being in a list with the same names you were assigning to you could do something like:
l = lapply(15:30, function(q){
t = q
matrix(apply(expand.grid(1:q,1:(q+1)),1,
function(x){
i = x[1]
c = x[2]
( (((((t-c) + 1 -(t-c+1-i))/t)^.69)/
(((((t-c) + 1 - (t-c+1-i))/t)^.69+(((1 - ((t-c) + 1 -(t-c+1-i))/t))^.69))^(1/.69))) -
(((((t-c)-(t-c+1-i))/t)^.69)/(((((t-c) - (t-c+1-i))/t)^.69+(((1 - ((t-c)-(t-c+1-i))/t))^.69))^(1/.69))))
}),nrow = q, ncol = q+1, byrow = TRUE)
})
names(l) = paste0("P",15:30)
I have used bits like t=q and i=x[1]; c=x[2] such that I could just copy paste your formula for probability.
What we are doing here is using lapply to loop over the given row numbers in your question, we then use expand.grid to give the pairs of indicies for all cells in the resultant vector. To the indicies we apply a function which given row i, column c calculates the probability according to your formula. The values are then cast as a matrix such that the result has the appropriate structure.
You end up with a list l of matrices with components called "P15", "P16", ...

Related

Replacing the values of Subsetted Matrix with an another matrix in R

I have an all zero sparse matrix K1 with the dimensions (9x3). I wanted to replace certain values of this matrix with an another matrix. Also, instead of numerical indexing, I have used variable indexing to make it more dynamic. The codes are as follows -
n <- 3
library(Matrix)
K1 <- Matrix(0, n*n, n*(n-1)/2, sparse = TRUE)
for (i in 1:(n - 1)) {
K1[2 + (i - 1)*(n + 1):i*n,
1 + (i - 1)*(n - i/2):i*(n - i)*(i + 1)/2] <- diag(n - i)
}
However, it shows the error -
Error in replCmat4(x, i1 = if (iMi) 0:(di[1] - 1L) else .ind.prep2(i, :
too many replacement values
Sometimes this error as well -
Error in intI(i, n = di[margin], dn = dn[[margin]], give.dn = FALSE) :
index larger than maximal 9
But, when I run the Similar code in MATLAB, it runs perfectly. MATLAB code -
n = 3
K1 = sparse(n*n,n*(n-1)/2);
for i = 1:n-1
K1(2+(i-1)*(n+1):i*n,1+(i-1)*(n-i/2):i*n-i*(i+1)/2) = eye(n-i);
end
And the output which MATLAB gives is -
K1 =
(2,1) 1.00
(3,2) 1.00
(6,3) 1.00
Thus, above is my desired output as well.
Can someone tell what is going wrong when I am trying to execute the same in R.
I appreciate the help. Thanks.
Please put the index in a pair of braket, otherwise they may be
explained differently in R and Matlab.
K1[(2+(i-1)*(n+1)):(i*n), (1+(i-1)*(n-i/2)):(i*(n-i)*(i+1)/2)]

Avoiding a loop when populating data frames in R

I have an empty data frame T_modelled with 2784 columns and 150 rows.
T_modelled <- data.frame(matrix(ncol = 2784, nrow = 150))
names(T_modelled) <- paste0("t=", t_sec_ERT)
rownames(T_modelled) <- paste0("z=", seq(from = 0.1, to = 15, by = 0.1))
where
t_sec_ERT <- seq(from = -23349600, to = 6706800, by = 10800)
z <- seq(from = 0.1, to = 15, by = 0.1)
I filled T_modelled by column with a nested for loop, based on a formula:
for (i in 1:ncol(T_modelled)) {
col_tmp <- colnames(T_modelled)[i]
for (j in 1:nrow(T_modelled)) {
z_tmp <- z[j]-0.1
T_tmp <- MANSRT+As*e^(-z_tmp*(omega/(2*K))^0.5)*sin(omega*t_sec_ERT[i]-((omega/(2*K))^0.5)*z_tmp)
T_modelled[j ,col_tmp] <- T_tmp
}
}
where
MANSRT <- -2.051185
As <- 11.59375
omega <- (2*pi)/(347.875*24*60*60)
c <- 790
k <- 0.00219
pb <- 2600
K <- (k*1000)/(c*pb)
e <- exp(1)
I do get the desired results but I keep thinking there must be a more efficient way of filling that data frame. The loop is quite slow and looks cumbersome to me. I guess there is an opportunity to take advantage of R's vectorized way of calculating. I just cannot see myself how to incorporate the formula in an easier way to fill T_modelled.
Anyone got any ideas how to get the same result in a faster, more "R-like" manner?
I believe this does it.
Run this first instruction right after creating T_modelled, it will be needed to test that the results are equal.
Tm <- T_modelled
Now run your code then run the code below.
z_tmp <- z - 0.1
for (i in 1:ncol(Tm)) {
T_tmp <- MANSRT + As*exp(-z_tmp*(omega/(2*K))^0.5)*sin(omega*t_sec_ERT[i]-((omega/(2*K))^0.5)*z_tmp)
Tm[ , i] <- T_tmp
}
all.equal(T_modelled, Tm)
#[1] TRUE
You don't need the inner loop, that's the only difference.
(I also used exp directly but that is of secondary importance.)
Much like your previous question's solution which you accepted, consider simply using sapply, iterating through the vector, t_sec_ERT, which is the same length as your desired dataframe's number of columns. But first adjust every element of z by 0.1. Plus, there's no need to create empty dataframe beforehand.
z_adj <- z - 0.1
T_modelled2 <- data.frame(sapply(t_sec_ERT, function(ert)
MANSRT+As*e^(-z_adj*(omega/(2*K))^0.5)*sin(omega*ert-((omega/(2*K))^0.5)*z_adj)))
colnames(T_modelled2) <- paste0("t=", t_sec_ERT)
rownames(T_modelled2) <- paste0("z=", z)
all.equal(T_modelled, T_modelled2)
# [1] TRUE
Rui is of course correct, I just want to suggest a way of reasoning when writing a loop like this.
You have two numeric vectors. Functions for numerics in R are usually vectorized. By which I mean you can do stuff like this
x <- c(1, 6, 3)
sum(x)
not needing something like this
x_ <- 0
for (i in x) {
x_ <- i + x_
}
x_
That is, no need for looping in R. Of course looping takes place none the less, it just happens in the underlying C, Fortran etc. code, where it can be done more efficiently. This is usually what we mean when we call a function vectorized: looping takes place "under the hood" as it were. The output of Vectorize() thus isn't strictly vectorized by this definition.
When you have two numeric vectors you want to loop over you have to first see if the constituent functions are vectorized, usually by reading the docs.
If it is, you continue by constructing that central vectorized compound function and and start testing it with one vector and one scalar. In your case it would be something like this (testing with just the first element of t_sec_ERT).
z_tmp <- z - 0.1
i <- 1
T_tmp <- MANSRT + As *
exp(-z_tmp*(omega/(2*K))^0.5) *
sin(omega*t_sec_ERT[i] - ((omega/(2*K))^0.5)*z_tmp)
Looks OK. Then you start looping over the elements of t_sec_ERT.
T_tmp <- matrix(nrow=length(z), ncol=length(t_sec_ERT))
for (i in 1:length(t_sec_ERT)) {
T_tmp[, i] <- MANSRT + As *
exp(-z_tmp*(omega/(2*K))^0.5) *
sin(omega*t_sec_ERT[i] - ((omega/(2*K))^0.5)*z_tmp)
}
Or you can do it with sapply() which is often neater.
f <- function(x) {
MANSRT + As *
exp(-z_tmp*(omega/(2*K))^0.5) *
sin(omega*x - ((omega/(2*K))^0.5)*z_tmp)
}
T_tmp <- sapply(t_sec_ERT, f)
I would prefer to put the data in a long format, with all combinations of z and t_sec_ERT as two columns, in order to take advantage of vectorization. Although I usually prefer tidyr for switching between long and wide formats, I've tried to keep this as a base solution:
t_sec_ERT <- seq(from = -23349600, to = 6706800, by = 10800)
z <- seq(from = 0.1, to = 15, by = 0.1)
v <- expand.grid(t_sec_ERT, z)
names(v) <- c("t_sec_ERT", "z")
v$z_tmp <- v$z-0.1
v$T_tmp <- MANSRT+As*e^(-v$z_tmp*(omega/(2*K))^0.5)*sin(omega*v$t_sec_ERT-((omega/(2*K))^0.5)*v$z_tmp)
T_modelled <- data.frame(matrix(v$T_tmp, nrow = length(z), ncol = length(t_sec_ERT), byrow = TRUE))
names(T_modelled) <- paste0("t=", t_sec_ERT)
rownames(T_modelled) <- paste0("z=", seq(from = 0.1, to = 15, by = 0.1))

Multiply unique pairs of values in a vector and sum the result

I want to multiply and then sum the unique pairs of a vector, excluding pairs made of the same element, such that for c(1:4):
(1*2) + (1*3) + (1*4) + (2*3) + (2*4) + (3*4) == 35
The following code works for the example above:
x <- c(1:4)
bar <- NULL
for( i in 1:length(x)) { bar <- c( bar, i * c((i+1) : length(x)))}
sum(bar[ 1 : (length(bar) - 2)])
However, my actual data is a vector of rational numbers, not integers, so the (i+1) portion of the loop will not work. Is there a way to look at the next element of the set after i, e.g. j, so that I could write i * c((j : length(x))?
I understand that for loops are usually not the most efficient approach, but I could not think of how to accomplish this via apply etc. Examples of that would be welcome, too. Thanks for your help.
An alternative to a loop would be to use combn and multiply the combinations using the FUN argument. Then sum the result:
sum(combn(x = 1:4, m = 2, FUN = function(x) x[1] * x[2]))
# [1] 35
Even better to use prod in FUN, as suggested by #bgoldst:
sum(combn(x = 1:4, m = 2, FUN = prod))

Create a function that takes in a vector and returns a matrix in R

I am trying to create a function that will take in a vector k and return to me a matrix with dimensions length(distMat[1,]) by length(k). distMat is a huge matrix and indSpam is a long vector. In particular to my situation, length(distMat[1,]) is 2412. When I enter in k as a vector of length one, I get a vector of length 2412. I want to be able to enter in k as a vector of length two and get a matrix of 2412x2. I am trying to use a while loop to let it go through the length of k, but it only returns to me a vector of length 2412. What am I doing wrong?
predNeighbor = function(k, distMat, indSpam){
counter = 1
while (counter<(length(k)+1))
{
preMatrix = apply(distMat, 1, order)
orderedMatrix = t(preMatrix)
truncate = orderedMatrix[,1:k[counter]]
checking = indSpam[truncate]
checking2 = matrix(checking, ncol = k[counter])
number = apply(checking2, 1, sum)
return(number[1:length(distMat[1,])] > (k[counter]/2))
counter = counter + 1
}
}
I am trying to create a function that will take in a vector k and return to me a matrix with dimensions length(distMat[1,]) by length(k)
Here's a function that does this.
foo <- function(k, distMat) {
return(matrix(0, nrow = length(distMat[1, ]), ncol = length(k)))
}
If you have other requirements, please describe them in words.
Based on your comment, I think I understand better your goal. You have a function that returns a vector of length k and you want to save it's output as rows in a matrix. This is a pretty common task. Let's do a simple example where k starts out as 1:10, and say we want to add some noise to it with a function foo() and see how the rank changes.
In the case where the input to the function is always the same, replicate() works very well. It will automatically put everything in a matrix
k <- 1:10
noise_and_rank <- function(k) {
rank(k + runif(length(k), min = -2, max = 2))
}
results <- replicate(n = 8, expr = {noise_and_rank(k)})
In the case where you want to iterate, i.e., the output from the one go is the input for the next, a for loop is good, and we just pre-allocate a matrix with 0's, to fill in one column/row at a time
k <- 1:10
n.sim <- 8
results <- matrix(0, nrow = length(k), ncol = n.sim)
results[, 1] <- k
for(i in 2:n.sim) {
results[, i] <- noise_and_rank(results[, i - 1])
}
What your original question seems to be about is how to do the pre-allocation. If the input is always the same, using replicate() means you don't worry about it. If the input is is different each time, then pre-allocate using matrix(), you don't need to write any special function.

R double loop with where

I have the the following data frame and variables:
u0 <- c(1,1,1,1,1)
df <- data.frame (u0)
a = .793
b = 2.426
r = 0.243
q = 1
w = 2
j = 1
z = .314
using the following loop I am doing some calculations and put the results in the first row of my data frame.
while (j<5){
df[q,w] <- df[q, w-1] * (r+j-1)*(b+j-1)*(z) / ((a+b+j-2)*j)
j = j + 1
w = w + 1
}
now I want to create another loop to do the same calculations for all rows (i.e I need the 'q' variable to vary) of my data frame. I would be thankful if anyone helps me.
You could either do this by putting your while loop inside of a for loop that goes over q, but a more R-tastic way would be to simply define q <- 1:5, and leave the rest of your code as-is. Then df will fill up entirely. I take it in this example you want all rows to be identical?
Can't you just put it in a for loop?
df <- data.frame (d1=u0, d2=u0+1, d3=u0+2, d4=u0+4, d5=u0+5)
for (q in 2:5) {
while (j<5){
df[q,w] <- df[q, w-1] * (r+j-1)*(b+j-1)*(z) / ((a+b+j-2)*j)
j = j + 1
w = w + 1
} }
You may want to check the algorithm. It doesn't seem to be doing anything very interesting.

Resources