How to skip an error in a loop - r

I want to skip an error (if there is any) in a loop and continue the next iteration. I want to compute 100 inverse matrices of a 2 by 2 matrix with elements randomly sampled from {0, 1, 2}. It is possible to have a singular matrix (for example,
1 0
2 0
Here is my code
set.seed(1)
count <- 1
inverses <- vector(mode = "list", 100)
repeat {
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
inverses[[count]] <- solve(x)
count <- count + 1
if (count > 100) break
}
At the third iteration, the matrix is singular and the code stops running with an error message. In practice, I would like to bypass this error and continue to the next loop. I know I need to use a try or tryCatch function but I don't know how to use them. Similar questions have been asked here, but they are all really complicated and the answers are far beyond my understanding. If someone can give me a complete code specifically for this question, I really appreciate it.

This would put NULLs into inverses for the singular matrices:
inverses[[count]] <- tryCatch(solve(x), error=function(e) NULL)
If the first expression in a call to tryCatch raises an error, it executes and returns the value of the function supplied to its error argument. The function supplied to the error arg has to take the error itself as an argument (here I call it e), but you don't have to do anything with it.
You could then drop the NULL entries with inverses[! is.null(inverses)].
Alternatively, you could use the lower level try. The choice is really a matter of taste.
count <- 0
repeat {
if (count == 100) break
count <- count + 1
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
x.inv <- try(solve(x), silent=TRUE)
if ('try-error' %in% class(x.inv)) next
else inverses[[count]] <- x.inv
}
If your expression generates an error, try returns an object with class try-error. It will print the message to screen if silent=FALSE. In this case, if x.inv has class try-error, we call next to stop the execution of the current iteration and move to the next one, otherwise we add x.inv to inverses.
Edit:
You could avoid using the repeat loop with replicate and lapply.
matrices <- replicate(100, matrix(sample(0:2, 4, replace=T), 2, 2), simplify=FALSE)
inverses <- lapply(matrices, function(mat) if (det(mat) != 0) solve(mat))
It's interesting to note that the second argument to replicate is treated as an expression, meaning it gets executed afresh for each replicate. This means you can use replicate to make a list of any number of random objects that are generated from the same expression.

Instead of using tryCatch you could simply calculate the determinant of the matrix with the function det. A matrix is singular if and only if the determinant is zero.
Hence, you could test whether the determinant is different from zero and calculate the inverse only if the test is positive:
set.seed(1)
count <- 1
inverses <- vector(mode = "list", 100)
repeat {
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
# if (det(x)) inverses[[count]] <- solve(x)
# a more robust replacement for the above line (see comment):
if (is.finite(determinant(x)$modulus)) inverses[[count]] <- solve(x)
count <- count + 1
if (count > 100) break
}
Update:
It is, however, possible to avoid generating singular matrices. The determinant of a 2-by-2 matrix mat is definded as mat[1] * mat[4] - mat[3] * mat[2]. You could use this knowledge for sampling random numbers. Just do not sample numbers which will produce a singular matrix. This, of course, depends on the numbers sampled before.
set.seed(1)
count <- 1
inverses <- vector(mode = "list", 100)
set <- 0:2 # the set of numbers to sample from
repeat {
# sample the first value
x <- sample(set, 1)
# if the first value is zero, the second and third one are not allowed to be zero.
new_set <- ifelse(x == 0, setdiff(set, 0), set)
# sample the second and third value
x <- c(x, sample(new_set, 2, replace = T))
# calculate which 4th number would result in a singular matrix
not_allowed <- abs(-x[3] * x[2] / x[1])
# remove this number from the set
new_set <- setdiff(0:2, not_allowed)
# sample the fourth value and build the matrix
x <- matrix(c(x, sample(new_set, 1)), 2, 2)
inverses[[count]] <- solve(x)
count <- count + 1
if (count > 100) break
}
This procedure is a guarantee that all generated matrices will have an inverse.

try is just a way of telling R: "If you commit an error inside the following parentheses, then skip it and move on."
So if you're worried that x <- matrix(sample(0:2, 4, replace = T), 2, 2) might give you an error, then all you have to do is:
try(x <- matrix(sample(0:2, 4, replace = T), 2, 2))
However, keep in mind then that x will be undefined if you do this and it ends up not being able to compute the answer. That could cause a problem when you get to solve(x) - so you can either define x before try or just "try" the whole thing:
try(
{
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
inverses[[count]] <- solve(x)
}
)

The documentation for try explains your problem pretty well. I suggest you go through it completely.
Edit: The documentation example looked pretty straightforward and very similar to the op's question. Thanks for the suggestion though. Here goes the answer following the example in the documentation page:
# `idx` is used as a dummy variable here just to illustrate that
# all 100 entries are indeed calculated. You can remove it.
set.seed(1)
mat_inv <- function(idx) {
print(idx)
x <- matrix(sample(0:2, 4, replace = T), nrow = 2)
solve(x)
}
inverses <- lapply(1:100, function(idx) try(mat_inv(idx), TRUE))

Related

R - Replacing a variable with its value in every iteration of a loop

Somewhat new to R (coming from SQL), trying to write a quick loop to generate a series of functions that perform a task a specified number of times (i.e. for function 2, do something 2 times, function 3 -> 3 times, etc.).
My issue is arising in that I'm using the iterative variable (in the below code, 'k') as a part of the lower-level function - as such, when I go to evaluate ANY of the subsequently generated functions, it returns the value of the function as of the last value of k (here, 4), no matter if I call function 2, 3, or 4.
My question is therefore how can I substitute the value of 'k' for the variable 'k' when I'm generating the lower level functions? For example, on the first iteration, when k = 2, I want to substitute '2' for every occurrence of 'k' in the lower level function, such that when function 2 is run later, it sees the value of '2', rather than the last value of 'k'?
Code below (note that the 'x' value the function will evaluate is a 1 column, variable row matrix):
x <- as.matrix(11:20)
for (k in 2:4) {
actvfun <- NULL
actvfun <- function(x) {
actv <- NULL
actvmtx <- NULL
actvmtx <- as.matrix(x)
for (j in 2:(k+1)) {
actv <- rep(NA, length(x))
for (i in j:length(x)) {
actv[i] <- x[i - (j - 1)]
}
actvmtx <- as.matrix(cbind(actvmtx, actv))
}
assign(paste("lag0", k, "av", sep=""), actvmtx)
return(apply(get(paste("lag0", k, "av", sep="")), 1, mean, na.rm=T))
assign(paste("LAGTEST0", k, "AV", sep=""), apply(actvmtx, 1, mean, na.rm=T))
}
assign(paste("v5LAGTEST0", k, ".av", sep=""), actvfun)
}
v5LAGTEST02.av(x)
v5LAGTEST03.av(x)
The last two items are the checks I was running - currently both return the result using k = 4, rather than their respective values of 2 and 3.
Any help is greatly appreciated - I know loops are somewhat frowned upon in R (as is 'assign', but I'm not sure how else to achieve the desired result of variable function names), so I'm certainly open to new suggestions!
Thanks,
Nate

R: Compute on previous elements of an ordered vector [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Given an ordered vector vec <- c(1, 4, 6, 3, 2, 7), I want to compute for each element i of vec the weighted average of the previous elements where the weight is the inverse of the distance from the element i.
The function should proceed as following.
For the first element 1, should return NA (no previous element).
For the second element 4, should return 1.
For the third element 6, should return weighted.mean(x = c(1,4), w
= c(1,2)).
For the fourth element 3, should return weighted.mean(x =
c(1,4,6), w = c(1,2,3))
The resulting vector result should be, with length(result) == length(vec), c(NA, 1, 3, 4.5, 3.9, 3.266667).
UPDATE:
I clearly mean without using a loop
result <- numeric()
for (i in 1:length(vec)) {
if (i == 1) {
result <-
c(result, NA)
} else {
previous_elements <- vec[1:(i-1)]
result <-
c(result,
weighted.mean(x = previous_elements, w = 1:length(previous_elements)))
}
}
Here's a naive implementation. Create a function that does what you say; the only 'clever' thing is to use the function seq_len() instead of 1:i to generate the indexes
fun = function(i, vec)
weighted.mean(head(vec, i - 1), w=seq_len(i - 1))
and then use it in sapply
sapply(seq_along(vec), fun, vec)
This is good enough -- NaN as the first element, rather than NA, but that's easily corrected after the fact (or conceptually accepted as the right answer). It's also better than your solution, but still 'using a loop' -- the management of the result vector is done by sapply(), rather than in your loop where you have to manage it yourself. And in particular your 'copy and append' approach is very bad performance-wise, making a copy of the existing result each time through the loop. It's better to pre-allocate a result vector of the appropriate length result = numeric(length(vec)) and then fill it result[[i]] = ..., and better still to just let sapply() do the right thing for you!
The problem is that the naive implementation scales quadratically -- you make a pass along vec to process each element, and then for each element you make a second pass to calculate the weighted mean, so there are n (n - 1) / 2 calculations. So...
Take a look at weighted.mean
> stats:::weighted.mean.default
function (x, w, ..., na.rm = FALSE)
{
## SNIP -- edited for brevity
w <- as.double(w)
if (na.rm) {
i <- !is.na(x)
w <- w[i]
x <- x[i]
}
sum((x * w)[w != 0])/sum(w)
}
and use cumsum() instead of sum() to get the cumulative weights, rather than the individual weights, i.e., return a vector as long as x, where the ith element is the weighted mean up to that point
cumweighted.mean <- function(x, w) {
## handle NA values?
w <- as.numeric(w) # to avoid integer overflow
cumsum(x * w)[w != 0] / cumsum(w)
}
You'd like something a little different
myweighted.mean <- function(x)
c(NA, cumweighted.mean(head(x, -1), head(seq_along(x), - 1)))
This makes a single pass through the data, so scales linearly (at least in theory).

Simulate 5000 samples of size 5 from a normal distribution with mean 5 and standard deviation 3

I am trying to simulate 5000 samples of size 5 from a normal distribution with mean 5 and standard deviation 3. I want to then compute the mean of each sample and make a histogram of the sample means
My current code is not giving me an error but I don't think it's right:
nrSamples = 5000
e <- list(mode="vector",length=nrSamples)
for (i in 1:nrSamples) {
e[[i]] <- rnorm(n = 5, mean = 5, sd = 3)
}
sample_means <- matrix(NA, 5000,1)
for (i in 1:5000){
sample_means[i] <- mean(e[[i]])
}
Any idea on how to tackle this? I am very very new to R!
You don't need a list in this case. It is a common mistake of new R users to use lists excessively.
observations <- matrix(rnorm(25000, mean=5, sd=3), 5000, 5)
means <- rowMeans(observations)
Now means is a vector of 5000 elements.
You can actually do this without for loops. replicate can be used to create the 5000 samples. Then use sapply to return the mean of each sample. Wrap the sapply call in hist() to get the histogram of means.
dat = replicate(5000, rnorm(5,5,3), simplify=FALSE)
hist(sapply(dat, mean))
Or, if you want to save the means:
sample.means = sapply(dat,mean)
hist(sample.means)
I think your code is giving valid results. list(mode="vector",length=nrSamples) isn't doing what I think you intended (run it in the console and see what happens), but it works out because the first two list elements get overwritten in the loop.
Although there's no need to use loops here, just for illustration here are two modified versions of your code using loops:
# 1. Store random samples in a list
e <- vector("list", nrSamples)
for (i in 1:nrSamples) {
e[[i]] <- rnorm(n = 5, mean = 5, sd = 3)
}
sample_means = rep(NA, nrSamples)
for (i in 1:nrSamples){
sample_means[i] <- mean(e[[i]])
}
# 2. Store random samples in a matrix
e <- matrix(rep(NA, 5000*5), nrow=5)
for (i in 1:nrSamples) {
e[,i] <- rnorm(n = 5, mean = 5, sd = 3)
}
sample_means = rep(NA, nrSamples)
for (i in 1:nrSamples){
sample_means[i] <- mean(e[, i])
}
Your code is fine (see below), but I would suggest you try the following:
yourlist <- lapply(1:nrSamples, function(x) rnorm(n=5, mean = 5, sd = 3 ))
yourmeans <- sapply(yourlist, mean)
Here, for each element of the sequence 1, 2, 3, ... nrSamples that I supply as the first argument, lapply executes an function with the given element of the sequence as argument (i.e. x). The function that I have supplied does not depend on x, however, so it is just replicated 5000 times, and the output is stored in a list (this is what lapply does). It is an easy way to avoid loops in situations like these. Needless to say, you could also just run
yourmeans <- sapply(1:nrSamples, function(x) mean(rnorm(n=5, mean = 5, sd = 3)))
Apart from the means, the latter does not store your results though, which may not be what you want. Also note that I call sapply to return a vector, which you can then use to plot your histogram, using e.g. hist(yourmeans).
To show that your code is fine, consider the following:
set.seed(42)
nrSamples = 5000
e <- list(mode="vector",length=nrSamples)
for (i in 1:nrSamples) {
e[[i]] <- rnorm(n = 5, mean = 5, sd = 3)
}
sample_means <- matrix(NA, 5000,1)
for (i in 1:5000){
sample_means[i] <- mean(e[[i]])
}
set.seed(42)
yourlist <- lapply(1:nrSamples, function(x) rnorm(n=5, mean = 5, sd = 3 ))
yourmeans <- sapply(yourlist, mean)
all.equal(as.vector(sample_means), yourmeans)
[1] TRUE
Here, I set the seed to the random number generator to make sure that the random numbers are the same. As you see, your code works fine, though as others have pointed out, loops can easily be avoided.

Create a function that takes in a vector and returns a matrix in R

I am trying to create a function that will take in a vector k and return to me a matrix with dimensions length(distMat[1,]) by length(k). distMat is a huge matrix and indSpam is a long vector. In particular to my situation, length(distMat[1,]) is 2412. When I enter in k as a vector of length one, I get a vector of length 2412. I want to be able to enter in k as a vector of length two and get a matrix of 2412x2. I am trying to use a while loop to let it go through the length of k, but it only returns to me a vector of length 2412. What am I doing wrong?
predNeighbor = function(k, distMat, indSpam){
counter = 1
while (counter<(length(k)+1))
{
preMatrix = apply(distMat, 1, order)
orderedMatrix = t(preMatrix)
truncate = orderedMatrix[,1:k[counter]]
checking = indSpam[truncate]
checking2 = matrix(checking, ncol = k[counter])
number = apply(checking2, 1, sum)
return(number[1:length(distMat[1,])] > (k[counter]/2))
counter = counter + 1
}
}
I am trying to create a function that will take in a vector k and return to me a matrix with dimensions length(distMat[1,]) by length(k)
Here's a function that does this.
foo <- function(k, distMat) {
return(matrix(0, nrow = length(distMat[1, ]), ncol = length(k)))
}
If you have other requirements, please describe them in words.
Based on your comment, I think I understand better your goal. You have a function that returns a vector of length k and you want to save it's output as rows in a matrix. This is a pretty common task. Let's do a simple example where k starts out as 1:10, and say we want to add some noise to it with a function foo() and see how the rank changes.
In the case where the input to the function is always the same, replicate() works very well. It will automatically put everything in a matrix
k <- 1:10
noise_and_rank <- function(k) {
rank(k + runif(length(k), min = -2, max = 2))
}
results <- replicate(n = 8, expr = {noise_and_rank(k)})
In the case where you want to iterate, i.e., the output from the one go is the input for the next, a for loop is good, and we just pre-allocate a matrix with 0's, to fill in one column/row at a time
k <- 1:10
n.sim <- 8
results <- matrix(0, nrow = length(k), ncol = n.sim)
results[, 1] <- k
for(i in 2:n.sim) {
results[, i] <- noise_and_rank(results[, i - 1])
}
What your original question seems to be about is how to do the pre-allocation. If the input is always the same, using replicate() means you don't worry about it. If the input is is different each time, then pre-allocate using matrix(), you don't need to write any special function.

Creating a loop that calculates the rolling mean of a vector for different rolling mean lengths

I am trying to create a "for loop" setup that is going calculate different rolling means of a return series, where I use rolling means ranging from the last 2 observations to the last 16 observations. kϵ[2,16]. I've been trying to use a function like this, where the "rollmean" is a function from zoo. This produces the warning "Warning message:
In roll[i] <- rollmean(x, i) :
number of items to replace is not a multiple of replacement length"
Can someone please help me?
rollk <- function(x, kfrom= 2, kto=16){
roll <- as.list(kto-kfrom+1)
for (i in kfrom:kto){
roll[i]<- rollmean(x, i)
return(roll)
}}
I suppose you want
# library(zoo)
rollk <- function(x, kfrom = 2, kto = 16){
roll <- list()
ft <- kfrom:kto
for (i in seq_along(ft)){
roll[[i]]<- rollmean(x, ft[i])
}
return(roll)
}
There are several problems in your function:
You need [[ to access a single list element, not [.
You want a list of length length(krom:kto). Now, i starts at 1, not at kfrom.
Now, roll is returned after the for loop. Hence, the function returns a single list containing all values.
A shorter equivalent of the function above:
rollk2 <- function(x, kfrom = 2, kto = 16)
lapply(seq(kfrom, kto), function(i) na.omit(filter(x, 1 / rep(i, i))))
It does not require loading additional packages.
Try this:
library(zoo)
lapply(2:16, rollmean, x = x)

Resources