I've looked through previous help threads and haven't found something that has helped me with this specific problem. I know that a for loop would be a better way to generate the same data, but I'm interested in making this work with a repeat loop (mostly just as an exercise) and am struggling with the solution.
So I'm looping to create 3 iterations of 100 rnorm observations, changing the means each time from 5, to 25, to 45.
i <- 1
repeat{
x <- rnorm(100, mean = j, sd = 3)
j <- 5*i
i <- i + 4
if (j > 45) break
cat(x, "\n",j, "\n")
}
All of my tinkering to get a combined saved output for each iteration (for a total of 300 values) has failed. Help!
You can use lapply to get this:
lapply(c(5,25,45), function(x){
rnorm(100, mean = x, sd = 3)
})
This will give you a list with 3 elements:
Each containing 100 observations drawn from the respective normal-distribution.
Depends on what structure of data do you want.
For lists it would be:
r = list()
repeat{
r[[length(r)+1]] = list(x,j)
}
Then: r[[1]][[1]] will be x for 1 loop and r[[1]][[2]] would be j.
Since you know how many observations you want to store, you can pre-allocate a matrix of that size, and store the data in it as it's generated.
# preallocate the space for the values you want to store
x <- matrix(nrow=100, ncol=3)
# save the three means in a vector
j_vals <- c(5,25,45)
# if you really need a repeat loop you can do it like so:
i <- 1
repeat {
# save the random sample in a column of the matrix x
x[,i] <- rnorm(100, mean = j_vals[i], sd = 3)
# print the random sample to the console (you can omit this)
cat(x[,i], "\n",j_vals[i], "\n")
i <- i+1
if (i > 3) break
}
You should get out a matrix x with the random samples stored in the columns. You can access each column like x[,1], x[,2] etc.
Related
I am trying to multiply the values stored in a list containing 1,000 values with another list containing ages. Ultimately, I want to store 1,000 rows to a dataframe.
I wonder if it's better to use lapply fucntion or for loop function here.
list 1
lambdaSamples1 <- lapply(
floor(runif(numSamples, min = 1, max = nrow(mcmcMatrix))),
function(x) mcmcMatrix[x, lambdas[[1]]])
*the out put is 1,000 different values in a list. *
list 2
ager1= 14:29
What I want to do is
for (i in 1: numSamples) {
assign(paste0("newRow1_", i), 1-exp(-lambdaSample1[[i]]*ager1))
}
now I got 1,000 rows of values that I want to store in a predetermiend dataframe, outDf_1 (nrow=1000, ncol = ager1).
I tried
`
for (i in 1:numSamples) {
outDf_1[i,] <- newRow1_i
}
I want to store newRow1_1, ,,,,,, , newRow1_1000 to each of the 1,000 row of outDf_1 dataframe.
SHould I approach different way?
I think you're overcomplicating this a bit. Many operations in R are vectorized so you shoudln't need lapply or for loops for this. You didn't give us any data to work with but the code below should do what you want in a more straightforward and fast way.
lambdaSamples1 <- mcmcMatrix[sample(nrow(mcmcMatrix), numSamples, replace=T),
lambdas[[1]]]
outDF_1 <- 1 - exp(-lambdaSamples1 %*% t(ager1))
Just note that this makes outDF_1 a matrix, not a data frame.
To do this for multiple ages, you could use a loop to save your resulting matrices in a list:
outDF <- list()
x <- 5
for (i in seq_len(x)) {
lambdaSamples <- mcmcMatrix[sample(nrow(mcmcMatrix), numSamples, replace=T),
lambdas[[1]]]
outDF[[i]] <- 1 - exp(-lambdaSamples %*% t(ager[[i]]))
}
Here, ager1, ..., agerx are expected to be stored in a list (ager).
I currently have a random bunch numbers listed from -1billion to positive 1 billion in a dataset, that is all. I want to write a function so that it will pull 5 random numbers 100,000 times from the dataset and then see how many times the number is below 0. Can I use the mapply function for this? or would another function be better.
The dataset is called numbers as has 2 columns, with the column i want to pull the numbers from called listofnumbers.
Currently i have a for loop but it seems to take forever to run, the code is below
```
n=5
m=100000
base_table <- as_tibble (0)
for (j in 1:m)
{
for (i in 1:n)
{
base_table[i,j] <- as_tibble(sample_n(numbers, 1) %>% pull(listofnumbers))
}
}
```
Can anyone help?
Here is a reduced size example:
r <- -100:100
n <- 10
collect <- 1000
output <- replicate(collect, sum(sample(r, n) < 0))
hist(output)
You would simply replace r, n and collect with your values. I.e., r = numbers$listofnumbers, n = 5, collect = 100000
dataset <- matrix(rnorm(100), 20, 5)
My dataset is a matrix of 100 returns, of 5 assets over 20 days.
I want to caluclate the average return for each asset, between 1:10 rows and 11:20 rows.
Then, I want to include the returns so computed in two vectors, in turn, included in a list.
The following list should include the two vectors of returns computed between rows 1:10 and 11:20.
returns <- vector(mode="list", 2)
I have implemented a for-loop, as reported below, to calculate the mean of returns only between 1:10.
assets <- 5
r <- rep(0, assets) # this vector should include the returns over 1:10
for(i in 1:assets){
r[i] <- mean(data[1:10,i])
}
returns[[1]] <- r
How could I manage this for-loop in order to calculate also the mean of returns between 11:20 rows?
I have tried to "index" the rows of the dataset, in the following way.
time <- c(1, 10, 11, 20)
and then implement a double for-loop, but the length are different. Moreover, in this case, I meet difficulties in managing the vector "r". Because, in this case, I should have two vectors and no longer only one as before.
for(j 1:length(time){
for(i in 1:assets){
r[i] <- mean(data[1:10,i])
}}
returns[[1]] <- r
You don't even need a for loop. You can use colMeans
returns <- vector(mode="list", 2)
returns[[1]] <- colMeans(dataset[1:10,])
returns[[2]] <- colMeans(dataset[11:20,])
Using a for loop, your solution could be something like the following
for(i in 1:assets){
returns[[1]] <- c(returns[[1]], mean(dataset[1:10,i]))
returns[[2]] <- c(returns[[2]], mean(dataset[11:20,i]))
}
I intend to fill a matrix I created that has 1000 rows and 2 columns. Here B is 1000.
resampled_ests <- matrix(NA, nrow = B, ncol = 2)
names(resampled_ests) <- c("Intercept_Est", "Slope_Est")
I want to fill it using a for loop looping from 1 to 1000.
ds <- diamonds[resampled_values[b,],]
Here, each of the ds(there should be 1000 versions of it in the for loop) is a data frame with 2 columns and 2000 rows. and I would like to use the lm() function to get the Beta coefficients of the two columns of data.
for (b in 1:B) {
#Write code that fills in the matrix resample_ests with coefficent estimates.
ds <- diamonds[resampled_values[b,],]
lm2 <- lm(ds$price~ds$carat, data = ds)
rowx <- coefficients(lm2)
resampled_ests <- rbind(rowx)
}
However, after I run the loop, resampled_ests, which is supposed to be a matrix of 1000 rows only shows 1 row, 1 pair of coefficients. But when I test the code outside of the loop by replacing b with numbers, I get different results which are correct. But by putting them together in a for loop, I don't seem to be row binding all of these different pairs of coefficients. Can someone explain why the result matrix resampled_etsis only showing one result case(1 row) of data?
rbind(x) returns x because you're not binding it to anything. If you want to build a matrix row by row, you need something like
resampled_ests <- rbind(resampled_ests, rowx)
This also means you need to initialize resampled_ests before the loop.
Which, if you're doing that anyway, I might just make a 1000 x 2 matrix of zeros and fill in the rows in the loop. Something like...
resampled_ests <- matrix(rep(0, 2*B), nrow=B)
for (b in 1:B) {
ds <- diamonds[resampled_values[b,],]
lm2 <- lm(ds$price~ds$carat, data = ds)
rowx <- coefficients(lm2)
resampled_ests[b,] <- rowx
}
I have a vector on which I want to do block resampling to get, say, 1000 samples of the same size of the vector, and then save all this samples in a list.
This is the code that performs normal resampling, i.e. randomly draws one observation per time, and saves the result in a list:
myvector <- c(1:200)
mylist <- list()
for(i in 1:1000){
mylist[[i]] <- sample(myvector, length(myvector), replace=TRUE)
}
I need a code that does exactly the same thing, except that instead of drawing single observations it draws blocks of observations (let's use blocks of dimension equal to 5).
I know there are packages that perform bootstrap operations, but I don't need statistics or confidence intervals or anything, just all the samples in a list. Both overlapping and non-overlapping blocks are ok, so the code for just one of the two procedures is enough. Of course, if you are so kind to give me the code for both it's appreciated. Thanks to anybody who can help me with this.
Not sure how you're wanting to store the final structure.
The following takes a block dimension, samples your vector by that block size (e.g. 200 element vector with block size 5 gives 40 observations of randomly sampled elements) and adds those blocks to an index of the final list. Using your example, the final result is a list with 1000 entries; each entry containing 40 randomly sampled observations.
myvector <- c(1:200)
rm(.Random.seed, envir=globalenv())
block_dimension <- 5
res = list()
for(i in 1:1000) {
name <- paste('sample_', i, sep='')
rep_num <- length(myvector) / block_dimension
all_blocks <- replicate(rep_num, sample(myvector, block_dimension))
tmp <- split(all_blocks, ceiling(seq_along(all_blocks)/block_dimension))
res[[name]] <- tmp
}
Here are the first 6 sampled observations for the first entry:
How about the following? Note that you can use lapply, which should be slightly faster than filling the list in a for loop in this case.
As reference, here is the case where you sample individual observations.
# Sample individual observations
set.seed(2017);
mylist <- lapply(1:1000, function(x) sample(myvector, length(myvector), replace = TRUE));
Next we sample blocks of 5 observations.
# Sample blocks of n observations
n <- 5;
set.seed(2017);
mylist <- lapply(1:1000, function(x) {
idx <- sample(1:(length(myvector) - n), length(myvector) / n, replace = TRUE);
idx <- c(t(sapply(0:(n - 1), function(i) idx + i)));
myvector[idx];
})
One solution, assuming blocks consist of contiguous elements of myvector, is to pre-define the blocks in rows of a data frame with start/end columns (e.g. blocks <- data.frame(start=seq(1,96,5),end=seq(5,100,5))). Create a set of sample indexes (with replacement) from [1:number of blocks] and concatenate values indexing from myvector using the start/end values from the defined blocks. You can add randomization within blocks as well, if you need to. This gives you control over the block contents, overlap, size, etc.
I found a way to perform the task with non-overlapping blocks:
myvector <- c(1:200)
n <- 5
mymatrix <- matrix(myvector, nrow = length(myvector)/n, byrow = TRUE)
mylist <- list()
for(i in 1:1000){
mylist[[i]] <- as.vector(t(mymatrix[sample(nrow(mymatrix), size = length(myvector)/n, replace = TRUE),]))
}