Double sampling method in R - r

My initial code for double sampling is the following. I did only one sample.
# Data
samples<-matrix(NA,nrow = 12, ncol = 2000)
for (i in 1:12) {
samples[i,]<- rbinom(2000,1,prob = 0.05)
}
# Double Sampling Plan
accept<-rep(0,12)
for (i in 1:12) {
if (sum(samples[i,1:80])<=5){
accept[i]<-1
} else if (sum(samples[i,1:80]<=8) & sum(samples[i,1:80]>5) ) {
if (sum(samples[i,1:160])<=12) {
accept[i]<-1
}
}
}
sum(accept)
Since I generated randomly from Bernoulli, every time you run the code, the results will not be the same.
I want 100 repetitions of this double sample.
My solution:
nm=double(100)
for (j in 1:100){
# Data
samples<-matrix(NA,nrow = 12, ncol = 2000)
for (i in 1:12) {
samples[i,]<- rbinom(2000,1,prob = 0.05)
}
# Double Sampling Plan
accept<-rep(0,12)
for (i in 1:12) {
if (sum(samples[i,1:80])<=5){
accept[i]<-1
} else if (sum(samples[i,1:80]<=8) & sum(samples[i,1:80]>5) ) {
if (sum(samples[i,1:160])<=12) {
accept[i]<-1
}
}
}
nm[j]=sum(accept)
}
mean(nm)
What do you think?

If we follow the proposition of #Onyambu, we can embeded one simulation inside a function and call it in a loop like this :
one_double_sampling <- function(){
# Data
samples<-matrix(NA,nrow = 12, ncol = 2000)
for (i in 1:12) {
samples[i,]<- rbinom(2000,1,prob = 0.05)
}
# Double Sampling Plan
accept<-rep(0,12)
for (i in 1:12) {
if (sum(samples[i, 1:80])<=5){
accept[i]<-1
} else if (sum(samples[i,1:80]<=8) & sum(samples[i,1:80]>5) ) {
if (sum(samples[i,1:160])<=12) {
accept[i]<-1
}
}
}
return(sum(accept))
}
set.seed(123)
# number of sample
n <- 100
# stock the result
res <- rep(0, n)
for(i in 1:n){
res[i] <- one_double_sampling()
}
# mean
mean(res)
Definitly your code is correct. For people interresting by the double sampling method I advise you to see this.
Edit 1
In one line code based on Onyambu advise :
mean(replicate(n, one_double_sampling()))

Related

How to restart iteration in R loop?

I made a loop to generate a Markov Chain. If the proposal does not satisfy a condition, I want to restart the iteration with a new proposal? Is there a way to do this? My current code is shown below for reference. Currently, it sets the current chain's value to the previous one. But I don't want that. I want it to just restart the "i". So if i=2, and the condition in line 4 is not satisfied, I then want it to stay at i=2 until it is satisfied. Thanks in advance.
ABC_MCMC<-function(n){
for (i in 2:n){
prop<-rnorm(1,mean=chain[i-1],sd=1)
if (ABC(prop)==T & prop>=0){
h_ratio<-(dgamma(prop,shape=prior_alpha,rate=prior_beta)/dgamma(chain[i-1],shape=prior_alpha,rate=prior_beta))*
(dnorm(x=chain[i-1],mean=prop,sd=1)/dnorm(x=prop,mean=chain[i-1],sd=1))
u<-runif(1)
if (min(1,h_ratio)>u) {chain[i]=prop} else {chain[i]=chain[i-1]}
}
else{chain[i]=chain[i-1]}
}
return(chain<<-chain)
}
This is more of a comment than of an answer but to keep the code formatting I'm posting as an answer.
Replace the code inside the for loop for the code below.
while(TRUE) {
prop <- rnorm(1, mean = chain[i - 1L], sd = 1)
if (ABC(prop) && prop >= 0) {
h_ratio<-(dgamma(prop,shape=prior_alpha,rate=prior_beta)/dgamma(chain[i-1],shape=prior_alpha,rate=prior_beta))*
(dnorm(x=chain[i-1],mean=prop,sd=1)/dnorm(x=prop,mean=chain[i-1],sd=1))
u<-runif(1)
if (min(1,h_ratio)>u) {chain[i]=prop} else {chain[i]=chain[i-1]}
break
} else {chain[i] <- chain[i-1]}
}
Edit
The function below seems to be what is asked for.
ABC_MCMC <- function(n){
for (i in 2:n){
# loops until condition (ABC(prop) & prop >= 0) is met
while(TRUE) {
prop <- rnorm(1, mean = chain[i-1], sd = 1)
if (ABC(prop) & prop >= 0) {
h_ratio <- (dgamma(prop, shape = prior_alpha, rate = prior_beta)/dgamma(chain[i - 1L], shape = prior_alpha, rate = prior_beta)) *
(dnorm(chain[i - 1L], prop, 1)/dnorm(prop, chain[i - 1L], 1))
u <- runif(1)
if (min(1, h_ratio) > u) {
chain[i] <- prop
} else {
chain[i] <- chain[i - 1L]
}
break
}
}
}
# function return value
chain
}

Trying to call a function 1000 times with replicate

simulate <- function(bankroll, bet, max_iterations) { #gambler's ruin function
iteration = 1
while (bankroll > 0 & iteration < max_iterations) {
win <- sample(0:1, size = 1) #either loos or win
if (win) {
bankroll <- (bankroll + bet)
} else {
bankroll <- (bankroll - bet)
}
iteration <- iteration + 1
}
return(list(c(iteration, bankroll)))
}
simulate(1000, 100, 100)
I am trying to call the above function 1000 times to get the vector. Then I need to count how many of the iterations were less than 100. Any suggestion? Thanks
This can simply be reduced to:
simulate <- function(bankroll, bet, max_iterations){
bankroll + bet * sum(sample(c(-1, 1), max_iterations, TRUE))
}
Then you could do:
sum(replicate(1000, simulate(1000, 100, 100))<100)
This is a simple version using base
simulate <- function(bankroll, bet, max_iterations) { #gambler's ruin function
iteration = 1
while (bankroll > 0 & iteration < max_iterations) {
win <- sample(0:1, size = 1) #either loos or win
if (win) {
bankroll <- (bankroll + bet)
} else {
bankroll <- (bankroll - bet)
}
iteration <- iteration + 1
}
return(bankroll)
}
calls <- replicate(n = 1000,expr = simulate(1000, 100, 100))
calls_less_100 <- calls < 100
result <- length(calls[calls_less_100])
result
probability <- result/length(calls)
probability

How to speed up R loop operations over 3 Dimensional Array of variable size?

I have an R script that I wrote which simulates how long particles need to equilibrate in a room. The room is represented with a 3 dimensional array of size max_size. Currently the script works as it is intended, but the real run time of the script is so slow that it is nearly unusable. I have tried adjusting my algorithm to reduce the number of times that the array needs to be looped over, but it is still many magnitudes slower than when I wrote this code in FORTRAN previously. How can I adjust this script in order to reduce the runtime?
Below is my code.
#!/usr/bin/env Rscript
options(warn=1)
max_size <- 10
use_partition = 1
cube <- array(dim=c(max_size, max_size, max_size))
cube[,,]=0.0
diffusion_coefficient <- 0.175
room_dimension <- 5
mol_speed <- 250 # speed of molecules based on 100 g/mol at RT
timestep <- (room_dimension/mol_speed)/max_size # h in seconds
dist_between <- room_dimension/max_size
DTerm <- diffusion_coefficient*timestep/(dist_between*dist_between)
cube[1,1,1] <- 1.0e21 # initial mass of particles
time <- 0 # keep up with accumulated system time
ratio <- 0
part_height <- floor(max_size*0.75)
center <- floor(max_size/2)
# the partition is defined as:
# 1 space (index) "wide" on the X axis
# spans 75% (rounded) of the "height" (Y axis)
# spans 100% of the "depth" (z axis)
# partition spaces are indicated with the value -69.0
# partition spaces cannot be diffused from nor into
if(use_partition == 1) {
for (j in 1:part_height) {
cube[center,j,] <- -69.0 # flag partition spaces with -69.0
}
}
repeat { #R replacement for Do While
for (i in 1:nrow(cube)) {
for (j in 1:ncol(cube)) {
for (k in 1:max_size) {
if(cube[i,j,k] != -69.0) {
for (off in seq(from=-1, to=1, by=2)) {
if (i+off >= 1 && i+off <= max_size) {
change <- (cube[i,j,k] - cube[i+off,j,k])*DTerm
cube[i,j,k] = cube[i,j,k] - change
cube[i+off,j,k] = cube[i+off,j,k] + change
}
if (j+off >= 1 && j+off <= max_size) {
change <- (cube[i,j,k] - cube[i,j+off,k])*DTerm
cube[i,j,k] = cube[i,j,k] - change
cube[i,j+off,k] = cube[i,j+off,k] + change
}
if (k+off >= 1 && k+off <= max_size) {
change <- (cube[i,j,k] - cube[i,j,k+off])*DTerm
cube[i,j,k] = cube[i,j,k] - change
cube[i,j,k+off] = cube[i,j,k+off] + change
}
}
}
}
}
}
time = time+timestep
#check mass for consistency
sumval <- 0.0
maxval <- cube[1,1,1]
minval <- cube[1,1,1]
for(i in 1:nrow(cube)) {
for(j in 1:ncol(cube)) {
for(k in 1:max_size) {
if(cube[i,j,k] != -69.0) {
if(cube[i,j,k] > maxval) {
maxval = cube[i,j,k]
}
if(cube[i,j,k] < minval) {
minval = cube[i,j,k]
}
sumval = sumval + cube[i,j,k]
}
}
}
}
ratio <- minval / maxval
cat(time, " ", ratio, " ", sumval, "\n")
if(ratio >= 0.99) {
break
}
}
cat("Box equilibrated in ", time, " seconds of simulated time.\n")

implement matrix determinant in R

I was asked to implement function that calculates n-dimensional matrix determinant using Laplace expansion. This involves recursion. I developed this:
minor<-function(A,i,j) {
return(A[c(1:(i-1),(i+1):dim(A)[1]),c(1:(j-1),(j+1):dim(A)[2])])
}
determinantRec<-function(X,k) {
if (dim(X)[1] == 1 && dim(X)[2] == 1) return(X[1][1])
else {
s = 0
for (i in 1:dim(X)[2]) {
s = s + X[k][i]*(-1)^(k+i)*determinantRec(minor(X,k,i),k)
}
return(s)
}
}
where k in determinantRec(X,k) function indicates which row I want to use Laplace expansion along of.
My problem is when I run determinantRec(matrix(c(1,2,3,4),nrow = 2,ncol = 2),1) this error appears:
C stack usage 7970628 is too close to the limit
What is wrong with my code?
#julia, there is one simple type in your code. Just remove the '*' at the end of the definition of 's'. And don't indent the recursion.
determinantRek<-function(X,k) {
if (dim(X)[1] == 1 && dim(X)[2] == 1)
return(X[1,1])
if (dim(X)[1] == 2 && dim(X)[2] == 2)
return(X[1,1]*X[2,2]-X[1,2]*X[2,1])
else
s = 0
for (i in 1:dim(X)[2]) {
s = s + X[k,i]*(-1)^(k+i)
determinantRek(X[-k,-i],k)
}
return(s)
}
I did this way and works just fine, although it is super slow, compared to the det function in base R
laplace_expansion <- function(mat){
det1 <- function(mat){
mat[1]*mat[4]-mat[2]*mat[3]
}
determinant <- 0
for(j in 1:ncol(mat)){
mat1 <- mat[-1,-j]
if(nrow(mat1) == 2){
determinant <- determinant+mat[1,j]*(-1)^(1+j)*det1(mat1)
}else{
val <- mat[1,j]*(-1)^(1+j)
if(val != 0){
determinant <- determinant+val*laplace_expansion(mat1)
}
}
}
return(determinant)
}
This is my approach, I think it's cleaner.
deter <- function(X) {
stopifnot(is.matrix(X))
stopifnot(identical(ncol(X), nrow(X)))
if (all(dim(X) == c(1, 1))) return(as.numeric(X))
i <- 1:nrow(X)
out <- purrr::map_dbl(i, function(i){
X[i, 1] * (-1)^(i + 1) * deter(X[-i, -1, drop = FALSE])
})
return(sum(out))
}
Thank you #ArtemSokolov and #MrFlick for pointing the problem cause, it was it. I also discovered that this code does not calculate properly the determinant of 2x2 matrix. After all it looks like that:
determinantRek<-function(X,k) {
if (dim(X)[1] == 1 && dim(X)[2] == 1)
return(X[1,1])
if (dim(X)[1] == 2 && dim(X)[2] == 2)
return(X[1,1]*X[2,2]-X[1,2]*X[2,1])
else
s = 0
for (i in 1:dim(X)[2]) {
s = s + X[k,i]*(-1)^(k+i)*
determinantRek(X[-k,-i],k)
}
return(s)
}
Debuging with browser() was also helpful :)

R: Using GraphNEL, term frequency of extracted keywords

I'm running the below code to extract key phrases from a raw data file. While i am successfully able to do it, i am not able to get the frequency or count of the extracted keywords which would help me understand the ranking of the occurrence of the keywords since i am using GraphNEL. Is there any way i can get the key phrase count? TIA.
ConstructTextGraph <- function(n)
{
word_graph <- new("graphNEL")
i <- 1
while (i < length(words) ) {
if ( IsSelectedWord(words[i]) ) {
links <- GetWordLinks(i,n)
if (links[1] != "") {
cat(i," ",words[i]," - ",paste(c(links),collapse=" "),"\n")
if ( length(which(nodes(word_graph)==words[i]))==0 ) {
word_graph <- addNode(words[i],word_graph)
}
for (j in 1:length(links)) {
if ( length(which(nodes(word_graph)==links[j]))==0 ) {
word_graph <- addNode(links[j],word_graph)
word_graph <- addEdge(words[i],links[j],word_graph,1)
}
else {
if ( length(which(edges(word_graph,links[j])[[1]]==words[i]))>0 ) {
prev_edge_weight <- as.numeric(edgeData(word_graph,words[i],links[j],"weight"))
edgeData(word_graph,words[i],links[j],"weight") <- prev_edge_weight+1
}
else {
word_graph <- addEdge(words[i],links[j],word_graph,1)
}
}
}
}
}
i <- i+1
}
word_graph
}
Please let me know if more information is needed.

Resources