Replacing nested while loops with vectorization/optimization - r

I am trying to code a lineup simulator that basically runs absorbing markov chains a bunch of times in hopes of approaching the "true" average number of runs a lineup would produce. I have looked at about 25 posts on here of people explaining certain types of vectorization and optimization but was not able to work in any of these into my code. Thus, I thought I would ask and see if there was any tips. Thank you!
PHLoops <- function(P1,P2,P3,P4,P5,P6,P7,P8,P9,PPH,R,start=1){
s <- start; path <- NULL; runs <- 0; zz=1;inn=1;gam=1;atbat=1
while (gam<163){
while(inn<10){
s <- start; path <- NULL;
while(s<25){
if(zz==1){
P=P1
}
if(zz==2){
P=P2
}
if(zz==3){
P=P3
}
if(zz==4){
P=P4
}
if(zz==5){
P=P5
}
if(zz==6){
P=P6
}
if(zz==7){
x3 <- sample(1:100, 1)
if((atbat>=24)&(table(P7==mat9)[1]>=625)&(atbat<=30)&(x3<=6)){
P=P7
}else if((atbat>=25)&(table(P7==mat9)[1]>624)&(atbat<=30)&(x3>=7)){
P=PPH
}else if(((atbat>=27)&(table(P8==mat9)[1]>624))){
P=PPH
}else{
P=P7
}
}
if(zz==8){
x3 <- sample(1:100, 1)
if((atbat>=25)&(table(P8==mat9)[1]>624)&(atbat<=30)&(x3<=3)){
P=P8
}else if((atbat>=26)&(table(P8==mat9)[1]>=625)&(atbat<=32)&(x3>=4)){
P=PPH
}else if(((atbat>=27)&(table(P8==mat9)[1]>624))){
P=PPH
}else{
P=P8
}
}
if(zz==9){
if(((atbat>=27)&(table(P9==mat9)[1]>624))){
P=PPH
}
else{
P=P9
}
}
s.new <- sample(1:25,1,prob=P[s,])
path <- c(path,s.new)
runs <- runs+R[s,s.new]
s <- s.new
zz=ifelse(zz==9,1,zz+1)
atbat=atbat+1
}
inn=inn+1
runs
}
gam=gam+1
atbat=1
inn=1
runs
}
runs
}
I then run this function using:
RUNS8 <- replicate(1000,PHLoops(mat2,
mat4,mat5,mat7,mat1,mat8,mat6,mat9,mat3,mat10,R))
Where mat 1-10 are 25x25 transition matrices for each individual player. R is a matrix that tells the function how many runs a team gets going from one state to a different state. zz is a variable that corresponds to where we are in the batting order. The 7-9 spots are different because the pitcher does not bat every time he comes up to bat. So basically after 2 full iterations through the order the pitcher will not bat and rather PPH (pitch hitter) will bat. While, the pitcher will hit more often as he gets closer to the top of the order. (that is what x3 is simulating).
Any help would be great and I will continue to read the forums for further tips.
Thank you!
UPDATE (2-11)
Made the changes that people suggested below and the code now looks like this:
simulate.half.inning9test <- function(P1,P2,P3,P4,P5,P6,P7,P8,P9,PPH,R,start=1){
s <- start; path <- NULL; runs <- 0; zz=1;inn=1;gam=1;atbat=1
p7=(table(P7==mat9)[1]>624)
p8=(table(P8==mat9)[1]>624)
p9=(table(P9==mat9)[1]>624)
order=list(P1,P2,P3,P4,P5,P6,P7,P8,P9)
while (gam<163){
x3 <- sample(1:100, 1)
while(inn<10){
s <- start; path <- NULL;
while(s<25){
P=order[[zz]]
if(zz==7){
if((atbat>=24)&&(p7==TRUE)&&(atbat<=30)&&(x3<=6)){
P=P7
}else if((atbat>=25)&&(p7==TRUE)&&(atbat<=30)&&(x3>=7)){
P=PPH
}else if(((atbat>=27)&&(p7==TRUE))){
P=PPH
}else{
P=P7
}
}
if(zz==8){
if((atbat>=25)&&(p8==TRUE)&&(atbat<=30)&&(x3<=3)){
P=P8
}else if((atbat>=26)&&(p8==TRUE)&&(atbat<=32)&&(x3>=4)){
P=PPH
}else if(((atbat>=27)&&(p8==TRUE))){
P=PPH
}else{
P=P8
}
}
if(zz==9){
if(((atbat>=27)&&(p9==TRUE))){
P=PPH
}
else{
P=P9
}
}
s.new <- sample(1:25,1,prob=P[s,])
runs <- runs+R[s,s.new]
s <- s.new
zz=ifelse(zz==9,1,zz+1)
atbat=atbat+1
}
inn=inn+1
runs
}
gam=gam+1
atbat=1
inn=1
runs
}
runs
}

Related

Three function in R

IS <- function(N,K,sigma,t,r,S_0,a,b,tol){
funct_1 <- function(x){
return((S_0*(exp(-0.5*(sigma^2)*t+sigma*sqrt(t)*x))*(sigma*sqrt(t)-x))+
(exp(-r*t))*K*x)
}
bisection_method <- function(a, b, tol, f = funct_1){
if (f(a)*f(b) > 0){
print("No root found.")
} else
while ((b - a)/2.0 > tol){
midpt= (a + b)/2.0
if (f(midpt) == 0){
return(midpt)
} else if (f(a)*f(midpt) < 0){
b = midpt
} else
a = midpt
}
return(midpt)
}
}
The above function will produce nothing for you. My goal that to input the values of "N,K,sigma,t,r,S_0, a,b" and somehow return "midpt" for me. I have searched a lot but could not come up with anything that makes sense. I have many problems, assume that I input everything things, then how the function "funct_1" will output expression, this expression needs to be recalled to the next function "bisection_method} along with the value of a and b then finally obtain the "midpt" value. Any suggestions are really appreciated. Please let me know if there is anything not clear to you at all.
Your main function doesn't return anything. It just creates the auxiliary functions and then do nothing. That's why you're getting no output.
Try returning the bisection method with appropriate parameters in your main function instead. I also edited so you get NULL output when no root is found.
IS <- function(N,K,sigma,t,r,S_0,a,b,tol){
funct_1 <- function(x){
return((S_0*(exp(-0.5*(sigma^2)*t+sigma*sqrt(t)*x))*(sigma*sqrt(t)-x))+
(exp(-r*t))*K*x)
}
bisection_method <- function(a, b, tol, f = funct_1){
if (f(a)*f(b) > 0){
print("No root found."); return(NULL)
} else
while ((b - a)/2.0 > tol){
midpt= (a + b)/2.0
if (f(midpt) == 0){
return(midpt)
} else if (f(a)*f(midpt) < 0){
b = midpt
} else
a = midpt
}
return(midpt)
}
return(bisection_method(a,b,tol,funct_1))
}
Figured out some parameter combination that makes sense:
IS(1,1,1,4,5,1,.1,9,10^-4)
[1] 2.000023

How to restart a loop with eval with timeout in R?

while (!exists("j")) {
i <- 1
repeat {
tryCatch(expr = {
print(i)
raw.result <- evalWithTimeout(Sys.sleep(i), timeout = 3)
if (i == 1) {
j <- i
} else {
j <- c(j, i)
}
i <- i + 1
}, TimeoutException = function(ex) {
rm("j")
})
}
}
The above code is getting stuck at i=4 and keeps executing the function for i=4, however I want it to restart from i=1, whenever there is an error.
Can someone please guide on where am I doing it wrong?
In your codeTimeoutException is unable to find j as it is evaluated in a different environment. Even if it was able to find it, nothing would change. As tryCatch is stopping an error from breaking a repeat loop, thus repeat will continue with the current i. You could explicitly break out from the repeat, but in that case you have deleted j, thus your while will stop.
I am not quite sure why you need while loop here.
Here is a modification of your code that will work as you want.
Fist explicitly set i <- 1, and rest it again to i <<-1 (Note <<- as i is one environment above tryCatch).
i <- 1
repeat {
tryCatch(
expr = {
print(i)
raw.result <- R.utils:evalWithTimeout(Sys.sleep(i), timeout = 3)
if (i == 1) {
j <- i
} else {
j <- c(j, i)
}
i <- i + 1
},
TimeoutException = function(ex) {
i <<- 1
}
)
}

R: Using GraphNEL, term frequency of extracted keywords

I'm running the below code to extract key phrases from a raw data file. While i am successfully able to do it, i am not able to get the frequency or count of the extracted keywords which would help me understand the ranking of the occurrence of the keywords since i am using GraphNEL. Is there any way i can get the key phrase count? TIA.
ConstructTextGraph <- function(n)
{
word_graph <- new("graphNEL")
i <- 1
while (i < length(words) ) {
if ( IsSelectedWord(words[i]) ) {
links <- GetWordLinks(i,n)
if (links[1] != "") {
cat(i," ",words[i]," - ",paste(c(links),collapse=" "),"\n")
if ( length(which(nodes(word_graph)==words[i]))==0 ) {
word_graph <- addNode(words[i],word_graph)
}
for (j in 1:length(links)) {
if ( length(which(nodes(word_graph)==links[j]))==0 ) {
word_graph <- addNode(links[j],word_graph)
word_graph <- addEdge(words[i],links[j],word_graph,1)
}
else {
if ( length(which(edges(word_graph,links[j])[[1]]==words[i]))>0 ) {
prev_edge_weight <- as.numeric(edgeData(word_graph,words[i],links[j],"weight"))
edgeData(word_graph,words[i],links[j],"weight") <- prev_edge_weight+1
}
else {
word_graph <- addEdge(words[i],links[j],word_graph,1)
}
}
}
}
}
i <- i+1
}
word_graph
}
Please let me know if more information is needed.

Loop for cricket code is not successful

I need to calculate the number of 4's and 6's a batsman has scored in a given cricket database.
The following is my code:
runs_batsman = ballbyball[,c("Batter","RunsBatsman")]
fours = data.frame(bat_name)
numoffours = function(name)
{counter = 0
for(i in nrow(ballbyball))
{
if((identical(toString(name),toString(runs_batsman[i,1]))))
{
if(runs_batsman[i,2]== 4)
{
counter = (counter + 1)
}
}
}
return(counter)
}
summary = function(dataset){
fours[,"numfours"]=0
for (j in nrow(bat_name)){
fours[j,2] = lapply(bat_name[j,1], numoffours)
}
return(fours)
}
I am not getting any values for any of the batsmen. bat_name is the data frame with all batsmen names. Please help me.
I am very new to R so please explain in much detail as possible.

R: Optimise spike pruning function

Since I have not found an R package for analysis of electrophysiological data, I have used a function for spike pruning from my group:
prune.spikes <- function(spikes, min.isi) {
# copy spike matrix
prunedspikes <- spikes
# initialise index of last spike: infinitely before the first one.
for (i in 1:ncol(spikes)) {
last <- -Inf
for (j in 1:nrow(spikes)) {
if (spikes[j, i] == 1) {
if (j - last < min.isi) {
prunedspikes[j, i] <- 0; # remove the spike
}
else {
last <- j
}
}
}
}
return(prunedspikes)
}
The function takes a spike vector or matrix consisting of 0 and 1 values and removes any 1 if it occurred within a minimum interval.
Because of the two nested loops it takes ages to run. In order to optimise it I have come up with this solution (removes one loop):
prune.cols <- function(spikes, min.isi) {
prunedspikes <- apply(spikes, 2, FUN = prune.rows, min.isi = min.isi)
return(prunedspikes)
}
prune.rows <- function(spikes, min.isi) {
prunedspikes <- spikes
last <- -Inf
for (i in 1:length(spikes)) {
if (spikes[i] == 1) {
if (i - last < min.isi) {
prunedspikes[i] <- 0; # remove the spike
}
else {
last <- i
}
}
}
return(prunedspikes)
}
Calling prune.cols on a large data set is noticeable faster compared to the original version (~60 times). One loop remains, though. So far I could not come up with a nice and simple solution. How can the function be even further improved?
Like #Khashaa proposed, I implemented the function with the help of Rcpp:
NumericMatrix prunespikes(NumericMatrix spikes, double minisi) {
NumericMatrix prunedspikes = spikes;
int ncol = spikes.ncol();
int nrow = spikes.nrow();
for (int i = 0; i < ncol; i++) {
int last = 0;
while (spikes(last, i) == 0) {
last++;
}
for (int j = last + 1; j < nrow; j++) {
if (spikes(j, i) == 1) {
if (j - last < minisi) {
prunedspikes(j, i) = 0;
} else {
last = j;
}
}
}
}
return prunedspikes;
}
If the speed difference is not a problem yet, it may be better to keep the loop instead of using Rcpp.
According to Hadley Wickham's article Loops that should be left as is, it is not a bad idea to have this loop as it can be categorized into the Recursive relationship case.
Once the speed is the bottleneck, then resorting to Rcpp or this page (suggested by the article too) may be the solution.

Resources