Loop for cricket code is not successful - r

I need to calculate the number of 4's and 6's a batsman has scored in a given cricket database.
The following is my code:
runs_batsman = ballbyball[,c("Batter","RunsBatsman")]
fours = data.frame(bat_name)
numoffours = function(name)
{counter = 0
for(i in nrow(ballbyball))
{
if((identical(toString(name),toString(runs_batsman[i,1]))))
{
if(runs_batsman[i,2]== 4)
{
counter = (counter + 1)
}
}
}
return(counter)
}
summary = function(dataset){
fours[,"numfours"]=0
for (j in nrow(bat_name)){
fours[j,2] = lapply(bat_name[j,1], numoffours)
}
return(fours)
}
I am not getting any values for any of the batsmen. bat_name is the data frame with all batsmen names. Please help me.
I am very new to R so please explain in much detail as possible.

Related

Double sampling method in R

My initial code for double sampling is the following. I did only one sample.
# Data
samples<-matrix(NA,nrow = 12, ncol = 2000)
for (i in 1:12) {
samples[i,]<- rbinom(2000,1,prob = 0.05)
}
# Double Sampling Plan
accept<-rep(0,12)
for (i in 1:12) {
if (sum(samples[i,1:80])<=5){
accept[i]<-1
} else if (sum(samples[i,1:80]<=8) & sum(samples[i,1:80]>5) ) {
if (sum(samples[i,1:160])<=12) {
accept[i]<-1
}
}
}
sum(accept)
Since I generated randomly from Bernoulli, every time you run the code, the results will not be the same.
I want 100 repetitions of this double sample.
My solution:
nm=double(100)
for (j in 1:100){
# Data
samples<-matrix(NA,nrow = 12, ncol = 2000)
for (i in 1:12) {
samples[i,]<- rbinom(2000,1,prob = 0.05)
}
# Double Sampling Plan
accept<-rep(0,12)
for (i in 1:12) {
if (sum(samples[i,1:80])<=5){
accept[i]<-1
} else if (sum(samples[i,1:80]<=8) & sum(samples[i,1:80]>5) ) {
if (sum(samples[i,1:160])<=12) {
accept[i]<-1
}
}
}
nm[j]=sum(accept)
}
mean(nm)
What do you think?
If we follow the proposition of #Onyambu, we can embeded one simulation inside a function and call it in a loop like this :
one_double_sampling <- function(){
# Data
samples<-matrix(NA,nrow = 12, ncol = 2000)
for (i in 1:12) {
samples[i,]<- rbinom(2000,1,prob = 0.05)
}
# Double Sampling Plan
accept<-rep(0,12)
for (i in 1:12) {
if (sum(samples[i, 1:80])<=5){
accept[i]<-1
} else if (sum(samples[i,1:80]<=8) & sum(samples[i,1:80]>5) ) {
if (sum(samples[i,1:160])<=12) {
accept[i]<-1
}
}
}
return(sum(accept))
}
set.seed(123)
# number of sample
n <- 100
# stock the result
res <- rep(0, n)
for(i in 1:n){
res[i] <- one_double_sampling()
}
# mean
mean(res)
Definitly your code is correct. For people interresting by the double sampling method I advise you to see this.
Edit 1
In one line code based on Onyambu advise :
mean(replicate(n, one_double_sampling()))

Manupulate the full Row of a tiddle with 2 loops

first, this is a homework question.
It's easy to manipulate the full row with:
testMan[2,] = apply(testMan[2,], 2, function(x) 100)
But we have to do this in a loop and it must be a function with a parameter.
manipulateRow = function(rowNumber){
i = 1;
for(row in testMan){
#print(i)
if(i == rowNumber){
for(price in row){
price = 100
}
break;
}
i = i + 1;
}
}
test = manipulateRow(2);
The goal is to replace the full 2nd line with the value 100.
There are more than 600 columns, so we have to do it with a loop.
We are working with tibble and tiddyverse.

Data table filtering does not work

I'm trying to create a data table and do some calculations about it for my assignment. However, when I create and manipulate data, for some reason filtering does not work. For instance if I filter column 's' for the value 0.7, no solution. For the value 0.9, it works as it should. It's weird.
Thanks for any help.
library(data.table)
p<-as.data.table(cbind(0:6, c(0,0.15,0.33,0.37,0.40,0.42,0.43)))
states<-seq(from=0, to=30,by=0.1)
actions<-seq(from=0,to=6)
actions<-as.data.table(actions)
actions[,hunt_share:=numeric()]
for(i in 1:7)
{
if(i==1)
actions[i]$hunt_share<-0
else
actions[i]$hunt_share<-floor(164/(i-1)*10)/10
}
u_star<-data.table(t=integer(),s=numeric(),a=integer(),value=numeric())
r<-data.table(t=integer(),s_bar=numeric(),s=numeric(),a=integer(),value=numeric())
trans_prob<-data.table(t=integer(),s_bar=numeric(),s=numeric(),a=integer(),value=numeric())
str(trans_prob)
####### transitional probabilities
horizon<-5
for(time in 1:horizon)
{
for(i in states)
{
for(a in actions$actions)
{
if((a==0))
{
tmp<-data.table(t(c(time,max(i-6,0),i,a,1)))
colnames(tmp)<-colnames(trans_prob)
trans_prob<-rbind(trans_prob,tmp,fill=T)
}
if((a>0)&(i>=0.5))
{
tmp<-data.table(t(c(time,min(30,i+actions[actions==a]$hunt_share-6.5),i,a,p[V1==a]$V2)))
colnames(tmp)<-colnames(trans_prob)
trans_prob<-rbind(trans_prob,tmp,fill=T)
tmp<-data.table(t(c(time,max(i-6.5,0),i,a,1-p[V1==a]$V2)))
colnames(tmp)<-colnames(trans_prob)
trans_prob<-rbind(trans_prob,tmp,fill=T)
}
}
}
print(time)
}
####### transitional probabilities
##Bug?
trans_prob[s==0.9]
trans_prob[s==0.7]
##Bug?

R: Optimise spike pruning function

Since I have not found an R package for analysis of electrophysiological data, I have used a function for spike pruning from my group:
prune.spikes <- function(spikes, min.isi) {
# copy spike matrix
prunedspikes <- spikes
# initialise index of last spike: infinitely before the first one.
for (i in 1:ncol(spikes)) {
last <- -Inf
for (j in 1:nrow(spikes)) {
if (spikes[j, i] == 1) {
if (j - last < min.isi) {
prunedspikes[j, i] <- 0; # remove the spike
}
else {
last <- j
}
}
}
}
return(prunedspikes)
}
The function takes a spike vector or matrix consisting of 0 and 1 values and removes any 1 if it occurred within a minimum interval.
Because of the two nested loops it takes ages to run. In order to optimise it I have come up with this solution (removes one loop):
prune.cols <- function(spikes, min.isi) {
prunedspikes <- apply(spikes, 2, FUN = prune.rows, min.isi = min.isi)
return(prunedspikes)
}
prune.rows <- function(spikes, min.isi) {
prunedspikes <- spikes
last <- -Inf
for (i in 1:length(spikes)) {
if (spikes[i] == 1) {
if (i - last < min.isi) {
prunedspikes[i] <- 0; # remove the spike
}
else {
last <- i
}
}
}
return(prunedspikes)
}
Calling prune.cols on a large data set is noticeable faster compared to the original version (~60 times). One loop remains, though. So far I could not come up with a nice and simple solution. How can the function be even further improved?
Like #Khashaa proposed, I implemented the function with the help of Rcpp:
NumericMatrix prunespikes(NumericMatrix spikes, double minisi) {
NumericMatrix prunedspikes = spikes;
int ncol = spikes.ncol();
int nrow = spikes.nrow();
for (int i = 0; i < ncol; i++) {
int last = 0;
while (spikes(last, i) == 0) {
last++;
}
for (int j = last + 1; j < nrow; j++) {
if (spikes(j, i) == 1) {
if (j - last < minisi) {
prunedspikes(j, i) = 0;
} else {
last = j;
}
}
}
}
return prunedspikes;
}
If the speed difference is not a problem yet, it may be better to keep the loop instead of using Rcpp.
According to Hadley Wickham's article Loops that should be left as is, it is not a bad idea to have this loop as it can be categorized into the Recursive relationship case.
Once the speed is the bottleneck, then resorting to Rcpp or this page (suggested by the article too) may be the solution.

Replacing nested while loops with vectorization/optimization

I am trying to code a lineup simulator that basically runs absorbing markov chains a bunch of times in hopes of approaching the "true" average number of runs a lineup would produce. I have looked at about 25 posts on here of people explaining certain types of vectorization and optimization but was not able to work in any of these into my code. Thus, I thought I would ask and see if there was any tips. Thank you!
PHLoops <- function(P1,P2,P3,P4,P5,P6,P7,P8,P9,PPH,R,start=1){
s <- start; path <- NULL; runs <- 0; zz=1;inn=1;gam=1;atbat=1
while (gam<163){
while(inn<10){
s <- start; path <- NULL;
while(s<25){
if(zz==1){
P=P1
}
if(zz==2){
P=P2
}
if(zz==3){
P=P3
}
if(zz==4){
P=P4
}
if(zz==5){
P=P5
}
if(zz==6){
P=P6
}
if(zz==7){
x3 <- sample(1:100, 1)
if((atbat>=24)&(table(P7==mat9)[1]>=625)&(atbat<=30)&(x3<=6)){
P=P7
}else if((atbat>=25)&(table(P7==mat9)[1]>624)&(atbat<=30)&(x3>=7)){
P=PPH
}else if(((atbat>=27)&(table(P8==mat9)[1]>624))){
P=PPH
}else{
P=P7
}
}
if(zz==8){
x3 <- sample(1:100, 1)
if((atbat>=25)&(table(P8==mat9)[1]>624)&(atbat<=30)&(x3<=3)){
P=P8
}else if((atbat>=26)&(table(P8==mat9)[1]>=625)&(atbat<=32)&(x3>=4)){
P=PPH
}else if(((atbat>=27)&(table(P8==mat9)[1]>624))){
P=PPH
}else{
P=P8
}
}
if(zz==9){
if(((atbat>=27)&(table(P9==mat9)[1]>624))){
P=PPH
}
else{
P=P9
}
}
s.new <- sample(1:25,1,prob=P[s,])
path <- c(path,s.new)
runs <- runs+R[s,s.new]
s <- s.new
zz=ifelse(zz==9,1,zz+1)
atbat=atbat+1
}
inn=inn+1
runs
}
gam=gam+1
atbat=1
inn=1
runs
}
runs
}
I then run this function using:
RUNS8 <- replicate(1000,PHLoops(mat2,
mat4,mat5,mat7,mat1,mat8,mat6,mat9,mat3,mat10,R))
Where mat 1-10 are 25x25 transition matrices for each individual player. R is a matrix that tells the function how many runs a team gets going from one state to a different state. zz is a variable that corresponds to where we are in the batting order. The 7-9 spots are different because the pitcher does not bat every time he comes up to bat. So basically after 2 full iterations through the order the pitcher will not bat and rather PPH (pitch hitter) will bat. While, the pitcher will hit more often as he gets closer to the top of the order. (that is what x3 is simulating).
Any help would be great and I will continue to read the forums for further tips.
Thank you!
UPDATE (2-11)
Made the changes that people suggested below and the code now looks like this:
simulate.half.inning9test <- function(P1,P2,P3,P4,P5,P6,P7,P8,P9,PPH,R,start=1){
s <- start; path <- NULL; runs <- 0; zz=1;inn=1;gam=1;atbat=1
p7=(table(P7==mat9)[1]>624)
p8=(table(P8==mat9)[1]>624)
p9=(table(P9==mat9)[1]>624)
order=list(P1,P2,P3,P4,P5,P6,P7,P8,P9)
while (gam<163){
x3 <- sample(1:100, 1)
while(inn<10){
s <- start; path <- NULL;
while(s<25){
P=order[[zz]]
if(zz==7){
if((atbat>=24)&&(p7==TRUE)&&(atbat<=30)&&(x3<=6)){
P=P7
}else if((atbat>=25)&&(p7==TRUE)&&(atbat<=30)&&(x3>=7)){
P=PPH
}else if(((atbat>=27)&&(p7==TRUE))){
P=PPH
}else{
P=P7
}
}
if(zz==8){
if((atbat>=25)&&(p8==TRUE)&&(atbat<=30)&&(x3<=3)){
P=P8
}else if((atbat>=26)&&(p8==TRUE)&&(atbat<=32)&&(x3>=4)){
P=PPH
}else if(((atbat>=27)&&(p8==TRUE))){
P=PPH
}else{
P=P8
}
}
if(zz==9){
if(((atbat>=27)&&(p9==TRUE))){
P=PPH
}
else{
P=P9
}
}
s.new <- sample(1:25,1,prob=P[s,])
runs <- runs+R[s,s.new]
s <- s.new
zz=ifelse(zz==9,1,zz+1)
atbat=atbat+1
}
inn=inn+1
runs
}
gam=gam+1
atbat=1
inn=1
runs
}
runs
}

Resources