What is the depth of this recursive function? - recursion

What is the depth of the following recursive function?
T(n) = r + T(n-r)
where r < n and if 2r > n then r = n-r,
and the stop condition if n = r
Example:
n = 24, r = 9
2r < n, 2*9 < 24, r is fixed (r=9)
T(24) = 9 + T(24-9)= 9 + T(15)
----------
2r > n, 2*9 >15, then r = 15 - 9 = 6
T(15) = 6 + T(15-6)= 6 + T(9)
------------
2r > n, 2*6 > 9, r = 9- 6 = 3
T(9) = 3 + T(9-3) = 3 + T(6)
-----------
2r = n, 2*3 = 6, r is fixed (r=3)
T(6) = 3 + T(6- 3) = 3 + T(3)
------------
Stop condition, r equals n
The time complexity is O(n).
This problem can be addressed by recursion trees but this solution is not generic for all values n and r.
I am searching for a formula to find the depth of this recursive equation.

Related

Calculating the posterior probability (Using JAGS )

I have created the following model in R (seen below), i now wish to calculate the posterior probabilities that the relative risk in each area exceeds 1.2, does any one know how i can do this. I was thinking an ifelse function in my function would work but wasnt able to get it to work, would this be correct? Something like;
P = ifelse(alpha > 1.2) #wasnt able to get it working though
Ohio_data$SMR = Ohio_data$Obs/Ohio_data$Exp
Obs = Ohio_data$Obs
Exp = Ohio_data$Exp
#1. define JAGS model as a function
jags.mod = function(){
#prior
alpha ~ dgamma(1,1)
beta0 ~ dunif(-100, 100)
for(i in 1:88){
theta[i] ~ dgamma(alpha, alpha)
Obs[i] ~ dpois(mu[i])
log(mu[i]) <- log(Exp[i]) + beta0 + log(theta[i])
rr[i] = exp(beta0) * theta[i]
}
}
#2. prepare data
cancer_data = list('Obs', 'Exp')
obs_inits1 = list('beta0'= -10, 'alpha' = 1)
obs_inits2 = list('beta0'= 10, 'alpha' = 2)
obs_inits = list(obs_inits1, obs_inits2)
#parameters to monitor
params_jags = c('rr','alpha', 'beta0')
#3. fit the model
jags.mod.fit.bomb = jags(data = cancer_data, inits = obs_inits, parameters.to.save = params_jags,
n.chains = 2, n.burnin = 4000, n.iter = 10000, model.file = jags.mod)
data snipet;
Ohio_data
X Obs Exp SMR
1 1 14 15.678357 0.8929507
2 2 56 62.786481 0.8919117
3 3 26 26.953383 0.9646284
4 4 57 59.448398 0.9588147
5 5 21 25.710943 0.8167728
6 6 22 24.764319 0.8883749
7 7 67 52.437394 1.2777141
8 8 18 19.082278 0.9432836
9 9 149 129.573779 1.1499240
10 10 9 14.767335 0.6094532

find a solution to the equations in R

I have to write a code for this equations to find μ_0 and σ_0. equations
Here, Φ[.] is the cumulative standard Normal distribution. There are given values for σ = 2, E[M] = 10 and p = Pr[8 ≤ M ≤12] = 2/3.
My results should be μ_0 ≈ 0.28 and σ_0 ≈ 0.21, but something is wrong with my functions, i think. Can you pls help me?
sigma <- 2
E_M <- 10
Pr <- 2/3
a <- 8
b <- 12
#From first equation we take log(E[M]) = mu_0 + 1/2sigma^2 + 1/2sigma_0^2,
#As sigma = 2 and E[M] = 10 -> mu_0 = 0.303 - 1/2 sigma_0^2
fun <- function(sigma_0)
{pnorm((log(b) - 2 - 0.303 + 1/2 * sigma_0^2)/sigma_0, mean = 0.303 - 1/2 * sigma_0^2, sd = sigma_0) -
pnorm((log(a) - 2 -0.303 + 1/2 * sigma_0^2)/sigma_0, mean = 0.303 - 1/2 * sigma_0^2, sd = sigma_0) - Pr}
sigma_0 <- seq(0.1, 2, 0.05)
uniroot(fun, upper = 2, lower = 0.1)

Solve Equation in R for L

I have the following equation and would like R to solve for L.
Any thought?
Average = 370.4
m = 2
p = 0.2
n = 5
#L = ?
log10(Average) = 0.379933834 -0.107509315* m + 0.104445717 * p + 0.016517169 * n -0.025566689* L + 0.014393465 * m * p + 0.001601271 * m * n - 0.014250365 * n * L + 0.002523518 * m^2 + 0.237090759 * L^2
Your equation is a quadratic, so the quadratic formula works. Alternatively, you can solve numerically using uniroot:
Average = 370.4
m = 2
p = 0.2
n = 5
#L = ?
f0 <- function(L) {
0.379933834 - 0.107509315*m + 0.104445717*p + 0.016517169*n - 0.025566689*L + 0.014393465*m*p + 0.001601271*m*n - 0.014250365*n*L + 0.002523518*m^2 + 0.237090759*L^2 - log10(Average)
}
# solve numerically using uniroot
(nroots <- c(uniroot(f0, c(0, 10))$root, uniroot(f0, c(-10, 0))$root))
#> [1] 3.304099 -2.895724
# solve analytically using the quadratic formula
a <- 0.237090759
b <- -0.025566689 - 0.014250365*n
c <- 0.379933834 - 0.107509315*m + 0.104445717*p + 0.016517169*n + 0.014393465*m*p + 0.001601271*m*n + 0.002523518*m^2 - log10(Average)
(aroots <- (-b + c(1, -1)*sqrt(b^2 - 4*a*c))/(2*a))
#> [1] 3.304084 -2.895724
# check the solutions
f0(c(nroots, aroots))
#> [1] 2.255707e-05 -5.932209e-08 4.440892e-16 4.440892e-16

Get terminal nodes of data in tree library

I am trying to build a regression or classification tree with some test data. My goal is to know how many terminal nodes/leaves my tree has and in which terminal node new data ends up.
I am using the tree library, because it has the option of getting the node each data point lands in as output by using predict(tree.model, data=df, type="where")
I created some sample data and tried this. But it seems that predict does not only output terminal nodes. When running my code, predict(...) has the factors 3 5 6 8 9. But the tree looks like
1) root 700 969.900 1 ( 0.487143 0.512857 )
2) B < 0.339751 346 104.300 0 ( 0.965318 0.034682 )
4) A < 0.747861 331 13.600 0 ( 0.996979 0.003021 ) *
5) A > 0.747861 15 17.400 1 ( 0.266667 0.733333 )
10) B < 0.139725 5 5.004 0 ( 0.800000 0.200000 ) *
11) B > 0.139725 10 0.000 1 ( 0.000000 1.000000 ) *
3) B > 0.339751 354 68.790 1 ( 0.019774 0.980226 )
6) A < 0.157866 8 6.028 0 ( 0.875000 0.125000 ) *
7) A > 0.157866 346 0.000 1 ( 0.000000 1.000000 ) *
(the "*" marks the terminal nodes).
Is there a possibility to only get terminal nodes? Preferably within the tree library.
Here is my full example code, the major part is only creating the sample data.
library(ggplot2)
library(hrbrthemes)
#generating some data to test######################################
set.seed(42)
#category A
x1s = rchisq(500, 5, ncp = 0)
y1s = 1/x1s +0.1*rchisq(500, 8, ncp = 0)
x1s = (x1s-min(x1s))/max(x1s)
y1s = (y1s-min(y1s))/max(y1s)
#category B
x2s = 15-rchisq(500, 5, ncp = 0)
y2s = 5-(2.5 -1/400*(x2s-15)^2 +0.1*rchisq(500, 8, ncp = 0))
x2s = (x2s-min(x2s))/max(x2s)
y2s = (y2s-min(y2s))/max(y2s)
xs = c(x1s, x2s)
ys = c(y1s, y2s)
type = c(0*(1:500), 0*(1:500)+1)
df = data.frame(type, xs, ys)
names(df) = c("category","A","B")
df$category = factor(df$category)
#plot the generated data##########################################
ggplot(df, aes(x=A, y=B, color=category)) + geom_point(shape=1)
#seperate in training and test data
alpha = 0.7
inTrain = sample(1:nrow(df), alpha*nrow(df))
train.set = df[inTrain,]
test.set = df[-inTrain, ]
####################################################################
#use tree to predict category
library(tree)
tree.model = tree(category ~ A + B, data = train.set)
factor(predict(tree.model, data = test.set, type="where"))
tree.model

why dot product of normalized vector is always data size -1

I don't understand why dot product of normalized vector is always data size -1.
a <- scale(rnorm(100))
crossprod(a)
# equal = 100 - 1 = 99
b <- scale(runif(50))
crossprod(b)
# equal = 50 - 1 = 49
c <- scale(rchisq(30, 5))
crossprod(c)
# equal = 30 - 1 = 29
I want to know mathematical understanding.
Not in LaTex, but proof may help you to understand:
Your values are scaled, so: [x_i-mean(X)] / sd(X).
Crossprod does sum of squares of x_i = Sum_i ( [x_i-mean(X)])^2
Variance (squared sd): var(X) = sd^2(X) = 1/(n-1) * Sum_i ( [x_i-mean(X)])^2
Crossprod = Sum_i ([x_i-mean(X)] / sd(X))^2) = 1/sd(X)^2 * Sum_i ( [x_i-mean(X)]^2) = 1/(1/(n-1)) = n-1

Resources