Using Beta.Select function in R (prior estimate) - r

I am trying to formulate the priors by using total counts and beta distribution.
I have following written:
quantile(df$row, probs=c(0.00001, 0.5, 0.99999))
quantile1 <- list(p=0.5, x=8)
quantile2 <- list(p=0.99999, x=10)
quantile3 <- list(p=0.00001, x=1)
library("LearnBayes")
findBeta <- function(quantile1,quantile2,quantile3)
quantile1_p <- quantile1[[1]]; quantile1_q <- quantile1[[2]]
quantile2_p <- quantile2[[1]]; quantile2_q <- quantile2[[2]]
quantile3_p <- quantile3[[1]]; quantile3_q <- quantile3[[2]]
priorA <- beta.select(list(p=0.5, x=8), list(p=0.99999, x=10))
and once I am trying to calculate priorA using beta.select function I get following error:
Error in if (p0 < p) m.hi = m0 else m.lo = m0 :
missing value where TRUE/FALSE needed
In addition: Warning message:
In pbeta(x, K * m0, K * (1 - m0)) : NaNs produced
I just can't get rid of the error and do not know how to approach it any more. Urgently need help.

I am guessing (completely out of thin air) that you are dealing with percentages. In which case you want to use x/100
beta.select(list(p=0.5, x=.08), list(p=0.9, x=.10))
# [1] 28.02 318.74
Either way, while it would be nice of beta.select to throw a more appropriate error message (or rather, to have an error check in there), the root of the issue is that your x's are out of bounds. (As #Didzis noted, the interval for a beta dist is [0, 1])

Related

Having trouble with making K Nearest Neighbors work in R Studio

I'm trying to use the knn function in r but I keep getting this error message when I try to compute it.
> knn(Taxi_train,Taxi_test,cl,k=100)
Error in knn(Taxi_train, Taxi_test, cl, k = 100) :
NA/NaN/Inf in foreign function call (arg 6)
In addition: Warning messages:
1: In knn(Taxi_train, Taxi_test, cl, k = 100) : NAs introduced by coercion
2: In knn(Taxi_train, Taxi_test, cl, k = 100) : NAs introduced by coercion
I don't know what exactly is wrong with my code so I need some help to get it working.
I tried making sure that all the variables are numeric but that didn't change anything. It may also be an issue with my cl factor in the knn equation.
Here is what my code is currently:
date<-chicago_taxi$date
class(date)
Date <- as.Date(date)
class(Date)
Julian <- yday(Date)
class(Julian)
head(Julian)
chicago_taxi <- cbind(chicago_taxi,Julian)
chicago_taxi$seconds <- as.numeric(chicago_taxi$seconds)
set.seed(7777)
train_set <- sample(1:13081,10400,replace = FALSE)
Taxi_train <- chicago_taxi[train_set,]
Taxi_test <- chicago_taxi[-train_set,]
cl <- Taxi_train$payment_type
scale(chicago_taxi$miles)
scale(chicago_taxi$seconds)
scale(chicago_taxi$Julian)
knn(Taxi_train,Taxi_test,cl,k=100)

error optim birnbaum sanders distribution fnpar

I don't know why I get an error message when I'm trying to use optim in R.
I have following data:
x <- c(6.0401209, 7.2888217, 0.4868070,
1.1699703, 51.5998419, 11.8766734,
2.3873264, 16.9583702, 21.6142835,
0.3133089, 3.4178360, 4.4367427,
2.0205100, 10.5798884, 0.4890031,
1.6734176, 10.2809820, 6.4705424,
5.6801965, 0.9438700)
And following log-likelihood function:
log.lik.bs <- function(gamma, betha, z){
n <- length(z)
- n * log(gamma) - n * log(2*sqrt(2*2*pi)) - sum(log(z)) + sum(log(sqrt(z/betha)) + sqrt(betha/z)) - (1/2*gamma^2) * sum((sqrt(z/betha) - sqrt(betha/z))^2)
}
What I'm trying to do is following:
optim(c(2, 6), log.lik.bs, control=list(fnscale=-1), x=x, method="BFGS")$par
But I get an error message:
Error in fn(par, ...) :
unused argument (x = c(6.04012089885228, 7.28882174812723, 0.486806990614708, 1.1699703323488, 51.5998418613029, 11.8766733963947, 2.38732637900487, 16.9583701851951, 21.6142834611592, 0.313308870127425, 3.41783600439905, 4.43674270859797, 2.02051001746263, 10.5798883747597, 0.489003100259996, 1.67341757119939, 10.2809820486722, 6.4705423816332, 5.68019649178721, 0.943869996033357))
It is not quite clear which parameters you are trying to optimize. I assume you want to optimize log.lik.bs with respect to gamma and betha for given z with initial values 2 and 6. In that case you have two errors in your code:
log.lik.bs is expecting an argument named z but you are providing an argument x. That is the error you are getting. Fix: z = x
When using optim your target function must take a single argument for the parameters. From ?optim:
A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
Combining this I get:
x <- c(6.0401209, 7.2888217, 0.4868070,
1.1699703, 51.5998419, 11.8766734,
2.3873264, 16.9583702, 21.6142835,
0.3133089, 3.4178360, 4.4367427,
2.0205100, 10.5798884, 0.4890031,
1.6734176, 10.2809820, 6.4705424,
5.6801965, 0.9438700)
log.lik.bs <- function(x, z){
gamma <- x[1]
betha <- x[2]
n <- length(z)
- n * log(gamma) - n * log(2*sqrt(2*2*pi)) - sum(log(z))
+ sum(log(sqrt(z/betha)) + sqrt(betha/z))
- (1/2*gamma^2) * sum((sqrt(z/betha) - sqrt(betha/z))^2)
}
optim(c(2, 6), log.lik.bs, control=list(fnscale=-1), z=x, method="BFGS")$par
Unfortunately this still throws an error:
Error in optim(c(2, 6), log.lik.bs, control = list(fnscale = -1), z = x, :
non-finite finite-difference value [1]
In addition there are several warnings that NaNs where introduced by sqrt and log. So my interpretation of your question might be wrong. After all, the function goes to infinity as gamma goes to zero.

R - numerical errors with analytical gradient?

I've got the following code:
theta=0.05
n=1000
m=200
r=rnorm(2000)
#ER check function
nu=Vectorize(function(a,tau){return(abs(tau-(a<0))*a^2)})
#Selecting 10 lowest sum values (lowest10 function returns indices)
lowest10=function(x){
values=sort(x)[1:min(10,length(x))]
indices=match(values,x)
return(indices)
}
sym.expectile=function(beta,e,abs.r){return(beta[1]+beta[2]*e+beta[3]*abs.r)}
ERsum=function(beta,tau,start,end){
y=r[(start+1):end]
X1=rep(1,n-1)
X3=abs(r[start:(end-1)])
X2=c()
X2[1]=e.sym.optimal[start-m]
for (i in 2:(n-1)){
X2[i]=sym.expectile(beta,X2[i-1],X3[i-1])
}
X=matrix(c(X1,X2,X3),ncol=3)
res=y-X%*%beta
sum.nu=mean(nu(res,tau))
return(sum.nu)
}
ERsum.gr=function(beta,tau,start,end){
y=r[(start+1):end]
X1=rep(1,n-1)
X3=abs(r[start:(end-1)])
X2=c()
X2[1]=e.sym.optimal[start-m]
for (i in 2:(n-1)){
X2[i]=sym.expectile(beta,X2[i-1],X3[i-1])
}
X=matrix(c(X1,X2,X3),ncol=3)
partial.beta0=c()
for (i in 1:(n-1)){partial.beta0[i]=-(1-beta[2]^(i))/(1-beta[2])}
gr.beta0=2/T*sum(abs(tau-(y<X%*%beta))*(y-X%*%beta)*partial.beta0)/1000
partial.beta1=c()
partial.beta1[1]=-X2[1]
for (i in 2:(n-1)){partial.beta1[i]=partial.beta1[i-1]*beta[2]-X2[i]}
gr.beta1=2/T*sum(abs(tau-(y<X%*%beta))*(y-X%*%beta)*partial.beta1)/1000
partial.beta2=c()
partial.beta2[1]=-X3[1]
for (i in 2:(n-1)){partial.beta2[i]=partial.beta2[i-1]*beta[2]-X3[i]}
gr.beta2=2/T*sum(abs(tau-(y<X%*%beta))*(y-X%*%beta)*partial.beta2)/1000
c(gr.beta0,gr.beta1,gr.beta2)
}
beta=matrix(nrow=1e4,ncol=3)
beta[,1]=runif(1e4,-1,0)#beta0
beta[,2]=runif(1e4,0,1)#beta1
beta[,3]=runif(1e4,-1,0)#beta2
e.sym.optimal=c()
tau.found.sym.optim=0.02234724
library('expectreg')
e.sym.optimal[1]=expectile(r[1:m],tau.found.sym.optim)
ERsums.sym=c()
for (i in 1:nrow(beta)){
ERsums.sym[i]=ERsum(beta[i,],tau.found.sym.optim,m+1,m+n)
}
initialbeta.esym=beta[lowest10(ERsums.sym),]
intermedietebeta.esym=matrix(ncol=3,nrow=10)
for (i in 1:10){
intermedietebeta.esym[i,]=optim(initialbeta.esym[i,],ERsum,
gr=ERsum.gr,tau=tau.found.sym.optim,
start=m+1,end=m+n,
method="BFGS")$par
}
I tried to replace the optim function with optimx, but got the following error:
Error: Gradient function might be wrong - check it!
To check if my gradient is ok I tried to evaluate values of gradient function using function grad from numDeriv and directly calling my ERsum.gr function. For the sample vector
beta
[1] -0.8256490 0.7146256 -0.4945032
I obtained following results:
>grad(function(beta) ERsum(c(beta[1],beta[2],beta[3]),tau.found.sym.optim,m+1,m+n),beta)
[1] -0.6703170 2.8812666 -0.5573101
> ERsum.gr2(beta,tau.found.sym.optim,m+1,m+n)
[1] -0.6696467 2.8783853 -0.5567527
So here is my question: is it possible that these differences are just some numerical errors caused by rounding down the partial.beta0, partial.beta1, partial.beta2 which are just the components of the sum representing gradient? I think so, because if my analytical formula for gradient misses something, the discrepancies would be probably much larger, but how can I be sure? If this is a case is there any other way to obtain more accurate values of gradient?
You've got further problems down the line even if you solve the question of whether that is really a proper gradient, which I see as too complex to tackle. If you take out the gr argument and try to run with only optimx instead of optim, you get:
Error in intermedietebeta.esym[i, ] <- optimx(initialbeta.esym[i, ], ERsum, :
number of items to replace is not a multiple of replacement length
This probably relates to the fact that optimx does not return the same structure as is returned by optim:
> optimx(initialbeta.esym[i,],ERsum,
+ tau=tau.found.sym.optim,
+ start=m+1,end=m+n,
+ method="BFGS")$par
NULL
> optimx(initialbeta.esym[i,],ERsum,
+ tau=tau.found.sym.optim,
+ start=m+1,end=m+n,
+ method="BFGS") # leave out `$par`
p1 p2 p3 value fevals gevals niter convcode kkt1 kkt2 xtimes
BFGS -1.0325 0.2978319 0.04921863 0.09326904 102 100 NA 1 TRUE FALSE 3.366
If you disagree with the decision to allow a default gradient estimate, hten you need to narrow down your debugging to the function that throws the error:
Error: Gradient function might be wrong - check it!
> traceback()
3: stop("Gradient function might be wrong - check it! \n", call. = FALSE)
2: optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower,
upper, hessian, optcfg$ctrl, have.bounds = optcfg$have.bounds,
usenumDeriv = optcfg$usenumDeriv, ...)
1: optimx(initialbeta.esym[i, ], ERsum, gr = ERsum.gr, tau = tau.found.sym.optim,
start = m + 1, end = m + n, method = "BFGS")
And look at the documentation (there was no help page) and code for optimx:::optimx.check. This is the section of code that does the checking:
if (!is.null(ugr) && !usenumDeriv) {
gname <- deparse(substitute(ugr))
if (ctrl$trace > 0)
cat("Analytic gradient from function ", gname,
"\n\n")
fval <- ufn(par, ...)
gn <- grad(func = ufn, x = par, ...)
ga <- ugr(par, ...)
teps <- (.Machine$double.eps)^(1/3)
if (max(abs(gn - ga))/(1 + abs(fval)) >= teps) {
stop("Gradient function might be wrong - check it! \n",
call. = FALSE)
optchk$grbad <- TRUE
}

Error message with objects in 'lsoda' in R

Very new user here. I am trying to use lsoda to solve differential equations stratified into two layers (as denoted by the for(s in 1:2) loop).
When running this full code, I keep getting the error message
object 'N' not found
no matter where or how I try to define N.
Can anyone help spot the error or advise on what I'm doing wrong? Thanks in advance.
R code:
library(deSolve)
Dyn <- function(t, var,par) {
with(as.list(c(par, var)), {
for(s in 1:2){
#Derivatives
dX[s] <- mu*N[s] - sigma*X[s] - (c[s]*beta*(InD[s] +ID[s]+ IdT[s])/N[s])*X[s] - mu*X[s]
dXint[s] <- sigma*X[s] - (1-omega)*(c[s]*beta*(InD[s] +ID[s]+ IdT[s])/N[s])*Xint[s] - mu*Xprep[s]
dInD[s] <- (c[s]*beta*(InD[s] +ID[s]+ IdT[s])/N[s])*X[s] - psi*InD[s]- mu*InD[s]
dID[s] <- (1-omega)*(c[s]*beta*(InD[s] +ID[s]+ IdT[s]) /N[s])*Xint[s]+ psi*InD[s]- mu*ID[s]
N[s] <- X[s]+Xint[s]+InD[s]+ID[s]
diffs <- c(dX[s], dXint[s], dInD[s], dID[s], N[s])}
return(list(diffs))
})}
#Defining parameter and initial values
par <- c(mu=0.033, sigma=0.29, beta=0.40, c=c(2, 30), Ctot=1773600, N=c(332550, 36950), psi=0.022, omega=0.44)
init <- c(X=c(332550,36950), Xint=c(0,0), InD=c(1,1), ID=c(0,0))
t <- seq(0, 30, by=0.1)
#Numerical solution#
Hom.sol <- lsoda(init, t, Dyn,par)
I think you are mixing up parameters and variables. N seems to be defined as a parameter par with dimension 2. However, in your model definition you are updating N with dimension 1.

do a nonlinear least square fit in r

I have two vectors:
y <- c(0.044924, 0.00564, 0.003848, 0.002385, 0.001448, 0.001138,
0.001025, 0.000983, 0.00079, 0.000765, 0.000721, 0.00061, 0.000606,
0.000699, 0.000883, 0.001069, 0.001226, 0.001433, 0.00162, 0.001685,
0.001604, 0.001674, 0.001706, 0.001683, 0.001505, 0.001497, 0.001416,
0.001449, 0.001494, 0.001544, 0.00142, 0.001458, 0.001544, 0.001279,
0.00159, 0.001756, 0.001749, 0.001909, 0.001885, 0.002063, 0.002265,
0.002137, 0.002391, 0.002619, 0.002733, 0.002957, 0.003244, 0.003407,
0.003563, 0.003889, 0.004312, 0.004459, 0.004946, 0.005248, 0.005302,
0.00574, 0.006141, 0.006977, 0.007386, 0.007843, 0.008473, 0.008949,
0.010164, 0.010625, 0.011279, 0.01191, 0.012762, 0.014539, 0.01477)
x <- 0:68
I am trying to use the non-linear least squares function to fit the data but I keep getting the error:
Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates
My code is:
a=0.00012
b=0.08436
k=0.21108
fit = nls(y ~ (a*b*exp(b*x)*k)/((k*b)+(a*(exp(b*x)-1))), start=list(a=a,b=b,k=k))
The parameters I have entered are parameters that I know are close to the expected values. Does anyone know what am I doing wrong here?
I have tried various initial values for the parameters a, b and k, but I always get some kind of error.
Use optim() instead. You have to make a function which takes a,b and k as input (collected as a vector), and which returns the squared error as a result:
func <-function(pars) {
a <- pars["a"]
b <- pars["b"]
k <- pars["k"]
fitted <- (a*b*exp(b*x)*k)/((k*b)+(a*(exp(b*x)-1)))
sum((y-fitted)^2)
}
Then we run optim() using the initial values:
result <- optim(c(a=0.00012, b=0.08436, k=0.21108), func)
To test the resulting fit:
plot(x, y)
a <- result$par["a"]
b <- result$par["b"]
k <- result$par["k"]
lines((a*b*exp(b*x)*k)/((k*b)+(a*(exp(b*x)-1))), col = "blue")

Resources