I write a function to calculate critical depth of water in a circular channel
while the flow (Q) and diameter (d) are given:
D_Critic<- function (Q,Dia) {
g=9.81
Diff=1
Phi=0.01
while(Diff>=0.001) {
A=16*Q*sqrt((2/g)*sin(Phi/2))
B=Dia^5/2*(Phi-sin(Phi))^3/2
Diff=A-B
Phi=Phi+0.001
Yc=Dia/2*(1-cos(Phi/2))
}
return(Yc)
}
now I want to use within function to bind Yc with dataframe DQ, but it returns only first calculated Yc and several repeated warnings:
Q<-c(2.5975,2.5900,2.4183,2.3077)
D<-c(1,1,1,1)
DQ<-data.frame(Q,D)
> D_Q<-within(DQ,Yc<-D_Critic( Q/2, D))
There were 50 or more warnings (use warnings() to see the first 50)
> D_Q
Q D Yc
1 2.5975 1 0.52609
2 2.5900 1 0.52609
3 2.4183 1 0.52609
4 2.3077 1 0.52609
> warnings()
Warning messages:
1: In while (Diff >= 0.001) { ... :
the condition has length > 1 and only the first element will be used
The while statement only takes one boolean value, e.g. Diff >= 0.001 where Diff must be a single number. In the first time you go through the loop, this is the case, as Diff equals 1. However, in the second instance, Diff becomes equal to A-B, where A and B are both vectors of length 4.
So, when your code reaches the second iteration, while generates a warning, as it is not sure how to deal with a vector of booleans. The choice it makes is to simply use the first element in the boolean vector, discarding the rest.
You need to consider what Diff actually is. Probably a single number, so Diff sum(A-B) or sum((A-B)^2). This would result in a single Diff value, and get rid of your errors. What Diff should exactly depends on the theory you are working on. Your text book should list this.
It was resolved with a trick:
Yc<-matrix(NA,length(DQ$Q),1)
for (i in 1:length(DQ$Q)) {
Yc[i,1]<- D_Critic(DQ$Q[i]/2,DQ$D[i])
}
> Yc
[,1]
[1,] 0.5260900
[2,] 0.5255907
[3,] 0.5163489
[4,] 0.5098512
DQ<-cbind(DQ,Yc)
> DQ
Q D Yc
1 2.5975 1 0.5260900
2 2.5900 1 0.5255907
3 2.4183 1 0.5163489
4 2.3077 1 0.5098512
Related
While working on an Rcpp program, I used the sample() function, which gave me the following error: "NAs not allowed in probability." I traced this issue to the fact that the probability vector I used had NA values in it. I have no idea how. Below is some R code that captures the errors:
n.0=20
n.1=20
n.reps=1
beta0.vals=rep(seq(-.3,.1,,n.0),n.reps)
beta1.vals=rep(seq(-7,0,,n.1),n.reps)
beta.grd=as.matrix(expand.grid(beta0.vals,beta1.vals))
n.rnd=200
beta.rnd.grd=cbind(runif(n.rnd,min(beta0.vals),max(beta0.vals)),runif(n.rnd,min(beta1.vals),max(beta1.vals)))
beta.grd=rbind(beta.grd,beta.rnd.grd)
N = 22670
count = 0
for(i in 1:dim(beta.grd)[1]){ # iterate through 600 possible beta values in beta grid
beta.ind = 0 # indicator for current pair of beta values
for(j in 1:N){ # iterate through all possible Nsums
logit = beta.grd[i,1]/N*(j - .1*N)^2 + beta.grd[i,2];
phi01 = exp(logit)/(1 + exp(logit))
if(is.na(phi01)){
count = count + 1
}
}
}
cat("Total number of invalid probabilities: ", count)
Here, $\beta_0 \in (-0.3, 0.1), \beta_1 \in (-7, 0), N = 22670, N_\text{sum} \in (1, N)$. Note that $N$ and $N_\text{sum}$ are integers, whereas the beta values may not be.
Since mathematically, $\phi_{01} \in (0,1)$, I'm assuming that NAs are arising because R is not liking extremely small values. I am receiving an overwhelming amount of NA values, too. More so than numbers. Why would I be getting NAs in this code?
Include print(logit) next to count = count + 1 and you will find lots of logit > 1000 values. exp(1000) == Inf so you divide Inf by Inf which will get you a NaN and NaN is NA:
> exp(500)
[1] 1.403592e+217
> Inf/Inf
[1] NaN
> is.na(NaN)
[1] TRUE
So your problems are not too small but to large numbers coming first out of the evaluation of exp(x) with x larger then roughly 700:
> exp(709)
[1] 8.218407e+307
> exp(710)
[1] Inf
Bernhard's answer correctly identifies the problem:
If logit is large, exp(logit) = Inf.
Here is a solution:
for(i in 1:dim(beta.grd)[1]){ # iterate through 600 possible beta values in beta grid
beta.ind = 0 # indicator for current pair of beta values
for(j in 1:N){ # iterate through all possible Nsums
logit = beta.grd[i,1]/N*(j - .1*N)^2 + beta.grd[i,2];
## This one isn't great because exp(logit) can be very large
# phi01 = exp(logit)/(1 + exp(logit))
## So, we say instead
## phi01 = 1 / ( 1 + exp(-logit) )
phi01 = plogis(logit)
if(is.na(phi01)){
count = count + 1
}
}
}
cat("Total number of invalid probabilities: ", count)
# Total number of invalid probabilities: 0
We can use the more stable 1 / (1 + exp(-logit)
(to convince yourself of this, multiply your expression with exp(-logit) / exp(-logit)),
and luckily either way, R has a builtin function plogis() that can calculate these probabilities quickly and accurately.
You can see from the help file (?plogis) that this function evaluates the expression I gave, but you can also double check to assure yourself
x = rnorm(1000)
y = 1 / (1 + exp(-x))
z = plogis(x)
all.equal(y, z)
[1] TRUE
I'm coding in R, and I want to replace all the values within a given vector which are over 0.5 with 1, and any value which is under 0.5 with -1. I know how to do this with two sequential calls to "replace" such as:
my_nums <- c(0.2,0.8,0.3,0.4)
my_nums_adj_temp <- replace(my_nums,my_nums>0.5,1)
my_nums_adj <- replace(my_nums_adj_temp,my_nums<0.5,-1)
And so "my_nums" goes from
> my_nums
[1] 0.2 0.8 0.3 0.4
to
> my_nums_adj
[1] -1 1 -1 -1
But is there a way to do this with just a single call to "replace"?
This returns the same value as the code in the question; however, if you really wanted to return 0 for an input component if that input is .5 (rather than returning .5 which is what the code in the question returns) then omit the second term or if you wanted to return some other value in that case replace the second .5 with that value.
sign(my_nums - .5) + .5 * (my_nums == .5)
## [1] -1 1 -1 -1
If you wanted to return 1 if the input is greater than or equal to .5 and -1 if it is less than we could use this. For each component of the input, only one of the two terms can be non-zero so the result will be 1 or -1 for each component.
(my_nums >= .5) - (my_nums < .5)
I'm trying to use constrOptim to optimize the sum of square errors from a linear multiple regression. The main equation should be D = Beta1*Xa+Beta2*Xb+Beta3*Xc+Beta4*Xd , with D,Xa,Xb,Xc,Xd from a imported .csv file, and the Betas are the coefficients I want to find, minimizing the quadratic errors.
So far I imported the file.csv to R, named each column as Ds,Xa,Xb,Xc,Xd, created the objfunction=
function(Beta1,Beta2,Beta3,Beta4)'sum(E²)'=(sum(D) - sum(Beta1*Xa+Beta2*Xb+Beta3*Xc+Beta4*Xd))^2)
created the matrix 'C' and vector 'd' to configure the constraints that should restrict the Beta's to <=0. I dont know how to find the feasible region, although I've used initial values that made the function work.
Here is the code:
> Tabela= read.table("Simulacao.csv", header=T, sep= ";")
> Tabela
D A B C D.1
1 -1 1 -1 0 0
2 4 0 0 1 -1
3 4 1 0 -1 0
4 0 0 1 0 -1
5 -2 1 0 0 -1
> Ds= Tabela[,1]
> Xa= Tabela[,2]
> Xb= Tabela[,3]
> Xc= Tabela[,4]
> Xd= Tabela[,5]
> simulaf= function(x1,x2,x3,x4) {
+ Ds= Tabela[,1]
+ Xa= Tabela[,2]
+ Xb= Tabela[,3]
+ Xc= Tabela[,4]
+ Xd= Tabela[,5]
+ J=sum(Ds)
+ H=sum(x1*Xa+x2*Xb+x3*Xc+x4*Xd)
+ sx=(J-H)^2
+ return(sx)
+ }
> s= function(x) {simulaf(x[1],x[2],x[3],x[4])}
> d= c(0,0,0,0)
> C= matrix(c(-1,0,0,0,0,-1,0,0,0,0,-1,0,0,0,0,-1),nrow=4,ncol=4,byrow=T)
> constrOptim(c(-1,-1,-1,-1),s,NULL,C,d)
$par
[1] -0.2608199 -0.8981110 -1.1095961 -1.9274866
The result I expect should be:
$par
[1] -0.125 0 -0.5 -0.875
After researching this, my conclusions are that it could be because I'm using bad initial values, parameterization problem (don't understand why its needed) or if it's simply that I have programmed it incorrectly.
What do I need to do to fix this?
The formula for the sum of squared errors is
sum((y - yhat)^2)
and not
(sum(y) - sum(yhat))^2
where yhat is the predicted value.
Also, if your only constraints are that the estimated betas should be negative (which is a bit weird, usually you want them to be positive but never mind), then you don't need constrOptim. Regular optim(method="L-BFGS-B") or nlminb will work with so-called box constraints.
Suppose the following system of equations Ax = b with:
> A <- matrix(c(2,0,-1,0,0,2,2,1,-1,2,0,0,0,1,0,0), ncol = 4)
> A
[,1] [,2] [,3] [,4]
[1,] 2 0 -1 0
[2,] 0 2 2 1
[3,] -1 2 0 0
[4,] 0 1 0 0
> b <- c(-2,5,0,0)
Solving these equations with solve() yields:
> x <- solve(A,b)
> x
[1] 6.66e-16 4.44e-16 2.00e+00 1.00e+00
This is just an example, but A and b can be of any form.
I need to detect whether any component of x is 0. Now, the first two components should actually be 0, but they are both higher than the machine epsilon .Machine$double.eps = 2.22e-16 which makes them very small, but not equal to zero.
I think I understand that this is caused by rounding errors in floating point arithmetic inside solve(). What I need to know is whether it is possible (from a practical point of view) to determine the higher bound of these errors, so 0s can be detected. For example, instead of
> x == 0
[1] FALSE FALSE FALSE FALSE
one would use something like this:
> x > -1e-15 & x < 1e-15
[1] TRUE TRUE FALSE FALSE
Giving more insight into this problem would be appreciated.
One way to approach this is to check if we can find a better solution to the linear system if we assume the components to be zero. For that we would want to solve A[3:4]%*%y=b since A%*%c(0,0,x[3],x[4])=A[3:4]%*%c(x[3],x[4]). This is an overdetermined system so we can't use solve to find a solution. We can however use qr.solve:
> x.new = c(0,0,qr.solve(A[,3:4],b))
It remains to check if this solution is really better:
> norm(A%*%x.new - b) < norm(A%*%x - b)
[1] TRUE
Thus we have a good reason to suspect that x[1]==x[2]==0.
In this simple example it is obviously possible to guess the true solution by looking at the approximate solution:
> x.true = c(0,0,2,1)
> norm(A%*%x.true - b)
[1] 0
This is however not very helpful in the general case.
I am learning about q-learning and found a Wikipedia post and this website.
According to the tutorials and pseudo code I wrote this much in R
#q-learning example
#http://mnemstudio.org/path-finding-q-learning-tutorial.htm
#https://en.wikipedia.org/wiki/Q-learning
set.seed(2016)
iter=100
dimension=5;
alpha=0.1 #learning rate
gamma=0.8 #exploration/ discount factor
# n x n matrix
Q=matrix( rep( 0, len=dimension*dimension), nrow = dimension)
Q
# R -1 is fire pit,0 safe path and 100 Goal state########
R=matrix( sample( -1:0, dimension*dimension,replace=T,prob=c(1,2)), nrow = dimension)
R[dimension,dimension]=100
R #reward matrix
################
for(i in 1:iter){
row=sample(1:dimension,1)
col=sample(1:dimension,1)
I=Q[row,col] #randomly choosing initial state
Q[row,col]=Q[row,col]+alpha*(R[row,col]+gamma*max(Qdash-Q[row,col])
#equation from wikipedia
}
But I have problem in max(Qdash-Q[row,col] which according to the website is Max[Q(next state, all actions)] How to I programmatically search all actions for next state?
The second problem is this pseudo code
Do While the goal state hasn't been reached.
Select one among all possible actions for the current state.
Using this possible action, consider going to the next state.
Get maximum Q value for this next state based on all possible actions.
Compute: Q(state, action) = R(state, action) + Gamma * Max[Q(next state, all actions)]
Set the next state as the current state.
End Do
Is it this
while(Q<100){
Q[row,col]=Q[row,col]+alpha*(R[row,col]+gamma*max(Qdash-Q[row,col])
}
This post is by no means a complete implementation of Q-learning in R. It is an attempt to answer the OP with regards to the description of the algorithm in the website linked in the post and in Wikipedia.
The assumption here is that the reward matrix R is as described in the website. Namely that it encodes reward values for possible actions as non-negative numbers, and -1's in the matrix represent null values (i.e., where there is no possible action to transition to that state).
With this setup, an R implementation of the Q update is:
Q[cs,ns] <- Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]) - Q[cs,ns])
where
cs is the current state at the current point in the path.
ns is the new state based on a (randomly) chosen action at the current state. This action is chosen from the collection of possible actions at the current state (i.e., for which R[cs,] > -1). Since the state transition itself is deterministic here, the action is the transition to the new state.
For this action resulting in ns, we want to add its maximum (future) value over all possible actions that can be taken at ns. This is the so-called Max[Q(next state, all actions)] term in the linked website and the "estimate of optimal future value" in Wikipedia. To compute this, we want to maximize over the ns-th row of Q but consider only columns of Q for which columns of R at the corresponding ns-th row are valid actions (i.e., for which R[ns,] > -1). Therefore, this is:
max(Q[ns, which(R[ns,] > -1)])
An interpretation of this value is a one-step look ahead value or an estimate of the cost-to-go in dynamic programming.
The equation in the linked website is the special case in which alpha, the learning rate, is 1. We can view the equation in Wikipedia as:
Q[cs,ns] <- (1-alpha)*Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]))
where alpha "interpolates" between the old value Q[cs,ns] and the learned value R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]). As noted in Wikipedia,
In fully deterministic environments, a learning rate of alpha=1 is optimal
Putting it all together into a function:
q.learn <- function(R, N, alpha, gamma, tgt.state) {
## initialize Q to be zero matrix, same size as R
Q <- matrix(rep(0,length(R)), nrow=nrow(R))
## loop over episodes
for (i in 1:N) {
## for each episode, choose an initial state at random
cs <- sample(1:nrow(R), 1)
## iterate until we get to the tgt.state
while (1) {
## choose next state from possible actions at current state
## Note: if only one possible action, then choose it;
## otherwise, choose one at random
next.states <- which(R[cs,] > -1)
if (length(next.states)==1)
ns <- next.states
else
ns <- sample(next.states,1)
## this is the update
Q[cs,ns] <- Q[cs,ns] + alpha*(R[cs,ns] + gamma*max(Q[ns, which(R[ns,] > -1)]) - Q[cs,ns])
## break out of while loop if target state is reached
## otherwise, set next.state as current.state and repeat
if (ns == tgt.state) break
cs <- ns
}
}
## return resulting Q normalized by max value
return(100*Q/max(Q))
}
where the input parameters are:
R is the rewards matrix as defined in the blog
N is the number of episodes to iterate
alpha is the learning rate
gamma is the discount factor
tgt.state is the target state of the problem.
Using the example in the linked website as a test:
N <- 1000
alpha <- 1
gamma <- 0.8
tgt.state <- 6
R <- matrix(c(-1,-1,-1,-1,0,-1,-1,-1,-1,0,-1,0,-1,-1,-1,0,-1,-1,-1,0,0,-1,0,-1,0,-1,-1,0,-1,0,-1,100,-1,-1,100,100),nrow=6)
print(R)
## [,1] [,2] [,3] [,4] [,5] [,6]
##[1,] -1 -1 -1 -1 0 -1
##[2,] -1 -1 -1 0 -1 100
##[3,] -1 -1 -1 0 -1 -1
##[4,] -1 0 0 -1 0 -1
##[5,] 0 -1 -1 0 -1 100
##[6,] -1 0 -1 -1 0 100
Q <- q.learn(R,iter,alpha,gamma,tgt.state)
print(Q)
## [,1] [,2] [,3] [,4] [,5] [,6]
##[1,] 0 0 0.0 0 80 0.00000
##[2,] 0 0 0.0 64 0 100.00000
##[3,] 0 0 0.0 64 0 0.00000
##[4,] 0 80 51.2 0 80 0.00000
##[5,] 64 0 0.0 64 0 100.00000
##[6,] 0 80 0.0 0 80 99.99994