I got error message argument is missing with no default when using the gradient function. It seems that variables are not passed to other functions. pi(1,3) works, but gradient(pi, 1,3) would result in error message "Error in s(p1, p2) : argument "p2" is missing, with no default" Can anyone help explain why this happens and how to fix it? Thanks. See code below
rm(list = ls())
n = 2000
# 1 for T, 2 for S, 0 for T, 1 for S
v1 = 8
v2 = 10
mc1 = 1
mc2 = 2
tc = 2 # travel cost
s = function(p1, p2) { # share for two markets
u1 = function(x) v1- p1 - tc * x
u2 = function(x) v2 - p2 - tc * (1 - x)
udiff = function(x) u1(x) - u2(x) # x prefer 1, (1-x) prefer 2
# previous if ensures a root in uniroot function
xbar = ifelse(u1(0) < u2(0), 0, ifelse(u1(1) > u2(1), 1,
uniroot(udiff, interval = c(0, 1))$root))
# in case utility negative
x1 = ifelse(u1(0) < 0, 0, ifelse(u1(1) >= 0, 1, uniroot(u1, interval = c(0, 1))$root))
x2 = ifelse(u2(1) < 0, 0, ifelse(u2(0) >= 0, 1, 1-uniroot(u2, interval = c(0, 1))$root))
s = c(min(xbar, x1), min(1 - xbar, x2))
}
pi = function(p1, p2) {
pi1 = (p1 - mc1) * s(p1, p2)[1]
pi2 = (p2 - mc2) * s(p1, p2)[2]
return(c(pi1, pi2))
}
g = function(p1, p2) diag(gradient(pi, p1,p2))
gradient(pi, 1,3)
If you type ?pracma::gradient, you will see that
Usage: gradient(F, h1 = 1, h2 = 1)
and
F: vector of function values, or a matrix of values of a function of two variables.
In your case pi is just the name of function, rather than numeric input arguments for gradient.
I have no clue what is your exact objective for your gradient or g function. An example to make your code run might be something like below
g = function(p1, p2) diag(gradient(pi(p1,p2), p1,p2))
Related
developers!
I have encountered an error message
Error in if (obs <= ei) 2 * pv else 2 * (1 - pv) : missing value where
TRUE/FALSE needed
stopping me to get the value from Moran's I function from ape package. Here is what I did:
library(ape)
nrstp <- data.frame(
X = c(300226.9, 300224.6, 300226.4, 300226.1, 300224.0, 300226.4, 300225.7, 300226.4, 300226.1, 300226.4, 300226.3, 300226.3, 300227.1),
Y = c(5057949, 5057952, 5057950, 5057950, 5057956, 5057950, 5057950, 5057950, 5057950, 5057950, 5057950, 5057950, 5057949),
V3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0))
nrstp = data.frame(nrstp)
dist = as.matrix(dist(cbind(nrstp$X, nrstp$Y)))
invdist = 1/dist
invdist[is.infinite(invdist)] <- 0
moranI = Moran.I(nrstp$V3, invdist)
The intention of this code is to calculate Moran's I from a series of point to check spatial autocorrelation. So far, this seems to be the only function working for Moran's I in R. After a few testing (I have thousands groups of points), this error seems only happen to the input vector having only one value (I tried other numbers than 0, it still raise this error).
Could someone help me improve this code? Or are their better suggestion to calculate Moran's I or test spatial autocorrelation from linestring (those point groups are origin point of one linestring and the closest points from other linestring within 10 meter buffer of such origin point)?
Thank you ahead for any help!
The control flow choice if(condition) do something requires that the value of condition is not NA.
In your case, obs <= ei results in NA. That's why the the error message missing value where TRUE/FALSE needed is generated.
To understand how obs <= ei results in NA, you can check the details inside Moran.I function:
Moran.I
function (x, weight, scaled = FALSE, na.rm = FALSE, alternative = "two.sided")
{
if (dim(weight)[1] != dim(weight)[2])
stop("'weight' must be a square matrix")
n <- length(x)
if (dim(weight)[1] != n)
stop("'weight' must have as many rows as observations in 'x'")
ei <- -1/(n - 1)
nas <- is.na(x)
if (any(nas)) {
if (na.rm) {
x <- x[!nas]
n <- length(x)
weight <- weight[!nas, !nas]
}
else {
warning("'x' has missing values: maybe you wanted to set na.rm = TRUE?")
return(list(observed = NA, expected = ei, sd = NA,
p.value = NA))
}
}
ROWSUM <- rowSums(weight)
ROWSUM[ROWSUM == 0] <- 1
weight <- weight/ROWSUM
s <- sum(weight)
m <- mean(x)
y <- x - m
cv <- sum(weight * y %o% y)
v <- sum(y^2)
obs <- (n/s) * (cv/v)
if (scaled) {
i.max <- (n/s) * (sd(rowSums(weight) * y)/sqrt(v/(n -
1)))
obs <- obs/i.max
}
S1 <- 0.5 * sum((weight + t(weight))^2)
S2 <- sum((apply(weight, 1, sum) + apply(weight, 2, sum))^2)
s.sq <- s^2
k <- (sum(y^4)/n)/(v/n)^2
sdi <- sqrt((n * ((n^2 - 3 * n + 3) * S1 - n * S2 + 3 * s.sq) -
k * (n * (n - 1) * S1 - 2 * n * S2 + 6 * s.sq))/((n -
1) * (n - 2) * (n - 3) * s.sq) - 1/((n - 1)^2))
alternative <- match.arg(alternative, c("two.sided",
"less", "greater"))
pv <- pnorm(obs, mean = ei, sd = sdi)
if (alternative == "two.sided")
pv <- if (obs <= ei)
2 * pv
else 2 * (1 - pv)
if (alternative == "greater")
pv <- 1 - pv
list(observed = obs, expected = ei, sd = sdi, p.value = pv)
}
<bytecode: 0x000001cd5e0715d0>
<environment: namespace:ape>
By assigning x = nrstp$V3 and weight = invdist, you will get mean(x) = 0. This results in y=0, cv = 0, v=0, and finally obs = NaN. Consequently,
obs <= ei
[1] NA
To overcome the problem, you need to ensure that each of obs and ei is not NA. In your case, if mean(x) is not zero, obs <= ei will not be NA. However, because I know nothing about this particular topic, I'm not sure whether non-zero mean(x) is always the right solution.
The problem is that your x are all the same value. If you look in the code from Abdur Rohman the calculation of the function is
m <- mean(x)
y <- x - m
cv <- sum(weight * y %o% y)
v <- sum(y^2)
obs <- (n/s) * (cv/v)
if all x are the same than the mean of m <- mean(x) is obviously the same value as all x and y, v, obs are 0.
For obs you divide cv/v which is NaN
So at least one value of x should be different
I have a Bayesian MCMC in R, and I have the code below:
RWM = function(Niter,Y,X){
p = ncol(X)
alpha = 0.7
beta = matrix(0,ncol=1,nrow=3)
beta = as.matrix(beta)
sig_p = 0
mu_p = beta
C = diag(p)
R = t(chol(C))
lpi = posterior(beta,Y,X)
OUT = matrix(NA, ncol=3, nrow=Niter)
for (j in 1:Niter){
rr = rnorm(p)
beta_p = beta + exp(sig_p) * as.vector(R%*%rr)
lpi_p = posterior(beta_p,Y,X)
A = exp(lpi_p-lpi)
Acc = min(1,A)
if (runif(1)<=Acc){
beta = beta_p
lpi = lpi_p
}
OUT[j,] = beta
sig_p = sig_p + (1/j^alpha)*(Acc -0.3)
mu_p = mu_p + (1/j)*(as.matrix(beta) - mu_p)
bmu = as.matrix(beta - mu_p)
C = C + (1/j)*(as.matrix(t(bmu)%*%bmu) - C)
}
return(OUT)
It looks like the vector beta will update, and the three elements in this vector will be different due to the rnorm function. However, this is not the case. The 3 columns of the output, one for each element, are exactly the same in the row. I have iterated this function out in the console several times, and in no case did the elements in beta appear to be the same.
For example: beta = [1, 2, 3] but the output = [1, 1, 1]
The MCMC iterates and does not get stuck, as the histogram shows a wide range of values in the output. It is just the sampled betas that are giving me the issue.
I'm just not understanding what is wrong with my code that prevents my vector beta from being added directly to the matrix OUT.
Having an lm object I need to create a function based on its variables represented as character vector. I have tried to use a combination of eval and expr to create an f function that would be further used in obj and nlm optimisation of the latter.
library(tidyverse)
df <- drop_na(airquality)
model <- lm(Ozone~. - Temp, data = df, x=TRUE, y=TRUE)
base_vars <- all.vars(formula(model)[-2])
k <- length(base_vars)
f <- function(base_df, x, y, parms) {
with(base_df, parms[1] +
eval(expr(paste(paste(paste0('parms[', 2:(k+1), ']'), base_vars, sep = '*'), collapse = '+'))) +
log(parms[k+2] * (x - parms[k+3] ^ 2)))
}
obj <- function(parms, y, x) mean((residuals(model) - f(df, x, y, parms))^2)
fit <- with(data, nlm(obj, c(0, 0, 0, 0, 0, 0, 0), y = e, x = x))
But calling f(model$x, df$Temp, model$y, c(0, 0, 0, 0, 0, 0, 0)) results in the following error:
Error in eval(substitute(expr), data, enclos = parent.frame()) :
numeric 'envir' arg not of length one
4.
eval(substitute(expr), data, enclos = parent.frame())
3.
with.default(base_df, parms[1] + eval(expr(paste(paste(paste0("parms[",
2:(k + 1), "]"), base_vars, sep = "*"), collapse = "+"))) +
log(parms[k + 2] * (x - parms[k + 3]^2)))
2.
with(base_df, parms[1] + eval(expr(paste(paste(paste0("parms[",
2:(k + 1), "]"), base_vars, sep = "*"), collapse = "+"))) +
log(parms[k + 2] * (x - parms[k + 3]^2)))
1.
f(model$x, df$Temp, model$y, c(0, 0, 0, 0, 0, 0, 0))
I believe there might be a conflict between eval environment and environment implied by with function, but can't figure out why. Any ideas how can I create custom function f for variable models?
Expected output for the f(model$x, df$Temp, model$y, c(0, 0, 0, 0, 0, 0, 0)) would be:
with(base_df, parms[1]+parms[2]*Solar.R+parms[3]*Wind+parms[4]*Temp+parms[5]*Month+
parms[6]*Day+log(parms[7] * (Temp - parms[8] ^ 2)))
but for a different model it could be something like:
with(base_df,
parms[1]+parms[2]*var1+parms[3]*var2+log(parms[4]*(var3-parms[5]^2)))
so the number of variables and parameters is different with every call.
R supports computing on the language, but it should not be your first option. If you do it, it should never involve text processing of code. You don't have a case here where you need to compute on the language. I have no idea how you thought your attempt would work but I don't know the expr function and I refuse to install package tidyverse and its ginormous dependency tree.
Also, you generally should avoid with outside of interactive use. But with is not the problem here.
Here is how I would do this:
df <- airquality[complete.cases(airquality),]
model <- lm(Ozone~. - Temp, data = df)
f <- function(base_df, x, parms) {
m <- model.matrix(model, data = base_df)
k <- ncol(m)
stopifnot(length(parms) == (k + 2L))
#I use exp(parms[k+1]) to ensure a positive value within the log
m %*% parms[seq_len(k)] + log(exp(parms[k + 1L]) * (x - parms[k + 2L] ^ 2))
}
obj <- function(parms, y, x, base_df) mean((residuals(model) - f(base_df, x, parms))^2)
#some x:
x <- rpois(nrow(df), 10)
fit <- nlm(obj, c(0, 0, 0, 0, 0, 0, 0), x = x, base_df = df)
#works
You don't seem to use y and thus I removed it from the code.
Note how I create the design matrix for the linear part (using model.matrix) and use matrix multiplication with the parameters. You also need to ensure that log doesn't return Inf/-Inf/NaN.
I think #Roland gave a good answer covering your actual problem. I am isolating what I think you were specifically asking based on the question Title, with no comment on whether it is a good idea or not. It probably isn't in this use case.
But what you were looking for more than likely is eval_tidy() from rlang. I left the :: function notation in just so its obvious what package is being used here.
Note I fixed a couple things that seemed to be errors in the code. I am also using all ones instead of zeros to test in parms due to the log.
library(rlang)
library(tidyr)
# dropped y since it was an unused argument
f <- function(base_df, x, parms) {
# set an expression to evaluate using parse_expr()
.f <- rlang::parse_expr(paste(paste(paste0('parms[', 2:(k+1), ']'),
base_vars, sep = '*'), collapse = '+'))
# use eval_tidy() with the data mask
y_part1 <- rlang::eval_tidy(.f, data = base_df)
y_part2 <- log(parms[k + 2] * (x - parms[k + 3] ^ 2))
parms[1] + y_part1 + y_part2
}
# using your code
df <- tidyr::drop_na(airquality)
model <- lm(Ozone~. - Temp, data = df, x=TRUE, y=TRUE)
base_vars <- all.vars(formula(model)[-2])
k <- length(base_vars)
# changed to all ones, I think this is what you wanted for length
parms <- rep(1, k + 3)
method_1 <- f(df, df$Temp, parms)
method_2 <- with(df, parms[1]+parms[2]*Solar.R+parms[3]*Wind+parms[4]*Temp+parms[5]*Month+
parms[6]*Day+log(parms[7] * (Temp - parms[8] ^ 2)))
all.equal(method_1, method_2)
# [1] TRUE
With the goal of turning the following into a function, I was wondering how I can write the following double integral in terms of R codes?: ($\bar{x} = \mu$):
Assuming pi0 and pi1 implement your functions $\pi_0$ and $\pi_1$ in a vectorized way, a possible solution is:
integral <- function(n, mu, s, pi0, pi1) {
C <- (2 * pi)^(-n/2)
C * integrate(f = function(sigmavec) sapply(sigmavec, function(sigma) {
integrate(f = function(delta) {
exp(-n/2 * ((mu / sigma - delta)^2 + (s / sigma)^2)) * pi1(delta)
}, lower = -Inf, upper = Inf)$value
}) * pi0(sigmavec) / (sigmavec^n), lower = 0, upper = Inf)$value
}
# Tests
integral(n = 1, mu = 0, s = 1, pi0 = dnorm, pi1 = dnorm)
# [1] 0.0473819
integral(n = 1, mu = 0, s = 1, pi0 = function(sigma) 1/sigma, pi1 = dcauchy)
# [1] 0.2615783
Note sure if this question is on topic, but I am open to answer.
May be you should ask a more general question, how to write/computing integral
using computer program (code)? There at least are two ways
Using numerical integration, such as Monte Carlo method
Using symbolic toolbox to solve the problem analytically and plugin the numerical value.
Examples on $\int_0^1 x^2$
f<-function(x){
x^2
}
curve(f,0,1)
# method 1
integrate(f,lower=0,upper = 1)
# method 2
library(Ryacas)
x <- Sym("x")
f <- function(x) {
x^2
}
f2=yacas(yacas(Integrate(f(x), x)))
f2
x <- 1
Eval(f2)
I want to first find a max of 0 or j where j is any variable and then sum these for k (k=1,2,...k) variables of a dataframe data. In stata, I did as follows:
gen sum=max(0,x)+max(0,y)+max(0,z)+...+max(0,k)
In R I used following approach:
data$sum<-ifelse(data$x<0,0,data$x*1)+ifelse(data$y<0,0,data$y*1)+ifelse(data$z<0,0,data$z*1)+...+ifelse(data$k<0,0,data$k*1)
I was wondering whether there is an alternative and efficient approach in R to do the same thing.
Try this:
mycols <- c("x", "y", "z", "k")
data$sum <- rowSums(data[mycols] * (data[mycols] > 0))
Check with some sample data:
data <- data.frame(x = runif(10) - 0.5,
y = runif(10) - 0.5,
z = runif(10) - 0.5,
k = runif(10) - 0.5)
identical(rowSums(data[mycols] * (data[mycols] > 0)), # mine
ifelse(data$x < 0, 0, data$x * 1) + # yours
ifelse(data$y < 0, 0, data$y * 1) +
ifelse(data$z < 0, 0, data$z * 1) +
ifelse(data$k < 0, 0, data$k * 1))
# [1] TRUE
Alternatives to flodel's excellent solution, noting the first looks quite a bit like your Stata code.
with( data, # terrible name for an R object, BTW
pmax(x, 0) + pmax(y, 0) + pmax(z, 0) +pmax(k,0) )
rowSums( apply(data[-5], 2, pmax, 0) )
The second one is probably slower, but it is in the running for this R-golf competition. Also a matrix math solution:
as.matrix( (data[,1:4] > 0 )* data[, 1:4]) %*% rep(1, 4 )
Not the question, but writing out every variable in Stata is likely to be tedious and error-prone. There is likely to be scope for a loop here:
gen sum = 0
quietly foreach v of var varlist {
replace sum = sum + `v' if inrange(`v', 0, .)
}
where you must work out what the varlist should be.