Insert equation using nlm in R - r
I need to estimate parameters by a non-linear fitting procedure. In particular, I'm trying to fit the following equation:
I thought that nlm could be a good solution, using:
#example data
df <- data.frame(var= cumsum(sort(rnorm(100, mean=20, sd=4))),
time= seq(strptime("2018-1-1 0:0:0","%Y-%m-%d %H:%M:%S"), by= dseconds(200),length.out=100))
#write function
my_fun <- function(v, vin, t,theta){
fy <- v ~ (theta[1]-(theta[1]- vin)*exp((-theta[2]/10000)*(t-theta[3])))
ssq<-sum((v-fy)^2)
return(ssq)
}
#run nlm
th.start <- c(7000, 1000, 10)
my_fit <- nlm(f=my_fun, vin=400, v = df$var,
t=df$time,p=th.start)
However I got the error: Error in v - fy : non-numeric argument to binary operator. I'm sure that it's a basic thing but I'm struggling to understand the problem.
Related
Fixing missing data- how to transform table into ts object that works with KalmanRun?
I'm working with data from SteamCharts on a game- Warframe (https://steamdb.info/app/230410/graphs/) Edit- The data is a .csv downloadable near "Steam charts for every day" I'm modelling this timeseries data, but the package I'm using requires no missing values. To resolve this, I'm using arima to predict the missing values (instructions from link reproduced below) https://stats.stackexchange.com/questions/104565/how-to-use-auto-arima-to-impute-missing-values require(forecast) # sample series x0 <- x <- log(AirPassengers) y <- x # set some missing values x[c(10,60:71,100,130)] <- NA # fit model fit <- auto.arima(x) # Kalman filter kr <- KalmanRun(x, fit$model) # impute missing values Z %*% alpha at each missing observation id.na <- which(is.na(x)) for (i in id.na) y[i] <- fit$model$Z %*% kr$states[i,] # alternative to the explicit loop above sapply(id.na, FUN = function(x, Z, alpha) Z %*% alpha[x,], Z = fit$model$Z, alpha = kr$states) As of now, I've managed to convert to Convert the Date strings to a DateTime object in my dataframe: df <- read.csv(file="chart.csv", header=TRUE, sep=",") df = df %>% select(DateTime, Players) df_temp[['DateTime']] <- as.Date(strptime(df[['DateTime']], format='%Y-%m-%d %H:%M:%S')) Get an xts object of my data (I believe arima only works with ts though) df = xts(df$Players, df$DateTime) df = ts(df) The arima model fits, but when I try to use the KalmanRun, I get the following error: Error in KalmanRun(x, fit$model) : invalid argument type I believe there's an issue in how I'm converting it to a timeseries object, but don't know how to resolve it. Any help would be greatly appreciated. Thanks!
R: Matrix multiplication error - related to GLM
I've been trying to build some custom code for Logistic regression (i.e. I cannot use the GLM package for this purpose since - happy to explain why.) Below is the initial R code to provide the data set I'm working with: ## Load the datasets data("titanic_train") data("titanic_test") ## Combining Training and Testing dataset complete_data <- rbind(titanic_train, titanic_test) library(dplyr) titanic_test$Survived <- 2 complete_data <- rbind(titanic_train, titanic_test) complete_data$Embarked[complete_data$Embarked==""] <- "S" complete_data$Age[is.na(complete_data$Age)] <- median(complete_data$Age,na.rm=T) complete_data <- as.data.frame(complete_data) titanic_data <- select(complete_data,-c(Cabin, PassengerId, Ticket, Name)) titanic_data <- titanic_data[!titanic_data$Survived == "2", ] titanic_model <- model <- glm(Survived ~.,family=binomial(link='logit'),data=titanic_data) y <- titanic_data$Survived x <- as.data.frame(cbind(rep(1, dim(titanic_data) [1]),titanic_data[,-2])) x <- as.matrix(as.numeric(x)) beta <- as.numeric(rep(0, dim(x)[2])) beta <- as.matrix(beta) The issue I'm having here is I would like to compute the matrix product of beta (a px1 matrix) and x (a n x p matrix) I have tried the following - beta * x x %*% beta However, the above the following errors - Error in FUN(left, right) : non-numeric argument to binary operator Error in x %*% beta : requires numeric/complex matrix/vector arguments I'd imagine this is due to the fact I've got non-numeric fields in the data matrix x. As a bit of a background, calculating the linear predictor will allow me to progress with my custom code for fitting a Logistic regression model. I would appreciate some help - thank you!
R : Clustered standard errors in fractional probit model
I need to estimate a fractional (response taking values between 0 and 1) model with R. I also want to cluster the standard errors. I have found several examples in SO and elsewhere and I built this function based on my findings: require(sandwich) require(lmtest) clrobustse <- function(fit, cl){ M <- length(unique(cl)) N <- length(cl) K <- fit$rank dfc <- (M/(M - 1))*((N - 1)/(N - K)) uj <- apply(estfun(fit), 2, function(x) tapply(x, cl, sum)) vcovCL <- dfc*sandwich(fit, meat = crossprod(uj)/N) coeftest(fit, vcovCL) } I estimate my model like this: fit <- glm(dep ~ exp1 + exp2 + exp3, data = df, fam = quasibinomial("probit")) clrobustse(fit, df$cluster) Everything works fine and I get the results. However, I suspect that something is not right as the non-clustered version: coeftest(fit) gives the exact same standard errors. I checked that Stata reports and that displays different clustered errors. I suspect that I have misspecified the function clrobustse but I just don't know how. Any ideas about what could be going wrong here?
R Harmonic Prediction Failing - newdata structure
I am forecasting a time series using harmonic regression created as such: (Packages used: tseries, forecast, TSA, plyr) airp <- AirPassengers TIME <- 1:length(airp) SIN <- COS <- matrix(nrow = length(TIME), ncol = 6,0) for (i in 1:6){ SIN[,i] <- sin(2*pi*i*TIME/12) COS[,i] <- cos(2*pi*i*TIME/12) } SIN <- SIN[,-6] decomp.seasonal <- decompose(airp)$seasonal seasonalfit <- lm(airp ~ SIN + COS) The fitting works just fine. The problem occurs when forecasting. TIME.NEW <- seq(length(TIME)+1, length(TIME)+12, by=1) SINNEW <- COSNEW <- matrix(nrow=length(TIME.NEW), ncol = 6, 0) for (i in 1:6) { SINNEW[,i] <- sin(2*pi*i*TIME.NEW/12) COSNEW[,i] <- cos(2*pi*i*TIME.NEW/12) } SINNEW <- SINNEW[,-6] prediction.harmonic.dataframe <- data.frame(TIME = TIME.NEW, SIN = SINNEW, COS = COSNEW) seasonal.predictions <- predict(seasonalfit, newdata = prediction.harmonic.dataframe) This causes the warning: Warning message: 'newdata' had 12 rows but variables found have 144 rows I went through and found that the names were SIN.1, SIN.2, et cetera, instead of SIN1 and SIN2... So I manually changed those and it still didn't work. I also manually removed the SIN.6 because it, for some reason, was still there. Help? Edit: I have gone through the similar posts as well, and the answers in those questions did not fix my problem.
Trying to predict with a data.frame after fitting an lm model with variables not inside a data.frame (especially matrices) is not fun. It's better if you always fit your model from data in a data.frame. For example if you did seasonalfit <- lm(airp ~ ., data.frame(airp=airp,SIN=SIN,COS=COS)) Then your predict would work. Alternatively you can try to cram matrices into data.frames but this is generally a bad idea. You would do prediction.harmonic.dataframe <- data.frame(TIME = TIME.NEW, SIN = I(SINNEW), COS = I(COSNEW)) The I() (or AsIs function) will keep them as matrices.
R nls2 "invalid model formula" fitting gamma
Working in R 3.1.3 and Rstudio. I want to fit gamma distributions that include a location parameter to data in order to 'shift' the x values to a new origin. I am trying to use nls2 with the following code: library(nls2) theVals <- data.frame(c(26.76,24.3,34.63,38.05,25.56,21.98,20.62,34,26.75,27.79,28.4,33.31,29.26,18.65,22.77,25.72,25.86,25.32,24.08,27.68,26.2,26.16,25.34,26.91,22.6,23.94,23.3,22.34,41.25,24.83,21.66,30.47,26.53,27.74,29.41,25.65,36.05,18.29,27.2,22.99,25.8,21.9,25.27,30.29,22.72,26.49,18.75,33.57,20.87,21.82,20.73,28.59,19.64,33.21,28.94,27.98,22.2,25.95,30.64,26.56,32.11,26.05,20.66,28.64,22.4,22.4,31.91,21.82,26.82,20.77,24.12,28.83,23.07,26.5,21.14,27.29,19.61,25.28,28.6,27.16,22.46,18.19,22.35,23.79,26.32,26.5,27.39,23.29,25.79,26.35,26.38,24.98,20,37.15,25.61,21.39,21.63,24.12,24.4,27.72,42.74,25.33,17.79,21.33,38.65,25.22,28.39,21.61,23.38,25.25,24.88,23.34,26.26,21.96,22.18,24.78,21.15,24.65,21.23,31.9,28.66,27.66,18.08,22.99,22.46,21.69,28.21,29.8,25.72,27.09,20.02,21.26,21.34,27.18,25.48,20.51,20.96,20.07,20.89,27.56,24.43,21.35,24.3,28.1,26.53,29.03,30.08,19.19,21.27,26.18,23.79,36.52,24.81,26.36,24.44,20.99,19.84,23.32,18.21,26.6,21.48,23.21,29.93,23.4,30.9,23.58,21.58,18.38,25.13,23.03,22.73,24.42,22.89,43.44,23.47,27.09,29.96,23.94,28.51,25.74,28.54,30.41,22.7,29.19,25.66,23.89,21.9,36.26,22.61,19.68,27.85,28.83,28.6,22.68,19.07,20.22,24.35,19.09,37.66,22.55,24.25,22.61,26.09,24.42,26.11,32.15,25.78,21.94,23.93,30.19,23.53,26.49,30.48,25.02,28.14,23.43,20.22,17.57,21.68,36.07,24.92,32.48,32.04,25.86,26.69,22.41,26.4,22.72,28.32,22.82,32.73,28.08,29.16,36.18,21.61,23.9,28.8,23.24,24.89,22.17,27.7,34.75,26.74,29.62,17.46,20.06,22.23,22.09,24.05,22.37,24.98,33.26,30.95,26.24,22.16,30.97,27.22,23.81,42.16,28.2,28.37,26.1,26.28,27.44,20.52,35.02,21.43,23.14,18.37,28.86,25.18,28.15,19.97,24.2,25.91,28.92,23.95,19.48,28.57,21.77,23.46,37.51,22.13,37.18,21.83,23.8,18.93,27.43,26.51,25.64,22.15,22.27,29.21,24.45,18.81,22.62,25.16,24.62,30.53,28.77,27.11,22.07,28.95,26.54,39.23,31.9,33,29.93,24.37,26.4,21.33,25.37,25.9,21.25,19.06,25.69,26.44,26.09,23.24,27.04,20.09,28.73,37.06,32.45,22.93,22.7,24.82,31.23,23.25,22.94,20.47,25.7,23.92,34.71,26.5,20.28,21.78,26.54,30.34,21.97,27.38,27.64,34.08,22.05,27.21,20.11,25.79,33.22,31.24,29.93,21.81,30.68,32.46,30.45,22.62,28.83,33.95,27.12,45.51,25.23,29.61,29.09)) colnames(theVals) <- c("theGamma") fo <- theGamma ~ dgamma(theX-location, shape=theShape, scale=theScale ) startList <- list(location=5, theShape=3, theScale=3) theGamma=NULL theX <- 0:50 mo1 <- nls2(fo, start=startList, data=theVals) I get an error "invalid model formula in ExtractVars". Curiosly dgamma works fine: location<- 5 theShape <- 3 theScale <- 3 dgamma(theX-location, shape=theShape, scale=theScale ) I have search stackoverflow and other sites, but can't find an answer to this one. Any ideas?