R - Error: argument 1 is not a vector when bootstrapping - r

I'm attempting to bootstrap my data to get 2000 measurements based on the linear regression and Theil regression (mblm function w/ repeated=FALSE).
My bootstrap R code works perfectly for the normal regression (from what I can tell), given below:
> fitfunc <- function(formula, data, index) {
+ d<- data[index,]
+ f<- lm(formula,data=d)
+ return(coef(f))
+ }
boot(dataframe, fitfunc, R=2000, formula=`Index A`~`Measurement B`)
But I get an error when attempting the Theil estimator bootstrap:
> fitfuncTheil <- function(formula,data,index) {
+ d<- data[index,]
+ f<- mblm(formula, data=d, repeated=FALSE)
+ return(coef(f))
+ }
> boot(dataframe, fitfuncTheil, R=2000, formula=`Index A`~`Measurement B`)
Error in order(x) : argument 1 is not a vector
In addition: Warning message:
In is.na(x) :
The error message seems basic but I cannot figure out why this would work in one case but not the other.

Once I removed the space from the column names (referenced in the formula field), the issue was resolved.

Related

Translating Stata xtmelogit to R glmer (lme4) package

I try to recalculate exactly published Stata code in R. In the first step I exported the same dataset from Stata and Imported it in R. Nevertheless I am fighting with errors in my code.
What am I doing wrong?
Original Stata Code:
xtmelogit redpref1 c.incomedif c.incomedif#c.forpop forpop i.year if (brncntr==1) || country:
My Approach in R was:
dataset <- dataset%>%
filter(brncntr==1) %>%
mutate(c.incomedif = factor(incomedif))%>%
mutate(c.forpop = factor(forpop)) %>%
mutate(i.year = as.integer(year)))
library(lme4)
logit <- glmer(redpref1~ c.incomedif+ i.year|country,family=binomial,rueda4, nAGQ=0L)
summary(logit)
This approach generates an error as follows:
Error in if (ctrl$npt > (2 * n + 1)) warning("Setting npt > 2 * length(par) + 1 is not recommended.") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In (n + 1L) * (n + 2L) : NAs produced by integer overflow
>

How to implement your own nonlinear function in nlmer in R?

I am trying to implement a new nonlinear function to use in nlmer function in lme4 package. But I'm not sure what the problem is. This is the first time I'm trying to use nlmer but I'm following all the instructions I've found on the internet. The first error is about my dataframe.
data <- read.csv(paste("C:/Users/oguz/Desktop/Runs4SiteModels/db/", "DB4NLSiteModel", Periods[i],".txt", sep=""), sep = "", header = TRUE)
psa_rock <- data$PSAr
nparams <- c("c")
nonl_fn <- deriv(~ log(( psa_rock + c)/c),
namevec = c("c"),
function.arg=c("c", psa_rock))
fm <- nlmer(log(data$PSAm) ~ nonl_fn(c, psa_rock) ~ 1 + data$M1 + data$M3 + data$M85 + data$Nflag + data$Rflag + data$FDepth +
data$Dist1 + data$Dist3 + data$VN + (exp(-1*exp(2*log(data$Vs)- 11)) * log((data$PSAr + c) / c) ) +
(1|data$EQID) + (1|data$STID), data=data, start=c(c=0.1))
When I run this code, I'm getting the following error:
Error in model.frame.default(data = data, drop.unused.levels = TRUE, formula = log(data$PSAm) ~ :
invalid type (list) for variable 'data'
which I wasn't getting it while using lmer function (of course without the nonlinear function). That's why I'm thinking my problem is not about my dataframe.
Other issue that I couldn't stop thinking about, the part in the fixed-effects:
(exp(-1*exp(2*log(data$Vs)- 11)) * log((data$PSAr + c) / c) )
as you can see my nonlinear function also takes a part in my fixed-effects formula and I'm not quite sure how to implement that. I hope my way is correct but because of my first problem, I couldn't find an opportunity to test that.

R Model returning the error: Too many open devices

I've been working on the creation of a training model in R for MS Azure. When I initially set up the model it all worked fine. Now it's continuously returning the below:
{"error":{"code":"LibraryExecutionError","message":"Module execution encountered an internal library error.","details":[{"code":"FailedToEvaluateRScript","target":"Score Model (RPackage)","message":"The following error occurred during evaluation of R script: R_tryEval: return error: Error in png(file = \"3e25ea05d5bc49d683f4471ff40780bcrViz%03d.png\", bg = \"transparent\") : \n too many open devices\n"}]}}
I haven't changed anything, and have looked around online only to find references to other issues. My code is as follows:
Trainer R Script
# Modify Datatype, factor Level, Replace NA to 0
x <- dataset
for (i in seq_along(x)) {
if (class(x[[i]]) == "character") {
#Convert Type
x[[i]] <- type.convert(x[[i]])
#Apply Levels
# levels(x[[i]]) <- levels(cols_modeled[, names(x)[i]]) # linked with levels in model
}
if (is.numeric(x[[i]]) && is.na(x[[i]]) ){
#print("*** Updating NA to 0")
x[[i]] <- 0
}
}
df1 <- x
rm(x)
set.seed(1234)
model <- svm(Paid ~ ., data= df1, type= "C")
Scorer R Script
library(e1071)
scores <- data.frame( predicted_result = predict(model, dataset))
Has anyone come across this before?

number of rows in use has changed: remove missing values?

I have been trying to do stepwise selection on my variables with R. This is my code:
library(lattice)#to get the matrix plot, assuming this package is already installed
library(ftsa) #to get the out-of sample performance metrics, assuming this package is already installed
library(car)
mydata=read.csv("C:/Users/jgozal1/Desktop/Multivariate Project/Raw data/FINAL_alldata_norowsunder90_subgroups.csv")
names(mydata)
str(mydata)
mydata$country_name=NULL
mydata$country_code=NULL
mydata$year=NULL
mydata$Unemployment.female....of.female.labor.force...modeled.ILO.estimate.=NULL
mydata$Unemployment.male....of.male.labor.force...modeled.ILO.estimate.=NULL
mydata$Life.expectancy.at.birth.male..years.= NULL
mydata$Life.expectancy.at.birth.female..years. = NULL
str(mydata)
Full_model=lm(mydata$Fertility.rate.total..births.per.woman. + mydata$Immunization.DPT....of.children.ages.12.23.months. + mydata$Immunization.measles....of.children.ages.12.23.months. + mydata$Life.expectancy.at.birth.total..years. + mydata$Mortality.rate.under.5..per.1000.live.births. + mydata$Improved.sanitation.facilities....of.population.with.access. ~ mydata$Primary.completion.rate.female....of.relevant.age.group. + mydata$School.enrollment.primary....gross. + mydata$School.enrollment.secondary....gross. + mydata$School.enrollment.tertiary....gross. + mydata$Internet.users..per.100.people. + mydata$Primary.completion.rate.male....of.relevant.age.group. + mydata$Mobile.cellular.subscriptions..per.100.people. + mydata$Foreign.direct.investment.net.inflows..BoP.current.US.. + mydata$Unemployment.total....of.total.labor.force...modeled.ILO.estimate., data= mydata)
summary(Full_model) #this provides the summary of the model
Reduced_model=lm(mydata$Fertility.rate.total..births.per.woman. + mydata$Immunization.DPT....of.children.ages.12.23.months. + mydata$Immunization.measles....of.children.ages.12.23.months. + mydata$Life.expectancy.at.birth.total..years. + mydata$Mortality.rate.under.5..per.1000.live.births. + mydata$Improved.sanitation.facilities....of.population.with.access. ~1,data= mydata)
step(Reduced_model,scope=list(lower=Reduced_model, upper=Full_model), direction="forward", data=mydata)
step(Full_model, direction="backward", data=mydata)
step(Reduced_model,scope=list(lower=Reduced_model, upper=Full_model), direction="both", data=mydata)
This is the link to the dataset that I am using: http://speedy.sh/YNXxj/FINAL-alldata-norowsunder90-subgroups.csv
After setting the scope for my stepwise I get this error:
Error in step(Reduced_model, scope = list(lower = Reduced_model, upper = Full_model), :
number of rows in use has changed: remove missing values?
In addition: Warning messages:
1: In add1.lm(fit, scope$add, scale = scale, trace = trace, k = k, :
using the 548/734 rows from a combined fit
2: In add1.lm(fit, scope$add, scale = scale, trace = trace, k = k, :
using the 548/734 rows from a combined fit
I have looked at other posts with the same error and the solutions usually is to omit the NAs from the data used, but that hasn't solved my problem and I am still getting exactly the same error.

Using Beta.Select function in R (prior estimate)

I am trying to formulate the priors by using total counts and beta distribution.
I have following written:
quantile(df$row, probs=c(0.00001, 0.5, 0.99999))
quantile1 <- list(p=0.5, x=8)
quantile2 <- list(p=0.99999, x=10)
quantile3 <- list(p=0.00001, x=1)
library("LearnBayes")
findBeta <- function(quantile1,quantile2,quantile3)
quantile1_p <- quantile1[[1]]; quantile1_q <- quantile1[[2]]
quantile2_p <- quantile2[[1]]; quantile2_q <- quantile2[[2]]
quantile3_p <- quantile3[[1]]; quantile3_q <- quantile3[[2]]
priorA <- beta.select(list(p=0.5, x=8), list(p=0.99999, x=10))
and once I am trying to calculate priorA using beta.select function I get following error:
Error in if (p0 < p) m.hi = m0 else m.lo = m0 :
missing value where TRUE/FALSE needed
In addition: Warning message:
In pbeta(x, K * m0, K * (1 - m0)) : NaNs produced
I just can't get rid of the error and do not know how to approach it any more. Urgently need help.
I am guessing (completely out of thin air) that you are dealing with percentages. In which case you want to use x/100
beta.select(list(p=0.5, x=.08), list(p=0.9, x=.10))
# [1] 28.02 318.74
Either way, while it would be nice of beta.select to throw a more appropriate error message (or rather, to have an error check in there), the root of the issue is that your x's are out of bounds. (As #Didzis noted, the interval for a beta dist is [0, 1])

Resources