Help fitting a poisson glmer (getting an error message) - r

I am trying to fit a Poisson glmer model in R, to determine if 4 experimental
treatments affected the rate at which plants developed new branches over time.
New branches were counted after 35, 70 and 83 days and data were organised as follows:
treatment replicate.plant time branches
a ID4 35 0
a ID4 70 1
a ID4 83 1
a ID12 35 1
a ID12 70 3
a ID12 83 8
Loading the package lme4, I ran the following model:
mod<-glmer(branches ~ treatment + (1|time),
family=poisson,
data=dataset)
but I obtain the following error message:
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
object '.setDummyField' not found
Can anyone please give me an indication of why I am getting this error
and what does it mean?
Any advicse on how to make this model run will be greatly appreciated.

This is a known issue, see here: https://github.com/lme4/lme4/issues/54
The problem seems to be limited to R version 3.0.0. You should update to a more recent version.

Related

can't set type = "class" in predict.rpart

I am trying to use decision tree classification on my dataset which contains 2 features and 1 dependent variable that looks like:
Age Salary Purchased(Y/N)
26 43000 0
17 57000 0
19 76000 0
27 58000 0
27 84000 0
32 150000 1
25 33000 0
If I use
classifier = rpart(formula = Purchased ~ ., data = training_set)
I get the result like
2 4 5 9
0.03296703 0.03296703 0.03296703 0.03296703
I need to get not the probability but the most likely result. But when I use
y_pred = predict(classifier, newdata = test_set[-3], type = 'class')
I get
Error in predict.rpart(classifier, newdata = test_set[-3], type =
"class") : Invalid prediction for "rpart" object
Can you help me with that?
For anyone with the same problem who does not find the above solutions helpful. I had the same problem with the predict function in the 'rpart' package and just uninstalled another package that also had a predict function. So that might be worth looking at; in my case, I had to uninstall the 'Harvest.tree' package.
A second way to access a function that is masked by another package ist to explicitly specify the namespace calling it with rpart::predict(...). That way you can keep all packages and you do not have to find out which package exactly causes the masking.
Got the solution. I should have encoded the dependent variable as a factor:
dataset$Purchased = factor(dataset$Purchased, levels = c(0, 1))
After adding this line everything works fine.

R - RandomForest with two Outcome Variables

Fairly new to using randomForest statistical package here.
I'm trying to run a model with 2 response variables and 7 predictor variables, but I can't seem to because of the lengths of the response variables and/or the nature of fitting the model with 2 response variables.
Let's assume this is my data and model:
> table(data$y1)
0 1 2 3 4
23 43 75 47 21
> length(data$y1)
0 4
> table(data$y2)
0 2 3 4
104 30 46 29
> length(data$y2)
0 4
m1<-randomForest(cbind(y1,y2)~a+b+c+d+e+f+g, data, mtry=7, importance=TRUE)
When I run this model, I receive this error:
Error in randomForest.default(m, y, ...) :
length of response must be the same as predictors
I did some troubleshooting, and find that cbind() the two response variables simply places their values together, thus doubling the original length, and possible resulting in the above error. As an example,
length(cbind(y1,y2))
> 418
t(lapply(data, length()))
> a b c d e f g y1 y2
209 209 209 209 209 209 209 209 209
I then tried to solve this issue by running randomForest individually on each of the response variables and then apply combine() on the regression models, but came across these issues:
m2<-randomForest(y1~a+b+c+d+e+f+g, data, mtry=7, importance=TRUE)
m3<-randomForest(y2~a+b+c+d+e+f+g, data, mtry=7, importance=TRUE)
combine(m2,m3)
Warning message:
In randomForest.default(m, y, ...) :
The response has five or fewer unique values. Are you sure you want to do regression?
I then decide to treat the randomForest models as classification models, and apply as.factor() to both response variables before running randomForest, but then came across this new issue:
m4<-randomForest(as.factor(y1)~a+b+c+d+e+f+g, data, mtry=7, importance=TRUE)
m5<-randomForest(as.factor(y2)~a+b+c+d+e+f+g, data, mtry=7, importance=TRUE)
combine(m4,m5)
Error in rf$votes + ifelse(is.na(rflist[[i]]$votes), 0, rflist[[i]]$votes) :
non-conformable arrays
My guess is that I can't combine() classification models.
I hope that my inquiry of trying to run a multivariate Random Forest model makes sense. Let me know if there are further questions. I can also go back and make adjustments.
Combine your columns outside the randomForest formula:
data[["y3"]] <- paste0(data$y1, data$y2)
randomForest(y3~a+b+c+d+e+f+g, data, mtry=7, importance=TRUE)

Warning message 'newdata' had 1 row but variables found have 16 rows in R

I am suppose to use the predict function to predict when fjbjor is 5.5 and I always get this warning message and I have tried many ways but it always comes so is there anyone who can see what I am doing wrong here
This is my code
fit.lm <- lm(fjbjor~amagn, data=bjor)
summary(fit.lm)
new.bjor<- data.frame(fjbjor=5.5)
predict(fit.lm,new.bjor)
and this comes out
1 2 3 4 5 6 7 8 9 10 11
5.981287 2.864521 9.988559 5.758661 4.645530 2.419269 4.645530 5.313409 6.871792 3.309773 4.200278
12 13 14 15 16
3.755026 5.981287 5.536035 1.974016 3.755026
Warning message: 'newdata' had 1 row but variables found have 16 rows
If anyone can see what is wrong I would be really thankful for the help.
Your model is fjbjor ~ amagn, where fjbjor is response and amagn is covariate. Then your newdata is data.frame(fjbjor=5.5).
newdata should be used to provide covariates rather than response. predict will only retain columns of covariates in newdata. For your specified newdata, this will be NULL. As a result, predict will use the internal model frame for prediction, which returns you fitted values.
The warning message is fairly clear. predict determines the expected number of predictions from nrow(newdata), which is 1. But then what I described above happened so 16 fitted values are returned. Such mismatch produces the warning.
Looks like the model you really want is: amagn ~ fjbjor.

r rms error using validate

I'm building an Linear model using OLS in the r package with:
model<-ols(nallSmells ~ rcs(size, 5) + rcs(minor,5)+rcs(change_churn,3)
+rcs(review_rate,0), data=quality,x=T, y=T)
When I want to validate my model using:
validate(model,B=100)
I get the following error:
Error in lsfit(x, y) : only 0 cases, but 2 variables
In addition: Warning message:
In lsfit(x, y) : 1164 missing values deleted
But if I decrease B, e.g., B=10, I works. Why I can't iterate more. Also I notice that the seed has an effect when I use this method.
Can someone give me some advice?
UPDATE:
I'm using rcs(review_rate,0) because I want to assign the 0 number of knots to this predictor, according to my DOF budget. I noticed that the problem is with thte data in review_rate. Even if I ommit the parameter in rcs() and just put the name of the predictor, I get errors. This is the frequency of the data in review_rate: count(quality$review_rate)
x freq
1 0.8571429 1
2 0.9483871 1
3 0.9789474 1
4 0.9887640 1
5 0.9940476 1
6 1.0000000 1159 I wonder if there is a relationship with the values of this vector? Because when I built the OLS model, I get the following warning:
Warning message:
In rcspline.eval(x, nk = nknots, inclx = TRUE, pc = pc, fractied = fractied) :
5 knots requested with 6 unique values of x. knots set to 4 interior values.
The values in the other predictors are real positives, but if ommit review_rate predictor I don't get any warning or error.
Thanks for your support.
I add the link for a sample of 100 of my data for replication
https://www.dropbox.com/s/oks2ztcse3l8567/examplestackoverflow.csv?dl=0
X represent the depedent variable and Y4 the predictor that is giving me problems.
require (rms)
Data <- read.csv ("examplestackoverflow.csv")
testmodel<-ols(X~ rcs(Y1)+rcs(Y2)+rcs(Y3),rcs(Y4),data=Data,x=T,y=T)
validate(testmodel,B=1000)
Kind regards,

error message when performing Gamma glmer in R- PIRLS step-halvings failed to reduce deviance in pwrssUpdate

I am trying to perform a glmer in R using the Gamma error family. I get the error message:
"Error: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate"
my response variable is flower mass. My fixed effects are base mass, F1 treatment, and fertilisation method. My random effects are line and maternal ID nested within line.
When I perform the same analysis using an integer as the response (ie. flower number) This error does not occur.
Here is a sample of my data:
LINE MATERNAL_ID F1TREAT SO FLWR_MASS BASE_MASS
17 81 stress s 2.7514 9.488
5 41 control o 0.3042 1.809
37 89 control o 2.3749 6.694
5 41 stress s 3.6140 9.729
9 5 control s 0.5020 7.929
13 7 stress s 0.4914 0.969
35 88 stress s 0.4418 1.840
1 57 control o 2.1531 6.673
13 7 stress s 3.0191 7.131
Here is the code I am using:
library(lme4)
m <- glmer(data=mydata,
FLWR_MASS~BASE_MASS*F1TREAT*SO+(1 |LINE/MATERNAL_ID),family=Gamma)
(I am using r 3.0.3 for windows)
#HongOoi answered this question in the comments, but I will repeat it here for anyone else having this issue. He suggested changing
family=Gamma
to
family=Gamma(link=log)

Resources