R: variable lengths differ (found for '...')

R: variable lengths differ (found for '...') - r

I am doing multinomial logistic regression:
Here is the important code:
ml$Category2 <- relevel(ml$Category, ref = "NON-CRIMINAL")
test <- multinom(Category2 ~ DayOfWeek +PdDistrict+liquor+district+MoonPhase,data = ml)
b<- predict(test, type="class")
The problem is when I tried to make classification table, after I run the code:
factcrime<- factor(b, levels=levels(ml$Category))
ctab<- xtabs(~ Category +factcrime, data=ml)
addmargins(ctab)
Then I received the error message:
Error in model.frame.default(formula = ~Category + factcrime, data =
ml) : variable lengths differ (found for 'factcrime')

Related

How to solve the error "Variable lengths differ (found for 'column name')"

So this is the code we used,
#Applying Normalization
n<-function(b){
(b-min(b))/(max(b)-min(b))
}
f2<-as.data.frame(lapply(f1[2:19],n))
f3<-cbind(f2,f1$fips)
f<-cbind(f3,f1$score)
#Partitioning Data into Training & Validation Data sets
set.seed(1234) # we set seed so that the value of sample remains same throughout the code
DroughtData <- f[sample(nrow(f), 5000), ]
pd <- sample(2,nrow(DroughtData), replace = TRUE, prob = c(0.8,0.2))
train <- DroughtData[pd == 1,]
validate <- DroughtData[pd == 2,]
regtree <- rpart(formula= f1$score~f1$fips+PRECTOT+QV2M+T2M+T2MDEW+T2MWET+TS++WS50M_MAX+WS50M_MIN+WS50M_RANGE
,data = train, control = rpart.control(maxdepth = 3))
I'm getting an error while applying a regression model on a dataset. The error is as follows:
Error :
Error in model.frame.default(formula = f1$score ~ f1$fips + PRECTOT + :
variable lengths differ (found for 'PRECTOT')
Can anyone assist me to resolve this error

Error in ConfusionMatrix: "data" and "reference" should be factors with the same levels

I am trying to do a kfold cross validation with logistics regression but ran into some errors when trying to run the confusion matrix. I cannot figure out how to solve this. Any help is greatly appreciated.
In addition:
Warning message:
In predict.lm(object, newdata, se.fit, scale = 1, type = if (type == :prediction from a rank-deficient fit may be misleading
Here is my code:
for(i in 1:10){
fold_val <- tele.df[folds[[i]],]
fold_train <- tele.df[-folds[[i]],]
tele.default.lr <- glm(Churn ~ ., data = fold_train, family = "binomial")
tele.default.lr.pred.train <- predict(tele.default.lr, fold_train, type = "response")
print(confusionMatrix(tele.default.lr.pred.train, as.factor(fold_train$Churn)))
tele.default.lr.pred.valid <- predict(tele.default.lr, fold_val, type = "response")
print(confusionMatrix(tele.default.lr.pred.valid, as.factor(fold_val$Churn)))

'same level factors' error in confusion matrix

I keep on getting an error for confusion matrix. I get an error for table function too
advertising <- read.csv('C:/Users/matpo/Desktop/advertising_1.csv', stringsAsFactors = TRUE)
LogMod <- glm(Clicked.on.Ad ~ Daily.Time.Spent.on.Site + Age + Area.Income, data=advertising, family=binomial(link="logit"))
predicted <- predict(LogMod, advertising, type="response")
#matrix below
glm.pred <- ifelse(predicted > 0.8, "click", "not_click")
table(glm.pred, Clicked.on.Ad)
confusionMatrix(glm.pred, advertising)
The error I get:
Error in table(glm.pred, Clicked.on.Ad) :
object 'Clicked.on.Ad' not found
> confusionMatrix(glm.pred, advertising)
Error: `data` and `reference` should be factors with the same levels.
Could you please point me where I am making a mistake?

Variables length differ on Step function r

I fitted a model using the lmer() function (it works well). I have 11 explanatory variables. Three of them, if present in model, cause the step() function (from package lmerTest) to return the error: "Variables length differ (found on "...")" where "..." is the formula call.
I don't have any NA values in the data: there are 600 rows and all three of the problematic variables (H, I, J) are factors.
My code is:
library(purrr) ## for rdunif()
library(lmerTest)
data2 = as.data.frame(matrix(c(rdunif(600*7,1,5),
rdunif(600*3,0,1),
rdunif(600,1,9),
rep(c("a","b"),300)),
nrow = 600), byrow = FALSE)
names(data2) = c("A","B","C","D", "E","F","G","H","I","J","Z","M")
data2[,7:10] = lapply(data2[,7:10],factor)
data2[,c(1:6,11)] = lapply(data2[,c(1:6,11)],as.numeric)
mod1 = lmer(Z ~ A+B+C+D+E+F+G+
#H+
#I+
#J+
(1|M),data2)
step.mod1 = lmerTest::step(mod1) #it works
#
mod2 = lmer(Z ~ A+B+C+D+E+F+G+H+
#I+
#J+
(1|M),data2)
step.mod2 = lmerTest::step(mod2) #it does not work and returns: Variables length differ (found on "A+B+C+D+E+F+G+")
mod3 = lmer(Z ~ A+B+C+D+E+F+G+H+I+J+
(1|M),data2)
step.mod3 = lmerTest::step(mod3) #it does not work and returns: Variables length differ (found on "A+B+C+D+E+F+G+H+I+")
I know that this error is common when there are NAs, but what is the error in this case? How can I fix it?

Can the boxTidwell function handle binary outcome variables?

I initially wanted to run a boxTidwell() (found in the "car" package) analysis on my prospective Logistic Regression model (BinaryOutcomeVar ~ ContinuousPredVar + ContinuousPredVar^2 + ContinuousPredVar^3). I ran into issues:
Error in x - xbar : non-numeric argument to binary operator
In addition: Warning message:
In mean.default(x) : argument is not numeric or logical: returning NA
So, I created a reproducable example for demonstrating the error:
Doesn't work:
boxTidwell(formula = Treatment ~ uptake, other.x = ~ poly(x = colnames(CO2)[c(1,2,4)], degree = 2), data = CO2)
boxTidwell(y = CO2$Treatment, x = CO2$uptake)
Works:
boxTidwell(formula = prestige ~ income + education, other.x = ~ poly(x = women , degree = 2), data = Prestige)
I've been goofing around with the other.x parameter and am guessing that's the issue.
Question
So, does anyone know if 1. the boxTidwell() function works with binary outcome variables 2. the logic behind the other.x, because I can't get my dummy example to work either.

After further searching, it looks like the car:::boxTidwell can't handle the binary outcome variable in the formula, but it can be hand coded:
require(MASS)
require(car)
d1<-read.csv("path for your csv file",sep=',',header=TRUE)
x<-d1$explanatory variable name
y<-d1$dependent variable name
#FIT IS DONE USING THE glm FUNCTION
m1res <- glm(y ~ x,family=binomial(link = "logit"))
coeff1<- coefficients(summary(m1res))
lnx<-x*log(x)
m2res <- glm(y ~ x+lnx ,family=binomial(link = "logit"))
coeff2<- coefficients(summary(m2res))
alpha0<-1.0
pvalue<-coeff2[3,4]
pvalue
beta1<-coeff1[2,1]
beta2<-coeff2[3,1]
iter<-0
err<-1
while (pvalue<0.1) {
alpha <-(beta2/beta1)+alpha0
err<-abs(alpha-alpha0)
alpha0<-alpha
mx<-x^alpha
m1res <- glm(y ~ mx,family=binomial(link = "logit"))
coeff1<- coefficients(summary(m1res))
mlnx<-mx*log(x)
m2res <- glm(y ~ mx+mlnx ,family=binomial(link = "logit"))
coeff2<- coefficients(summary(m2res))
pvalue<-coeff2[3,4]
beta1<-coeff1[2,1]
beta2<-coeff2[3,1]
iter<- iter+1
}
# PRINT THE POWER TO CONSOLE
alpha
above code taken from:
https://sites.google.com/site/ayyalaprem/box-tidwelltransform

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R: variable lengths differ (found for '...') - r

Related

How to solve the error "Variable lengths differ (found for 'column name')"

Error in ConfusionMatrix: "data" and "reference" should be factors with the same levels

'same level factors' error in confusion matrix

Variables length differ on Step function r

Can the boxTidwell function handle binary outcome variables?

Categories

Resources