I am trying to plot my svm model.
library(foreign)
library(e1071)
x <- read.arff("contact-lenses.arff")
#alt: x <- read.arff("http://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/contact-lenses.arff")
model <- svm(`contact-lenses` ~ . , data = x, type = "C-classification", kernel = "linear")
The contact lens arff is the inbuilt data file in weka.
However, now i run into an error trying to plot the model.
plot(model, x)
Error in plot.svm(model, x) : missing formula.
The problem is that in in your model, you have multiple covariates. The plot() will only run automatically if your data= argument has exactly three columns (one of which is a response). For example, in the ?plot.svm help page, you can call
data(cats, package = "MASS")
m1 <- svm(Sex~., data = cats)
plot(m1, cats)
So since you can only show two dimensions on a plot, you need to specify what you want to use for x and y when you have more than one to choose from
cplus<-cats
cplus$Oth<-rnorm(nrow(cplus))
m2 <- svm(Sex~., data = cplus)
plot(m2, cplus) #error
plot(m2, cplus, Bwt~Hwt) #Ok
plot(m2, cplus, Hwt~Oth) #Ok
So that's why you're getting the "Missing Formula" error.
There is another catch as well. The plot.svm will only plot continuous variables along the x and y axes. The contact-lenses data.frame has only categorical variables. The plot.svm function simply does not support this as far as I can tell. You'll have to decide how you want to summarize that information in your own visualization.
Related
Preface - I really hope this makes sense!
I ran a linear-mixed effect model using an imputed dataset (FYI, the data is a mids object imputed using mice). The model has a three-way interaction with 3 continuous variables. I am now trying to plot the interaction using the interactions::interact_plot function. However, I'm receiving an error when I run the plot code, which I believe is due to the fact that the model came from a mids object and not a data frame. Does anyone know how to address this error or if there's a better way to get the plot that I'm trying to get?
Thanks very much in advance!
MIDmod1 <- with(data = df.mids, exp = lmer(GC ~ Age + Sex + Edu + Stress*Time*HLI + (1|ID)))
summary(pool(MIDmod1))
interact_plot(
model=MIDmod1,
pred = Time,
modx=Stress,
mod2=HLI,
data = df.mids,
interval=TRUE,
y.label='Global cognition composite score',
modx.labels=c('Low Baseline Stress (-1SD)','Moderate Baseline Stress (Mean)', 'High Baseline Stress (+1SD)'),
mod2.labels=c('Low HLI (-1SD)', 'Moderate HLI (Mean)', 'High HLI (+1SD)'),
legend.main='') + ylim(-2,2)
Error:
Error in rep(1, times = nrow(data)) : invalid 'times' argument
Note - I also get an error if I don't include the data argument (optional argument for this function).
Error in formula.default(object, env = baseenv()) : invalid formula
BTW - I am able to generate the plot when the model comes from a data frame - an example of what this should look like is included here: 1
Sorry, but it won’t be that easy. Multiple imputation object will definitely require special treatment, and none of the many R packages which can plot interactions are likely to work out of hte box.
Here’s a minimal example, adapted from the multiple imputation vignette of the marginaleffects package. (Disclaimer: I am the author.)
library(mice)
library(lme4)
library(ggplot2)
library(marginaleffects)
# insert missing data in an existing dataset and impute
iris_miss <- iris
iris_miss$Sepal.Width[sample(1:nrow(iris), 20)] <- NA
iris_mice <- mice(iris_miss, m = 20, printFlag = FALSE, .Random.seed = 1024)
iris_mice <- complete(iris_mice, "all")
# fit a model on 1 imputed datatset and use the `plot_predictions()` function
# with the `draw=FALSE` argument to extract the data that we want to plot
fit <- function(dat) {
mod <- lmer(Sepal.Width ~ Petal.Width * Petal.Length + (1 | Species), data = dat)
out <- plot_predictions(mod, condition = list("Petal.Width", "Petal.Length" = "threenum"), draw = FALSE)
# `mice` requires a unique row identifier called "term"
out$term <- out$rowid
class(out) <- c("custom", class(out))
return(out)
}
# `tidy.custom()` is needed by `mice` to combine datasets, but the output of fit() also has
# the right structure and column names, so it is useless
tidy.custom <- function(x, ...) return(x)
# Fit on each imputation
mod_mice <- lapply(iris_mice, fit)
# Pool
mod_pool <- pool(mod_mice)$pooled
# Merge back some of the covariates
datplot <- data.frame(mod_pool, mod_mice[[1]][, c("Petal.Width", "Petal.Length")])
# Plot
ggplot(datplot, aes(Petal.Width, estimate, color = Petal.Length)) +
geom_line() +
theme_minimal()
I'm trying to model raw data by an asymptotic function with the equation $$f(x) = a + (b-a)(1-\exp(-c x))$$ using R. To do so I used the following code:
rawData <- import("path/StackTestData.tsv")
# executing regression
X <- rawData$x
Y <- rawData$y
model <- drm(Y ~ X, fct = DRC.asymReg())
# creating the regression function
f_0_ <- model$coefficients[1] #value for y if x=0
steepness <- model$coefficients[2]
plateau <- model$coefficients[3]
eq <- function(x){f_0_+(plateau-f_0_)*(1-exp(-steepness*x))}
# plotting the regression function together with the raw data
ggplot(rawData,aes(x=x,y=y)) +
geom_line(col="red") +
stat_function(fun=eq,col="blue") +
ylim(10,12.5)
In some cases, I got a proper regression function. However, with the attached data I don't get one. The regression function is not showing any correlation with the raw data whatsoever, as shown in the figure below. Can you perhaps offer a better solution for performing the asymptotic regression or do you know where the error lies?
Best Max
R4.1.2 was used using R Studio 1.4.1106. For ggplot the package ggpubr, for DRC.asymReg() the packages aomisc and drc were load.
I have two glmer models with two covariates each that I'm trying to plot into a single figure.
MWE:
## generalized linear mixed model
library(lattice)
cbpp$response <- sample(c(0,1), replace=TRUE, size=nrow(cbpp))
gm1 <- glmer(response ~ size + incidence + (1 | herd),
data = cbpp, family = binomial)
cbpp$obs <- 1:nrow(cbpp)
gm2 <- glmer(response ~ size + incidence + (1 | herd) + (1|obs),
family = binomial, data = cbpp)
I am trying to plot the predicted values againts each covariate for each model. I found the sjPlot library and the plot_model function, which can plot these predictions when using type = "pred". Calling this function individually on each model works perfect and yields two separate figures like this for each model:
However I'm not familiar with R and I am having a hard time trying to plot the 4 plots on the same figure.
The plot_model function has a grid parameter, which only works for models with a Poisson distirbution. For gm1 and gm2, I am getting the following error when I call plot_model(gm1, type = "pred", grid = TRUE):
Error in if (attr(x, "logistic", exact = TRUE) == "1" && attr(x, "is.trial", : missing value where TRUE/FALSE needed
Anyway, I would not be able to plot the three models in one figure using this so I tried three different approaches. First, I saw the plot_models function, which takes multiple models as input. When I try to pass the two models as arguments, calling plot_models(gm1, gm2) I get the following error:
Error: $ operator not defined for this S4 class
Second, I tried using the par function setting the mfrow and then calling plot_model again without success. I don't get any error but the plots keep showing as individual figures.
Third, I tried using the gridExtra library. Calling
p1 <- plot_model(gm1, type = "pred")
p2 <- plot_model(gm2, type = "pred")
grid.arrange(p1, p2)
results in the following error:
Error in gList(list(ppt = list(data = list(x = c(-2, -1, 0, 1, 2, 3, 4, : only 'grobs' allowed in "gList"
Does anyone have an insight on this?
EDIT
This seems to work:
pp1 <- plot_model(gm1,type="pred")
pp2 <- plot_model(gm2,type="pred")
plot_grid(c(pp1,pp2))
I am trying to create a 2D plot using SVM in library(kernlab), but it appears the plot function
is only appropriate for binary classification. I would like to be able to plot 3 (or more) groups, as in the example below.
My data is structured just like the iris data, so I will use it to illustrate.
After fitting the model:
fit.ksvm <- ksvm(Species~., data=iris, kernel= "rbfdot", prob.model=TRUE)
fit.ksvm
I use the plot function for ksvm:
plot(fit.ksvm, data=iris)
and get the message following:
> plot(fit.ksvm, data=iris)
Error in .local(x, ...) :
plot function only supports binary classification
When I try similar analyses using a two-way classification, the plot is produced. So, I think the issue is multiple groups. Can anyone think of a way to create a two-dimensional "heat-map" similar to the one below, but using an SVM classification model with three (or more?) classes?
two-way SVM classification
x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2))
y <- matrix(c(rep(1,60),rep(-1,60)))
svp <- ksvm(x,y,type="C-svc")
plot(svp,data=x)
You could use the e1071 library
library(e1071)
m <- svm(Species~., data = iris)
plot(m, iris, Petal.Width ~ Petal.Length, slice = list(Sepal.Width = 3, Sepal.Length = 4))
I have a data in R so i want to test the data on various models. I have split the data into 2 sets 80% training and 20% testing. So now what i want to do is train the training data set on a linear model and predict it on the testing data set.
I have don this so far.
temp<-lm(formula = cityMpg ~ peakRpm+horsePower+wheelBase , data=train)
temp_test<- predict(temp,test)
plot(temp_test)
Here, I get the scatter plot. Now I just want a line in this scatter plot.
When I use abline(temp_test), I get an error.
i WANT THE LINE as automatic, I do not wish to specify the co-ordinates.
getting error as:
Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) :
invalid a=, b= specification
As pointed out above, this is a bit tricky for a multi-dimensional model.
Get some data (you neglected to include a reproducible example: see http://tinyurl.com/reproducible-000 ...)
library(foreign)
dat <- read.arff(url("http://www.cs.umb.edu/~rickb/files/UCI/autos.arff"))
Split into training and test data sets:
train <- dat[1:150,]
test <- dat[151:nrow(dat),]
The variable names are a bit awkward for R (the dashes are interpreted as minus operators, so we have to use back-quotes to protect the names):
fit <- lm(`city-mpg` ~ `peak-rpm`+horsepower+`wheel-base`,data=train)
temp_test <- predict(fit,test)
Plot the predictions vs peak RPM:
par(las=1,bty="l") ## cosmetic
plot(test[["peak-rpm"]],temp_test,xlab="peak rpm",ylab="predicted")
In order to add the line, we have to adjust the intercept according to some baseline values of the other parameters: we'll use the mean (another alternative is to center all the predictor variables before fitting the model):
cf <- coef(fit)
abline(a=cf["(Intercept)"]+
mean(test$horsepower)*cf["horsepower"]+
mean(test$`wheel-base`)*cf["`wheel-base`"],
b=coef(fit)["`peak-rpm`"])
Another way to do this is to use predict():
newdat <- with(test,
data.frame(horsepower=mean(horsepower),
"wheel-base"=mean(`wheel-base`),
"peak-rpm"=seq(min(`peak-rpm`),
max(`peak-rpm`),
length=41),
check.names=FALSE))
newdat["city-mpg"] <- predict(fit,newdat)
with(newdat,lines(`peak-rpm`,`city-mpg`,col=4))
(41 points is silly for a straight line -- we could have used just 2 -- but will work well if you want to plot something curved, like confidence intervals or a nonlinear fit.)
Alternatively you could just fit the marginal model, but the actual fitted line is somewhat different (it will only be the same if all the predictors are orthogonal to each other):
fit2 <- lm(`city-mpg` ~ `peak-rpm`,data=train)
abline(fit2,col="red")