Support vector machine plotting - r

I was trying to plot a SVM classification. However I encountered a problem that I have no idea how to fix. I looked at documentation and some videos but still stuck. Here is my code
library(ISLR)
svm.oj2 <- svm(Purchase~.,data=OJ,kernel='linear',cost=1,scale = F)
plot(svm.oj2,data=OJ)
Here is the error:
Error in plot.svm(svm.oj2, data = OJ) : missing formula.
Really appreciate any help

I think this is what you are trying to do:
library(ISLR)
library(e1071)
svm.oj2 <- svm(Purchase~.,data=OJ,kernel='linear',cost=1,scale = F)
plot(OJ, col = 1:1000 %in% svm.oj2$index + 1)

Related

Why is set.seed not working for mcp R package?

I am doing a simple change point analysis using the mcp R package, but my results still vary each time I rerun my code even after including set.seed. Appreciate any help on this! (Below is my code, thank you!)
library(rjags)
library(mcp)
model = list(y~1+x,~0+x,~0+x,~0+x,~0+x,~0+x)
set.seed(42)
fit_mcp = mcp(model, data=hosp_df)
summary(fit_mcp)
plot(fit_mcp)
newdata = data.frame(x = c(2023:2040))
prediction <- fitted(fit_mcp, newdata = newdata)

Problem with running emmeans (error in assign '.Last.ref_grid')

I have been having trouble with running emmeans function (from the emmeans package) whenever I try to follow up a two way between groups ANOVA with estimated marginal means.
A simple example:
library(emmeans)
library(tidyverse)
df <- tibble(fct1 = factor(rep(1:3, 10)),
fct2 = factor(rep(2:1, 15)),
DV = rnorm(30, 100, 15))
model1 <- lm(DV ~ fct1 * fct2, df)
emmeans(model1, "fct1", by = "fct2")
Returns:
Error in assign(".Last.ref_grid", object, inherits = TRUE) :
cannot change value of locked binding for '.Last.ref_grid'
No matter what data I run it on, always the same error shows up.
Thank you for any help!
This should stop it:
emm_options(save.ref_grid = FALSE)
This will keep it from saving the most recently created reference grid (or trying to, in your case). However, it may be worth trying to understand why this is happening. If you do:
.Last.ref_grid
you should see what it is that was last saved. That might be a clue. And try to delete it.

Error in predict.randomForest

I was hoping someone would be able to help me out with an issue I am having with the prediction function of the randomForest package in R. I keep getting the same error when I try to predict my test data:
Here's my code so far:
extractFeatures <- function(RCdata) {
features <- c(4, 9:13, 17:20)
fea <- RCdata[, features]
fea$Week <- as.factor(fea$Week)
fea$Age_Range <- as.factor(fea$Age_Range)
fea$Race <- as.factor(fea$Race)
fea$Referral_Source <- as.factor(fea$Referral_Source)
fea$Referral_Source_Category <- as.factor(fea$Referral_Source_Category)
fea$Rehire <- as.factor(fea$Rehire)
fea$CLFPR_.HS <- as.factor(fea$CLFPR_.HS)
fea$CLFPR_HS <- as.factor(fea$CLFPR_HS)
fea$Job_Openings <- as.factor(fea$Job_Openings)
fea$Turnover <- as.factor(fea$Turnover)
return(fea)
}
gp <- runif(nrow(RCdata))
RCdata <- RCdata[order(gp), ]
train <- RCdata[1:4600, ]
test <- RCdata[4601:6149, ]
rf <- randomForest(extractFeatures(train), suppressWarnings(as.factor(train$disposition_category)), ntree=100, importance=TRUE)
testpredict <- predict(rf, extractFeatures(test))
"Error in predict.randomForest(rf, extractFeatures(test)) :
Type of predictors in new data do not match that of the training data."
I have tried adding in the following line to the code, and still receive the same error:
testpredict <- predict(rf, extractFeatures(test), type="prob")
I found the source of the error being the fact that the training data has a level or two that is not found in the test data. So when I tried another suggestion I found online to adjust the levels of the test data to that of the training data, I keep getting NULL values in the fields I am using in both the training and test sets.
levels(test$Referral)
NULL
I can see the levels when I use the function, however.
levels(as.factor(test$Referral))
So then I tried the same suggestion I found online with adjusting the levels of the test to equal that of the training data using the following function and received an error:
levels(as.factor(test$Referral)) -> levels(as.factor(train$Referral))
Error in `levels<-.factor`(`*tmp*`, value = c(... :
number of levels differs
I am sure there is something simple I am missing (I am still very new to R), so any insight you can provide would be unbelievably helpful. Thanks!

bic.glm predict error: "newdata is missing variables"

I've spent a lot of time trying to solve this error and searching for solutions without any luck, and I thank you in advance for your help.
I'm trying to create predicted values from the coefficients created via BMA. Whenever I run my predict function, I am getting a "newdata is missing variables" error. All variables included in the original model are present in the new dataframe, so I'm not quite sure what the problem is. I'm working with a fairly large dataset with many independent variables. I'm fairly new to R, so I apologize if this is an obvious question!
y<-df$y
x<-df
x$y<-NULL
bic.glm<-bic.glm(x, y, strict=FALSE, OR=20, glm.family="binomial", factortype=TRUE)
predict(bic.glm.bwt, x)
I've also tried it this way:
bic.glm<-bic.glm(y~., data=df, strict=FALSE, OR=20, glm.family="binomial", factortype=TRUE)
predict(bic.glm, x)
And also with creating a new data frame...
bic.glm<-bic.glm(y~., data=df, strict=FALSE, OR=20, glm.family="binomial", factortype=TRUE)
newdata<-x
predict(bic.glm, newdata=x)
Each time I receive the same error message:
Error in predict.bic.glm(bic.glm, newdata=x) :
newdata is missing variables
Any help is very much appreciated!
First, it is bad practice to call your LHS the same name as the function call. You may be masking the function bic.glm from further use.
That minor comment aside... I just encountered the same error. After some digging, it seems that predict.bic.glm checks the names vs. the mle matrix in the bic.glm object. The problem is that somewhere in bic.glm, if factors are used, those names get a '.x' or just '.' appended at the end. Therefore, whenever you use factors you will get this error.
I communicated this to package maintainers. Meanwhile, you can work around the bug by renaming the column names of the mle object, like this (using your example):
fittedBMA<-bic.glm(y~., data=df)
colnames(fittedBMA$mle)=colnames(model.matrix(y~., data=df)) ### this is the workaround
predict(fittedBMA,newdata=x) ### should work now, if x has the same variables as df
Okay, so first look at the usage section in the cran documentation for BMA::bic.glm.
here
This example is instructive for a data.frame.
Example 2 (binomial)
library(MASS)
data(birthwt)
y <- birthwt$lo
x <- data.frame(birthwt[,-1])
x$race <- as.factor(x$race)
x$ht <- (x$ht>=1)+0
x <- x[,-9]
x$smoke <- as.factor(x$smoke)
x$ptl <- as.factor(x$ptl)
x$ht <- as.factor(x$ht)
x$ui <- as.factor(x$ui)
bic.glm.bwT <- bic.glm(x, y, strict = FALSE, OR = 20,
glm.family="binomial",
factor.type=TRUE)
predict( bic.glm.bwT, newdata = x)
bic.glm.bwF <- bic.glm(x, y, strict = FALSE, OR = 20,
glm.family="binomial",
factor.type=FALSE)
predict( bic.glm.bwF, newdata = x)

R code to find slope in Tableau

I tried to write the formula to get the slope using R and Tableau integration.
My formula within the calculated field shows to be a valid one. However, when I try to plot the same, I get an error. The formula i am using is as follows:
SCRIPT_REAL("mydata <- data.frame(cbind(yy = .arg1, xx = .arg2)); fit <- lm(yy ~ xx,new data=mydata); fit$coeff[[2]]",(avg([Revenue Growth])),(avg([WTI]) ))
The error i receive is :
Any help with the same would be appreciated.
Thanks.
Try this:
SCRIPT_REAL(
'mydata<- data.frame(yy=.arg1, xx=.arg2);
fit <- lm(yy~xx,mydata)$coefficients[2]',
AVG([Revenue Growth],
AVG([WTI])
)

Resources