I am following Titanic Tutorial on DataCamp. After built the decision tree, the plotting for the decision tree just doesn't work, showing an error that
there is no package called ‘rpart.plot’
Any idea how to fix this?
libraries
library(rattle)
library(rpart)
library(RColorBrewer)
decision tree
my_tree_two <- rpart(Survived ~ Pclass + Sex + Age + SibSp + Parch + Fare + Embarked,
data = train,
method = 'class')
fancyRpartPlot(my_tree_two)
Please try installing "rpart.plot":
library("rpart.plot")
The rpart.plot package should first be installed, then loaded in order to be used.
install.packages("rpart.plot")
library(rpart.plot)
The package does not install when you install rplot and must be installed with the above commands.
Datacamp's Interpreter at times it is buggy and doesn't behaves as expected. If the package isn't available, you won't be able to install it.
Consider reporting it to Datacamp
Related
I have the following packages loaded and I have tried reinstalling them in case it had not downloaded properly. That was not the issue. I'm working on the latest version of R and R studio.
library(Matrix)
library(lme4)
library(lmerTest)
library(emmeans)
library(stats)
library(fitdistrplus)
library(tidyverse)
library(buildmer)
library(performance)
library(see)
library(sjPlot)
After importing and transforming my data a little bit e.g. recoding 1's and 2's to more meaningful labels. I have run the following code which has always worked for me in the past.
modelRT4 <- lmer(RT4 ~ condition_number +
(1 + condition_number | participant) +
(1 + condition_number | item_number),
data = alldata_Pred_RT, REML = TRUE)
However, now I am getting the error code...
Error in diag(Lambdat) : object 'R_sparse_diag_get' not found
For context RT4 is a reaction time measure in seconds. Whilst the others are self-explanatory.
I'm having no problems getting descriptives or visualising the data using violin plots and box plots.
Any ideas why this might be and what can be done to rectify this? I can show more code if needed.
The full code can be found on my GitHub page: https://github.com/E-LeLuan/ASC_small/blob/master/Tidy_RT_data/Prediction/Prediction_tidy_script.R
I am trying to create a node-link diagram (decision tree) by using parsnip and tidymodels. What I am performing is building a decision tree model for the StackOverflow dataset using the tidymodels package and rpart as model engine. The model should predict whether a developer will work remotely (variable remote) based on the number of years of programming experience (years_coded_job), degree of career satisfaction (career_satisfaction), job title "Data Scientist" yes/no (data_scientist), and size of the employing company (company_size_number).
My pipeline
library(tidyverse)
library(tidymodels)
library(rpart.plot)
library(rpart)
library(rattle)
so <- read_rds(here::here("stackoverflow.rds"))
fit <- rpart(remote ~ years_coded_job + career_satisfaction + data_scientist + company_size_number,
data = so,
control = rpart.control(minsplit = 20, minbucket = 2))
fancyRpartPlot(fit, sub = "")
The plot I obtain
I want to know whether is this the correct approach for determining the predictors. Since I am not building a model is this the right way?
If you are going tidymodels and parsnip to fit your model, it's better to use that actual fitted model for any visualizations like this. You can get the underlying engine object from a parsnip model using $fit.
library(tidyverse)
library(tidymodels)
library(rattle)
#> Loading required package: bitops
#> Rattle: A free graphical interface for data science with R.
#> Version 5.4.0 Copyright (c) 2006-2020 Togaware Pty Ltd.
#> Type 'rattle()' to shake, rattle, and roll your data.
data(kyphosis, package = "rpart")
tree_fit <- decision_tree(min_n = 20) %>%
set_engine("rpart") %>%
set_mode("classification") %>%
fit(Kyphosis ~ Age + Number + Start,
data = kyphosis)
fancyRpartPlot(tree_fit$fit, sub = "")
Created on 2021-05-25 by the reprex package (v2.0.0)
For some kinds of visualizations, you will need to use repair_call().
I have tried to produce diagnostics plots for glmmTMB models using package DHARMa without success. Example 1.1 in this vignette gives:
owls_nb1 <- glmmTMB(SiblingNegotiation ~ FoodTreatment*SexParent +
(1|Nest)+offset(log(BroodSize)),
contrasts=list(FoodTreatment="contr.sum",
SexParent="contr.sum"),
family = nbinom1,
zi = ~1,
data=Owls)
plot(owls_nb1_simres <- simulateResiduals(owls_nb1) )
# Error in on.exit(add = TRUE, { : invalid 'add' argument
The same happens with:
if (!require(RCurl)) install.packages('RCurl'); library(RCurl)
unicorns <- read.csv(text= RCurl::getURL("https://raw.githubusercontent.com/marcoplebani85/datasets/master/unicorns.csv"))
# simulated data, obviously
unicorns_glmmTMB <- glmmTMB(Herd_size_n ~ food.quantity
+ (1 + food.quantity | Locality)
+ (1 + food.quantity | Year_Month),
family="poisson",
data=unicorns)
plot(simulateResiduals(unicorns_glmmTMB))
# Error in on.exit(add = TRUE, { : invalid 'add' argument
If I run the same model in lme4::glmer:
unicorns_glmer <- glmer(Herd_size_n ~ food.quantity
+ (1 + food.quantity | Locality)
+ (1 + food.quantity | Year_Month),
family="poisson",
data=unicorns)
...and "feed" it to:
plot(simulateResiduals(unicorns_glmer))
I obtain diagnostics plots without issues (by the way I am aware that model unicorns_glmer is suboptimal and can be improved).
I'm using:
glmmTMB version 1.0.2.9000 freshly installed from github;
DHARMa version 0.4.1;
R version 3.6.3;
MacOS Sierra version 10.12.6.
Has anyone encountered the same problem? Does anyone know how to solve it?
EDIT: my question was originally on how packages performance and DHARMa handle glmmTMB objects. For the sake of focus and clarity I removed the references to package performance, thus making this question specific to glmmTMB and DHARMa.
It looks like this is a bug that was present in R <= 4.0.1. From the R NEWS file for version 4.0.2:
on.exit() now correctly matches named arguments, thanks to PR#17815 (including patch) by Brodie Gaslam.
I have attempted to fix the glmmTMB code so it works around the bug.
You could try
remotes::install_github("glmmTMB/glmmTMB/glmmTMB#on_exit_order")
and see if that helps (provided nothing goes wrong, this branch should be merged into master shortly ...)
I have discovered some heteroscedasticity in my model that I would like to compensate for with more robust standard errors. I have tried to use the Huber-White robust standard errors from the merDeriv package in R but I beleive these only work for a GLMM with a binomial distribution. Is there a way I could achieve the same thing for a Negative Binomial distribition?
Model:
library(lme4)
model <- glmer.nb(Jobs ~ 1 + Month + Year + (1|Region), data = df)
Huber-White robust standard errors:
library(merDeriv)
bread.glmerMod(model)
Error:
Error in vcov.lmerMod(object, full = full) : estfun.lmerMod() only works for lmer() models.
Thank you for any help!
This looks like a bug in the package, as far as I can tell (the bread.glmerMod function was calling estfun.lmerMod rather than estfun.glmerMod; there's a broader question here about the design of the generic functions, but never mind ...)
You should be able to install a fixed version from my fork via remotes::install_github("bbolker/merDeriv"), then reload the package and try again.
Alternately, download the tarball, change vcov.lmerMod to vcov.glmerMod in the last line of R/bread.glmerMod.R, and re-install the package ...
Try something like this:
library(lme4)
model <- glmer.nb(Jobs ~ 1 + Month + Year + (1|Region), data = df)
cov <- vcovHC(model, type = "HC1", sandwich = T)
se <- sqrt(diag(cov_m1))
(Can't confirm if it works since this there isn't a reproducible example)
i am newbie to R
i am using R 3.0.1 version,
I have installed rpart package using
install.packages("rpart")
selected USA(CA 1) in the pop up box and it is installed successfully
> p_data = read.csv(file="/home/sudeep/Desktop/mysql.tsv",sep="\t",dec=".",header=TRUE)
> dtree <- rpart(paid ~ .,data = p_data, method="class")
Error: could not find function "rpart"
so what is wrong with me??
You don't actually need to install rpart manually; it comes with the base R distribution. However, you do need to load it into your R session with a call to library:
library(rpart)
rpart(paid ~ ., data=....)