I'm unable to call the function randomForest.plot() when loading a randomForest object through an RData file.
library("randomForest")
load("rf.RData")
plot(rf)
I get the error:
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'
Get the same error when I call randomForest:::plot.randomForest(rf)
Other function calls on rf work just fine.
EDIT:
See output of str(rf)
str(rf)
List of 15
$ call : language randomForest(x = data[, match("feat1", names(data)):match("feat_n", names(data))], y = data[, match("my_y", n| __truncated__ ...
$ type : chr "regression"
$ predicted : Named num [1:723012] -1141 -1767 -1577 NA -1399 ...
..- attr(*, "names")= chr [1:723012] "1" "2" "3" "4" ...
$ oob.times : int [1:723012] 3 4 6 3 2 3 2 6 7 5 ...
$ importance : num [1:150, 1:2] 6172 928 6367 5754 1013 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:150] "feat1" "feat2" "feat3" "feat4" ...
.. ..$ : chr [1:2] "%IncMSE" "IncNodePurity"
$ importanceSD : Named num [1:150] 400.9 96.7 500.1 428.9 194.8 ...
..- attr(*, "names")= chr [1:150] "feat1" "feat2" "feat3" "feat4" ...
$ localImportance: NULL
$ proximity : NULL
$ ntree : num 60
$ mtry : num 10
$ forest :List of 11
..$ ndbigtree : int [1:60] 392021 392219 392563 392845 393321 392853 392157 392709 393223 392679 ...
..$ nodestatus : num [1:393623, 1:60] -3 -3 -3 -3 -3 -3 -3 -3 -3 -3 ...
..$ leftDaughter : num [1:393623, 1:60] 2 4 6 8 10 12 14 16 18 20 ...
..$ rightDaughter: num [1:393623, 1:60] 3 5 7 9 11 13 15 17 19 21 ...
..$ nodepred : num [1:393623, 1:60] -8.15 -31.38 5.62 -59.87 -16.06 ...
..$ bestvar : num [1:393623, 1:60] 118 57 82 77 65 148 39 39 12 77 ...
..$ xbestsplit : num [1:393623, 1:60] 1.08e+02 -8.26e+08 -2.50 8.55e+03 1.20e+04 ...
..$ ncat : Named int [1:150] 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "names")= chr [1:150] "feat1" "feat2" "feat3" "feat4" ...
..$ nrnodes : int 393623
..$ ntree : num 60
..$ xlevels :List of 150
.. ..$ feat1 : num 0
.. ..$ feat2 : num 0
.. ..$ feat3 : num 0
.. ..$ feat4 : num 0
.. ..$ featn : num 0
.. .. [list output truncated]
$ coefs : NULL
$ y : num [1:723012] -1885 -1918 -1585 -1838 -2035 ...
$ test : NULL
$ inbag : NULL
- attr(*, "class")= chr "randomForest"
Related
I have the following object with a lot of lists and I need to convert all of them in dataframes using R...
glimpse(pickle_data)
List of 32
$ 2020-02-01:List of 11
..$ model :List of 6
.. ..$ : num [1:88, 1:100] 0.00487 0.13977 -0.07648 0.18417 -0.1105 ...
.. ..$ : num [1:25, 1:100] -0.186 0.0703 0.1479 0.0321 0.1185 ...
.. ..$ : num [1:100(1d)] 0.0119 0.0457 0.023 0.0295 0.0115 ...
.. ..$ : num [1:25, 1:132] -0.024 0.0756 -0.0724 -0.1112 -0.1974 ...
.. ..$ : num [1:33, 1:132] 0.1904 0.1275 0.0684 0.1707 0.017 ...
.. ..$ : num [1:132(1d)] 0.0434 0.0636 0.0444 0.0329 0.0393 ...
..$ X_train : num [1:2494, 1:13, 1:88] 0.0676 0.0697 0.0717 0.0753 0.0783 ...
..$ X_test : num [1:3180, 1:13, 1:88] 0.0676 0.0697 0.0717 0.0753 0.0783 ...
..$ df_input : feat__price__mean_22_days ... feat__us_area_harvested__slope_66_days
ds ...
2010-01-01 NaN ... NaN
2010-01-04 NaN ... NaN
2010-01-05 NaN ... NaN
2010-01-06 NaN ... NaN
2010-01-07 NaN ... NaN
... ... ... ...
2022-09-13 0.699482 ... 0.157974
2022-09-14 0.705994 ... 0.163528
2022-09-15 0.713729 ... 0.171177
2022-09-16 0.722944 ... 0.176913
2022-09-19 0.728798 ... 0.184181
[3317 rows x 88 columns]
..$ index_test :DatetimeIndex(['2010-07-13', '2010-07-14', '2010-07-15', '2010-07-16',
'2010-07-19', '2010-07-20', '2010-07-21', '2010-07-22',
'2010-07-23', '2010-07-26',
...
'2022-09-06', '2022-09-07', '2022-09-08', '2022-09-09',
'2022-09-12', '2022-09-13', '2022-09-14', '2022-09-15',
'2022-09-16', '2022-09-19'],
dtype='datetime64[ns]', name='ds', length=3180, freq=None)
..$ window : int 12
..$ shift_coeff : int 4
..$ target_frequency : int 6
..$ nb_of_months_to_predict : int 9
..$ spine_params :List of 3
.. ..$ month_duration: int 22
.. ..$ week_duration : int 5
.. ..$ min_periods : int 7
..$ neural_network_hyperparams:List of 10
.. ..$ number_of_neurons_first_layer: int 25
.. ..$ l1_regularization : int 0
.. ..$ l2_regularization : int 0
.. ..$ decrease_reg_deep_layers : logi TRUE
.. ..$ decrease_reg_factor : int 10
.. ..$ dropout_rate : num 0.1
.. ..$ use_dropout : logi TRUE
.. ..$ initial_learning_rate : num 0.000151
.. ..$ output_layer_cell : chr "lstm"
.. ..$ gradient_initialization : int 42
$ 2020-03-01:List of 11
..$ model :List of 6
It is possible to extract dataframe for the first list and all of them inside him and so on untill to the end of this object using R?
I have done many bayesian models using the MCMCglmm package in R, like this one:
model=MCMCglmm(scale(lifespan)~scale(weight)*scale(littersize),
random=~idv(DNA1)+idv(DNA2),
data=df,
family="gaussian",
prior=prior1,
thin=50,
burnin=5000,
nitt=50000,
verbose=F)
summary(model)
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) 11.23327 8.368 13.73756 6228 <2e-04 ***
weight -1.63770 -2.059 -1.23457 6600 <2e-04 ***
littersize 0.40960 0.024 0.80305 6600 0.0415 *
weight:littersize -0.33411 -0.635 -0.04406 5912 0.0248 *
I would like to plot the resulting interaction (weight:littersize) with ggeffects or sjPlots packages, like this:
plot_model(model,
type = "int",
terms = c("scale(lifespan)", "scale(weight)", "scale(littersize)"),
mdrt.values = "meansd",
ppd = TRUE)
But I obtain the next output:
`scale(weight)` was not found in model terms. Maybe misspelled?
`scale(littersize)` was not found in model terms. Maybe misspelled?
Error in terms.default(model) : no terms component nor attribute
Además: Warning messages:
1: Some model terms could not be found in model data. You probably need to load the data into the environment.
2: Some model terms could not be found in model data. You probably need to load the data into the environment.
Data is already loaded. I tried to write terms differently without the "scale(x)" term, and changed the model too to deal with equal terms, but I am still getting this error message. I am also open to plot this interaction with different packages.
My model str(model) is:
>str(model)
List of 20
$ Sol : 'mcmc' num [1:6600, 1:4] -0.814 1.215 -2.119 -0.125 -1.648 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "(Intercept)" "scale(weight)" "scale(littersize)" "scale(weight):scale(littersize)"
..- attr(*, "mcpar")= num [1:3] 7e+04 4e+05 5e+01
$ Lambda : NULL
$ VCV : 'mcmc' num [1:6600, 1:3] 1.094 0.693 1.58 0.645 1.161 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:3] "phylo." "haplo." "units"
..- attr(*, "mcpar")= num [1:3] 7e+04 4e+05 5e+01
$ CP : NULL
$ Liab : NULL
$ Fixed :List of 3
..$ formula:Class 'formula' language scale(lifespan) ~ scale(weight) * scale(littersize)
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ nfl : int 4
..$ nll : num 0
$ Random :List of 5
..$ formula:Class 'formula' language ~idv(phylo) + idv(haplo)
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ nfl : num [1:2] 1 1
..$ nrl : int [1:2] 92 92
..$ nat : num [1:2] 0 0
..$ nrt : int [1:2] 1 1
$ Residual :List of 6
..$ formula :Class 'formula' language ~units
.. .. ..- attr(*, ".Environment")=<environment: 0x0000025ba05f8938>
..$ nfl : num 1
..$ nrl : int 92
..$ nrt : int 1
..$ family : chr "gaussian"
..$ original.family: chr "gaussian"
$ Deviance : 'mcmc' num [1:6600] -262.6 -137.3 -203.6 -83.6 -29.1 ...
..- attr(*, "mcpar")= num [1:3] 7e+04 4e+05 5e+01
$ DIC : num -158
$ X :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..# i : int [1:368] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# p : int [1:5] 0 92 184 276 368
.. ..# Dim : int [1:2] 92 4
.. ..# Dimnames:List of 2
.. .. ..$ : chr [1:92] "1.1" "2.1" "3.1" "4.1" ...
.. .. ..$ : chr [1:4] "(Intercept)" "scale(weight)" "scale(littersize)" "scale(weight):scale(littersize)"
.. ..# x : num [1:368] 1 1 1 1 1 1 1 1 1 1 ...
.. ..# factors : list()
$ Z :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..# i : int [1:16928] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# p : int [1:185] 0 92 184 276 368 460 552 644 736 828 ...
.. ..# Dim : int [1:2] 92 184
.. ..# Dimnames:List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:184] "phylo1.NA.1" "phylo2.NA.1" "phylo3.NA.1" "phylo4.NA.1" ...
.. ..# x : num [1:16928] 0.4726 0.0869 0.1053 0.087 0.1349 ...
.. ..# factors : list()
$ ZR :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
.. ..# i : int [1:92] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# p : int [1:93] 0 1 2 3 4 5 6 7 8 9 ...
.. ..# Dim : int [1:2] 92 92
.. ..# Dimnames:List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:92] "units.1" "units.2" "units.3" "units.4" ...
.. ..# x : num [1:92] 1 1 1 1 1 1 1 1 1 1 ...
.. ..# factors : list()
$ XL : NULL
$ ginverse : NULL
$ error.term : int [1:92] 1 1 1 1 1 1 1 1 1 1 ...
$ family : chr [1:92] "gaussian" "gaussian" "gaussian" "gaussian" ...
$ Tune : num [1, 1] 1
..- attr(*, "dimnames")=List of 2
.. ..$ : chr "1"
.. ..$ : chr "1"
$ meta : logi FALSE
$ y.additional: num [1:92, 1:2] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "class")= chr "MCMCglmm"
Thank you.
Try to scale your predictors before fitting the model, i.e.
df$lifespan <- as.vecor(scale(df$lifespan))
Or better, use effectsize::standardize(), which does not create a matrix for a one-dimensial vector when scaling your variables:
df <- effectsize::standardize(df, select = c("lifespan", "weight", "littersize"))
Then you can call your model like this:
model <- MCMCglmm(lifespan ~ weight * littersize,
random=~idv(DNA1)+idv(DNA2),
data=df,
family="gaussian",
prior=prior1,
thin=50,
burnin=5000,
nitt=50000,
verbose=F)
Does this work?
Melt not working when upgrading from Reshape to Reshape 2
I have a large list of values. Here is the summary (lots of columns):
List of 46
$ Date: Date[1:9], format: "2011-03-04" ...
$ 1 : num [1:9] 20278 19493 20587 24679 55708 ...
$ 2 : num [1:9] 24029 25317 25103 28871 79423 ...
$ 3 : num [1:9] 6657 7025 6603 8105 17883 ...
$ 4 : num [1:9] 29684 27555 28956 31504 73638 ...
$ 5 : num [1:9] 9572 8759 9947 11173 22341 ...
$ 6 : num [1:9] 18935 20168 22963 24387 58640 ...
$ 7 : num [1:9] 8299 8297 10484 10211 19277 ...
$ 8 : num [1:9] 14365 13691 13906 17149 38364 ...
$ 9 : num [1:9] 10333 10899 9708 11297 24100 ...
$ 10 : num [1:9] 33647 33455 35327 49031 128927 ...
$ 11 : num [1:9] 15090 16105 16343 18624 53809 ...
$ 12 : num [1:9] 17971 16408 15911 18350 44048 ...
$ 13 : num [1:9] 36820 44024 52026 62491 142186 ...
$ 14 : num [1:9] 27036 33240 39248 53035 148606 ...
$ 15 : num [1:9] 11490 11704 12587 17840 50201 ...
$ 16 : num [1:9] 11016 11768 13711 13323 21258 ...
$ 17 : num [1:9] 19792 18734 20477 30433 66028 ...
$ 18 : num [1:9] 19920 20316 21285 29360 88008 ...
$ 19 : num [1:9] 17046 19281 19610 30376 80302 ...
$ 20 : num [1:9] 32886 38971 44672 53278 141423 ...
$ 21 : num [1:9] 11324 13211 13123 15510 32014 ...
$ 22 : num [1:9] 21416 23530 25978 37096 94035 ...
$ 23 : num [1:9] 29527 33310 32701 42628 112442 ...
$ 24 : num [1:9] 19479 19181 20525 25210 69559 ...
$ 25 : num [1:9] 20727 20620 22190 29052 59528 ...
$ 26 : num [1:9] 16056 15122 15240 17327 39292 ...
$ 27 : num [1:9] 19020 28919 29659 43806 94475 ...
$ 28 : num [1:9] 19041 15803 15940 20319 49065 ...
$ 29 : num [1:9] 15775 15080 17841 21492 49891 ...
$ 30 : num [1:9] 9554 10395 9605 11513 13558 ...
$ 31 : num [1:9] 15322 16603 16348 17228 32973 ...
$ 32 : num [1:9] 19752 21591 21272 24639 52204 ...
$ 33 : num [1:9] 2017 2109 1944 1899 2224 ...
$ 34 : num [1:9] 18797 18496 17514 20066 39702 ...
$ 35 : num [1:9] 14306 13489 14507 18560 51028 ...
$ 36 : num [1:9] 2247 2558 2232 2401 2931 ...
$ 37 : num [1:9] 10971 10779 10272 11788 17386 ...
$ 38 : num [1:9] 6241 6414 6024 6291 8257 ...
$ 39 : num [1:9] 16933 18888 20160 25847 60786 ...
$ 40 : num [1:9] 18254 17638 17956 20265 43778 ...
$ 41 : num [1:9] 18249 19955 20016 25647 53012 ...
$ 42 : num [1:9] 9917 10655 10194 10354 15472 ...
$ 43 : num [1:9] 6561 6903 6941 6174 14034 ...
$ 44 : num [1:9] 5857 5968 6283 7645 9861 ...
$ 45 : num [1:9] 17185 18197 19508 26187 67014 ...
- attr(*, "row.names")= int [1:9] 1 2 3 4 5 6 7 8 9
- attr(*, "idvars")= chr "Date"
- attr(*, "rdimnames")=List of 2
..$ :'data.frame': 9 obs. of 1 variable:
.. ..$ Date: Date[1:9], format: "2011-03-04" ...
..$ :'data.frame': 45 obs. of 1 variable:
.. ..$ Store: num [1:45] 1 2 3 4 5 6 7 8 9 10 ...
'data.frame': 405 obs. of 3 variables:
$ Date : Date, format: "2011-03-04" ...
$ value: num 20278 19493 20587 24679 55708 ...
$ Store: num 1 1 1 1 1 1 1 1 1 2 ...
With the original reshape library I am able to melt it down without issue:
'data.frame': 405 obs. of 3 variables:
$ Date : Date, format: "2011-03-04" ...
$ value: num 20278 19493 20587 24679 55708 ...
$ Store: num 1 1 1 1 1 1 1 1 1 2 ...
However, when I try to use melt from Reshape2, I get the following warning and error:
attributes are not identical across measure variables; they will be dropped
Error: `by` must be supplied when `x` and `y` have no common variables.
What happened here between versions here? Any suggestions for fixing? I'm stuck using Reshape2 for this. Thanks!
I have created an MI data set using the MICE package with 7 imputed data sets
imputeddata <- mice(distress_tibmi, m=7)
the structure of my data is now:
..$ id : num [1:342] 4 8 10 11 23 32 40 47 48 56 ...
..$ diagnosis : Factor w/ 2 levels "psychosis","bpd": 1 1 1 1 1 1 1 1 1 1 ...
..$ gender : Factor w/ 2 levels "female","male": 1 2 2 2 2 1 1 1 1 1 ...
..$ distress.time : Factor w/ 2 levels "baseline","post": 1 1 1 1 1 1 1 1 1 1 ...
..$ distress.score: num [1:342] -2.436 -1.242 0.251 -1.54 0.549 ...
..$ depression : num [1:342] 0.332 0.542 1.172 -0.298 1.172 ...
..$ anxiety : num [1:342] -1.898 -0.687 0.87 -0.687 1.043 ...
..$ choice : num [1:342] 6.73 2.18 2 6.45 3.55 ...
$ imp :List of 8
..$ id :'data.frame': 0 obs. of 7 variables:
.. ..$ 1: logi(0)
.. ..$ 2: logi(0)
.. ..$ 3: logi(0)
.. ..$ 4: logi(0)
.. ..$ 5: logi(0)
.. ..$ 6: logi(0)
.. ..$ 7: logi(0)
..$ diagnosis :'data.frame': 0 obs. of 7 variables:
.. ..$ 1: logi(0)
.. ..$ 2: logi(0)
.. ..$ 3: logi(0)
.. ..$ 4: logi(0)
.. ..$ 5: logi(0)
.. ..$ 6: logi(0)
.. ..$ 7: logi(0)
..$ gender :'data.frame': 0 obs. of 7 variables:
.. ..$ 1: logi(0)
.. ..$ 2: logi(0)
.. ..$ 3: logi(0)
.. ..$ 4: logi(0)
.. ..$ 5: logi(0)
.. ..$ 6: logi(0)
.. ..$ 7: logi(0)
..$ distress.time :'data.frame': 0 obs. of 7 variables:
.. ..$ 1: logi(0)
.. ..$ 2: logi(0)
.. ..$ 3: logi(0)
.. ..$ 4: logi(0)
.. ..$ 5: logi(0)
.. ..$ 6: logi(0)
.. ..$ 7: logi(0)
..$ distress.score:'data.frame': 59 obs. of 7 variables:
.. ..$ 1: num [1:59] -0.6808 -0.6448 -1.658 -0.0293 -0.3463 ...
.. ..$ 2: num [1:59] 1.2736 0.2507 -0.0478 -0.6448 1.2736 ...
.. ..$ 3: num [1:59] -0.681 0.848 -1.658 1.274 0.251 ...
.. ..$ 4: num [1:59] -1.3322 -0.0478 -0.6808 -0.355 -2.4358 ...
.. ..$ 5: num [1:59] -1.3322 -0.355 -4.8239 -0.6448 -0.0293 ...
.. ..$ 6: num [1:59] -1.3322 0.5493 -0.0293 -2.6352 0.8478 ...
.. ..$ 7: num [1:59] 0.5493 0.2507 1.1463 -0.0478 1.2736 ...
..$ depression :'data.frame': 24 obs. of 7 variables:
.. ..$ 1: num [1:24] -0.0882 -0.5084 -1.2966 0.542 -2.1891 ...
.. ..$ 2: num [1:24] 0.332 0.255 1.592 0.752 0.945 ...
.. ..$ 3: num [1:24] -2.159 0.332 -0.262 0.962 1.382 ...
.. ..$ 4: num [1:24] -0.2621 -0.0897 -1.7689 1.1172 0.7724 ...
.. ..$ 5: num [1:24] 0.122 -2.159 -2.399 1.462 -2.189 ...
.. ..$ 6: num [1:24] -0.298 -0.434 -0.607 1.172 0.962 ...
.. ..$ 7: num [1:24] 0.6 1.29 1.635 0.542 0.428 ...
..$ anxiety :'data.frame': 10 obs. of 7 variables:
.. ..$ 1: num [1:10] 0.909 -1.379 1.389 -1.268 -0.598 ...
.. ..$ 2: num [1:10] 1.0433 -1.3789 -0.0955 -0.7655 -0.598 ...
.. ..$ 3: num [1:10] 1.0771 -1.8979 -0.0955 -0.5138 0.0052 ...
.. ..$ 4: num [1:10] -0.598 -1.603 0.9095 -2.608 -0.0955 ...
.. ..$ 5: num [1:10] 0.742 0.2395 -1.7249 -2.1055 -0.0955 ...
.. ..$ 6: num [1:10] 1.412 -0.86 1.389 -2.608 0.575 ...
.. ..$ 7: num [1:10] 1.245 -1.033 0.909 0.909 -1.033 ...
..$ choice :'data.frame': 22 obs. of 7 variables:
.. ..$ 1: num [1:22] 4.55 3.91 7.09 4.27 3.55 ...
.. ..$ 2: num [1:22] 8.09 5.09 5.36 4.91 4.45 ...
.. ..$ 3: num [1:22] 4.27 7.09 3.91 3.91 7.09 ...
.. ..$ 4: num [1:22] 5.82 6.27 7 6.82 4.73 ...
.. ..$ 5: num [1:22] 6.18 5.36 5.36 3.18 3.18 ...
.. ..$ 6: num [1:22] 6.18 6.73 4.73 4.73 5 ...
.. ..$ 7: num [1:22] 5.45 7.09 7.45 3.18 4.91 ...
$ m : num 7
$ where : logi [1:342, 1:8] FALSE FALSE FALSE FALSE FALSE FALSE ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:342] "1" "2" "3" "4" ...
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ blocks :List of 8
..$ id : chr "id"
..$ diagnosis : chr "diagnosis"
..$ gender : chr "gender"
..$ distress.time : chr "distress.time"
..$ distress.score: chr "distress.score"
..$ depression : chr "depression"
..$ anxiety : chr "anxiety"
..$ choice : chr "choice"
..- attr(*, "calltype")= Named chr [1:8] "type" "type" "type" "type" ...
.. ..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ call : language mice(data = distress_tibmi, m = 7)
$ nmis : Named int [1:8] 0 0 0 0 59 24 10 22
..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ method : Named chr [1:8] "" "" "" "" ...
..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ predictorMatrix: num [1:8, 1:8] 0 1 1 1 1 1 1 1 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ visitSequence : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ formulas :List of 8
..$ id :Class 'formula' language id ~ 0 + diagnosis + gender + distress.time + distress.score + depression + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ diagnosis :Class 'formula' language diagnosis ~ 0 + id + gender + distress.time + distress.score + depression + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ gender :Class 'formula' language gender ~ 0 + id + diagnosis + distress.time + distress.score + depression + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ distress.time :Class 'formula' language distress.time ~ 0 + id + diagnosis + gender + distress.score + depression + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ distress.score:Class 'formula' language distress.score ~ 0 + id + diagnosis + gender + distress.time + depression + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ depression :Class 'formula' language depression ~ 0 + id + diagnosis + gender + distress.time + distress.score + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ anxiety :Class 'formula' language anxiety ~ 0 + id + diagnosis + gender + distress.time + distress.score + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
..$ choice :Class 'formula' language choice ~ 0 + id + diagnosis + gender + distress.time + distress.score + ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00>
$ post : Named chr [1:8] "" "" "" "" ...
..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ blots :List of 8
..$ id : list()
..$ diagnosis : list()
..$ gender : list()
..$ distress.time : list()
..$ distress.score: list()
..$ depression : list()
..$ anxiety : list()
..$ choice : list()
$ seed : logi NA
$ iteration : num 5
$ lastSeedValue : int [1:626] 10403 331 -1243825859 461242975 2057104913 -837414599 -54045022 1529270132 -105270003 -1459771035 ...
$ chainMean : num [1:8, 1:5, 1:7] NaN NaN NaN NaN -0.727 ...
..- attr(*, "dimnames")=List of 3
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
.. ..$ : chr [1:5] "1" "2" "3" "4" ...
.. ..$ : chr [1:7] "Chain 1" "Chain 2" "Chain 3" "Chain 4" ...
$ chainVar : num [1:8, 1:5, 1:7] NA NA NA NA 2.26 ...
..- attr(*, "dimnames")=List of 3
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
.. ..$ : chr [1:5] "1" "2" "3" "4" ...
.. ..$ : chr [1:7] "Chain 1" "Chain 2" "Chain 3" "Chain 4" ...
$ loggedEvents : NULL
$ version :Classes 'package_version', 'numeric_version' hidden list of 1
..$ : int [1:3] 3 9 0
$ date : Date[1:1], format: ...
- attr(*, "class")= chr "mids"
Show in New WindowClear OutputExpand/Collapse Output
id diagnosis gender
Min. : 1.00 psychosis:250 female:196
1st Qu.: 76.75 bpd : 92 male :146
Median :198.00
Mean :215.66
3rd Qu.:337.00
Max. :514.00
distress.time distress.score depression
baseline:171 Min. :-4.8239 Min. :-2.39920
post :171 1st Qu.:-0.6808 1st Qu.:-0.76410
Median :-0.0293 Median : 0.08280
Mean :-0.3083 Mean :-0.06085
3rd Qu.: 0.6221 3rd Qu.: 0.77240
Max. : 1.2736 Max. : 1.80690
NA's :59 NA's :24
anxiety choice
Min. :-2.6080 Min. :0.0909
1st Qu.:-0.9330 1st Qu.:2.4545
Median :-0.0955 Median :4.0454
Mean :-0.1397 Mean :3.8903
3rd Qu.: 0.8702 3rd Qu.:5.1136
Max. : 1.7471 Max. :8.0909
NA's :10 NA's :22
Show in New WindowClear OutputExpand/Collapse Output
1
<dbl>
2
<dbl>
3
<dbl>
4
<dbl>
5
<dbl>
6
<dbl>
7
<dbl>
21 -0.6808 1.2736 -0.6808 -1.3322 -1.3322 -1.3322 0.5493
34 -0.6448 0.2507 0.8478 -0.0478 -0.3550 0.5493 0.2507
48 -1.6580 -0.0478 -1.6580 -0.6808 -4.8239 -0.0293 1.1463
141 -0.0293 -0.6448 1.2736 -0.3550 -0.6448 -2.6352 -0.0478
143 -0.3463 1.2736 0.2507 -2.4358 -0.0293 0.8478 1.2736
180 1.1463 -1.0065 -2.3094 -3.6124 -0.6448 -1.5403 -1.0065
181 -0.0293 -0.6808 -0.6808 -3.9381 -0.3463 -1.3322 0.2964
182 1.2736 -0.3463 0.9479 -0.0478 0.9479 -0.3463 1.1463
197 -0.3550 -0.0293 -0.6808 -0.3550 -1.3322 -4.8239 -0.6448
208 0.6221 0.2507 -0.6808 -0.3550 -0.6448 0.6221 -0.6448
1-10 of 59 rows
I created a lm with the imputed data set and summarised it using pool()
distressmodel <- with(data = imputeddata, exp = lm(distress.score ~ distress.time * diagnosis))
summary(mice::pool(distressmodel), conf.int = TRUE, conf.level = 0.95 )
however now I want to get the type 3 F values for the model, but this code is not working
car::Anova(mice::pool(distressmodel), type = 3)
it produces this error message:
Error in UseMethod("vcov") : no applicable method for 'vcov' applied to an object of class "c('mipo', 'data.frame')"
I also want to get the marginal effects of the model (eg see effects from only one level of the grouping variable which is diagnosis) which I have done successfully in my complete case analysis, but this code:
summary(margins(distressmodel, data = subset(imputeddata, diagnosis == "bpd", type = "response")))
produces this error
Error in subset_datlist(datlist = x, subset = subset, select = select, : object 'diagnosis' not found
Does anyone have any advice on alterations to the code or way to get the car::anova or margins () packages to work with an MI data set? (preferably being able to pool the results
The with(data, exp) procedure can be used to apply statistical test/models to multiple imputation outputs (mipo) only if they allow extracting the estimates with the coef method and a variance-covariance matrix with vcov. The latter seems not to work for the function car::Anova that you used.
Fortunately, there is the miceadds package, which offers procedures to conduct and pool additional statistical tests. miceadds::mi.anova seems to do exactly what you want:
miceadds::mi.anova(imputeddata, distress.score ~ distress.time * diagnosis, type=3)
I am not sure, however, about the marginal effects. In general, you can do a bit more coding and apply any statistical procedure to each imputed sample separately. Then you can pool it using the pool.scalar function. This method also gives you within-imputation, between-imputation, and total variance estimates for your pooled statistic. (And with that you can conduct a basic t-test for difference from 0, if you want.)
This approach relies on normal distribution of statistics – or on them being transformable to a normally distributed metric. (Stef van Buuren gives a list of statistics that can easily be transformed, pooled, and back-transformed here, see Table 5.2) So it should be possible for the marginal means you want, right?
I do not know the margins function you use (what package is it from?). But, if you want to get the marginal means and pool them yourself, this is the approach:
# transform your mids into a long-format data frame
imputed_l <- mice::complete(imputeddata, action="long")
nimp <- imputed_l$m #number of imputations for convenience
# create vectors to contain the marginal effects and their SEs from all seven imputations
mm_all <- vector("numeric", nimp)
mmse_all <- mm_all
# get marginal means and SEs for all imputations
for (i in 1:nimp) {
mm_all[i] <- Expression_producing_marginal_mean(..., data = subset(imputed_l, .imp=i) )
mmse_all[i] <- Expression_producing_SE(..., data = subset(imputed_l, .imp=i) )
}
# pool them (the U argument should be variances, so square the SEs)
mm_pool <- pool.scalar(Q=mm_all, U=mmse_all^2, n=nrow(imputed_l)/nimp)
mm_pool$qbar #marginal mean aggregated across imputations
sqrt(mm_pool$t) #SE of marginal mean (based on within- and between-imputations variance)
I have a nested list density_subset_list
It contains 6 lists, which each contain another 3 lists of density data. e.g.
dsl <- A(all_density, p1_density,p2_density), B(all_density, p1_density,p2_density)
etc.
I would like the overall y range.
Here is my attempt.
for (i in 1:length(INTlist)){
y <- unlist(lapply(density_subset_list[[i]], function(d) range(d$y)))
yall <- c(y, yall)
}
range(yall)
It doesn't seem to be working.
Any help is appreciated
Thanks
str(density_subset_list)
List of 6
$ STRexp :List of 3
..$ all:List of 7
.. ..$ x : num [1:512] -0.712 -0.708 -0.705 -0.702 -0.698 ...
.. ..$ y : num [1:512] 2.17e-14 3.64e-14 5.99e-14 9.64e-14 1.62e-13 ...
.. ..$ bw : num 0.047
.. ..$ n : int 1127
.. ..$ call : language density.default(x = x$corr, from = min(Sa14_scoreCorr$corr), to = max(Sa14_scoreCorr$corr), na.rm = T)
.. ..$ data.name: chr "x$corr"
.. ..$ has.na : logi FALSE
.. ..- attr(*, "class")= chr "density"
..$ Kan:List of 7
.. ..$ x : num [1:512] -0.712 -0.708 -0.705 -0.702 -0.698 ...
.. ..$ y : num [1:512] 2.60e-08 3.42e-08 4.50e-08 5.88e-08 7.62e-08 ...
.. ..$ bw : num 0.0649
.. ..$ n : int 287
.. ..$ call : language density.default(x = x$corr, from = min(Sa14_scoreCorr$corr), to = max(Sa14_scoreCorr$corr), na.rm = T)
.. ..$ data.name: chr "x$corr"
.. ..$ has.na : logi FALSE
.. ..- attr(*, "class")= chr "density"
..$ Cm :List of 7
.. ..$ x : num [1:512] -0.712 -0.708 -0.705 -0.702 -0.698 ...
.. ..$ y : num [1:512] 3.88e-08 4.79e-08 5.94e-08 7.38e-08 9.10e-08 ...re
You're very close, but a for isn't needed:
set.seed(100)
dat <- list(STRexp=list(list(), list(), list()),
all=list(y=sample(100, 50)),
Kan=list(y=sample(10,3)),
Cm=list(y=sample(1000, 100)))
str(dat)
## List of 4
## $ STRexp:List of 3
## ..$ : list()
## ..$ : list()
## ..$ : list()
## $ all :List of 1
## ..$ y: int [1:50] 31 26 55 6 45 46 77 35 51 16 ...
## $ Kan :List of 1
## ..$ y: int [1:3] 4 2 9
## $ Cm :List of 1
## ..$ y: int [1:100] 275 591 253 124 229 595 211 461 642 952 ...
# this will get the range of each "y" then get the overall range
range(unlist(lapply(names(dat)[-1], function(x) range(dat[[x]]$y))))
## [1] 2 991