I am using the hudsonia dataset from popbio for a reprex. Assuming that I have 2 lists of projection matrices that I want to calculate stochastic growth rate sensitivity.
What I want R to do is to apply stoch.sens to each nested list, and give me a new nested list of sensitivity and elasticity for each list of the projection matrices.
library(popbio)
data("hudsonia")
library(purrr)
library(tidyverse)
hudson <- list(hudsonia[1:2], hudsonia[3:4])
I set.seed(500), but it doesn't help. These three versions did the job as demonstrated in https://rdrr.io/cran/popbio/man/stoch.sens.html, but they give different results. Fortunately, the results seem to be consistent. How do I know which one is the most accurate?
scenario1_stoch.sens1 <- hudson %>%
map(., ~{stoch.sens(.)})
scenario1_stoch.sens1[[1]]
$sensitivities
seed seedlings tiny small medium large
[1,] 0.03305056 2.771649e-05 0.0001287404 0.0001431969 0.0001538398 0.0001839186
[2,] 13.00203595 1.134659e-02 0.0452484230 0.0484751986 0.0478620603 0.0884144846
[3,] 26.78972167 2.334817e-02 0.0933100332 0.1000829077 0.0989421815 0.1816309641
[4,] 46.35258286 4.059941e-02 0.1627644750 0.1744453985 0.1728116142 0.3133614288
[5,] 60.17717120 5.218833e-02 0.2105849414 0.2261332506 0.2239088481 0.4059634764
[6,] 73.30117860 6.334906e-02 0.2570030445 0.2762777810 0.2737179892 0.4930947167
$elasticities
seed seedlings tiny small medium large
[1,] 0.016508753 0.00000000 0.0005893993 0.001738768 0.0034331970 0.009230784
[2,] 0.005200814 0.00000000 0.0001764688 0.000494447 0.0008998067 0.003739933
[3,] 0.000000000 0.01114408 0.0552075615 0.017941807 0.0063520008 0.000000000
[4,] 0.000000000 0.00000000 0.0359152049 0.077326821 0.0301908444 0.026247911
[5,] 0.000000000 0.00000000 0.0000000000 0.066648140 0.1172797856 0.034134910
[6,] 0.000000000 0.00000000 0.0000000000 0.007905494 0.0624854738 0.409207593
scenario1_stoch.sens2 <- map(hudson, ~{stoch.sens(.x)})
scenario1_stoch.sens2[[1]]
$sensitivities
seed seedlings tiny small medium large
[1,] 0.03403483 2.845839e-05 0.0001207616 0.0001396471 0.0001501461 0.0001875122
[2,] 13.00850282 1.128956e-02 0.0517530409 0.0564579344 0.0569689775 0.0809149273
[3,] 26.69896759 2.310674e-02 0.1057631968 0.1155037098 0.1166128499 0.1659382881
[4,] 45.93156321 4.011413e-02 0.1806746089 0.1973586368 0.1990428759 0.2871106860
[5,] 57.80866176 5.033736e-02 0.2282652316 0.2491555124 0.2513725389 0.3608284037
[6,] 69.50472834 6.057715e-02 0.2735822612 0.2987184500 0.3012894262 0.4344459633
$elasticities
seed seedlings tiny small medium large
[1,] 0.017000397 0.00000000 0.0005528708 0.0016956653 0.003350765 0.009411144
[2,] 0.005203401 0.00000000 0.0002018369 0.0005758709 0.001071017 0.003422701
[3,] 0.000000000 0.01102885 0.0632803733 0.0202711514 0.007591172 0.000000000
[4,] 0.000000000 0.00000000 0.0392556049 0.0875884490 0.034936656 0.029256857
[5,] 0.000000000 0.00000000 0.0000000000 0.0739450382 0.132775014 0.037177426
[6,] 0.000000000 0.00000000 0.0000000000 0.0093164903 0.066822493 0.344268758
scenario1_stoch.sens3 <- stoch.sens(hudsonia[1:2])
$sensitivities
seed seedlings tiny small medium large
[1,] 0.03084729 0.0000258864 0.0001033805 0.0001115822 0.0001068774 0.0002138166
[2,] 12.76976503 0.0111371273 0.0541500215 0.0584835187 0.0602080777 0.0763170039
[3,] 26.19161727 0.0227584944 0.1105507047 0.1193949501 0.1227492926 0.1570462498
[4,] 45.34229432 0.0395745196 0.1895626453 0.2051304050 0.2107298137 0.2728195593
[5,] 56.73178410 0.0493258466 0.2361536206 0.2552820940 0.2616984321 0.3432671780
[6,] 68.02535325 0.0591084382 0.2809327390 0.3038881720 0.3110907601 0.4133270712
$elasticities
seed seedlings tiny small medium large
[1,] 0.015408220 0.00000000 0.0004732968 0.0013548868 0.002385150 0.010731348
[2,] 0.005107906 0.00000000 0.0002111851 0.0005965319 0.001131912 0.003228209
[3,] 0.000000000 0.01086263 0.0670680736 0.0204936687 0.008178951 0.000000000
[4,] 0.000000000 0.00000000 0.0401334404 0.0911770528 0.037510314 0.029048719
[5,] 0.000000000 0.00000000 0.0000000000 0.0762261032 0.139968147 0.036758280
[6,] 0.000000000 0.00000000 0.0000000000 0.0100422826 0.066522076 0.325381617
Related
Goal
Create a LASSO model using MLR3
Use nested CV with inner CV or bootstraps for hyperparameter (lambda) determination and outer CV for model performance evaluation (instead of doing just one test-train spit) and finding the standard deviation of the different LASSO regression coefficients amongst the different model instances.
Do a prediction on a testing data set not available yet.
Issues
I am unsure whether the nested CV approach as described is implemented correctly in my code below.
I am unsure whether alpha is set correctly alpha = 1 only.
I do not know how to access the LASSO lamda coefficients when using resampling in mlr3. (importance() in mlr3learners does not yet support LASSO)
I don't know how to apply a possible model to the unavailable testing set in mlr3.
Code
library(readr)
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
library(reprex)
# Data ------
# Prepared according to the Blog post by Julia Silge
# https://juliasilge.com/blog/lasso-the-office/
urlfile = 'https://raw.githubusercontent.com/shudras/office_data/master/office_data.csv'
data = read_csv(url(urlfile))[-1]
#> Warning: Missing column names filled in: 'X1' [1]
#> Parsed with column specification:
#> cols(
#> .default = col_double()
#> )
#> See spec(...) for full column specifications.
# Add a factor to data
data$factor = as.factor(c(rep('a', 20), rep('b', 50), rep('c', 30), rep('a', 6), rep('c', 10), rep('b', 20)))
# Task creation
task =
TaskRegr$new(
id = 'office',
backend = data,
target = 'imdb_rating'
)
# Model creation
graph =
po('scale') %>>%
po('encode') %>>% # make factors numeric
# How to normalize predictors, leaving target unchanged?
lrn('regr.cv_glmnet', # 10-fold CV for inner loop. Is alpha permanently set to 1?
id = 'rp', alpha = 1, family = 'gaussian'
)
graph_learner = GraphLearner$new(graph)
# Execution (actual modeling)
result =
resample(
task,
graph_learner,
rsmp('cv', folds = 5) # 5-fold for outer CV
)
#> INFO [13:21:53.485] Applying learner 'scale.encode.regr.cv_glmnet' on task 'office' (iter 3/5)
#> INFO [13:21:54.937] Applying learner 'scale.encode.regr.cv_glmnet' on task 'office' (iter 2/5)
#> INFO [13:21:55.242] Applying learner 'scale.encode.regr.cv_glmnet' on task 'office' (iter 1/5)
#> INFO [13:21:55.500] Applying learner 'scale.encode.regr.cv_glmnet' on task 'office' (iter 4/5)
#> INFO [13:21:55.831] Applying learner 'scale.encode.regr.cv_glmnet' on task 'office' (iter 5/5)
# How to access results, i.e. lamda coefficients,
# and compare them (why no variable importance for glmnet)
# Access prediction
result$prediction()
#> <PredictionRegr> for 136 observations:
#> row_id truth response
#> 2 8.3 8.373798
#> 6 8.7 8.455151
#> 9 8.4 8.358964
#> ---
#> 116 9.7 8.457607
#> 119 8.2 8.130352
#> 128 7.8 8.224150
Created on 2020-06-11 by the reprex package (v0.3.0)
Edit 1 (LASSO coefficients)
According to a comment from missuse LASSO coefficients can be accessed through result$data$learner[[1]]$model$rp$model$glmnet.fit$beta Additionally, I found that store_models = TRUE needs to be set in result to store the model and in turn access the coefficients.
despite setting alpha = 1, I optained multiple LASSO coefficients. I would like the 'best' LASSO coefficients (stemming from e. g. from lamda = lamda.min or lamda.1se). What do the different s1, s2, s3, ... mean? Are these different lamdas?
The different coefficients indeed seem to stem from different lambda values denoted as s1, s2 , s3, ... (Numer is index.) I suppose, the 'best' coefficients can be accessed by first finding the indices of the 'best' lambda index_lamda.1se = which(ft$lambda == ft$lambda.1se)[[1]]; index_lamda.min = which(ft$lambda == ft$lambda.min)[[1]] and then finding the set of coefficients. A more concise approach to find the 'best' coefficients is given in the comments by missuse.
library(readr)
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
library(reprex)
urlfile = 'https://raw.githubusercontent.com/shudras/office_data/master/office_data.csv'
data = read_csv(url(urlfile))[-1]
# Add a factor to data
data$factor = as.factor(c(rep('a', 20), rep('b', 50), rep('c', 30), rep('a', 6), rep('c', 10), rep('b', 20)))
# Task creation
task =
TaskRegr$new(
id = 'office',
backend = data,
target = 'imdb_rating'
)
# Model creation
graph =
po('scale') %>>%
po('encode') %>>% # make factors numeric
# How to normalize predictors, leaving target unchanged?
lrn('regr.cv_glmnet', # 10-fold CV for inner loop. Is alpha permanently set to 1?
id = 'rp', alpha = 1, family = 'gaussian'
)
graph$keep_results = TRUE
graph_learner = GraphLearner$new(graph)
# Execution (actual modeling)
result =
resample(
task,
graph_learner,
rsmp('cv', folds = 5), # 5-fold for outer CV
store_models = TRUE # Store model needed to acces coefficients
)
# LASSO coefficients
# Why more than one coefficient per predictor?
# What are s1, s2 etc.? Shouldn't 'lrn' fix alpha = 1?
# How to obtain the best coefficient (for lamda 1se or min) if multiple?
as.matrix(result$data$learner[[1]]$model$rp$model$glmnet.fit$beta)
#> s0 s1 s2 s3 s4 s5
#> andy 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> angela 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> b_j_novak 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> brent_forrester 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> darryl 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> dwight 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> episode 0 0.000000000 0.00000000 0.00000000 0.010297763 0.02170423
#> erin 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> gene_stupnitsky 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> greg_daniels 0 0.000000000 0.00000000 0.00000000 0.001845101 0.01309437
#> jan 0 0.000000000 0.00000000 0.00000000 0.005663699 0.01357832
#> jeffrey_blitz 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> jennifer_celotta 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> jim 0 0.006331732 0.01761548 0.02789682 0.036853510 0.04590513
#> justin_spitzer 0 0.000000000 0.00000000 0.00000000 0.000000000 0.00000000
#> [...]
#> s6 s7 s8 s9 s10
#> andy 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> angela 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> b_j_novak 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> brent_forrester 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> darryl 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> dwight 0.002554576 0.007006995 0.011336058 0.01526851 0.01887180
#> episode 0.031963475 0.040864492 0.047487987 0.05356482 0.05910066
#> erin 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> gene_stupnitsky 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> greg_daniels 0.023040791 0.031866343 0.040170917 0.04779004 0.05472702
#> jan 0.021030152 0.028094541 0.035062678 0.04143812 0.04725379
#> jeffrey_blitz 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> jennifer_celotta 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> jim 0.053013058 0.058503984 0.062897112 0.06683734 0.07041964
#> justin_spitzer 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> kelly 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> ken_kwapis 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> kevin 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> lee_eisenberg 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> michael 0.057190859 0.062963830 0.068766981 0.07394472 0.07865977
#> mindy_kaling 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> oscar 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> pam 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> paul_feig 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> paul_lieberstein 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> phyllis 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> randall_einhorn 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> ryan 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> season 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> toby 0.000000000 0.000000000 0.005637169 0.01202893 0.01785309
#> factor.a 0.000000000 -0.003390125 -0.022365768 -0.03947047 -0.05505681
#> factor.b 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> factor.c 0.000000000 0.000000000 0.000000000 0.00000000 0.00000000
#> s11 s12 s13 s14
#> andy 0.000000000 0.000000000 0.000000000 0.0000000000
#> angela 0.000000000 0.000000000 0.000000000 0.0000000000
#> b_j_novak 0.000000000 0.000000000 0.000000000 0.0000000000
#> brent_forrester 0.000000000 0.000000000 0.000000000 0.0000000000
#> darryl 0.000000000 0.000000000 0.000000000 0.0017042281
#> dwight 0.022170870 0.025326337 0.027880703 0.0303865693
#> episode 0.064126846 0.069018240 0.074399623 0.0794693480
#> [...]
Created on 2020-06-15 by the reprex package (v0.3.0)
Edit 2 (optional follow up question)
Nested CV provides discrepancy-evalutation amongst multiple models. The discrepancy can be expressed as an error (e.g. RMSE) obtained by the outer CV. While that error may be small, individual LASSO coefficients (importance of predictors) from the models (instanciated by the outer CV) may vary considerably.
Does mlr3 provide functionality describing the consitancy in quantitative importance of predictor variables, i. e. RMSE of LASSO coefficients amongst models created by the outer CV? Or should a custom function be created, retrieving the LASSO coefficients using result$data$learner[[i]]$model$rp$model$glmnet.fit$beta (suggested by missuse) with i = 1, 2, 3, 4, 5 being the folds of the outer CV and then taking RMSE of the matching coefficients?
I have a list in the following format:
[[825]][[4]]
Each of the 4 inside list elements are different sized and dimensioned arrays:
[[1]]
[1] 0.02918644 0.03239657 0.03560670 0.03881683 0.04202696 0.04523709 0.04844722 0.05165735
[9] 0.05486748 0.05807761 0.06128774 0.06449787 0.06770800 0.07091813 0.07412827 0.07733840
[17] 0.08054853 0.08375866 0.08696879 0.09017892
[[2]]
[1] 0.7581078 0.7587820 0.7608009 0.7641538 0.7688234 0.7747857 0.7820113 0.7904655 0.8001093
[10] 0.8109003 0.8244816 0.8444896 0.8706241 0.9023530 0.9391094 0.9803280 1.0254709 1.0740433
[19] 1.1256013 1.1797536
[[3]]
[,1] [,2] [,3]
[1,] 0.4177711 0.34606863 2.361603e-01
[2,] 0.4345125 0.35491274 2.105747e-01
[3,] 0.4512540 0.36375685 1.849892e-01
[4,] 0.4679954 0.37260096 1.594036e-01
[5,] 0.4847369 0.38144507 1.338180e-01
[6,] 0.5014783 0.39028918 1.082325e-01
[7,] 0.5182198 0.39913329 8.264693e-02
[8,] 0.5349612 0.40797740 5.706137e-02
[9,] 0.5517027 0.41682150 3.147581e-02
[10,] 0.5684441 0.42566561 5.890257e-03
[11,] 0.6059978 0.39400216 0.000000e+00
[12,] 0.6497759 0.35022414 0.000000e+00
[13,] 0.6935539 0.30644612 0.000000e+00
[14,] 0.7373319 0.26266811 -2.408519e-18
[15,] 0.7811099 0.21889009 -6.394265e-19
[16,] 0.8248879 0.17511207 1.129666e-18
[17,] 0.8686659 0.13133405 2.898758e-18
[18,] 0.9124440 0.08755604 4.667850e-18
[19,] 0.9562220 0.04377802 6.436942e-18
[20,] 1.0000000 0.00000000 0.000000e+00
[[4]]
[,1]
[1,] 0.03849906
[2,] 0.04269549
[3,] 0.04680160
[4,] 0.05079714
[5,] 0.05466400
[6,] 0.05838658
[7,] 0.06195207
[8,] 0.06535055
[9,] 0.06857498
[10,] 0.07162115
[11,] 0.07433489
[12,] 0.07637498
[13,] 0.07776951
[14,] 0.07859245
[15,] 0.07893464
[16,] 0.07889032
[17,] 0.07854784
[18,] 0.07798443
[19,] 0.07726429
[20,] 0.07643877
I want to have 4 new lists, each with 825 elements:
[[4]][[825]]
For example, all the [[1]]'s, [[2]]'s etc. from the list of 825 should be combined.
What's the best way to do this? I've been trying to figure it out with some sort of apply..
First create an example list of lists:
big.lst <- lapply(1:825, function(x) rep(list(rnorm(10)), 4))
#check lengths
length(big.lst)
#[1] 825
unique(lengths(big.lst))
#[1] 4
Then lapply a subset over the big list. I chose 1:4 to create four new groups, but you can genralize with 1:length(big.lst[[1]]) as each sublist has the same length:
newlst <- lapply(1:4, function(x) lapply(big.lst, '[[', x))
#verify answer
length(newlst)
#[1] 4
unique(lengths(newlst))
#[1] 825
I have a matrix with 26 columns. The values in each row sum up to 1:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0.02105263 0.01052632 0.01052632 0.04210526 0.01052632 0.06315789 0.03157895 0.1789474 0.07368421 0.07368421 0.02105263
[2,] 0.00000000 0.01176471 0.01176471 0.00000000 0.01176471 0.18823529 0.09411765 0.1764706 0.15294118 0.07058824 0.01176471
[3,] 0.00000000 0.00000000 0.02941176 0.01470588 0.04411765 0.11764706 0.05882353 0.2058824 0.07352941 0.08823529 0.00000000
[,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
[1,] 0.04210526 0.04210526 0.05263158 0 0.03157895 0.02105263 0.00000000 0.04210526 0.01052632 0.05263158 0.02105263
[2,] 0.00000000 0.01176471 0.00000000 0 0.03529412 0.01176471 0.04705882 0.04705882 0.02352941 0.01176471 0.00000000
[3,] 0.02941176 0.02941176 0.02941176 0 0.05882353 0.01470588 0.02941176 0.02941176 0.02941176 0.01470588 0.00000000
[,23] [,24] [,25] [,26]
[1,] 0.06315789 0.03157895 0.03157895 0.02105263
[2,] 0.05882353 0.02352941 0.00000000 0.00000000
[3,] 0.02941176 0.01470588 0.00000000 0.05882353
I would like to alternate the values to make up some new data. This would mean changing every value in a row randomly to a value in the range of +- 5%, while still adding up to 1 with the rowsum.
So in column2 the 6th value is currently 0.18 and in the new data it should be somewhere between 0.171 and 0.189 (and plus 5%).
Alternatively, the value in the column should just be drawn from a normal distribution, but should not differ too much from the original value. Maybe more for large values like 0.18 and also for values which are smaller.
If the value is 0, it would be good to randomly decide whether it should stay at 0 or increase by a range between 5% or 10% (taking as the initial value something like 0.0001).
Is there an easy way to do this?
Well, the first thing you want to do is be able to generate new numbers. It can be done with rnorm(). You can supply it with the mean and standard deviation. The mean should be zero and sd somewhere around 0.02 or so. It would result in vast majority of generated numbers to be within 0.05 of the original number.
After that you want to re-scale back to a rowsum of 1, which is easily achieved by dividing all values with the sum of the whole row.
> (a <- 1:10)
[1] 1 2 3 4 5 6 7 8 9 10
> (a <- a / sum(a))
[1] 0.01818182 0.03636364 0.05454545 0.07272727 0.09090909 0.10909091 0.12727273 0.14545455 0.16363636 0.18181818
> (a <- a + rnorm(10, 0, 0.02))
[1] 0.01293189 0.06799608 0.03552480 0.08015437 0.07834294 0.07845255 0.11692691 0.13262836 0.15728399 0.16228330
> sum(a)
[1] 0.9225252
> sum(a / sum(a))
[1] 1
> a <- a / sum(a)
I'll leave it to you to figure out how to eliminate negative numbers and the 5% or 10% increase. But those are the tools you need.
Let your dataset be a matrix called data, then data * matrix(runif(prod(dim(data)),.95,1.05),nrow=nrow(data)) will give you data that is all +/- 5%.
If you don't want negative values you can wrap it all in an abs() since if a value can shift by 5% and be negative, the absolute value will always be within 5% of the original value still.
If you want to start with no 0 values then step one is data = data[which(data<=0)] = 0.001
I fit a model using the following:
mymodel <- glm(LS ~ bs(LA, df = 8) + bs(IN, df = 7),
family = binomial, data = mydata, na.action = na.omit)
No problem, I have the model fit now I am trying to extract the knot points used. I followed a post on extracting knot points using attr and str. That was for a model that was just a spline. I think the knots are somewhere in the structure in terms
I called str(mymodel$terms) there are ..-attr(*, "variables"). I am having trouble going further with attr but I am relatively certain that this is basically what I need to do. Any guidance to get the knots is appreciated.
You can use
eval(attr(mymodel$terms, "predvars"))
which evaluates the language object contained in the predvars attribute of the terms component of the fitted model.
Here is an example with a silly fitted model
mod <- glm(rnorm(length(women$height)) ~ bs(women$height, df = 5))
from which we can evaluate the required part of the terms component of mod
> eval(attr(mod$terms, "predvars"))
[[1]]
[1] -1.20088330 -0.46267556 -0.04791518 -1.42748340 2.32896914 0.07858849
[7] 2.16635328 -0.78670562 -1.68737883 0.71389437 -0.64123154 -0.04891306
[13] -0.07260125 0.71263717 -2.63426761
[[2]]
1 2 3 4 5
[1,] 0.000000e+00 0.000000000 0.000000000 0.000000e+00 0.000000000
[2,] 4.534439e-01 0.059857872 0.001639942 0.000000e+00 0.000000000
[3,] 5.969388e-01 0.203352770 0.013119534 0.000000e+00 0.000000000
[4,] 5.338010e-01 0.376366618 0.044278426 0.000000e+00 0.000000000
[5,] 3.673469e-01 0.524781341 0.104956268 0.000000e+00 0.000000000
[6,] 2.001640e-01 0.595025510 0.204719388 9.110787e-05 0.000000000
[7,] 9.110787e-02 0.566326531 0.336734694 5.830904e-03 0.000000000
[8,] 3.125000e-02 0.468750000 0.468750000 3.125000e-02 0.000000000
[9,] 5.830904e-03 0.336734694 0.566326531 9.110787e-02 0.000000000
[10,] 9.110787e-05 0.204719388 0.595025510 2.001640e-01 0.000000000
[11,] 0.000000e+00 0.104956268 0.524781341 3.673469e-01 0.002915452
[12,] 0.000000e+00 0.044278426 0.376366618 5.338010e-01 0.045553936
[13,] 0.000000e+00 0.013119534 0.203352770 5.969388e-01 0.186588921
[14,] 0.000000e+00 0.001639942 0.059857872 4.534439e-01 0.485058309
[15,] 0.000000e+00 0.000000000 0.000000000 0.000000e+00 1.000000000
attr(,"degree")
[1] 3
attr(,"knots")
33.33333% 66.66667%
62.66667 67.33333
attr(,"Boundary.knots")
[1] 58 72
attr(,"intercept")
[1] FALSE
attr(,"class")
[1] "bs" "basis" "matrix"
In the resulting list, the first and second components are the response and predictor data respectively. To this list a number of attributes are attached to the second component, the bs data. You need to extract those.
ll <- eval(attr(mod$terms, "predvars"))
attr(ll[[2]], "knots")
attr(ll[[2]], "Boundary.knots")
> attr(ll[[2]], "knots")
33.33333% 66.66667%
62.66667 67.33333
> attr(ll[[2]], "Boundary.knots")
[1] 58 72
You're on the right track. mymodel$terms contains information about the terms in the model, and attr(mymodel$terms, "predvars") is a language object that is a list of predictors with the computed values of the knots.
To get them out:
x <- attr(mymodel$terms, "predvars")
x[[2]] # bs(LA, degree=3, knots=<vector>, Boundary.knots=<vector>, intercept=FALSE)
x[[2]]$knots
x[[2]]$Boundary.knots
This question already has an answer here:
Closed 11 years ago.
Possible Duplicate:
Increase the width of matrix printout
I have a big matrix named S, so it displays results separated across multiple lines. This is not clear for me, so is there a way to display the whole matrix, and not let the columns print on separated lines like this:
> S
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1.0000000 0.5393599 0.5276449 0.33449680 0.2925090 0.60927180 0.2925090
[2,] 0.5393599 1.0000000 0.7826238 0.43412157 0.6507914 0.51639778 0.5423261
[3,] 0.5276449 0.7826238 1.0000000 0.41602515 0.5457052 0.50518149 0.5457052
[4,] 0.3344968 0.4341216 0.4160251 1.00000000 0.2690691 0.08006408 0.3363364
[5,] 0.2925090 0.6507914 0.5457052 0.26906912 1.0000000 0.42008403 0.5294118
[6,] 0.6092718 0.5163978 0.5051815 0.08006408 0.4200840 1.00000000 0.4900980
[7,] 0.2925090 0.5423261 0.5457052 0.33633640 0.5294118 0.49009803 1.0000000
[8,] 0.4029115 0.5378529 0.6013378 0.44474959 0.6482037 0.46291005 0.5185630
[9,] 0.3636364 0.5393599 0.5276449 0.16724840 0.6581452 0.52223297 0.4387635
[10,] 0.2727273 0.4045199 0.4522670 0.25087260 0.5850179 0.43519414 0.6581452
[11,] 0.4351941 0.6454972 0.5773503 0.32025631 0.4900980 0.41666667 0.4200840
[12,] 0.3636364 0.6741999 0.5276449 0.33449680 0.5850179 0.34815531 0.4387635
[13,] 0.1906925 0.2828427 0.1581139 0.43852901 0.3834825 0.18257419 0.3834825
[,8] [,9] [,10] [,11] [,12] [,13]
[1,] 0.4029115 0.3636364 0.2727273 0.4351941 0.3636364 0.1906925
[2,] 0.5378529 0.5393599 0.4045199 0.6454972 0.6741999 0.2828427
[3,] 0.6013378 0.5276449 0.4522670 0.5773503 0.5276449 0.1581139
[4,] 0.4447496 0.1672484 0.2508726 0.3202563 0.3344968 0.4385290
[5,] 0.6482037 0.6581452 0.5850179 0.4900980 0.5850179 0.3834825
[6,] 0.4629100 0.5222330 0.4351941 0.4166667 0.3481553 0.1825742
[7,] 0.5185630 0.4387635 0.6581452 0.4200840 0.4387635 0.3834825
[8,] 1.0000000 0.6446584 0.5640761 0.4629100 0.5640761 0.4225771
[9,] 0.6446584 1.0000000 0.6363636 0.3481553 0.4545455 0.2860388
[10,] 0.5640761 0.6363636 1.0000000 0.2611165 0.3636364 0.4767313
[11,] 0.4629100 0.3481553 0.2611165 1.0000000 0.5222330 0.2738613
[12,] 0.5640761 0.4545455 0.3636364 0.5222330 1.0000000 0.1906925
[13,] 0.4225771 0.2860388 0.4767313 0.2738613 0.1906925 1.0000000
In addition to the print(... ,digits) method you can also change the width at which printing wraps:
options(width = 150)
Make your window wide...
And, depending on the number of digits that count try...
print(S, digits = 3)
but you really need to come up with better ways of examining correlation matrices that don't depend on such things.