extract certain elements from lists of lists - r

I have a list that contains outputs from multiple correlation tests
dput(head(corr[1:2]))
list(structure(list(statistic = c(S = 1486), parameter = NULL,
p.value = 0.219369570345178, estimate = c(rho = 0.265810276679842),
null.value = c(rho = 0), alternative = "two.sided", method = "Spearman's rank correlation rho",
data.name = "x$theta.x and x$theta.y"), class = "htest"),
structure(list(statistic = c(S = 1852), parameter = NULL,
p.value = 0.699151237307271, estimate = c(rho = 0.0849802371541502),
null.value = c(rho = 0), alternative = "two.sided", method = "Spearman's rank correlation rho",
data.name = "x$theta.x and x$theta.y"), class = "htest"))
I would like to extract into a separate data frame p.value and estimate. For each element I can do it like this:
corr[[1]][3]
$p.value
[1] 0.2193696
> corr[[1]][4]
$estimate
rho
0.2658103
But I did not have any success in trying to extract those values from the entire list at once.

We can also use extract function from magrittr package for this purpose:
library(purrr)
df %>% map_dfr(magrittr::extract, c("estimate", "p.value"))
# A tibble: 2 x 2
estimate p.value
<dbl> <dbl>
1 0.266 0.219
2 0.0850 0.699

We could do
do.call(rbind, lapply(corr, \(x) data.frame(x[3:4])))
p.value estimate
rho 0.2193696 0.26581028
rho1 0.6991512 0.08498024

You can use [ to extract specific element.
as.data.frame(t(sapply(corr, `[`, c(3, 4))))
# p.value estimate
#1 0.219 0.266
#2 0.699 0.085
Moreover, using broom::tidy might be simpler.
purrr::map_df(corr, broom::tidy)
# estimate statistic p.value method alternative
# <dbl> <dbl> <dbl> <chr> <chr>
#1 0.266 1486 0.219 Spearman's rank correlation rho two.sided
#2 0.0850 1852 0.699 Spearman's rank correlation rho two.sided

Related

How to extract lower and upper bounds from confidence level in R from t test function?

I used the following code to retrieve a confidence level for my data:
out <- t.test(my_data$my_col, conf.level = 0.95)
out
This returns something like:
data: my_data$my_column
t = 30, df = 20, p-value < 2.1e-14
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
62.23191 80.11201
sample estimates:
mean of x
75.10457
I've tried doing:
out[4][1]
But this returns:
$conf.int
[1] 62.23191 80.11201
attr(,"conf.level")
[1] 0.95
How do I specifically get the lower bound and upper bound from this respectively? (i.e. how do I extract 62.23191 and 80.11201 as variables?)
The output from t.test() is a list. The confidence interval is stored as a vector within the $conf.int list element.
To access the individual confidence intervals use out$conf.int[1] & out$conf.int[2]
Example:
out <- t.test(1:10, y=c(7:20))
out$conf.int
#[1] -11.052802 -4.947198
#attr(,"conf.level")
#[1] 0.95
out$conf.int[1]
#[1] -11.0528
out$conf.int[2]
#[1] -4.947198
you can use the broom package not sure if it is more efficient than accepted answer but it is another option.
library(broom)
tidy(t.test(1:10, y = c(7:20),conf.int = TRUE)
estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high method alternative
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 -8 5.5 13.5 -5.43 0.0000186 22.0 -11.1 -4.95 Welch Two Sample t-test two.sided

How to create dataframe using results (output) of sens.slope function?

I have an Excel data with multiple sheets. I imported them into R and applied Mann-Kendall trend test with the function sens.slope(). The results of this function are in htest class, but I want to put them in a table.
I installed packages needed and imported each sheets of dataset.
require(readxl)
require(trend)
tmin1 <- read_excel("C:/TEZ/ANALİZ/future_projection/2051-2100/model 3-3/average_tmin_3_3_end.xlsx", sheet = "acipayam")
tmin2 <- read_excel("C:/TEZ/ANALİZ/future_projection/2051-2100/model 3-3/average_tmin_3_3_end.xlsx", sheet = "adana")
...
tmin57 <- read_excel("C:/TEZ/ANALİZ/future_projection/2051-2100/model 3-3/average_tmin_3_3_end.xlsx", sheet = "yumurtalik")
Then, specified the columns for trend test.
x1<-tmin1$`13`
x2<-tmin1$`14`
x3<-tmin1$`15`
x4<-tmin1$`16`
x5<-tmin1$`17`
...
x281<-tmin57$`13`
x282<-tmin57$`14`
x283<-tmin57$`15`
x284<-tmin57$`16`
x285<-tmin57$`17`
And appplied the function.
sens.slope(x1)
sens.slope(x2)
sens.slope(x3)
....
sens.slope(x285)
The result is looking like this.
> sens.slope(x1)
Sen's slope
data: x1
z = 4.6116, n = 49, p-value = 3.996e-06
alternative hypothesis: true z is not equal to 0
95 percent confidence interval:
0.03241168 0.08101651
sample estimates:
Sen's slope
0.05689083
> sens.slope(x2)
Sen's slope
data: x2
z = 6.8011, n = 49, p-value = 1.039e-11
alternative hypothesis: true z is not equal to 0
95 percent confidence interval:
0.05632911 0.08373755
sample estimates:
Sen's slope
0.07032428
...
How can I put these values in a single table and write them to an Excel file? (names of needed values are statistic and estimates in the function.)
There is a package broom precisely for this:
library(tidyverse)
library(trend)
sens.slope(runif(1000)) %>%
broom::tidy()
# A tibble: 1 x 7
statistic p.value parameter conf.low conf.high method alternative
<dbl> <dbl> <int> <dbl> <dbl> <chr> <chr>
1 0.548 0.584 1000 -0.0000442 0.0000801 Sen's slope two.sided
And if you have many data frames, bind them all into one list and loop it over with map_df:
A = tibble(Value = runif(1000))
B = tibble(Value = runif(1000))
C = tibble(Value = runif(1000))
D = tibble(Value = runif(1000))
list(A,B,C,D) %>%
map_df(~.x %>%
pull(1) %>%
sens.slope() %>%
broom::tidy())
# A tibble: 4 x 7
statistic p.value parameter conf.low conf.high method alternative
<dbl> <dbl> <int> <dbl> <dbl> <chr> <chr>
1 -0.376 0.707 1000 -0.0000732 0.0000502 Sen's slope two.sided
2 -2.30 0.0215 1000 -0.000138 -0.0000110 Sen's slope two.sided
3 -1.30 0.194 1000 -0.000104 0.0000209 Sen's slope two.sided
4 0.674 0.500 1000 -0.0000410 0.0000848 Sen's slope two.sided
Edit: Just realised that broom::tidy in this case doesn't provide the estimate (haven't encountered this before), here is the solution without using broom:
A = tibble(Value = runif(1000))
B = tibble(Value = runif(1000))
C = tibble(Value = runif(1000))
D = tibble(Value = runif(1000))
list(A,B,C,D) %>%
purrr::map_df(.,~{
Test = sens.slope(.x %>% pull(1))
Test = tibble(Estimate = Test["estimates"] %>% unlist,
Statistic = Test["statistic"] %>% unlist)
}
)
# A tibble: 4 x 2
Estimate Statistic
<dbl> <dbl>
1 -0.0000495 -1.55
2 -0.00000491 -0.155
3 0.0000242 0.755
4 -0.0000301 -0.921
Try using lists instead of having so many objects in global environment.
Now since you already have them, you can combine them in a list, apply sens.slope on each one, extract statistic and estimates from them an get the dataframe.
library(trend)
output <- data.frame(t(sapply(mget(paste0('x', 1:285)), function(y)
{temp <- sens.slope(y);c(temp$statistic, temp$estimates)})))
You can now write this dataframe as csv using write.csv.
write.csv(output, 'output.csv', row.names = FALSE)

Using metafor::rma with broom::tidy?

I'm a complete R novice and would like to do the following:
library(metafor)
library(broomExtra)
df <-
escalc(
measure = "RR",
ai = tpos,
bi = tneg,
ci = cpos,
di = cneg,
data = dat.bcg
)
meta_analysis <- rma(yi, vi, data = df, method = "EB")
meta_analysis
tidy(meta_analysis)
Why does tidy(meta_analysis)always give me NULL?
You can use broomExtra::tidy_parameters function if there is no tidier in broom:
library(metafor)
#> Loading required package: Matrix
#> Loading 'metafor' package (version 2.1-0). For an overview
df <-
escalc(
measure = "RR",
ai = tpos,
bi = tneg,
ci = cpos,
di = cneg,
data = dat.bcg
)
meta_analysis <- rma(yi, vi, data = df, method = "EB")
broomExtra::tidy_parameters(meta_analysis)
#> # A tibble: 1 x 8
#> term type estimate std.error statistic p.value conf.low conf.high
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 overall summary -0.715 0.181 -3.95 0.0000774 -1.07 -0.360
Checked the documentation (?tidy). Seems there is no tidy method for an object of class rma. From the docs of broomExtra::tidy:
Checks if a tidy method exits for a given object, either in broom or
in broom.mixed. If it does, it turn an object into a tidy tibble, if
not, return a NULL. In case of data frames, a tibble data frame is
returned.

arguments provided as a list not getting evaluated properly

I am working on a custom function whose goal is to run a function (..f) for all combinations of grouping variables grouping.var provides for a given dataframe and then tidy those results into a dataframe using broom package.
Here is a custom function I've written. Note that ... are supplied to ..f, while additional arguments for broom::tidy method are supplied via tidy.args list.
# setup
set.seed(123)
library(tidyverse)
options(pillar.sigfig = 8)
# custom function
grouped_tidy <- function(data,
grouping.vars,
..f,
...,
tidy.args = list()) {
# check how many variables were entered for grouping variable vector
grouping.vars <-
as.list(rlang::quo_squash(rlang::enquo(grouping.vars)))
grouping.vars <-
if (length(grouping.vars) == 1) {
grouping.vars
} else {
grouping.vars[-1]
}
# quote all argument to `..f`
dots <- rlang::enquos(...)
# running the grouped analysis
df_results <- data %>%
dplyr::group_by(.data = ., !!!grouping.vars, .drop = TRUE) %>%
dplyr::group_map(
.tbl = .,
.f = ~ broom::tidy(
x = rlang::exec(.fn = ..f, !!!dots, data = .x),
unlist(tidy.args)
))
# return the final dataframe with results
return(df_results)
}
As shown by examples below, although this function works, I am doubtful the tidy.args list is getting evaluated properly because irrespective of what conf.level I choose, I always get the same results to the 4th decimal place.
95% CI
# using the function to get 95% CI
grouped_tidy(
data = ggplot2::diamonds,
grouping.vars = c(cut),
..f = stats::lm,
formula = price ~ carat - 1,
tidy.args = list(conf.int = TRUE, conf.level = 0.95)
)
#> # A tibble: 5 x 8
#> # Groups: cut [5]
#> cut term estimate std.error statistic p.value conf.low conf.high
#> <ord> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Fair carat 4510.7919 42.614474 105.85117 0 4427.2062 4594.3776
#> 2 Good carat 5260.8494 27.036670 194.58200 0 5207.8454 5313.8534
#> 3 Very Good carat 5672.5054 18.675939 303.73334 0 5635.8976 5709.1132
#> 4 Premium carat 5807.1392 16.836474 344.91422 0 5774.1374 5840.1410
#> 5 Ideal carat 5819.4837 15.178657 383.39911 0 5789.7324 5849.2350
99% CI
# using the function to get 99% CI
grouped_tidy(
data = ggplot2::diamonds,
grouping.vars = c(cut),
..f = stats::lm,
formula = price ~ carat - 1,
tidy.args = list(conf.int = TRUE, conf.level = 0.99)
)
#> # A tibble: 5 x 8
#> # Groups: cut [5]
#> cut term estimate std.error statistic p.value conf.low conf.high
#> <ord> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Fair carat 4510.7919 42.614474 105.85117 0 4427.2062 4594.3776
#> 2 Good carat 5260.8494 27.036670 194.58200 0 5207.8454 5313.8534
#> 3 Very Good carat 5672.5054 18.675939 303.73334 0 5635.8976 5709.1132
#> 4 Premium carat 5807.1392 16.836474 344.91422 0 5774.1374 5840.1410
#> 5 Ideal carat 5819.4837 15.178657 383.39911 0 5789.7324 5849.2350
Any idea on how I can change the function so that the list of arguments will be evaluated properly by broom::tidy?
set.seed(123)
library(tidyverse)
options(pillar.sigfig = 8)
grouped_tidy <- function(data,
grouping.vars,
..f,
...,
tidy.args = list()) {
# functions passed to group_map must accept
# .x and .y arguments, where .x is the data
tidy_group <- function(.x, .y) {
# presumes ..f won't explode if called with these args
model <- ..f(..., data = .x)
# mild variation on do.call to call function with
# list of arguments
rlang::exec(broom::tidy, model, !!!tidy.args)
}
data %>%
group_by(!!!grouping.vars, .drop = TRUE) %>%
group_map(tidy_group) %>%
ungroup() # don't get bitten by groups downstream
}
grouped_tidy(
data = ggplot2::diamonds,
# wrap grouping columns in vars() like in scoped dplyr verbs
grouping.vars = vars(cut),
..f = stats::lm,
formula = price ~ carat - 1,
tidy.args = list(conf.int = TRUE, conf.level = 0.95)
)
#> # A tibble: 5 x 8
#> cut term estimate std.error statistic p.value conf.low conf.high
#> <ord> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Fair carat 4510.7919 42.614474 105.85117 0 4427.2062 4594.3776
#> 2 Good carat 5260.8494 27.036670 194.58200 0 5207.8454 5313.8534
#> 3 Very Good carat 5672.5054 18.675939 303.73334 0 5635.8976 5709.1132
#> 4 Premium carat 5807.1392 16.836474 344.91422 0 5774.1374 5840.1410
#> 5 Ideal carat 5819.4837 15.178657 383.39911 0 5789.7324 5849.2350
Created on 2019-02-23 by the reprex package (v0.2.1)

How to pull the coefficient values from a logistic regression into a dataframe in R? [duplicate]

This question already has answers here:
Extract regression coefficient values
(4 answers)
Closed 4 years ago.
I did run a logistic regression model fit in R for some dataset. I can see the Coefficients per predictor via summary(model_fit), but now I need to store them in a data frame. Below are my values how I see them via summary.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.387e+00 2.734e+00 -1.605 0.1086
GDP_PER_CAP -6.888e-05 3.870e-05 -1.780 0.0751 .
CO2_PER_CAP 1.816e-01 7.255e-02 2.503 0.0123 *
PERC_ACCESS_ELECTRICITY -5.973e-03 1.291e-02 -0.463 0.6437
ATMS_PER_1E5 -5.749e-03 8.181e-03 -0.703 0.4822
PERC_INTERNET_USERS -2.146e-02 2.106e-02 -1.019 0.3083
SCIENTIFIC_ARTICLES_PER_YR 3.319e-05 1.650e-05 2.011 0.0443 *
PERC_FEMALE_SECONDARY_EDU 1.559e-01 6.428e-02 2.426 0.0153 *
PERC_FEMALE_LABOR_FORCE -1.265e-02 1.470e-02 -0.860 0.3896
PERC_FEMALE_PARLIAMENT -4.802e-02 2.087e-02 -2.301 0.0214 *
dataframe <- dataframe0 %>%
mutate(EQUAL_PAY = relevel(factor(EQUAL_PAY), "YES"))
set.seed(1)
trn_index = createDataPartition(y = dataframe$EQUAL_PAY, p = 0.80, list = FALSE)
trn_equalpay = dataframe[trn_index, ]
tst_equalpay = dataframe[-trn_index, ]
equalpay_lgr = train(EQUAL_PAY ~ .-EQUAL_WORK -COUNTRY, method = "glm",
family = binomial(link = "logit"), data = trn_equalpay,
trControl = trainControl(method = 'cv', number = 10))
???? coefficients <- summary(equalpay_lgr)
You should definitely check out the broom package, which does lots of stuff like this. You can find an introduction to that package here.
For your question, the solution is tidy from broom. Using the example from the link above:
library(broom)
tidy(lmfit)
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 37.3 1.88 19.9 8.24e-19
## 2 wt -5.34 0.559 -9.56 1.29e-10
As you can see, tidy returns a tibble (a type of data frame) that contains a column estimate. This is the coefficient you're looking for.

Resources