iGraph figures chopped off in R markdown - r

I have some code that generates layouts for the following minimum spanning tree "cell_dtree":
> cell_dtree
IGRAPH 951dfd5 D--- 720 719 --
+ edges from 951dfd5:
[1] 400-> 1 1-> 2 38-> 3 3-> 4 197-> 5 10-> 6 10-> 7 13-> 8 1-> 9 28-> 10 225-> 11 362-> 12 30-> 13
[14] 20-> 14 148-> 15 3-> 16 13-> 17 160-> 18 435-> 19 1-> 20 60-> 21 38-> 22 9-> 23 68-> 24 9-> 25 178-> 26
[27] 21-> 27 1-> 28 60-> 29 2-> 30 1-> 31 2-> 32 1-> 33 352-> 34 21-> 35 20-> 36 1-> 37 1-> 38 554-> 39
[40] 3-> 40 554-> 41 333-> 42 352-> 43 126-> 44 1-> 45 69-> 46 227-> 47 160-> 48 1-> 49 1-> 50 37-> 51 708-> 52
[53] 705-> 53 185-> 54 307-> 55 48-> 56 667-> 57 563-> 58 428-> 59 519-> 60 428-> 61 707-> 62 1-> 63 707-> 64 707-> 65
[66] 707-> 66 214-> 67 20-> 68 68-> 69 37-> 70 453-> 71 57-> 72 148-> 73 345-> 74 69-> 75 148-> 76 80-> 77 79-> 78
[79] 9-> 79 70-> 80 68-> 81 148-> 82 23-> 83 345-> 84 454-> 85 345-> 86 36-> 87 345-> 88 36-> 89 311-> 90 148-> 91
[92] 13-> 92 345-> 93 13-> 94 350-> 95 326-> 96 79-> 97 666-> 98 539-> 99 430->100 554->101 213->102 20->103 38->104
[105] 21->105 172->106 112->107 1->108 20->109 453->110 80->111 703->112 20->113 9->114 79->115 1->116 47->117
+ ... omitted several edges
Here is the r markdown code to create the plot.
```{r, results='asis', fig.width = 7, fig.heigt = 7}
set.seed(1)
l2 <- igraph::layout.lgl(cell_dtree)
l2 <- igraph::layout.norm(l2, ymin = -2, ymax = 2, xmin = -2, xmax = 2)
plot(cell_dtree,
rescale = F,
layout = l2 * 1,
edge.arrow.width=0.1,
vertex.label=NA,
vertex.size=1,
vertex.label=NA,
edge.width=0.5,
edge.arrow.size=0.5,
edge.arrow.width=0.7)
```
When this document goes through knittr, the plot looks like it has been cut off, only a rectangular region in the center is displayed in the knitr html output.
To troubleshoot the issue, I ran igraph's implementation of Erdos-Renyi graph, using the same number of vertices (720) as the graph above, and the same plotting parameters:
```{r, results='asis', out.width = '100%', out.height = '100%', fig.width = 7, fig.heigt = 7}
er <- igraph::sample_gnm(n=720, m=40)
plot(er, vertex.size=6, vertex.label=NA)
set.seed(1)
l2 <- igraph::layout.lgl(er)
l2 <- igraph::layout.norm(l2, ymin = -2, ymax = 2, xmin = -2, xmax = 2)
plot(er,
rescale = F,
layout = l2 * 1,
vertex.label=NA,
vertex.size=1,
vertex.label=NA,
edge.width=0.5,
edge.arrow.size=0.5,
edge.arrow.width=0.7)
```
However, the resulting image does exactly what I want it to do: fill the figure box in an efficient way without those big white borders. Obviously not all the vertices are displayed here, but the scaling helps to show that the layout is using all available space for the image:

Related

How do you get the posterior estimates from a stanfit object just as you would from a brmsfit object in R?

I am fairly new to R/STAN and I would like to code my own model in STAN code. The problem is that I don't know how to obtain the estimate__ values that conditional_effects(brmsfit) produces when using library(brms).
Here is an example of what I would like to obtain:
library(rstan)
library(brms)
N <- 10
y <- rnorm(10)
x <- rnorm(10)
df <- data.frame(x, y)
fit <- brm(y ~ x, data = df)
data <- conditional_effects(fit)
print(data[["x"]])
Which gives this output:
x y cond__ effect1__ estimate__ se__
1 -1.777412243 0.1417486 1 -1.777412243 0.08445399 0.5013894
2 -1.747889444 0.1417486 1 -1.747889444 0.08592914 0.4919022
3 -1.718366646 0.1417486 1 -1.718366646 0.08487412 0.4840257
4 -1.688843847 0.1417486 1 -1.688843847 0.08477227 0.4744689
5 -1.659321048 0.1417486 1 -1.659321048 0.08637019 0.4671830
6 -1.629798249 0.1417486 1 -1.629798249 0.08853233 0.4612196
7 -1.600275450 0.1417486 1 -1.600275450 0.08993511 0.4566040
8 -1.570752651 0.1417486 1 -1.570752651 0.08987979 0.4501722
9 -1.541229852 0.1417486 1 -1.541229852 0.09079337 0.4415650
10 -1.511707053 0.1417486 1 -1.511707053 0.09349952 0.4356073
11 -1.482184255 0.1417486 1 -1.482184255 0.09382594 0.4292237
12 -1.452661456 0.1417486 1 -1.452661456 0.09406637 0.4229115
13 -1.423138657 0.1417486 1 -1.423138657 0.09537000 0.4165933
14 -1.393615858 0.1417486 1 -1.393615858 0.09626168 0.4126735
15 -1.364093059 0.1417486 1 -1.364093059 0.09754818 0.4060894
16 -1.334570260 0.1417486 1 -1.334570260 0.09737763 0.3992320
17 -1.305047461 0.1417486 1 -1.305047461 0.09646332 0.3929951
18 -1.275524662 0.1417486 1 -1.275524662 0.09713718 0.3870211
19 -1.246001864 0.1417486 1 -1.246001864 0.09915170 0.3806628
20 -1.216479065 0.1417486 1 -1.216479065 0.10046754 0.3738948
21 -1.186956266 0.1417486 1 -1.186956266 0.10192677 0.3675363
22 -1.157433467 0.1417486 1 -1.157433467 0.10329695 0.3613282
23 -1.127910668 0.1417486 1 -1.127910668 0.10518868 0.3533583
24 -1.098387869 0.1417486 1 -1.098387869 0.10533191 0.3484098
25 -1.068865070 0.1417486 1 -1.068865070 0.10582833 0.3442075
26 -1.039342271 0.1417486 1 -1.039342271 0.10864510 0.3370518
27 -1.009819473 0.1417486 1 -1.009819473 0.10830692 0.3325785
28 -0.980296674 0.1417486 1 -0.980296674 0.11107417 0.3288747
29 -0.950773875 0.1417486 1 -0.950773875 0.11229667 0.3249769
30 -0.921251076 0.1417486 1 -0.921251076 0.11420108 0.3216303
31 -0.891728277 0.1417486 1 -0.891728277 0.11533604 0.3160908
32 -0.862205478 0.1417486 1 -0.862205478 0.11671013 0.3099456
33 -0.832682679 0.1417486 1 -0.832682679 0.11934724 0.3059504
34 -0.803159880 0.1417486 1 -0.803159880 0.12031792 0.3035792
35 -0.773637082 0.1417486 1 -0.773637082 0.12114301 0.2985330
36 -0.744114283 0.1417486 1 -0.744114283 0.12149371 0.2949334
37 -0.714591484 0.1417486 1 -0.714591484 0.12259197 0.2915398
38 -0.685068685 0.1417486 1 -0.685068685 0.12308763 0.2905327
39 -0.655545886 0.1417486 1 -0.655545886 0.12409683 0.2861451
40 -0.626023087 0.1417486 1 -0.626023087 0.12621634 0.2834400
41 -0.596500288 0.1417486 1 -0.596500288 0.12898609 0.2838938
42 -0.566977489 0.1417486 1 -0.566977489 0.12925969 0.2802667
43 -0.537454691 0.1417486 1 -0.537454691 0.13050938 0.2782553
44 -0.507931892 0.1417486 1 -0.507931892 0.12968382 0.2765127
45 -0.478409093 0.1417486 1 -0.478409093 0.13252478 0.2735946
46 -0.448886294 0.1417486 1 -0.448886294 0.13414535 0.2727640
47 -0.419363495 0.1417486 1 -0.419363495 0.13453109 0.2710725
48 -0.389840696 0.1417486 1 -0.389840696 0.13526957 0.2683500
49 -0.360317897 0.1417486 1 -0.360317897 0.13675913 0.2665745
50 -0.330795098 0.1417486 1 -0.330795098 0.13987067 0.2658021
51 -0.301272300 0.1417486 1 -0.301272300 0.14111051 0.2668740
52 -0.271749501 0.1417486 1 -0.271749501 0.14382292 0.2680711
53 -0.242226702 0.1417486 1 -0.242226702 0.14531118 0.2662193
54 -0.212703903 0.1417486 1 -0.212703903 0.14656473 0.2670958
55 -0.183181104 0.1417486 1 -0.183181104 0.14689102 0.2677249
56 -0.153658305 0.1417486 1 -0.153658305 0.14749250 0.2698547
57 -0.124135506 0.1417486 1 -0.124135506 0.14880275 0.2711767
58 -0.094612707 0.1417486 1 -0.094612707 0.15072864 0.2719037
59 -0.065089909 0.1417486 1 -0.065089909 0.15257772 0.2720895
60 -0.035567110 0.1417486 1 -0.035567110 0.15434018 0.2753563
61 -0.006044311 0.1417486 1 -0.006044311 0.15556588 0.2783308
62 0.023478488 0.1417486 1 0.023478488 0.15481341 0.2802336
63 0.053001287 0.1417486 1 0.053001287 0.15349716 0.2833364
64 0.082524086 0.1417486 1 0.082524086 0.15432904 0.2868926
65 0.112046885 0.1417486 1 0.112046885 0.15637411 0.2921039
66 0.141569684 0.1417486 1 0.141569684 0.15793097 0.2979247
67 0.171092482 0.1417486 1 0.171092482 0.15952338 0.3022751
68 0.200615281 0.1417486 1 0.200615281 0.15997047 0.3048768
69 0.230138080 0.1417486 1 0.230138080 0.16327957 0.3087545
70 0.259660879 0.1417486 1 0.259660879 0.16372900 0.3125599
71 0.289183678 0.1417486 1 0.289183678 0.16395417 0.3185642
72 0.318706477 0.1417486 1 0.318706477 0.16414444 0.3240570
73 0.348229276 0.1417486 1 0.348229276 0.16570600 0.3273931
74 0.377752075 0.1417486 1 0.377752075 0.16556032 0.3316680
75 0.407274873 0.1417486 1 0.407274873 0.16815162 0.3391713
76 0.436797672 0.1417486 1 0.436797672 0.16817144 0.3465403
77 0.466320471 0.1417486 1 0.466320471 0.16790241 0.3514764
78 0.495843270 0.1417486 1 0.495843270 0.16941330 0.3590708
79 0.525366069 0.1417486 1 0.525366069 0.17068468 0.3662851
80 0.554888868 0.1417486 1 0.554888868 0.17238535 0.3738123
81 0.584411667 0.1417486 1 0.584411667 0.17358253 0.3796033
82 0.613934466 0.1417486 1 0.613934466 0.17521059 0.3869863
83 0.643457264 0.1417486 1 0.643457264 0.17617046 0.3939509
84 0.672980063 0.1417486 1 0.672980063 0.17710931 0.3967577
85 0.702502862 0.1417486 1 0.702502862 0.17816611 0.4026686
86 0.732025661 0.1417486 1 0.732025661 0.17998354 0.4094216
87 0.761548460 0.1417486 1 0.761548460 0.18085939 0.4165644
88 0.791071259 0.1417486 1 0.791071259 0.18114271 0.4198687
89 0.820594058 0.1417486 1 0.820594058 0.18294576 0.4255245
90 0.850116857 0.1417486 1 0.850116857 0.18446785 0.4333511
91 0.879639655 0.1417486 1 0.879639655 0.18498697 0.4407155
92 0.909162454 0.1417486 1 0.909162454 0.18729221 0.4472631
93 0.938685253 0.1417486 1 0.938685253 0.18952720 0.4529227
94 0.968208052 0.1417486 1 0.968208052 0.19203126 0.4579841
95 0.997730851 0.1417486 1 0.997730851 0.19408999 0.4671136
96 1.027253650 0.1417486 1 1.027253650 0.19551024 0.4751111
97 1.056776449 0.1417486 1 1.056776449 0.19700981 0.4804208
98 1.086299247 0.1417486 1 1.086299247 0.19756573 0.4850098
99 1.115822046 0.1417486 1 1.115822046 0.20044626 0.4915511
100 1.145344845 0.1417486 1 1.145344845 0.20250046 0.4996890
lower__ upper__
1 -1.0567858 1.1982199
2 -1.0438136 1.1831539
3 -1.0228641 1.1707170
4 -1.0072313 1.1596104
5 -0.9864567 1.1438521
6 -0.9689320 1.1282532
7 -0.9505741 1.1173943
8 -0.9357609 1.0983966
9 -0.9230198 1.0859565
10 -0.9104617 1.0757511
11 -0.8874429 1.0631791
12 -0.8687644 1.0467475
13 -0.8513190 1.0348922
14 -0.8290140 1.0236083
15 -0.8126063 1.0166800
16 -0.7975146 1.0011153
17 -0.7869631 0.9873863
18 -0.7760327 0.9721754
19 -0.7551183 0.9585837
20 -0.7427828 0.9479480
21 -0.7269582 0.9405559
22 -0.7072756 0.9284436
23 -0.6975987 0.9161489
24 -0.6884648 0.9040642
25 -0.6684576 0.8923201
26 -0.6535668 0.8811996
27 -0.6517693 0.8714208
28 -0.6394743 0.8652541
29 -0.6235719 0.8542377
30 -0.6127188 0.8433206
31 -0.6017256 0.8346912
32 -0.5845027 0.8192662
33 -0.5701008 0.8098853
34 -0.5596900 0.7982326
35 -0.5473666 0.7980605
36 -0.5340069 0.7908127
37 -0.5239994 0.7826979
38 -0.5124559 0.7811926
39 -0.4986325 0.7786670
40 -0.5044564 0.7745791
41 -0.4940340 0.7699341
42 -0.4871297 0.7698303
43 -0.4808839 0.7678166
44 -0.4790951 0.7662335
45 -0.4711604 0.7576184
46 -0.4690302 0.7577330
47 -0.4675442 0.7567887
48 -0.4673520 0.7554134
49 -0.4649256 0.7499373
50 -0.4600178 0.7494690
51 -0.4500426 0.7500552
52 -0.4475863 0.7505488
53 -0.4437339 0.7513191
54 -0.4429276 0.7564214
55 -0.4427087 0.7578937
56 -0.4451014 0.7613821
57 -0.4418548 0.7706546
58 -0.4377409 0.7787030
59 -0.4397108 0.7882644
60 -0.4462651 0.8026011
61 -0.4538979 0.8069187
62 -0.4542826 0.8163290
63 -0.4557042 0.8285206
64 -0.4572005 0.8335650
65 -0.4638491 0.8413812
66 -0.4681885 0.8539095
67 -0.4775714 0.8633141
68 -0.4888333 0.8698490
69 -0.4952363 0.8791527
70 -0.4975383 0.8833882
71 -0.5088667 0.8863114
72 -0.5197474 0.8951534
73 -0.5316745 0.9085101
74 -0.5409388 0.9207023
75 -0.5572803 0.9282691
76 -0.5643576 0.9357900
77 -0.5751774 0.9517092
78 -0.5855919 0.9625510
79 -0.5995727 0.9781417
80 -0.6115650 0.9946185
81 -0.6198287 1.0071916
82 -0.6297608 1.0208370
83 -0.6447637 1.0357034
84 -0.6511860 1.0506364
85 -0.6659993 1.0608813
86 -0.6794852 1.0702993
87 -0.6893830 1.0801824
88 -0.7040491 1.1026626
89 -0.7183266 1.1196308
90 -0.7387399 1.1401544
91 -0.7541057 1.1561184
92 -0.7608552 1.1701851
93 -0.7783620 1.1855296
94 -0.7920760 1.2014060
95 -0.8063188 1.2157463
96 -0.8224106 1.2307841
97 -0.8377605 1.2484814
98 -0.8530954 1.2580503
99 -0.8684646 1.2731355
100 -0.8840083 1.2891893
Where I can easily plot the estimate__ vs x column to obtain my linear regression.
Now assuming I want to do the same but with my own STAN code using the stan() function:
library(rstan)
N <- 10
y <- rnorm(10)
x <- rnorm(10)
df <- data.frame(x, y)
fit <- stan('stan_test.stan', data = list(y = y, x = x, N = N))
print(fit)
Which yields the output:
Inference for Stan model: stan_test.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha -0.35 0.01 0.43 -1.23 -0.62 -0.35 -0.09 0.50 2185 1
beta -0.26 0.01 0.57 -1.41 -0.60 -0.25 0.08 0.86 2075 1
sigma 1.26 0.01 0.41 0.74 0.99 1.17 1.43 2.27 1824 1
lp__ -6.19 0.04 1.50 -10.18 -6.87 -5.79 -5.07 -4.48 1282 1
Samples were drawn using NUTS(diag_e) at Fri Jun 03 10:08:50 2022.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
How would I obtain the same estimate__ column as well as the lower__ and upper__ columns?
Note, I know I can easily plot it using the intercept and slope means, but I would like to plot more complex models that can't be plotted as easily as such -- this is just a simple example.
My understanding is that brms estimates conditional effects by applying the model formula to a range of values for the variable you're interested in, with other variables set to appropriate baseline values. In order to do this, brms has to generate the new dataset, apply the model to it, and summarize appropriately. To my knowledge, rstan doesn't have built-in functions that do this; this means that, when we move from brms to rstan, we have to do these steps ourselves.
Here's one way to do it. I've done the first two steps (generate a new dataset and apply the model to it) within Stan, although it would be possible to use R instead.
Generate the new dataset
I've added a transformed data block to the basic Stan program. It finds the min and max observed values of x and creates a vector of 100 evenly spaced points between those two values. If you have more than one predictor for which you want to estimate conditional effects, you'll need to create a separate vector for each one.
data {
int<lower=0> N;
vector[N] x;
vector[N] y;
}
transformed data {
// How many values of the continuous variable will we use to estimate
// conditional effects?
// 100, to match the default behavior of conditional_effects.
int n_cond_points = 100;
vector[n_cond_points] x_cond_internal;
// Space the values evenly between the min and max observed values.
real point_diff = (max(x) - min(x)) / n_cond_points;
for(i in 1:n_cond_points) {
if(i == 1) {
x_cond_internal[i] = min(x);
} else if(i == n_cond_points) {
x_cond_internal[i] = max(x);
} else {
x_cond_internal[i] = x_cond_internal[i - 1] + point_diff;
}
}
}
Apply the model to the new dataset
I used the generated quantities block to apply the model to the new dataset. Three things are noteworthy here:
It's not possible to extract values from the transformed data block out of a stanfit object. As suggested by this answer, I've copied the new dataset into a variable in the generated quantities block so we can get it out of the stanfit object. (It will be the same across all draws.)
We have to specify the model by hand, in the same way it was specified in the model block. If the model changes there, it must be changed by hand in the same way in the generated quantities block.
If you have more than one predictor, you'll need to iterate over each predictor separately. In addition, while you're estimating conditional effects for one predictor, you'll need to choose an appropriate baseline values for the other predictors (0, mean, baseline category, or whatever is appropriate for your dataset).
parameters {
real alpha;
real beta;
real<lower=0> sigma;
}
model {
y ~ normal(alpha + (beta * x), sigma);
}
generated quantities {
// We can't extract transformed data from the stanfit object, so we copy the
// values of x_cond here.
vector[n_cond_points] x_cond = x_cond_internal;
// Estimated value of y for each value of x.
// Note that we have to specify the formula from the model block again; if
// that formula changes, this one must be changed by hand to match.
vector[n_cond_points] y_cond;
for(i in 1:n_cond_points) {
y_cond[i] = alpha + (beta * x_cond[i]);
}
}
Summarize the estimates
When we fit this Stan model, we get one estimate of y_cond per value of x_cond per draw, which is exactly what we want. We can summarize over draws in R:
library(tidyverse)
library(tidybayes)
fit2 <- stan('stan_test.stan', data = list(y = y, x = x, N = N))
cond.effects.df = spread_draws(fit2, x_cond[i], y_cond[i]) %>%
ungroup() %>%
dplyr::select(.draw, i, x = x_cond, y_cond) %>%
group_by(i, x) %>%
summarise(estimate__ = median(y_cond),
lower__ = quantile(y_cond, 0.025),
upper__ = quantile(y_cond, 0.975),
.groups = "keep") %>%
ungroup()
Comparing the two methods
The results of this procedure look pretty much the same as the output of brms. Here's what I got:
theme_set(theme_bw())
bind_rows(
data[["x"]] %>%
mutate(i = row_number(),
method = "brms"),
cond.effects.df %>%
mutate(method = "by hand")
) %>%
ggplot(aes(x = x, color = method, fill = method, group = method)) +
geom_line(aes(y = estimate__)) +
geom_ribbon(aes(ymin = lower__, ymax = upper__), color = NA, alpha = 0.2)

How to use characters in variables summing in R?

I have some dataframe. Here is a small expample:
a <- rnorm(100, 5, 2)
b <- rnorm(100, 10, 3)
c <- rnorm(100, 15, 4)
df <- data.frame(a, b, c)
And I have a character variable vect <- "c('a','b')"
When I try to calculate sum of vars using command
df$d <- df[vect]
which must be an equivalent of
df$d <- df[c('a','b')]
But, as a reslut I have got an error
[.data.frame(df, vect) :undefined columns selected
You're assumption that
vect <- "c('a','b')"
df$d <- df[vect]
is equivalent to
df$d <- df[c('a','b')]
is incorrect.
As #Karthik points out, you should remove the quotation marks in the assignment to vect
However, from your question it sounds like you want to then sum the elements specified in vect and then assign to d. To do this you need to slightly change your code
vect <- c('a','b')
df$d <- apply(X = df[vect], MARGIN = 1, FUN = sum)
This does elementwise sum on the columns in df specified by vect. The MARGIN = 1 specifies that we want to apply the sum rowise rather than columnwise.
EDIT:
As #ThomasIsCoding points out below, if for some reason vect has to be a string, you can parse a string to an R expression using str2lang
vect <- "c('a','b')"
parsed_vect <- eval(str2lang(vect))
df$d <- apply(X = df[parsed_vect], MARGIN = 1, FUN = sum)
Perhaps you can try
> df[eval(str2lang(vect))]
a b
1 8.1588519 9.0617818
2 3.9361214 13.2752377
3 5.5370983 8.8739725
4 8.4542050 8.5704234
5 3.9044461 13.2642793
6 5.6679639 12.9529061
7 4.0183808 6.4746806
8 3.6415608 11.0308990
9 4.5237453 7.3255129
10 6.9379168 9.4594150
11 5.1557935 11.6776181
12 2.3829337 3.5170335
13 4.3556430 7.9706624
14 7.3274615 8.1852829
15 -0.5650641 2.8109197
16 7.1742283 6.8161200
17 3.3412044 11.6298940
18 2.5388981 10.1289533
19 3.8845686 14.1517643
20 2.4431608 6.8374837
21 4.8731053 12.7258259
22 6.9534912 6.5069513
23 4.4394807 14.5320225
24 2.0427553 12.1786148
25 7.1563978 11.9671603
26 2.4231207 6.1801862
27 6.5830372 0.9814878
28 2.5443326 9.8774632
29 1.1260322 9.4804636
30 4.0078436 12.9909014
31 9.3599808 12.2178596
32 3.5362245 8.6758910
33 4.6462337 8.6647953
34 2.0698037 7.2750532
35 7.0727970 8.9386798
36 4.8465248 8.0565347
37 5.6084462 7.5676308
38 6.7617479 9.5357666
39 5.2138482 13.6822924
40 3.6259103 13.8659939
41 5.8586547 6.5087016
42 4.3490281 9.5367522
43 7.5130701 8.1699117
44 3.7933813 9.3241308
45 4.9466813 9.4432584
46 -0.3730035 6.4695187
47 2.0646458 10.6511916
48 4.6027309 4.9207746
49 5.9919348 7.1946723
50 6.0148330 13.4702419
51 5.5354452 9.0193366
52 5.2621651 12.8856488
53 6.8580210 6.3526151
54 8.0812166 14.4659778
55 3.6039030 5.9857886
56 9.8548553 15.9081336
57 3.3675037 14.7207681
58 3.9935336 14.3186175
59 3.4308085 10.6024579
60 3.9609624 6.6595521
61 4.2358603 10.6600581
62 5.1791856 9.3241118
63 4.6976289 13.2833055
64 5.1868906 7.1323826
65 3.1810915 12.8402472
66 6.0258287 9.3805249
67 5.3768112 6.3805096
68 5.7072092 7.1130150
69 6.5789349 8.0092541
70 5.3175820 17.3377234
71 9.7706112 10.8648956
72 5.2332127 12.3418373
73 4.7626124 13.8816910
74 3.9395911 6.5270785
75 6.4394724 10.6344965
76 2.6803695 10.4501753
77 3.5577834 8.2323369
78 5.8431140 7.7932460
79 2.8596818 8.9581837
80 2.7365174 10.2902512
81 4.7560973 6.4555758
82 4.6519084 8.9786777
83 4.9467471 11.2818536
84 5.6167284 5.2641380
85 9.4700525 2.9904731
86 4.7392906 11.3572521
87 3.1221908 6.3881556
88 5.6949432 7.4518023
89 5.1435241 10.8912283
90 2.1628966 10.5080671
91 3.6380837 15.0594135
92 5.3434709 7.4034042
93 -0.1298439 0.4832707
94 7.8759390 2.7411723
95 2.0898649 9.7687250
96 4.2131549 9.3175228
97 5.0648105 11.3943350
98 7.7225193 11.4180456
99 3.1018895 12.8890257
100 4.4166832 10.4901303

Calculate mean value for each row with interval

i need to calculate the mean value for each row (mean of interval). Here is a basic example (maybe anyone has even better idea to do it):
M_1_mb <- (15 : -15)#creating a vector value --> small
M_31 <- cut(M_31_mb,128)# getting 128 groups from the small vector
#M_1_mb <- (1500 : -1500)#creating a vector value
#M_1 <- cut(M_1_mb,128)# getting 128 groups from the vector
I do need to get the mean value for each row/group out of 128 intervals created in M_1 (actually i do not need even those intervals, i just need the mean of them) and i cannot figure out how to do it...
I had a look at the cut2 function from Hmisc library but unfortunatelly there is no option to set up number of intervals into which vector is to be cut (-> but there is an option to get the mean value of created intervals: levels.mean...)
I would appreciate any help! Thanks!
Additional Info:
cut2 function is working well for bigger vectors (M_1_mb), however when my vector is small (M_31_mb), then i am getting a Warning message:
Warning message:
In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf
and only 31 groups are created:
M_31_mb <- (15 : -15) # smaller vector
M_31 <- table(cut2(M_31_mb,g=128,levels.mean = TRUE))
whereas
g = number of quantile groups
like this?
aggregate(M_1_mb,by=list(M_1),mean)
EDIT: Result
Group.1 x
1 (-1.5e+03,-1.48e+03] -1488.5
2 (-1.48e+03,-1.45e+03] -1465.0
3 (-1.45e+03,-1.43e+03] -1441.5
4 (-1.43e+03,-1.41e+03] -1418.0
5 (-1.41e+03,-1.38e+03] -1394.5
6 (-1.38e+03,-1.36e+03] -1371.0
7 (-1.36e+03,-1.34e+03] -1347.5
8 (-1.34e+03,-1.31e+03] -1324.0
9 (-1.31e+03,-1.29e+03] -1301.0
10 (-1.29e+03,-1.27e+03] -1277.5
11 (-1.27e+03,-1.24e+03] -1254.0
12 (-1.24e+03,-1.22e+03] -1230.5
13 (-1.22e+03,-1.2e+03] -1207.0
14 (-1.2e+03,-1.17e+03] -1183.5
15 (-1.17e+03,-1.15e+03] -1160.0
16 (-1.15e+03,-1.12e+03] -1136.5
17 (-1.12e+03,-1.1e+03] -1113.0
18 (-1.1e+03,-1.08e+03] -1090.0
19 (-1.08e+03,-1.05e+03] -1066.5
20 (-1.05e+03,-1.03e+03] -1043.0
21 (-1.03e+03,-1.01e+03] -1019.5
22 (-1.01e+03,-984] -996.0
23 (-984,-961] -972.5
24 (-961,-938] -949.0
25 (-938,-914] -926.0
26 (-914,-891] -902.5
27 (-891,-867] -879.0
28 (-867,-844] -855.5
29 (-844,-820] -832.0
30 (-820,-797] -808.5
31 (-797,-773] -785.0
32 (-773,-750] -761.5
33 (-750,-727] -738.0
34 (-727,-703] -715.0
35 (-703,-680] -691.5
36 (-680,-656] -668.0
37 (-656,-633] -644.5
38 (-633,-609] -621.0
39 (-609,-586] -597.5
40 (-586,-562] -574.0
41 (-562,-539] -551.0
42 (-539,-516] -527.5
43 (-516,-492] -504.0
44 (-492,-469] -480.5
45 (-469,-445] -457.0
46 (-445,-422] -433.5
47 (-422,-398] -410.0
48 (-398,-375] -386.5
49 (-375,-352] -363.0
50 (-352,-328] -340.0
51 (-328,-305] -316.5
52 (-305,-281] -293.0
53 (-281,-258] -269.5
54 (-258,-234] -246.0
55 (-234,-211] -222.5
56 (-211,-188] -199.0
57 (-188,-164] -176.0
58 (-164,-141] -152.5
59 (-141,-117] -129.0
60 (-117,-93.8] -105.5
61 (-93.8,-70.3] -82.0
62 (-70.3,-46.9] -58.5
63 (-46.9,-23.4] -35.0
64 (-23.4,0] -11.5
65 (0,23.4] 12.0
66 (23.4,46.9] 35.0
67 (46.9,70.3] 58.5
68 (70.3,93.8] 82.0
69 (93.8,117] 105.5
70 (117,141] 129.0
71 (141,164] 152.5
72 (164,188] 176.0
73 (188,211] 199.0
74 (211,234] 222.5
75 (234,258] 246.0
76 (258,281] 269.5
77 (281,305] 293.0
78 (305,328] 316.5
79 (328,352] 340.0
80 (352,375] 363.5
81 (375,398] 387.0
82 (398,422] 410.0
83 (422,445] 433.5
84 (445,469] 457.0
85 (469,492] 480.5
86 (492,516] 504.0
87 (516,539] 527.5
88 (539,562] 551.0
89 (562,586] 574.0
90 (586,609] 597.5
91 (609,633] 621.0
92 (633,656] 644.5
93 (656,680] 668.0
94 (680,703] 691.5
95 (703,727] 715.0
96 (727,750] 738.5
97 (750,773] 762.0
98 (773,797] 785.0
99 (797,820] 808.5
100 (820,844] 832.0
101 (844,867] 855.5
102 (867,891] 879.0
103 (891,914] 902.5
104 (914,938] 926.0
105 (938,961] 949.0
106 (961,984] 972.5
107 (984,1.01e+03] 996.0
108 (1.01e+03,1.03e+03] 1019.5
109 (1.03e+03,1.05e+03] 1043.0
110 (1.05e+03,1.08e+03] 1066.5
111 (1.08e+03,1.1e+03] 1090.0
112 (1.1e+03,1.12e+03] 1113.5
113 (1.12e+03,1.15e+03] 1137.0
114 (1.15e+03,1.17e+03] 1160.0
115 (1.17e+03,1.2e+03] 1183.5
116 (1.2e+03,1.22e+03] 1207.0
117 (1.22e+03,1.24e+03] 1230.5
118 (1.24e+03,1.27e+03] 1254.0
119 (1.27e+03,1.29e+03] 1277.5
120 (1.29e+03,1.31e+03] 1301.0
121 (1.31e+03,1.34e+03] 1324.0
122 (1.34e+03,1.36e+03] 1347.5
123 (1.36e+03,1.38e+03] 1371.0
124 (1.38e+03,1.41e+03] 1394.5
125 (1.41e+03,1.43e+03] 1418.0
126 (1.43e+03,1.45e+03] 1441.5
127 (1.45e+03,1.48e+03] 1465.0
128 (1.48e+03,1.5e+03] 1488.5

Loop Linear Regression

As a begginer in R i have a, probably, simple question.
I have a linear regression with this specification:
X1 = X1_t-h + X2_t-h
h for is equal to 1,2,3,4,5:
For example, when h=1 i run this code:
Modelo11 <- dynlm(X1 ~ L(X1,1) + L(X2, 1)-1, data = GDP)
Its a simple regression.
I want to implement a function that gives me the five linear regressions (h=1,2,3,4 and 5) with and without HAC heteroscedasticity estimation:
I did this, and didnt work:
for(h in 1:5){
Modelo1[h] <- dynlm(GDPTrimestralemT ~ L(SpreademT,h) + L(GDPTrimestralemT, h)-1, data = MatrizDadosUS)
coeftest(Modelo1[h], df = Inf, vcov = parzenHAC)
return(list(summary(Modelo1[h])))
}
One of the error message is:
number of items to replace is not a multiple of replacement length
This is my data.frame:
GDP <- data.frame(data )
GDP
X1 X2
1 0.542952690 0.226341364
2 0.102328393 0.743360185
3 0.166345969 0.186533485
4 1.406733422 1.392420181
5 -0.469811005 -0.114609464
6 -0.509268267 0.687555461
7 1.470439930 0.298655018
8 1.046456428 -1.056387597
9 -0.492462197 -0.530284962
10 -0.516065519 0.645957530
11 0.624638996 1.044731264
12 0.213616470 -1.652979785
13 0.669747432 1.398602289
14 0.552089131 -0.821013792
15 0.452715216 1.420094663
16 -0.892063248 -1.436600779
17 1.429284965 0.559738610
18 0.853740565 -0.898976767
19 0.741864168 1.352012831
20 0.171494650 1.704764705
21 0.422326351 -0.267064235
22 -1.261643503 -2.090694608
23 -1.321086283 -0.273954212
24 0.365226000 1.965167113
25 -0.080888690 -0.594498893
26 -0.183293801 -0.483053404
27 -1.033792032 0.586491772
28 0.718322432 1.776210145
29 -2.822693790 -0.731509917
30 -1.251740437 -1.918124078
31 1.184256949 -0.016548037
32 2.255202675 0.303438286
33 -0.930446147 0.803126180
34 -1.691383225 -0.157839283
35 -1.081643279 -0.006652717
36 1.034162006 -1.970063305
37 -0.716827488 0.306792930
38 0.098471514 0.338333164
39 0.343536547 0.389775011
40 1.442117465 -0.668885360
41 0.095131066 -0.298356861
42 0.222524607 0.291485267
43 -0.499969717 1.308312472
44 0.588162304 0.026539575
45 0.581215173 0.167710855
46 0.629343124 -0.052835206
47 0.811618963 0.716913172
48 1.463610069 -0.356369304
49 -2.000576321 1.226446201
50 1.278233553 0.313606888
51 -0.700373666 0.770273988
52 -1.206455648 0.344628878
53 0.024602262 1.001621886
54 0.858933385 -0.865771777
55 -1.592291995 -0.384908852
56 -0.833758365 -1.184682199
57 -0.281305858 2.070391729
58 -0.122848757 -0.308397782
59 -0.661013984 1.590741535
60 1.887869805 -1.240283364
61 -0.313677463 -1.393252994
62 1.142864110 -1.150916732
63 -0.633380499 -0.223923970
64 -0.158729527 -1.245647224
65 0.928619010 -1.050636078
66 0.424317087 0.593892028
67 1.108704956 -1.792833100
68 -1.338231248 1.138684394
69 -0.647492569 0.181495183
70 0.295906675 -0.101823172
71 -0.079827607 0.825158278
72 0.050353111 -0.448453121
73 0.129068772 0.205619797
74 -0.221450137 0.051349511
75 -1.300967949 1.639063824
76 -0.861963677 1.273104220
77 -1.691001610 0.746514122
78 0.365888734 -0.055308006
79 1.297349754 1.146102001
80 -0.652382297 -1.095031447
81 0.165682952 -0.012926971
82 0.127996446 0.510673745
83 0.338743162 -3.141650682
84 -0.266916587 -2.483389321
85 0.148135154 -1.239997153
86 1.256591385 0.051984536
87 -0.646281986 0.468210275
88 0.180472423 0.393014848
89 0.231892902 -0.545305005
90 -0.709986273 0.104969765
91 1.231712844 -1.703489840
92 0.435378714 0.876505107
93 -1.880394798 -0.885893722
94 1.083580732 0.117560662
95 -0.499072654 -1.039222894
96 1.850756855 -1.308752222
97 1.653952857 0.440405804
98 -1.057618294 -1.611779530
99 -0.021821282 -0.807071503
100 0.682923562 -2.358596342
101 -1.132293845 -1.488806929
102 0.319237353 0.706203968
103 -2.393105781 -1.562111727
104 0.188653972 -0.637073832
105 0.667003685 0.047694037
106 -0.534018861 1.366826933
107 -2.240330371 -0.071797320
108 -0.220633546 1.612879694
109 -0.022442941 1.172582601
110 -1.542418139 0.635161458
111 -0.684128812 -0.334973482
112 0.688849615 0.056557966
113 0.848602803 0.785297518
114 -0.874157558 -0.434518305
115 -0.404999060 -0.078893114
116 0.735896917 1.637873669
117 -0.174398836 0.542952690
118 0.222418628 0.102328393
119 0.419461884 0.166345969
120 -0.042602368 1.406733422
121 2.135670836 -0.469811005
122 1.197644287 -0.509268267
123 0.395951293 1.470439930
124 0.141327444 1.046456428
125 0.691575897 -0.492462197
126 -0.490708151 -0.516065519
127 -0.358903359 0.624638996
128 -0.227550909 0.213616470
129 -0.766692832 0.669747432
130 -0.001690915 0.552089131
131 -1.786701123 0.452715216
132 -1.251495762 -0.892063248
133 1.123462446 1.429284965
134 0.237862653 0.853740565
Thanks.
Your variable Modelo1 is a vector which cannot store lm objects. When Modelo1 is a list it should work.
library(dynlm)
df<-data.frame(rnorm(50),rnorm(50))
names(df)<-c("a","b")
c<-list()
for(h in 1:5){
c[[h]] <- dynlm(a ~ L(a,h) + L(b, h)-1, data = df)
}
To get the summary you have to access the single list elements. For example:
summary(c[[1]])
*edit in response to Richard Scriven comment
The most efficent way to to get all summaries would be:
lapply(c, summary)
This applies the summary function to each element of the list and returns a list with the results.

how do I select points in a dataset above x% contour of a density map?

I have a matrix of data (see below) and I am trying to turn it into a density contour map (Can1 and Can2 variables), maybe with ks or sm packages.
My question is how do I select those points in the dataset which lie above (say) 80% contour of the density map?
Thanks
ID Can1 Can2
4 -12.3235137 -1.0788867664
1 -12.2949912 -0.9321009837
5 -12.2835123 -1.0164225574
2 -12.2571822 -0.7094457036
3 -12.2713779 -0.9908419863
10 -12.9870438 -1.0936405526
6 -12.7167605 -1.4620772026
7 -12.8193776 -1.0911349785
8 -12.9781963 -1.1762698594
9 -12.7983478 -1.3453369581
13 -14.0389948 0.2855210115
11 -14.0015922 0.1467552738
15 -14.0723604 0.0244576488
14 -14.0743560 0.1417245145
12 -13.9898266 0.0005437008
20 -6.5881994 0.5124980991
17 -6.1812321 0.6789584579
16 -6.4704200 0.5942317307
18 -6.6960456 0.5720874622
19 -6.1159788 0.5960966790
22 -2.4794887 2.5493267897
24 -2.4918040 2.7823374576
21 -2.5145044 2.5877290160
23 -2.5048371 2.4916280770
25 -2.5018765 2.8536302559
29 -0.1781852 2.0805229401
26 -0.1581308 2.0151355747
28 -0.2118605 1.9658284615
27 -0.4184119 2.0540218901
30 -0.2994573 2.0205573385
35 2.6254869 1.3858705991
31 2.3146430 1.3510499304
33 2.5346138 1.2524229847
34 2.3741699 1.3842499455
32 2.6008389 1.3446707509
37 3.0920503 1.5807032840
38 3.1559727 1.4924092104
36 3.1593556 1.5803284343
39 3.0801444 1.6031732981
40 3.2562384 1.5810975265
43 4.8414364 2.1539254215
41 4.7938193 2.1613978258
44 4.7919209 2.2151527426
42 4.9830802 2.2374622446
45 4.7629268 2.4217335005
46 5.5631728 0.9986762598
50 5.5250403 1.0549399894
48 5.5833619 1.1368625963
47 5.5660312 1.1881215490
49 5.6224256 1.1634998303
53 5.5536366 0.2513665533
54 5.5276808 0.2685455911
51 5.7103045 0.2193839293
52 5.6014729 0.2353172964
55 5.5959034 0.2447836618
56 5.1542133 0.6070006863
59 5.0043394 0.4518710615
58 5.2314146 0.5656457888
60 5.1318728 0.4771275341
57 5.3599822 0.4918185651
61 7.0235173 -0.2669136870
63 7.0216315 -0.0097862523
64 7.0521253 -0.2457722410
62 7.0150637 -0.1456269078
65 7.0729018 -0.3573952321
69 5.8115406 -1.4652084167
67 5.7624475 -1.4147564126
68 5.8692888 -1.4695783153
70 5.9088094 -1.4927034632
66 5.8400205 -1.4817447808
71 4.8586107 -1.3111515744
73 4.7198564 -1.2891991780
72 4.9153659 -1.4499710448
74 4.7653488 -1.2839433419
75 4.7754971 -1.4655359108
77 3.8955675 -7.0922887151
78 3.8338151 -7.1595858283
80 3.7255063 -7.2147373050
79 3.7367055 -7.3468877516
76 4.0166957 -7.1952570639
Calculate the 80% point. One way: y<- x[x > 0.8 * max(x)] (I'm assuming you wanted 80% of the max level, not the 80th percentile) .
Then plot y .
After a bit of searching I think it can be achieved using the kde2d function from the MASS package.

Resources