I am using this code to plot my dataframe. The statistics variable contains two values: "mean" and "sd".
ggplot(NDVIdf_forplot, aes(x = statistic, y= value, group = ID)) + geom_line()
If I use that code the graph includes both the "mean" and "sd" categories. I want to use only those observations that are in the "mean" class of the statistic variable and later use the "sd" class to plot geom_errorbar
I used this code before but did not manage to create what I want:
ggplot(NDVIdf_forplot,aes(x=mean,y=value))+geom_errorbar(aes(ymin=NDVI_mean-NDVI_sd, ymax=NDVI_mean+NDVI_sd), width=0.1)+geom_line()+geom_point()
edit ---
The data I want to plot look like this (I'm showing only the top rows). The idea is to use NDVI_mean to create the lines and NDVI_sd to create the error bars on the same graph
> NDVIdf_forplot
ID statistic value
1 1 NDVI_mean 0.052957208
2 2 NDVI_mean 0.044501794
3 3 NDVI_mean 0.077902512
4 4 NDVI_mean 0.141576609
5 5 NDVI_mean 0.653835647
6 6 NDVI_mean 0.716164870
7 7 NDVI_mean 0.386612348
8 8 NDVI_mean 0.486527816
9 9 NDVI_mean 0.226190208
10 10 NDVI_mean 0.573239754
11 1 NDVI_sd 0.008259909
12 2 NDVI_sd 0.015453091
13 3 NDVI_sd 0.099944407
14 4 NDVI_sd 0.091479545
15 5 NDVI_sd 0.223150965
16 6 NDVI_sd 0.074045394
17 7 NDVI_sd 0.058177949
18 8 NDVI_sd 0.109762451
19 9 NDVI_sd 0.019822312
20 10 NDVI_sd 0.104795771
21 1 NDVI_mean.1 0.081417705
22 2 NDVI_mean.1 0.036114126
23 3 NDVI_mean.1 0.037729680
24 4 NDVI_mean.1 0.016398037
25 5 NDVI_mean.1 0.052672604
26 6 NDVI_mean.1 0.024580946
27 7 NDVI_mean.1 0.064811390
28 8 NDVI_mean.1 0.119724256
29 9 NDVI_mean.1 0.078961665
30 10 NDVI_mean.1 0.041025489
31 1 NDVI_sd.1 0.016093458
32 2 NDVI_sd.1 0.027927592
33 3 NDVI_sd.1 0.046937888
34 4 NDVI_sd.1 0.011805721
35 5 NDVI_sd.1 0.026467984
36 6 NDVI_sd.1 0.028896611
37 7 NDVI_sd.1 0.016313583
38 8 NDVI_sd.1 0.066647683
39 9 NDVI_sd.1 0.022800589
40 10 NDVI_sd.1 0.015085673
41 1 NDVI_mean.2 0.063375514
42 2 NDVI_mean.2 0.086191853
43 3 NDVI_mean.2 0.092580942
44 4 NDVI_mean.2 0.144053635
45 5 NDVI_mean.2 0.696155509
46 6 NDVI_mean.2 0.252707792
47 7 NDVI_mean.2 0.144636380
48 8 NDVI_mean.2 0.757321462
49 9 NDVI_mean.2 0.689617575
50 10 NDVI_mean.2 0.179591653
51 1 NDVI_sd.2 0.010017152
52 2 NDVI_sd.2 0.023206464
53 3 NDVI_sd.2 0.106580902
54 4 NDVI_sd.2 0.097440674
55 5 NDVI_sd.2 0.231063744
56 6 NDVI_sd.2 0.043961963
57 7 NDVI_sd.2 0.010335935
58 8 NDVI_sd.2 0.061841114
59 9 NDVI_sd.2 0.048363788
60 10 NDVI_sd.2 0.111704779
61 1 NDVI_mean.3 0.048932939
62 2 NDVI_mean.3 0.110942174
63 3 NDVI_mean.3 0.080362752
64 4 NDVI_mean.3 0.132868790
65 5 NDVI_mean.3 0.682639604
66 6 NDVI_mean.3 0.503766225
67 7 NDVI_mean.3 0.120794820
68 8 NDVI_mean.3 0.777808416
69 9 NDVI_mean.3 0.755741184
70 10 NDVI_mean.3 0.058089687
71 1 NDVI_sd.3 0.009048781
72 2 NDVI_sd.3 0.029528930
73 3 NDVI_sd.3 0.098454753
74 4 NDVI_sd.3 0.089512544
75 5 NDVI_sd.3 0.241257647
76 6 NDVI_sd.3 0.114466677
77 7 NDVI_sd.3 0.013347437
78 8 NDVI_sd.3 0.066441491
79 9 NDVI_sd.3 0.065787691
80 10 NDVI_sd.3 0.013351357
So far this image shows how the plot is being produced. As you can see both NDVI_mean and NDVI_sd are used but this should not be the case. NDVI_sd should be used to produce geom_errorbar
Code:
# Transform data
# Here we make table with three columns (ID Mean SD)
pd <- reshape2::dcast(NDVIdf_forplot, ID ~ statistic, value.var = "value")
# Plot data using ggplot2
library(ggplot2)
ggplot(pd, aes(ID, NDVI_mean)) +
geom_point() +
geom_line() +
geom_errorbar(aes(ymin = NDVI_mean - NDVI_sd,
ymax = NDVI_mean + NDVI_sd))
Result plot:
Related
This question already has answers here:
How to fill with different colors between two lines? (originally: fill geom_polygon with different colors above and below y = 0 (or any other value)?)
(4 answers)
Closed 5 years ago.
I have this df
x acc
1 1902-01-01 0.782887804
2 1903-01-01 -0.003144199
3 1904-01-01 0.100006276
4 1905-01-01 0.326173392
5 1906-01-01 1.285114692
6 1907-01-01 2.844399973
7 1920-01-01 -0.300232190
8 1921-01-01 1.464389342
9 1922-01-01 0.142638653
10 1923-01-01 -0.020162385
11 1924-01-01 0.361928571
12 1925-01-01 0.616325588
13 1926-01-01 -0.108206003
14 1927-01-01 -0.318441954
15 1928-01-01 -0.267884586
16 1929-01-01 -0.022473777
17 1930-01-01 -0.294452983
18 1931-01-01 -0.654927109
19 1932-01-01 -0.263508341
20 1933-01-01 0.622530992
21 1934-01-01 1.009666043
22 1935-01-01 0.675484421
23 1936-01-01 1.209162008
24 1937-01-01 1.655280986
25 1948-01-01 2.080021785
26 1949-01-01 0.854572563
27 1950-01-01 0.997540963
28 1951-01-01 1.000244163
29 1952-01-01 0.958322941
30 1953-01-01 0.816259474
31 1954-01-01 0.814488644
32 1955-01-01 1.233694537
33 1958-01-01 0.460120970
34 1959-01-01 0.344201474
35 1960-01-01 1.601430139
36 1961-01-01 0.387850967
37 1962-01-01 -0.385954401
38 1963-01-01 0.699355708
39 1964-01-01 0.084519926
40 1965-01-01 0.708964572
41 1966-01-01 1.456280443
42 1967-01-01 1.479412638
43 1968-01-01 1.199000726
44 1969-01-01 0.282942042
45 1970-01-01 -0.181724504
46 1971-01-01 0.012170186
47 1972-01-01 -0.095891043
48 1973-01-01 -0.075384446
49 1974-01-01 -0.156668145
50 1975-01-01 -0.303023258
51 1976-01-01 -0.516027310
52 1977-01-01 -0.826791524
53 1980-01-01 -0.947112221
54 1981-01-01 -1.634878300
55 1982-01-01 -1.955298323
56 1987-01-01 -1.854447550
57 1988-01-01 -1.458955443
58 1989-01-01 -1.256102245
59 1990-01-01 -0.864108585
60 1991-01-01 -1.293373024
61 1992-01-01 -1.049530431
62 1993-01-01 -1.002526230
63 1994-01-01 -0.868783614
64 1995-01-01 -1.081858981
65 1996-01-01 -1.302103374
66 1997-01-01 -1.288048194
67 1998-01-01 -1.455750340
68 1999-01-01 -1.015467069
69 2000-01-01 -0.682789640
70 2001-01-01 -0.811058004
71 2002-01-01 -0.972374057
72 2003-01-01 -0.536505225
73 2004-01-01 -0.518686263
74 2005-01-01 -0.976298621
75 2006-01-01 -0.946429713
I would like plot the data in this kind:
where on x axes there is column x of df, and on y axes column acc.
Is possible plot it with ggplot?
I tried with this code:
ggplot(df,aes(x=x,y=acc))+
geom_linerange(data =df , aes(colour = ifelse(acc <0, "blue", "red")),ymin=min(df),ymax=max(cdf))
but the result is this:
Please, how I can do it?
Is this what you want? I'm not sure.
ggplot(data = df,mapping = aes(x,acc))+geom_segment(data = df , mapping = aes(x=x,y=ystart,xend=x,yend=acc,color=col))
df$x=year(as.Date(df$x))
df$ystart=0
df$col=ifelse(df$acc>=0,"blue","red")
I am trying to move from long format data to wide format in order to do some correlation analyses.
But, dcast seems to create to rows for the first subject and splits the data across those two rows filling the created empty cells with NA.
The first 2 subjects were being duplicated when I was using alphanumeric subject codes, I went to numeric subject numbers and that has to down to only the first subject being duplicated.
the first few lines of the long format data frame:
Subject Age Gender R_PTA L_PTA BE_PTA Avg_PTA L_Aided_SII R_Aided_SII Best_Aided_SII L_Unaided_SII R_Unaided_SII Best_Unaided_SII L_SII_Diff R_SII_Diff
1 1 74 M 48.33 53.33 48.33 50.83 31 42 42 14 25 25 17 17
2 2 77 F 36.67 36.67 36.67 36.67 73 67 73 44 43 44 29 24
3 3 72 F 45.00 41.67 41.67 43.33 42 34 42 35 28 35 7 6
4 4 66 F 36.67 36.67 36.67 36.67 66 76 76 44 44 44 22 32
5 5 38 F 41.67 46.67 41.67 44.17 48 58 58 23 29 29 25 29
6 6 65 M 35.00 43.33 35.00 39.17 46 60 60 32 46 46 14 14
Best_SII_Diff rSII MoCA_Vis MoCA_Nam MoCA_Attn MoCA_Lang MoCA_Abst MoCA_Del_Rec MoCA_Ori MoCA_Tot PNT Semantic Aided PNT_Prop PNT_Prop_Mod
1 17 -0.4231157 5 3 6 2 2 2 6 26 0.971 0.029 Unaided 0.971 0.983
2 29 1.2739255 3 3 5 0 2 2 5 20 0.954 0.046 Unaided 0.960 0.966
3 7 -1.2777889 4 2 5 2 2 5 6 26 0.966 0.034 Unaided 0.960 0.982
4 32 1.5959701 5 3 6 3 2 5 6 30 0.983 0.017 Unaided 0.983 0.994
5 29 0.9492167 4 2 6 3 1 3 6 25 0.983 0.017 Unaided 0.983 0.994
6 14 -0.2936395 4 2 6 2 2 2 6 24 0.989 0.011 Unaided 0.989 0.994
PNT_S_Wt PNT_P_Wt
1 0.046 0.041
2 0.073 0.033
3 0.045 0.074
4 0.049 0.057
5 0.049 0.057
6 0.049 0.057
Creating varlist:
varlist <- list(colnames(subset(PNT_Data_All2, ,c(18:27,29:33))))
My dcast command:
Data_Wide <- dcast(as.data.table(PNT_Data_All2),Subject + Age + Gender + R_PTA + L_PTA + BE_PTA + Avg_PTA + L_Aided_SII + R_Aided_SII + Best_Aided_SII + L_Unaided_SII + R_Unaided_SII + Best_Unaided_SII + L_SII_Diff + R_SII_Diff + Best_SII_Diff + rSII ~ Aided, value.var=varlist)
The resulting first few lines of the wide format:
Subject Age Gender R_PTA L_PTA BE_PTA Avg_PTA L_Aided_SII R_Aided_SII Best_Aided_SII L_Unaided_SII R_Unaided_SII Best_Unaided_SII L_SII_Diff R_SII_Diff
1: 1 74 M 48.33 53.33 48.33 50.83 31 42 42 14 25 25 17 17
2: 1 74 M 48.33 53.33 48.33 50.83 31 42 42 14 25 25 17 17
3: 2 77 F 36.67 36.67 36.67 36.67 73 67 73 44 43 44 29 24
4: 3 72 F 45.00 41.67 41.67 43.33 42 34 42 35 28 35 7 6
5: 4 66 F 36.67 36.67 36.67 36.67 66 76 76 44 44 44 22 32
6: 5 38 F 41.67 46.67 41.67 44.17 48 58 58 23 29 29 25 29
Notice Subject 1 has 2 entries. All of the other subjects seem correct
Is this a problem with my command/arguments? A bug in dcast?
Edit 1: Through the process of elimination, the extra entries only appear when I include the "rSII" variable. This is a variable that is calculated from a previous step in the script:
PNT_Data_All$rSII <- stdres(lm(Best_Aided_SII ~ Best_Unaided_SII, data=PNT_Data_All))
PNT_Data_All <- PNT_Data_All[, colnames(PNT_Data_All)[c(1:17,34,18:33)]]
Is there something about that calculated variable that would mess up dcast for some subjects?
Edit 2 to add my workaround:
I ended up rounding the calculated variable to 3 digits after the decimal and that solved the problem. Everything is casting correctly now with no duplicates.
PNT_Data_All$rSII <- format(round(stdres(lm(Best_Aided_SII ~ Best_Unaided_SII, data=PNT_Data_All)),3),nsmall=3)
My question is similar to Fill region between two loess-smoothed lines in R with ggplot1
But I have two groups.
g1<-ggplot(NVIQ_predict,aes(cogn.age, predict, color=as.factor(NVIQ_predict$group)))+
geom_smooth(aes(x = cogn.age, y = upper,group=group),se=F)+
geom_line(aes(linetype = group), size = 0.8)+
geom_smooth(aes(x = cogn.age, y = lower,group=group),se=F)
I want to fill red and blue for each group.
I tried:
gg1 <- ggplot_build(g1)
df2 <- data.frame(x = gg1$data[[1]]$x,
ymin = gg1$data[[1]]$y,
ymax = gg1$data[[3]]$y)
g1 + geom_ribbon(data = df2, aes(x = x, ymin = ymin, ymax = ymax),fill = "grey", alpha = 0.4)
But it gave me the error: Aesthetics must either be length one, or the same length as the dataProblems
I get the same error every time my geom_ribbon() data and ggplot() data differ.
Can somebody help me with it? Thank you so much!
My data looks like:
> NVIQ_predict
cogn.age predict upper lower group
1 7 39.04942 86.68497 18.00000 1
2 8 38.34993 82.29627 18.00000 1
3 10 37.05174 74.31657 18.00000 1
4 11 36.45297 70.72421 18.00000 1
5 12 35.88770 67.39555 18.00000 1
6 13 35.35587 64.32920 18.00000 1
7 14 34.85738 61.52322 18.00000 1
8 16 33.95991 56.68024 18.00000 1
9 17 33.56057 54.63537 18.00000 1
10 18 33.19388 52.83504 18.00000 1
11 19 32.85958 51.27380 18.00000 1
12 20 32.55752 49.94791 18.00000 1
13 21 32.28766 48.85631 18.00000 1
14 24 31.67593 47.09206 18.00000 1
15 25 31.53239 46.91136 18.00000 1
16 28 31.28740 48.01764 18.00000 1
17 32 31.36627 50.55201 18.00000 1
18 35 31.73386 53.19630 18.00000 1
19 36 31.91487 54.22624 18.00000 1
20 37 32.13026 55.25721 18.00000 1
21 38 32.38237 56.26713 18.00000 1
22 40 32.98499 58.36229 18.00000 1
23 44 34.59044 62.80187 18.00000 1
24 45 35.06804 64.01951 18.00000 1
25 46 35.57110 65.31888 18.00000 1
26 47 36.09880 66.64696 17.93800 1
27 48 36.72294 67.60053 17.97550 1
28 49 37.39182 68.49995 18.03062 1
29 50 38.10376 69.35728 18.10675 1
30 51 38.85760 70.17693 18.18661 1
31 52 39.65347 70.95875 18.27524 1
32 53 40.49156 71.70261 18.38020 1
33 54 41.35332 72.44006 17.90682 1
34 59 46.37849 74.91802 18.63206 1
35 60 47.53897 75.66218 19.64432 1
36 61 48.74697 76.43933 20.82346 1
37 63 51.30607 78.02426 23.73535 1
38 71 63.43129 86.05467 40.43482 1
39 72 65.15618 87.44794 42.72704 1
40 73 66.92714 88.95324 45.01966 1
41 84 89.42079 114.27939 68.03834 1
42 85 91.73831 117.44007 69.83676 1
43 7 33.69504 54.03695 15.74588 2
44 8 34.99931 53.96500 18.00533 2
45 10 37.61963 54.05684 22.43516 2
46 11 38.93493 54.21969 24.60049 2
47 12 40.25315 54.45963 26.73027 2
48 13 41.57397 54.77581 28.82348 2
49 14 42.89710 55.16727 30.87982 2
50 16 45.54954 56.17193 34.88453 2
51 17 46.87877 56.78325 36.83632 2
52 18 48.21025 57.46656 38.75807 2
53 19 49.54461 58.22266 40.65330 2
54 20 50.88313 59.05509 42.52505 2
55 21 52.22789 59.97318 44.36944 2
56 24 56.24397 63.21832 49.26963 2
57 25 57.55394 64.33850 50.76938 2
58 28 61.45282 68.05043 54.85522 2
59 32 66.44875 72.85234 60.04517 2
60 35 69.96560 76.06171 63.86949 2
61 36 71.09268 77.06821 65.11714 2
62 37 72.19743 78.04559 66.34927 2
63 38 73.28041 78.99518 67.56565 2
64 40 75.37861 80.81593 69.94129 2
65 44 79.29028 84.20275 74.37780 2
66 45 80.20272 85.00888 75.39656 2
67 46 81.08645 85.80180 76.37110 2
68 47 81.93696 86.57689 77.29704 2
69 48 82.75920 87.34100 78.17739 2
70 49 83.55055 88.09165 79.00945 2
71 50 84.30962 88.82357 79.79567 2
72 51 85.03743 89.53669 80.53817 2
73 52 85.73757 90.23223 81.24291 2
74 53 86.41419 90.91607 81.91232 2
75 54 87.05716 91.58632 82.52800 2
76 59 89.75923 94.58218 84.93629 2
77 60 90.18557 95.05573 85.31541 2
78 61 90.58166 95.51469 85.64864 2
79 63 91.27115 96.31107 86.23124 2
80 71 92.40983 98.35031 86.46934 2
81 72 92.36362 98.52258 86.20465 2
82 73 92.27734 98.67161 85.88308 2
83 84 88.66150 98.84699 78.47602 2
84 85 88.08846 98.73625 77.44067 2
According to Gregor, I tried inherit.aes = FALSE, the error is gone. But my plot looks like:
We've got all the info we need. Now we just need to, ahem, connect the dots ;-)
First the input data:
NVIQ_predict <- read.table(text = "
id cogn.age predict upper lower group
1 7 39.04942 86.68497 18.00000 1
2 8 38.34993 82.29627 18.00000 1
3 10 37.05174 74.31657 18.00000 1
4 11 36.45297 70.72421 18.00000 1
5 12 35.88770 67.39555 18.00000 1
6 13 35.35587 64.32920 18.00000 1
7 14 34.85738 61.52322 18.00000 1
8 16 33.95991 56.68024 18.00000 1
9 17 33.56057 54.63537 18.00000 1
10 18 33.19388 52.83504 18.00000 1
11 19 32.85958 51.27380 18.00000 1
12 20 32.55752 49.94791 18.00000 1
13 21 32.28766 48.85631 18.00000 1
14 24 31.67593 47.09206 18.00000 1
15 25 31.53239 46.91136 18.00000 1
16 28 31.28740 48.01764 18.00000 1
17 32 31.36627 50.55201 18.00000 1
18 35 31.73386 53.19630 18.00000 1
19 36 31.91487 54.22624 18.00000 1
20 37 32.13026 55.25721 18.00000 1
21 38 32.38237 56.26713 18.00000 1
22 40 32.98499 58.36229 18.00000 1
23 44 34.59044 62.80187 18.00000 1
24 45 35.06804 64.01951 18.00000 1
25 46 35.57110 65.31888 18.00000 1
26 47 36.09880 66.64696 17.93800 1
27 48 36.72294 67.60053 17.97550 1
28 49 37.39182 68.49995 18.03062 1
29 50 38.10376 69.35728 18.10675 1
30 51 38.85760 70.17693 18.18661 1
31 52 39.65347 70.95875 18.27524 1
32 53 40.49156 71.70261 18.38020 1
33 54 41.35332 72.44006 17.90682 1
34 59 46.37849 74.91802 18.63206 1
35 60 47.53897 75.66218 19.64432 1
36 61 48.74697 76.43933 20.82346 1
37 63 51.30607 78.02426 23.73535 1
38 71 63.43129 86.05467 40.43482 1
39 72 65.15618 87.44794 42.72704 1
40 73 66.92714 88.95324 45.01966 1
41 84 89.42079 114.27939 68.03834 1
42 85 91.73831 117.44007 69.83676 1
43 7 33.69504 54.03695 15.74588 2
44 8 34.99931 53.96500 18.00533 2
45 10 37.61963 54.05684 22.43516 2
46 11 38.93493 54.21969 24.60049 2
47 12 40.25315 54.45963 26.73027 2
48 13 41.57397 54.77581 28.82348 2
49 14 42.89710 55.16727 30.87982 2
50 16 45.54954 56.17193 34.88453 2
51 17 46.87877 56.78325 36.83632 2
52 18 48.21025 57.46656 38.75807 2
53 19 49.54461 58.22266 40.65330 2
54 20 50.88313 59.05509 42.52505 2
55 21 52.22789 59.97318 44.36944 2
56 24 56.24397 63.21832 49.26963 2
57 25 57.55394 64.33850 50.76938 2
58 28 61.45282 68.05043 54.85522 2
59 32 66.44875 72.85234 60.04517 2
60 35 69.96560 76.06171 63.86949 2
61 36 71.09268 77.06821 65.11714 2
62 37 72.19743 78.04559 66.34927 2
63 38 73.28041 78.99518 67.56565 2
64 40 75.37861 80.81593 69.94129 2
65 44 79.29028 84.20275 74.37780 2
66 45 80.20272 85.00888 75.39656 2
67 46 81.08645 85.80180 76.37110 2
68 47 81.93696 86.57689 77.29704 2
69 48 82.75920 87.34100 78.17739 2
70 49 83.55055 88.09165 79.00945 2
71 50 84.30962 88.82357 79.79567 2
72 51 85.03743 89.53669 80.53817 2
73 52 85.73757 90.23223 81.24291 2
74 53 86.41419 90.91607 81.91232 2
75 54 87.05716 91.58632 82.52800 2
76 59 89.75923 94.58218 84.93629 2
77 60 90.18557 95.05573 85.31541 2
78 61 90.58166 95.51469 85.64864 2
79 63 91.27115 96.31107 86.23124 2
80 71 92.40983 98.35031 86.46934 2
81 72 92.36362 98.52258 86.20465 2
82 73 92.27734 98.67161 85.88308 2
83 84 88.66150 98.84699 78.47602 2
84 85 88.08846 98.73625 77.44067 2", header = TRUE)
NVIQ_predict$id <- NULL
Make sure the group column is a factor variable, so we can use it as a line type.
NVIQ_predict$group <- as.factor(NVIQ_predict$group)
Then build the plot.
library(ggplot2)
g1 <- ggplot(NVIQ_predict, aes(cogn.age, predict, color=group)) +
geom_smooth(aes(x = cogn.age, y = upper, group=group), method = loess, se = FALSE) +
geom_smooth(aes(x = cogn.age, y = lower, group=group), method = loess, se = FALSE) +
geom_line(aes(linetype = group), size = 0.8)
Finally, extract the (x,ymin) and (x,ymax) coordinates of the curves for group 1 as well as group 2. These pairs have identical x-coordinates, so connecting those points mimics shading the areas between both curves. This was explained in Fill region between two loess-smoothed lines in R with ggplot. The only difference here is that we need to be a bit more careful to select and connect the points that belong to the correct curves...
gp <- ggplot_build(g1)
d1 <- gp$data[[1]]
d2 <- gp$data[[2]]
df1 <- data.frame(x = d1[d1$group == 1,]$x,
ymin = d2[d2$group == 1,]$y,
ymax = d1[d1$group == 1,]$y)
df2 <- data.frame(x = d1[d1$group == 2,]$x,
ymin = d2[d2$group == 2,]$y,
ymax = d1[d1$group == 2,]$y)
g1 + geom_ribbon(data = df1, aes(x = x, ymin = ymin, ymax = ymax), inherit.aes = FALSE, fill = "grey", alpha = 0.4) +
geom_ribbon(data = df2, aes(x = x, ymin = ymin, ymax = ymax), inherit.aes = FALSE, fill = "grey", alpha = 0.4)
The result looks like this:
This shouldn't be too hard, but I always have issues when tying to run calculations on a column in a dataframe that relies on the value of a another column in the data frame. Here is my data.frame
stream reach length.km length.m total.sa pools.sa
1 Stream Reach_Code 109 109 1 1
2 Brooks BRK_001 17 14 108 13
3 Brooks BRK_002 15 12 99 9
4 Brooks BRK_003 24 21 94 95
5 Brooks BRK_004 32 29 97 33
6 Brooks BRK_005 27 24 92 79
7 Brooks BRK_006 26 23 95 6
8 Brooks BRK_007 16 13 77 15
9 Brooks BRK_008 29 26 84 26
10 Brooks BRK_009 18 15 87 46
11 Brooks BRK_010 23 20 88 47
12 Brooks BRK_011 22 19 91 40
13 Brooks BRK_012 30 27 98 37
14 Brooks BRK_013 25 22 93 29
19 Buncombe_Hollow BNH_0001 7 4 75 65
20 Buncombe_Hollow BNH_0002 8 5 66 21
21 Buncombe_Hollow BNH_0003 9 6 68 53
22 Buncombe_Hollow BNH_0004 19 16 81 11
23 Buncombe_Hollow BNH_0005 6 3 65 27
24 Buncombe_Hollow BNH_0006 13 10 63 23
25 Buncombe_Hollow BNH_0007 12 9 71 57
I would like to calculate the mean of a column (lets say length.m) where stream = Brooks and then do the same thing for stream = Buncombe_Hollow. I actually have 17 different stream names, and plan on calculating the mean of some column for each stream. I will then store these means as a vector, and bind them to another vector of the stream names, so the end result is something like this
stream truevalue
1 Brooks 0.9440620
2 Siouxon 0.5858527
3 Speelyai 0.5839844
Thanks!
try using aggregate:
# Generate some data to use
someDf <- data.frame(stream = rep(c("Brooks", "Buncombe_Hollow"), each = 10),
length.m = rpois(20, 4))
# Calculate the means with aggregate
with(someDf, aggregate(list(truevalue = length.m), list(stream = stream), mean))
The reason for the "list" bits is to specifically name the columns in the (data frame) output
Start using the dplyr package. It makes such calculations quick as well as very easy to write
library(dplyr)
result <- data %>% group_by(stream) %>% summarize(truevalue = mean(length.m))
I am conducting a network meta-analysis on R with two packages, gemtc and rjags. However, when I type
Model <- mtc.model (network, linearmodel=’fixed’).
R always returns “
Error in [.data.frame(data, sel1 | sel2, columns, drop = FALSE) :
undefined columns selected In addition: Warning messages: 1: In
mtc.model(network, linearModel = "fixed") : Likelihood can not be
inferred. Defaulting to normal. 2: In mtc.model(network, linearModel =
"fixed") : Link can not be inferred. Defaulting to identity “
How to fix this problem? Thanks!
I am attaching my codes and data here:
SAE <- read.csv(file.choose(),head=T, sep=",")
head(SAE)
network <- mtc.network(data.ab=SAE)
summary(network)
plot(network)
model.fe <- mtc.model (network, linearModel="fixed")
plot(model.fe)
summary(model.fe)
cat(model.fe$code)
model.fe$data
# run this model
result.fe <- mtc.run(model.fe, n.adapt=0, n.iter=50)
plot(result.fe)
gelman.diag(result.fe)
result.fe <- mtc.run(model.fe, n.adapt=1000, n.iter=5000)
plot(result.fe)
gelman.diag(result.fe)
following is my data: SAE
study treatment responder sample.size
1 1 3 0 76
2 1 30 2 72
3 2 3 99 1389
4 2 23 132 1383
5 3 1 6 352
6 3 30 2 178
7 4 2 6 106
8 4 30 3 95
9 5 3 49 393
10 5 25 18 198
11 6 1 20 65
12 6 22 10 26
13 7 1 1 76
14 7 30 3 76
15 8 3 7 441
16 8 26 1 220
17 9 2 1 47
18 9 30 0 41
19 10 3 10 156
20 10 30 9 150
21 11 1 4 85
22 11 25 5 85
23 11 30 4 84
24 12 3 6 152
25 12 30 5 160
26 13 18 4 158
27 13 21 8 158
28 14 1 3 110
29 14 30 2 111
30 15 3 3 83
31 15 30 1 92
32 16 1 3 124
33 16 22 6 123
34 16 30 4 125
35 17 3 236 1553
36 17 23 254 1546
37 18 6 5 398
38 18 7 6 403
39 19 1 64 588
40 19 22 73 584
How about reading the manual ?mtc.model. It clearly states the following:
Required columns [responders, sampleSize]
So your responder variable should be responders and your sample.size variable should be sampleSize.
Next, your plot(network) should help you determine that some comparisons can not be made. In your data, there are 2 subgroups of trials that were compared. Treatment 18 and 21 were not compared with any of the others. Therefore you can only do a meta-analysis of 21 and 18 or a network meta-analysis of the rest.
network <- mtc.network(data.ab=SAE[!SAE$treatment %in% c(21, 18), ])
model.fe <- mtc.model(network, linearModel="fixed")