How can I get all model estimates automatically in R? - r

I know I can calculate models' estimates by hand, but I'm sure there's a way to get all model estimates for all categorical levels automatically. Since I'm dealing with lmers, maybe this should be suitable. Note: I don't want to predict new data, I just wanna get all estimates automatically. (just edited the post to make it easier to understand)
an example:
> model <- lmer(Score ~ Proficiency_c * testType + (1|ID), data = myData, REML = F)
> summary(model)
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 4.8376 0.2803 156.9206 17.259 < 2e-16 ***
Proficiency_c -1.3381 0.4405 156.9206 -3.038 0.00279 **
testTypeTestB 0.2088 0.3269 126.0000 0.639 0.52421
testTypeTestC 0.4638 0.3269 126.0000 1.418 0.15853
Proficiency_c:testTypeTestB 0.5008 0.5138 126.0000 0.975 0.33157
Proficiency_c:testTypeTestC 0.2357 0.5138 126.0000 0.459 0.64727
---
> contrasts(myData$testType)
TestB TestC
TestA 0 0
TestB 1 0
TestC 0 1
'by hand', I would:
## estimate for Test A:
y = b0 + b1x1 + b2x2 + b3x3 + b4(x1 * x2) + b5(x1 * x3)
y = b0 + b1 * 1 + 0 + 0 + 0
y = b0 + b1
y = 3.49
## estimate for Test B:
y = b0 + b1x1 + b2x2 + b3x3 + b4(x1 * x2) + b5(x1 * x3)
y = b0 + b1 * 1 + b2 * 1 + 0 + b4(1 * 1) + 0
y = b0 + b2 + (b1 + b4)x1
y = 4.20
## estimate for Test C:
y = b0 + b1x1 + b2x2 + b3x3 + b4(x1 * x2) + b5(x1 * x3)
y = b0 + b1 * 1 + b2 * 0 + b3 * 1 + 0 + b5 (1 * 1)
y = b0 + b3 + (b1 + b5)x1
y = 4.19
edited question
I usually deal with people who don't know how to come up with the model's estimates by themselves, so I usually have to calculate them all 'by hand'. I just wish there was a way to get all 'ys' estimates concerning each categorical level (as I did 'by hand' above) without doing that manually? Again, for now, I don't want to predict new values. Thanks in advance!
data:
dput(myData)
structure(list(ID = c("p1", "p1", "p1", "p2", "p2", "p2", "p3",
"p3", "p3", "p4", "p4", "p4", "p5", "p5", "p5", "p6", "p6", "p6",
"p7", "p7", "p7", "p8", "p8", "p8", "p9", "p9", "p9", "p10",
"p10", "p10", "p11", "p11", "p11", "p12", "p12", "p12", "p13",
"p13", "p13", "p14", "p14", "p14", "p15", "p15", "p15", "p16",
"p16", "p16", "p17", "p17", "p17", "p18", "p18", "p18", "p19",
"p19", "p19", "p20", "p20", "p20", "p21", "p21", "p21", "p22",
"p22", "p22", "p23", "p23", "p23", "p24", "p24", "p24", "p25",
"p25", "p25", "p26", "p26", "p26", "p27", "p27", "p27", "p28",
"p28", "p28", "p29", "p29", "p29", "p30", "p30", "p30", "p31",
"p31", "p31", "p32", "p32", "p32", "p33", "p33", "p33", "p34",
"p34", "p34", "p35", "p35", "p35", "p36", "p36", "p36", "p37",
"p37", "p37", "p38", "p38", "p38", "p39", "p39", "p39", "p40",
"p40", "p40", "p41", "p41", "p41", "p42", "p42", "p42", "p43",
"p43", "p43", "p44", "p44", "p44", "p45", "p45", "p45", "p46",
"p46", "p46", "p47", "p47", "p47", "p48", "p48", "p48", "p49",
"p49", "p49", "p50", "p50", "p50", "p51", "p51", "p51", "p52",
"p52", "p52", "p53", "p53", "p53", "p54", "p54", "p54", "p55",
"p55", "p55", "p56", "p56", "p56", "p57", "p57", "p57", "p58",
"p58", "p58", "p59", "p59", "p59", "p60", "p60", "p60", "p61",
"p61", "p61", "p62", "p62", "p62", "p63", "p63", "p63"), Score = c(5.33,
5.05, 5.15, 5.82, 2.29, 7.54, 4.46, 2.43, 1.53, 8.97, 7.69, 7.21,
6.76, 8.41, 3.77, 3.33, 11.57, 7.69, 2.15, 3.84, 3.29, 3.36,
6.66, 5.6, 4.23, 4.41, 3.07, 2.29, 4.9, 4.46, 3.22, 1.72, 2.08,
4.47, 2.4, 2.54, 2.73, 6.57, 7.31, 4.46, 9.27, 4.31, 4.54, 6.32,
8.97, 3.44, 4.68, 9.7, 2.15, 5.68, 5.26, 9.3, 5.68, 8.97, 4.65,
4.13, 4.57, 11.22, 11.39, 7.52, 3.94, 4.47, 3.52, 5, 8, 5.81,
2.96, 4.05, 2.22, 4.41, 5.64, 4.79, 2.43, 2.5, 4.16, 7.57, 9.21,
2.59, 3.12, 3.84, 7.76, 8.77, 5.08, 7.81, 4.49, 2.17, 7.4, 5.81,
4.9, 3.19, 3.2, 2.72, 3.67, 4.42, 3.57, 1.02, 4.42, 2.45, 5.88,
7.84, 4.93, 9.61, 3.75, 1.8, 3.47, 0.65, 1.39, 2.9, 6.36, 2.77,
2.67, 6.89, 6.74, 6.81, 1.94, 3.22, 3.12, 4.08, 5.31, 11.23,
4.1, 4.28, 3.89, 2.98, 3.52, 3.64, 3.63, 5.08, 4.9, 6.66, 7.56,
3.14, 5.26, 1.03, 4.58, 2.9, 2.5, 3.57, 4, 7.54, 3.5, 5.19, 2.56,
2.38, 1.4, 3.97, 2, 8.69, 5.33, 6.42, 3.62, 2.59, 4.63, 4.85,
6.87, 5.55, 3.14, 2.29, 4.68, 7.76, 3.53, 8.88, 3.44, 8, 5.15,
6.77, 12.28, 6.25, 4.91, 7.01, 7.4, 5.21, 3, 4.87, 7.5, 5.47,
8.97, 7.89, 7.54, 9.25, 7.24, 5.37, 6.41, 2.94, 5.47, 7.14, 5.4,
5.06, 6.32), Proficiency_c = c(0.44, 0.44, 0.44, 0.69, 0.69,
0.69, 1.24, 1.24, 1.24, -0.16, -0.16, -0.16, 1.14, 1.14, 1.14,
0.69, 0.69, 0.69, -0.26, -0.26, -0.26, 0.94, 0.94, 0.94, -0.26,
-0.26, -0.26, 1.04, 1.04, 1.04, 0.39, 0.39, 0.39, -0.06, -0.06,
-0.06, -0.41, -0.41, -0.41, 0.54, 0.54, 0.54, -0.51, -0.51, -0.51,
-0.81, -0.81, -0.81, 0.14, 0.14, 0.14, -0.31, -0.31, -0.31, 0.44,
0.44, 0.44, -0.11, -0.11, -0.11, -0.21, -0.21, -0.21, -0.51,
-0.51, -0.51, 0.24, 0.24, 0.24, 0.59, 0.59, 0.59, -0.21, -0.21,
-0.21, -0.66, -0.66, -0.66, -0.06, -0.06, -0.06, -1.01, -1.01,
-1.01, -0.26, -0.26, -0.26, 0.19, 0.19, 0.19, 0.84, 0.84, 0.84,
-0.11, -0.11, -0.11, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.79,
0.79, 0.79, 1.09, 1.09, 1.09, -0.76, -0.76, -0.76, 0.14, 0.14,
0.14, 0.64, 0.64, 0.64, 0.49, 0.49, 0.49, -0.71, -0.71, -0.71,
-0.31, -0.31, -0.31, -0.11, -0.11, -0.11, -0.61, -0.61, -0.61,
0.19, 0.19, 0.19, -0.36, -0.36, -0.36, -0.31, -0.31, -0.31, -1.01,
-1.01, -1.01, 1.19, 1.19, 1.19, -0.96, -0.96, -0.96, 0.99, 0.99,
0.99, 0.74, 0.74, 0.74, 0.24, 0.24, 0.24, -0.06, -0.06, -0.06,
-0.31, -0.31, -0.31, -0.66, -0.66, -0.66, -0.96, -0.96, -0.96,
0.89, 0.89, 0.89, -0.96, -0.96, -0.96, -1.01, -1.01, -1.01, -0.66,
-0.66, -0.66, -0.71, -0.71, -0.71, -0.36, -0.36, -0.36), testType = structure(c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("TestA",
"TestB", "TestC"), class = "factor")), row.names = c(NA, -189L
), class = c("tbl_df", "tbl", "data.frame"))

I'm not sure why you're calculating predictions at a reference proficiency of 1 (0 would be the default), but maybe you're looking for emmeans?
library(emmeans)
emmeans(model, ~testType, at = list(Proficiency_c=1))
The at = argument is the way to specify in emmeans that we want to calculate marginal means with the non-focal parameters (Proficiency_c in this case) set to a value other than the default [typically the mean of a numeric covariate]. See vignette("basics", package = "emmeans") (emmeans has many high-quality vignettes). It's specified as a list because we may have multiple non-focal parameters to set.
Results:
NOTE: Results may be misleading due to involvement in interactions
testType emmean SE df lower.CL upper.CL
TestA 3.50 0.529 162 2.45 4.54
TestB 4.21 0.529 162 3.16 5.25
TestC 4.20 0.529 162 3.15 5.24
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
If you're looking for the estimated slope within each test type, use emtrends:
emtrends(model, ~testType, "Proficiency_c")
testType Proficiency_c.trend SE df lower.CL upper.CL
TestA -1.338 0.448 162 -2.22 -0.4541
TestB -0.837 0.448 162 -1.72 0.0467
TestC -1.102 0.448 162 -1.99 -0.2185
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95

Related

How to show the results of a Tukey test with boxplots showing CLD letters

I have collected data on 216 individuals. I measured the concentration of the same 7 Substances in each individual, represented by Sub1:Sub7. The concentration of these Substances may be different in individuals from different Locations. I am interested in the level of refinement at which these individuals can be classified into groups based on their concentrations of these substances. I am also interested in seeing how these Substances may be correlated with each other, as the concentration of some may effect the concentration of others. Each Individual in my data set is represented by a unique ID number. Three "nested" grouping variables (Location, State, and Region) can be used to separate these individuals. Multiple Locations are in each State, and multiple States are part of larger Regions. For instance, the individuals in the Locations: APNG, BLEA, and NEAR are all in FL, while the individuals in the Locations: CACT, OYLE, and PIY are all in GA. The states FL and GA are both in Region A. I used this function to conduct an anova:
library(tidyverse)
library(multicomp)
library(multicompView)
tests <- list()
Groups <- c(1:3)
Variables <- 6:12
for(i in Groups){
Group <- as.factor(data[[i]])
for(j in Variables)
{
test_name <- paste0(names(data)[j], "_by_", names(data[i]))
Response <- data[[j]]
sublist <- list()
sublist$aov <- aov(Response ~ Group)
sublist$tukey <- TukeyHSD(sublist$aov)
sublist$multcomp <- multcompLetters(extract_p(sublist$tukey$Group))
tests[[test_name]] <- sublist
}
}
#i can access the results like this:
lapply(tests, function(x) summary(x$aov))
#and access the compact letter display results like this:
lapply(tests, function(x) x$multcomp)
using the object tests, how can I tell R to create boxplots of the TukeyHSD results and show the CLD letters and paste the plots onto a pdf?
This website: r-graph-gallery.com/84-tukey-test.html explains how to do this, but I cannot get it to work with the object tests.
here is my data:
> dput(data)
structure(list(Region = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L), .Label = c("A", "B", "C", "D", "E"), class = "factor"),
State = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 10L, 10L, 10L,
10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 12L), .Label = c("DE", "FL", "GA", "MA",
"MD", "ME", "NC", "NH", "NY", "SC", "VA", "VT"), class = "factor"),
Location = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 14L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 20L, 20L, 20L, 20L, 20L, 20L, 22L,
22L, 22L, 22L, 22L, 22L, 22L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 15L, 15L, 15L, 15L, 15L,
15L, 15L, 15L, 15L, 15L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 11L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L,
19L, 19L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L, 21L
), .Label = c("APNG", "BATO", "BLEA", "CACT", "CHAG", "CHOG",
"COTR", "DTU", "HAB", "LOP", "MASV", "NEAR", "NGUP", "OYLE",
"PIRT", "PIY", "PKE", "PONO", "PPP", "ROG", "VONG", "YENQ"
), class = "factor"), Sex = structure(c(1L, 1L, 1L, 2L, 1L,
1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L,
2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
2L), .Label = c("F", "M"), class = "factor"), ID = 1:216,
Sub1 = c(0.03, 0.03, 0.03, 0.04, 0.04, 0.03, 0.03, 0.03,
0.03, 0.03, 0.04, 0.03, 0.04, 0.03, 0.03, 0.03, 0.02, 0.04,
0.03, 0.03, 0.03, 0.02, 0.04, 0.04, 0.02, 0.03, 0.02, 0.03,
0.05, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03,
0.03, 0.03, 0.04, 0.03, 0.04, 0.06, 0.03, 0.03, 0.03, 0.03,
0.02, 0.03, 0.03, 0.03, 0.04, 0.03, 0.02, 0.02, 0.04, 0.03,
0.04, 0.03, 0.03, 0.03, 0.05, 0.03, 0.03, 0.04, 0.03, 0.02,
0.04, 0.02, 0.03, 0.02, 0.02, 0.04, 0.03, 0.02, 0.03, 0.03,
0.05, 0.04, 0.03, 0.02, 0.03, 0.05, 0.02, 0.04, 0.03, 0.05,
0.03, 0.04, 0.02, 0.03, 0.02, 0.03, 0.03, 0.03, 0.02, 0.05,
0.03, 0.03, 0.04, 0.02, 0.02, 0.04, 0.05, 0.03, 0.03, 0.02,
2.03, 2.03, 2.03, 2.04, 2.04, 2.03, 2.03, 2.03, 2.03, 2.03,
2.04, 2.03, 2.04, 2.03, 2.03, 2.03, 2.02, 2.04, 2.03, 2.03,
2.03, 2.02, 2.04, 2.04, 2.02, 2.03, 2.02, 2.03, 2.05, 2.03,
2.03, 2.03, 2.03, 2.03, 2.03, 2.03, 2.03, 2.03, 2.03, 2.03,
2.04, 2.03, 2.04, 2.06, 2.03, 2.03, 2.03, 2.03, 2.02, 2.03,
2.03, 2.03, 2.04, 2.03, 2.02, 2.02, 2.04, 2.03, 2.04, 2.03,
2.03, 2.03, 2.05, 2.03, 2.03, 2.04, 2.03, 2.02, 2.04, 2.02,
2.03, 2.02, 2.02, 2.04, 2.03, 2.02, 2.03, 2.03, 2.05, 2.04,
2.03, 2.02, 2.03, 2.05, 2.02, 2.04, 2.03, 2.05, 2.03, 2.04,
2.02, 2.03, 2.02, 2.03, 2.03, 2.03, 2.02, 2.05, 2.03, 2.03,
2.04, 2.02, 2.02, 2.04, 2.05, 2.03, 2.03, 2.02), Sub2 = c(0.69,
1.28, 1.27, 2.25, 1.05, 1.76, 1.57, 1.09, 0.68, 1.35, 0.85,
1.55, 0.12, 0, 0.58, 1.13, 0.1, 1.9, 0.54, 1.48, 0.8, 0.52,
1.76, 1.77, 1.24, 0.63, 0.63, 0.57, 0.63, 0.53, 1.32, 1.79,
1.16, 1.11, 1.1, 1.92, 1.06, 1.18, 0.43, 0.67, 0.75, 2.37,
3.93, 0.3, 2.8, 1.25, 0.9, 1.32, 0.5, 0.4, 0.72, 0.34, 0.12,
0.89, 0.69, 1.13, 1.22, 0.88, 4.13, 1.27, 0.62, 2.9, 2.42,
0.9, 0.4, 1.29, 1.61, 0.3, 1.47, 0.36, 1.27, 0.84, 1.81,
0.18, 0.47, 1.01, 0.85, 0.59, 1.73, 0.72, 0.5, 0.83, 0.9,
0.81, 0.59, 2.84, 2.24, 2.68, 1.18, 1.36, 0.84, 1.79, 1.01,
0.34, 0.41, 2.22, 0.51, 0.42, 1.26, 2.26, 1.79, 1.43, 1.3,
1.8, 2.21, 1.65, 2.39, 0.31, 2.69, 3.28, 3.27, 4.25, 3.05,
3.76, 3.57, 3.09, 2.68, 3.35, 2.85, 3.55, 2.12, 2, 2.58,
3.13, 2.1, 3.9, 2.54, 3.48, 2.8, 2.52, 3.76, 3.77, 3.24,
2.63, 2.63, 2.57, 2.63, 2.53, 3.32, 3.79, 3.16, 3.11, 3.1,
3.92, 3.06, 3.18, 2.43, 2.67, 2.75, 4.37, 5.93, 2.3, 4.8,
3.25, 2.9, 3.32, 2.5, 2.4, 2.72, 2.34, 2.12, 2.89, 2.69,
3.13, 3.22, 2.88, 6.13, 3.27, 2.62, 4.9, 4.42, 2.9, 2.4,
3.29, 3.61, 2.3, 3.47, 2.36, 3.27, 2.84, 3.81, 2.18, 2.47,
3.01, 2.85, 2.59, 3.73, 2.72, 2.5, 2.83, 2.9, 2.81, 2.59,
4.84, 4.24, 4.68, 3.18, 3.36, 2.84, 3.79, 3.01, 2.34, 2.41,
4.22, 2.51, 2.42, 3.26, 4.26, 3.79, 3.43, 3.3, 3.8, 4.21,
3.65, 4.39, 2.31), Sub3 = c(1.32, 0.19, 0.27, 0.73, 0.41,
0.37, 0.89, 1.35, 0.49, 1.32, 0.69, 0, 0.57, 0.24, 0.23,
0.71, 0, 0, 0, 0.58, 0.32, 1.1, 0.45, 0.61, 0.38, 0.3, 0.01,
0.06, 0.48, 0.62, 0.64, 1.96, 0.61, 0.43, 0.25, 0.34, 0.17,
0.57, 0.1, 0.6, 1.07, 0.44, 0.12, 0.55, 0.08, 0.56, 0.59,
0.66, 0.44, 0.58, 0.75, 0.99, 0.77, 0.57, 0.35, 0.18, 0.16,
0.31, 0.04, 0.17, 0.46, 0.19, 0.8, 0.61, 1.14, 0.3, 0.08,
0.25, 0.78, 1.07, 0.38, 0.17, 0.42, 0.48, 0.55, 0.74, 2.98,
1.96, 0.51, 0.63, 0, 0.52, 0.32, 0.23, 0.31, 0.09, 0.06,
0.26, 0.23, 0.58, 1.49, 0.46, 0.33, 0.37, 1.16, 0.91, 0.41,
0.72, 0.2, 0.84, 0.71, 0.56, 0.34, 0.68, 0.81, 0.52, 0.78,
0.19, 3.32, 2.19, 2.27, 2.73, 2.41, 2.37, 2.89, 3.35, 2.49,
3.32, 2.69, 2, 2.57, 2.24, 2.23, 2.71, 2, 2, 2, 2.58, 2.32,
3.1, 2.45, 2.61, 2.38, 2.3, 2.01, 2.06, 2.48, 2.62, 2.64,
3.96, 2.61, 2.43, 2.25, 2.34, 2.17, 2.57, 2.1, 2.6, 3.07,
2.44, 2.12, 2.55, 2.08, 2.56, 2.59, 2.66, 2.44, 2.58, 2.75,
2.99, 2.77, 2.57, 2.35, 2.18, 2.16, 2.31, 2.04, 2.17, 2.46,
2.19, 2.8, 2.61, 3.14, 2.3, 2.08, 2.25, 2.78, 3.07, 2.38,
2.17, 2.42, 2.48, 2.55, 2.74, 4.98, 3.96, 2.51, 2.63, 2,
2.52, 2.32, 2.23, 2.31, 2.09, 2.06, 2.26, 2.23, 2.58, 3.49,
2.46, 2.33, 2.37, 3.16, 2.91, 2.41, 2.72, 2.2, 2.84, 2.71,
2.56, 2.34, 2.68, 2.81, 2.52, 2.78, 2.19), Sub4 = c(0.63,
0.05, 0.2, 0.41, 0.43, 0.54, 0.26, 0.78, 0.13, 0.8, 0.47,
0.65, 0, 0.22, 0.45, 0.85, 0.47, 0, 0.62, 0.59, 0.14, 0.8,
0.9, 0.88, 0.56, 0.56, 0.47, 0.24, 0.62, 1.77, 0.56, 0.99,
0.21, 0.9, 0.62, 0.58, 0.41, 0.97, 0.2, 0.9, 0.68, 0.52,
0.14, 1.27, 0.63, 0.51, 0.12, 0.61, 0.31, 0.43, 0.62, 1.18,
0.95, 0.59, 0.39, 0.26, 0.53, 0.77, 0.4, 0.39, 0, 0.19, 0.82,
1.1, 0.46, 0.25, 0.29, 0.2, 2.01, 0.36, 0.62, 0.54, 0.48,
0.87, 0.66, 1.46, 2.59, 1.37, 1.28, 0.99, 0.71, 0.32, 0.64,
0.66, 0.47, 0.48, 0.38, 0.67, 0.18, 1.02, 0.54, 0.53, 0.25,
0.43, 1.02, 0.58, 0.58, 0.48, 0.2, 0.7, 0.38, 0.28, 0.65,
1.21, 1.03, 0.38, 0.6, 0.44, 2.63, 2.05, 2.2, 2.41, 2.43,
2.54, 2.26, 2.78, 2.13, 2.8, 2.47, 2.65, 2, 2.22, 2.45, 2.85,
2.47, 2, 2.62, 2.59, 2.14, 2.8, 2.9, 2.88, 2.56, 2.56, 2.47,
2.24, 2.62, 3.77, 2.56, 2.99, 2.21, 2.9, 2.62, 2.58, 2.41,
2.97, 2.2, 2.9, 2.68, 2.52, 2.14, 3.27, 2.63, 2.51, 2.12,
2.61, 2.31, 2.43, 2.62, 3.18, 2.95, 2.59, 2.39, 2.26, 2.53,
2.77, 2.4, 2.39, 2, 2.19, 2.82, 3.1, 2.46, 2.25, 2.29, 2.2,
4.01, 2.36, 2.62, 2.54, 2.48, 2.87, 2.66, 3.46, 4.59, 3.37,
3.28, 2.99, 2.71, 2.32, 2.64, 2.66, 2.47, 2.48, 2.38, 2.67,
2.18, 3.02, 2.54, 2.53, 2.25, 2.43, 3.02, 2.58, 2.58, 2.48,
2.2, 2.7, 2.38, 2.28, 2.65, 3.21, 3.03, 2.38, 2.6, 2.44),
Sub5 = c(1.14, 1.38, 1.5, 1.43, 1.65, 1.34, 1.29, 1.72, 1.32,
1.17, 1.19, 1.35, 1.34, 1.06, 1.24, 1.33, 1.2, 1.31, 1.29,
1.37, 1.42, 1.08, 1.77, 1.32, 1.2, 1.14, 1.48, 0.98, 1.33,
1.65, 1.24, 1.43, 1.41, 1.2, 1.42, 1.09, 1.04, 1.57, 0.78,
1.37, 0.99, 1.4, 1.13, 1.34, 1.35, 1.23, 0.93, 0.94, 1.02,
1.16, 1.08, 0.96, 1.33, 1.19, 1.25, 1.44, 1.62, 1.27, 1.4,
1.4, 1.29, 1.53, 1.43, 1.33, 1.25, 1.82, 1.45, 1.36, 1.38,
1.34, 1.29, 1.86, 1.15, 1.31, 1.21, 1.23, 1.42, 1.57, 1.23,
0.99, 1.33, 1.74, 1.03, 1.33, 1.41, 1.01, 0.97, 1.46, 1.55,
1.04, 1.22, 1.19, 1.74, 1.64, 1.35, 1.34, 1.21, 1.55, 1.31,
1.5, 1.45, 1.21, 0.83, 1.17, 1.25, 1.54, 1.5, 1.11, 3.14,
3.38, 3.5, 3.43, 3.65, 3.34, 3.29, 3.72, 3.32, 3.17, 3.19,
3.35, 3.34, 3.06, 3.24, 3.33, 3.2, 3.31, 3.29, 3.37, 3.42,
3.08, 3.77, 3.32, 3.2, 3.14, 3.48, 2.98, 3.33, 3.65, 3.24,
3.43, 3.41, 3.2, 3.42, 3.09, 3.04, 3.57, 2.78, 3.37, 2.99,
3.4, 3.13, 3.34, 3.35, 3.23, 2.93, 2.94, 3.02, 3.16, 3.08,
2.96, 3.33, 3.19, 3.25, 3.44, 3.62, 3.27, 3.4, 3.4, 3.29,
3.53, 3.43, 3.33, 3.25, 3.82, 3.45, 3.36, 3.38, 3.34, 3.29,
3.86, 3.15, 3.31, 3.21, 3.23, 3.42, 3.57, 3.23, 2.99, 3.33,
3.74, 3.03, 3.33, 3.41, 3.01, 2.97, 3.46, 3.55, 3.04, 3.22,
3.19, 3.74, 3.64, 3.35, 3.34, 3.21, 3.55, 3.31, 3.5, 3.45,
3.21, 2.83, 3.17, 3.25, 3.54, 3.5, 3.11), Sub6 = c(0.2, 0.15,
0.16, 0.14, 0.19, 0.12, 0.14, 0.35, 0.29, 0.25, 0.06, 0.16,
0.18, 0.65, 0.18, 0.12, 0.42, 0.09, 0.13, 0.12, 0.22, 0.49,
0.18, 0.11, 0.29, 0.16, 0.18, 0.15, 0.46, 0.19, 0.15, 0.19,
0.1, 0.09, 0.11, 0.14, 0.1, 0.31, 0.53, 0.32, 0.23, 0.18,
0.14, 0.38, 0.19, 0.1, 0.14, 0.08, 0.21, 0.13, 0.08, 0.08,
0.26, 0.14, 0.17, 0.09, 0.09, 0.22, 0.26, 0.09, 0.3, 0.16,
0.17, 0.09, 0.12, 0.17, 0.14, 0.34, 0.12, 0.21, 0.1, 0.27,
0.11, 0.13, 0.15, 0.17, 0.21, 0.16, 0.12, 0.36, 0.16, 0.17,
0.27, 0.32, 0.15, 0.13, 0.14, 0.15, 0.1, 0.26, 0.25, 0.08,
0.25, 0.19, 0.38, 0.08, 0.64, 0.71, 0.1, 0.18, 0.12, 0.13,
0.1, 1.17, 0.14, 0.19, 0.14, 0.24, 2.2, 2.15, 2.16, 2.14,
2.19, 2.12, 2.14, 2.35, 2.29, 2.25, 2.06, 2.16, 2.18, 2.65,
2.18, 2.12, 2.42, 2.09, 2.13, 2.12, 2.22, 2.49, 2.18, 2.11,
2.29, 2.16, 2.18, 2.15, 2.46, 2.19, 2.15, 2.19, 2.1, 2.09,
2.11, 2.14, 2.1, 2.31, 2.53, 2.32, 2.23, 2.18, 2.14, 2.38,
2.19, 2.1, 2.14, 2.08, 2.21, 2.13, 2.08, 2.08, 2.26, 2.14,
2.17, 2.09, 2.09, 2.22, 2.26, 2.09, 2.3, 2.16, 2.17, 2.09,
2.12, 2.17, 2.14, 2.34, 2.12, 2.21, 2.1, 2.27, 2.11, 2.13,
2.15, 2.17, 2.21, 2.16, 2.12, 2.36, 2.16, 2.17, 2.27, 2.32,
2.15, 2.13, 2.14, 2.15, 2.1, 2.26, 2.25, 2.08, 2.25, 2.19,
2.38, 2.08, 2.64, 2.71, 2.1, 2.18, 2.12, 2.13, 2.1, 3.17,
2.14, 2.19, 2.14, 2.24), Sub7 = c(0.01, 0, 0, 0.01, 0, 0,
0.01, 0.01, 0.02, 0.03, 0.01, 0, 0.03, 0, 0.02, 0, 0, 0,
0.01, 0.03, 0.03, 0.02, 0.02, 0.02, 0.01, 0.01, 0.01, 0,
0, 0.05, 0.02, 0.04, 0.02, 0, 0.02, 0.02, 0.02, 0.04, 0.01,
0.02, 0.04, 0.02, 0.01, 0.01, 0.01, 0.01, 0.03, 0.02, 0,
0.02, 0.05, 0.14, 0, 0.01, 0, 0.01, 0.01, 0, 0.01, 0.02,
0.01, 0.02, 0.01, 0.03, 0.05, 0.06, 0.03, 0.02, 0.11, 0.05,
0.02, 0.02, 0, 0.01, 0, 0.01, 0.06, 0.04, 0.02, 0.02, 0,
0.02, 0.01, 0.02, 0.01, 0, 0.01, 0.01, 0.02, 0.01, 0.02,
0.01, 0, 0.01, 0.06, 0.01, 0.02, 0.01, 0.01, 0.03, 0.02,
0.03, 0.03, 0.02, 0.09, 0, 0.19, 0.02, 2.01, 2, 2, 2.01,
2, 2, 2.01, 2.01, 2.02, 2.03, 2.01, 2, 2.03, 2, 2.02, 2,
2, 2, 2.01, 2.03, 2.03, 2.02, 2.02, 2.02, 2.01, 2.01, 2.01,
2, 2, 2.05, 2.02, 2.04, 2.02, 2, 2.02, 2.02, 2.02, 2.04,
2.01, 2.02, 2.04, 2.02, 2.01, 2.01, 2.01, 2.01, 2.03, 2.02,
2, 2.02, 2.05, 2.14, 2, 2.01, 2, 2.01, 2.01, 2, 2.01, 2.02,
2.01, 2.02, 2.01, 2.03, 2.05, 2.06, 2.03, 2.02, 2.11, 2.05,
2.02, 2.02, 2, 2.01, 2, 2.01, 2.06, 2.04, 2.02, 2.02, 2,
2.02, 2.01, 2.02, 2.01, 2, 2.01, 2.01, 2.02, 2.01, 2.02,
2.01, 2, 2.01, 2.06, 2.01, 2.02, 2.01, 2.01, 2.03, 2.02,
2.03, 2.03, 2.02, 2.09, 2, 2.19, 2.02)), class = "data.frame", row.names = c(NA,
-216L))
I think the issue with your tests object is that it holds too much informations to figure out how to plot it.
Here, I focused only on Regions columns, but you can apply the same workflow to other categorical columns of your dataset.
1) We need to obtain the label (letters) associated to each region for each substance, so recycling your loop, I did this:
library(multcomp)
library(multcompView)
Labels_box = NULL
Group <- as.factor(data[,"Region"])
for(j in 6:12)
{
Response <- data[, j]
TUKEY <- TukeyHSD(aov(lm(Response ~ Group)))
MultComp <- multcompLetters(extract_p(TUKEY$Group))
Region <- names(MultComp$Letters)
Labels <- MultComp$Letters
df <- data.frame(Region, Labels)
df$Substance <- colnames(data)[j]
if(j == 1){Labels_box = df}
else{Labels_box = rbind(Labels_box,df)}
}
Now, the dataset Labels_box should look like:
head(Labels_box)
Region Labels Substance
B B a Sub1
C C b Sub1
D D b Sub1
E E b Sub1
A A a Sub1
B1 B a Sub2
2) Next, in order to add them on the top of each boxplot, we will have to define the y position for each labels. So, we are going to calculate the max value of each region for each substance using dplyr and tidyr:
library(tidyverse)
Max_Val <- data %>% pivot_longer(., cols = starts_with("Sub"), names_to = "Substance", values_to = "Value") %>%
group_by(Region, Substance) %>% summarise(MAX = max(Value)+0.2)
# A tibble: 6 x 3
# Groups: Region [1]
Region Substance MAX
<fct> <chr> <dbl>
1 A Sub1 0.26
2 A Sub2 4.13
3 A Sub3 1.55
4 A Sub4 2.21
5 A Sub5 2.06
6 A Sub6 0.85
And we combine both Labels_box and Max_Val datasets using left_join:
Labels_box <- left_join(Labels_box, Max_Val, by = c("Region" = "Region", "Substance" = "Substance"))
Region Labels Substance MAX
1 B a Sub1 0.25
2 C b Sub1 2.25
3 D b Sub1 2.26
4 E b Sub1 2.25
5 A a Sub1 0.26
6 B a Sub2 4.33
3) Finally, we need to reshape in a long format all values for each substances from your data to match the grammar used by ggplot. For that, we can re-use the pivot_longer function seen in 2):
library(tidyverse)
data_box <- data %>% pivot_longer(., cols = starts_with("Sub"), names_to = "Substance", values_to = "Value")
# A tibble: 6 x 7
Region State Location Sex ID Substance Value
<fct> <fct> <fct> <fct> <int> <chr> <dbl>
1 A FL APNG F 1 Sub1 0.03
2 A FL APNG F 1 Sub2 0.69
3 A FL APNG F 1 Sub3 1.32
4 A FL APNG F 1 Sub4 0.63
5 A FL APNG F 1 Sub5 1.14
6 A FL APNG F 1 Sub6 0.2
We are almost ready but in order to set a color matching group identified by Tukey test, we need to add the label on our data_box.
For that, we can do a left_join:
data_box <- left_join(data_box,Labels_box, by = c("Region" = "Region", "Substance" = "Substance"))
# A tibble: 6 x 9
Region State Location Sex ID Substance Value Labels MAX
<fct> <fct> <fct> <fct> <int> <chr> <dbl> <fct> <dbl>
1 A FL APNG F 1 Sub1 0.03 a 0.26
2 A FL APNG F 1 Sub2 0.69 a 4.13
3 A FL APNG F 1 Sub3 1.32 a 1.55
4 A FL APNG F 1 Sub4 0.63 a 2.21
5 A FL APNG F 1 Sub5 1.14 a 2.06
6 A FL APNG F 1 Sub6 0.2 a 0.85
4) Now, we are ready to plot everything:
library(ggplot2)
ggplot(data_box, aes(x = Region, y = Value, fill = Labels))+
geom_boxplot()+
geom_text(data = Labels_box,aes( x = Region, y = MAX, label = Labels))+
facet_grid(.~Substance, scales = "free")
And you get this:
Does it look satisfying for you ?

dplyr rounding and averages

I have a couple of options and approaches but i am not sure exactly which is best or how to actually fully code it.
I have some ocean data at different locations each sampling is described as an event - which is how i differentiate each sampling thus i would like to group_by() event. However my sampling is too fine and i would like to get the average value in this case turbidity for every 0.5 m of depth. So perhaps rounding depth to the nearest 0.5 and then averaging the rest of the variables: "time", "pres" - with the "station" and "event" remaining an ID factor.
So I am thinking something like:
df2 <- df %>% group_by(event)%>%
mutate(vars(depth),funs(round(.,5))%>%
mutate_if(is.numeric, mean)
^ but that is not correct
Another option is reducing the data to a per second and averaging all the numeric values, including depth, per second - but again I am not sure how best to do that.
Expected output:
Heres is some dummy data:
df <- structure(list(datetime = structure(c(1556215607, 1556215607,
1556215607, 1556215607, 1556215607, 1556215607, 1556215607, 1556215608,
1556215608, 1556215608, 1556215608, 1556215608, 1556215608, 1556215609,
1556215609, 1556215609, 1556215609, 1556215609, 1556215609, 1556215610,
1556215610, 1556215610, 1556215610, 1556215610, 1556215610, 1556215611,
1556215611, 1556215611, 1556215611, 1556215611, 1556215611, 1556215612,
1556215612, 1556215612, 1556215612, 1556215612, 1556215612, 1556215613,
1556215613, 1556215613, 1556215613, 1556215613, 1556215613, 1556215614,
1556215614, 1556215614, 1556215614, 1556215614, 1556215614, 1556215615,
1556216764, 1556216765, 1556216765, 1556216765, 1556216765, 1556216765,
1556216765, 1556216766, 1556216766, 1556216766, 1556216766, 1556216766,
1556216766, 1556216767, 1556216767, 1556216767, 1556216767, 1556216767,
1556216767, 1556216768, 1556216768, 1556216768, 1556216768, 1556216768,
1556216768, 1556216769, 1556216769, 1556216769, 1556216769, 1556216769,
1556216769, 1556216770, 1556216770, 1556216770, 1556216770, 1556216770,
1556216770, 1556216771, 1556216771, 1556216771, 1556216771, 1556216771,
1556216771, 1556216772, 1556216772, 1556216772, 1556216772, 1556216772,
1556216772, 1556216772, 1556216773), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), depth = c(0.48, 2.34, 2.36, 2.35, 2.35, 2.35,
2.37, 2.35, 2.34, 2.35, 2.34, 2.34, 2.35, 2.35, 2.35, 2.35, 2.35,
2.35, 2.34, 2.34, 2.32, 2.32, 2.3, 2.3, 2.31, 2.32, 2.32, 2.32,
2.35, 2.34, 2.34, 2.35, 2.33, 2.34, 2.33, 2.32, 2.31, 2.31, 2.31,
2.33, 2.34, 2.35, 2.35, 2.36, 2.36, 2.36, 2.36, 2.36, 2.35, 2.35,
1.76, 1.76, 1.76, 1.76, 1.77, 1.76, 1.76, 1.77, 1.76, 1.76, 1.77,
1.79, 1.78, 1.78, 1.8, 1.78, 1.76, 1.77, 1.76, 1.78, 1.83, 1.97,
2.11, 2.31, 2.48, 2.62, 2.77, 2.92, 3.06, 3.19, 3.35, 3.49, 3.66,
3.8, 3.94, 4.09, 4.24, 4.38, 4.54, 4.68, 4.82, 4.95, 5.1, 5.23,
5.38, 5.5, 5.65, 5.79, 5.95, 6.08, 6.27), press = c(0.48, 2.36,
2.38, 2.37, 2.37, 2.37, 2.39, 2.37, 2.36, 2.37, 2.36, 2.36, 2.37,
2.37, 2.37, 2.37, 2.37, 2.37, 2.36, 2.36, 2.34, 2.34, 2.32, 2.32,
2.33, 2.34, 2.34, 2.34, 2.37, 2.36, 2.36, 2.37, 2.35, 2.36, 2.35,
2.34, 2.33, 2.33, 2.33, 2.35, 2.36, 2.37, 2.37, 2.38, 2.38, 2.38,
2.38, 2.38, 2.37, 2.37, 1.78, 1.78, 1.78, 1.78, 1.79, 1.78, 1.78,
1.79, 1.78, 1.77, 1.79, 1.81, 1.8, 1.8, 1.82, 1.8, 1.78, 1.79,
1.78, 1.8, 1.85, 1.99, 2.13, 2.33, 2.5, 2.64, 2.79, 2.94, 3.09,
3.22, 3.38, 3.52, 3.69, 3.83, 3.97, 4.12, 4.28, 4.42, 4.58, 4.72,
4.86, 4.99, 5.14, 5.27, 5.43, 5.55, 5.7, 5.84, 6, 6.13, 6.32),
event = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2), station = structure(c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("BI-1", "BI-2",
"BI-3", "BI-4", "BI-5", "BI-6", "BI-8", "BI-9"), class = "factor")), class = "data.frame",
row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L,
42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 1600L, 1601L, 1602L,
1603L, 1604L, 1605L, 1606L, 1607L, 1608L, 1609L, 1610L, 1611L,
1612L, 1613L, 1614L, 1615L, 1616L, 1617L, 1618L, 1619L, 1620L,
1621L, 1622L, 1623L, 1624L, 1625L, 1626L, 1627L, 1628L, 1629L,
1630L, 1631L, 1632L, 1633L, 1634L, 1635L, 1636L, 1637L, 1638L,
1639L, 1640L, 1641L, 1642L, 1643L, 1644L, 1645L, 1646L, 1647L,
1648L, 1649L, 1650L))
Any help appreciated.

Rotate a faceted, grouped bar plot

**UPDATED BELOW
I have created a plot, I literally need it horizontal, but the coord_flip() leaves the facets on the bottom instead of having nested groups on the left.
The data:
srvc_data <- structure(list(dept = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Distribution Centre Services",
"IT", "Marketing", "Merchandise & Inventory", "Operations and Communication"
), class = "factor"), label = c("test5", "test7", "test3", "test10",
"test4", "test6", "test2", "test1", "test11", "test12", "test9",
"test8", "test18", "test19", "test15", "test17", "test13", "test16",
"test20", "test14", "test22", "test21", "test25", "test23", "test24",
"test27", "test26", "test28", "test29", "test31", "test33", "test30",
"test32", "test38", "test36", "test37", "test43", "test34", "test35",
"test40", "test39", "test42", "test41", "test5", "test7", "test3",
"test10", "test4", "test6", "test2", "test1", "test11", "test12",
"test9", "test8", "test18", "test19", "test15", "test17", "test13",
"test16", "test20", "test14", "test22", "test21", "test25", "test23",
"test24", "test27", "test26", "test28", "test29", "test31", "test33",
"test30", "test32", "test38", "test36", "test37", "test43", "test34",
"test35", "test40", "test39", "test42", "test41"), Gap = c(-0.07,
-0.13, -0.15, -0.16, -0.16, -0.21, -0.22, -0.24, -0.24, -0.25,
-0.3, -0.3, -0.18, -0.19, -0.24, -0.29, -0.3, -0.34, -0.36, -0.41,
-0.46, -0.63, -0.16, -0.18, -0.21, -0.22, -0.27, -0.29, -0.31,
-0.31, -0.35, -0.39, -0.42, -0.15, -0.15, -0.2, -0.21, -0.22,
-0.27, -0.29, -0.29, -0.31, -0.36, -0.07, -0.13, -0.15, -0.16,
-0.16, -0.21, -0.22, -0.24, -0.24, -0.25, -0.3, -0.3, -0.18,
-0.19, -0.24, -0.29, -0.3, -0.34, -0.36, -0.41, -0.46, -0.63,
-0.16, -0.18, -0.21, -0.22, -0.27, -0.29, -0.31, -0.31, -0.35,
-0.39, -0.42, -0.15, -0.15, -0.2, -0.21, -0.22, -0.27, -0.29,
-0.29, -0.31, -0.36), impeff = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("Importance", "Effectiveness"), class = "factor"),
score = c(0.87, 0.79, 0.78, 0.82, 0.81, 0.81, 0.92, 0.92,
0.78, 0.81, 0.86, 0.91, 0.79, 0.79, 0.87, 0.93, 0.9, 0.9,
0.82, 0.95, 0.91, 0.95, 0.77, 0.79, 0.82, 0.8, 0.83, 0.9,
0.91, 0.94, 0.89, 0.94, 0.91, 0.82, 0.74, 0.78, 0.81, 0.83,
0.85, 0.82, 0.81, 0.8, 0.83, 0.8, 0.66, 0.63, 0.66, 0.65,
0.6, 0.7, 0.68, 0.54, 0.56, 0.56, 0.61, 0.61, 0.6, 0.63,
0.64, 0.6, 0.56, 0.46, 0.54, 0.45, 0.32, 0.61, 0.61, 0.61,
0.58, 0.56, 0.61, 0.6, 0.63, 0.54, 0.55, 0.49, 0.67, 0.59,
0.58, 0.6, 0.61, 0.58, 0.53, 0.52, 0.49, 0.47)), row.names = c(NA,
-86L), .Names = c("dept", "label", "Gap", "impeff", "score"), class = "data.frame")
And the code:
ggplot(data = srvc_data, aes(x = label, y = score)) +
geom_bar( aes(fill = impeff),stat = "identity", position = "dodge",width = 1) +
facet_grid(~dept, switch = "x", scales = "free", space = "free") +
#coord_flip()+
The plot (without the flip) looks like the below, I need it horizontal, with the facet categories on the far left. How does the coord_flip() work? Why wouldn't it also flip/move the facet strips? Please ignore the crammed formatting!
**UPDATE
So thanks to #neilfws I have fixed the plot, by switching the order of the data.
ggplot(data = srvc_data, aes(x = label, y = score)) +
geom_bar( aes(fill = impeff),stat = "identity", position = "dodge",width = 1) +
facet_grid(dept~., switch = "y", scales = "free_y", space = "free") +
coord_flip()
Now I have the correctly oriented plot, but there is lots of unused space for all the labels that are unused in each facet. Within the facet_grid call, setting scales = "free" doesn't work, nor does drop = T. Any ideas? Plot below for reference.
If you coord_flip, you also need to reverse the faceting relationship (~), to place it on the side, and the switch, to place it on the y-axis. Does this get you close to what you want?
ggplot(srvc_data, aes(label, score)) +
geom_bar( aes(fill = impeff), stat = "identity", position = "dodge", width = 1) +
facet_grid(dept ~ ., switch = "y", scales = "free", space = "free") + coord_flip()

Random effects model in R - error

I am running econometric model with panel data in R. I am using plm package and pooled model and fixed effects model works great. But I get this error when trying to do random effects model and I don't know how to fix it.
There is my whole dataset and code:
auto <- structure(list(Country = structure(c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L,
6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L,
10L, 10L, 10L, 11L, 11L, 11L, 11L), .Label = c("Bahrain", "Cuba",
"China", "Kuwait", "Lao PDR", "Qatar", "Saudi Arabia", "Swaziland",
"Syria", "United Arab Emirates", "Vietnam"), class = "factor"),
Year = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L,
3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L,
2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L,
1L, 2L, 3L, 4L), .Label = c("1971", "1981", "1991", "2001"
), class = "factor"), AVG_GR_. = c(2.44, -2.93, 1.77, -1.04,
3.17, 3.5, -1.59, 5.13, 4.29, 7.51, 9.42, 9.83, -7.39, -5.52,
10.72, -0.14, 1.77, 3.38, 3.68, 5.33, -1.55, -5.72, 4.64,
1.5, 6.06, -5.25, 0.54, 2.28, 6.99, 2.82, 0.82, 1.12, 6.72,
-2, 3.09, 2.15, -1.06, -4.88, 0.2, -6.04, 1.61, 3.21, 5.88,
6.24), GDP_PC = c(17444.65, 19550.76, 15970.05, 18212.71,
2067.93, 3127.98, 3221.25, 3081.73, 153.5, 231.14, 491.26,
1207.52, 70184.35, 23911.92, 9559.35, 27681.03, 162.06, 212.46,
261.98, 386.38, 72617.74, 55370.39, 31970, 51090.02, 13752.55,
21124.79, 12891.51, 12446.49, 881.75, 1595.82, 1995.8, 2191.36,
738.63, 1349.2, 1057.84, 1380.2, 88377.72, 75348.77, 43306.13,
45038.43, 164.15, 194.45, 267.17, 481.92), POP_. = c(5.39,
3.26, 3.03, 6.49, 1.22, 0.75, 0.5, 0.13, 1.91, 1.71, 0.95,
0.6, 6.22, 4.16, -0.66, 4.61, 1.93, 2.7, 2.42, 1.73, 7.44,
7.9, 2.23, 11.57, 5.43, 5.12, 2.2, 3.08, 3.07, 3.64, 2.12,
1.16, 3.45, 3.35, 2.77, 2.78, 15.96, 5.94, 5.3, 10.95, 2.29,
2.3, 1.62, 0.97), CONSUMP_. = c(64.21, 52.81, 51.47, 40.51,
54.58, 54.96, 62.74, 54.02, 51.72, 51.01, 45.63, 39, 27.44,
48.61, 49.76, 35.74, 90.19, 90.65, 89.15, 70.38, 21.33, 26.27,
26.84, 16.81, 22.96, 46.85, 44.2, 31.61, 54.77, 74.9, 80.42,
79.36, 67.09, 69.71, 69.92, 61.26, 15.28, 33.07, 46.79, 59.97,
90, 89.89, 73.9, 65.33), GOV_CON_. = c(11.1, 19.55, 19.21,
14.27, 31.67, 31.66, 29.47, 34.91, 12.99, 14.11, 14.53, 14.1,
12.04, 23.7, 48.98, 18.45, 8.05, 8.29, 7.21, 8.96, 20.47,
36.49, 31.09, 14.5, 16.02, 30.12, 26.94, 22.53, 19.07, 17.11,
17.65, 14.76, 19.93, 19.6, 12.75, 12.67, 10.87, 19.27, 16.99,
7.66, 6.73, 6.85, 7.46, 6.19), CAP_FORM_. = c(34.15, 32.51,
24.24, 26.56, 25.94, 25.49, 10.76, 10.7, 34.57, 35.19, 37.79,
42.21, 13.55, 18.68, 17.9, 17.28, 7.57, 10.24, 16.68, 30.28,
22.49, 18.37, 26.13, 36.58, 22.59, 22.7, 20.49, 23.68, 30.77,
21.42, 17.65, 14.55, 25.34, 20.68, 22.53, 23.48, 29.93, 26.28,
27.29, 22.63, 14.45, 14.46, 25.22, 36.44), NAT_RES_. = c(27.42,
20.18, 17.52, 23.34, 1.81, 1.87, 2.5, 3.42, 41.09, 38.83,
40.09, 17.91, 66.53, 41.25, 35.94, 48.41, 5.28, 4.2, 3.01,
10.15, 63.5, 40.84, 39.7, 54.17, 57.89, 31.24, 32.74, 42.77,
6.47, 3.64, 2.25, 1.32, 9.55, 9.14, 14.19, 22.92, 51.04,
37.08, 27.99, 31.36, 3.95, 4.17, 8.39, 13.57), TRADE = c(1.69,
1.48, 1.37, 1.34, 0.77, 0.76, 0.33, 0.34, 0.11, 0.21, 0.35,
0.58, 1.03, 0.99, 1.09, 0.9, 0.15, 0.23, 0.63, 0.57, 0.95,
0.82, 0.85, 0.91, 0.89, 0.76, 0.66, 0.8, 1.47, 1.54, 1.42,
1.62, 0.51, 0.44, 0.66, 0.71, 1.1, 0.97, 1.37, 1.23, 0.62,
0.62, 0.86, 1.43), INFL_. = c(13.26, 3.24, 1.64, 5.65, 5.22,
0.11, 5.49, 2.44, 1.17, 5.72, 6.85, 4.2, 31.52, -0.47, 3.25,
7.29, 43.86, 56.9, 32.37, 7.95, 20.84, -1.59, 3.18, 8.65,
26.67, -1.16, 2.4, 5.73, 10.71, 11.36, 10.97, 8.04, 11.62,
17.43, 6.74, 6.78, 28.31, 1.25, 2.03, 6.94, 7.05, 156.6,
18.99, 9.45), LIFE_EXP = c(67.39, 71.47, 73.66, 75.55, 72.28,
74.46, 75.6, 77.81, 65.7, 68.43, 70.64, 73.99, 68.17, 71.25,
72.92, 73.79, 47.79, 51.39, 58.38, 64.68, 71.16, 74.31, 76.18,
77.53, 58.65, 66.77, 71.16, 74.03, 51.33, 57.45, 54.96, 46.81,
63.01, 68.42, 72.03, 74.56, 65.49, 70.19, 73.24, 75.66, 62.69,
69.09, 72.28, 74.66), EDU_T = c(0.68, 1.59, 2.63, 3.14, 0.75,
1.46, 2.81, 3.84, 0.37, 0.62, 1.08, 1.71, 1.41, 2.71, 3.53,
3.54, 0.16, 0.35, 0.65, 1, 1.61, 2.11, 2.5, 3.06, 1.06, 1.44,
2.13, 2.66, 0.35, 0.74, 1.07, 0.91, 0.34, 0.74, 1.27, 1.3,
1.14, 1.65, 2.61, 3.85, 0.67, 1.21, 0.67, 1.54)), .Names = c("Country",
"Year", "AVG_GR_.", "GDP_PC", "POP_.", "CONSUMP_.", "GOV_CON_.",
"CAP_FORM_.", "NAT_RES_.", "TRADE", "INFL_.", "LIFE_EXP", "EDU_T"
), row.names = c(1L, 2L, 3L, 4L, 9L, 10L, 11L, 12L, 5L, 6L, 7L,
8L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L,
42L, 43L, 44L, 45L, 46L, 47L, 48L), class = c("plm.dim", "data.frame"
))
Y <- cbind(auto$AVG_GR_.)
X <- cbind(auto$GDP_PC, auto$POP_., auto$CONSUMP_., auto$GOV_CON_.,
auto$CAP_FORM_., auto$NAT_RES_., auto$TRADE, auto$INFL_.,
auto$LIFE_EXP, auto$EDU_T)
pdata <- plm.data(auto, c("Country", "Year"))
random <- plm(Y~X, data=pdata, model="random")
Everything is OK until the last row. I get this error:
Error in if (sigma2$id < 0) stop(paste("the estimated variance of the", :
missing value where TRUE/FALSE needed
Thanks for your help :)
I am looking for help, but solved your problem. The first column has row. Names automatically filled in. You need to delete first column.
This worked:
> pdata <- pdata[,2:13];
> random <- plm(Y~X, data=pdata, model="random")
Just replace last row of your code with the above two lines.

Creating ROC curve with GGPLOT

I have the following data:
df <- structure(list(TPR = c(0.02, 0.04, 0.06, 0.08, 0.1, 0.12, 0.14,
0.16, 0.18, 0.2, 0.22, 0.24, 0.26, 0.28, 0.3, 0.32, 0.34, 0.36,
0.38, 0.4, 0.42, 0.44, 0.46, 0.48, 0.5, 0.52, 0.54, 0.56, 0.58,
0.6, 0.62, 0.64, 0.64, 0.64, 0.66, 0.68, 0.7, 0.72, 0.74, 0.76,
0.78, 0.8, 0.8, 0.82, 0.82, 0.84, 0.84, 0.84, 0.86, 0.86, 0.86,
0.86, 0.88, 0.88, 0.9, 0.92, 0.92, 0.92, 0.92, 0.94, 0.94, 0.96,
0.96, 0.96, 0.96, 0.96, 0.96, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98,
0.98, 0.98, 0.98, 0.98, 0.98, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.02, 0.04, 0.06, 0.08, 0.1,
0.12, 0.14, 0.16, 0.18, 0.2, 0.22, 0.24, 0.24, 0.26, 0.28, 0.3,
0.32, 0.34, 0.36, 0.38, 0.4, 0.42, 0.42, 0.42, 0.44, 0.46, 0.48,
0.5, 0.52, 0.54, 0.56, 0.58, 0.6, 0.6, 0.6, 0.6, 0.62, 0.62,
0.62, 0.64, 0.66, 0.66, 0.68, 0.68, 0.68, 0.7, 0.72, 0.74, 0.76,
0.78, 0.8, 0.8, 0.8, 0.82, 0.82, 0.84, 0.84, 0.84, 0.86, 0.86,
0.86, 0.86, 0.86, 0.88, 0.88, 0.88, 0.9, 0.9, 0.9, 0.9, 0.9,
0.9, 0.9, 0.92, 0.94, 0.96, 0.96, 0.96, 0.96, 0.96, 0.96, 0.96,
0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 1, 1, 1,
1, 1, 1, 1, 1, 1, 0.02, 0.04, 0.06, 0.08, 0.1, 0.1, 0.1, 0.12,
0.14, 0.16, 0.18, 0.2, 0.22, 0.24, 0.24, 0.26, 0.28, 0.28, 0.3,
0.32, 0.34, 0.36, 0.38, 0.4, 0.42, 0.42, 0.42, 0.42, 0.44, 0.44,
0.44, 0.46, 0.48, 0.48, 0.5, 0.52, 0.54, 0.56, 0.58, 0.58, 0.6,
0.62, 0.62, 0.62, 0.64, 0.66, 0.68, 0.68, 0.7, 0.72, 0.72, 0.72,
0.72, 0.74, 0.74, 0.74, 0.76, 0.76, 0.78, 0.78, 0.8, 0.82, 0.84,
0.84, 0.84, 0.86, 0.88, 0.88, 0.9, 0.9, 0.92, 0.92, 0.92, 0.92,
0.92, 0.92, 0.92, 0.92, 0.94, 0.94, 0.96, 0.96, 0.96, 0.96, 0.98,
0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98, 0.98,
1, 1, 1, 1, 0.02, 0.04, 0.06, 0.06, 0.06, 0.08, 0.08, 0.1, 0.12,
0.14, 0.16, 0.16, 0.18, 0.2, 0.22, 0.24, 0.26, 0.28, 0.28, 0.3,
0.32, 0.32, 0.34, 0.34, 0.36, 0.38, 0.4, 0.42, 0.42, 0.44, 0.46,
0.46, 0.46, 0.48, 0.48, 0.5, 0.52, 0.54, 0.56, 0.56, 0.58, 0.6,
0.62, 0.64, 0.64, 0.64, 0.64, 0.64, 0.66, 0.68, 0.68, 0.7, 0.7,
0.7, 0.7, 0.7, 0.72, 0.74, 0.76, 0.76, 0.78, 0.78, 0.78, 0.8,
0.8, 0.82, 0.82, 0.84, 0.86, 0.86, 0.86, 0.86, 0.88, 0.9, 0.92,
0.92, 0.92, 0.92, 0.92, 0.92, 0.92, 0.92, 0.92, 0.94, 0.94, 0.94,
0.94, 0.94, 0.94, 0.96, 0.98, 0.98, 0.98, 0.98, 1, 1, 1, 1, 1,
1), FPR = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.02, 0.04, 0.04,
0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.06, 0.06, 0.08, 0.08,
0.1, 0.12, 0.12, 0.14, 0.16, 0.18, 0.18, 0.2, 0.2, 0.2, 0.22,
0.24, 0.26, 0.26, 0.28, 0.28, 0.3, 0.32, 0.34, 0.36, 0.38, 0.38,
0.4, 0.42, 0.44, 0.46, 0.48, 0.5, 0.52, 0.54, 0.56, 0.58, 0.58,
0.6, 0.62, 0.64, 0.66, 0.68, 0.7, 0.72, 0.74, 0.76, 0.78, 0.8,
0.82, 0.84, 0.86, 0.88, 0.9, 0.92, 0.94, 0.96, 0.98, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.02, 0.02, 0.02, 0.02, 0.02, 0.02,
0.02, 0.02, 0.02, 0.02, 0.04, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 0.06, 0.06, 0.08, 0.1, 0.12, 0.12, 0.14, 0.16, 0.16,
0.16, 0.18, 0.18, 0.2, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22,
0.24, 0.26, 0.26, 0.28, 0.28, 0.3, 0.32, 0.32, 0.34, 0.36, 0.38,
0.4, 0.4, 0.42, 0.44, 0.44, 0.46, 0.48, 0.5, 0.52, 0.54, 0.56,
0.56, 0.56, 0.56, 0.58, 0.6, 0.62, 0.64, 0.66, 0.68, 0.68, 0.7,
0.72, 0.74, 0.76, 0.78, 0.8, 0.82, 0.84, 0.84, 0.86, 0.88, 0.9,
0.92, 0.94, 0.96, 0.98, 1, 0, 0, 0, 0, 0, 0.02, 0.04, 0.04, 0.04,
0.04, 0.04, 0.04, 0.04, 0.04, 0.06, 0.06, 0.06, 0.08, 0.08, 0.08,
0.08, 0.08, 0.08, 0.08, 0.08, 0.1, 0.12, 0.14, 0.14, 0.16, 0.18,
0.18, 0.18, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.22, 0.22, 0.22, 0.24,
0.26, 0.26, 0.26, 0.26, 0.28, 0.28, 0.28, 0.3, 0.32, 0.34, 0.34,
0.36, 0.38, 0.38, 0.4, 0.4, 0.42, 0.42, 0.42, 0.42, 0.44, 0.46,
0.46, 0.46, 0.48, 0.48, 0.5, 0.5, 0.52, 0.54, 0.56, 0.58, 0.6,
0.62, 0.64, 0.64, 0.66, 0.66, 0.68, 0.7, 0.72, 0.72, 0.74, 0.76,
0.78, 0.8, 0.82, 0.84, 0.86, 0.88, 0.9, 0.92, 0.94, 0.94, 0.96,
0.98, 1, 0, 0, 0, 0.02, 0.04, 0.04, 0.06, 0.06, 0.06, 0.06, 0.06,
0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.1, 0.1, 0.1, 0.12,
0.12, 0.14, 0.14, 0.14, 0.14, 0.14, 0.16, 0.16, 0.16, 0.18, 0.2,
0.2, 0.22, 0.22, 0.22, 0.22, 0.22, 0.24, 0.24, 0.24, 0.24, 0.24,
0.26, 0.28, 0.3, 0.32, 0.32, 0.32, 0.34, 0.34, 0.36, 0.38, 0.4,
0.42, 0.42, 0.42, 0.42, 0.44, 0.44, 0.46, 0.48, 0.48, 0.5, 0.5,
0.52, 0.52, 0.52, 0.54, 0.56, 0.58, 0.58, 0.58, 0.58, 0.6, 0.62,
0.64, 0.66, 0.68, 0.7, 0.72, 0.74, 0.74, 0.76, 0.78, 0.8, 0.82,
0.84, 0.84, 0.84, 0.86, 0.88, 0.9, 0.9, 0.92, 0.94, 0.96, 0.98,
1), GeneSet = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("Distort = 1", "Distort = 1.5",
"Distort = 2", "Distort = 2.5"), class = "factor")), .Names = c("TPR",
"FPR", "GeneSet"), row.names = c(NA, -400L), class = "data.frame")
But why the following code fail to create the desired plot?
library(ggplot2)
library(RColorBrewer)
p <- qplot(FPR, TPR, data = df, geom = "blank", main = "ROC curve", xlab = "False Positive Rate (1-Specificity)", ylab = "True Positive Rate (Sensitivity)" )
p <- p + geom_line(aes(x = FPR, y = TPR, data = data, colour = GeneSet), size = 2, alpha = 0.7) + scale_colour_manual(values=colors)
p
I got this error message:
Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
Error: Aesthetics must either be length one, or the same length as the dataProblems:data
The desired plot is this:
You don't need to plot geom="blank" and geom_line() - it can be done just by geom_line(). Only colors can't be reproduced because variable colors isn't provided in question.
ggplot(df,aes(FPR,TPR,color=GeneSet))+geom_line(size = 2, alpha = 0.7)+
labs(title= "ROC curve",
x = "False Positive Rate (1-Specificity)",
y = "True Positive Rate (Sensitivity)")

Resources