Error when running poisson regression with a binary outcome - r

I am trying to run a poisson regression to predict a common binary outcome.
This is my first attempt at using dput - if I have used it inappropriately, please let me know so I can correct it.
Example data:
df <- structure(list(id = 1:30, sex = structure(c(1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L,
2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L), .Label = c("Female", "Male"
), class = "factor"), migStat = structure(c(1L, 2L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("Australian-born",
"Migrant"), class = "factor"), mhAreaBi = structure(c(1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L), .Label = c("Metropolitan",
"Regional"), class = "factor"), empStatBi = structure(c(2L, 2L,
1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Student / employed",
"Unemployed"), class = "factor"), pensBenBi = structure(c(1L,
2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L,
1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L), .Label = c("No benefit",
"In receipt of pension benefit"), class = "factor"), maritStatBi = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("Married (including de facto)",
"Not married"), class = "factor"), cto = structure(c(1L, 2L,
2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L,
2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L), .Label = c("No",
"Yes"), class = "factor")), .Names = c("id", "sex", "migStat",
"mhAreaBi", "empStatBi", "pensBenBi", "maritStatBi", "cto"), row.names = c(NA,
-30L), class = "data.frame")
When running the regression using glm in R, I receive an error:
fit <- glm(cto ~ sex + migStat + mhAreaBi + empStatBi + pensBenBi + maritStatBi, df, family = poisson)
Error in if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In Ops.factor(y, 0) : ‘<’ not meaningful for factors
The same error has been explained briefly in this thread:
Because the "<" operator is not defined for factors the result that is
passed to if is of length 0. Setting the factor variable on the RHS
and using the integer values on hte LHS succeeds.
The error does not appear when I convert the outcome to an integer; however, this:
seems to defeat the purpose of predicting a binary outcome (unless a numeric variable with range 0-1 is treated the same as a factor variable with two levels); and
does not seem necessary (at least according to this post, which uses geeglm from geepack to predict a binary outcome [unfortunately, I receive the same error when I adapt the code to my own dataset])
Questions:
Could I receive further explanation of the error?
If I convert my outcome to an integer with range 0-1, will glm treat it the same as a binary variable? If not, is there an approach better suited to running a regression for a common binary outcome?

I think the best option here is:
df$cto_binary <- as.numeric(df$cto == "Yes")
fit <- glm(cto_binary ~ sex + migStat + mhAreaBi + empStatBi + pensBenBi + maritStatBi,
df, family = poisson)
As this way you explicitly show in your code what will be a 1/success in your binary outcome and don't get tripped up by things like the ordering of factor levels. Note that in R as.numeric(c(FALSE, TRUE)) gives c(0, 1), so you always know what you're going to get from a logical comparison.

Related

Form groups using block random assignment on two covariates

I often have groups of people who differ in their nationality and their status. They have to work in groups, and I would like to use block random assignment to create groups of a maximum of 5 individuals. Each group should have at least one person who is "foreign" and one who is "female". I have found the library randomizr which is supposedly able to do block random assignments, but my code does not work as intended.
An example dataset would be:
structure(list(Student = c("Susan", "Ciara", "Carl",
"Paula", "Emil", "Tammy", "Logan", "Anna", "Victor",
"Felix", "Federica", "Jesus", "Jens", "Samira", "Berit", "Yi",
"Lea", "Gordon", "Boris", "Silvester", "Celine", "Thomas", "Eduardo",
"RoY", "Marlene", "Amelie", "Claudius", "Herbert", "Cynthia", "Melanie",
"Leander", "Leona", "Tobias", "Leander", "Peter",
"Lilly", "Roxy", "Joachim"), Nationality = structure(c(2L, 2L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 1L,
1L, 1L, 2L, 2L), levels = c("Non-foreign", "Foreign"), class = "factor"),
Gender = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L), levels = c("female",
"male"), class = "factor")), class = "data.frame", row.names = c(NA,
-38L))
UPDATE: I have carefully read the vignette for the randomzir package again. I found that it is possible to create blocks with more than 1 covariate. I am now looking to see if i can assign these blocks to the students to get block random groups. I need to test if the code below works as intended.
blocks <- with(data, paste(Nationality, Gender, sep = "_"))
Z <- block_ra(blocks = blocks, num_arms = 6)
table(data$Student, Z)

Error: Argument 20 matches multiple formal arguments in venn.diagram function

I'm very new to the venn.diagram() function, and am trying to create a simple venn diagram. Here is the data I am using:
structure(list(Transmitter = c("1657", "1657", "1658", "1659",
"1659", "1660", "1660", "1661", "1662", "1663", "1663", "1664",
"1664", "1666", "1667", "1667", "1668", "1668", "1669", "1670",
"1671", "1671", "1672", "1672", "1673", "1673", "1674", "1674",
"1675", "1675", "1676", "1676", "1678", "1679", "1679", "1680",
"1681", "1681", "1682", "1682", "1683", "1684", "1685", "1686",
"1686", "9782", "9782", "24166", "24166", "24167", "24168", "24169",
"24170", "24171", "24172", "24173", "24174", "24175", "24175",
"24176", "24177", "24178", "24179", "24179", "24180", "24181",
"24182", "24183", "24184", "24184", "24185", "24186", "24187",
"24188", "24189", "24190", "24191", "24192", "24193", "24194",
"24194", "24195", "24195", "24196", "24197", "24198", "24198",
"24199", "24199", "24200", "24201", "24203", "24204", "24204",
"24206", "24207", "24209", "24210", "24211", "24212", "24212",
"24213", "24214", "24215", "24216", "24216", "24217", "24218",
"24219", "30759", "30760", "30761", "30761", "30761", "30762",
"30763", "30764", "30765", "30765", "30765", "30766", "30766",
"30766", "30767", "30767", "30768", "30768", "30768", "30769",
"30769", "30769", "30770", "30771", "30772", "30772", "30772",
"30773", "30773", "30773", "30774", "30774", "30775", "30775",
"30776", "30776", "30777", "30777", "30777", "30778", "30778",
"30779", "30780", "30780", "30780", "30781", "30782", "30782",
"30783", "30784", "30785", "30786", "30787", "30788", "30788"
), Direction = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L,
1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L), .Label = c("Marine",
"River"), class = "factor")), row.names = c(NA, -164L), class = "data.frame")
I want to create a venn diagram with a circle for each direction. Inside each circle is a number indicating the number of transmitters that are considered 'river', 'marine' or both.
This is some code I modified from a website:
install.packages('VennDiagram')
library(VennDiagram)
venn.diagram(
x = list(
lasts2WOFD %>% filter(Direction == 'Marine') %>% select(Transmitter) %>% unlist() ,
lasts2WOFD %>% filter(Direction == 'River') %>% select(Transmitter) %>% unlist()
),
category.names = c("Marine" , "Fresh"),
filename = 'VennDiagram',
output = TRUE ,
imagetype="png" ,
height = 480 ,
width = 480 ,
resolution = 300,
compression = "lzw",
lwd = 1,
col=c("#440154ff", '#21908dff'),
fill = c(alpha("#440154ff",0.3), alpha('#21908dff',0.3)),
cex = 0.5,
fontfamily = "sans",
cat.cex = 0.3,
cat.default.pos = "outer",
cat.pos = c(-27, 27),
cat.dist = c(0.055, 0.055),
cat.fontfamily = "sans",
cat.col = c("#440154ff", '#21908dff'),
rotation = 1
)
When run, I get this error:
Error in VennDiagram::draw.pairwise.venn(area1 = length(x[[1]]), area2 = length(x[[2]]), :
argument 20 matches multiple formal arguments
Regarding your question, I had a look at the source code of VennDiagram and I saw that rotation is part of venn.diagram, but not of draw.pairwise.venn. The parameter gets passed but cannot be used. Simply remove rotation=1 and it should work.
I understand this does not answer your question, but I just wanted to let you know that you can get the diagram with other packages. My nVennR package can do that in a couple of steps. If your object is called lasts2WOFD,
>library(nVennR)
>myV <- plotVenn(list(River=subset(lasts2WOFD, Direction == "River")$Transmitter, Marine=subset(lasts2WOFD, Direction == "Marine")$Transmitter))
The result would be:
You can control the output as explained in the vignette. You can also export a vectorial svg file that you can edit afterwards.

Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...) : there are aliased coefficients in the model

For my experiment, I have 3 independent variables: trial type, sex and gaming experience (all of which are categorical).
I have one dependent variable: proportion of correct trials (which is continuous).
When I tried running a 3-way ANOVA, the assumptions were not met, and so I used an aligned-rank transformation ANOVA.
m1 <- art(Proportioncorrect ~ Videogamefrequency + Biologicalsex + + Trialtype + Videogamefrequency:Biologicalsex + Videogamefrequency:Trialtype + Biologicalsex:Trialtype + Biologicalsex:Trialtype:Videogamefrequency, data = Gaming)
The model gave me the error:
Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...) :
there are aliased coefficients in the model
Could anyone give me a helping hand?
My data is here:
structure(list(ID = c("P_200214123342", "P_200224092247", "P_200219163622",
"P_200220130332", "P_200219091823", "P_200225184226", "P_200219123120",
"P_200219175102", "P_200214103155", "P_200219111605", "P_200217101213",
"P_200219102411", "P_200221101028", "P_200220145557", "P_200225171612",
"P_200224092247", "P_200219163622", "P_200220130332", "P_200214123342",
"P_200219091823", "P_200225184226", "P_200219123120", "P_200219175102",
"P_200214103155", "P_200219111605", "P_200217101213", "P_200219102411",
"P_200221101028", "P_200220145557", "P_200225171612"), Trialtype = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Beaconed",
"Probe"), class = "factor"), Proportioncorrect = c(0.729727660699102,
1.33933990048532, 0.729727660699102, 1.075862200454, 0.578378233982015,
1.16808048521424, 1.33933990048532, 1.13531397797248, 1.28700221758657,
1.13531397797248, 1.28700221758657, 1.13531397797248, 1.28700221758657,
1.28700221758657, 1.20358829695229, 0.297711691252463, 0.160690652951911,
0.147197653346961, 0.0667161517509908, 0.080085580033659, 0.160690652951911,
0.133731586046578, 0.214985569478799, 0.160690652951911, 0.269932799291976,
0.339836905918588, 0.242365851038963, 0.214985569478799, 0.677268408841807,
1.20358829695229), Videogamefrequency = structure(c(2L, 1L, 1L,
1L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L), .Label = c("Monthly",
"Never", "Weekly", "Yearly"), class = "factor"), Biologicalsex = structure(c(1L,
1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L), .Label = c("Female",
"Male"), class = "factor")), row.names = c(NA, -30L), class = "data.frame")

Stacked barplot using ggplot2 - data visualisation

I have very little experience with R and am trying to make a stacked barplot using ggplot2.
I have 2 groups - control and experimental, and 2 choices - red and green. I'm not sure how to organise my data.
There were 80 animals in my trial (control n=40, experimental n=40) and they were given the choice of red and green substrate, I noted which substrate they chose, and that's the data I'm trying to plot.
I would essentially want 'Experimental' and 'Control on the x-axis, and the number of choices on the y-axis (e.g. Control, Red n=20, Control, Green = 12 etc).
Any help would be appreciated!
Edited to add:
This is the graph it's outputting
This is the code I'm using (including suggested adjustments):
df <- data.frame(group = rep(c("control", "experimental"), each = 40),
substrate = sample (c("red","green"), 80, TRUE))
ggplot(df, aes(x = group, y = substrate, fill = substrate)) +
geom_bar(stat = "identity") +
scale_fill_manual(values = c("red", "green"))
This is the output:
structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("control", "experimental"
), class = "factor"), substrate = structure(c(1L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L,
2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L,
1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L), .Label = c("green",
"red"), class = "factor")), class = "data.frame", row.names = c(NA,
-80L))
output from df(behaviour) - original dataframe
structure(list(group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Control", "Experimental"
), class = "factor"), substrate = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Green",
"Red"), class = "factor")), class = "data.frame", row.names = c(NA,
-80L))
Your data:
behaviour=structure(list(group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Control", "Experimental"
), class = "factor"), substrate = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Green",
"Red"), class = "factor")), class = "data.frame", row.names = c(NA,
-80L))
We can tabulate your data:
table(behaviour$group,behaviour$substrate)
Green Red
Control 10 30
Experimental 27 13
So you can only specify fill or y with geom_bar. In your case, you specify the fill, the geom_bar() function will do the counting for you:
ggplot(behaviour,aes(x=group,fill=substrate))+
geom_bar() + scale_fill_manual(values=c("#29c7ac","#c02739"))
You could have your data like this, with one row for each observation (i.e. each animal), with the group and the substrate recorded for each:
df <- data.frame(group = rep(c("control", "experimental"), each = 40),
substrate = rep(c("green", "red", "green", "red"), c(10, 30, 27, 13)))
Now define your plot using ggplot, specifying group as your x axis, and ..count.. as your y axis. Use geom_bar to get the stacked bars you are looking for, and finally use scale_fill_manual to set the colours:
library(ggplot2)
ggplot(df, aes(x = group, y = ..count.., fill = substrate)) +
geom_bar(colour = "black") +
scale_fill_manual(values = c("green", "red"))

Boxplot with two levels and multiple data.frames

I have 4 data.frames with two factor levels in each data.frame. df1 is reproduced below. Please duplicate df1 to produce df2...df4.
How can I produce boxplots with ggplot2 such that my final figure looks very similar to the figure below? The seasons in the figure represent the dataframe names while present and future represent level names and the legend represents heavy, heavy, heaviest in the data reproduced here.
Ignore the dotted horizontal red line.
df1= structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("NN", "SS"), class = "factor"),
heavy = c(0.136230125, 0.136281211, 0.136038018, 0.135392862,
0.137088902, 0.136028293, 0.13640057, 0.135317058, 0.13688615,
0.136448994, 0.137089424, 0.136810847, 0.135865471, 0.136130096,
0.136361327, 0.137796714, 0.136052839, 0.135892646, 0.13544437,
0.136452363, 0.135367421, 0.135617509, 0.138202559, 0.135396942,
0.135930092, 0.135661805, 0.135666, 0.135860128, 0.137648687,
0.136057353, 0.136057731, 0.135162399, 0.136080113, 0.135285036,
0.136204839, 0.138058091, 0.137215664, 0.135696637, 0.135863902,
0.135733243, 0.138274445, 0.136632122, 0.137787919, 0.135033093,
0.136926798, 0.136766413, 0.13690947, 0.135203152, 0.138370968,
0.136862356, 0.136083112, 0.138212845, 0.135964773, 0.13583601,
0.134923731, 0.135828965, 0.136272539, 0.138127602, 0.137028323,
0.136526836, 0.136407397, 0.137025373, 0.138358757, 0.137858521,
0.135464076, 0.136302506, 0.135528362, 0.137540677, 0.136455865,
0.138470144, 0.137227895, 0.136296955, 0.136792631, 0.135875782,
0.13815733, 0.136383864, 0.136696618, 0.13857652, 0.136700903,
0.136743873, 0.136033619, 0.135970522, 0.135816385, 0.136003984,
0.136583925, 0.136768202, 0.136292002, 0.136316737, 0.136540075,
0.136051218, 0.135924119, 0.136736303, 0.136946894, 0.136266073,
0.136263692, 0.136399301, 0.13611577, 0.135857095, 0.136769488,
0.136072466, 0.135564224, 0.136496131, 0.137659507, 0.136704681,
0.136542173, 0.136777403, 0.135771538, 0.13665463, 0.136984748,
0.137717859, 0.138195237, 0.136232227, 0.135956814), heavier = c(0.227332679,
0.227200132, 0.227299118, 0.227289816, 0.22724478, 0.227082442,
0.227861315, 0.227055561, 0.227112284, 0.228651438, 0.228158412,
0.228789678, 0.227188949, 0.228850198, 0.227246991, 0.227359368,
0.227359531, 0.227310607, 0.229490445, 0.227295226, 0.227958185,
0.228104958, 0.227254823, 0.22715392, 0.228062515, 0.227509559,
0.227143662, 0.230048719, 0.227860836, 0.228467792, 0.227263728,
0.227222794, 0.227165592, 0.227140611, 0.228424335, 0.227356425,
0.227243374, 0.228936267, 0.227320467, 0.22738371, 0.227694891,
0.227270428, 0.227751798, 0.228803279, 0.227330453, 0.229679261,
0.228999206, 0.227227604, 0.227247085, 0.227198567, 0.229234921,
0.227211613, 0.23007234, 0.226793036, 0.226474338, 0.226654333,
0.229964991, 0.22880328, 0.22700099, 0.226640822, 0.227522393,
0.227463578, 0.227832692, 0.227293936, 0.230154101, 0.229813709,
0.22761097, 0.227445308, 0.228669159, 0.22660539, 0.229017398,
0.230421347, 0.227041103, 0.227583471, 0.229547568, 0.22676335,
0.226737661, 0.229922588, 0.226907188, 0.227102239, 0.226469073,
0.230680908, 0.227763879, 0.226882448, 0.226741993, 0.226693024,
0.22671415, 0.226773662, 0.227795194, 0.226983096, 0.226647946,
0.226799552, 0.226759218, 0.22692942, 0.226601519, 0.227098192,
0.226886889, 0.226959012, 0.226552119, 0.226809761, 0.226786285,
0.226709252, 0.226834015, 0.228033943, 0.226693494, 0.22748613,
0.227608804, 0.22685023, 0.226586619, 0.227718907, 0.228890098,
0.226701909, 0.230919944), heaviest = c(0.316870607, 0.316772978,
0.316851707, 0.317017543, 0.316673994, 0.317224709, 0.319234458,
0.31861305, 0.319804304, 0.318605816, 0.316930034, 0.31688398,
0.316789552, 0.320783976, 0.317094325, 0.31809319, 0.317134565,
0.318173976, 0.317213167, 0.317084404, 0.321712205, 0.317128056,
0.316866913, 0.3170489, 0.31712423, 0.31684494, 0.319497635,
0.316932301, 0.316864646, 0.317279005, 0.316887692, 0.317134437,
0.316792589, 0.320894499, 0.319883014, 0.316924639, 0.316575642,
0.31686389, 0.316985994, 0.321566256, 0.316683995, 0.320299883,
0.317308965, 0.318151948, 0.316479828, 0.319857732, 0.317171909,
0.322137849, 0.316526917, 0.316870364, 0.322205784, 0.317055758,
0.320329144, 0.318015397, 0.318719989, 0.317910658, 0.317292016,
0.321348723, 0.319915048, 0.317160762, 0.318773245, 0.319627925,
0.31869767, 0.322422407, 0.32082693, 0.318034899, 0.318760783,
0.318325502, 0.320739086, 0.317216142, 0.32284544, 0.319466593,
0.318740499, 0.317489944, 0.319064923, 0.322014928, 0.317353897,
0.318904583, 0.317931141, 0.323295254, 0.318924712, 0.318965677,
0.317700019, 0.31793468, 0.317699508, 0.317168657, 0.318903983,
0.317493401, 0.317511406, 0.317483897, 0.31748495, 0.317776804,
0.318893431, 0.317663608, 0.316978585, 0.317473467, 0.317500429,
0.317144259, 0.317330826, 0.317610353, 0.317881476, 0.31707787,
0.317728374, 0.317452137, 0.31938939, 0.317199373, 0.31898747,
0.318878952, 0.317987024, 0.318951952, 0.318419561, 0.319568088,
0.321165413)), .Names = c("id", "heavy", "heavier", "heaviest"
), class = "data.frame", row.names = c(NA, -113L))
## create some data.frames: this results in a list of four dfs
createDF <- quote(data.frame(id=sample(c("NN", "SS"), 100, rep=T),
heavy=runif(100),
heavier=runif(100),
heaviest=runif(100)))
dfs <- lapply(1:4, function(i) eval(createDF))
## join and shape them
library(reshape2)
dat <- do.call(rbind, dfs)
dat$dfid <- paste("df", rep(1:4, times=sapply(dfs, nrow)))
dat <- melt(dat, id.vars=c("id", "dfid"))
ggplot(dat, aes(id, value, group=interaction(variable, id), fill=variable)) +
geom_boxplot() +
facet_grid(~dfid)
Something like this?
df1$season<- 'winter'
df2$season<- 'spring'
df3$season<- 'summer'
df4$season<- 'fall'
df1.m <- melt(df1, id.vars=c('id', 'season'), variable.name='weight', value.name='weight')
df2.m <- melt(df2, id.vars=c('id', 'season'), variable.name='weight', value.name='weight')
df3.m <- melt(df3, id.vars=c('id', 'season'), variable.name='weight', value.name='weight')
df4.m <- melt(df4, id.vars=c('id', 'season'), variable.name='weight', value.name='weight')
df.all <- rbind(df1.m, df2.m, df3.m, df4.m)
ggplot(df.all, aes(x=id, y=weight, fill=weightCat)) + geom_boxplot() + facet_grid(. ~ season)

Resources