multiple ggplot2 in 1 data frame - r

df
primer timepoints mean sde
Acan 0 1.000000e+00 0.000000e+00
Acan 20 9.547922e-01 1.729115e-01
Acan 40 1.936454e+00 9.934593e-01
Acan 60 1.261360e+00 2.232165e-01
Acan 120 2.219807e+00 5.915425e-01
Acan 240 2.540490e+00 5.651534e-01
Acan 360 1.518923e+00 1.522455e-01
Actb 0 1.000000e+00 0.000000e+00
Actb 20 1.061931e+00 4.362860e-02
Actb 40 8.835103e-01 1.196449e-01
Actb 60 8.889279e-01 1.401378e-01
Actb 120 1.001135e+00 7.770563e-02
Actb 240 8.551348e-01 1.884853e-01
Actb 360 7.343955e-01 1.824412e-01
This treats the data like each primer is in 1 df, but I want to make a scatter plot using ggplot2 for each unique primer (the y axis would be column mean and the x axis would be timepoints), could I get lapply to work here?
If I could just lapply a function somehow that would be ideal, a list of plots.
Here's the code I've been using for ggplot, in my attempts to loop this
plot_gg <- function(x){
ggplot(df,aes(x=timepoints,mean)) +
geom_point() +
geom_line() +
scale_x_continuous(name='x axis') +
scale_y_continuous(name='y axis') +
geom_errorbar(aes(ymin=mean-sde,ymax=mean+sde),width=2) +
opts(title = primer)
}
desired_list <- lapply(unique(df$primer),plot_gg,df)
this is pretty wrong, but, I'm not sure if I should subset the df first according to each individual primer. or if it would be easier to do w/ ggplot in the structure the data is in
if you could help direct me a little bit that would be great

I think the missing pieces are a need to redo the definition of arguments to geom_errorbar and add the use of facet_wrap. If you specify the number of columns and rows in the layout of facet_warp you can get multiple pages. Another way to print multiple pages is with the grid::grid.newpage() function.
ggplot(df, aes(x = timepoints, y = mean, ymin = mean - sde,
ymax = mean + sde)) +
geom_errorbar() + geom_point() + geom_line() +
facet_wrap(~ primer) +
xlab('x axis') + ylab('y axis') + opts(title = "primer")
For the multi-page request added in the comment below and using #Thierry's edits:
pdf("twopage.pdf", onefile=TRUE)
for ( i in unique(df$primer) ) {
g <- ggplot(df[df$primer == i, ], aes(x = timepoints, y = mean, ymin = mean - sde,
ymax = mean + sde)) +
geom_errorbar() + geom_point() + geom_line() +
facet_wrap(~ primer, ncol=1, nrow=1) +
xlab('x axis') + ylab('y axis') + opts(title = "primer")
print(g) ; cat(paste("printing", i, "\n"))}
dev.off()

Related

Adding names to all X axis values using ggplot2 in R

The head of data frame is as follows:
Age number
21 4
22 4
23 5
24 6
25 11
26 10
I am trying to plot the frequency chart using ggplot using the following code
ggplot(data=x2, aes(x=Age, y=number)) +
geom_bar(stat="identity", fill="steelblue")+
geom_text(aes(label=number), vjust=-0.3, size=3.5)+
theme_minimal()+ labs(x = "Age", y = "Number of users")+
ggtitle("Frequency of Age")
and I get the output but not all the values on the X Axis are visible. I am sorry as this might be a very silly question but I am very new to R.
You can use scale_x_continuous to set the axis breaks. With such a large number of axis labels, this probably works better if the orientation is flipped. Even then, it's still quite crowded.
library(tidyverse)
# Fake data
set.seed(2)
x2 = data_frame(Age=sample(20:70, 1000, replace=TRUE)) %>%
group_by(Age) %>%
summarise(number=n())
ggplot(data=x2, aes(x=Age, y=number)) +
geom_bar(stat="identity", fill="steelblue")+
geom_text(aes(label=number, y=0.5*number), size=3, colour="white")+
theme_minimal() +
labs(x = "Age", y = "Number of users")+
ggtitle("Frequency of Age") +
coord_flip() +
scale_x_continuous(breaks=min(x2$Age):max(x2$Age), expand=c(0,0.1)) +
scale_y_continuous(expand=c(0,0.2))

In ggplot2, geom_text() labels are misplaced below my data points (as pictured). How to overlay them onto points?

I'm using ggplot2 to create a simple dot plot of -1 to +1 correlation values using the following R code:
ggplot(dataframe, aes(x = exit)) +
geom_point(aes(y= row.names(dataframe))) +
geom_text(aes(y=exit, label=samplesize))
The y-axis has text labels, and I believe those text labels may be the reason that my geom_text() data point labels are squished down into the bottom of the plot as pictured here:
How can I change my plotting so that the data point labels appear on the dots themselves?
I understand that you would like to have the samplesize appear above each data point in the plot. Here is a sample plot with a sample data frame that does this:
EDIT: Per note by Gregor, changed the geom_text() call to utilize aes() when referencing the data. Thanks for the heads up!
top10_rank<-
String Number
4 h 0
1 a 1
11 w 1
3 z 3
7 z 3
2 b 4
8 q 5
6 k 6
9 r 9
5 x 10
10 l 11
x<-ggplot(data=top10_rank, aes(x = Number,
y = String)) + geom_point(size=3) + scale_y_discrete(limits=top10_rank$String)
x + geom_text(data=top10_rank, size=5, color = 'blue',
aes(x = Number,label = Number), hjust=0, vjust=0)
Not sure if this is what you wanted though.
Your problem is simply that you switched the y variables:
# your code
ggplot(dataframe, aes(x = exit)) +
geom_point(aes(y = row.names(dataframe))) + # here y is the row names
geom_text(aes(y =exit, label = samplesize)) # here y is the exit column
Since you want the same y-values for both you can define this in the initial ggplot() call and not worry about repeating it later
# working version
ggplot(dataframe, aes(x = exit, y = row.names(dataframe))) +
geom_point() +
geom_text(aes(label = samplesize))
Using row names is a little fragile, it's a little safer and more robust to actually create a data column with what you want for y values:
# nicer code
dataframe$y = row.names(dataframe)
ggplot(dataframe, aes(x = exit, y = y)) +
geom_point() +
geom_text(aes(label = samplesize))
Having done this, you probably don't want the labels right on top of the points, maybe a little offset would be better:
# best of all?
ggplot(dataframe, aes(x = exit, y = y)) +
geom_point() +
geom_text(aes(x = exit + .05, label = samplesize), vjust = 0)
In the last case, you'll have to play with the adjustment to the x aesthetic, what looks right will depend on the dimensions of your final plot

geom_lines not linking what they should with error bars plot in ggplot

I have the following dataset ready to plot an error bars and lines graph
> growth
treatment class variable N value sd se ci
1 elevated Dominant RBAI2012 18 0.014127713 0.009739951 0.002295728 0.004843564
2 elevated Dominant RBAI2013 18 0.021869978 0.013578741 0.003200540 0.006752549
3 elevated Codominant RBAI2012 40 0.011564725 0.013718591 0.002169100 0.004387418
4 elevated Codominant RBAI2013 41 0.011471512 0.011091167 0.001732149 0.003500804
5 elevated Subordinate RBAI2012 24 0.004419784 0.009286883 0.001895677 0.003921507
6 elevated Subordinate RBAI2013 24 0.004397105 0.008704831 0.001776866 0.003675728
7 ambient Dominant RBAI2012 13 0.025836265 0.011880315 0.003295007 0.007179203
8 ambient Dominant RBAI2013 13 0.025992636 0.015162901 0.004205432 0.009162850
9 ambient Codominant RBAI2012 26 0.018067329 0.011830940 0.002320238 0.004778620
10 ambient Codominant RBAI2013 26 0.015595275 0.012467140 0.002445007 0.005035587
11 ambient Subordinate RBAI2012 33 0.006073904 0.008287442 0.001442658 0.002938599
12 ambient Subordinate RBAI2013 35 0.003239033 0.006846507 0.001157271 0.002351857
I've tried the following code, resulting this plot:
p <- ggplot(growth,aes(class,value,colour=treatment,group=variable))
pd<-position_dodge(.9)
# se= standard error; ci=confidence interval
p + geom_errorbar(aes(ymin=value-se,ymax=value+se),width=.1,position=pd,colour="black") + geom_point(position=pd,size=4) + geom_line(position=pd) +
theme_bw() + theme(legend.position=c(1,1),legend.justification=c(1,1))
The lines should link the points of their same color within each x-axis category, but clearly they don't. Please, could you help me draw the lines properly (e.g blue with blue and red with red within "Dominant" class, different lines for "codominant" class.
Also, do you know how to include in the x-labels the variables I am grouping with (i.e. "RBAI2012","RBAI2013"?
Many thanks
To distinguish also between different of levels of 'variable' you may introduce a fourth aesstetic: shape. First define a new grouping variable, a combination of 'treatment' and 'variable', which has four levels. Map group, colours and shape to this variable. Then use scale_colour_manual and scale_shape_manual to set two levels of colours, which corresponds to the two levels of 'treatment'. Similarly, define two 'variable' shapes.
growth$grp <- paste0(growth$treatment, growth$variable)
ggplot(data = growth, aes(x = class, y = value, group = grp,
colour = grp, shape = grp)) +
geom_point(size = 4, position = pd) +
geom_line(position = pd) +
geom_errorbar(aes(ymin = value - se, ymax = value + se), colour = "black",
position = pd, width = 0.1) +
scale_colour_manual(name = "Treatment:Variable",
values = c("red", "red","blue", "blue")) +
scale_shape_manual(name = "Treatment:Variable",
values = c(19, 17, 19, 17))
theme_bw() +
theme(legend.position = c(1,1), legend.justification = c(1,1))
One option is using a facet plot like so:
p <- ggplot(growth, aes(x = class, y = value, group = treatment, color = treatment))
p + geom_point(size = 4) + facet_grid(. ~ variable) + geom_errorbar(aes(ymin=value-se,ymax=value+se),width=.1,colour="black") + geom_line()
If you want it on one graph, another option is defining a new variable that combines treatment and variable:
growth$treatment_variable <- paste(growth$treatment, growth$variable)
p <- ggplot(growth, aes(x = class, y = value, group = treatment_variable, colour = treatment_variable))
pd<-position_dodge(.2)
p + geom_point(size = 4, position=pd) + geom_errorbar(aes(ymin=value-se, ymax=value+se), width=.1, position=pd, colour="black") + geom_line(position=pd)
You have too many grouping variables (variable and treatment) and including them in a single plot may be a bit confusing. You might want to use faceting, like this:
p <- ggplot(growth,aes(class,value,colour=treatment,group=treatment))
pd<-position_dodge(.9)
p +
geom_errorbar(aes(ymin=value-se,ymax=value+se),width=.1,position=pd,colour="black") +
geom_point(position=pd,size=4) + geom_line(position=pd) +
theme_bw() + theme(legend.position=c(1,1),legend.justification=c(1,1)) +
facet_grid(variable~treatment)
It is possible to do this, but you need to hack it since you're essentially plotting a geom_line() on different groupings (variable + treatment) than with the geom_point() and geom_errorbar() calls.
You need to use ggplot_build() to get back the rendered data and draw a geom_line(), based on the existing points data, grouped by colour:
p <- ggplot(growth) # move the aes() into the individual charts
pd<-position_dodge(.9) # leave dodge as is
se<-0.01 # faked this
p <- p +
geom_point(aes(x=factor(class),y=value,colour=treatment,group=variable),position=pd,size=4) +
theme_bw() + theme(legend.position=c(1,1),legend.justification=c(1,1)) +
geom_errorbar(aes(x=factor(class),ymin=value-se,ymax=value+se,colour=treatment,group=variable),position=pd,width=.1,colour="black")
b<-ggplot_build(p)$data[[1]] # get the ggpolt rendered data for this panel
p + geom_line(data=b,aes(x,y,group=colour), color=b$colour) # plot the lines

Errorbars look like pointrange (ggplot2)

I have the following data frame:
> df <- read.table("throughputOverallSummary.txt", header = TRUE)
> df
ExperimentID clients connections msgSize Mean Deviation Error
1 77 100 50 1999 142.56427 8.368127 0.4710121
2 78 200 50 1999 284.22705 13.575943 0.3832827
3 79 400 50 1999 477.48997 44.820831 0.7538666
4 80 600 50 1999 486.87102 49.916391 0.8240869
5 81 800 50 1999 488.84899 51.422070 0.8462216
6 82 10 50 1999 15.23667 1.995150 1.0498722
7 83 50 50 1999 71.94000 5.197893 0.5793057
and some code that processes the dataframe df above:
msg_1999 = subset(df, df$msgSize == 1999)
if (nrow(msg_1999) > 0) {
limits = aes(ymax = msg_1999$Mean + msg_1999$Deviation, ymin = msg_1999$Mean -
msg_1999$Deviation)
ggplot(data = msg_1999, aes(clients, Mean, color = as.factor(connections), group =
as.factor(connections))) +
geom_point() + geom_line() +
geom_errorbar(limits, width = 0.25) +
xlab("Number of Clients") +
ylab("Throughput (in messages/second)") +
labs(title = "Message size 1999 bytes", color = "Connections")
ggsave(file = "throughputMessageSize1999.png")
}
My problem is that the error bars in the plot look like pointrange. The horizontal bars at the upper and lower end of the error bars are missing.
Ideally, the error bars should have looked something like this:
Why do errorbars from my code look different?
The width parameter as the same scale as x, you have given width = 0.25, where the range of the x axis is 0-800. A bar with width 0.25 is not going to be visible on this graph. If you don't set the width value, then something reasonably sensible is guessed.
ggplot(data = df, aes(clients, Mean, color = as.factor(connections), group =
as.factor(connections))) +
geom_point() + geom_line() +
geom_errorbar(aes(ymax = Mean + Deviation, ymin=Mean-Deviation)) +
xlab("Number of Clients") +
ylab("Throughput (in messages/second)") +
labs(title = "Message size 1999 bytes", color = "Connections")
Note that if you want to predefine your mapping argument, you should still specify the variables as you would within a call to geom_xxxx. aes (and ggplot) does some fancy footwork to ensure that this will be evaluated within the correct environment at the time of plotting.
Thus the following will work
limits <- aes(ymax = Mean + Deviation, ymin=Mean-Deviation)
ggplot(data = df, aes(clients, Mean, color = as.factor(connections), group =
as.factor(connections))) +
geom_point() + geom_line() +
geom_errorbar(limits) +
xlab("Number of Clients") +
ylab("Throughput (in messages/second)") +
labs(title = "Message size 1999 bytes", color = "Connections")

R ggplot barplot; Fill based on two separate variables

A picture says more than a thousand words. As you can see, my fill is based on the variable variable.
Within each bar there is however multiple data entities (black borders) since the discrete variable complexity make them unique. What I am trying to find is something that makes each section of the bar more distinguishable than the current look. Preferable would be if it was something like shading.
Here's an example (not the same dataset, since the original was imported):
dat <- read.table(text = "Complexity Method Sens Spec MMC
1 L Alpha 50 20 10
2 M Alpha 40 30 80
3 H Alpha 10 10 5
4 L Beta 70 50 60
5 M Beta 49 10 80
6 H Beta 90 17 48
7 L Gamma 19 5 93
8 M Gamma 18 39 4
9 H Gamma 10 84 74", sep = "", header=T)
library(ggplot2)
library(reshape)
short.m <- melt(dat)
ggplot(short.m, aes(x=Method, y= value/100 , fill=variable)) +
geom_bar(stat="identity",position="dodge", colour="black") +
coord_flip()
This is far from perfect, but hopefully a step in the right direction, as it's dodged by variable, but still manages to represent Complexity in some way:
ggplot(short.m, aes(x=Method, y=value/100, group=variable, fill=variable, alpha=Complexity,)) +
geom_bar(stat="identity",position="dodge", colour="black") +
scale_alpha_manual(values=c(0.1, 0.5, 1)) +
coord_flip()
Adding alpha=complexity might work:
ggplot(short.m, aes(x=Method, y= value/100 , fill=variable, alpha=complexity)) +
geom_bar(stat="identity",position="dodge", colour="black") + coord_flip()
You might need to separate your Method and variable factors. Here are two ways to do that:
Use facet_wrap():
ggplot(short.m, aes(x=variable, y=value/100, fill=Complexity)) +
facet_wrap(~ Method) + geom_bar(position="stack", colour="black") +
scale_alpha_manual(values=c(0.1, 0.5, 1)) + coord_flip()
Use both on the x-axis:
ggplot(short.m, aes(x=Method:variable, y=value/100, group=Method, fill=variable, alpha=Complexity,)) +
geom_bar(stat="identity", position="stack", colour="black") +
scale_alpha_manual(values=c(0.1, 0.5, 1)) + coord_flip()

Resources