confusion with overriding ggplot data in a plot layer - r

Following is a short code to generate a barplot with an added layer of line plot. I have added comments indicating what works and what doesn't. While my problem is solved, I can't understand why I had a problem or how it got solved. If you can explain or suggest the right way to do it, that would be nice.
library(ggplot2)
factors <- c("A", "B", "C", "D", "B", "A", "C", "B", "D", "D")
data <- data.frame(n=1:10, a= runif(10, 1, 5), b=runif(10, 1, 5),c=runif(10, 1, 5))
gg_data <- melt(data, id.vars="n", variable.name="var")
gg_data$alp <- rep(factors, 3)
gg_data1 <- melt(data.frame(n=1:10, a= runif(10, 2, 3), b=runif(10, 4, 5),c=runif(10, 3, 4)), id.vars="n", variable.name="var")
#this does not work
ggplot(data= gg_data, aes(x=n, y=value, fill=alp))+geom_bar(stat="identity")+ facet_grid( var ~ ., scale="free_y")+geom_line(data= gg_data1, aes(x= n, y=value))
#this gives a weird output
gg_data1$alp <- rep(factors, 3)
ggplot(data= gg_data, aes(x=n, y=value, fill=alp))+geom_bar(stat="identity")+ facet_grid( var ~ ., scale="free_y")+geom_line(data= gg_data1, aes(x= n, y=value))
#this works the way I want it to, don't know why.
gg_data1$alp <- "A"
ggplot(data= gg_data, aes(x=n, y=value, fill=alp))+geom_bar(stat="identity")+ facet_grid( var ~ ., scale="free_y")+geom_line(data= gg_data1, aes(x= n, y=value))

Basically your plots are combining information from the two datasets to try to get a new plot. Since you have listed that fill = alp, then ggplot is trying to apply this information to all of your plots.
The easiest way to see this is consider this new data.frame:
gg1 <- gg_data1
names(gg1) <- c("n1", "var1", "value1")
gg_combine <- cbind(gg_data, gg1)
To reproduce your 2nd graph it is equivalent to:
ggplot(data=gg_combine, aes(x=n, y=value, fill=alp))+
geom_bar(stat="identity")+
geom_line(aes(x=n1, y=value1, colour=alp)) +
facet_grid( var ~ ., scale="free_y")
Basically what it is saying is I want to group everything by "alp" and plot them together by those groups, which is why you get those lines; with the addition of colour=alp then it becomes clear why the lines look that way.
With your last plot. What you've done is only group the bar plots with alp, but with the lines we want to ignore this grouping. This is equivalent to:
ggplot(data=gg_combine, aes(x=n, y=value))+
geom_bar(aes(fill=alp), stat="identity")+
geom_line(aes(x=n1, y=value1)) +
facet_grid( var ~ ., scale="free_y")
Hope this helps.

Related

Why pies are flat in geom_scatterpie in R?

Why are the pies flat?
df<- data.frame(
Day=(1:6),
Var1=c(172,186,191,201,205,208),
Var2= c(109,483,64010,161992,801775,2505264), A=c(10,2,3,4.5,16.5,39.6), B=c(10,3,0,1.4,4.8,11.9), C=c(2,5,2,0.1,0.5,1.2), D=c(0,0,0,0,0.1,0.2))
ggplot() +
geom_scatterpie(data = df, aes(x = Var1 , y = Var2, group = Var1), cols = c("A", "B", "C", "D"))
I have tried using coord_fixed() and does not work either.
The problem seems to be the scales of the x- and y-axes. If you rescaled them to both to have zero mean and unit variance, the plot works. So, one thing you could do is plot the rescaled values, but transform the labels back into the original scale. To do this, you would have to do the following:
Make the data:
df<- data.frame(
Day=(1:6),
Var1=c(172,186,191,201,205,208),
Var2= c(109,483,64010,161992,801775,2505264), A=c(10,2,3,4.5,16.5,39.6), B=c(10,3,0,1.4,4.8,11.9), C=c(2,5,2,0.1,0.5,1.2), D=c(0,0,0,0,0.1,0.2))
Rescale the variables
df <- df %>%
mutate(x = c(scale(Var1)),
y = c(scale(Var2)))
Find the linear map that transforms the rescaled values back into their original values. Then, you can use the coefficients from the model to make a function that will transform the rescaled values back into the original ones.
m1 <- lm(Var1 ~ x, data=df)
m2 <- lm(Var2 ~ y, data=df)
trans_x <- function(x)round(coef(m1)[1] + coef(m1)[2]*x)
trans_y <- function(x)round(coef(m2)[1] + coef(m2)[2]*x)
Make the plot, using the transformation functions as the call to labels in the scale_[xy]_continuous() functions
ggplot() +
geom_scatterpie(data=df, aes(x = x, y=y), cols = c("A", "B", "C", "D")) +
scale_x_continuous(labels = trans_x) +
scale_y_continuous(labels = trans_y) +
coord_fixed()
There may be an easier way than this, but it wasn't apparent to me.
The range on the y-axis is so large it's compressing the disks to lines. Change the y-axis to a log scale, and you can see the shapes. Adding coord_fixed() to keep the pies circular:
ggplot() +
geom_scatterpie(data = df, aes(x = Var1 , y = Var2, group = Var1), cols = c("A", "B", "C", "D")) +
scale_y_log10() +
coord_fixed()

For looping x-as in ggplot

I would like to create multiple histograms (ggplot) using a for loop. The problem is that my x-as from the plots, stay the same like "value". Do you know how to change the x-as every time it loops?
My dataframe for example:
df <- data.frame(variable = c("A", "A", "B", "B", "C", "C"), value = c(1, 2, 4, 5, 2, 3))
So that means I get three plots with x-as: "A", "B" and "C"
My code:
for (i in unique(df$variable)){
d <- subset(df, df$variable == i)
print(ggplot(d, aes(x = value)) + geom_histogram())
}
You can take help of imap to get different x-axis value after splitting the data by variable.
library(ggplot2)
list_plot <- df %>%
split(.$variable) %>%
purrr::imap(~ggplot(.x, aes(x = value)) +
geom_histogram() + xlab(.y))
Also have you considered using facets? Where x-axis is the same and you get A, B, C as facet names.
ggplot(df, aes(x = value)) + geom_histogram() + facet_wrap(~variable)

Adding additional points to ggplot Boxplot

I have built a simple boxplot using ggplot, and I am trying to add an additional theoretical data-point - 'theoretical' in the sense that it did not form part of the original boxplot, but is linked to another dataset I would like to make a comparison to...
Here is my boxplot at present with some dummy data.
# create a dataset
data <- data.frame(
name=c( rep("A",10), rep("B",10), rep("B",10), rep("C",10), rep('D', 10) ),
value=c( rnorm(10, 10, 3), rnorm(10, 10, 1), rnorm(10, 4, 2), rnorm(10, 6, 2), rnorm(10, 8, 4) )
)
# Plot
data %>%
ggplot( aes(x=name, y=value, fill=name)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.5) +
geom_jitter(position=position_jitter(0.2), color="black", size=2.0, alpha=0.9, pch=21)
If I had the below array, where each value represents a theoretical data-point for each condition from a different distribution, how would I include that data-point on the above plot (with a different plot character)?
A_new <- c(5)
B_new <- c(6)
C_new <- c(10)
D_new <- c(7)
new_vals <- c(A_new, B_new, C_new, D_new)
You can do this by saving the original ggplot object in a variable and then adding additional layers via "+" later on.
x=data %>%
ggplot( aes(x=name, y=value, fill=name)) +
geom_boxplot() +
geom_jitter(position=position_jitter(0.2), color="black", size=2.0, alpha=0.9, pch=21)
new_data <- data.frame(name=c("A", "B", "C", "D"), value=new_vals)
x + geom_jitter(data=new_data, aes(x=name, y=value, fill=name), position=position_jitter(.2), color="blue", size=1.5, pch=20)

Ignore sign of numbers with ggplot2

I have the following df:
gene = c("a", "b", "c", "d")
fc = c(-1, -2, 1, 2)
df = data.frame(gene, fc)
I am using the following code for plotting:
ggplot(df, aes(gene, fc)) + geom_point(size=df$fc) + theme_minimal()
How can I ignore the sign of the values in "fc" while plotting?
Thanks
You can use the absolute value function abs() to ignore the negative sign. For example
ggplot(df, aes(gene, fc)) +
geom_point(aes(size=abs(fc))) +
theme_minimal()
Just make sure to put properties that you want to map to data inside aes() at all times. Rarely should you ever see a $ in ggplot code.

Add multi-stack axes label to plot

I have a dataset, named “data”:
df=ddply(data,c("Treatment","Concentration"),summarise,mean=mean(Inhibition),sd=sd(Inhibition),n=length(Inhibition),se=sd/sqrt(n))
p <- ggplot(df, aes(x=Treatment, y=Inhibition))
p1 <- p + geom_bar(stat="identity", position="dodge") +
geom_errorbar(aes(ymin=Inhibition-se,ymax=Inhibition+se), position="dodge",width=0.2)
and I got the following graph:
I want x-axis to be like the picture below:
How woud I do this??
This is best achieved using a facet within ggplot. As you haven’t included a reusable dataset, I have made one here:
df <- data.frame(Group = c("A", "A", "A", "A", "B"),
SubGroup = c(letters[1:5]),
value = 1:5
)
See below the facet_grid line which has a few additional options specified. You can read more about the added arguments here
library(ggplot2)
ggplot(df, aes(x = SubGroup, value)) +
geom_bar(stat="identity", position="dodge") +
facet_grid(.~Group, scales = "free_x", space = "free", switch = "x") +
theme(strip.placement = "outside")
For your data, you will need to split the drug and dose into two separate columns first, like my example.

Resources