I am trying to make an extremely single heatmap of percentages using ggplot2 which ideally will just be two single thin columns. I tried the following code, believing that the width option in aes would solve the problem.
p_prev_tg <- ggplot(tg_melt, aes(x = variable , y = OTU, fill = value,
width=.3)) + geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7))
p_prev_tg
Unfortunately, this returns a plot with lots of empty space as shown. The plot I would like is those two bars side by side, how can I do this in ggplot?
thanks
What about this solution ?
set.seed(1234)
tg_melt <- data.frame(variable=rep(c("Prevalence_T","Prevalence_NT"), each=10),
OTU=rep(paste0("OTU_",1:10),2),
value=rnorm(20))
library(RColorBrewer)
library(ggplot2)
hm.palette2 <- colorRampPalette(rev(brewer.pal(11, 'Spectral')))
p_prev_tg <- ggplot(tg_melt, aes(x = as.numeric(variable), y = OTU, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7)) +
scale_x_continuous(breaks=c(1,2),
limits=c(0,3),
labels=levels(tg_melt$variable))+
theme_bw()
p_prev_tg
Related
I am trying to plot data using ggplot2 in R.
The datapoints occur for each 2^i-th x-value (4, 8, 16, 32,...). For that reason, I want to scale my x-Axis by log_2 so that my datapoints are spread out evenly. Currently most of the datapoints are clustered on the left side, making my plot hard to read (see first image).
I used the following command to get this image:
ggplot(summary, aes(x=xData, y=yData, colour=groups)) +
geom_errorbar(aes(ymin=yData-se, ymax=yData+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd)
However trying to scale my x-axis with log2_trans yields the second image, which is not what I expected and does not follow my data.
Code used:
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=solver.name)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(trans = log2_trans(),
breaks = trans_breaks("log2", function(x) 2^x),
labels = trans_format("log2", math_format(2^.x)))
Using scale_x_continuous(trans = log2_trans()) only doesn't help either.
EDIT:
Attached the data for reproducing the results:
https://pastebin.com/N1W0z11x
EDIT 2:
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Removing the position=pd statements solved the issue
Here is a way you could format your x-axis:
# Generate dummy data
x <- 2^seq(1, 10)
df <- data.frame(
x = c(x, x, x),
y = c(0.5*x, x, 1.5*x),
z = rep(letters[seq_len(3)], each = length(x))
)
The plot of this would look like this:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line()
Adjusting the x-axis would work like so:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line() +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
The labels argument is just so you have labels in the format 2^x, you could change that to whatever you like.
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Adjusting the amount of position dodge and the with of the error bars according to the new scaling solved the problem.
pd <- position_dodge(0.2) # move them .2 to the left and right
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=algorithm)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=0.4, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
Adding scale_y_continuous(trans="log2") yields the results I was looking for:
I've been trying to plot two histograms by using the fill aesthetic and a specific column with two levels. However, instead of displaying both desired histograms, my code displays one histogram with the whole data and another only for the second classification. I don't know if there is a problem in my syntax neither if this is some kind of tricky issue.
library(tidyverse)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
P1 <- ggplot(db1, aes(x=val)) + geom_histogram()
P2 <- ggplot(db2, aes(x=val)) + geom_histogram()
PF <- ggplot(dbf, aes(x=val)) + geom_histogram()
I want to get this, P1 and P2
ggplot(db1, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I want
But the code I think should work, P1 and P2 with the fill aesthetic for column val
ggplot(dbf, aes(x=val)) + geom_histogram(aes(fill=type), alpha=0.5)
My code
Produces the combination of PF and P2
ggplot(dbf, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I get
Any help or idea will be highly appreciated!
All you need is to pass position = "identity" to your geom_histogram function.
library(tidyverse)
library(ggplot2)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
ggplot(dbf, aes(x=val, fill = type)) + geom_histogram(alpha=0.5, position = "identity")
Is your goal to show the overlap via the color combination? I'm not sure how to force geom_histogram to show the overlap, but geom_density does do what you want. You can play with the bandwidth (bw) to show more or less detail.
dbf %>% ggplot() +
aes(x = val, fill = type) +
geom_density(alpha = .5, bw = .5) +
scale_fill_manual(values = c("red","green"))
I have the following code in R which is modified from here, which plots a crosstab table:
#load ggplot2
library(ggplot2)
# Set up the vectors
xaxis <- c("A", "B")
yaxis <- c("A","B")
# Create the data frame
df <- expand.grid(xaxis, yaxis)
df$value <- c(120,5,30,200)
#Plot the Data
g <- <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value), colour = "lightblue") + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
It produces the right figure, which is great, but I was hoping to custom colour the four dots, ideally so that the top left and bottom right are both one colour and the top right and bottom left are another.
I have tried to use:
+ scale_color_manual(values=c("blue","red","blue","red"))
but that doesn't seem to work. Any ideas?
I would suggest that you colour by a vector in your data frame, as you don't have a column that gives you this, you can either create one, or make a rule based on existing columns (which I have done below):
g <- ggplot(df, aes(Var1, Var2)) + geom_point(aes(size = value, colour = (Var2!=Var1))) + theme_bw() + xlab("") + ylab("")
g + scale_size_continuous(range=c(10,30)) + geom_text(aes(label = value))
The important part is: colour = (Var2!=Var1), note that i put this inside the aesthetic (aes) for the geom_point
Edit: if you wish to remove the legend (you annotate the chart with totals, so I guess you don't really need it), you can add: g + theme(legend.position="none") to remove it
I am trying to align or combine a map and a boxplot, both of which have latitude as their y-axis. I've found a number of ways to align plots, but it seems like having a map involved complicates things. I'd like to have the two plots align seamlessly, with the map on the left and boxplot on the right.
Here's example code for the two plots:
library(ggplot2)
library(maps)
library(gridExtra)
##Plot base map
s_map <- map_data('state',region=c('south carolina','georgia','florida'))
p <- ggplot() + coord_fixed()
base_world <- p+geom_polygon(data=s_map, aes(x=long, y=lat, group=group), color="white", fill="grey72")
yscale <- c(24:34)
map1 <- base_world +
coord_map(xlim = c(-83, -79.5),ylim = c(25, 34)) +
xlab("") +
ylab("Latitude") +
scale_y_discrete(labels=yscale)
##plot boxplot (seasonal movements of an animal)
df <- data.frame("month"=month.abb,
"min"=c(26,26,26,28,28,29,29,29,28,28,26,26),
"lci"=c(27,27,27,29,29,30,30,30,29,29,27,27),
"med"=c(28,28,28,29,29,31,31,31,29,29,28,28),
"uci"=c(29,29,29,30,30,32,32,32,30,30,29,29),
"max"=c(30,30,30,31,31,33,33,33,31,31,30,30),
"order"=c(1:12))
boxplot1 <- ggplot(df, aes(x=factor(order), ymin = min, lower = lci, middle = med, upper = uci, ymax = max)) +
geom_boxplot(stat = "identity") +
ggtitle("Latitude by Month") +
xlab("Month") +
ylab("Latitude") +
ylim(24,33)
grid.arrange(map1,boxplot1,ncol=2)
How do I force the y-axes and plot heights to be the same, with gridlines aligned?
Thanks in advance for your thoughts on this - I'm still pretty new with R.
EDITED:
I've tried probably a dozen fragments of code I found, and nothing much seemed effective at all, so I didn't think listing them would be helpful. I'm mostly looking for a real place to start.
Here's one example I found for forcing plot heights to be the same, but it didn't seem to do anything:
# build the plots
map2 <- ggplot_gtable(ggplot_build(map1))
boxplot2 <- ggplot_gtable(ggplot_build(boxplot1))
# copy the plot height from p1 to p2
boxplot2$heights <- map2
grid.arrange(map2,boxplot2,ncol=2,widths=c(1,5))
This might work for you, I changed three things:
added ggtitle("") to the map
added theme(plot.margin()) to both plots to change the space around the plots (you can play with the values)
added scale_y_continuous(limits=c(24,33), breaks=seq(24,33,by=1), expand=c(0,0)) to the plot in order to create same axis scale as for the map
map1 <- base_world +
coord_map(xlim = c(-83, -79.5),ylim = c(25, 34)) +
xlab("") +
ylab("Latitude") +
scale_y_discrete(labels=yscale) +
ggtitle("") +
theme(plot.margin=unit(c(0.5,-1.5,0.5,-8), "cm"))
boxplot1 <- ggplot(df, aes(x=factor(order), ymin = min, lower = lci, middle = med, upper = uci, ymax = max)) +
geom_boxplot(stat = "identity") +
ggtitle("Latitude by Month") +
xlab("Month") +
ylab("Latitude") +
scale_y_continuous(limits=c(24,33), breaks=seq(24,33,by=1), expand=c(0,0)) +
labs(y=NULL) +
theme(plot.margin=unit(c(0.5,0.5,0.5,-6), "cm"))
grid.arrange(map1,boxplot1,ncol=2)
I have a empirical PDF + CDF combo I'd like to plot on the same panel. distro.df has columns pdf, cdf, and day. I'd like the pdf values to be plotted as bars, and the cdf as lines. This does the trick for making the plot:
p <- ggplot(distro.df, aes(x=day))
p <- p + geom_bar(aes(y=pdf/max(pdf)), stat="identity", width=0.95, fill=fillCol)
p <- p + geom_line(aes(y=cdf))
p <- p + xlab("Day") + ylab("")
p <- p + theme_bw() + theme_update(panel.background = element_blank(), panel.border=element_blank())
However, I'm having trouble getting a legend to appear. I'd like a line for the cdf and a filled block for the pdf. I've tried various contortions with guides, but can't seem to get anything to appear.
Suggestions?
EDIT:
Per #Henrik's request: to make a suitable distro.df object:
df <- data.frame(day=0:10)
df$pdf <- runif(length(df$day))
df$pdf <- df$pdf / sum(df$pdf)
df$cdf <- cumsum(df$pdf)
Then the above to make the plot, then invoke p to see the plot.
This generally involves moving fill into aes and using it in both the geom_bar and geom_line layers. In this case, you also need to add show_guide = TRUE to geom_line.
Once you have that, you just need to set the fill colors in scale_fill_manual so CDF doesn't have a fill color and use override.aes to do the same thing for the lines. I didn't know what your fill color was, so I just used red.
ggplot(df, aes(x=day)) +
geom_bar(aes(y=pdf/max(pdf), fill = "PDF"), stat="identity", width=0.95) +
geom_line(aes(y=cdf, fill = "CDF"), show_guide = TRUE) +
xlab("Day") + ylab("") +
theme_bw() +
theme_update(panel.background = element_blank(),
panel.border=element_blank()) +
scale_fill_manual(values = c(NA, "red"),
breaks = c("PDF", "CDF"),
name = element_blank(),
guide = guide_legend(override.aes = list(linetype = c(0,1))))
I'd still like a solution to the above (and will checkout #aosmith's answer), but I am currently going with a slightly different approach to eliminate the need to solve the problem:
p <- ggplot(distro.df, aes(x=days, color=pdf, fill=pdf))
p <- p + geom_bar(aes(y=pdf/max(pdf)), stat="identity", width=0.95)
p <- p + geom_line(aes(y=cdf), color="black")
p <- p + xlab("Day") + ylab("CDF")
p <- p + theme_bw() + theme_update(panel.background = element_blank(), panel.border=element_blank())
p
This also has the advantage of displaying some of the previously missing information, namely the PDF values.