Related
I'm trying to plot bar graphs in ggplot2 and running into an issue.
Starting with the variables as this
PalList <- c(9, 9009, 906609, 99000099)
PalList1 <- as_tibble(PalList)
Index <- c(1,2,3,4)
PalPlotList <- cbind(Index, PalList)
PPL <- as_tibble(PalPlotList)
and loading the tidyverse library(tidyverse), I tried plotting like this:
PPL %>%
ggplot(aes(x=PalList)) +
geom_bar()
It doesn't matter whether I'm accessing PPL or PalList, I'm still ending up with this (axes and labels may change, but not the chart area):
Even this still gave a blank plot, only now in classic styling:
ggplot(PalList1, aes(value)) +
geom_bar() +
theme_classic()
If I try barplot(PalList), I get an expected result. But I want the control of ggplot. Any suggestions on how to fix this?
An option is to specify the x, y in aes, create the geom_bar with stat as 'identity', and change the x-axis tick labels
library(ggplot2)
ggplot(PPL, aes(x = Index, y = PalList)) +
geom_bar(stat = 'identity') +
scale_x_continuous(breaks = Index, labels = PalList)
I'm working with a dataset of 5k finish times that looks a little bit like this:
"15:34"
"14:23"
"17:34"
and so on, there's a lot, but they're all formatted like that. I'm able to convert all of them to POSIXct, and store them in a data frame to make using ggplot2 easier, but for the life of me, I cannot get ggplot to change colors. The fill command doesn't work, the graph just remains grey.
I've tried just referencing the POSIXct object I made, but ggplot throws an error and tells me it doesn't work well with POSIXct. The only way I've been able to display a histogram is by storing it in a dataframe.
The code I'm currently using looks like:
#make the data frame
df <- data.frame(
finish_times = times_list)
#set the limits on the x axis as datetime objects
lim <- as.POSIXct(strptime(c('2018-3-18 14:15', '2018-3-18 20:00'), format = "%Y-%m-%d %M:%S"))
#making the plot
ggplot(data = df, aes(x = finish_times)) +
geom_histogram(fill = 'red') + #this just doesn't work
stat_bin(bins = 30) +
scale_x_datetime(labels = date_format("%M:%S"),
breaks = date_breaks("1 min"),
limits = lim) +
labs(title = "2017 5k finishers",
x='Finish Times',
y= 'Counts')
I've crawled through a lot of ggplot and R documentation, and I'm not sure what I'm missing, I appreciate all help, thanks
stat_bin(bins = 30) is overriding anything you set in geom_histogram(). Generally, each geom has an associated default stat, and you can plot the object using one of the two, but when you try to do it with both, you can end up with problems. There are several solutions to this. Here's an example.
ggplot(diamonds, aes(x = carat)) + geom_histogram(fill = "red") + stat_bin(bins = 30)
Produces a plot with gray fill
ggplot(diamonds, aes(x = carat)) + geom_histogram(fill = "red", bins = 30)
Produces a plot with red fill
I'm trying to do a pyramid-"like" plot in R and I think I am close. I know there are functions such as plotrix's pyramid.plot but what I want to do isn't a real pyramid plot. In a pyramid plot, there are text labels down the middle that line up with the bars on the left and on the right. Instead, what I'd like to do is have two columns of text with bars coming away from them.
I'm using ggplot (but I guess I don't have to) and the multiplot function. A minimal example would be something like this:
mtcars$`car name` <- rownames(mtcars)
obj_a <- ggplot (mtcars, aes (x=`car name`, y=mpg))
obj_a <- obj_a + geom_bar (position = position_dodge(), stat="identity")
obj_a <- obj_a + coord_flip ()
obj_a <- obj_a + xlab ("")
USArrests$`states` <- rownames(USArrests)
obj_b <- ggplot (USArrests, aes (x=`states`, y=UrbanPop))
obj_b <- obj_b + geom_bar (position = position_dodge(), stat="identity")
obj_b <- obj_b + coord_flip ()
obj_b <- obj_b + xlab ("")
multiplot (obj_a, obj_b, cols=2)
Which looks like this:
I guess what I'd like is just to flip the left half so that each row has (from left-to-right): left bar, car model, state name, right bar. (The graph I'm making will have the same number of rows in both halves so it won't look so cramped.) However, the point is, there are two columns of text, not one.
Of course, since both halves are independent of each other, my real problem is I don't know how to make the left half. (A bar plot with bars going in the opposite direction.) But I thought I'd also explain what I'm trying to do...
Thank you in advance!
You can set the mpg values in obj_a negative, & position the car names axis on the opposite side:
ggplot (mtcars, aes (x=`car name`, y=-mpg)) + # y takes on negative values
geom_bar (position = position_dodge(), stat = "identity") +
coord_flip () +
scale_x_discrete(name = "", position = "top") + # x axis (before coord_flip) on opposite side
scale_y_continuous(name = "mpg",
breaks = seq(0, -30, by = -10), # y axis values (before coord_flip)
labels = seq(0, 30, by = 10)) # show non-negative values
It seems that pyramid.plot already does what you need. Using their example:
xy.pop<-c(3.2,3.5,3.6,3.6,3.5,3.5,3.9,3.7,3.9,3.5,3.2,2.8,2.2,1.8,
1.5,1.3,0.7,0.4)
xx.pop<-c(3.2,3.4,3.5,3.5,3.5,3.7,4,3.8,3.9,3.6,3.2,2.5,2,1.7,1.5,
1.3,1,0.8)
agelabels<-c("0-4","5-9","10-14","15-19","20-24","25-29","30-34",
"35-39","40-44","45-49","50-54","55-59","60-64","65-69","70-74",
"75-79","80-44","85+")
mcol<-color.gradient(c(0,0,0.5,1),c(0,0,0.5,1),c(1,1,0.5,1),18)
fcol<-color.gradient(c(1,1,0.5,1),c(0.5,0.5,0.5,1),c(0.5,0.5,0.5,1),18)
par(mar=pyramid.plot(xy.pop,xx.pop,labels=agelabels,
main="Australian population pyramid 2002",lxcol=mcol,rxcol=fcol,
gap=0.5,show.values=TRUE))
# three column matrices
avtemp<-c(seq(11,2,by=-1),rep(2:6,each=2),seq(11,2,by=-1))
malecook<-matrix(avtemp+sample(-2:2,30,TRUE),ncol=3)
femalecook<-matrix(avtemp+sample(-2:2,30,TRUE),ncol=3)
# *** Make agegrps a two column data frame with the labels ***
# group by age
agegrps<-data.frame(c("0","11","21","31","41","51",
"61-70","71-80","81-90","91+"),
c("10","20","30","40","50","60",
"70","80","90","91"))
oldmar<-pyramid.plot(malecook,femalecook,labels=agegrps,
unit="Bowls per month",lxcol=c("#ff0000","#eeee88","#0000ff"),
rxcol=c("#ff0000","#eeee88","#0000ff"),laxlab=c(0,10,20,30),
raxlab=c(0,10,20,30),top.labels=c("Males","Age","Females"),gap=4,
do.first="plot_bg(\"#eedd55\")")
# put a box around it
box()
# give it a title
mtext("Porridge temperature by age and sex of bear",3,2,cex=1.5)
# stick in a legend
legend(par("usr")[1],11,c("Too hot","Just right","Too cold"),
fill=c("#ff0000","#eeee88","#0000ff"))
# don't forget to restore the margins and background
par(mar=oldmar,bg="transparent")
Result:
Instead of negating the variable, you could just add + scale_y_reverse(), which becomes x after the flip
This way you don't have to set axis labels manually.
You will still need to change the x axis label position, as suggested in the answer by user Z.Lin
E.g.
library(ggplot2)
mtcars$`car name` <- rownames(mtcars)
ggplot (mtcars, aes (x=`car name`, y=mpg)) +
geom_bar (position = position_dodge(), stat="identity") +
scale_y_reverse () +
scale_x_discrete(name = "", position = "top") +
coord_flip ()
Created on 2020-04-09 by the reprex package (v0.3.0)
EDIT: An even simpler solution is not to use coord_flip() at all, but rather specify the desired mapping directly right away. I would now recommend this approach, having encountered issues with how coord_flip() behaves under certain more complex scenarios.
I give the code for this below; the plot should still look the same.
library(ggplot2)
mtcars$`car name` <- rownames(mtcars)
ggplot (mtcars, aes(x = mpg, y = `car name`)) +
geom_bar(position = position_dodge(), stat = "identity") +
scale_x_reverse() +
scale_y_discrete(name = "", position = "right")
I have a set of code that produces multiple plots using facet_wrap:
ggplot(summ,aes(x=depth,y=expr,colour=bank,group=bank)) +
geom_errorbar(aes(ymin=expr-se,ymax=expr+se),lwd=0.4,width=0.3,position=pd) +
geom_line(aes(group=bank,linetype=bank),position=pd) +
geom_point(aes(group=bank,pch=bank),position=pd,size=2.5) +
scale_colour_manual(values=c("coral","cyan3", "blue")) +
facet_wrap(~gene,scales="free_y") +
theme_bw()
With the reference datasets, this code produces figures like this:
I am trying to accomplish two goals here:
Keep the auto scaling of the y axis, but make sure only 1 decimal place is displayed across all the plots. I have tried creating a new column of the rounded expr values, but it causes the error bars to not line up properly.
I would like to wrap the titles. I have tried changing the font size as in Change plot title sizes in a facet_wrap multiplot, but some of the gene names are too long and will end up being too small to read if I cram them on a single line. Is there a way to wrap the text, using code within the facet_wrap statement?
Probably cannot serve as definite answer, but here are some pointers regarding your questions:
Formatting the y-axis scale labels.
First, let's try the direct solution using format function. Here we format all y-axis scale labels to have 1 decimal value, after rounding it with round.
formatter <- function(...){
function(x) format(round(x, 1), ...)
}
mtcars2 <- mtcars
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() + facet_wrap(~cyl, scales = "free_y")
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 1))
The issue is, sometimes this approach is not practical. Take the leftmost plot from your figure, for example. Using the same formatting, all y-axis scale labels would be rounded up to -0.3, which is not preferable.
The other solution is to modify the breaks for each plot into a set of rounded values. But again, taking the leftmost plot of your figure as an example, it'll end up with just one label point, -0.3
Yet another solution is to format the labels into scientific form. For simplicity, you can modify the formatter function as follow:
formatter <- function(...){
function(x) format(x, ..., scientific = T, digit = 2)
}
Now you can have a uniform format for all of plots' y-axis. My suggestion, though, is to set the label with 2 decimal places after rounding.
Wrap facet titles
This can be done using labeller argument in facet_wrap.
# Modify cyl into factors
mtcars2$cyl <- c("Four Cylinder", "Six Cylinder", "Eight Cylinder")[match(mtcars2$cyl, c(4,6,8))]
# Redraw the graph
sp <- ggplot(mtcars2, aes(x = mpg, y = qsec)) + geom_point() +
facet_wrap(~cyl, scales = "free_y", labeller = labeller(cyl = label_wrap_gen(width = 10)))
sp <- sp + scale_y_continuous(labels = formatter(nsmall = 2))
It must be noted that the wrap function detects space to separate labels into lines. So, in your case, you might need to modify your variables.
This only solved the first part of the question. You can create a function to format your axis and use scale_y_continous to adjust it.
df <- data.frame(x=rnorm(11), y1=seq(2, 3, 0.1) + 10, y2=rnorm(11))
library(ggplot2)
library(reshape2)
df <- melt(df, 'x')
# Before
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free")
# label function
f <- function(x){
format(round(x, 1), nsmall=1)
}
# After
ggplot(df, aes(x=x, y=value)) + geom_point() +
facet_wrap(~ variable, scale="free") +
scale_y_continuous(labels=f)
scale_*_continuous(..., labels = function(x) sprintf("%0.0f", x)) worked in my case.
Is there any way to line up the points of a line plot with the bars of a bar graph using ggplot when they have the same x-axis? Here is the sample data I'm trying to do it with.
library(ggplot2)
library(gridExtra)
data=data.frame(x=rep(1:27, each=5), y = rep(1:5, times = 27))
yes <- ggplot(data, aes(x = x, y = y))
yes <- yes + geom_point() + geom_line()
other_data = data.frame(x = 1:27, y = 50:76 )
no <- ggplot(other_data, aes(x=x, y=y))
no <- no + geom_bar(stat = "identity")
grid.arrange(no, yes)
Here is the output:
The first point of the line plot is to the left of the first bar, and the last point of the line plot is to the right of the last bar.
Thank you for your time.
Extending #Stibu's post a little: To align the plots, use gtable (Or see answers to your earlier question)
library(ggplot2)
library(gtable)
data=data.frame(x=rep(1:27, each=5), y = rep(1:5, times = 27))
yes <- ggplot(data, aes(x = x, y = y))
yes <- yes + geom_point() + geom_line() +
scale_x_continuous(limits = c(0,28), expand = c(0,0))
other_data = data.frame(x = 1:27, y = 50:76 )
no <- ggplot(other_data, aes(x=x, y=y))
no <- no + geom_bar(stat = "identity") +
scale_x_continuous(limits = c(0,28), expand = c(0,0))
gYes = ggplotGrob(yes) # get the ggplot grobs
gNo = ggplotGrob(no)
plot(rbind(gNo, gYes, size = "first")) # Arrange and plot the grobs
Edit To change heights of plots:
g = rbind(gNo, gYes, size = "first") # Combine the plots
panels <- g$layout$t[grepl("panel", g$layout$name)] # Get the positions for plot panels
g$heights[panels] <- unit(c(0.7, 0.3), "null") # Replace heights with your relative heights
plot(g)
I can think of (at least) two ways to align the x-axes in the two plots:
The two axis do not align because in the bar plot, the geoms cover the x-axis from 0.5 to 27.5, while in the other plot, the data only ranges from 1 to 27. The reason is that the bars have a width and the points don't. You can force the axex to align by explicitly specifying an x-axis range. Using the definitions from your plot, this can be achieved by
yes <- yes + scale_x_continuous(limits=c(0,28))
no <- no + scale_x_continuous(limits=c(0,28))
grid.arrange(no, yes)
limits sets the range of the x-axis. Note, though, that the alginment is still not quite perfect. The y-axis labels take up a little more space in the upper plot, because the numbers have two digits. The plot looks as follows:
The other solution is a bit more complicated but it has the advantage that the x-axis is drawn only once and that ggplot makes sure that the alignment is perfect. It makes use of faceting and the trick described in this answer. First, the data must be combined into a single data frame by
all <- rbind(data.frame(other_data,type="other"),data.frame(data,type="data"))
and then the plot can be created as follows:
ggplot(all,aes(x=x,y=y)) + facet_grid(type~.,scales = "free_y") +
geom_bar(data=subset(all,type=="other"),stat="identity") +
geom_point(data=subset(all,type=="data")) +
geom_line(data=subset(all,type=="data"))
The trick is to let the facets be constructed by the variable type which was used before to label the two data sets. But then each geom only gets the subset of the data that should be drawn with that specific geom. In facet_grid, I also used scales = "free_y" because the two y-axes should be independent. This plot looks as follows:
You can change the labels of the facets by giving other names when you define the data frame all. If you want to remove them alltogether, then add the following to your plot:
+ theme(strip.background = element_blank(), strip.text = element_blank())