I've created custom, two level x-axis entries that tend to work pretty well. The only problem is that when my y-axis, proportion, is close to one, these axis entries spill onto the chart area. When I use vjust to manually alter their vertical position, part of each entry is hidden by the chart boundary.
Any suggestions for how to make chart boundaries that dynamically adjust to accommodate large y-axis values and the full text of each entry (without running on to the chart).
Have a look at the following example:
library(ggplot2)
GroupType <- rep(c("American","European"),2)
Treatment <- c(rep("Smurf",2),rep("OompaLoompa",2))
Proportion <- rep(1,length(GroupType))
PopulationTotal <- rep(2,length(GroupType))
sampleData <- as.data.frame(cbind(GroupType,Treatment,Proportion,PopulationTotal))
hist_cut <- ggplot(sampleData, aes(x=GroupType, y=Proportion, fill=Treatment, stat="identity"))
chartCall<-expression(print(hist_cut + geom_bar(position="dodge") + scale_x_discrete(breaks = NA) +
geom_text(aes(label = paste(as.character(GroupType),"\n[N=",PopulationTotal,"]",sep=""),y=-0.02),size=4) + labs(x="",y="",fill="")
))
dev.new(width = 860, height = 450)
eval(chartCall)
Any thoughts about how I can fix the sloppy x-axis text?
Many thanks in advance,
Aaron
Unfortunately you have to manage the y axis yourself - there's currently no way for ggplot2 to figure out how much extra space you need because the physical space required depends on the size of the plot. Use, e.g., expand_limits(y = -0.1) to budget a little extra space for the text.
Related
I have a plot that is a simple barplot of number of each type of an event. I need the labels of the plot to be under the plot as some of the events have very long names and were squashing the plot sideways. I tried to move the labels underneath the plot but it now gets squashed upwards when there are lots of event types. Is there a way of having a static plot size (i.e. for the bar graph) so that long legends don't squash the plot?
My code:
ggplot(counts_df, aes(x = Var2, y = value, fill - Var1)+
geom_bar(stat = "identity") +
theme(legend.position = "bottom") +
theme(legen.direction = "vertical") +
theme(axis.text.x = element_text(angle = -90)
The result:
I think this is because the image size must be static so the plot gets sacrificed for the axis. The same thing happens when I put a legend beneath the plot.
There a several ways to avoid overplotting of labels or squeezing the plot area or to improve readability in general. Which of the proposed solutions is most suitable will depend on the lengths of the labels and the number of bars, and a number of other factors. So, you will probably have to play around.
Dummy data
Unfortunately, the OP hasn't included a reproducible example, so we we have to make up our own data:
V1 <- c("Long label", "Longer label", "An even longer label",
"A very, very long label", "An extremely long label",
"Long, longer, longest label of all possible labels",
"Another label", "Short", "Not so short label")
df <- data.frame(V1, V2 = nchar(V1))
yaxis_label <- "A rather long axis label of character counts"
"Standard" bar chart
Labels on the x-axis are printed upright, overplotting each other:
library(ggplot2) # version 2.2.0+
p <- ggplot(df, aes(V1, V2)) + geom_col() + xlab(NULL) +
ylab(yaxis_label)
p
Note that the recently added geom_col() instead of geom_bar(stat="identity") is being used.
OP's approach: rotate labels
Labels on x-axis are rotated by 90° degrees, squeezing the plot area:
p + theme(axis.text.x = element_text(angle = 90))
Horizontal bar chart
All labels (including the y-axis label) are printed upright, improving readability but still squeezing the plot area (but to a lesser extent as the chart is in landscape format):
p + coord_flip()
Vertical bar chart with labels wrapped
Labels are printed upright, avoiding overplotting, squeezing of plot area is reduced. You may have to play around with the width parameter to stringr::str_wrap.
q <- p + aes(stringr::str_wrap(V1, 15), V2) + xlab(NULL) +
ylab(yaxis_label)
q
Horizontal bar chart with labels wrapped
My favorite approach: All labels are printed upright, improving readability,
squeezing of plot area are is reduced. Again, you may have to play around with the width parameter to stringr::str_wrap to control the number of lines the labels are split into.
q + coord_flip()
Addendum: Abbreviate labels using scale_x_discrete()
For the sake of completeness, it should be mentioned that ggplot2 is able to abbreviate labels. In this case, I find the result disappointing.
p + scale_x_discrete(labels = abbreviate)
To clarify, what this question appears to be asking about is how to specify the panel size in ggplot2.
I believe that the correct answer to this question is 'you just can't do that'.
As of the present time, there does not seem to be any parameter that can be set in any ggplot2 function that would achieve this. If there was one, I think it would most likely be in the form of height and width arguments to an element_rect call within a call to theme (which is how we make other changes to the panel, e.g. altering its background colour), but there's nothing resembling those in the docs for element_rect so my best guess is that specifying the panel size is impossible:
https://ggplot2.tidyverse.org/reference/element.html
The following reference is old but I can't find anything more up to date that positively confirms whether or not this is the case:
https://groups.google.com/forum/#!topic/ggplot2/nbhph_arQ7E
In that discussion, someone asks whether it is possible to specify the panel size, and Hadley says 'Not yet, but it's on my to do list'. That was nine years ago; I guess it's still on his to do list!
One more solution in addition to those above - use staggered labels. These can be used with text wrapping to get a fairly readable result:
p + scale_x_discrete(guide = ggplot2::guide_axis(n.dodge = 2),
labels = function(x) stringr::str_wrap(x, width = 20))
(Using the plot p from #Uwe's answer)
I found other methods didn't quite get what I wanted. I made this function to add a couple of dots after long names
tidy_name <- function(name, n_char) {
ifelse(nchar(name) > (n_char - 2),
{substr(name, 1, n_char) %>% paste0(., "..")},
name)
}
vec <- c("short", "medium string", "very long string which will be shortened")
vec %>% tidy_name(20)
# [1] "short" "medium string" "very long string whi.."
I am a newbie to R and hence having some problems in plotting using ggplot and hence need help.
In the above diagram, if any of my bars have high values (in this case, a green one with value of 447), the plot and the plot title gets overlapped. The values here are normalised / scaled such that the y-axis values are always between 0-100, though the label might indicate a different number (this is the actual count of occurrences, where as the scaling is done based on percentages).
I would like to know how to avoid the overlap of the plot with the plot title, in all cases, where the bar heights are very close to 100.
The ggplot function I am using is as below.
my_plot<-ggplot(data_frame,
aes(x=as.factor(X_VAR),y=GROUP_VALUE,fill=GROUP_VAR)) +
geom_bar(stat="identity",position="dodge") +
geom_text(aes(label = BAR_COUNT, y=GROUP_VALUE, ymax=GROUP_VALUE, vjust = -1), position=position_dodge(width=1), size = 4) +
theme(axis.text.y=element_blank(),axis.text.x=element_text(size=12),legend.position = "right",legend.title=element_blank()) + ylab("Y-axis label") +
scale_fill_discrete(breaks=c("GRP_PERCENTAGE", "NORMALIZED_COUNT"),
labels=c("Percentage", "Count of Jobs")) +
ggtitle("Distribution based on Text Analysis 2nd Level Sub-Category") +
theme(plot.title = element_text(lineheight=1, face="bold"))
Here is the ggsave command, in case if that is creating the problem, with dpi, height and width values.
ggsave(my_plot,file=paste(paste(variable_name,"my_plot",sep="_"),".png",sep = ""),dpi=72, height=6.75,width=9)
Can anyone please suggest what need to be done to get this right?
Many Thanks
As Axeman suggests ylim is useful Have a look at the documentation here:
http://docs.ggplot2.org/0.9.3/xylim.html
In your code:
my_plot + ylim(0,110)
Also, I find this intro to axis quite useful:
http://www.cookbook-r.com/Graphs/Axes_(ggplot2)/
Good luck!
I would like to use customized linetypes in ggplot. If that is impossible (which I believe to be true), then I am looking for a smart hack to plot arrowlike symbols above, or below, my line.
Some background:
I want to plot some water quality data and compare it to the standard (set by the European Water Framework Directive) in a red line. Here's some reproducible data and my plot:
df <- data.frame(datum <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y=rnorm(53,mean=100,sd=40))
(plot1 <-
ggplot(df, aes(x=datum,y=y)) +
geom_line() +
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
However, in this plot it is completely unclear if the Standard is a maximum value (as it would be for example Chloride) or a minimum value (as it would be for Oxygen). So I would like to make this clear by adding small pointers/arrows Up or Down. The best way would be to customize the linetype so that it consists of these arrows, but I couldn't find a way.
Q1: Is this at all possible, defining custom linetypes?
All I could think of was adding extra points below the line:
extrapoints <- data.frame(datum2 <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y2=68)
plot1 + geom_point(data=extrapoints, aes(x=datum2,y=y2),
shape=">",size=5,colour="red",rotate=90)
However, I can't seem to rotate these symbols pointing downward. Furthermore, this requires calculating the right spacing of X and distance to the line (Y) every time, which is rather inconvenient.
Q2: Is there any way to achieve this, preferably as automated as possible?
I'm not sure what is requested, but it sounds as though you want arrows at point up or down based on where the y-value is greater or less than some expected value. If that's the case, then this satisfies using geom_segment:
require(grid) # as noted by ?geom_segment
(plot1 <-
ggplot(df, aes(x=datum,y=y)) + geom_line()+
geom_segment(data = data.frame( df$datum, y= 70, up=df$y >70),
aes(xend = datum , yend =70 + c(-1,1)[1+up]*5), #select up/down based on 'up'
arrow = arrow(length = unit(0.1,"cm"))
) + # adjust units to modify size or arrow-heads
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
If I'm wrong about what was desired and you only wanted a bunch of down arrows, then just take out the stuff about creating and using "up" and use a minus-sign.
I'm having trouble creating a figure with ggplot2. I am using geom_dotplot with center stacking to display my data which are discrete values for 4 categories.
For aesthetic reasons I want to customize the positions of the dots so that
reduce the empty space between dots along the y axis, (ie the dots are 1 value large)
The distributions fit and don't overlap
I've adjusted the bin and dotsize to achieve aesthetic goal 1, but that requires me to fiddle with the ylim() parameter to make sure that the groups fit in the plot. This results in a plot with more whitw space and few numbers on the y axis.
Question: Can anyone explain a way to resize the empty space on this plot?
My code is below:.
plot <- ggplot(figdata, aes(y=Counts, x=category, col=strain)) +
geom_dotplot(aes(fill=strain), dotsize=1, binwidth=.7,
binaxis= "y",stackdir ="centerwhole", stackratio=.7) +
ylim(18,59)
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black")
Which produces:
EDIT: Incorporating jitter will allow the all the data to fit, but I don't want to add noise to this data and would prefer to show it as discreet data.
adjusting the binwidth and dotsize to 0.3 as suggested below also fits all the data, however it leaves too much white space.
I think that I might have to transform my data so that the values are steps smaller than 1, in order to get everything to fit horizontally and dot sizes to big large enough to reduce white space.
I think the easiest way is using coord_cartesian:
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black") +
coord_cartesian(ylim=c(17,40))
Which gives me this plot (with fake data that are not as neatly distributed as yours):
Context: R/ggplot2.
Is there an automated way (or even a manual way) to put the legend factors inside the plot like the energies here (Co, 4,6,10,...), instead of having them in a regular legend box next to the plot ?
Source: Radiation Oncology Physics: A Handbook for Teachers and Students, EB. Podgorsak
So this seems close. I'd characterize this as "semi-automatic": there's definitely some tweaking needed, but most of the work is done for you...
The tricky bit is not placing the text labels (geom_text(...)), but creating the breaks in the plotted curves. This is done with geom_rect(...), where the width of the rectangles are set to the maximum label width, as determined using strwidth(...).
# create sample data
df <- data.frame(x=rep(seq(0,20,.01),5),k=rep(1:5,each=2001))
df$y <- with(df,x*exp(-x/k))
library(ggplot2)
eps.x <- max(strwidth(df$k)) # maximum width of legend label
eps.y <- eps.x*diff(range(df$y))/diff(range(df$x))
ggplot(df,aes(x,y))+
geom_line(aes(group=factor(k)))+
geom_rect(data=df[df$x==5,],
aes(xmax=x+eps.x, xmin=x-eps.x, ymax=y+eps.y, ymin=y-eps.y),
fill="white", colour=NA)+
geom_text(data=df[df$x==5,],aes(x,y,label=k))+
theme_bw()
If you want to color the lines too:
ggplot(df,aes(x,y))+
geom_line(aes(color=factor(k)))+
geom_rect(data=df[df$x==5,],
aes(xmax=x+eps.x, xmin=x-eps.x, ymax=y+eps.y, ymin=y-eps.y),
fill="white", colour=NA)+
geom_text(data=df[df$x==5,],aes(x,y,label=k), colour="black")+
scale_color_discrete(guide="none")+
theme_bw()