ggplot2 data labels outside margins - r

I'm making many figures in ggplot2 using a for loop, but my data labels are extending beyond the plot margin. I've tried using expand, but it only works for some figures. When I try to use par(mar) I get this error message:
Error: Don't know how to add o to a plot.
I also tried just using ggsave to save as a really wide file, but 1) that looks odd and 2) that won't work for making so many different figures.
Does anyone know of any other workarounds? Ideally a way to have the inner plot margins automatically set per figure based on the length of the bars + data labels. Below is the code I'm using and an example figure (you can see the bar for 'x' is outside the margin). Thank you in advance!
for (i in each) {
temp_plot = ggplot(data= subset(Data, Each == i)) +
geom_bar(stat = "identity",
aes(x = reorder(Letter, +Number), y = Number, fill = factor(Category))) +
xlab("Letters") +
ggtitle(paste0("Title"), subtitle = "Subtitle") +
coord_flip() +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5, size=16),
plot.subtitle = element_text(hjust = 0.5)) +
scale_fill_manual(values = c("#00358E", "#00AFD7"),
name= "Category",
labels=c("This","That")) +
geom_text(family="Verdana", size=3,
aes(label=Number2, x=reorder(Letter, +Number), y=Number),
position=position_dodge(width=0.8), hjust=-0.001) +
scale_y_continuous(labels = comma, expand = c(0.01,0)) +
scale_x_discrete(labels = letters)
ggsave(temp_plot, file=paste0("Example", i,".jpeg"))
}

I figured out a simple solution: + ylim(0, 130000)

scale_y_continuous(expand = expansion(mult = c(0, .1)) )

Related

How to change the Y range for ggplot (geom_col) in R?

I am trying to create 2 ggplot bar graphs for text analysis to compare frequencies as percentages from the dictionary "loughran". Here is my code for one of the graphs. How can I edit my y range so that both graphs start at 0% and end at 100%? This way, it would be much easier to see the differences.
ggplot(loughran_nc) +
aes(x = fct_reorder(sentiment, perc), y = perc)+
geom_col()+
ylab("Percentage") +
xlab("Sentiment")+
ggtitle("Sentiment Analysis: Non-Complaints Loughran dictionary")+
theme(plot.title = element_text(hjust = 0.5))
you can set limits within coord_cartesian()
Some quick data:
library(tidyverse)
loughran_nc <- data.frame(sentiment = c("words","for","some","data"),perc=c(40,60,20,80))
Then your plot + 1 line:
ggplot(loughran_nc) +
aes(x = fct_reorder(sentiment, perc), y = perc)+
geom_col()+
ylab("Percentage") +
xlab("Sentiment")+
ggtitle("Sentiment Analysis: Non-Complaints Loughran dictionary")+
theme(plot.title = element_text(hjust = 0.5)) +
coord_cartesian(ylim = c(0,100))
An alternative to coord_cartesian() is to use scale_y_continuous() or ylim().
scale_y_continuous() lets you specify all sorts of attributes to the y axis; limits, breaks, name etc (see ?scale_y_continuous). For your example, you can add scale_y_continuous(limits = c(0, 100)) to your code
ylim() is simple, and adding ylim(c(0, 100)) would also do the same job

Change axis breaks/limits of ggplot with geom_col

I am having problems with changing the axis ticks in a barplot. I am fairly new in using ggplot so the answer might be very obvious.
Here is some data (yes it is strange, but designed to mimic the original dataset I have, which I am not allowed to share):
lab='this is just a very long example text and it will be longer and longer and longer and longer and longer and longer and longer and longer and longer and end'
number=1:20
n=unlist(lapply(number,paste,value=lab))
a=round(runif(n=20,min=-48000,max=-40000))
b=round(runif(n=20,min=-48000,max=-40000))
c=round(runif(n=20,min=-48000,max=-40000))
d=data.frame(cbind(n,a,b,c))
df=pivot_longer(d,cols=c('a','b','c'))
l1=round(as.numeric(min(df$value))/1000 )*1000+1000
l2=round(as.numeric(max(df$value))/1000 )*1000-1000
lim=seq(from=l1,to=l2,by=-1000)
colScale <- scale_fill_manual(name = "n",values = c(rainbow(nrow(df)/3)))
from which I create a barplot
p1=ggplot(df, aes(name, value, fill = as.factor(n))) +
geom_col(position = "dodge",colour='black') +
#scale_y_continuous(breaks = lim , labels = as.character(lim)) +
coord_flip() +
theme_bw() +
theme(axis.text.x=element_text(angle=90),axis.title.x=element_text(face='bold')) +
theme(axis.text.y=element_text(angle=90,size=15)) +
theme(legend.title=element_blank()) +
labs(x = "",y="test") +
colScale +
guides(fill=guide_legend(ncol=1)) +
ggtitle('something') +
theme(plot.title = element_text(hjust = 0.5,size=20))
which is this
that is basically working as I wanted, but the scaling of the x-axis is very unpleasant. What I want instead is an axis, where the breaks and labels are equal to the vector 'lim'. What I understood was that it should be possible to do this by scaling the respective axis as in the commented line. But when I'm trying this I get the error 'Discrete value supplied to continuous scale'. I tried to change the scale to 'scale_y_discrete' but then the ticks disappear completely. I tried everything I could find but nothing worked, so what is wrong?
Based on the answers I changed the plot definition to:
p1=ggplot(df, aes(name, as.numeric(value), fill = as.factor(n))) +
geom_col(position = "dodge",colour='black') +
scale_y_continuous(breaks = lim , labels = as.character(lim)) +
coord_flip() +
theme_bw() +
theme(axis.text.x=element_text(angle=90),axis.title.x=element_text(face='bold')) +
theme(axis.text.y=element_text(angle=90,size=15)) +
theme(legend.title=element_blank()) +
labs(x = "",y="test") +
colScale +
guides(fill=guide_legend(ncol=1)) +
ggtitle('something') +
theme(plot.title = element_text(hjust = 0.5,size=20))
which produced this plot
now I am able to change the axis ticks, but the plot looks nothing like the first one. My goal is to keep the look, meaning showing only the top part of the bars.
I'd suggest converting value to as.numeric (preferably before ggplot, but you can do it within, like below) and using coord_cartesian to specify the "view window". You also might find it simpler to specify your axes in the order you want them, rather than using coord_flip, which is mostly unnecessary since ggplot 3.3.0.
ggplot(df, aes(as.numeric(value), name, fill = as.factor(n))) +
geom_col(position = "dodge",colour='black') +
scale_x_continuous(breaks = lim , labels = as.character(lim)) +
coord_cartesian(xlim = c(min(as.numeric(df$value)), max(as.numeric(df$value))))
# Theming after this up to you

How do I use facetting correctly in ggplot geom_tile, while keeping the aspect ratio intact?

I am trying to create a 'likeliness plot' intended to quickly show an items likeliness vs other items in a table.
A quick example:
'property_data.csv' file to use:
"","Country","Town","Property","Property_value"
"1","UK","London","Road_quality","Bad"
"2","UK","London","Air_quality","Very bad"
"3","UK","London","House_quality","Average"
"4","UK","London","Library_quality","Good"
"5","UK","London","Pool_quality","Average"
"6","UK","London","Park_quality","Bad"
"7","UK","London","River_quality","Very good"
"8","UK","London","Water_quality","Decent"
"9","UK","London","School_quality","Bad"
"10","UK","Liverpool","Road_quality","Bad"
"11","UK","Liverpool","Air_quality","Very bad"
"12","UK","Liverpool","House_quality","Average"
"13","UK","Liverpool","Library_quality","Good"
"14","UK","Liverpool","Pool_quality","Average"
"15","UK","Liverpool","Park_quality","Bad"
"16","UK","Liverpool","River_quality","Very good"
"17","UK","Liverpool","Water_quality","Decent"
"18","UK","Liverpool","School_quality","Bad"
"19","USA","New York","Road_quality","Bad"
"20","USA","New York","Air_quality","Very bad"
"21","USA","New York","House_quality","Average"
"22","USA","New York","Library_quality","Good"
"23","USA","New York","Pool_quality","Average"
"24","USA","New York","Park_quality","Bad"
"25","USA","New York","River_quality","Very good"
"26","USA","New York","Water_quality","Decent"
"27","USA","New York","School_quality","Bad"
Code:
prop <- read.csv('property_data.csv')
Property_col_vector <- c("NA" = "#e6194b",
"Very bad" = "#e6194B",
"Bad" = "#ffe119",
"Average" = "#bfef45",
"Decent" = "#3cb44b",
"Good" = "#42d4f4",
"Very good" = "#4363d8")
plot_likeliness <- function(town_property_table){
g <- ggplot(town_property_table, aes(Property, Town)) +
geom_tile(aes(fill = Property_value, width=.9, height=.9)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5),
strip.text.y = element_text(angle = 0)) +
scale_fill_manual(values = Property_col_vector) +
coord_fixed()
return(g)
}
summary_town_plot <- plot_likeliness(prop)
Output:
This is looking great!
Now I've created a plot that looks nice because I used the coord_fixed() function, but now I want to create the same plot, facetted by Country.
To do this I created the following function:
plot_likeliness_facetted <- function(town_property_table){
g <- ggplot(town_property_table, aes(Property, Town)) +
geom_tile(aes(fill = Property_value, width=.9, height=.9)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5),
strip.text.y = element_text(angle = 0)) +
scale_fill_manual(values = Property_col_vector) +
facet_grid(Country ~ .,
scale = 'free_y')
return(g)
}
facetted_town_plot <- plot_likeliness_facetted(prop)
facetted_town_plot
Result:
However, now my tiles are stretched and if i try to use '+ coords_fixed()' I get the error:
Error: coord_fixed doesn't support free scales
How can I get the plot to facet, but maintain the aspect ratio ? Please note that I'm plotting these in a series, so hardcoding the heights of the plot with manual values is not a solution I'm after, I need something that dynamically scales with the amount of values in the table.
Many thanks for any help!
Edit: Although the same question was asked in slightly different context elsewhere, it had multiple answers with none marked as solving the question.
theme(aspect.ratio = 1) and space = 'free' seems to work.
plot_likeliness_facetted <- function(town_property_table){
g <- ggplot(town_property_table, aes(Property, Town)) +
geom_tile(aes(fill = Property_value, width=.9, height=.9)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5),
strip.text.y = element_text(angle = 0), aspect.ratio = 1) +
scale_fill_manual(values = Property_col_vector) +
facet_grid(Country ~ .,
scale = 'free_y', space = 'free')
return(g)
}
This might not be a perfect answer, but I'm going to give it a spin anyway. Basically, it is going to be difficult to do this with base ggplot because -as you mentioned- coord_fixed() or theme(aspect.ratio = ...) don't play nice with facets.
The first solution I'll propose, is to use gtables to programatically set the width of panels to match the number of variables on your x-axis:
plot_likeliness_gtable <- function(town_property_table){
g <- ggplot(town_property_table, aes(Property, Town)) +
geom_tile(aes(fill = Property_value, width=.9, height=.9)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5),
strip.text.y = element_text(angle = 0)) +
scale_fill_manual(values = Property_col_vector) +
facet_grid(Country ~ .,
scale = 'free_y', space = "free_y")
# Here be the gtable bits
gt <- ggplotGrob(g)
# Find out where the panel is stored in the x-direction
panel_x <- unique(gt$layout$l[grepl("panel", gt$layout$name)])[1]
# Set that width based on the number of x-axis variables, plus 0.2 because
# of the expand arguments in the scales
gt$widths[panel_x] <- unit(nlevels(droplevels(town_property_table$Property)) + 0.2, "null")
# Respect needs to be true to have 'null' units match in x- and y-direction
gt$respect <- TRUE
return(gt)
}
Which would work in the following way:
library(grid)
x <- plot_likeliness_gtable(prop)
grid.newpage(); grid.draw(x)
And gives this plot:
This all works reasonably well but at this point, it would probably be good to discuss some of the drawbacks of having gtables instead of ggplot objects. First, you can't edit it anymore with ggplot, so you can't add another + geom_myfavouriteshape() or anything of the sort. You could still edit parts of the plot in gtable/grid though. Second, it has the quirky grid.newpage(); grid.draw() syntax, which needs the grid library. Third, we're kind of relying on the ggplot facetting to set the y-direction panel heights correctly (2.2 and 1.2 null-units in your example) while this might not be appropriate in all cases. On the upside, you're still defining dimensions in flexible null-units, so it'll scale pretty well with whatever plotting device you're using.
The second solution I'll propose could be a bit hacky for many a taste, but it'll take away the first two drawbacks of using gtables. Some time ago, I had similar issues with the weird panel size behaviour when facetting, so I wrote these functions to set panel sizes. The essence of what is does is to copy the panel drawing function from whatever plot you're making and wrap it inside a new function that sets the panel sizes to some pre-defined numbers. It has to be called after any facetting function though. It would work like this:
plot_likeliness_forcedsizes <- function(town_property_table){
g <- ggplot(town_property_table, aes(Property, Town)) +
geom_tile(aes(fill = Property_value, width=.9, height=.9)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5),
strip.text.y = element_text(angle = 0)) +
scale_fill_manual(values = Property_col_vector) +
facet_grid(Country ~ .,
scale = 'free_y', space = "free_y") +
force_panelsizes(cols = nlevels(droplevels(town_property_table$Property)) + 0.2,
respect = TRUE)
return(g)
}
myplot <- plot_likeliness_forcedsizes(prop)
myplot
It still relies on ggplot setting the y-direction heights correctly though, but you could override these within force_panelsizes() if things go awry.
Hope this helped, good luck!

How to put histogram with ggplot in the beginning of axes (0,0)?

I want to create a histogram, but I have problems with putting it in the beginning of axes (0,0). Currently it is shifted in the right which looks not good. I expected expand_limits(x = 0, y = 0) to solve this. I know it might be answered already but all solutions I've found didn't work. Thank you if you point where is the problem. Here is my code:
ggplot(data=dataset, aes(x= dataset$count)) +
geom_histogram(binwidth = 3,
col="blue",
fill="darkblue") +
labs(title="Retweets Distribution") +
labs(x="Retweet number") +
theme(plot.title = element_text(hjust = 0.5)) +
scale_x_continuous(limits = c(0,250)) +
scale_y_continuous(limits = c(0,250)) + expand_limits(x = 0, y = 0)
And the plot:
Also the summary of count column:
Plots will automatically have padding between the edge of your plot area. So even if you set the axes to start at 0, you will have space between the plot area and the margins.
As you have not provided a dataset, here is a reproducible example on how to fix it. You can change the expand option WITHIN the scale_x_continuous to remove this padding:
ggplot(diamonds, aes(carat)) +
geom_histogram() +
scale_x_continuous(expand = c(0,0))
In your case, you will have to use scale_x_continuous(limits = c(0,250), expand=c(0,0))
If you then wish to shift the whole graph left, simply alter the limits.
E.g.
scale_x_continuous(limits = c(20,250), expand=c(0,0))
See the package documentation for more details: http://ggplot2.tidyverse.org/reference/scale_continuous.html

Automatically resizing legend for a plot made using ggplot2 such that the entire legend lies within the boundary of the layer

I have some data here
I read the data into a data frame and then plot this data with this following code,
# Reading data from a .csv file into a data frame
df <- read.table("newcsv_file.csv",header=T,sep="\t" )
# Now melting the data frame prior to plotting
df_mlt <- melt(df, id=names(df)[1], measure=names(df)[c(2, 6, 11,16,21,26,31,36,41,46,51,106,111,116,121,126,131,136,141,146,151)], variable = "cols")
# plotting the data
plt_fit <- ggplot(df_mlt, aes(x=x,y= value, color=cols)) +
geom_point(size=2) +
geom_smooth(method = "lm", se = FALSE) +
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x))) +
annotation_logticks(sides = "rl") +
theme_bw() +
theme(legend.text=element_text(size=12), legend.title=element_text(size=12))+
theme(axis.text=element_text(size=14)) +
theme(axis.title=element_text(size=14,face="bold")) +
labs(x = "x", y = "y") +
scale_color_discrete(name = "values", labels = c("0","-0.1","-0.2","-0.3","-0.4","-0.5","-0.6","-0.7","-0.8","-0.9","-1","+0.1","+0.2","+0.3","+0.4","+0.5","+0.6","+0.7","+0.8","+0.9","+1")) +
guides(colour = guide_legend(override.aes = list(size=3),nrow=2,title.position = 'top',title.hjust=0.5,legend.direction = "horizontal")) +
theme(legend.position = 'bottom', legend.margin=unit(1,"cm"),legend.background = element_rect(fill ='gray94')) +
theme(plot.margin=unit(c(0,2,0,0),"mm"))
The resulting plot looks like this, the problem here is that the right most edge of the legend is cropped.
I use +theme(legend.margin=unit(1,"cm")) but this does not seem sufficient. Could someone please let me know what I can change to display the full legend properly in the plot.
Thanks.
The code is fine. The problem is the size of your plot window. Try making it wider and you'll see the whole legend.
Also,
ggsave("plot_fit.pdf",plot_fit)
will create a pdf where the full legend is displayed.
After changing the width and height of the plot using the following code,
ggsave(file="new_png_file.png",width=22,height=21,units=c("cm"), dpi=600)
Yields a plot such as this,

Resources