I'd like to save a histogram as .pdf but when I do it not all x-axis labels are visible. Is it there a way to automatically adjust the size of the plot so that all labels fit nicely and can be read? Thanks a lot in advance for your help!!
# Example data
dd <- iris
dd$Species <- as.character(dd$Species)
dd$Species[dd$Species=="setosa"] <- "setosa and more text that should also fit in the pdf"
dd$Species[dd$Species=="versicolor"] <- "versicolor and more text that should also fit in the pdf"
dd$Species[dd$Species=="virginica"] <- "virginica and more text that should also fit in the pdf"
dd$Species <- as.factor(dd$Species)
# Plotting & saving as .pdf
windows()
plot(dd$Species)
dev.copy(pdf, file="%/test.pdf") # % is the directory in my computer
dev.off()
If the problem is you need give more information on each label with more text, you can use next line
dd <- iris
dd$Species <- as.character(dd$Species)
dd$Species[dd$Species=="setosa"] <- "setosa is the name \n of iris-more text"
dd$Species[dd$Species=="versicolor"] <- "versicolor is the name \n of iris-more text "
dd$Species[dd$Species=="virginica"] <- "virginica is the name \n of iris-more text"
dd$Species <- as.factor(dd$Species)
plot(dd$Species)
When you print to pdf, what matters is the size of the pdf rather than the plot, for the labels to fit. In other words if you use
pdf("filename.pdf", width = W)
plot(dd$Species)
dev.off()
and the width parameter W is large enough, you should get a pdf in which the bars are wide enough so that all labels are visible.
This, however, may not be aesthetically pleasing, in which case you may want to try using ggplot2. This will enable you to play around with the labels more easily. For example, you can rotate all labels by an angle so that they all fit nicely
library(ggplot2)
ggplot(dd, aes(Species)) +
theme(axis.text.x = element_text(angle = 90)) +
geom_bar()
ggsave("filename.pdf")
You can also adjust the size of the font of the labels, or use a legend (which may be a neater way of listing all your labels in order, if there are too many - you can also colour each label differently if you use fill = Species in aes). You can find out how to set these parameters by typing ?theme, and ggplot2 also has excellent documentation with lots of examples at http://docs.ggplot2.org.
Related
I'm having a trouble with the binwidth of my col graphics.
I'm trying to show the highest/lowest suicide rate by country on the same page using shiny.
But, the country names are overlaping one another as you can see below:
How can i adjust this?
There are quite a few possibilities for resizing and adjusting things so that your long axis labels "fit" on a column plot (or any other ggplot for that matter). I'll go through some options here.
First of all... a sample dataset, since we did not get a suitable reprex from your question.
df <- data.frame(
x=c('Text1', 'Text2', 'Long Text Here', 'Really Really Long Text Label Here', 'Text5', 'Text6', 'Text7'),
y=c(sample(1:20, 7, replace=TRUE)))
df$x <- factor(df$x, levels=df$x) # making sure ggplot doesn't alphabetically sort!
library(ggplot2)
p <- ggplot(df, aes(x,y)) +
geom_col(aes(fill=x), show.legend = FALSE)
p
There's your overlapping labels. Now for some options:
Option #1: Resize the plot
One very simple method to "solving" the problem is to realize that R handles graphics... kind of funny. The look of a particular plot depends on the resolution and aspect ratio of the graphics device. Not only that, but text does not scale the same as the other plot elements. This means that you can fix the problem by forcing a different aspect ratio.
# this was used to create the above plot
ggsave('original.png', width=8, height=5)
# changing the aspect ratio produces the plot below on my graphics device
ggsave('resized.png', width=12, height=5)
Option #2: Change the Text Size
The other option is to make your text size for the axis labels smaller. The result is really similar to just resizing the plot.
p + theme(axis.text.x=element_text(size=6))
Option #3: Angle the Text
One really good option is to angle your text using the theme() element again. Note that when you do this you want to change the default alignment of your labels. Set hjust=1 so that the text is "right aligned". If you are setting your angle to 90°, you will also want to set your vjust=0.5 to make the text aligned with the tick mark vertically. Here I'll show you a 45° angled text option:
p + theme(axis.text.x=element_text(angle=45, hjust=1))
Option #4: Wrap Text Labels
One of my go-to favorite options here is to wrap the text label. There are a few ways to do this, but I prefer using wrap_format() from the scales package and a scale_* function. Note, the number given to wrap_format(X) indicates that wrapping should happen after X number of characters in the label.
library(scales)
p + scale_x_discrete(labels=wrap_format(22))
Option #5: Combine all Above
The best way to fix your problem is to use a combination of all techniques above to get the chart to look the way you believe looks most satisfying. This will depend on how many columns you have in your shiny plot and how you generate that plot (user input or always the same, etc). So it's up to you here.
p + scale_x_discrete(name=NULL, labels=wrap_format(22)) +
scale_y_continuous(expand=expansion(mult=c(0,0.15))) +
theme_classic() + #important to put this before overwriting individual theme elements!
theme(
axis.text.x=element_text(angle=40, hjust=1, size=15),
axis.text.y=element_text(size=15),
panel.grid.major.y=element_line(color='gray75', linetype=2))
You can flip your graph horizontally, it will be better for the readers. You can add this to your graph to flip it:
coord_flip()
I made a time series plot of several random walks and by now I understand how to extract a certain part of it and how to change the ticks from years to months. But even after long testing I don't get how to manipulate the x-axis in my graph properly.
Right now, it displays 50 year-steps and only every second white vertical grid line is labelled (why? In every tutorial I watch all lines are labelled instead). What I want to achieve is to change the scaling, so less space is used horizontally (i.e. reduce the space between all the ticks on the x-axis), so the first tick would be at 2000, the second (not the third as is currently the case) at 2050, and so on. I think this should be somehow achievable with breaks, but I can't figure it out. Finally the plot starts and ends too early on the left and on the right, but I believe I can handle that.
Here is the plot:
set.seed(21)
n <- 2500
x <- matrix(replicate(20,cumsum(sample(c(-1, 1), n, TRUE))),nrow = 2500,ncol=20)
aa <- x
rnames <- seq(as.Date("2010-01-01"), length=dim(aa)[1], by="1 month") - 1
rownames(aa) <- format(as.POSIXlt(rnames, format = "%Y-%m-%d"), format = "%d.%m.%Y")
colnames(aa) <- paste0("aa",1:k)
library("ggplot2")
library("reshape2")
library("scales")
aa <- melt(aa, id.vars = rownames(aa))
names(aa) <- c("time","id","value")
aa$time <- as.Date(aa$time, "%d.%m.%Y")
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line()
By default, ggplot adds a minor grid line (that is, a grid line without a tick mark or tick label) between each major grid line. To include only major grid lines add scale_x_date(minor_breaks=NULL). (If you're not seeing minor grid lines in the tutorial videos you've watched, my guess is that they are there, but difficult or impossible to see due to insufficient resolution and/or small size of the video image.)
To reduce the physical distance between tick marks, you would need to change the aspect ratio of the plot. For example, if you want the vertical extent of the plot to be, say 3", then you would need to shrink the horizontal extent until you get a small enough distance between tick marks. First, let's create a plot:
ggplot(aa, aes(x=time,y=value,colour=id,group=id)) +
geom_line(show.legend=FALSE) +
scale_x_date(minor_breaks=NULL)
Here are two examples of rendering the plot:
UPDATE: To answer the comment: For the plots above, I used grid.arrange to create the plot layout and then saved it as a png from the RStudio plot window. I used the widths argument to make one plot thinner than the other.
library(gridExtra)
grid.arrange(p1, p1, widths=c(0.6,0.4), ncol=2)
However, you can adjust the size precisely in many different ways, depending on what format you desire. For example:
# PNG format
png("wide.png", 500,500)
p1
dev.off()
png("narrow.png", 300,500)
p1
dev.off()
# PDF format
pdf("wide.pdf", 5, 5)
p1
dev.off()
pdf("narrow.pdf", 3, 5)
p1
dev.off()
The R package wordcloud has a very useful function which is called wordlayout. It takes initial positions of words and their respective sizes an rearranges them in a way that they do not overlap. I would like to use the results of this functions to do a geom_text plot in ggplot.
I came up with the following example but soon realized that there seems to be a big difference betweetn cex (wordlayout) and size (geom_plot) since words in graphics package appear way larger.
here is my sample code. Plot 1 is the original wordcloud plot which has no overlaps:
library(wordcloud)
library(tm)
library(ggplot2)
samplesize=100
textdf <- data.frame(label=sample(stopwords("en"),samplesize,replace=TRUE),x=sample(c(1:1000),samplesize,replace=TRUE),y=sample(c(1:1000),samplesize,replace=TRUE),size=sample(c(1:5),samplesize,replace=TRUE))
#plot1
plot.new()
pdf(file="plot1.pdf")
textplot(textdf$x,textdf$y,textdf$label,textdf$size)
dev.off()
#plot2
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot2.pdf")
#plot3
new_pos <- wordlayout(x=textdf$x,y=textdf$y,words=textdf$label,cex=textdf$size)
textdf$x <- new_pos[,1]
textdf$y <- new_pos[,2]
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot3.pdf")
#plot4
textdf$x <- new_pos[,1]+0.5*new_pos[,3]#this is the way the wordcloud package rearranges the positions. I took this out of the textplot function
textdf$y <- new_pos[,2]+0.5*new_pos[,4]
ggplot(textdf,aes(x,y))+geom_text(aes(label = label, size = size))
ggsave("plot4.pdf")
is there a way to overcome this cex/size difference and reuse wordlayout for ggplots?
cex stands for character expansion and is the factor by which text is magnified relative the default, specified by cin - set on my installation to 0.15 in by 0.2 in: see ?par for more details.
#hadley explains that ggplot2 sizes are measured in mm. Therefore cex=1 would correspond to size=3.81 or size=5.08 depending on if it is being scaled by the width or height. Of course, font selection may cause differences.
In addition, to use absolute sizes, you need to have the size specification outside the aes otherwise it considers it a variable to map to and choose the scale itself, eg:
ggplot(textdf,aes(x,y))+geom_text(aes(label = label),size = textdf$size*3.81)
Sadly I think you're going to find the short answer is no! I think the package handles the text vector mapping differently from ggplot2, so you can tinker with size and font face/family, etc. but will struggle to replicate exactly what the package is doing.
I tried a few things:
1) Try to plot the grobs from textdata using annotation_custom
require(plyr)
require(grid)
# FIRST TRY PLOT INDIVIDUAL TEXT GROBS
qplot(0:1000,0:1000,geom="blank") +
alply(textdf,1,function(x){
annotation_custom(textGrob(label=x$label,0,0,c("center","center"),gp=gpar(cex=x$size)),x$x,x$x,x$y,x$y)
})
2) Run the wordlayout() function which should readjust the text, but difficult to see for what font (similarly doesn't work)
# THEN USE wordcloud() TO GET CO-ORDS
plot.new()
wordlayout(textdf$x,textdf$y,words=textdf$label,cex=textdf$size,xlim=c(min(textdf$x),max(textdf$x)),ylim=c(min(textdf$y),max(textdf$y)))
plotdata<-cbind(data.frame(rownames(w)),w)
colnames(plotdata)=c("word","x","y","w","h")
# PLOT WORDCLOUD DATA
qplot(0:1000,0:1000,geom="blank") +
alply(plotdata,1,function(x){
annotation_custom(textGrob(label=x$word,0,0,c("center","center"),gp=gpar(cex=x$h*40)),x$x,x$x,x$y,x$y)
})
Here's a cheat if you just want to overplot other ggplot functions on top of it (although the co-ords don't seem to match up exactly between the data and the plot). It basically images the wordcloud, removes the margins, and under-plots it at the same scale:
# make a png file of just the panel
plot.new()
png(filename="bgplot.png")
par(mar=c(0.01,0.01,0.01,0.01))
textplot(textdf$x,textdf$y,textdf$label,textdf$size,xaxt="n",yaxt="n",xlab="",ylab="",asp=1)
dev.off()
# library to get PNG file
require(png)
# then plot it behind the panel
qplot(0:1000,0:1000,geom="blank") +
annotation_custom(rasterGrob(readPNG("bgplot.png"),0,0,1,1,just=c("left","bottom")),0,1000,0,1000) +
coord_fixed(1,c(0,1000),c(0,1000))
I wanted to ask for any general idea about plotting this kind of plot in R which can compare for example the overlaps of different methods listed on the horizontal and vertical side of the plot? Any sample code or something
Many thanks
A ggplot2-example:
# data generation
df <- matrix(runif(25), nrow = 5)
# bring data to long format
require(reshape2)
dfm <- melt(df)
# plot
require(ggplot2)
ggplot(dfm, aes(x = Var1, y = Var2)) +
geom_tile(aes(fill = value)) +
geom_text(aes(label = round(value, 2)))
The corrplot package and corrplot function in that package will create plots similar to what you show above, that may do what you want or give you a starting point.
If you want more control then you could plot the colors using the image function, then use the text function to add the numbers. You can either create the margins large enough to place the text in the margins, see the axis function for the common way to add text labels in the margin. Or you could leave enough space internally (maybe use rasterImage instead of image) and use text to do the labelling. Look at the xpd argument to par if you want to add the lines and the grconvertX and grconvertY functions to help with the coordinates of the line segents.
When using the pdf() function in R for saving a plot in an external file, we can specify width and/or height to adjust the size of the plot. However, there are situations when we obtain multiple plot (say using par(mfrow=c(2,4))). In this situation, it's kind of difficult to determine what is the best width and height for the PDF file in order to have all plots displayed properly. Is there a way to let R automatically "fit the plots" in the PDF file? I searched the arguments in pdf() and tried some, but the results are not satisfactory. Thank you very much!
How about something using ggplot?
require(ggplot2)
# Bogus data
x <- rnorm(10000)
y <- as.factor(round(rnorm(10000, mean=10, sd=2), 0))
df <- data.frame(vals=x, factors=y)
myplot <- ggplot(data=df, aes(x=vals)) +
geom_density() +
facet_wrap(~ factors)
ggsave(filename="~/foo.pdf", plot=myplot, width=8, height=10, units="in")
EDIT: Should you need to print over multiple pages, see this question.