ggplot, setting maximum number of x axis tickmarks - r

hopefully a trivial one. The code below is hopefully quite simplistic and is just to illustrate the issue (and hence is quite crude). I am just wondering what the best way to set a maximum number of x axis tickmarks might be here.
library(ggplot2)
library(data.table)
data <- data.table(year=c(2009,2009,2009,2009,2010,2010,2010,2010,
2011,2011,2011,2011,2012,2012,2012,2012,2013,2013,2013,2013,
2014,2014,2014,2014,2015,2015,2015,2015),year_quart = c("2009-Q1","2009-Q2",
"2009-Q3","2009-Q4","2010-Q1","2010-Q2","2010-Q3","2010-Q4","2011-Q1","2011-
Q2","2011-Q3","2011-Q4","2012-Q1","2012-Q2","2012-Q3","2012-Q4",
"2013-Q1","2013-Q2","2013-Q3","2013-Q4","2014-Q1","2014-Q2","2014-Q3",
"2014-Q4","2015-Q1","2015-Q2","2015-Q3","2015-Q4"),region = c("EU","EU",
"EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU",
"EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU"),value = c(390,621,
442,113,586,571,391,432,758,897,696,160,189,567,621,922,402,185,609,812,549,
783,211,974,723,584,745,609))
plot1 <- ggplot(data, aes(factor(year_quart),value, xmin="2009-Q1", xmax="2009-Q4")) +
geom_line(aes(group=region),size=0.4) +
labs(x = "year", y = "value", title = "Title") +
scale_x_discrete(
breaks = unique(data$year_quart),
labels = unique(data$year_quart),
limits = c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2")
)
So with this code I get a plot which looks OKish. However if I swap
limits=c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2")
with
limits=c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2","2010-Q3", "2010-Q4","2011-Q1","2011-Q2","2011-Q3","2011-Q4","2012-Q1","2012-Q2","2012-Q3",
"2012-Q4","2013-Q1","2013-Q2","2013-Q3","2013-Q4","2014-Q1","2014-Q2","2014-Q3",
"2014-Q4","2015-Q1","2015-Q2","2015-Q3","2015-Q4"))
I generate far too many tickmarks to be viewed clearly. So what I would ideally like is, for a certain year/quarter range, specific code that generates a maximum number of (clearly viewable) tickmarks depending on this range.
many thanks in advance!

If our goal is to make the tick labels more readable, we can always rotate them using the axis.text.x argument in theme:
ggplot(data, aes(factor(year_quart),value, xmin="2009-Q1", xmax="2009-Q4")) +
geom_line(aes(group=region),size=0.4) +
labs(x = "year", y = "value", title = "Title") +
scale_x_discrete(
breaks = unique(data$year_quart),
labels = unique(data$year_quart),
limits=c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2","2010-Q3", "2010-Q4","2011-Q1","2011-Q2","2011-Q3","2011-Q4","2012-Q1","2012-Q2","2012-Q3",
"2012-Q4","2013-Q1","2013-Q2","2013-Q3","2013-Q4","2014-Q1","2014-Q2","2014-Q3",
"2014-Q4","2015-Q1","2015-Q2","2015-Q3","2015-Q4")
) +
theme(axis.text.x = element_text(angle = 45))

Related

Adding space *just* on right size of x-axis, color based on relative position, specify labels

I have a time series graph of 49 countries, and I'd like to do three things: (1) prevent the country label name from being cut off, (2) specify so that the coloring is based on the position in the graph rather than alphabetically, and (3) specify which countries I would like to label (49 labels in one graph is too many).
library(ggplot2)
library(directlabels)
library(zoo)
library(RColorBrewer)
library(viridis)
colourCount = length(unique(df$newCol))
getPalette = colorRampPalette(brewer.pal(11, "Paired"))
## Yearly Incorporation Rates
ggplot(df,aes(x=year2, y=total_count_th, group = newCol, color = newCol)) +
geom_line() +
geom_dl(aes(label = newCol),
method= list(dl.trans(x = x + 0.1),
"last.points", cex = 0.8)) +
scale_color_manual(values = getPalette(colourCount)) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
legend.position = "none") +
labs(title = "Title",
x = "Year",
y = "Count")
This code works -- there are 49 lines, and each of them is labelled. But it just so happens that all the countries with the highest y-values have the same/similar colors (red/orange). So is there a way to specify the colors dynamically (maybe with scale_color_identity)? And how do I add space just on the right side of the labels? I found the expand = expand_scale, but it added space on both sides (though I did read that in the new version, it should be possible to do so.)
I am also fine defining a list of 49 manually-defined colors rather than using the color ramp.
One way to do it is to limit the x axis by adding something like
coord_cartesian(xlim = c(1,44), expand = TRUE)
In this case, I had 41 years of observations on the axis, so by specifying 44, I added space to the x-axis.
Thank you to #JonSpring for the help and getting me to the right answer!

How to plot multiple boxplots with numeric x values properly in ggplot2?

I am trying to get a boxplot with 3 different tools in each dataset size like the one below:
ggplot(data1, aes(x = dataset, y = time, color = tool)) + geom_boxplot() +
labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_y_log10() + theme_bw()
But I need to transform x-axis to log scale. For that, I need to numericize each dataset to be able to transform them to log scale. Even without transforming them, they look like the one below:
ggplot(data2, aes(x = dataset, y = time, color = tool)) + geom_boxplot() +
labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_y_log10() + theme_bw()
I checked boxplot parameters and grouping parameters of aes, but could not resolve my problem. At first, I thought this problem is caused by scaling to log, but removing those elements did not resolve the problem.
What am I missing exactly? Thanks...
Files are in this link. "data2" is the numericized version of "data1".
Your question was a tough cookie, but I learned something new from it!
Just using group = dataset is not sufficient because you also have the tool variable to look out for. After digging around a bit, I found this post which made use of the interaction() function.
This is the trick that was missing. You want to use group because you are not using a factor for the x values, but you need to include tool in the separation of your data (hence using interaction() which will compute the possible crosses between the 2 variables).
# This is for pretty-printing the axis labels
my_labs <- function(x){
paste0(x/1000, "k")
}
levs <- unique(data2$dataset)
ggplot(data2, aes(x = dataset, y = time, color = tool,
group = interaction(dataset, tool))) +
geom_boxplot() + labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_x_log10(breaks = levs, labels = my_labs) + # define a log scale with your axis ticks
scale_y_log10() + theme_bw()
This plots

Delete ticks between breaks in ggplot

I'd like to create a line graph with irregular break intervals on the x-axis. When I define the breaks as in the code example, I get additional unlabeled breaks, which always seem to be exactly in the middle of two defined breaks and thus are unregular too (see image in link).
test_frame <- data.frame("v1"=1:3,"v2"=3:1)
library(ggplot2)
ggplot(data = test_frame, aes(x=v1, y=v2, group=1))+geom_line()+
scale_x_continuous(breaks = c(2.74,2.43,1.19))
Graph with additional breaks:
Is there any way to get rid of these vertical lines, so that there are only lines at the defined break position? I'd be grateful for any suggestions.
Set minor_breaks = NULL:
scale_x_continuous(breaks = c(2.74,2.43,1.19), minor_breaks = NULL)
ggplot(data = test_frame, aes(x = v1, y = v2, group = 1)) +
geom_line() +
scale_x_continuous(breaks = c(2.74, 2.43, 1.19)) +
theme(panel.grid.minor = element_blank()) # or panel.grid.minor.x to keep horizontal lines

coord_flip() mixing up axis lables?

I am trying to build a horizontal bar chart.
library(ggplot2)
library(plyr)
salary <- read.csv('September 15 2015 Salary Information - Alphabetical.csv', na.strings = '')
head(salary)
salary$X <- NULL
salary$X.1 <- NULL
salary$Club <- as.factor(salary$Club)
levels(salary$Club)
salary$Base.Salary <- gsub(',', '', salary$Base.Salary)
salary$Base.Salary <- as.numeric(as.character(salary$Base.Salary))
salary$Base.Salary <- salary$Base.Salary / 1000000
salary <- ddply(salary, .(Club), transform, pos = cumsum(Base.Salary) - (0.5 * Base.Salary))
ggplot(salary, aes(x = Club, y = Base.Salary, fill = Base.Salary)) +
geom_bar(stat = 'identity') +
ylab('Base Salary in millions of dollars') +
theme(axis.title.y = element_blank()) +
coord_flip() +
geom_text(data = subset(salary, Base.Salary > 2), aes(label = Last.Name, y = pos))
(credits to this thread: Showing data values on stacked bar chart in ggplot2 for the text position calculation)
and the resulting plot is this:
I was thoroughly confused for a while, because I was using xlab to specify the label, and theme(axis.title.y = element_blank()) to hide the y label. However, this didn't work, and I got it to work by changing it to ylab. This seems rather confusing, is it intended?
This seems rather confusing, is it intended?
Yes.
Rather than using theme() to hide the y label, I think
labs(x = "My x label",
y = "")
is more straightforward.
When you flip x and y, they take their labels with them. If this weren't the case, a graph compared with and without coordinate flip would have incorrect axis labels in one of the two cases - which seems confusing and inconsistent. As-is, the labels will be correct always (with and without coord_flip).
Theming, on the other hand, is applied after-the-fact.

How to manage x axis ticks when handling dynamic x asix through shiny (inside ggplot)

Following is my plot function,
I have used implemented code here and not a reproducible one, because I just want to know the concept of handling things here.
print(ggplot(subset(gg1,!is.na(var)), aes_string(x = "Day", y = var, group = "Mi")) +
geom_point(aes(color = factor(Mi)), size = 5, alpha = 0.7) +
#scale_x_continuous(breaks=pretty_breaks(n=10)) + #geom_smooth(stat= "smooth" , alpha = I(0.4), method="loess",color="grey", formula = y ~ x)
scale_color_manual("Mesocosm", values = c('#FF0000', '#00FF00', '#0000FF', '#FFFF00', '#FF00FF', '#808080', '#800000' , '#008000', '#008080')) +
scale_y_continuous(breaks=pretty_breaks(n=10)) +
theme_bw() +
geom_line(data = (ggl), size = 0.5) +
theme (legend.position = "right", legend.title=element_text(size=14),
panel.border = element_rect(colour = "black"),strip.background = element_rect(fill="#CCCCFF"),
strip.text.x = element_text(size=14, face="bold"),axis.text.y = element_text(colour="grey20",size=13,face="bold"),
axis.text.x = element_text(colour="grey20",size=13,face="bold"),
axis.title.x = element_text(colour="grey20",size=20,face="bold"),
axis.title.y = element_text(colour="grey20",size=20,face="bold")) +
xlim(input$slider[1],input$slider[2]) +
scale_x_continuous(breaks=pretty_breaks(n=10)) )
I want to split the x asix ticks to accomodate more ticks on the x axis. this I can do using scale_x_continuous as shown in the above example. The result is fine and I get the ticks as I wanted.
What is ticks? A similar question can be found here: [Pretty Breaks][1]
But in the above implementation the dynamic x axis fails to do its operation,
Dynamic x axis: change the slider bar points to make the x axis to adjust automatically.
Next:
if I reverse the order of last two lines like
scale_x_continuous(breaks=pretty_breaks(n=10)) + xlim(input$slider[1],input$slider[2]) )
Then scale_x _continuous doesn't work saying "Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing scale." (Which eliminates the having many ticks as i intend of having ).
How can I implement both in this case. [Want to have dynamic x axis and also want to overwrite the predefined ticks and have more ticks.]
The overview can be seen in this pic.
![enter image description here][2]
The pic is showing even though the slider bar values are changed , the x axis is not adjusting that is because as I said the order of scale_x_continuous and xlim.
How Can I make both work?
I think limits in the scale_x_continous() function is what you want.
Replace:
xlim(input$slider[1],input$slider[2]) +
scale_x_continuous(breaks=pretty_breaks(n=10)) )
With:
scale_x_continuous(breaks=pretty_breaks(n=10), limits=c(input$slider[1],input$slider[2])) )

Resources