I want to explore the directlabels package with ggplot. I am trying to plot labels at the endpoint of a simple line chart; however, the labels are clipped by the plot panel. (I intend to plot about 10 financial time series in one plot and I thought directlabels would be the best solution.)
I would imagine there may be another solution using annotate or some other geoms. But I would like to solve the problem using directlabels. Please see code and image below. Thanks.
library(ggplot2)
library(directlabels)
library(tidyr)
#generate data frame with random data, for illustration and plot:
x <- seq(1:100)
y <- cumsum(rnorm(n = 100, mean = 6, sd = 15))
y2 <- cumsum(rnorm(n = 100, mean = 2, sd = 4))
data <- as.data.frame(cbind(x, y, y2))
names(data) <- c("month", "stocks", "bonds")
tidy_data <- gather(data, month)
names(tidy_data) <- c("month", "asset", "value")
p <- ggplot(tidy_data, aes(x = month, y = value, colour = asset)) +
geom_line() +
geom_dl(aes(colour = asset, label = asset), method = "last.points") +
theme_bw()
On data visualization principles, I would like to avoid extending the x-axis to make the labels fit--this would mean having data space with no data. Rather, I would like the labels to extend toward the white space beyond the chart box/panel (if that makes sense).
In my opinion, direct labels is the way to go. Indeed, I would position labels at the beginning and at the end of the lines, creating space for the labels using expand(). Also note that with the labels, there is no need for the legend.
This is similar to answers here and here.
library(ggplot2)
library(directlabels)
library(grid)
library(tidyr)
x <- seq(1:100)
y <- cumsum(rnorm(n = 100, mean = 6, sd = 15))
y2 <- cumsum(rnorm(n = 100, mean = 2, sd = 4))
data <- as.data.frame(cbind(x, y, y2))
names(data) <- c("month", "stocks", "bonds")
tidy_data <- gather(data, month)
names(tidy_data) <- c("month", "asset", "value")
ggplot(tidy_data, aes(x = month, y = value, colour = asset, group = asset)) +
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_continuous(expand = c(0.15, 0)) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x + .3), "last.bumpup")) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x - .3), "first.bumpup")) +
theme_bw()
If you prefer to push the labels into the plot margin, direct labels will do that. But because the labels are positioned outside the plot panel, clipping needs to be turned off.
p1 <- ggplot(tidy_data, aes(x = month, y = value, colour = asset, group = asset)) +
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_continuous(expand = c(0, 0)) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x + .3), "last.bumpup")) +
theme_bw() +
theme(plot.margin = unit(c(1,4,1,1), "lines"))
# Code to turn off clipping
gt1 <- ggplotGrob(p1)
gt1$layout$clip[gt1$layout$name == "panel"] <- "off"
grid.draw(gt1)
This effect can also be achieved using geom_text (and probably also annotate), that is, without the need for direct labels.
p2 = ggplot(tidy_data, aes(x = month, y = value, group = asset, colour = asset)) +
geom_line() +
geom_text(data = subset(tidy_data, month == 100),
aes(label = asset, colour = asset, x = Inf, y = value), hjust = -.2) +
scale_x_continuous(expand = c(0, 0)) +
scale_colour_discrete(guide = 'none') +
theme_bw() +
theme(plot.margin = unit(c(1,3,1,1), "lines"))
# Code to turn off clipping
gt2 <- ggplotGrob(p2)
gt2$layout$clip[gt2$layout$name == "panel"] <- "off"
grid.draw(gt2)
Since you didn't provide a reproducible example, it's hard to say what the best solution is. However, I would suggest trying to manually adjust the x-scale. Use a "buffer" increase the plot area.
#generate data frame with random data, for illustration and plot:
p <- ggplot(tidy_data, aes(x = month, y = value, colour = asset)) +
geom_line() +
geom_dl(aes(colour = asset, label = asset), method = "last.points") +
theme_bw() +
xlim(minimum_value, maximum_value + buffer)
Using scale_x_discrete() or scale_x_continuous() would likely also work well here if you want to use the direct labels package. Alternatively, annotate or a simple geom_text would also work well.
Related
I have created one monthly plot with facet_wrap
.
So in the plot I have 3 rows and 4 columns. Now I want to set my common y axis for each rows e.g 1st row should have one common y values, same goes with the 2nd and 3rd rows.
I tried but not able to do it.
I used
ggplot(data = PB,
aes(x = new_date, y = Mean, group = 1)) +
geom_line(aes(color = experiment)) +
theme(legend.title = element_blank()) +
facet_wrap( ~MonthAbb, ncol = 4, scales = "free")
The issue is the scales = "free". Remove this and it will set a common scale across rows and columns (or use "free_y" or "free_x" to adjust accordingly).
If what you're looking for is a separate scale for each row, it will require a bit more work. Check this solution at R: How do I use coord_cartesian on facet_grid with free-ranging axis which layers invisible points on the plot to force the look you want. Otherwise a simple solution might to look at using gridExtra and plot each row separately, then merge into a grid.
Edit: a gridExtra solution would look something like:
library(gridExtra)
g1 <- ggplot(data = PB1, aes(x=new_date, y = Mean, group = 1)) +
geom_line(aes(color = experiment)) +
theme(legend.title = element_blank())
g2 <- ggplot(data = PB2, aes(x=new_date, y = Mean, group = 1)) +
geom_line(aes(color = experiment)) +
theme(legend.title = element_blank())
grid.arrange(g1, g2, nrow=2)
Here is an option to set these on a per-panel basis. It is based on a function I've put in a github package. I'm using some dummy data as example.
library(ggplot2)
library(ggh4x)
df <- data.frame(
x = rep(1:20, 9),
y = c(cumsum(rnorm(60)) + 90,
cumsum(rnorm(60)) - 90,
cumsum(rnorm(60))),
row = rep(LETTERS[1:3], each = 60),
col = rep(LETTERS[1:3], each = 20)
)
ggplot(df, aes(x, y)) +
geom_line() +
facet_wrap(row ~ col, scales = "free_y") +
facetted_pos_scales(
y = rep(list(
scale_y_continuous(limits = c(90, 100)),
scale_y_continuous(limits = c(-100, -80)),
scale_y_continuous(limits = c(0, 20))
), each = 3)
)
As you can see on the image, R automatically assigns the values 0, 0.25... 1 for the size of the point. I was wondering if I could replace the 0, 0.25... 1 and make these text values instead while keeping the actual numerical values from the data.
library(ggplot2)
library(scales)
data(SLC4A1, package="ggplot2")
SLC4A1 <- read.csv(file.choose(), header = TRUE)
# bubble chart showing position of polymorphisms on gene, the frequency of each of these
# polymorphisms, where they are prominent on earth, and p-value
SLC4A1ggplot <- ggplot(SLC4A1, aes(Position, log10(Frequency)))+
geom_jitter(aes(col=Geographical.Location, size =(p.value)))+
labs(subtitle="Frequency of Various Polymorphisms", title="SLC4A1 Gene") +
labs(color = "Geographical Location") +
labs(size = "p-value") + labs(x = "Position of Polymorphism on SLC4A1 Gene") +
scale_size_continuous(range=c(1,4.5), trans = "reverse") +
guides(size = guide_legend(reverse = TRUE))
library(tidyver)
df <- data.frame(x = 1:5, y = 1:5,z = 1:5)
ggplot(df,aes(x = x, y = y, size = z)) +
geom_point()
ggplot(df,aes(x = x, y = y, size = z)) +
geom_point() +
scale_size_continuous(range = 1:2) # control range of circle size
See more here:
https://ggplot2.tidyverse.org/reference/scale_size.html
I would like to create a plot with multiple breaks of different sized intervals on the y axis. The closest post I could find is this Show customised X-axis ticks in ggplot2 But it doesn't fully solve my problem.
# dummy data
require(ggplot2)
require(reshape2)
a<-rnorm(mean=15,sd=1.5, n=100)
b<-rnorm(mean=1500,sd=150, n=100)
df<-data.frame(a=a,b=b)
df$x <- factor(seq(100), ordered = T)
df.m <- melt(df)
ggplot(data = df.m, aes(x = x, y=value, colour=variable, group=variable)) +
geom_line() + scale_y_continuous(breaks = c(seq(from = 0, to = 20, by = 1),
seq(from = 1100, to = max(y), by = 100))) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
The problem is how to get the first set of breaks to be proportional to the second (thus visible).
Any pointer would be very much appreciated, thanks!
You can try something like this:
# Rearrange the factors in the data.frame
df.m$variable <- factor(df.m$variable, levels = c("b", "a"))
ggplot(data = df.m, aes(x = x, y=value, colour=variable, group=variable)) +
geom_line() + facet_grid(variable~., scales = "free")
Hope this helps
I don't know the name of this type of plot (comments around this are welcomed). Essentially it is a barplot with glyphs that are filled to indicate a loss/gain. The glyph is arrow like encoding information about direction, magnitude, and allowing the bar geom under to be seen.
This looks interesting but can't think of how to do it in ggplot2 (grid frame work). How could we recreate this plot in ggplot2/grid framework (base solutions welcomed as well for completeness of question). Specifically the glyphs, not the text as this is pretty straight forward in ggplot2 already.
Here is some code to create data and traditional overlaid & coordinate flipped dodged bar plots and line graphs to show typical ways of visualizing this type of data.
set.seed(10)
x <- sample(30:60, 12)
y <- jitter(x, 60)
library(ggplot2)
dat <- data.frame(
year = rep(2012:2013, each=12),
month = rep(month.abb, 2),
profit = c(x, y)
)
ggplot() +
geom_bar(data=subset(dat, year==2012), aes(x=month, weight=profit)) +
geom_bar(data=subset(dat, year==2013), aes(x=month, weight=profit), width=.5, fill="red")
ggplot(dat, aes(x=month, fill=factor(year))) +
geom_bar(position="dodge", aes(weight=profit)) +
coord_flip
ggplot(dat, aes(x=month, y=profit, group = year, color=factor(year))) +
geom_line(size=1)
Here is an example, perhaps there are other ways though,
dat <- data.frame(
year = rep(2012:2013, each=12),
month = factor(rep(1:12, 2), labels=month.abb),
profit = c(x, y)
)
dat2 <- reshape2::dcast(dat, month~ year, value.var = "profit")
names(dat2)[2:3] <- paste0("Y", names(dat2)[2:3])
ggplot(dat2) +
geom_bar(aes(x=month, y = Y2012), stat = "identity", fill = "grey80", width = 0.6) +
geom_segment(aes(x=as.numeric(month)-0.4, xend = as.numeric(month)+0.4, y = Y2013, yend = Y2013)) +
geom_segment(aes(x = month, xend = month, y = Y2013, yend = Y2012, colour = Y2013 < Y2012),
arrow = arrow(60, type = "closed", length = unit(0.1, "inches")), size = 1.5) +
theme_bw()
I'm trying to annotate a bar chart with the percentage of observations falling into that bucket, within a facet. This question is very closely related to this question:
Show % instead of counts in charts of categorical variables but the introduction of faceting introduces a wrinkle. The answer to the related question is to use stat_bin w/ the text geom and then have the label be constructed as so:
stat_bin(geom="text", aes(x = bins,
y = ..count..,
label = paste(round(100*(..count../sum(..count..)),1), "%", sep="")
)
This works fine for an un-faceted plot. However, with facets, this sum(..count..) is summing over the entire collection of observations without regard for the facets. The plot below illustrates the issue---note that the percentages do not sum to 100% within a panel.
Here the actually code for the figure above:
g.invite.distro <- ggplot(data = df.exp) +
geom_bar(aes(x = invite_bins)) +
facet_wrap(~cat1, ncol=3) +
stat_bin(geom="text", aes(x = invite_bins,
y = ..count..,
label = paste(round(100*(..count../sum(..count..)),1), "%", sep="")
),
vjust = -1, size = 3) +
theme_bw() +
scale_y_continuous(limits = c(0, 3000))
UPDATE: As per request, here's a small example re-producing the issue:
df <- data.frame(x = c('a', 'a', 'b','b'), f = c('c', 'd','d','d'))
ggplot(data = df) + geom_bar(aes(x = x)) +
stat_bin(geom = "text", aes(
x = x,
y = ..count.., label = ..count../sum(..count..)), vjust = -1) +
facet_wrap(~f)
Update geom_bar requires stat = identity.
Sometimes it's easier to obtain summaries outside the call to ggplot.
df <- data.frame(x = c('a', 'a', 'b','b'), f = c('c', 'd','d','d'))
# Load packages
library(ggplot2)
library(plyr)
# Obtain summary. 'Freq' is the count, 'pct' is the percent within each 'f'
m = ddply(data.frame(table(df)), .(f), mutate, pct = round(Freq/sum(Freq) * 100, 1))
# Plot the data using the summary data frame
ggplot(data = m, aes(x = x, y = Freq)) +
geom_bar(stat = "identity", width = .7) +
geom_text(aes(label = paste(m$pct, "%", sep = "")), vjust = -1, size = 3) +
facet_wrap(~ f, ncol = 2) + theme_bw() +
scale_y_continuous(limits = c(0, 1.2*max(m$Freq)))