I'd like to start a bar chart at somewhere other than the y = 0. In my case, I want to start the bar chart at y = 1.
As an example, let's say that I build a identity geom_bar() chart with ggplot2.
df <- data.frame(values = c(1, 2, 0),
labels = c("A", "B", "C"))
library(ggplot2)
ggplot(df, aes(x = labels, y = values, fill = labels, colour = labels)) +
geom_bar(stat="identity")
Now, I'm not asking how to set scale or axis limits. I want bars representing values less than 1 to flow down from y = 1.
It needs to look like this...but with a different y axis:
Any advice?
You could just change the labels manually, as shown in the other answer. However, I think conceptually the better solution is to define a transformation object that transforms the y axis scale as requested. With that approach, you're literally just modifying the relative baseline for the bar plots, and you can still set breaks and limits as you normally would.
df <- data.frame(values = c(1,2,0), labels = c("A", "B", "C"))
t_shift <- scales::trans_new("shift",
transform = function(x) {x-1},
inverse = function(x) {x+1})
ggplot(df, aes(x = labels, y = values, fill = labels, colour = labels)) +
geom_bar(stat="identity") +
scale_y_continuous(trans = t_shift)
Setting breaks and limits:
ggplot(df, aes(x = labels, y = values, fill = labels, colour = labels)) +
geom_bar(stat="identity") +
scale_y_continuous(trans = t_shift,
limits = c(-0.5, 2.5),
breaks = c(0, 1, 2))
You could use
ggplot(df, aes(x = labels, y = values-1, fill = labels, colour = labels)) +
geom_bar(stat = "identity") +
scale_y_continuous(name = 'values',
breaks = seq(-1, 1, 0.5),
labels = seq(-1, 1, 0.5) + 1)
Related
I am confused of this problem for a long time. A simple data frame is constructed as follows
data <- data.frame(
x = 1:5,
y = 5:1,
fill = c(rep("pink", 3), rep("blue", 2)),
shape = c(rep(21, 3), rep(22, 2))
)
Suppose I wand to show the legend of the fill
uniFill <- unique(data$fill)
p <- ggplot(data,
mapping = aes(x = x,
y = y,
fill = fill)) +
geom_point(shape = data$shape) +
# show legend so that I do not call `scale_fill_identity()`
scale_fill_manual(values = uniFill,
labels = uniFill,
breaks = uniFill)
p
The graphics are OK, however, the legend is not correct
I guess, maybe different shapes (21 to 25) cannot be merged? Then, I partition the data into two subsets where the first set has shape 21 and the second has shape 22.
data1 <- data[1:3, ]
data2 <- data[4:5, ]
# > data1$shape
# [1] 21 21 21
# > data2$shape
# [1] 22 22
ggplot(mapping = aes(x = x,
y = y,
fill = fill)) +
geom_point(data = data1, shape = data1$shape) +
geom_point(data = data2, shape = data2$shape) +
scale_fill_manual(values = uniFill,
labels = uniFill,
breaks = uniFill)
Unfortunately, the legend does not change. Then, I changed the shape from a vector to a scalar, as in
ggplot(mapping = aes(x = x,
y = y,
fill = fill)) +
geom_point(data = data1, shape = 21) +
geom_point(data = data2, shape = 22) +
scale_fill_manual(values = uniFill,
labels = uniFill,
breaks = uniFill)
The legend of the fill color is correct finally...
So what happens here? Is it a bug? Is it possible to just add a single layer but with different shapes (21 to 25)?
A possible solution is that one can add component guides(), as in
p +
guides(fill = guide_legend(override.aes = list(fill = uniFill,
shape = 21)))
But I am more interested in why p does not work (legend)
The main reason your legend is not working in your first example is because you did not put your shape in the aesthetics.
I have a couple other suggestions: Do not define colors in your data frame; instead define a column to change the aesthetics using a code. Then define your fill and shape values explicitly. Each of the scales needs to have the same name - in this case "Legend."
Give this edit a try.
data <- data.frame(
x = 1:5,
y = 5:1,
fill = c(rep("p", 3), rep("b", 2))
)
uniFill <- c("p"="pink", "b"="blue")
uniShape <- c("p" = 21, "b" = 22)
p <- ggplot(data,
mapping = aes(x = x,
y = y,
fill = fill,
shape = fill)) +
geom_point() +
# show legend so that I do not call `scale_fill_identity()`
scale_fill_manual("Legend",values = uniFill,
labels = uniFill)+
scale_shape_manual("Legend",values = uniShape,
labels = uniFill)
p
(edit) If your fill and shape aesthetics do not match up, I don't see any other way than to use guides and two legends. Notice that if your attribute column is descriptive, you do not need to set the labels and your code will be cleaner (see shape vs fill aesthetics).
data <- data.frame(
x = 1:5,
y = 5:1,
fill = c(rep("p", 3), rep("b", 2)),
shape = c(rep("circles", 2), rep("squares", 3))
)
uniFill <- c("p"="pink", "b"="blue")
uniShape <- c("circles" = 21, "squares" = 22)
p <- ggplot(data,
mapping = aes(x = x,
y = y,
fill = fill,
shape = shape)) +
geom_point() +
# show legend so that I do not call `scale_fill_identity()`
scale_fill_manual("Legend fill",values = uniFill,
labels = uniFill)+
scale_shape_manual("Legend shape",values = uniShape )+
guides(fill = guide_legend("Legend fill", override.aes = list(shape = 21)))
p
I want to separately plot data in a bubble plot like the image right (I make this in PowerPoint just to visualize).
At the moment I can only create a plot that looks like in the left where the bubble are overlapping. How can I do this in R?
b <- ggplot(df, aes(x = Year, y = Type))
b + geom_point(aes(color = Spp, size = value), alpha = 0.6) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(0.5, 12))
You can have the use of position_dodge() argument in your geom_point. If you apply it directly on your code, it will position points in an horizontal manner, so the idea is to switch your x and y variables and use coord_flip to get it in the right way:
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6, position = position_dodge(0.9)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(1, 15)) +
coord_flip()
Does it look what you are trying to achieve ?
EDIT: Adding text in the middle of each points
To add labeling into each point, you can use geom_text and set the same position_dodge2 argument than for geom_point.
NB: I use position_dodge2 instead of position_dodge and slightly change values of width because I found position_dodge2 more adapted to this case.
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6,
position = position_dodge2(width = 1)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(3, 15)) +
coord_flip()+
geom_text(aes(label = Value, group = Group),
position = position_dodge2(width = 1))
Reproducible example
As you did not provide a reproducible example, I made one that is maybe not fully representative of your original dataset. If my answer is not working for you, you should consider providing a reproducible example (see here: How to make a great R reproducible example)
Group <- c(LETTERS[1:3],"A",LETTERS[1:2],LETTERS[1:3])
Year <- c(rep(1918,4),rep(2018,5))
Type <- c(rep("PP",3),"QQ","PP","PP","QQ","QQ","QQ")
Value <- sample(1:50,9)
df <- data.frame(Group, Year, Value, Type)
df$Type <- factor(df$Type, levels = c("PP","QQ"))
I've been struggling with one last bit of code to make this graph I'm working on really work for me and my audience. I have a bar chart with a two lines (one is acting as a rolling average, the other as the peak of that rolling average). What I want to do is label that peak line with a number, one time, but in each facet where the number is different in each facet. Here's some stripped down data and code:
tdf <- data.frame(a=as.POSIXct(c("2019-10-15 08:00:00","2019-10-15 09:00:00","2019-10-15 10:00:00","2019-10-15 08:00:00","2019-10-15 09:00:00","2019-10-15 10:00:00")),
b=as.Date(c("2019-09-02","2019-09-02","2019-09-02","2019-09-03","2019-09-03","2019-09-03")),
m1=c(0.2222222,0.3636364, 0.2307692, 0.4000000, 0.3428571, 0.3529412),
m2=c(0.2222222,0.2929293, 0.2972028, 0.3153846, 0.3714286, 0.3529412),
m3=c(0.2929293, 0.2929293, 0.2929293, 0.3529412,0.3529412,0.3529412))
g <- ggplot(data = tdf, aes(x = a, y = m1)) +
geom_bar(stat = "identity", alpha = 0.75, fill = 352) +
xlab("time of day") +
ylab("metric name") +
ggtitle("Graph Title") +
scale_x_datetime(breaks = scales::date_breaks("1 hours"),
date_labels = "%H")+
scale_y_continuous(breaks = c(0,.10,.20,.30,.40,.50,.50,.60,.70,.80,.90,1.0),
labels = scales::percent) +
theme_minimal()
# add line for m2
g <- g +
geom_line(data = tdf,
aes(x = a, y = m2),
color = "blue",
size = 1.2)
# add line for m3
g <- g + geom_line(data=tdf,
aes(x = a, y = m3),
color = "#d95f02",
size = 0.6,
linetype = "dashed")
# last attempt to label the line results in an error: Invalid input: time_trans works with objects of class POSIXct
#g <- g+geom_text(aes(x=-Inf, y=Inf, label=median(tdf$m3)), size=2, hjust=-0.5, vjust= 1.4,inherit.aes=FALSE)
# facet wrap
g <- g + facet_wrap(~b, ncol = 5, scales = "fixed")
I've seen a few techniques, but none of them seem to relate having a time for the x-axis in the facets, and each facet having a different date. I'm reasonably certain it's related to the date, but I sort of have no clue how to make the text block happen on each facet anyway.
You just need to pass a different dataset to the labeling layer that still preserves your faceting variable. This will work using dplyr
g <- g +
geom_text(data = tdf %>%
group_by(b) %>%
summarize(median = median(m3)),
aes(x = as.POSIXct(-Inf, origin="1970-01-01"),
y = Inf,
label = median),
size = 2,
hjust = -0.5,
vjust = 1.4,
inherit.aes = FALSE)
We also have to explicitly convert the x to a date/time value for the axis to work.
I prepared a MWE and hope for help on how to set ticks and labels at different position on the x-axis for a grouped bar plot.
library(ggplot2)
library(reshape2)
data <- data.frame(name = c("X","Y","Z"), A = c(2,4,6), B = c(1,3,4), C = c(3,4,5))
data <- melt(data, id = 1)
ggplot(data, aes(name,value)) +
geom_bar(aes(fill = variable), position = "dodge", stat = "identity")
The ticks should appear BETWEEN the groups, but the labels centered below the grouped bars (as they are in the figure). I tried to set user-defined breaks (as factors) for scale_x_discrete but it only made my ticks and labels disappear completely.
Any help is much appreciated!
One options would be to convert the discrete x scale to continuous, to facilitate calculation of break positions:
# numeric version of x values
data$x <- as.integer(as.factor(data$name))
1. x ticks between groups of bars
x_tick <- head(unique(data$x), -1) + 0.5
len <- length(x_tick)
ggplot(data, aes(x = x, y = value, fill = variable)) +
geom_col(position = "dodge") +
scale_x_continuous(breaks = c(sort(unique(data$x)), x_tick),
labels = c(sort(unique(data$name)), rep(c(""), len))) +
theme(axis.ticks.x = element_line(color = c(rep(NA, len + 1), rep("black", len))))
2: x ticks before, between, and after groups of bars
x_tick <- c(0, unique(data$x)) + 0.5
len <- length(x_tick)
ggplot(data, aes(x = x, y = value, fill = variable)) +
geom_col(position = "dodge") +
scale_x_continuous(breaks = c(sort(unique(data$x)), x_tick),
labels = c(sort(unique(data$name)), rep(c(""), len))) +
theme(axis.ticks.x = element_line(color = c(rep(NA, len - 1), rep("black", len))))
Don't ask me about the additional grid lines which appeared at 2.25 and 1.75 respectively...
Here is another solution which uses grid package.
library(grid)
nTicks <- 2
tickersPosition <- unit(rep(1:nTicks /(nTicks+1), each=2), "native")
Part 1:nTicks /(nTicks+1) identifies positions where ticks will be placed.
p1 <- ggplot(data, aes(name,value)) +
geom_bar(aes(fill = variable), position = "dodge", stat = "identity")
To change position of ticks we need to create gtable
p2 <- ggplot_gtable(ggplot_build(p1))
and find the right grob (using str):
p2$grobs[[7]]$children$axis$grobs[[1]]$x <- tickersPosition
After the position is rewritten, we can run
grid::grid.draw(p2)
which will show warnings. This is because of a different number of splits.
I want to explore the directlabels package with ggplot. I am trying to plot labels at the endpoint of a simple line chart; however, the labels are clipped by the plot panel. (I intend to plot about 10 financial time series in one plot and I thought directlabels would be the best solution.)
I would imagine there may be another solution using annotate or some other geoms. But I would like to solve the problem using directlabels. Please see code and image below. Thanks.
library(ggplot2)
library(directlabels)
library(tidyr)
#generate data frame with random data, for illustration and plot:
x <- seq(1:100)
y <- cumsum(rnorm(n = 100, mean = 6, sd = 15))
y2 <- cumsum(rnorm(n = 100, mean = 2, sd = 4))
data <- as.data.frame(cbind(x, y, y2))
names(data) <- c("month", "stocks", "bonds")
tidy_data <- gather(data, month)
names(tidy_data) <- c("month", "asset", "value")
p <- ggplot(tidy_data, aes(x = month, y = value, colour = asset)) +
geom_line() +
geom_dl(aes(colour = asset, label = asset), method = "last.points") +
theme_bw()
On data visualization principles, I would like to avoid extending the x-axis to make the labels fit--this would mean having data space with no data. Rather, I would like the labels to extend toward the white space beyond the chart box/panel (if that makes sense).
In my opinion, direct labels is the way to go. Indeed, I would position labels at the beginning and at the end of the lines, creating space for the labels using expand(). Also note that with the labels, there is no need for the legend.
This is similar to answers here and here.
library(ggplot2)
library(directlabels)
library(grid)
library(tidyr)
x <- seq(1:100)
y <- cumsum(rnorm(n = 100, mean = 6, sd = 15))
y2 <- cumsum(rnorm(n = 100, mean = 2, sd = 4))
data <- as.data.frame(cbind(x, y, y2))
names(data) <- c("month", "stocks", "bonds")
tidy_data <- gather(data, month)
names(tidy_data) <- c("month", "asset", "value")
ggplot(tidy_data, aes(x = month, y = value, colour = asset, group = asset)) +
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_continuous(expand = c(0.15, 0)) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x + .3), "last.bumpup")) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x - .3), "first.bumpup")) +
theme_bw()
If you prefer to push the labels into the plot margin, direct labels will do that. But because the labels are positioned outside the plot panel, clipping needs to be turned off.
p1 <- ggplot(tidy_data, aes(x = month, y = value, colour = asset, group = asset)) +
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_continuous(expand = c(0, 0)) +
geom_dl(aes(label = asset), method = list(dl.trans(x = x + .3), "last.bumpup")) +
theme_bw() +
theme(plot.margin = unit(c(1,4,1,1), "lines"))
# Code to turn off clipping
gt1 <- ggplotGrob(p1)
gt1$layout$clip[gt1$layout$name == "panel"] <- "off"
grid.draw(gt1)
This effect can also be achieved using geom_text (and probably also annotate), that is, without the need for direct labels.
p2 = ggplot(tidy_data, aes(x = month, y = value, group = asset, colour = asset)) +
geom_line() +
geom_text(data = subset(tidy_data, month == 100),
aes(label = asset, colour = asset, x = Inf, y = value), hjust = -.2) +
scale_x_continuous(expand = c(0, 0)) +
scale_colour_discrete(guide = 'none') +
theme_bw() +
theme(plot.margin = unit(c(1,3,1,1), "lines"))
# Code to turn off clipping
gt2 <- ggplotGrob(p2)
gt2$layout$clip[gt2$layout$name == "panel"] <- "off"
grid.draw(gt2)
Since you didn't provide a reproducible example, it's hard to say what the best solution is. However, I would suggest trying to manually adjust the x-scale. Use a "buffer" increase the plot area.
#generate data frame with random data, for illustration and plot:
p <- ggplot(tidy_data, aes(x = month, y = value, colour = asset)) +
geom_line() +
geom_dl(aes(colour = asset, label = asset), method = "last.points") +
theme_bw() +
xlim(minimum_value, maximum_value + buffer)
Using scale_x_discrete() or scale_x_continuous() would likely also work well here if you want to use the direct labels package. Alternatively, annotate or a simple geom_text would also work well.