Label placing in ggplot2 using geom_text function in R - r

I'm designing some graphs within a function with ggPlot2 geom_text.
It is a sequence of five graphs and, in each one, I want to place my label (my text) in the top right position.
The problem is that I will constantly change my N and Y values (according to and input interval). X and Y coordinates will change, and even be out of scale.
So how do I make the label placement fixed, let's say in the top right, in my graph?
Here's my code
parte.mac <- subset(dados, subset = (dados$Especie == 'C.macelaria' & dados$Temp >= minima & dados$Temp <= maxima))
mac <- qplot(Temp, Tempo, data = parte.mac, color = Especie, main = 'C.macelaria', geom = c("point", "line"), add = T) +
stat_smooth(method = 'lm', level = 0.99, alpha = 0.5, aes(group=1), color = 'blue') +
geom_text(x = maxima, y = mean(range(dados$Tempo)), label = mac.sm, parse = TRUE)
Please, help

Echoing the comment by #jazzuro, could you provide us with (your) reproducible data by running (don't send anything confidential!)
dput(parte.mac)
and pasting that with your question.
In the absence of your exact data, I'll echo #baptiste with a simple example using the "faithful" data file of the Old Faithful geyser eruptions:
data(faithful)
head(faithful)
p <- qplot(x=eruptions, y=waiting, data=faithful)
and then here is one example of an annotation:
p + annotate("text", x=3, y=40, label="Group 1") + annotate("text", x=4.5, y=60, label="Group 2")
Below is a second example, using arguments such as "min" and "max" for placement of the annotations:
p + annotate("text", x=min(faithful$eruptions), y=min(faithful$waiting), label="Group 1") + annotate("text", x=max(faithful$eruptions), y=max(faithful$waiting), label="Group 2")
If this doesn't help, remember to dput your data and paste into your question.

Related

How to plot multiple boxplots with numeric x values properly in ggplot2?

I am trying to get a boxplot with 3 different tools in each dataset size like the one below:
ggplot(data1, aes(x = dataset, y = time, color = tool)) + geom_boxplot() +
labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_y_log10() + theme_bw()
But I need to transform x-axis to log scale. For that, I need to numericize each dataset to be able to transform them to log scale. Even without transforming them, they look like the one below:
ggplot(data2, aes(x = dataset, y = time, color = tool)) + geom_boxplot() +
labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_y_log10() + theme_bw()
I checked boxplot parameters and grouping parameters of aes, but could not resolve my problem. At first, I thought this problem is caused by scaling to log, but removing those elements did not resolve the problem.
What am I missing exactly? Thanks...
Files are in this link. "data2" is the numericized version of "data1".
Your question was a tough cookie, but I learned something new from it!
Just using group = dataset is not sufficient because you also have the tool variable to look out for. After digging around a bit, I found this post which made use of the interaction() function.
This is the trick that was missing. You want to use group because you are not using a factor for the x values, but you need to include tool in the separation of your data (hence using interaction() which will compute the possible crosses between the 2 variables).
# This is for pretty-printing the axis labels
my_labs <- function(x){
paste0(x/1000, "k")
}
levs <- unique(data2$dataset)
ggplot(data2, aes(x = dataset, y = time, color = tool,
group = interaction(dataset, tool))) +
geom_boxplot() + labs(x = 'Datasets', y = 'Seconds', title = 'Time') +
scale_x_log10(breaks = levs, labels = my_labs) + # define a log scale with your axis ticks
scale_y_log10() + theme_bw()
This plots

Adding a legend to a double plot using ggplot

I'm trying to add a legend to my plot using ggplot in R. Everything OK so far. My case is special because I'm trying to deal with three variables, but not in order to draw a 3D plot but draw a 2D plot facing v1 vs. v2 and v1 vs. v3.
I get my plot in a correct way but I dont get the legend.
This is my code:
colfuncWarmest <- colorRampPalette(c("orange","red"))
colfuncColdest <- colorRampPalette(c("green","blue"))
plot <- ggplot(data=temperatures_Celsius, aes(x=temperatures_Celsius$Year))
params <- labs(title=paste("Year vs. (Warmest minimum temperature\n",
"and Coldest minimum temperature)"),
x="Year",
y="Coldest min temp / Warmest min temp")
theme <- theme(plot.title = element_text(hjust = 0.5)) #Centering title
wmtl<-geom_line(data=temperatures_Celsius,
aes(y=temperatures_Celsius$Warmest.Minimum.Temperature..C.,
color="red"
),
colour=colfuncWarmest(length(temperatures_Celsius$Year))
)
wmtt<-stat_smooth(data=temperatures_Celsius,
aes(y=temperatures_Celsius$Warmest.Minimum.Temperature..C.),
color="green",
method = "loess")
cmtl<- geom_line(data=temperatures_Celsius,
aes(y=temperatures_Celsius$Coldest.Minimum.Temperature..C.,
color="blue"
),
colour=colfuncColdest(length(temperatures_Celsius$Year))
)
cmtt<-stat_smooth(data=temperatures_Celsius,
aes(y=temperatures_Celsius$Coldest.Minimum.Temperature..C.),
color="orange",
method = "loess")
plot + theme + params + wmtl + wmtt + cmtl + cmtt
(Not all code was added because I did a lot of changes. It is only to get an idea) I get this:
If I add
+ scale_color_manual(values=c("red","blue"))
(for example) in order to add the legend, I get no error, but nothing different happens. I get the same plot.
What I want is only two lines. A red one that says "Warmest minimum" and another blue line that says "Coldest minimum". What could I do to get my legend in this way?
Thanks in advance.
Generally I would say that the correct way to apply a legend to a ggplot is to map a variable to an aesthetic (such as fill, color, size, alpha). Usually this consists of transforming the data to long format (key ~ value pair) and mapping the key variable to color or other aestetic.
In the current case this is not desirable since there is next to no chance the color gradient (colorRampPalette) on the line could be achieved. So I suggest a hacky way where a dummy layer (layer which will not be seen on the plot) is used to create the legend.
Here is some data
temperatures_Celsius = data.frame(year = 1900:2000,
Warmest = rnorm(100, mean = 20, sd = 5),
Coldest = rnorm(100, mean = 10, sd = 5))
Your plot:
colfuncWarmest <- colorRampPalette(c("orange","red"))
colfuncColdest <- colorRampPalette(c("green","blue"))
plot <- ggplot(data=temperatures_Celsius, aes(x=year))
params <- labs(title=paste("Year vs. (Warmest minimum temperature\n",
"and Coldest minimum temperature)"),
x="Year",
y="Coldest min temp / Warmest min temp")
theme <- theme(plot.title = element_text(hjust = 0.5)) #Centering title
wmtl<-geom_line(data=temperatures_Celsius,
aes(y=Warmest),
colour=colfuncWarmest(length(temperatures_Celsius$year)))
wmtt<-stat_smooth(data=temperatures_Celsius,
aes(y=Warmest),
color="green",
method = "loess")
cmtl<- geom_line(data=temperatures_Celsius,
aes(y=Coldest),
colour=colfuncColdest(length(temperatures_Celsius$year)))
cmtt<-stat_smooth(data=temperatures_Celsius,
aes(y=Coldest),
color="orange",
method = "loess")
plot1 <- plot + theme + params + wmtl + wmtt + cmtl + cmtt
Now add a dummy layer:
plot1+
geom_line(data = data.frame(year = c(1900, 1900),
group = factor(c("Coldest", "Warmest"), levels = c("Warmest", "Coldest")),
value = c(10, 20)), aes(x=year, y = value, color = group), size = 2)+
scale_color_manual(values=c("red","blue"))

Labels outside of plots

I am hoping to add the labels "a)" and "b)" to my two plots so that I can differentiate and discuss them more effectively when writing up. Ive tried to do this through the text and legend functions but I'm not getting any good results. Ideally I would have the a) in the very top left of the ep.var.hist plot (1st plot) and the b) in the very top left of the tp.var.hist plot (2nd plot), with the labels sitting outside the actual plot and above the y-axis labels.
My code is below
par(mfrow=c(2,1), mar=c(4,4,0.9,4))
ep.var.hist<-hist(data.ep, breaks=5, xlim=c(0,0.011), ylim=c(0,6000), xlab=NULL, main=NULL)
tp.var.hist<-hist(data.tp, breaks=66, xlim=c(0,0.011), ylim=c(0,6000), xlab="Variance", main=NULL)
Working with the cowplot package, designed to ease the process for producing publication ready plots.
library(cowplot)
library(ggplot2)
sepal <- ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
geom_bar(stat = "identity") +
theme(text = element_text(margin = margin(), debug = FALSE))
petal <- ggplot(data = iris, aes(x = Species, y = Petal.Length)) +
geom_bar(stat = "identity") +
theme(text = element_text(margin = margin(), debug = FALSE))
plot_grid(sepal, petal, labels = c("A", "B"))
plot_grid and save_plot (a polished version of ggsave) are my two favorite cowplot functions. I highly recommend looking at the help pages for more options and customization.
If you really want to keep it to just graphics try this solution, I think you are looking for adj = 0:
par(mfrow=c(2,1), mar=c(4,4,0.9,4))
petal <- hist(iris$Petal.Length, main = "Petal", adj = 0)
sepal <- hist(iris$Sepal.Length, main = "Sepal", adj = 0)
Full disclaimer I would strongly consider using ggplot2 in the long term like #rosscova suggested. You will have a lot more options for controlling the details of your plots, plus lots of modern visualizations that base R just can't do. There is a reason why ggplot2 is so popular :)
EDIT: I apologise, I went ahead and answered without realising you said "outside of plots", which I don't think my answer can achieve.
I don't know how to do this in base, but ggplot2 has the annotate function to achieve what you're after. Here's an example from which you can start playing (I've added a few bits you might want to help get you started):
library( ggplot2 )
plot <- ggplot( diamonds ) +
geom_histogram( aes( carat ), bins = 30 ) +
annotate( "text", label = "label here", x = 1, y = 7500, col = "red" ) +
annotate( "text", label = "and another", x = 2, y = 5500, col = "blue" )
plot <- plot +
xlim( 0, 3 ) +
ggtitle( "Main title" ) +
xlab( "label x" ) +
ylab( "label y" )
plot
Adjust the x and y values within the annotate function to move the label around. You can add as many of these as you like by adding more calls to annotate.

coord_flip() mixing up axis lables?

I am trying to build a horizontal bar chart.
library(ggplot2)
library(plyr)
salary <- read.csv('September 15 2015 Salary Information - Alphabetical.csv', na.strings = '')
head(salary)
salary$X <- NULL
salary$X.1 <- NULL
salary$Club <- as.factor(salary$Club)
levels(salary$Club)
salary$Base.Salary <- gsub(',', '', salary$Base.Salary)
salary$Base.Salary <- as.numeric(as.character(salary$Base.Salary))
salary$Base.Salary <- salary$Base.Salary / 1000000
salary <- ddply(salary, .(Club), transform, pos = cumsum(Base.Salary) - (0.5 * Base.Salary))
ggplot(salary, aes(x = Club, y = Base.Salary, fill = Base.Salary)) +
geom_bar(stat = 'identity') +
ylab('Base Salary in millions of dollars') +
theme(axis.title.y = element_blank()) +
coord_flip() +
geom_text(data = subset(salary, Base.Salary > 2), aes(label = Last.Name, y = pos))
(credits to this thread: Showing data values on stacked bar chart in ggplot2 for the text position calculation)
and the resulting plot is this:
I was thoroughly confused for a while, because I was using xlab to specify the label, and theme(axis.title.y = element_blank()) to hide the y label. However, this didn't work, and I got it to work by changing it to ylab. This seems rather confusing, is it intended?
This seems rather confusing, is it intended?
Yes.
Rather than using theme() to hide the y label, I think
labs(x = "My x label",
y = "")
is more straightforward.
When you flip x and y, they take their labels with them. If this weren't the case, a graph compared with and without coordinate flip would have incorrect axis labels in one of the two cases - which seems confusing and inconsistent. As-is, the labels will be correct always (with and without coord_flip).
Theming, on the other hand, is applied after-the-fact.

How to smartly place text labels beside points of different sizes in ggplot2?

I am trying to make a labeled bubble plot with ggplot2 in R. Here is the simplified scenario:
I have a data frame with 4 variables: 3 quantitative variables, x, y, and z, and another variable that labels the points, lab.
I want to make a scatter plot, where the position is determined by x and y, and the size of the points is determined by z. I then want to place text labels beside the points (say, to the right of the point) without overlapping the text on top of the point.
If the points did not vary in size, I could try to simply modify the aesthetic of the geom_text layer by adding a scaling constant (e.g. aes(x=x+1, y=y+1)). However, even in this simple case, I am having a problem with positioning the text correctly because the points do not scale with the output dimensions of the plot. In other words, the size of the points remains constant in a 500x500 plot and a 1000x1000 plot - they do not scale up with the dimensions of the outputted plot.
Therefore, I think I have to scale the position of the label by the size (e.g. dimensions) of the output plot, or I have to get the radius of the points from ggplot somehow and shift my text labels. Is there a way to do this in ggplot2?
Here is some code:
# Stupid data
df <- data.frame(x=c(1,2,3),
y=c(1,2,3),
z=c(1,2,1),
lab=c("a","b","c"), stringsAsFactors=FALSE)
# Plot with bad label placement
ggplot(aes(x=x, y=y), data=df) +
geom_point(aes(size=z)) +
geom_text(aes(label=lab),
colour="red") +
scale_size_continuous(range=c(5, 50), guide="none")
EDIT: I should mention, I tried hjust and vjust inside of geom_text, but it does not produce the desired effect.
# Trying hjust and vjust, but it doesn't look nice
ggplot(aes(x=x, y=y), data=df) +
geom_point(aes(size=z)) +
geom_text(aes(label=lab), hjust=0, vjust=0.5,
colour="red") +
scale_size_continuous(range=c(5, 50), guide="none")
EDIT: I managed to get something that works for now, thanks to Henrik and shujaa. I will leave the question open just in case someone shares a more general solution.
Just a blurb of what I am using this for: I am plotting a map, and indicating the amount of precipitation at certain stations with a point that is sized proportionally to the amount of precipitation observed. I wanted to add a station label beside each point in an aesthetically pleasing manner. I will be making more of these plots for different regions, and my output plot may have a different resolution or scale (e.g. due to different projections) for each plot, so a general solution is desired. I might try my hand at creating a custom position_jitter, like baptiste suggested, if I have time during the weekend.
It appears that position_*** don't have access to the scales used by other layers, so it's a no go. You could make a clone of GeomText that shifts the labels according to the size mapped,
but it's a lot of effort for a very kludgy and fragile solution,
geom_shiftedtext <- function (mapping = NULL, data = NULL, stat = "identity",
position = "identity",
parse = FALSE, ...) {
GeomShiftedtext$new(mapping = mapping, data = data, stat = stat, position = position,
parse = parse, ...)
}
require(proto)
GeomShiftedtext <- proto(ggplot2:::GeomText, {
objname <- "shiftedtext"
draw <- function(., data, scales, coordinates, ..., parse = FALSE, na.rm = FALSE) {
data <- remove_missing(data, na.rm,
c("x", "y", "label"), name = "geom_shiftedtext")
lab <- data$label
if (parse) {
lab <- parse(text = lab)
}
with(coord_transform(coordinates, data, scales),
textGrob(lab, unit(x, "native") + unit(0.375* size, "mm"),
unit(y, "native"),
hjust=hjust, vjust=vjust, rot=angle,
gp = gpar(col = alpha(colour, alpha),
fontfamily = family, fontface = fontface, lineheight = lineheight))
)
}
})
df <- data.frame(x=c(1,2,3),
y=c(1,2,3),
z=c(1.2,2,1),
lab=c("a","b","c"), stringsAsFactors=FALSE)
ggplot(aes(x=x, y=y), data=df) +
geom_point(aes(size=z), shape=1) +
geom_shiftedtext(aes(label=lab, size=z),
hjust=0, colour="red") +
scale_size_continuous(range=c(5, 100), guide="none")
This isn't a very general solution, because you'll need to tweak it every time, but you should be able to add to the x value for the text some value that's linear depending on z.
I had luck with
ggplot(aes(x=x, y=y), data=df) +
geom_point(aes(size=z)) +
geom_text(aes(label=lab, x = x + .06 + .14 * (z - min(z))),
colour="red") +
scale_size_continuous(range=c(5, 50), guide="none")
but, as the font size depends on your window size, you would need to decide on your output size and tweak accordingly. I started with x = x + .05 + 0 * (z-min(z)) and calibrated the intercept based on the smallest point, then when I was happy with that I adjusted the linear term for the biggest point.
Another alternative. Looks OK with your test data, but you need to check how general it is.
dodge <- abs(scale(df$z))/4
ggplot(data = df, aes(x = x, y = y)) +
geom_point(aes(size = z)) +
geom_text(aes(x = x + dodge), label = df$lab, colour = "red") +
scale_size_continuous(range = c(5, 50), guide = "none")
Update
Just tried position_jitter, but the width argument only takes one value, so right now I am not sure how useful that function would be. But I would be happy to find that I am wrong. Example with another small data set:
df3 <- mtcars[1:10, ]
ggplot(data = df3, aes(x = wt, y = mpg)) +
geom_point(aes(size = qsec), alpha = 0.1) +
geom_text(label = df3$carb, position = position_jitter(width = 0.1, height = 0)) +
scale_size_continuous(range = c(5, 50), guide = "none")

Resources