paste function for labeling in ggplot2 - r

In the following plot, I want to rename x-axis by paste0 function.
daata <- data.frame(
q = paste0("q",1:20),
value = runif(n = 20, 2, 10))
ggplot2::ggplot(data = daata, aes(x = q, y = value)) +
geom_col()
so I used the following code:
q = paste0("q",1:20)
labels <- paste0("'", q,"'" , " = ", 1:20) %>% noquote()
# Or
labels <- noquote(paste0("'", q,"'" , " = ", 1:20))
ggplot2::ggplot(data = daata, aes(x = q, y = value)) +
geom_col() +
scale_x_discrete(labels = labels)
But it did not work. Why? (main question)
I want to search for solutions that make labels = c("'q1' = 1", ...) works.
Beside paste function I know two alternatives.
Using list:
labels = sapply(1:20, list)
names(labels) <- daata$q
ggplot2::ggplot(data = daata, aes(x = q, y = value)) +
geom_col() +
scale_x_discrete(labels = labels)
Using function:
ggplot2::ggplot(data = daata, aes(x = q, y = value)) +
geom_col() +
scale_x_discrete(labels = function(i){gsub("q", "", i)})
I am eager to know other solutions too.

How about something like this? Extract the question number in the data = step, and use that for the axis:
daata <- data.frame(
q = paste0("q",1:20),
value = 1:20)
ggplot2::ggplot(data = daata %>% mutate(order = str_remove(q, "q") %>% as.numeric),
aes(x = order, y = value)) +
geom_col() +
scale_x_continuous(breaks = 1:20, minor_breaks = NULL)
Edit: here's an alternative that extracts the numeric part of the label. As you'll note, this preserves the alphabetical ordering created by mapping x to q.
ggplot2::ggplot(data = daata, aes(x = q, y = value)) +
geom_col() +
scale_x_discrete(labels = function(x) parse_number(x))

why not giving a named vector?
labels <- parse_number(as.character(daata$q))
names(labels) <- as.character(daata$q)
p1 <- ggplot2::ggplot(data = daata, aes(x = q, y = value)) +
geom_col()
p2 <- p1 + scale_x_discrete(labels = labels)
cowplot::plot_grid(p1, p2, nrow = 1)

Related

With ggplot, use both unit_format and dollar_format from scales for tick text labeling

I have created the following ggplot to highlight my issue:
mydf = data.frame(x = c(1,2,3,4,5), y = c(1,2,3,4,5))
ggplot(data = mydf) +
geom_point(aes(x = x, y = y)) +
scale_x_continuous(labels = scales::dollar_format()) +
scale_y_continuous(labels = scales::unit_format(unit = "M"))
which gives the following amazing, advanced ggplot graph:
My question is then simply - how can i make one axis have both the $ and M unit labels, so that the label shows as $1M $2M, etc. Is this possible? Is it also possible to reduce the gap between the number and the M sign, so that it shows 5M instead of 5 M
Thanks as always!
Hacky, but works:
ggplot(data = mydf) +
geom_point(aes(x = x, y = y)) +
scale_x_continuous(labels = scales::dollar_format()) +
scale_y_continuous(labels = scales::dollar_format(prefix="$", suffix = "M"))
You can also define your own function:
ggplot(data = mydf) +
geom_point(aes(x = x, y = y)) +
scale_x_continuous(labels = f <- function(x) paste0("$",x,"M")) +
scale_y_continuous(labels = f)
A method using y with unit_format() function to generate desired result - tick label y as "$1M", no gap between dollar and amount, no gap between amount and M:
mydf = data.frame(x = c(1,2,3,4,5), y = c(1,2,3,4,5))
ggplot(data = mydf) +
geom_point(aes(x = x, y = y)) +
scale_x_continuous(labels = scales::dollar_format()) +
scale_y_continuous(labels = scales::unit_format(unit = "M", prefix = "$", sep = "", accuracy = 1))
Using Roman's method - since y is using dollar format, results are same without prefix = "$" argument in dollar_format() function:
ggplot(data = mydf) +
geom_point(aes(x = x, y = y)) +
scale_x_continuous(labels = scales::dollar_format()) +
scale_y_continuous(labels = scales::dollar_format(suffix = "M"))

R ggplot2 : continuous x + colors

I'm trying to create a boxplot using ggplot2 with :
X as a continuous variable
Colors for different groups
Here is an example :
x <- sample(c(1,2,5),300,replace = TRUE)
y <- sapply(x,function(mu) rnorm(1,mean = mu))
color <- sample(c("color 1","color 2"),300,replace = TRUE)
data <- data.frame(x, y, color)
I can either have colors and x as a factor :
ggplot(data = data) + geom_boxplot(aes(x = factor(x),y = y,col = color))
or x as a continuous variable and no colors :
ggplot(data = data) + geom_boxplot(aes(x = x,y = y,group = x))
But not both.
Does somebody know how to do this ?
Thanks
I think you need one more column for group, which is the combination of color and x. For example, how about simply paste()ing them?
set.seed(1)
x <- sample(c(1,2,5),300,replace = TRUE)
y <- sapply(x,function(mu) rnorm(1,mean = mu))
color <- sample(c("color 1","color 2"),300,replace = TRUE)
data <- data.frame(x, y, color)
library(ggplot2)
ggplot(data = data) +
geom_boxplot(aes(x = x, y = y, col = color, group = paste(color, x)))
You can use scales to change the x-axis scale.
library(ggplot2)
library(scales)
x <- sample(c(1,2,5),300,replace = TRUE)
y <- sapply(x,function(mu) rnorm(1,mean = mu))
color <- sample(c("color 1","color 2"),300,replace = TRUE)
data <- data.frame(x, y, color)
ggplot(data = data) + geom_boxplot(aes(x = factor(x),y = y,col = color)) + scale_x_discrete(limit = c('1','2','3','4','5'))
Hack for dynamic limits:
min = min(data$x)
max = max(data$x)
limits <- as.character(seq(min:max))
ggplot(data = data) + geom_boxplot(aes(x = factor(x),y = y,col = color)) + scale_x_discrete(limit = limits)
You could misuse the fill aesthetic:
ggplot(data = data) +
geom_boxplot(aes(x = x, y = y, col = color, fill = factor(x))) +
scale_fill_manual(values = rep(NA, 3), guide = "none")

ggplot axis custom order with duplicate labels

set.seed(357)
x <- data.frame(name = sample(letters, 10), val = runif(10), stringsAsFactors = F)
x[c(2,6),"name"] <- c("k","k")
ggplot(x, aes(x = name, y = val)) + theme_bw() + geom_bar(stat = "identity")
How can I plot the axis in the same order as x$name? (Yes, the k is duplicate, I want that to show up in the plot like this axis: c k g f o k s v t q)
In the past I used to do:
x$name <- factor(x$name, levels = x$name[order(x$val)], ordered = T)
wich doesn't work any more thanks to:
http://r.789695.n4.nabble.com/factors-with-non-unique-quot-duplicated-quot-levels-have-been-deprecated-since-2009-are-more-depreca-td4721481.html
This is no duplicate of: ggplot: order of factors with duplicate levels
His data structure is completely different.
Also, I have tried setting limits in x_scale_discrete. Doesn't work.
Try this...
x$name2 <- 1:nrow(x)
ggplot(x, aes(x = factor(name2), y = val)) + theme_bw() + geom_bar(stat = "identity") +
scale_x_discrete(labels=x$name)
Actually, simply add the following setting xlab(x$name)
ggplot(x, aes(x = name, y = val)) + theme_bw() + geom_bar(stat = "identity") + xlab(x$name)

How to extract an interaction to an external variable from a ggplot graph?

I have a ggplot graph defined like this:
x <- seq(0, 10, by = 0.1)
y1 <- cos(x)
y2 <- sin(x)
df1 <- data.frame(x = x, y = y1, type = "sin", id = 1)
df2 <- data.frame(x = x, y = y2, type = "cos", id = 2)
df3 <- data.frame(x = 2, y = 0.5, type = "constant", id = 3)
df4 <- data.frame(x = 4, y = 0.2, type = "constant", id = 4)
combined <- rbind(df1, df2, df3, df4)
ggplot(combined, aes(x, y, colour = interaction(type, id))) + geom_line() +
geom_point(data = subset(combined, type == "constant"))
This works very well as illustrated below:
Now I would like to extract the interaction in a variable to reuse it later (e.g. customize the legend style or labels).
I did that in a very naïve way:
my.interaction <- interaction(combined$type, combined$id)
ggplot(combined, aes(x, y, colour = my.interaction)) + geom_line() +
geom_point(data = subset(combined, type == "constant"))
But then I have an error:
Error: Aesthetics must be either length 1 or the same as the data (2):
x, y, colour
Edit:
Here is the kind of manipulation I could do: edit the linetype of the legend
displayed <- levels(factor(my.interaction))
line.style <- rep(1, length.out = length(displayed))
line.style[grep("constant", displayed)] <- 0
That works:
ggplot(combined, aes(x, y, colour = interaction(type, id))) + geom_line() +
geom_point(data = subset(combined, type == "constant")) +
guides(colour=guide_legend(override.aes=list(linetype = line.style)))
That does not:
ggplot(combined, aes(x, y, colour = my.interation) + geom_line() +
geom_point(data = subset(combined, type == "constant")) +
guides(colour=guide_legend(override.aes=list(linetype = line.style)))
In the end, I could also edit the shapes or the legend labels (e.g. "Id: 1 / Type: sin" or any other advanced transformation of the labels based on the interaction values).
This'll work. What's wrong with adding a column to your data frame?
combined %>% mutate(my.interaction = paste(type, id, sep='.')) %>%
ggplot(aes(x, y, colour = my.interaction)) + geom_line() +
geom_point(data = subset(combined, type == "constant"))

Legend for summary statistics in ggplot2

Here is the code for the plot
library(ggplot2)
df <- data.frame(gp = factor(rep(letters[1:3], each = 10)), y = rnorm(30))
library(plyr)
ds <- ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y))
ggplot(df, aes(x = gp, y = y)) +
geom_point() +
geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
I want to have a legend for this plot that will identify the data values and mean values some thing like this
Black point = Data
Red point = Mean.
How can I achieve this?
Use a manual scale, i.e. in your case scale_colour_manual. Then map the colours to values in the scale using the aes() function of each geom:
ggplot(df, aes(x = gp, y = y)) +
geom_point(aes(colour="data")) +
geom_point(data = ds, aes(y = mean, colour = "mean"), size = 3) +
scale_colour_manual("Legend", values=c("mean"="red", "data"="black"))
You can combine the mean variable and data in the same data.frame and colour /size by column which is a factor, either data or mean
library(reshape2)
# in long format
dsl <- melt(ds, value.name = 'y')
# add variable column to df data.frame
df[['variable']] <- 'data'
# combine
all_data <- rbind(df,dsl)
# drop sd rows
data_w_mean <- subset(all_data,variable != 'sd',drop = T)
# create vectors for use with scale_..._manual
colour_scales <- setNames(c('black','red'),c('data','mean'))
size_scales <- setNames(c(1,3),c('data','mean') )
ggplot(data_w_mean, aes(x = gp, y = y)) +
geom_point(aes(colour = variable, size = variable)) +
scale_colour_manual(name = 'Type', values = colour_scales) +
scale_size_manual(name = 'Type', values = size_scales)
Or you could not combine, but include the column in both data sets
dsl_mean <- subset(dsl,variable != 'sd',drop = T)
ggplot(df, aes(x = gp, y = y, colour = variable, size = variable)) +
geom_point() +
geom_point(data = dsl_mean) +
scale_colour_manual(name = 'Type', values = colour_scales) +
scale_size_manual(name = 'Type', values = size_scales)
Which gives the same results

Resources