How to add inside legend for a combined plot in ggplot2 - r

I have the following df and ggplot2 code to make a scatter plot but failed to add a legend inside the plot. Thanks :)
x1 = 1:5
x2 = 6:10
y1 = 3:7
y2 = 2:6
df <- data.frame(x1, y1, x2, y2)
ggplot(df) + geom_point(aes(x=x1, y = y1),col='red') + geom_point(aes(x = x2, y = y2),col='black')

Try:
x1 = 1:5
x2 = 6:10
y1 = 3:7
y2 = 2:6
df <- data.frame(x1, y1, x2, y2)
ggplot(df) + geom_point(aes(x=x1, y = y1, col = "1")) +
geom_point(aes(x = x2, y = y2, col = "2")) + scale_colour_manual(values = c("red", "black"))
In the above code, by putting col = "1" and col = "2" inside the aesthetics aes(), you're telling ggplot to add a colour dimension to the plot (and not just colour the points "red" and "black"). Hence, you see a legend now. Then, by setting colour equal to "1" and "2", you're saying to use these as labels. scale_colour_manual allows you to change these colours to red and black, instead of the red and blue" default.
The same applies anytime you want to add any dimension to the plot. But, instead of using col and scale_colour_manual, you would use an alternative such as shape and scale_shape_manual.

Here is a way of long format data input
#data into long format
x1 = 1:5
x2 = 6:10
y1 = 3:7
y2 = 2:6
df <- data.frame(x=c(x1, x2), y=c(y1, y2), group=rep(c("x1", "x2"), c(5, 5)))
#plot it
library(ggplot2)
ggplot(df) +
geom_point(aes(x=x, y = y, colour=group))+
scale_colour_manual(values=c("red", "black"))

Related

Control legend in ggplotly when using subplot

I use the R plotly package and the functions ggplotly and subplot to create an interactive plot consisting of multiple individually interactive ggplot2 plots. Some of the plots share the same grouping variables.
col <- factor(rep(c(1, 2), 5))
fill <- factor(c(rep("a", 5), rep("b", 5)))
x1 <- x2 <- y1 <- y2 <- 1:10
x3 <- y3 <- 11:20
d1 <- dplyr::tibble(x1 = x1, y1 = y1, col = col)
d2 <- dplyr::tibble(x2 = x2, y2 = y2, col = col, fill = fill)
d3 <- dplyr::tibble(x3 = x3, y3 = y3, col = col)
g1 <-
ggplot2::ggplot(d1) +
ggplot2::geom_point(ggplot2::aes(x = x1, y = y1, col = col))
g2 <-
ggplot2::ggplot(d2) +
ggplot2::geom_point(ggplot2::aes(x = x2, y = y2, col = col, fill = fill)) +
ggplot2::scale_fill_manual(values = c("red","green"))
g3 <-
ggplot2::ggplot(d3) +
ggplot2::geom_point(ggplot2::aes(x = x3, y = y3, col = col))
plotly::subplot(plotly::ggplotly(g1), plotly::ggplotly(g2), plotly::ggplotly(g3))
1) How can I remove the duplicated "col" labels in the interactive plotly legend?
2) How can I remove the legend for "fill", but keep the legend for "col"?
EDIT: I know about the following "dirty" solution, which is to manually disable the legend:
t <-
plotly::subplot(plotly::ggplotly(g1), plotly::ggplotly(g2), plotly::ggplotly(g3))
t$x$data[[1]]$showlegend <- FALSE
t$x$data[[2]]$showlegend <- FALSE
t$x$data[[3]]$showlegend <- FALSE
t$x$data[[4]]$showlegend <- FALSE
However, this requires me to know the positions of the list elements in advance, which is why I am looking for a more general solution.
Another way to manually remove the unwanted legends is to use style(). In your example, lt <- t %>% style(t, showlegend = FALSE, traces = 3:n), where n<-8 is defined before, will suppress the unwanted legends.

Dynamically creating unit labels (K, Mn, Bn, Tn) + adjusting axis limit for dual-axis ggplot2 graph?

I have a date-time dataset with two columns y1 and y2. y2 is always bigger than y1 by an unknown order of magnitude, and both y2 and y1 can have zeroes. When y2 is really big (in the hundreds of billions), reading it's y-axis is tedious (even with commas)- I want to be able to change 1,000 to K, 1,000,000 to M. 10,000,000 to 10M, and so fourth.
Optional/Bonus- I want change the y-axis for either plots to focus/ show the range of y for which data is available. i.e.- if y2 has values that are all above 1,000,000; don't plot a graph starting from zero for this column. Likewise for y1.
Data:
df = data.frame('date' = c(seq(as.Date('2019-01-01'), as.Date('2019-02-01'), 'day')))
df$y1 = runif(nrow(df))
df$y2 = runif(nrow(df)) + 10000000
.
My Current Approach:
Calculating how much to scale y2 by:
#Calculates nearest power of 10
log10_ceiling <- function(x) {
10^(ceiling(log10(x)))
}
scale = df$y2 / df$y1
scale = scale[!is.na(scale)]
scale = scale[!is.infinite(scale)]
scale = log10_ceiling(max(scale))/100
Plotting:
p <- ggplot(df, aes(x = date))
p <- p + geom_line(aes(y = y1, colour = "y1"))
p <- p + geom_line(aes(y = df$y2/scale, colour = "y2"))+
scale_y_continuous(sec.axis = sec_axis(trans = ~.*scale, name = "y2"))
p <- p + scale_colour_manual(values = c("blue", "red"))
p <- p + labs(y = "y1",
x = "date",
colour = "Parameter")
p <- p + theme(legend.position = c(0.8, 0.9))
p
I am aware of label=unit_format(unit = "B"), but don't know how to change it dynamically to M, or handle cases where the scale is 10B instead of 1B.
Edit- a problem with how I'm currently calculating the scale is y2 is always plotted to the next order of magnitude. i.e.- if y2 values fall between 10 and 11 for instance, the window shows y2 from 0 to 100.

how to plot many x variable agaist one y variable using ggplot function in

I have an excel file with multiple columns with titles as x, x1, x2, x3, x4 etc. I am using ggplot function in R to plot x against x1. The code is
data %>%
ggplot(aes(x = x1, y = x)) +
geom_point(colour = "red") +
geom_smooth(method = "lm", fill = NA)
How to modify the present code so as to plot x against x1, x against x2, x against x3, x against x4 in the same ggplot function code
You should change the way your data.frame is formated to do this easily with ggplot2 syntax.
Instead of having 5 columns, with x, x1, x2, x3, x4, you may want to have a data.frame with 3 columns : x, y and type with type being a categorical variable indicating from which column your y is from (x1, x2, x3 or x4).
That would be something like this :
df <- data.frame(x = rep(data$x, 4),
y = c(data$x1, data$x2, data$x3, data$x4),
type = rep(c("x1", "x2", "x3", "x4"), each = nrow(data))
Then, with this data.frame, you can set the aes in order to plot x according to y for each category of your variable type thanks to the color argument.
ggplot(df, aes(x = x, y = y, color = type)) + geom_point() + geom_smooth(method = "lm, fill = "NA")
You should check http://www.sthda.com/english/wiki/ggplot2-scatter-plots-quick-start-guide-r-software-and-data-visualization for detailed explanations and customizations.

Drawing polygon: limits?

The polygon function in R seems rather simple...however I can't get it to work.
It easily works with this code:
x <- seq(-3,3,0.01)
y1 <- dnorm(x,0,1)
y2 <- 0.5*dnorm(x,0,1)
plot(x,y1,type="l",bty="L",xlab="X",ylab="dnorm(X)")
points(x,y2,type="l",col="red")
polygon(c(x,rev(x)),c(y2,rev(y1)),col="skyblue")
When adopting this to something else, it doesn't work. Here some stuff to reproduce the issue:
lowerbound = c(0.05522914,0.06567045,0.07429926,0.08108482,0.08624472,0.09008050,0.09288837,0.09492226)
upperbound = c(0.1743657,0.1494058,0.1333106,0.1227383,0.1156714,0.1108787,0.1075915,0.1053178)
lim = c(100,200,400,800,1600,3200,6400,12800)
plot(upperbound, ylim=c(0, 0.2), type="b", axes=FALSE)
lines(lowerbound, type="b", col="red")
atvalues <- seq(1:8)
axis(side=1, at=atvalues, labels=lim)
axis(side=2, at=c(0,0.05,0.1,0.15,0.2), labels=c(0,0.05,0.1,0.15,0.2))
polygon(lowerbound,upperbound, col="skyblue")
It also doesn't work when only segmenting a subset when directly calling the coordinates:
xpoly <- c(100,200,200,100)
ypoly <- c(lowerbound[1], lowerbound[2], upperbound[2], upperbound[1])
polygon(xpoly,ypoly, col="skyblue")
What am I missing?
Plotting the whole polygon
You need to supply both x and y to polygon. Normally, you'd also do that for plot, but if you don't it will just use the Index as x, that is integers 1 to n. We can use that to make an x range. seq_along will create a 1:n vector, where n is the length of another object.
x <- c(seq_along(upperbound), rev(seq_along(lowerbound)))
y <- c(lowerbound, rev(upperbound))
plot(upperbound, ylim=c(0, 0.2), type="b", axes=FALSE)
lines(lowerbound, type="b", col="red")
atvalues <- seq(1:8)
axis(side=1, at=atvalues, labels=lim)
axis(side=2, at=c(0,0.05,0.1,0.15,0.2), labels=c(0,0.05,0.1,0.15,0.2))
polygon(x = x, y = y, col="skyblue")
Plotting a subset
For a subset, I would create the y first, and then use the old x to easily get `x values:
y2 <- c(lowerbound[1:2], upperbound[2:1])
x2 <- x[which(y2 == y)]
polygon(x2, y2, col="skyblue")
How I would do it
Creating something like this is much easier in ggplot2, where geom_ribbon does a lot of the heavy lifting. We just have to make an actual data.frame, an stop relying on indices.
Full polygon:
library(ggplot2)
ggplot(d, aes(x = x, ymin = low, ymax = up)) +
geom_ribbon(fill = 'skyblue', alpha = 0.5) +
geom_line(aes(y = low), col = 'red') +
geom_line(aes(y = up), col = 'black') +
scale_x_continuous(trans = 'log2') +
theme_bw()
Subset:
ggplot(d, aes(x = x, ymin = low, ymax = up)) +
geom_ribbon(data = d[1:2, ], fill = 'skyblue', alpha = 0.5) +
geom_line(aes(y = low), col = 'red') +
geom_line(aes(y = up), col = 'black') +
scale_x_continuous(trans = 'log2') +
theme_bw()

Fill superimposed ellipses in ggplot2 scatterplots

This question is a follow-up of "How can a data ellipse be superimposed on a ggplot2 scatterplot?".
I want to create a 2D scatterplot using ggplot2 with filled superimposed confidence ellipses. Using the solution of Etienne Low-Décarie from the above mentioned post, I do get superimposed ellipses to work. The solution is based on stat_ellipse available from https://github.com/JoFrhwld/FAAV/blob/master/r/stat-ellipse.R
Q: How can I fill the inner area of the ellipse(s) with a certain color (more specifically I want to use the color of the ellipse border with some alpha)?
Here is the minimal working example modified from the above mentioned post:
# create data
set.seed(20130226)
n <- 200
x1 <- rnorm(n, mean = 2)
y1 <- 1.5 + 0.4 * x1 + rnorm(n)
x2 <- rnorm(n, mean = -1)
y2 <- 3.5 - 1.2 * x2 + rnorm(n)
class <- rep(c("A", "B"), each = n)
df <- data.frame(x = c(x1, x2), y = c(y1, y2), colour = class)
# get code for "stat_ellipse"
library(devtools)
library(ggplot2)
source_url("https://raw.github.com/JoFrhwld/FAAV/master/r/stat-ellipse.R")
# scatterplot with confidence ellipses (but inner ellipse areas are not filled)
qplot(data = df, x = x, y = y, colour = class) + stat_ellipse()
Output of working example:
As mentioned in the comments, polygon is needed here:
qplot(data = df, x = x, y = y, colour = class) +
stat_ellipse(geom = "polygon", alpha = 1/2, aes(fill = class))

Resources