I would like to plot points twice using two diferent color scales:
In the exemple here 5 points are drown and color is mapped to two covariates (cov1 and cov2): cov1 and cov2 are in different scales 1 to 5 and 0.01 to 0.05 respectively.
I wish to have 2 independent color keys, one for cov1 and one for cov2,
a bit like in the graph below. However on the graph below I used 'color = cov1' end 'fill = cov2' in order to bring another color key...
Any help would be appreciated.
gg1 <- ggplot(data = df1 , aes( x = x , y = y ) ) +
geom_point( aes(x = x , y = y - 1 , color = cov1 )) +
geom_point( aes(x = x , y = y + 1 , color = cov2 )) +
scale_y_continuous(limits = c(-3,3))
gg2 <- ggplot(data = df1 , aes( x = x , y = y ) ) +
geom_point( aes(x = x , y = y - 1 , color = cov1 )) +
geom_point( aes(x = x , y = y + 1 , fill = cov2 ), pch = 21 ) +
scale_y_continuous(limits = c(-3,3))
grid.arrange( gg1 , gg2 , ncol = 2 )
In basic ggplot2 it is impossible if I remember correctly. But this repository may be your answer:
https://github.com/eliocamp/ggnewscale
or this (mentioned in description of the previous one):
https://github.com/clauswilke/relayer
I haven't been using ggplot2 for quite a long time so I'm not familiar with these two, but I remember that I used one of them at least once.
I've just wrote quick example to check if it works:
d1 <- data.frame(x=1:5, y=1)
d2 <- data.frame(x=1:5, y=2)
library(ggplot2)
library(ggnewscale)
ggplot() +
geom_point(data = d1, aes(x=x, y=y, color = x)) +
scale_color_continuous(low = "#0000aa", high="#ffffff") +
new_scale_color() +
geom_point(data = d2, aes(x=x, y=y, color = x)) +
scale_color_continuous(low = "#aa0000", high="#00aa00")
And it seems to work as you want.
I used your idea about combining col and fill and small hack to use different shapes for cov1 and cov2:
# sample data
my_data <- data.frame(x = 1:5,
cov1 = 1:5,
cov2 = seq(0.01, 0.05, 0.01))
library(ggplot2)
ggplot() +
geom_point(data = my_data, aes(x = x, y = 0.5, col = cov1), shape = 16) +
scale_color_continuous(low = "red1", high = "red4") +
geom_point(data = my_data, aes(x = x, y = -0.5, fill = cov2), shape = 21, col = "white", size = 2) +
ylim(-1, 1)
Hope it helps.
Related
I have the following graph and code:
Graph
ggplot(long2, aes(x = DATA, y = value, fill = variable)) + geom_area(position="fill", alpha=0.75) +
scale_y_continuous(labels = scales::comma,n.breaks = 5,breaks = waiver()) +
scale_fill_viridis_d() +
scale_x_date(date_labels = "%b/%Y",date_breaks = "6 months") +
ggtitle("Proporcions de les visites, només 9T i 9C") +
xlab("Data") + ylab("% visites") +
theme_minimal() + theme(legend.position="bottom") + guides(fill=guide_legend(title=NULL)) +
annotate("rect", fill = "white", alpha = 0.3,
xmin = as.Date.character("2020-03-16"), xmax = as.Date.character("2020-06-22"),
ymin = 0, ymax = 1)
But it has some sawtooth, how am I supposed to smooth it out?
I believe your situation is roughly analogous to the following, wherein we have missing x-positions for one group, but not the other at the same position. This causes spikes if you set position = "fill".
library(ggplot2)
x <- seq_len(100)
df <- data.frame(
x = c(x[-c(25, 75)], x[-50]),
y = c(cos(x[-c(25, 75)]), sin(x[-50])) + 5,
group = rep(c("A", "B"), c(98, 99))
)
ggplot(df, aes(x, y, fill = group)) +
geom_area(position = "fill")
To smooth out these spikes, it has been suggested to linearly interpolate the data at the missing positions.
# Find all used x-positions
ux <- unique(df$x)
# Split data by group, interpolate data groupwise
df <- lapply(split(df, df$group), function(xy) {
approxed <- approx(xy$x, xy$y, xout = ux)
data.frame(x = ux, y = approxed$y, group = xy$group[1])
})
# Recombine data
df <- do.call(rbind, df)
# Now without spikes :)
ggplot(df, aes(x, y, fill = group)) +
geom_area(position = "fill")
Created on 2022-06-17 by the reprex package (v2.0.1)
P.S. I would also have expected a red spike at x=50, but for some reason this didn't happen.
I first make a plot
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill='red',alpha=..level..),geom='polygon', show.legend = F)
Then I want to change the geom_density values and use these in another plot.
# build plot
q <- ggplot_build(p)
# Change density
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
Build the other plot using the changed densities, something like this:
# Built another plot
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_density2d(dens)
This does not work however is there a way of doing this?
EDIT: doing it when there are multiple groups:
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40), group = c(rep('A',40), rep('B',60), rep('C',26)))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill=group,alpha=..level..),geom='polygon', show.legend = F)
q <- ggplot_build(p)
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Results when applied to my own dataset
Although this is exactly what I'm looking for the fill colors seem not to correspond to the initial colors (linked to A, B and C):
Like this? It is possible to plot a transformation of the shapes plotted by geom_density. But that's not quite the same as manipulating the underlying density...
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Edit - OP now has multiple groups. We can plot those with the code below, which produces an artistic plot of questionably utility. It does what you propose, but I would suggest it would be more fruitful to transform the underlying data and summarize that, if you are looking for representative output.
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = group, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F) +
theme_minimal()
Hi have some code to simulate a Gaussian process. Please can someone help me add a legend to my plots on the top right corner. I want to state the different parameter values for each of the line styles/colours, e.g. l=1, l=5, l=10. Thanks.
# simulate a gaussian process
simGP = function(K){
n = nrow(K)
U = chol(K) # cholesky decomposition
z = rnorm(n)
c(t(U) %*% z)
}
# choose points to simulate the covariance.
x = seq(-1, 1, length.out = 500)
# Exponential kernel ------------------------------------------------------
kernel_exp = function(x, l = 1) {
d = as.matrix(dist(x))/l
K = exp(-d)
diag(K) = diag(K) + 1e-8
K
}
{y1 = simGP(kernel_exp(x,l=10))
y2 = simGP(kernel_exp(x,l=1))
y3 = simGP(kernel_exp(x,l=0.1))
data1 <- as.data.frame(x,y1)
data2 <- as.data.frame(x,y2)
data3 <- as.data.frame(x,y3)
df=data.frame(data1,data2,data3)
ggplot() +
geom_line(data=data1, aes(x=x, y=y1), color="green4", linetype = "twodash", size=0.5) +
geom_line(data=data2, aes(x=x, y=y2), color='red', linetype="longdash", size=0.5) +
geom_line(data=data3, aes(x=x, y=y3), color='blue') +
scale_color_manual(values = colors) +
theme_classic() +
labs(x='input, x',
y='output, f(x)')+
theme(axis.text=element_text(size=16),
axis.title=element_text(size=14))}
You can do it using a dataframe variable to group the linetype and colour.
If you want to specify color and linetype, use scale_color_discrete and scale_linetype_discrete
y1 = simGP(kernel_exp(x,l=10))
y2 = simGP(kernel_exp(x,l=1))
y3 = simGP(kernel_exp(x,l=0.1))
data1 <- data.frame(x, y = y1, value = "10")
data2 <- data.frame(x, y = y2, value = "1")
data3 <- data.frame(x, y = y3, value = "0.1")
df=rbind(data1,data2,data3)
ggplot(data = df, aes(x=x, y=y, color = value, linetype = value, group = value)) +
geom_line(size=0.5) +
theme_classic() +
labs(x='input, x',
y='output, f(x)')+
theme(axis.text=element_text(size=16),
axis.title=element_text(size=14))
I am creating several plots in order to create frames for a gif. It is supposed to show growing points over time. (see plot 1 and 2 - the values increase). Using size aesthetic is problematic, because the scaling is done for each plot individually.
I tried to set breaks with scale_size_area() to provide a sequence of absolute values, in order to scale on 'all values' rather than only the values present in each plot. (no success).
Plot 3 shows how the points should be scaled, but this scaling should be achieved in each plot.
library(tidyverse)
df1 <- data.frame(x = letters[1:5], y = 1:5, size2 = 21:25)
ggplot(df1, aes(x, y, size = y)) +
geom_point() +
scale_size_area(breaks = seq(0,25,1))
ggplot(df1, aes(x, y, size = size2)) +
geom_point() +
scale_size_area(breaks = seq(0,25,1))
df2 <- data.frame(x = letters[1:5], y = 1:5, size2 = 21:25) %>% gather(key, value, y:size2)
ggplot(df2, aes(x, value, size = value)) +
geom_point() +
scale_size_area(breaks = seq(0,25,1))
Created on 2019-05-12 by the reprex package (v0.2.1)
Pass lower and upper bound to limits argument in scale_size_area function:
ggplot(df1, aes(x, y, size = y)) +
geom_point() +
labs(
title = "Y on y-axis",
size = NULL
) +
scale_size_area(limits = c(0, 25))
ggplot(df1, aes(x, y, size = size2 )) +
geom_point() +
labs(
title = "size2 on y-axis",
size = NULL
) +
scale_size_area(limits = c(0, 25))
How about this?
library("ggplot2")
df1 <- data.frame(x = letters[1:5],
y = 1:5)
ggplot(data = df1,
aes(x = x,
y = y,
size = y)) +
geom_point() +
scale_size_area(breaks = seq(1,25,1),
limits = c(1, 25))
I have three dataframes, containing data for the same variables (x and y, grouped by variable case) but each dataframe contains data from a different source (test, sim and model). The levels of case are identical for test and model, but they are different for sim. For each value of case, I want all xy curves from different sources but with the same case to have the same color. I need to have a legend which clearly identifies the data source, but I would also like to use different geoms for different data sources. This is what I've been able to do:
rm(list=ls())
gc()
graphics.off()
library(ggplot2)
# build the dataframes
nx <- 10
x1 <- seq(0, 1, len = nx)
x2 <- x1+ 0.1
x3 <- x2+ 0.1
x4 <- x3+ 0.1
x <- c(x1, x2, x3, x4)
y1 <- 1 - x1
y2 <- 1.1 * y1
y3 <- 1.1 * y2
y4 <- 1.1 * y3
y <- c(y1, y2, y3, y4)
z1 <- (y1 + y2)/2
z2 <- (y2 + y3)/2
z3 <- (y3 + y4)/2
z4 <- (y4 + 1.1 * y4)/2
z <- c(z1, z2, z3, z4)
w <- y*1.01
case_y <- c("I-26_1", "I00", "I20_5", "I40_9")
case_z <- c("I-23_6", "I00", "I22_4", "I42_3")
case_y <- rep(case_y, each = nx)
case_z <- rep(case_z, each = nx)
foo <- data.frame(x = x, y = z, case = case_z, type = "test")
bar <- data.frame(x = x, y = y, case = case_y, type = "sim")
mod <- data.frame(x = x, y = w, case = case_z, type = "model")
# different data frames have different factor levels: to avoid this,
# I bind all dataframes and I reorder the levels of case
foobar <- rbind(foo, bar, mod)
case_levels <- c("I-26_1", "I-23_6", "I00", "I20_5", "I22_4", "I40_9", "I42_3")
foobar$case <- factor(foobar$case, levels = case_levels)
# now I can plot the resulting dataframe
p <- ggplot(data = foobar, aes(x = x, y = y, color = case)) +
geom_line(aes(linetype = type), size = 1)
p
The problem here is that it's difficult to discern sim and model. In order to make a more readable plot, I switch to geom_point for the model data:
foobar <- rbind(foo, bar)
case_levels <- c("I-26_1", "I-23_6", "I00", "I20_5", "I22_4", "I40_9", "I42_3")
foobar$case <- factor(foobar$case, levels = case_levels)
mod$case <- factor(mod$case, levels = case_levels)
# now I can plot the resulting dataframe
p <- ggplot(data = foobar, aes(x = x, y = y, color = case)) +
geom_line(aes(linetype = type), size = 1) +
geom_point(data = mod)
However, now I don't have a model label in the legend. How can I make sure that the model curves are clearly labeled in the legend, but they are also easy to discern visually from the sim and test curves?
EDIT Procrastinatus Maximus suggests an edit to Pierre Lafortune's code which should eliminate the space between the model label and the type legend, but apparently it eliminates the space between model and the case legend instead:
ggplot(data = foobar, aes(x = x, y = y, color = case)) +
geom_line(aes(linetype = type), size = 1) +
geom_point(data = mod, aes(shape=type)) +
scale_shape_discrete(name="") +
guides(colour = guide_legend(override.aes = list(linetype=c(1),
shape=c(NA)))) +
theme(legend.margin = margin(0,0,0,0), legend.spacing = unit(0, 'lines'))
The result is
This will get you closer to your goal. I will look to see if we can close the gap between the two legends.
ggplot(data = foobar, aes(x = x, y = y, color = case)) +
geom_line(aes(linetype = type), size = 1) +
geom_point(data = mod, aes(shape=type)) +
scale_shape_discrete(name="") +
guides(colour = guide_legend(override.aes = list(linetype=c(1),
shape=c(NA))))
Edit
##ProcrastinatusMaximus
ggplot(data = foobar, aes(x = x, y = y, color = case)) +
geom_line(aes(linetype = type), size = 1) +
geom_point(data = mod, aes(shape = type)) +
guides(color = guide_legend(override.aes = list(linetype = c(1), shape = c(NA)), order = 1),
linetype = guide_legend(order = 2),
shape = guide_legend(title = NULL, order = 3))+
theme(legend.margin = margin(0,0,0,0), legend.spacing = unit(0, 'lines'))
Personally, I think all you need to do is change to order of the type, so that the solid line is in the middle. If you make the background white and the line colors a bit brighter, I think your figure is clear:
(p <- ggplot(data = foobar, aes(x = x, y = y, color = case)) +
geom_line(aes(linetype = rev(type)), size = 1) +
scale_color_manual(values = c("black","green","blue","purple","pink","red","brown"))+
theme_bw())