I am creating several plots in order to create frames for a gif. It is supposed to show growing points over time. (see plot 1 and 2 - the values increase). Using size aesthetic is problematic, because the scaling is done for each plot individually.
I tried to set breaks with scale_size_area() to provide a sequence of absolute values, in order to scale on 'all values' rather than only the values present in each plot. (no success).
Plot 3 shows how the points should be scaled, but this scaling should be achieved in each plot.
library(tidyverse)
df1 <- data.frame(x = letters[1:5], y = 1:5, size2 = 21:25)
ggplot(df1, aes(x, y, size = y)) +
geom_point() +
scale_size_area(breaks = seq(0,25,1))
ggplot(df1, aes(x, y, size = size2)) +
geom_point() +
scale_size_area(breaks = seq(0,25,1))
df2 <- data.frame(x = letters[1:5], y = 1:5, size2 = 21:25) %>% gather(key, value, y:size2)
ggplot(df2, aes(x, value, size = value)) +
geom_point() +
scale_size_area(breaks = seq(0,25,1))
Created on 2019-05-12 by the reprex package (v0.2.1)
Pass lower and upper bound to limits argument in scale_size_area function:
ggplot(df1, aes(x, y, size = y)) +
geom_point() +
labs(
title = "Y on y-axis",
size = NULL
) +
scale_size_area(limits = c(0, 25))
ggplot(df1, aes(x, y, size = size2 )) +
geom_point() +
labs(
title = "size2 on y-axis",
size = NULL
) +
scale_size_area(limits = c(0, 25))
How about this?
library("ggplot2")
df1 <- data.frame(x = letters[1:5],
y = 1:5)
ggplot(data = df1,
aes(x = x,
y = y,
size = y)) +
geom_point() +
scale_size_area(breaks = seq(1,25,1),
limits = c(1, 25))
Related
I have the following graph and code:
Graph
ggplot(long2, aes(x = DATA, y = value, fill = variable)) + geom_area(position="fill", alpha=0.75) +
scale_y_continuous(labels = scales::comma,n.breaks = 5,breaks = waiver()) +
scale_fill_viridis_d() +
scale_x_date(date_labels = "%b/%Y",date_breaks = "6 months") +
ggtitle("Proporcions de les visites, només 9T i 9C") +
xlab("Data") + ylab("% visites") +
theme_minimal() + theme(legend.position="bottom") + guides(fill=guide_legend(title=NULL)) +
annotate("rect", fill = "white", alpha = 0.3,
xmin = as.Date.character("2020-03-16"), xmax = as.Date.character("2020-06-22"),
ymin = 0, ymax = 1)
But it has some sawtooth, how am I supposed to smooth it out?
I believe your situation is roughly analogous to the following, wherein we have missing x-positions for one group, but not the other at the same position. This causes spikes if you set position = "fill".
library(ggplot2)
x <- seq_len(100)
df <- data.frame(
x = c(x[-c(25, 75)], x[-50]),
y = c(cos(x[-c(25, 75)]), sin(x[-50])) + 5,
group = rep(c("A", "B"), c(98, 99))
)
ggplot(df, aes(x, y, fill = group)) +
geom_area(position = "fill")
To smooth out these spikes, it has been suggested to linearly interpolate the data at the missing positions.
# Find all used x-positions
ux <- unique(df$x)
# Split data by group, interpolate data groupwise
df <- lapply(split(df, df$group), function(xy) {
approxed <- approx(xy$x, xy$y, xout = ux)
data.frame(x = ux, y = approxed$y, group = xy$group[1])
})
# Recombine data
df <- do.call(rbind, df)
# Now without spikes :)
ggplot(df, aes(x, y, fill = group)) +
geom_area(position = "fill")
Created on 2022-06-17 by the reprex package (v2.0.1)
P.S. I would also have expected a red spike at x=50, but for some reason this didn't happen.
I have made the graph below with ggplot. I would like to reduce the distance between the y axis and the first category (a). Which function should I use? Thanks! :)
library(ggplot2)
library(reshape2)
data <- data.frame(a = rnorm(10), b = rnorm(10), c = rnorm(10), group = 1:10)
data <- melt(data, id = "group")
ggplot(data, aes(x = variable, y = value, group = group, color = as.factor(group))) + geom_point() + geom_line() + theme_minimal() + theme(axis.line = element_line(), panel.grid = element_blank())
Suppose we have the following plot:
library(ggplot2)
df <- data.frame(x = rep(LETTERS[1:3], 3),
y = rnorm(9),
z = rep(letters[1:3], each = 3))
ggplot(df, aes(x, y, colour = z, group = z)) +
geom_line() +
geom_point()
We can reduce the space between the extreme points and the panel edges by adjusting the expand argument in a scale function:
ggplot(df, aes(x, y, colour = z, group = z)) +
geom_line() +
geom_point() +
scale_x_discrete(expand = c(0,0.1))
Setting expand = c(0,0) completely removes the space. The first argument is a relative number, the second an absolute; so in the example above we set the expand to 0.1 x-axis units.
As you can see on the image, R automatically assigns the values 0, 0.25... 1 for the size of the point. I was wondering if I could replace the 0, 0.25... 1 and make these text values instead while keeping the actual numerical values from the data.
library(ggplot2)
library(scales)
data(SLC4A1, package="ggplot2")
SLC4A1 <- read.csv(file.choose(), header = TRUE)
# bubble chart showing position of polymorphisms on gene, the frequency of each of these
# polymorphisms, where they are prominent on earth, and p-value
SLC4A1ggplot <- ggplot(SLC4A1, aes(Position, log10(Frequency)))+
geom_jitter(aes(col=Geographical.Location, size =(p.value)))+
labs(subtitle="Frequency of Various Polymorphisms", title="SLC4A1 Gene") +
labs(color = "Geographical Location") +
labs(size = "p-value") + labs(x = "Position of Polymorphism on SLC4A1 Gene") +
scale_size_continuous(range=c(1,4.5), trans = "reverse") +
guides(size = guide_legend(reverse = TRUE))
library(tidyver)
df <- data.frame(x = 1:5, y = 1:5,z = 1:5)
ggplot(df,aes(x = x, y = y, size = z)) +
geom_point()
ggplot(df,aes(x = x, y = y, size = z)) +
geom_point() +
scale_size_continuous(range = 1:2) # control range of circle size
See more here:
https://ggplot2.tidyverse.org/reference/scale_size.html
I first make a plot
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill='red',alpha=..level..),geom='polygon', show.legend = F)
Then I want to change the geom_density values and use these in another plot.
# build plot
q <- ggplot_build(p)
# Change density
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
Build the other plot using the changed densities, something like this:
# Built another plot
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_density2d(dens)
This does not work however is there a way of doing this?
EDIT: doing it when there are multiple groups:
df <- data.frame(x = c(1:40, rep(1:20, 3), 15:40), group = c(rep('A',40), rep('B',60), rep('C',26)))
p <- ggplot(df, aes(x=x, y = x)) +
stat_density2d(aes(fill=group,alpha=..level..),geom='polygon', show.legend = F)
q <- ggplot_build(p)
dens <- q$data[[1]]
dens$y <- dens$y - dens$x
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Results when applied to my own dataset
Although this is exactly what I'm looking for the fill colors seem not to correspond to the initial colors (linked to A, B and C):
Like this? It is possible to plot a transformation of the shapes plotted by geom_density. But that's not quite the same as manipulating the underlying density...
ggplot(df, aes(x=x, y =1)) +
geom_point(alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = fill, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F)
Edit - OP now has multiple groups. We can plot those with the code below, which produces an artistic plot of questionably utility. It does what you propose, but I would suggest it would be more fruitful to transform the underlying data and summarize that, if you are looking for representative output.
ggplot(df, aes(x=x, y =1)) +
geom_point(aes(col = group), alpha = 0.3) +
geom_polygon(data = dens, aes(x, y, fill = group, group = piece, alpha = alpha)) +
scale_alpha_identity() +
guides(fill = F, alpha = F) +
theme_minimal()
I would like to plot points twice using two diferent color scales:
In the exemple here 5 points are drown and color is mapped to two covariates (cov1 and cov2): cov1 and cov2 are in different scales 1 to 5 and 0.01 to 0.05 respectively.
I wish to have 2 independent color keys, one for cov1 and one for cov2,
a bit like in the graph below. However on the graph below I used 'color = cov1' end 'fill = cov2' in order to bring another color key...
Any help would be appreciated.
gg1 <- ggplot(data = df1 , aes( x = x , y = y ) ) +
geom_point( aes(x = x , y = y - 1 , color = cov1 )) +
geom_point( aes(x = x , y = y + 1 , color = cov2 )) +
scale_y_continuous(limits = c(-3,3))
gg2 <- ggplot(data = df1 , aes( x = x , y = y ) ) +
geom_point( aes(x = x , y = y - 1 , color = cov1 )) +
geom_point( aes(x = x , y = y + 1 , fill = cov2 ), pch = 21 ) +
scale_y_continuous(limits = c(-3,3))
grid.arrange( gg1 , gg2 , ncol = 2 )
In basic ggplot2 it is impossible if I remember correctly. But this repository may be your answer:
https://github.com/eliocamp/ggnewscale
or this (mentioned in description of the previous one):
https://github.com/clauswilke/relayer
I haven't been using ggplot2 for quite a long time so I'm not familiar with these two, but I remember that I used one of them at least once.
I've just wrote quick example to check if it works:
d1 <- data.frame(x=1:5, y=1)
d2 <- data.frame(x=1:5, y=2)
library(ggplot2)
library(ggnewscale)
ggplot() +
geom_point(data = d1, aes(x=x, y=y, color = x)) +
scale_color_continuous(low = "#0000aa", high="#ffffff") +
new_scale_color() +
geom_point(data = d2, aes(x=x, y=y, color = x)) +
scale_color_continuous(low = "#aa0000", high="#00aa00")
And it seems to work as you want.
I used your idea about combining col and fill and small hack to use different shapes for cov1 and cov2:
# sample data
my_data <- data.frame(x = 1:5,
cov1 = 1:5,
cov2 = seq(0.01, 0.05, 0.01))
library(ggplot2)
ggplot() +
geom_point(data = my_data, aes(x = x, y = 0.5, col = cov1), shape = 16) +
scale_color_continuous(low = "red1", high = "red4") +
geom_point(data = my_data, aes(x = x, y = -0.5, fill = cov2), shape = 21, col = "white", size = 2) +
ylim(-1, 1)
Hope it helps.