Related
I am using geom_ellipse to create an ellipse on faithful dataset.
Here is the code:
t <- position_nudge(x=-1,y=-0.5)
obj <- ggplot(faithful, aes(waiting, eruptions))+
geom_point()+ geom_ellipse(aes(x0 = 70, y0 = 3, a= 3, b = 10,angle = pi/3),color="red",
position = t)
The graph looks like this:
I want to extract the points/data that are present within the ellipse? How can I do that? I tried what was done here , but it does not work for geom_ellipse.
The thing with geom_ellipse is that when you look at the data of the layer with ggplot_build, you can see that the ellipse is shown multiple times in a data loop. So what you could do is get a first copy of the values like 272 values (same as your data). Based on the answer you can do the following:
library(ggplot2)
library(ggforce)
library(sp)
t <- position_nudge(x=-1,y=-0.5)
obj <- ggplot(faithful, aes(waiting, eruptions))+
geom_point()+ geom_ellipse(aes(x0 = 70, y0 = 3, a= 3, b = 10,angle = pi/3),color="red",
position = t)
obj
#> Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
# Extract components
build <- ggplot_build(obj)$data
points <- build[[1]]
ell <- build[[2]]
# Find which points are inside the ellipse, and add this to the data
dat <- data.frame(
points[1:2],
in.ell = as.logical(point.in.polygon(points$x, points$y, ell$x[1:272], ell$y[1:272]))
)
# Plot the result
ggplot(dat, aes(x, y)) +
geom_point(aes(col = in.ell)) +
geom_ellipse(aes(x0 = 70, y0 = 3, a= 3, b = 10,angle = pi/3),color="red",
position = t)
Created on 2022-12-17 with reprex v2.0.2
When you check the dat dataframe, you can see which points are in the ellipse.
Below is a minimal working example.
library(ggplot2)
set.seed(926)
df <- data.frame(x. = rnorm(100),
y. = rnorm(100),
color. = rnorm(100))
library(ggplot2)
p <- ggplot(df, aes(x = x., y = y., color = color.)) +
geom_point() +
viridis::scale_color_viridis(option = "C")
p
p_build <- ggplot_build(p)
# The desired vector is below somehow I feel there must have an easier way to get it
p_build[["data"]][[1]][["colour"]]
df$color_converted <- p_build[["data"]][[1]][["colour"]]
Specifically, I like to use viridis::viridis(option = "C") color scheme. Could anyone help with this? Thanks.
*Modify*
Sorry, my question wasn't clear enough. Let me put it this way, I couldn't utilize ggplot2 package and had to use the pure plot() function that comes with R, in my specific project.
My goal is to try to reproduce the above plot with the base R package.
plot(df$x., df$y., color = df$color_converted)
If possible, could anyone also direct me on how to customize a gradient legend that is similar to ggplot2, with base legend()?
First of all you can assign the colors to a vector called "color2" and use scale_colour_gradientn to assign these colors to your plot. The problem is that the colors are not sorted right so you have to do that first by using the TSP package. In the output below you can see that you can recreate the plot without using scale_color_viridis:
set.seed(926)
df <- data.frame(x. = rnorm(100),
y. = rnorm(100),
color. = rnorm(100))
library(ggplot2)
library(TSP)
p <- ggplot(df, aes(x = x., y = y., color = color.)) +
geom_point() +
viridis::scale_color_viridis(option = "C")
p
p_build <- ggplot_build(p)
# The desired vector is below somehow I feel there must have an easier way to get it
color2 <- p_build[["data"]][[1]][["colour"]]
rgb <- col2rgb(color2)
lab <- convertColor(t(rgb), 'sRGB', 'Lab')
ordered_cols2 <- color2[order(lab[, 'L'])]
ggplot(df, aes(x = x., y = y.)) +
geom_point(aes(colour = color.)) +
scale_colour_gradientn(colours = ordered_cols2, guide = "colourbar")
#viridis::scale_color_viridis(option = "C")
Created on 2022-08-17 with reprex v2.0.2
Base r
You can use the following code:
color2 <- p_build[["data"]][[1]][["colour"]]
rgb <- col2rgb(color2)
lab <- convertColor(t(rgb), 'sRGB', 'Lab')
ordered_cols2 <- color2[order(lab[, 'L'])]
layout(matrix(1:2,ncol=2), width = c(2,1),height = c(1,1))
plot(df$x., df$y., col = df$color_converted)
legend_image <- as.raster(matrix(ordered_cols2, ncol=1))
plot(c(0,2),c(0,1),type = 'n', axes = F,xlab = '', ylab = '', main = 'legend title')
text(x=1.5, y = seq(0,1,l=5), labels = seq(-3,3,l=5))
rasterImage(legend_image, 0, 0, 1,1)
Output:
I am trying to combine facet strips across two adjacent panels (there is always two adjacent ones with the same first ID variable, but with two different scenarios, let's call them "A" and "B"). I am not particularly wedded to the gtable + grid solution I tried, but sadly I cannot use the facet_nested() from the ggh4x package (I cannot install it on my company's server due to various restrictions that are in place and needed dependencies - I looked at using only the relevant code, but that again is not easy due to the dependencies).
A minimum viable example of the basic plot I want to make easier to read by indicating which panels "belong together" by combining the top facet strips looks like this:
library(tidyverse)
library(gtable)
library(grid)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id + id2, nrow = 4, ncol=8)
The strips with the "1"s, the ones with the "2"s etc. should be combined (in reality it's a somewhat longer text, but this is just for illustration). I was trying to adapt an answer for a similar scenario (https://stackoverflow.com/a/40316170/7744356 - thank you #markus for finding it again), but this is what I tried. As you can see below, the height of what I produce seems wrong. I assume this must be some trivial thing I am overlooking/not understanding.
# Combine strips for a ID
g <- ggplot_gtable(ggplot_build(p1))
strip <- gtable_filter(g, "strip-t", trim = FALSE)
stript <- which(grepl('strip-t', g$layout$name))
stript2 = stript[idx*2-1]
top <- strip$layout$t[idx*2-1]
# # Using the $b below instead of b = top[i]+1, also seems not to work
#bot <- strip$layout$b[idx*2-1]
l <- strip$layout$l[idx*2-1]
r <- strip$layout$r[idx*2]
mat <- matrix(vector("list",
length = length(idx)*3),
nrow = length(idx))
mat[] <- list(zeroGrob())
res <- gtable_matrix("toprow", mat,
unit(c(1, 0, 1), "null"),
unit( rep(1, length(idx)),
"null"))
for (i in 1:length(stript2)){
if (i==1){
zz <- res %>%
gtable_add_grob(g$grobs[[stript2[i]]]$grobs[[1]], 1, 1, 1, 3) %>%
gtable_add_grob(g, .,
t = top[i],
l = l[i],
b = top[i]+1,
r = r[i],
name = c("add-strip"))
} else {
zz <- res %>%
gtable_add_grob(g$grobs[[stript2[i]]]$grobs[[1]], 1, 1, 1, 3) %>%
gtable_add_grob(zz, .,
t = top[i],
l = l[i],
b = top[i]+1,
r = r[i],
name = c("add-strip"))
}
}
grid::grid.draw(zz)
------------ Update with a ggh4x implementation -----------------
This may solve this type of problem for many, but has its downsides (e.g. axes alignment across rows gets a bit manual, probably need to manually remove x-axes and ensure the limits are the same, add a unified y-axis label, requires installation of a package from github: devtools::install_github("teunbrand/ggh4x#v0.1") for a specific version, plus cowplot interacts badly with e.g. ggtern). So I'd love it, if someone still managed to do a pure gtable + grid version.
library(tidyverse)
library(ggh4x)
library(cowplot)
plots = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n()),
plotrow=(id-1)%/%4+1) %>%
group_by(plotrow) %>%
group_map( ~ ggplot(data=.,
aes(x=x,y=y)) +
geom_jitter() +
facet_nested( ~ id + id2, ))
plot_grid(plotlist = plots, nrow = 4, ncol=1)
I'm a bit late to this game, but ggh4x now has a facet_nested_wrap() implementation that should greatly simplify this problem (disclaimer: I wrote ggh4x).
library(tidyverse)
library(ggh4x)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_nested_wrap(~id + id2, nrow = 4, ncol=8)
p1
Created on 2020-08-12 by the reprex package (v0.3.0)
Keep in mind that there might still be a few bugs in this. Also, I'm aware that this doesn't help the OP because his package versions are constrained, but I thought I mention this here anyway.
Here's a reprex of a somewhat pedestrian way to do it in grid. I have made the "parent" facet somewhat darker to emphasise the nesting, but if you prefer the color to match just change the rectGrob fill color to "gray85".
# Set up plot as per example
library(tidyverse)
library(gtable)
library(grid)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id + id2, nrow = 4, ncol=8)
g <- ggplot_gtable(ggplot_build(p1))
# Code to produce facet strips
stript <- grep("strip", g$layout$name)
grid_cols <- sort(unique(g$layout[stript,]$l))
t_vals <- rep(sort(unique(g$layout[stript,]$t)), each = length(grid_cols)/2)
l_vals <- rep(grid_cols[seq_along(grid_cols) %% 2 == 1], length = length(t_vals))
r_vals <- rep(grid_cols[seq_along(grid_cols) %% 2 == 0], length = length(t_vals))
labs <- levels(as.factor(p1$data$id))
for(i in seq_along(labs))
{
filler <- rectGrob(y = 0.7, height = 0.6, gp = gpar(fill = "gray80", col = NA))
tg <- textGrob(label = labs[i], y = 0.75, gp = gpar(cex = 0.8))
g <- gtable_add_grob(g, filler, t = t_vals[i], l = l_vals[i], r = r_vals[i],
name = paste0("filler", i))
g <- gtable_add_grob(g, tg, t = t_vals[i], l = l_vals[i], r = r_vals[i],
name = paste0("textlab", i))
}
grid.newpage()
grid.draw(g)
And to demonstrate changing the rectGrob to 50% height and "gray85":
Or if you wanted you could assign a different fill for each cycle of the loop:
Obviously the above method might take a few tweaks to fit other plots with different numbers of levels etc.
Created on 2020-07-04 by the reprex package (v0.3.0)
Maybe this can not tackle the issue, but I would like to post because it could help to present results in a different plot keeping the same structure. You will have to define the number of columns for the plot in plot_layout(ncol = 4). This code uses patchwork package. Hope this can be useful.
library(tidyverse)
library(gtable)
library(grid)
library(patchwork)
idx = 1:16
#Data
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n()))
#Split data
List <- split(p1,p1$id)
#Sketch function
myplot <- function(x)
{
d <- ggplot(x,aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id2, nrow = 1, ncol=2)+
ggtitle(unique(x$id))+
theme(plot.title = element_text(hjust = 0.5))
return(d)
}
#List of plots
Lplots <- lapply(List,myplot)
#Concatenate plots
#Create chain for plots
chain <- paste0('Lplots[[',1:length(Lplots),']]',collapse = '+')
#Evaluate the object and create the plot
Plot <- eval(parse(text = chain))+plot_layout(ncol = 4)+
plot_annotation(title = 'A nice plot')&theme(plot.title = element_text(hjust=0.5))
#Display
Plot
You will end up with a plot like this:
This is a followup post from here
and here
I have successfully implemented the split violin ggplot2 for my data (two median estimator densities, for two cases) that need to be compared. Now, since i would like to add some confidence interval. I m following the code posted in the links above:
EDIT: A reproducible example
tmp <- rnorm(1000,0,1)
tmp.2 <- rnorm(1000,0,1)
x.1 <- density(tmp)
y.1 <- density(tmp.2)
Here, i m making the densities, extracting the (x,y) pairs. Then i m getting the quantiles back,
# Make densities
densities <- as.data.frame(c(x.1$x,y.1$x))
colnames(densities) <- "loc"
densities$dens <- c(x.1$y,y.1$y)
densities$drop_case <- c(rep("B",512),rep("S",512))
densities$dens <- ifelse(densities$drop_case=="B",densities$dens*-1,densities$dens)
densities$dens <- ifelse(densities$drop_case=="S",densities$dens*1,densities$dens)
conf <- as.data.frame(c(quantile(tmp,c(0.025,0.975))[1],quantile(tmp,c(0.025,0.975))[2],quantile(tmp.2,c(0.025,0.975))[1],quantile(tmp.2,c(0.025,0.975))[2]))
colnames(conf) <- "intervals"
conf$drop_case <- c(rep("B",2),rep("S",2))
conf$length <- rep(1000,4)
Now here i am trying to extract the values inside the densities, as was noted in the linked posts
Find data points in densities
val.tmp <- rep(0,4)
val.tmp.2 <- rep(0,4)
for (i in 1:4) {
x.here <- densities$loc
y.here <- densities$dens
your.number<- conf$intervals[i]
pos.tmp <- which(abs(x.here-your.number)==min(abs(x.here-your.number)))
val.tmp[i] <- x.here[pos.tmp]
val.tmp.2[i] <- y.here[pos.tmp]
}
conf$positions <- val.tmp
conf$length <- val.tmp.2
conf$length <- ifelse(conf$drop_case=="B",conf$length*-1,conf$length)
conf$length <- ifelse(conf$drop_case=="S",conf$length*1,conf$length)
ggplot(densities,aes(dens, loc, fill = factor(drop_case)))+
geom_polygon()+
scale_x_continuous(breaks = 0, name = info$Name)+
ylab('Estimator Density') +
theme(axis.title.x = element_blank())+
geom_point(data = conf, aes(x = positions, y = length, fill = factor(drop_case), group = factor(drop_case))
,shape = 21, colour = "black", show.legend = FALSE)
Then unfortuantely I am facing the following, the points are not mapped on the densities but are rather mapped on the plane.
There is a bunch of little mistakes in the code. Firstly, within that for loop, you can't set x.here and y.here to all of the density and location values, since that includes both groups. Secondly, since the signs are already changed in densities there is no need to use those ifelse statements afterwards. Thirdly, you would only need the top ifelse anyway, since the bottom one does absolutely nothing. Finally, you had the x and y mappings in geom_point the wrong way around!
There is a bunch of other things one could change to make the code more understandable and pretty, but I'm on limited time, so I'll leave those for what they are.
Below the full adjusted code:
tmp <- rnorm(1000,0,1)
tmp.2 <- rnorm(1000,0,1)
x.1 <- density(tmp)
y.1 <- density(tmp.2)
# Make densities
densities <- as.data.frame(c(x.1$x,y.1$x))
colnames(densities) <- "loc"
densities$dens <- c(x.1$y,y.1$y)
densities$drop_case <- c(rep("B",512),rep("S",512))
densities$dens <- ifelse(densities$drop_case=="B",densities$dens*-1,densities$dens)
conf <- as.data.frame(c(quantile(tmp,c(0.025,0.975)), quantile(tmp.2,c(0.025,0.975))))
colnames(conf) <- "intervals"
conf$drop_case <- c(rep("B",2),rep("S",2))
conf$length <- rep(1000,4)
val.tmp <- rep(0,4)
val.tmp.2 <- rep(0,4)
for (i in 1:4) {
x.here <- densities$loc[densities$drop_case == conf$drop_case[i]]
y.here <- densities$dens[densities$drop_case == conf$drop_case[i]]
your.number<- conf$intervals[i]
pos.tmp <- which(abs(x.here-your.number)==min(abs(x.here-your.number)))
val.tmp[i] <- x.here[pos.tmp]
val.tmp.2[i] <- y.here[pos.tmp]
}
conf$positions <- val.tmp
conf$length <- val.tmp.2
ggplot(densities, aes(dens, loc, fill = drop_case)) +
geom_polygon()+
ylab('Estimator Density') +
theme(axis.title.x = element_blank())+
geom_point(data = conf, aes(x = length, y = positions, fill = drop_case),
shape = 21, colour = "black", show.legend = FALSE)
This results in:
I would personally prefer a plot with line segments:
ggplot(densities, aes(dens, loc, fill = factor(drop_case)))+
geom_polygon()+
ylab('Estimator Density') +
theme(axis.title.x = element_blank())+
geom_segment(data = conf, aes(x = length, xend = 0, y = positions, yend = positions))
In short, I am looking for help with creating a plot in which a specific area of the plot is designated to a certain randomly assigned observation (who possesses a certain factor designated by color). I'd prefer ggplot2 or d3.
I was in a discussion today and was trying to demonstrate the point that even if we randomly chose an observation, we would probably pick an observation of interest. In other words, if every observation was randomly assigned to a certain part of a dartboard and I threw a dart, chances are that my dart would hit an observation that would be of interest
Now, I'm now literally trying to create this dartboard.
I've been messing around with ggplot2 and I've come close to visualizations that I want but not quite.
First, I tried using geom_polar. I randomly generated coordinates for each observation
df$Coord1 <- sample(50, size = nrow(df), replace = TRUE)
df$Coord2 <- sample(50, size = nrow(df), replace = TRUE)
and then plotted them The problem here, though, is that all of the area of the plot was not taken. (There's also the issue that some points actually overlapped... so if someone also knows how to generate coordinates that wouldn't ever overlap that would be nice.) If I were to throw a dart, I might not hit an observation. Here's the code I used:
dartboard <- ggplot(df, aes(Coord1, Coord2, fill = Classification)) +
geom_tile()+coord_polar()
So, then I tried my hand with a pie chart.
pie <- ggplot(df, aes(x = factor(1), fill = factor(Classification))) +
geom_bar(width = 1) + coord_polar()
which was nice because it was a whole circle, but it grouped the classifications together when I want them randomly scattered across the plot.
I also tried replicating this heat map creation (http://www.r-bloggers.com/controlling-heatmap-colors-with-ggplot2/) but I wasn't quite able to figure out how to make it fit correctly with my data.
In short, I am looking for help with creating a plot in which a specific area of the plot is designated to a certain observation who possesses a certain factor.
Any ideas?
Update 1:
This code is what I'm looking for visually from the conceptual level (all part of the chart is covered by an observation:
df <- expand.grid(x = 1:20, y = 1:20)
samples <- c("one", "two", "three", "four", "five")
df$series <- samples[runif(n = nrow(df), min=1,max=length(samples))]
g <- ggplot(df, aes(fill=series, xmin = x, ymin = y, xmax = x+1, ymax = y+1))
g <- g + geom_rect()
g <- g + coord_polar(theta="y")
g <- g + theme(panel.grid=element_blank())
g <- g + theme(axis.text=element_blank())
g <- g + theme(axis.ticks=element_blank())
g
but now I'm trying to figure out how to apply this to my own data set which has about 1,500 observations. The coordinates in that picture are used with expand.grid, so now I'm trying to figure out how to assign coordinates to my own 1,500 observations.
Update 2:
I have relative success with the code below.
random <- expand.grid(x = 1:40, y = 1:39)
random <- random %>%
mutate(ID = 1,
ID = cumsum(ID))
plot.data <- full_join(data, random, by = "ID")
samples <- c("UI", "IN", "OW", "BE" , "Five")
plot.data$Classification <- samples[runif(n = nrow(plot.data), min=1,max=length(samples))]
g <- ggplot(plot.data, aes(fill=Classification, xmin = x, ymin = y, xmax = x+1, ymax = y+1))
g <- g + geom_rect()
g <- g + coord_polar(theta="y")
g <- g + theme(panel.grid=element_blank())
g <- g + theme(axis.text=element_blank())
g <- g + theme(axis.ticks=element_blank())
g <- g + scale_fill_manual(values=c("dark green", "orange" , "yellow" , "red" , "green"))
g
I used the expand.grid function to assign coordinates to kids. Some observations got left out because they couldn't be given coordinates, but that's okay.
My only concern/complaint now is that some observations are larger (and thus easier to hit) than others.... so I might examine that heat maproute. Time will tell. Thank you very much for the help so far.
Update 3:
Another version (and probably final form):
This way, each observation is roughly the same size.
The base of this heatmap version can be found here: http://www.r-bloggers.com/controlling-heatmap-colors-with-ggplot2/ but here's my own code
ggplot(plot.data, aes(x = x, y = y, fill = factor(Classification))) +
geom_tile(color = "black") +
scale_fill_manual(values=c("dark green", "orange" , "yellow" , "red" , "green")) +
theme(legend.position="none") +
theme_change
How's this?
df <- expand.grid(x = 1:20, y = 1:20)
samples <- c("one", "two", "three", "four", "five")
df$series <- samples[runif(n = nrow(df), min=1,max=length(samples))]
g <- ggplot(df, aes(fill=series, xmin = x, ymin = y, xmax = x+1, ymax = y+1))
g <- g + geom_rect()
g <- g + coord_polar(theta="y")
g <- g + theme(panel.grid=element_blank())
g <- g + theme(axis.text=element_blank())
g <- g + theme(axis.ticks=element_blank())
g