I'm struggling to overlap rotated density plot onto the original scatterplot. Here are 2 plots I have:
require(ggplot2); set.seed(1);
df1 <- data.frame(ID=paste0('ID',1:1000), value=rnorm(1000,500,100))
p1 <- ggplot(data = df1, aes(x=reorder(ID, value), y=value)) +
geom_point(size=2, alpha = 0.7)+
coord_trans(y="log10")
p2 <- ggplot(data = df1, aes(x=value)) +
coord_trans(x="log10") +
geom_density() +
coord_flip()
p1
p2
First, there's a little problem with the density plot that its vertical axis is not log10-transformed. But main issue is that I can't find how to draw it on the previous plot keeping correct coordinates.
Because you are using coord_flip on your second plot you are effectively trying to plot two different values onto the same x axis (density and ID). There are plenty of posts discouraging this, here's one for example: How do I plot points with two different y-axis ranges on the same panel in the same X axis?.
Related
I have a data frame with three continuous variables (x,y,z). I want a column plot in which x defines the x-axis position of the columns, y defines the length of the columns, and the column colors (function of y) are defined by z. The test code below shows the set up.
`require(ggplot2)
require(viridis)
# Create a dummy data frame
x <- c(rep(0.0, 5),rep(0.5,10),rep(1.0,15))
y <- c(seq(0.0,-5,length.out=5),
seq(0.0,-10,length.out=10),
seq(0.0,-15,length.out=15))
z <- c(seq(10,0,length.out=5),
seq(8,0,length.out=10),
seq(6,0,length.out=15))
df <- data.frame(x=x, y=y, z=z)
pbase <- ggplot(df, aes(x=x, y=y, fill=z))
ptest <- pbase + geom_col(width=0.5, position="identity") +
scale_fill_viridis(option="turbo",
limits = c(0,10),
breaks=seq(0,10,2.5),
labels=c("0","2.5","5.0","7.5","10.0"))
print(ptest)`
The legend has the correct colors but the columns do not. Perhaps this is not the correct way to do this type of plot. I tried using geom_bar() which creates a bars with the correct colors but the y-values are incorrect.
It looks like you have 3 X values that each appear 5, 10, or 15 times. Do you want the bars to be overlaid on top of one another, as they are now? If you add an alpha = 0.5 to the geom_col call you'll see the overlapping bars.
Alternatively, you might use dodging to show the bars next to one another instead of on top of one another.
ggplot(df, aes(x=x, y=y, fill=z, group = z)) +
geom_col(width=0.5, position=position_dodge()) +
scale_fill_viridis_c(option="turbo", # added with ggplot 3.x in 2018
limits = c(0,10),
breaks=seq(0,10,2.5),
labels=c("0","2.5","5.0","7.5","10.0"))
Or you might plot the data in order of y so that the smaller bars appear on top, visibly:
ggplot(dplyr::arrange(df,y), aes(x=x, y=y, fill=z))+
geom_col(width=0.5, position="identity") +
scale_fill_viridis_c(option="turbo",
limits = c(0,10),
breaks=seq(0,10,2.5),
labels=c("0","2.5","5.0","7.5","10.0"))
I solved this by using geom_tile() in place of geom_col().
I am trying to make a scatter plot with ggplot to show time watching TV on x axis and immigrant sentiment on y axis.
The code I am using is
ggplot(totalTV,
aes(x = dfnew.TV.watching..total.time.on.average.weekday,
y = dfnew.Immigrant.Sentiment)) +
geom_point()
I am getting this output
My table is so, with first variable being character, and subsequent two being numeric:
Any idea on how to produce a representative scatter of the outcome?
Cheers
Here are some examples using the mtcars dataset.
library(ggplot2)
# Original
ggplot(mtcars,aes(factor(cyl),mpg)) +
geom_point()
# Jitter
ggplot(mtcars,aes(factor(cyl),mpg)) +
geom_jitter(width = .2) # Control spread with width
# Violin plot
ggplot(mtcars,aes(factor(cyl),mpg)) +
geom_violin()
# Boxplot
ggplot(mtcars,aes(factor(cyl),mpg)) +
geom_boxplot()
# Remember that different geoms can be combined
ggplot(mtcars,aes(factor(cyl),mpg)) +
geom_violin() +
geom_jitter(width = .2)
# Or something more exotic ala Raincloud-plots
# https://micahallen.org/2018/03/15/introducing-raincloud-plots/
I am plotting a series of point that are grouped by two factors. I would like to add lines within one group across the other and within the x value (across the position-dodge distance) to visually highlight trends within the data.
geom_line(), geom_segment(), and geom_path() all seem to plot only to the actual x value rather than the position-dodge place of the data points. Is there a way to add a line connecting points within the x value?
Here is a structurally analogous sample:
# Create a sample data set
d <- data.frame(expand.grid(x=letters[1:3],
g1=factor(1:2),
g2=factor(1:2)),
y=rnorm(12))
# Load ggplot2
library(ggplot2)
# Define position dodge
pd <- position_dodge(0.75)
# Define the plot
p <- ggplot(d, aes(x=x, y=y, colour=g1, group=interaction(g1,g2))) +
geom_point(aes(shape = factor(g2)), position=pd) +
geom_line()
# Look at the figure
p
# How to plot the line instead across g1, within g2, and within x?
Simply trying to close this question (#Axeman please feel free to take over my answer).
p <- ggplot(d, aes(x=x, y=y, colour=g1, group=interaction(g1,g2))) +
geom_point(aes(shape = factor(g2)), position=pd) +
geom_line(position = pd)
# Look at the figure
p
I use ggplot2::ggplot for all 2D plotting needs, including density plots, but I find that when plotting a number of overlapping densities with extreme outliers on a single space (in different colors) the line on the x-axis becomes a little distracting.
My question is then, can you remove the bottom section of the density plot from being plotted? If so, how?
You can use this example:
library(ggplot2)
ggplot(movies, aes(x = rating)) + geom_density()
Should turn out like this:
How about using stat_density directly
ggplot(movies, aes(x = rating)) + stat_density(geom="line")
You can just draw a white line over it:
ggplot(movies, aes(x = rating)) +
geom_density() +
geom_hline(color = "white", yintercept = 0)
I have a scatterplot in R. Each (x,y) point is colored according to its z value. So you can think of each point as (x,y,z), where (x,y) determines its position and z determines its color along a color gradient. I would like to add two things
A legend on the right side showing the color gradient and what z values correspond to what colors
I would like to smooth all the color using some type of interpolation, I assume. In other words, the entire plotting region (or at least most of it) should become colored so that it looks like a huge heatmap instead of a scatterplot. So, in the example below, there would be lots of orange/yellow around and then some patches of purple throughout. I'm happy to further clarify what I'm trying to explain here, if need be.
Here is the code I have currently, and the image it makes.
x <- seq(1,150)
y <- runif(150)
z <- c(rnorm(mean=1,100),rnorm(mean=20,50))
colorFunction <- colorRamp(rainbow(100))
zScaled <- (z - min(z)) / (max(z) - min(z))
zMatrix <- colorFunction(zScaled)
zColors <- rgb(zMatrix[,1], zMatrix[,2], zMatrix[,3], maxColorValue=255)
df <- data.frame(x,y)
x <- densCols(x,y, colramp=colorRampPalette(c("black", "white")))
df$dens <- col2rgb(x)[1,] + 1L
plot(y~x, data=df[order(df$dens),],pch=20, col=zColors, cex=1)
Here are some solutions using the ggplot2 package.
# Load library
library(ggplot2)
# Recreate the scatterplot from the example with default colours
ggplot(df) +
geom_point(aes(x=x, y=y, col=dens))
# Recreate the scatterplot with a custom set of colours. I use rainbow(100)
ggplot(df) +
geom_point(aes(x=x, y=y, col=dens)) +
scale_color_gradientn(colours=rainbow(100))
# A 2d density plot, using default colours
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..level..), geom="polygon") +
ylim(-0.2, 1.2) + xlim(-30, 180) # I had to twiddle with the ranges to get a nicer plot
# A better density plot, in my opinion. Tiles across your range of data
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..density..), geom="tile",
contour = FALSE)
# Using custom colours. I use rainbow(100) again.
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..density..), geom="tile",
contour = FALSE) +
scale_fill_gradientn(colours=rainbow(100))
# You can also plot the points on top, if you want
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..density..), geom="tile",
contour = FALSE) +
geom_point(aes(x=x, y=y, col=dens)) +
scale_colour_continuous(guide=FALSE) # This removes the extra legend
I attach the plots as well:
Also, using ggplot2, you can use color and size together, as in:
ggplot(df, aes(x=x, y=y, size=dens, color=dens)) + geom_point() +
scale_color_gradientn(name="Density", colours=rev(rainbow(100))) +
scale_size_continuous(range=c(1,15), guide="none")
which might make it a little clearer.
Notes:
The expression rev(rainbow(100)) reverses the rainbow color scale,
so that red goes with the larger values of dens.
Unfortunately, you cannot combine a continuous legend (color) and a
discrete legend (size), so you would normally get two legends. The
expression guide="none" hides the size legend.
Here's the plot: