modifying ggplot objects after creation - r

Is there a preferred way to modify ggplot objects after creation?
For example I recommend my students to save the r object together with the pdf file for later changes...
library(ggplot2)
graph <-
ggplot(mtcars, aes(x=mpg, y=qsec, fill=cyl)) +
geom_point() +
geom_text(aes(label=rownames(mtcars))) +
xlab('miles per galon') +
ggtitle('my title')
ggsave('test.pdf', graph)
save(graph, file='graph.RData')
So new, in case they have to change title or labels or sometimes other things, they can easily load the object and change simple things.
load('graph.RData')
print(graph)
graph +
ggtitle('better title') +
ylab('seconds per quarter mile')
What do I have to do for example to change the colour to discrete scale? In the original plot I would wrap the y in as.factor. But is there a way to do it afterwards?
Or is there a better way on modifying the objects, when the data is gone. Would love to get some advice.

You could use ggplot_build() to alter the plot without the code or data:
Example plot:
data("iris")
p <- ggplot(iris) +
aes(x = Sepal.Length, y = Sepal.Width, colour = Species) +
geom_point()
Colours are respective to Species.
Disassemble the plot using ggplot_build():
q <- ggplot_build(p)
Take a look at the object q to see what is happening here.
To change the colour of the point, you can alter the respective table in q:
q$data[[1]]$colour <- "black"
Reassemble the plot using ggplot_gtable():
q <- ggplot_gtable(q)
And plot it:
plot(q)
Now, the points are black.

Related

Plotting multiple wordcloud (ggwordcloud) with other types of plots in ggplot2

I am trying to plot several wordclouds in a scatterplot and I wonder if one can control the position of a wordcloud in ggplot?
As an example the code below overlays both wordclouds around the origin of the plot.
Say I want to place the second wordcloud at x=4 and y =35. Is that possible?
library(ggplot2)
library(ggwordcloud)
ggplot() +
geom_point(mtcars,mapping=aes(wt,mpg)) +
geom_text_wordcloud(love_words_small,mapping=aes(label=word)) +
geom_text_wordcloud(mtcars,mapping=aes(label=rownames(mtcars))) +
theme_minimal()
I was looking for the exact same thing. Looks like you can simply add the x and y aesthetic arguments. Ie.
ggplot() +
geom_point(mtcars,mapping=aes(wt,mpg)) +
geom_text_wordcloud(love_words_small,mapping=aes(label=word)) +
geom_text_wordcloud(mtcars,mapping=aes(label=rownames(mtcars), x=4,y=35)) +
theme_minimal()
What I did may be more generally helpful for folks, which is to pass x and y vectors:
library(tidyverse)
library(ggwordcloud)
ggplot(data = mtcars %>% mutate(car_names = rownames(mtcars)) %>%
group_by(cyl),
mapping = aes(label=car_names, x=mpg, y=disp)) +
geom_text_wordcloud()
Perhaps you could save the wordclouds as separate plots, and then add them to one plot with cowplot or gridExtra or any of the packages that lets you combine plots?

Arranging data for two facet R line plot

I am trying to make a two facet line plot as this example. My problem is to arrange data to show desired variable on x-axis. Here is small data set I wanna use.
Study,Cat,Dim1,Dim2,Dim3,Dim4
Study1,PK,-3.00,0.99,-0.86,0.46
Study1,US,-4.67,0.76,1.01,0.45
Study2,FL,-2.856,4.15,1.554,0.765
Study2,FL,-8.668,5.907,3.795,4.754
I tried to use the following code to draw line graph from this data frame.
plot1 <- ggplot(data = dims, aes(x = Cat, y = Dim1, group = Study)) +
geom_line() +
geom_point() +
facet_wrap(~Study)
As is clear, I can only use one value column to draw lines. I want to put Dim1, Dim2, Dim3, Dim4 on x axis which I cannot do in this arrangement of data. [tried c(Dim1, Dim2, Dim3, Dim4) with no luck]
Probably the solution is to transpose the table but then I cannot reproduce categorization for facet (Study in above table) and colour (Cat in above table. Any ideas how to solve this issue?
You can try this:
library(tidyr)
library(dplyr)
gather(dims, variable, value, -Study, -Cat) %>%
ggplot(aes(x=variable, y=value, group=Cat, col=Cat)) +
geom_point() + geom_line() + facet_wrap(~Study)
The solution was quite easy. Just had to think a bit and the re-arranged data looks like this.
Study,Cat,Dim,Value
Study1,PK,Dim1,-3
Study1,PK,Dim2,0.99
Study1,PK,Dim3,-0.86
Study1,PK,Dim4,0.46
Study1,US,Dim1,-4.67
Study1,US,Dim2,0.76
Study1,US,Dim3,1.01
Study1,US,Dim4,0.45
Study2,FL,Dim1,-2.856
Study2,FL,Dim2,4.15
Study2,FL,Dim3,1.554
Study2,FL,Dim4,0.765
Study2,FL,Dim1,-8.668
Study2,FL,Dim2,5.907
Study2,FL,Dim3,3.795
Study2,FL,Dim4,4.754
After that R produced desire result with this code.
plot1 <- ggplot(data=dims, aes(x=Dim, y=Value, colour=Cat, group=Cat)) + geom_line()+ geom_point() + facet_wrap(~Study)

3-variables plotting heatmap ggplot2

I'm currently working on a very simple data.frame, containing three columns:
x contains x-coordinates of a set of points,
y contains y-coordinates of the set of points, and
weight contains a value associated to each point;
Now, working in ggplot2 I seem to be able to plot contour levels for these data, but i can't manage to find a way to fill the plot according to the variable weight. Here's the code that I used:
ggplot(df, aes(x,y, fill=weight)) +
geom_density_2d() +
coord_fixed(ratio = 1)
You can see that there's no filling whatsoever, sadly.
I've been trying for three days now, and I'm starting to get depressed.
Specifying fill=weight and/or color = weight in the general ggplot call, resulted in nothing. I've tried to use different geoms (tile, raster, polygon...), still nothing. Tried to specify the aes directly into the geom layer, also didn't work.
Tried to convert the object as a ppp but ggplot can't handle them, and also using base-R plotting didn't work. I have honestly no idea of what's wrong!
I'm attaching the first 10 points' data, which is spaced on an irregular grid:
x = c(-0.13397460,-0.31698730,-0.13397460,0.13397460,-0.28867513,-0.13397460,-0.31698730,-0.13397460,-0.28867513,-0.26794919)
y = c(-0.5000000,-0.6830127,-0.5000000,-0.2320508,-0.6547005,-0.5000000,-0.6830127,-0.5000000,-0.6547005,0.0000000)
weight = c(4.799250e-01,5.500250e-01,4.799250e-01,-2.130287e+12,5.798250e-01,4.799250e-01,5.500250e-01,4.799250e-01,5.798250e-01,6.618956e-01)
any advise? The desired output would be something along these lines:
click
Thank you in advance.
From your description geom_density doesn't sound right.
You could try geom_raster:
ggplot(df, aes(x,y, fill = weight)) +
geom_raster() +
coord_fixed(ratio = 1) +
scale_fill_gradientn(colours = rev(rainbow(7)) # colourmap
Here is a second-best using fill=..level... There is a good explanation on ..level.. here.
# load libraries
library(ggplot2)
library(RColorBrewer)
library(ggthemes)
# build your data.frame
df <- data.frame(x=x, y=y, weight=weight)
# build color Palette
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")), space="Lab")
# Plot
ggplot(df, aes(x,y, fill=..level..) ) +
stat_density_2d( bins=11, geom = "polygon") +
scale_fill_gradientn(colours = myPalette(11)) +
theme_minimal() +
coord_fixed(ratio = 1)

How to get the points inside of the ellipse in ggplot2?

I'm trying to identify the densest region in the plot. And I do this using stat_ellipse() in ggplot2. But I can not get the information (sum total, order number of each point and so on) of the points inside of the ellipse.
Seldom see the discussion about this problem. Is this possible?
For example:
ggplot(faithful, aes(waiting, eruptions))+
geom_point()+
stat_ellipse()
Here is Roman's suggestion implemented. The help for stat_ellipse says it uses a modified version of car::ellipse, so therefore I chose to extract the ellipse points from the ggplot object. That way it should always be correct (also if you change options in stat_ellipse).
# Load packages
library(ggplot2)
library(sp)
# Build the plot first
p <- ggplot(faithful, aes(waiting, eruptions)) +
geom_point() +
stat_ellipse()
# Extract components
build <- ggplot_build(p)$data
points <- build[[1]]
ell <- build[[2]]
# Find which points are inside the ellipse, and add this to the data
dat <- data.frame(
points[1:2],
in.ell = as.logical(point.in.polygon(points$x, points$y, ell$x, ell$y))
)
# Plot the result
ggplot(dat, aes(x, y)) +
geom_point(aes(col = in.ell)) +
stat_ellipse()

Get data associated to ggplot + stat_ecdf()

I like the stat_ecdf() feature part of ggplot2 package, which I find quite useful to explore a data series. However this is only visual, and I wonder if it is feasible - and if yes how - to get the associated table?
Please have a look to the following reproducible example
p <- ggplot(iris, aes_string(x = "Sepal.Length")) + stat_ecdf() # building of the cumulated chart
p
attributes(p) # chart attributes
p$data # data is iris dataset, not the serie used for displaying the chart
As #krfurlong showed me in this question, the layer_data function in ggplot2 can get you exactly what you're looking for without the need to recreate the data.
p <- ggplot(iris, aes_string(x = "Sepal.Length")) + stat_ecdf()
p.data <- layer_data(p)
The first column in p.data, "y", contains the ecdf values. "x" is the Sepal.Length values on the x-axis in your plot.
We can recreate the data:
#Recreate ecdf data
dat_ecdf <-
data.frame(x=unique(iris$Sepal.Length),
y=ecdf(iris$Sepal.Length)(unique(iris$Sepal.Length))*length(iris$Sepal.Length))
#rescale y to 0,1 range
dat_ecdf$y <-
scale(dat_ecdf$y,center=min(dat_ecdf$y),scale=diff(range(dat_ecdf$y)))
Below 2 plots should look the same:
#plot using new data
ggplot(dat_ecdf,aes(x,y)) +
geom_step() +
xlim(4,8)
#plot with built-in stat_ecdf
ggplot(iris, aes_string(x = "Sepal.Length")) +
stat_ecdf() +
xlim(4,8)

Resources