once again Im confronted with a complicated ggplot. I want to plot different plottypes within one plot using facet grid.
I hope I can make my point clear using the following example:
I want to produce a plot similar to the first picture but the upper plot should look like the second picture.
I already found the trick using the subset function but I can't add vertical lines to only one plot let alone two or three (or specify the color).
CODE:
a <- rnorm(100)
b <- rnorm(100,8,1)
c <- rep(c(0,1),50)
dfr <- data.frame(a=a,b=b,c=c,d=seq(1:100))
dfr_melt <- melt(dfr,id.vars="d")
#I want only two grids, not three
ggplot(dfr_melt,aes(x=d,y=value)) + facet_grid(variable~.,scales="free")+
geom_line(subset=.(variable=="a")) + geom_line(subset=.(variable=="b"))
#Upper plot should look like this
ggplot(dfr,aes(x=d,y=a)) + geom_line() + geom_line(aes(y=c,color="c"))+
geom_hline(aes(yintercept=1),linetype="dashed")+
geom_hline(aes(yintercept=-2),linetype="dashed")
If I understand your question correctly, you just need to a variable column to dfr in order to allow the faceting to work:
dfr$variable = "a"
ggplot(subset(dfr_melt, variable=="a"),aes(x=d,y=value)) +
facet_grid(variable~.,scales="free")+
geom_line(data=subset(dfr_melt,variable=="a")) +
geom_line(data=subset(dfr_melt, variable=="b")) +
geom_line(data=dfr, aes(y=c, colour=factor(c))) +
geom_hline(aes(yintercept=1),linetype="dashed")+
geom_hline(aes(yintercept=-2),linetype="dashed")
Notice that my plot doesn't have the zig-zig line, this is because I changed:
#This is almost certainly not what you want
geom_line(data=dfr, aes(y=c, colour="c"))
to
#I made c a factor since it only takes the values 0 or 1
geom_line(data=dfr, aes(y=c, colour=factor(c)))
##Alternatively, you could have
geom_line(data=dfr, aes(y=c), colour="red") #or
geom_line(data=dfr, aes(y=c, colour=c)) #or
To my knowledge, you can't put multiple plot types in a single plot using facet.grid(). Your two options, as far as I can see, are
to put empty data in the first facet, so the lines are 'there' but not displayed, or
to combine multiple plots into one using viewports.
I think the second solution is more general, so that's what I did:
#name each of your plots
p2 <- ggplot(subset(dfr_melt, variable=="a"),aes(x=d,y=value)) + facet_grid(variable~.,scales="free")+
geom_line(subset=.(variable=="a")) + geom_line(subset=.(variable=="b"))
#Upper plot should look like this
p1 <- ggplot(dfr,aes(x=d,y=a)) + geom_line() + geom_line(aes(y=c,color="c"))+
geom_hline(aes(yintercept=1),linetype="dashed")+
geom_hline(aes(yintercept=-2),linetype="dashed")
#From Wickham ggplot2, p154
vplayout <- function(x,y) {
viewport(layout.pos.row=x, layout.pos.col=y)
}
require(grid)
png("myplot.png", width = 600, height = 300) #or use a different device, e.g. quartz for onscreen display on a mac
grid.newpage()
pushViewport(viewport(layout=grid.layout(2, 1)))
print(p1, vp=vplayout(1, 1))
print(p2, vp=vplayout(2, 1))
dev.off()
You might need to fiddle a bit to get them to line up exactly right. Turning off the faceting on the upper plot, and moving the legend on the lower plot to the bottom, should do the trick.
Related
I have a somewhat "weird" two-dimensional distribution (not normal with some uniform values, but it kinda looks like this.. this is just a minimal reproducible example), and want to log-transform the values and plot them.
library("ggplot2")
library("scales")
df <- data.frame(x = c(rep(0,200),rnorm(800, 4.8)), y = c(rnorm(800, 3.2),rep(0,200)))
Without the log transformation, the scatterplot (incl. rug plot which I need) works (quite) well, apart from a marginally narrower rug plot on the x axis:
p <- ggplot(df, aes(x, y)) + geom_point() + geom_rug(alpha = I(0.5)) + theme_minimal()
p
When plotting the same with a log10-transform though, the points at the margin (at x = 0 and y = 0, respectively) are plotted outside the rug plot or just on the axis (with other data, and only one half side of a point is visible).
p + scale_x_log10() + scale_y_log10()
How can I "rescale" the axes so that all the points are contained fully within the grid and the rug plots are unaffected, as in the first example?
Maybe you want
p + scale_x_log10(oob=squish_infinite) + scale_y_log10(oob=squish_infinite)
I don't really know what you expect to happen for those values that can be negative or infinite, but one general advice when transformations don't do what you want is to perform them outside of ggplot2. Something like this might be useful,
library(plyr)
df2 <- colwise(log10)(df) # log transform columns
df2 <- colwise(squish_infinite)(df2) # do something with infinites
p %+% df2 # plot the transformed data
What's the ggplot2 equivalent of "dotplot" histograms? With stacked points instead of bars? Similar to this solution in R:
Plot Histogram with Points Instead of Bars
Is it possible to do this in ggplot2? Ideally with the points shown as stacks and a faint line showing the smoothed line "fit" to these points (which would make a histogram shape.)
ggplot2 does dotplots Link to the manual.
Here is an example:
library(ggplot2)
set.seed(789); x <- data.frame(y = sample(1:20, 100, replace = TRUE))
ggplot(x, aes(y)) + geom_dotplot()
In order to make it behave like a simple dotplot, we should do this:
ggplot(x, aes(y)) + geom_dotplot(binwidth=1, method='histodot')
You should get this:
To address the density issue, you'll have to add another term, ylim(), so that your plot call will have the form ggplot() + geom_dotplot() + ylim()
More specifically, you'll write ylim(0, A), where A will be the number of stacked dots necessary to count 1.00 density. In the example above, the best you can do is see that 7.5 dots reach the 0.50 density mark. From there, you can infer that 15 dots will reach 1.00.
So your new call looks like this:
ggplot(x, aes(y)) + geom_dotplot(binwidth=1, method='histodot') + ylim(0, 15)
Which will give you this:
Usually, this kind of eyeball estimate will work for dotplots, but of course you can try other values to fine-tune your scale.
Notice how changing the ylim values doesn't affect how the data is displayed, it just changes the labels in the y-axis.
As #joran pointed out, we can use geom_dotplot
require(ggplot2)
ggplot(mtcars, aes(x = mpg)) + geom_dotplot()
Edit: (moved useful comments into the post):
The label "count" it's misleading because this is actually a density estimate may be you could suggest we changed this label to "density" by default. The ggplot implementation of dotplot follow the original one of Leland Wilkinson, so if you want to understand clearly how it works take a look at this paper.
An easy transformation to make the y axis actually be counts, i.e. "number of observations". From the help page it is written that:
When binning along the x axis and stacking along the y axis, the numbers on y axis are not meaningful, due to technical limitations of ggplot2. You can hide the y axis, as in one of the examples, or manually scale it to match the number of dots.
So you can use this code to hide y axis:
ggplot(mtcars, aes(x = mpg)) +
geom_dotplot(binwidth = 1.5) +
scale_y_continuous(name = "", breaks = NULL)
I introduce an exact approach using #Waldir Leoncio's latter method.
library(ggplot2); library(grid)
set.seed(789)
x <- data.frame(y = sample(1:20, 100, replace = TRUE))
g <- ggplot(x, aes(y)) + geom_dotplot(binwidth=0.8)
g # output to read parameter
### calculation of width and height of panel
grid.ls(view=TRUE, grob=FALSE)
real_width <- convertWidth(unit(1,'npc'), 'inch', TRUE)
real_height <- convertHeight(unit(1,'npc'), 'inch', TRUE)
### calculation of other values
width_coordinate_range <- diff(ggplot_build(g)$panel$ranges[[1]]$x.range)
real_binwidth <- real_width / width_coordinate_range * 0.8 # 0.8 is the argument binwidth
num_balls <- real_height / 1.1 / real_binwidth # the number of stacked balls. 1.1 is expanding value.
# num_balls is the value of A
g + ylim(0, num_balls)
Apologies : I don't have enough reputation to 'comment'.
I like cuttlefish44's "exact approach", but to make it work (with ggplot2 [2.2.1]) I had to change the following line from :
### calculation of other values
width_coordinate_range <- diff(ggplot_build(g)$panel$ranges[[1]]$x.range)
to
### calculation of other values
width_coordinate_range <- diff(ggplot_build(g)$layout$panel_ranges[[1]]$x.range)
I am trying to write a code that I wrote with a basic graphics package in R to ggplot.
The graph I obtained using the basic graphics package is as follows:
I was wondering whether this type of graph is possible to create in ggplot2. I think we could create this kind of graph by using panels but I was wondering is it possible to use faceting for this kind of plot. The major difficulty I encountered is that maximum and minimum have common lengths whereas the observed data is not continuous data and the interval is quite different.
Any thoughts on arranging the data for this type of plot would be very helpful. Thank you so much.
Jdbaba,
From your comments, you mentioned that you'd like for the geom_point to have just the . in the legend. This is a feature that is yet to be implemented to be used directly in ggplot2 (if I am right). However, there's a fix/work-around that is given by #Aniko in this post. Its a bit tricky but brilliant! And it works great. Here's a version that I tried out. Hope it is what you expected.
# bind both your data.frames
df <- rbind(tempcal, tempobs)
p <- ggplot(data = df, aes(x = time, y = data, colour = group1,
linetype = group1, shape = group1))
p <- p + geom_line() + geom_point()
p <- p + scale_shape_manual("", values=c(NA, NA, 19))
p <- p + scale_linetype_manual("", values=c(1,1,0))
p <- p + scale_colour_manual("", values=c("#F0E442", "#0072B2", "#D55E00"))
p <- p + facet_wrap(~ id, ncol = 1)
p
The idea is to first create a plot with all necessary attributes set in the aesthetics section, plot what you want and then change settings manually later using scale_._manual. You can unset lines by a 0 in scale_linetype_manual for example. Similarly you can unset points for lines using NA in scale_shape_manual. Here, the first two values are for group1=maximum and minimum and the last is for observed. So, we set NA to the first two for maximum and minimum and set 0 to linetype for observed.
And this is the plot:
Solution found:
Thanks to Arun and Andrie
Just in case somebody needs the solution of this sort of problem.
The code I used was as follows:
library(ggplot2)
tempcal <- read.csv("temp data ggplot.csv",header=T, sep=",")
tempobs <- read.csv("temp data observed ggplot.csv",header=T, sep=",")
p <- ggplot(tempcal,aes(x=time,y=data))+geom_line(aes(x=time,y=data,color=group1))+geom_point(data=tempobs,aes(x=time,y=data,colour=group1))+facet_wrap(~id)
p
The dataset used were https://www.dropbox.com/s/95sdo0n3gvk71o7/temp%20data%20observed%20ggplot.csv
https://www.dropbox.com/s/4opftofvvsueh5c/temp%20data%20ggplot.csv
The plot obtained was as follows:
Jdbaba
In the following example, how do I set separate ylims for each of my facets?
qplot(x, value, data=df, geom=c("smooth")) + facet_grid(variable ~ ., scale="free_y")
In each of the facets, the y-axis takes a different range of values and I would like to different ylims for each of the facets.
The defaults ylims are too long for the trend that I want to see.
This was brought up on the ggplot2 mailing list a short while ago. What you are asking for is currently not possible but I think it is in progress.
As far as I know this has not been implemented in ggplot2, yet. However a workaround - that will give you ylims that exceed what ggplot provides automatically - is to add "artificial data". To reduce the ylims simply remove the data you don't want plot (see at the and for an example).
Here is an example:
Let's just set up some dummy data that you want to plot
df <- data.frame(x=rep(seq(1,2,.1),4),f1=factor(rep(c("a","b"),each=22)),f2=factor(rep(c("x","y"),22)))
df <- within(df,y <- x^2)
Which we could plot using line graphs
p <- ggplot(df,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")
print(p)
Assume we want to let y start at -10 in first row and 0 in the second row, so we add a point at (0,-10) to the upper left plot and at (0,0) ot the lower left plot:
ylim <- data.frame(x=rep(0,2),y=c(-10,0),f1=factor(c("a","b")),f2=factor(c("x","y")))
dfy <- rbind(df,ylim)
Now by limiting the x-scale between 1 and 2 those added points are not plotted (a warning is given):
p <- ggplot(dfy,aes(x,y))+geom_line()+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
Same would work for extending the margin above by adding points with higher y values at x values that lie outside the range of xlim.
This will not work if you want to reduce the ylim, in which case subsetting your data would be a solution, for example to limit the upper row between -10 and 1.5 you could use:
p <- ggplot(dfy,aes(x,y))+geom_line(subset=.(y < 1.5 | f1 != "a"))+facet_grid(f1~f2,scales="free_y")+xlim(c(1,2))
print(p)
There are actually two packages that solve that problem now:
https://github.com/zeehio/facetscales, and https://cran.r-project.org/package=ggh4x.
I would recommend using ggh4x because it has very useful tools, such as facet grid multiple layers (having 2 variables defining the rows or columns), scaling the x and y-axis as you wish in each facet, and also having multiple fill and colour scales.
For your problems the solution would be like this:
library(ggh4x)
scales <- list(
# Here you have to specify all the scales, one for each facet row in your case
scale_y_continuous(limits = c(2,10),
scale_y_continuous(breaks = c(3, 4))
)
qplot(x, value, data=df, geom=c("smooth")) +
facet_grid(variable ~ ., scale="free_y") +
facetted_pos_scales(y = scales)
I have one example of function facet_wrap
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(vars(class), scales = "free",
nrow=2,ncol=4)
Above code generates plot as:
my level too low to upload an image, click here to see plot
I've got a nice facet_wrap density plot that I have created with ggplot2. I would like for each panel to have x and y axis labels instead of only having the y axis labels along the left side and the x labels along the bottom. What I have right now looks like this:
library(ggplot2)
myGroups <- sample(c("Mo", "Larry", "Curly"), 100, replace=T)
myValues <- rnorm(300)
df <- data.frame(myGroups, myValues)
p <- ggplot(df) +
geom_density(aes(myValues), fill = alpha("#335785", .6)) +
facet_wrap(~ myGroups)
p
Which returns:
(source: cerebralmastication.com)
It seems like this should be simple, but my Google Fu has been too poor to find an answer.
You can do this by including the scales="free" option in your facet_wrap call:
myGroups <- sample(c("Mo", "Larry", "Curly"), 100, replace=T)
myValues <- rnorm(300)
df <- data.frame(myGroups, myValues)
p <- ggplot(df) +
geom_density(aes(myValues), fill = alpha("#335785", .6)) +
facet_wrap(~ myGroups, scales="free")
p
Short answer: You can't do that. It might make sense with 3 graphs, but what if you had a big lattice of 32 graphs? That would look noisy and bad. GGplot's philosophy is about doing the right thing with a minimum of customization, which means, naturally, that you can't customize things as much as other packages.
Long answer: You could fake it by constructing three separate ggplot objects and combining them. But it's not a very general solution. Here's some code from Hadley's book that assumes you've created ggplot objects a, b, and c. It puts a in the top row, with b and c in the bottom row.
grid.newpage()
pushViewport(viewport(layout=grid.layout(2,2)))
vplayout<-function(x,y)
viewport(layout.pos.row=x,layout.pos.col=y)
print(a,vp=vplayout(1,1:2))
print(b,vp=vplayout(2,1))
print(c,vp=vplayout(2,2))