I am having some trouble creating a facet grid of a back-to-back histogram created with ggplot.
# create data frame with latency values
latc_sorted <- data.frame(
subject=c(1,1,1,1,1,2,2,2,2,2),
grp=c("K_N","K_I","K_N","K_I","K_N","K_I","K_N","K_I","K_N","K_I"),
lat=c(22,45,18,55,94,11,67,22,64,44)
)
# subset and order data
x.sub_ki<-subset(latc_sorted, grp=="K_I")
x.sub_kn<-subset(latc_sorted, grp=="K_N")
x.sub_k<-rbind(x.sub_ki,x.sub_kn)
x=x.sub_ki$lat
y=x.sub_kn$lat
nm<-list("x","y")
# make absolute values on x axis
my.abs<-function(x){abs(x)}
# plot back-to-back histogram
hist_K<-qplot(x, geom="histogram", fill="inverted", binwidth=20) +
geom_histogram(data=data.frame(x=y), aes(fill="non-inverted", y=-..count..),
binwidth= 20) + scale_y_continuous(formatter='my.abs') + coord_flip() +
scale_fill_hue("variable")
hist_K
this plots fine but if I try the following I get the error:
Error: Casting formula contains variables not found in molten data: x.sub_k$subject
hist_K_sub<-qplot(x, geom="histogram", fill="inverted", binwidth=20) +
geom_histogram(data=data.frame(x=y), aes(fill="non-inverted", y=-..count..),
binwidth= 20) + scale_y_continuous(formatter='my.abs') + coord_flip() +
scale_fill_hue("variable")+
facet_grid(x.sub_k$subject ~ .)
hist_K_sub
any ideas what is causing this to fail?
The problem is that the variables referenced in facet_grid are looked for in the data.frames that are passed to the various layers. You have created (implicitly and explicitly) data.frames which have only the lat data and do not have the subject information. If you use x.sub_ki and x.sub_kn instead, they do have the subject variable associated with the lat values.
hist_K_sub <-
ggplot() +
geom_histogram(data=x.sub_ki, aes(x=lat, fill="inverted", y= ..count..), binwidth=20) +
geom_histogram(data=x.sub_kn, aes(x=lat, fill="not inverted", y=-..count..), binwidth=20) +
facet_grid(subject ~ .) +
scale_y_continuous(formatter="my.abs") +
scale_fill_hue("variable") +
coord_flip()
hist_K_sub
I also converted from qplot to full ggplot syntax; that shows the parallel structure of ki and kn better.
The syntax above doesn't work with newer versions of ggplot2, use
the following instead for the formatting of axes:
abs_format <- function() {
function(x) abs(x)
}
hist_K_sub <- hist_K_sub+ scale_y_continuous(labels=abs_format())
Related
First, is there a way to fix my code to include ln(x) in the equation. For example, the plot in the equation is shown as y=1.2 + 0.32x, but instead should be y=1.2 +0.32ln(x).
Lastly, I'm trying to figure out is there a way to create either a new data frame that would allow me to summarize all the plots logarithmic equations that resulted from using stat_regline_equation(formula=y~log(x)).
iris<-rotated.plot.data %>%
select(`2014-02-03 06:10:00` : `2014-09-30 22:10:00`)
plots <- purrr::map(iris, function(y) {
ggplot(rotated.plot.data,
aes(x=instrument.supersaturation, y={{ y }})) +
geom_point() + geom_smooth(method="lm", formula = y~log(x)) +
stat_regline_equation(formula=y~log(x)) +
ylab("Nccn/Ncn") +
xlab("instrument supersaturation(%)")})
Unfortunately, I been google searching and I can't find any methods to help with the problems I have encountered.
Your example is not reproducible (since it relies on rotated.plot.data that is not available).
But from your question it seems all you want is to transform the X axis with the logarithm. There is a scale for that: scale_x_log10(). You can remove the log() from the geom_smooth() call add this scale to your plot.
Try this:
rotated.plot.data %>%
ggplot(aes(x=instrument.supersaturation, y={{ y }})) +
geom_point() +
geom_smooth(method="lm") +
scale_x_log10() +
ylab("Nccn/Ncn") +
xlab("instrument supersaturation(%)")
Trying to add geom_points to an autolayer() line ("fitted" in pic), which is a wrapper part of autoplot() for ggplot2 in Rob Hyndmans forecast package (there's a base autoplot/autolayer in ggplot2 too so same likely applies there).
Problem is (I'm no ggplot2 expert, and autoplot wrapper makes it trickier) the geom_point() applies fine to the main call, but how do I apply similar to the autolayer (fitted values)?
Tried type="b" like normal geom_line() but it's not an object param in autolayer().
require(fpp2)
model.ses <- ets(mdeaths, model="ANN", alpha=0.4)
model.ses.fc <- forecast(model.ses, h=5)
forecast::autoplot(mdeaths) +
forecast::autolayer(model.ses.fc$fitted, series="Fitted") + # cannot set to show points, and type="b" not allowed
geom_point() # this works fine against the main autoplot call
This seems to work:
library(forecast)
library(fpp2)
model.ses <- ets(mdeaths, model="ANN", alpha=0.4)
model.ses.fc <- forecast(model.ses, h=5)
# Pre-compute the fitted layer so we can extract the data out of it with
# layer_data()
fitted_layer <- forecast::autolayer(model.ses.fc$fitted, series="Fitted")
fitted_values <- fitted_layer$layer_data()
plt <- forecast::autoplot(mdeaths) +
fitted_layer +
geom_point() +
geom_point(data = fitted_values, aes(x = timeVal, y = seriesVal))
There might be a way to make forecast::autolayer do what you want directly but this solution works. If you want the legend to look right, you'll want to merge the input data and fitted values into a single data.frame.
I need to add lines via stat_contour() to my ggplot/ggplot2-plot. Unfortunately, I can not give you the real data from which point values should be evaluated. However, another easily repreducably example behaves the same:
testPts <- data.frame(x=rep(seq(7.08, 7.14, by=0.005), 200))
testPts$y <- runif(length(testPts$x), 50.93, 50.96)
testPts$z <- sin(testPts$y * 500)
ggplot(data=testPts, aes(x=x, y=y, z=z)) + geom_point(aes(colour=z))
+ stat_contour()
This results in the following error message:
Error in if (nrow(layer_data) == 0) return() : argument is of length
zero In addition: Warning message: Not possible to generate contour
data
The example looks not different from others posted on stackoverflow or in the official manual/tutorial to me, and it seemingly doesn't matter if I provide more specifications to stat_contour. It seems the function does not pass the data(-layer) as pointed ou tint the error message.
Use stat_density2d instead of stat_contour with irregularly spaced data.
library(ggplot2)
testPts <- data.frame(x=rep(seq(7.08, 7.14, by=0.005), 200))
testPts$y <- runif(length(testPts$x), 50.93, 50.96)
testPts$z <- sin(testPts$y * 500)
(ggplot(data=testPts, aes(x=x, y=y, z=z))
+ geom_point(aes(colour=z))
+ stat_density2d()
)
One solution to this problem is the generation of a regular grid and the interpolation of point values in respect to that grid. Here is how I did it for just one of multiple data fields:
pts.grid <- interp(as.data.frame(pts)$coords.x1, as.data.frame(pts)$coords.x2, as.data.frame(pts)$GWLEVEL_TI)
pts.grid2 <- expand.grid(x=pts.grid$x, y=pts.grid$y)
pts.grid2$z <- as.vector(pts.grid$z)
This results in a data frame which can be used in a ggplot in stat_contour() when defined in the data-parameter of that function:
(ggplot(as.data.frame(pts), aes(x=coords.x1, y=coords.x2, z=GWLEVEL_TI))
#+ geom_tile(data=na.omit(pts.grid2), aes(x=x, y=y, z=z, fill=z))
+ stat_contour(data=na.omit(pts.grid2), binwidth=2, colour="red", aes(x=x, y=y, z=z))
+ geom_point()
)
This solution most likely includes unneccessary transformations because I don't know better yet. Furthermore I must make the same grid generation for every data field individually before combining them in a single data frame again - not as efficient as I would like it to be for bigger data sets.
You should generate a z for each combination of x and y using expand.grid or outer. For example:
library(ggplot2)
testPts <- transform(expand.grid(x=1:10,y=1:5),z=sin(x*y))
(ggplot(data=testPts, aes(x=x, y=y, z=z))
+ stat_contour()
+ geom_point(aes(colour=z))
)
I would like to plot another series of data on top of a current graph. The additional data only contains information for 3 (out of 6) spp, which are used in the facet_wraping.
The other series of data is currently a column (in the same data file).
Current graph:
ped.num <- ggplot(data, aes(ped.length, seeds.inflorstem))
ped.num + geom_point(size=2) + theme_bw() + facet_wrap(~spp, scales = "free_y")
Additional layer would be:
aes(ped.length, seeds.filled)
I feel I should be able to plot them using the same y-axis, because they have just slightly smaller values. How do I go about add this layer?
#ialm 's solution should work fine, but I recommend calling the aes function separately in each geom_* because it makes the code easier to read.
ped.num <- ggplot(data) +
geom_point(aes(x=ped.length, y=seeds.inflorstem), size=2) +
theme_bw() +
facet_wrap(~spp, scales="free_y") +
geom_point(aes(x=ped.length, y=seeds.filled))
(You'll always get better answers if you include example data, but I'll take a shot in the dark)
Since you want to plot two variables that are on the same data.frame, it's probably easiest to reshape the data before feeding it into ggplot:
library(reshape2)
# Melting data gives you exactly one observation per row - ggplot likes that
dat.melt <- melt(dat,
id.var = c("spp", "ped.length"),
measure.var = c("seeds.inflorstem", "seeds.filled")
)
# Plotting is slightly different - instead of explicitly naming each variable,
# you'll refer to "variable" and "value"
ggplot(dat.melt, aes(x = ped.length, y = value, color = variable)) +
geom_point(size=2) +
theme_bw() +
facet_wrap(~spp, scales = "free_y")
The seeds.filled values should plot only on the facets for the corresponding species.
I prefer this to Drew's (totally valid) approach of explicitly mapping different layers because you only need a single geom_point() whether you have two variables or twenty and it's easy to map a variety of aesthetics to variable.
Using scale shape manual in ggplot, I created different values for three different types of factories (squares, triangles, and circles), which corresponds to North, South, and West respectively. Is it possible to have the North/South/West labels in the legend without creating three different data frames for each region? Can I add these labels to the original data frame?
I have one data frame for a plot (as recommended by the ggplot2 book), and with my code below, the default legend lists every row in my data frame, which is repetitive and not what I want.
Basically, I would like to know the best way to label these regions in the plot. The only reason I would like to maintain one data frame is because the code will be easy to use over and over again by just switching the data frame (the benefit of one df mentioned in the ggplot2 book).
I think part of the problem is that I am using scale shape manual to assign values to each point individually. Should I put the North/South/West labels in my data frame and alter my scale shape manual? If so, what is the best way to accomplish this?
Please let me know if my question is unclear. My code is below, and it replicates my plot as it stands. Thanks.
#Data frame
points <- c(3,5,4,7,12)
bars <- c(.8,1.2,1.4,2.1,4)
points_df<-data.frame(points)
row.names(points_df) <- c( "Factory 1","Factory 2","Factory 3","Factory 4","Factory 5" )
df<-data.frame(Output=points,Errors=bars,lev.names= rownames(points_df))
df$lev.names<-factor(df$lev.names,levels=df$lev.names[order(df$Output)])
# GGPLOT #
library(ggplot2)
library(scales)
p2 <- ggplot(df,aes(lev.names,Output,shape=lev.names))
p2 <- p2 +geom_errorbar(aes(ymin=Output-Errors, ymax=Output+Errors), width=0,color="gray40", lty=1, size=0)
p2 <- p2 + geom_point(aes(size=2))
p2 <- p2 + scale_shape_manual(values=c(6,7,6,1,1))
p2 <- p2 + theme_bw() + xlab(" ") + ylab("Output")
p2 <- p2 + opts(title = expression("Production"))
p2 <- p2+ coord_flip()
print(p2)
Yes, put the location in your data.frame and use it in the aes mapping:
df$location <- c("North","South","North","West","West")
p2 <- ggplot(df,aes(lev.names,Output,shape=location)) +
geom_errorbar(aes(ymin=Output-Errors, ymax=Output+Errors),
width=0,color="gray40", lty=1, size=0) +
geom_point(size=3) +
theme_bw() + xlab(" ") + ylab("Output") +
ggtitle(expression("Production")) +
coord_flip()
print(p2)
I've also fixed some other stuff (e.g., opts is deprecated and you don't want to map size, but to set it).