stat_countour() cannot generate contour lines with six decimal data [duplicate] - r

I need to add lines via stat_contour() to my ggplot/ggplot2-plot. Unfortunately, I can not give you the real data from which point values should be evaluated. However, another easily repreducably example behaves the same:
testPts <- data.frame(x=rep(seq(7.08, 7.14, by=0.005), 200))
testPts$y <- runif(length(testPts$x), 50.93, 50.96)
testPts$z <- sin(testPts$y * 500)
ggplot(data=testPts, aes(x=x, y=y, z=z)) + geom_point(aes(colour=z))
+ stat_contour()
This results in the following error message:
Error in if (nrow(layer_data) == 0) return() : argument is of length
zero In addition: Warning message: Not possible to generate contour
data
The example looks not different from others posted on stackoverflow or in the official manual/tutorial to me, and it seemingly doesn't matter if I provide more specifications to stat_contour. It seems the function does not pass the data(-layer) as pointed ou tint the error message.

Use stat_density2d instead of stat_contour with irregularly spaced data.
library(ggplot2)
testPts <- data.frame(x=rep(seq(7.08, 7.14, by=0.005), 200))
testPts$y <- runif(length(testPts$x), 50.93, 50.96)
testPts$z <- sin(testPts$y * 500)
(ggplot(data=testPts, aes(x=x, y=y, z=z))
+ geom_point(aes(colour=z))
+ stat_density2d()
)

One solution to this problem is the generation of a regular grid and the interpolation of point values in respect to that grid. Here is how I did it for just one of multiple data fields:
pts.grid <- interp(as.data.frame(pts)$coords.x1, as.data.frame(pts)$coords.x2, as.data.frame(pts)$GWLEVEL_TI)
pts.grid2 <- expand.grid(x=pts.grid$x, y=pts.grid$y)
pts.grid2$z <- as.vector(pts.grid$z)
This results in a data frame which can be used in a ggplot in stat_contour() when defined in the data-parameter of that function:
(ggplot(as.data.frame(pts), aes(x=coords.x1, y=coords.x2, z=GWLEVEL_TI))
#+ geom_tile(data=na.omit(pts.grid2), aes(x=x, y=y, z=z, fill=z))
+ stat_contour(data=na.omit(pts.grid2), binwidth=2, colour="red", aes(x=x, y=y, z=z))
+ geom_point()
)
This solution most likely includes unneccessary transformations because I don't know better yet. Furthermore I must make the same grid generation for every data field individually before combining them in a single data frame again - not as efficient as I would like it to be for bigger data sets.

You should generate a z for each combination of x and y using expand.grid or outer. For example:
library(ggplot2)
testPts <- transform(expand.grid(x=1:10,y=1:5),z=sin(x*y))
(ggplot(data=testPts, aes(x=x, y=y, z=z))
+ stat_contour()
+ geom_point(aes(colour=z))
)

Related

Weird geom_path behavior

I have noticed an odd behavior in geom_path() in ggplot2. I am not sure whether I am doing something wrong or whether it's a bug.
Here's my data set:
x <- abs(rnorm(10))
y <- abs(rnorm(10)/10)
categs <- c("a","b","c","d","e","f","g","h","i","j")
df <- data.frame(x,y,categs)
I make a plot with points and I join them using geom_path. Works well:
ggplot(df, aes(categs, x, group=1)) + geom_point() + geom_errorbar(aes(ymin=x-y, ymax=x+y)) + geom_path()
However, if I reorder my levels, for instance like this:
df$categs <- factor(df$categs, levels = c("f","i","c","g","e","a","d","h","b","j"))
then geom_plot still keeps the original order (although the order of the factor levels has been updated on the x axis).
Any guesses at what I am doing wrong? Thanks.
Order the df rows based on df$categs, geom_path goes row-by-row to plot:
ggplot(df[ order(df$categs), ], aes(categs, x, group=1)) +
geom_point() +
geom_errorbar(aes(ymin=x-y, ymax=x+y)) +
geom_path()
From ?geom_path manual:
geom_path() connects the observations in the order in which they appear in the data.

How to add geom_point() to autolayer() line?

Trying to add geom_points to an autolayer() line ("fitted" in pic), which is a wrapper part of autoplot() for ggplot2 in Rob Hyndmans forecast package (there's a base autoplot/autolayer in ggplot2 too so same likely applies there).
Problem is (I'm no ggplot2 expert, and autoplot wrapper makes it trickier) the geom_point() applies fine to the main call, but how do I apply similar to the autolayer (fitted values)?
Tried type="b" like normal geom_line() but it's not an object param in autolayer().
require(fpp2)
model.ses <- ets(mdeaths, model="ANN", alpha=0.4)
model.ses.fc <- forecast(model.ses, h=5)
forecast::autoplot(mdeaths) +
forecast::autolayer(model.ses.fc$fitted, series="Fitted") + # cannot set to show points, and type="b" not allowed
geom_point() # this works fine against the main autoplot call
This seems to work:
library(forecast)
library(fpp2)
model.ses <- ets(mdeaths, model="ANN", alpha=0.4)
model.ses.fc <- forecast(model.ses, h=5)
# Pre-compute the fitted layer so we can extract the data out of it with
# layer_data()
fitted_layer <- forecast::autolayer(model.ses.fc$fitted, series="Fitted")
fitted_values <- fitted_layer$layer_data()
plt <- forecast::autoplot(mdeaths) +
fitted_layer +
geom_point() +
geom_point(data = fitted_values, aes(x = timeVal, y = seriesVal))
There might be a way to make forecast::autolayer do what you want directly but this solution works. If you want the legend to look right, you'll want to merge the input data and fitted values into a single data.frame.

R geom_line not plotting as expected

I am using the following code to plot a stacked area graph and I get the expected plot.
P <- ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) + #ggplot initial parameters
geom_ribbon(position='fill', aes(ymin=0, ymax=1))
but then when I add lines which are reading the same data source I get misaligned results towards the right side of the graph
P + geom_line(position='fill', aes(group=model, ymax=1))
does anyone know why this may be? Both plots are reading the same data source so I can't figure out what the problem is.
Actually, if all you wanted to do was draw an outline around the areas, then you could do the same using the colour aesthetic.
ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) +
geom_ribbon(position='fill', aes(ymin=0, ymax=1), colour = "black")
I have an answer, I hope it works for you, it looks good but very different from your original graph:
library(ggplot2)
DATA2 <- read.csv("C:/Users/corcoranbarriosd/Downloads/porsche model volumes.csv", header = TRUE, stringsAsFactors = FALSE)
In my experience you want to have X as a numeric variable and you have it as a string, if that is not the case I can Change that, but this will transform your bucket into a numeric vector:
bucket.list <- strsplit(unlist(DATA2$bucket), "[^0-9]+")
x=numeric()
for (i in 1:length(bucket.list)) {
x[i] <- bucket.list[[i]][2]
}
DATA2$bucket <- as.numeric(x)
P <- ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) +
geom_ribbon(aes(ymin=0, ymax=volume))+ geom_line(aes(group=model, ymax=volume))
It gives me the area and the line tracking each other, hope that's what you needed
If you switch to using geom_path in place of geom_line, it all seems to work as expected. I don't think the ordering of geom_line is behaving the same as geom_ribbon (and suspect that geom_line -- like geom_area -- assumes a zero base y value)
ggplot(DATA2, aes(x=bucket, y=volume, ymin=0, ymax=1,
group=model, fill=model, label=volume)) +
geom_ribbon(position='fill') +
geom_path(position='fill')
Should give you

Second layer in ggplot2 is shifted by one

I'm trying to plot a scatter-plot with two layers. The reason is I want to represent the size of the points by its number of answers. Then I need to have a smooth-curve layed over it. So I use two datasets to achieve this.
The problem is, when I lay the second layer with the smoother using the original dataset, then the smoother is shifted by one point on the x-scale to the left.
Does anyone know, how to correct this in the R code? Is there maybe something wrong in it?
I thought about to add 1 to the x variable, but I don't want to have to go this far.
library(ggplot2)
q.tab <- xtabs(~x + y, mydata)
q.df <- as.data.frame(q.tab)
pointsize <- q.df$Freq
qplot(x, y, data=q.df) + geom_point(aes(size=as.factor(pointsize)))
+ geom_smooth(data=mydata, method="loess", span=1))
With ggplot2 , when you think in terms of layer it is better to use ggplot function and not qplot.
I generate your data (sample function is very convenient to generate data)
mydata$x <- sample(1:10,100,replace=TRUE)
mydata$y <- sample(1:10,100,replace=TRUE)
q.tab <- xtabs(~x + y, mydata)
q.df <- as.data.frame(q.tab)
ggplot version:
library(ggplot2)
ggplot(data=mydata,aes(x,y,size=Freq)) +
geom_point() +
geom_smooth( method="loess", span=1)
qplot version:
qplot(data=mydata,x=x,y=y,size=Freq,geom='point')+
geom_smooth( method="loess", span=1)

facet_grid of back to back histogram failing

I am having some trouble creating a facet grid of a back-to-back histogram created with ggplot.
# create data frame with latency values
latc_sorted <- data.frame(
subject=c(1,1,1,1,1,2,2,2,2,2),
grp=c("K_N","K_I","K_N","K_I","K_N","K_I","K_N","K_I","K_N","K_I"),
lat=c(22,45,18,55,94,11,67,22,64,44)
)
# subset and order data
x.sub_ki<-subset(latc_sorted, grp=="K_I")
x.sub_kn<-subset(latc_sorted, grp=="K_N")
x.sub_k<-rbind(x.sub_ki,x.sub_kn)
x=x.sub_ki$lat
y=x.sub_kn$lat
nm<-list("x","y")
# make absolute values on x axis
my.abs<-function(x){abs(x)}
# plot back-to-back histogram
hist_K<-qplot(x, geom="histogram", fill="inverted", binwidth=20) +
geom_histogram(data=data.frame(x=y), aes(fill="non-inverted", y=-..count..),
binwidth= 20) + scale_y_continuous(formatter='my.abs') + coord_flip() +
scale_fill_hue("variable")
hist_K
this plots fine but if I try the following I get the error:
Error: Casting formula contains variables not found in molten data: x.sub_k$subject
hist_K_sub<-qplot(x, geom="histogram", fill="inverted", binwidth=20) +
geom_histogram(data=data.frame(x=y), aes(fill="non-inverted", y=-..count..),
binwidth= 20) + scale_y_continuous(formatter='my.abs') + coord_flip() +
scale_fill_hue("variable")+
facet_grid(x.sub_k$subject ~ .)
hist_K_sub
any ideas what is causing this to fail?
The problem is that the variables referenced in facet_grid are looked for in the data.frames that are passed to the various layers. You have created (implicitly and explicitly) data.frames which have only the lat data and do not have the subject information. If you use x.sub_ki and x.sub_kn instead, they do have the subject variable associated with the lat values.
hist_K_sub <-
ggplot() +
geom_histogram(data=x.sub_ki, aes(x=lat, fill="inverted", y= ..count..), binwidth=20) +
geom_histogram(data=x.sub_kn, aes(x=lat, fill="not inverted", y=-..count..), binwidth=20) +
facet_grid(subject ~ .) +
scale_y_continuous(formatter="my.abs") +
scale_fill_hue("variable") +
coord_flip()
hist_K_sub
I also converted from qplot to full ggplot syntax; that shows the parallel structure of ki and kn better.
The syntax above doesn't work with newer versions of ggplot2, use
the following instead for the formatting of axes:
abs_format <- function() {
function(x) abs(x)
}
hist_K_sub <- hist_K_sub+ scale_y_continuous(labels=abs_format())

Resources