R + ggplot: geom_txt label not recognize a variable in function call - r

I'm an R/ggplot newbie switching over from MatLab.
I would like to create a function using ggplot with linear regression equation printed on the graph (which is discussed in Adding Regression Line Equation and R2 on graph). But here, I am trying to build a function with it but wasn't successful.
I kept getting an error -
"Error in eval(expr, envir, enclos) : object 'label' not found".
One workaround is to define "label" variable outside of the function but I just don't understand why this doesn't work.
Can anyone explain why?
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
f <- function(DS, x, y, z) {
label <- z
print(label)
ggplot(DS, aes(x=x, y=y)) +
geom_point() +
labs(y=y) +
labs(title=y) +
xlim(0,5)+
ylim(0,5)+
geom_smooth(method="lm", se=FALSE)+
geom_text (aes(x=1, y=4, label=label))
}
f(df, x, y, "aaa") #execution line

See the following code:
library(ggplot2)
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)
f <- function(DS, x, y, z) {
label.df = data.frame(x=1, y=4, label=z)
ggplot(DS, aes_string(x=x, y=y)) +
geom_point() +
labs(y=y) +
labs(title=y) +
geom_smooth(method="lm", se=FALSE)+
geom_text (aes(x=x, y=y, label=label), label.df)
}
f(df, "x", "y", "aaa")
There were a few fixes about your code:
The data you are using in geom_text is the same you have defined in ggplot() unless you change it. Here I have created a temporary data.frame for this purpose called label.df.
The xlim() and ylim() functions were filtering most of your data points, since the range of x and y are much larger than the limits you defined in the original code.
When you want to pass the names of the columns of your data.frame to be used for displaying the graph it would be easier to pass their names as strings (i.e. "x"). This way, the aes() function is also changed to aes_string().
Here is the result:
Edit
Thanks to #Gregor, a simpler version would be:
f <- function(DS, x, y, z) {
ggplot(DS, aes_string(x=x, y=y)) +
geom_point() +
labs(y=y) +
labs(title=y) +
geom_smooth(method="lm", se=FALSE)+
annotate(geom="text", x=1, y=4, label=z)
}

Related

Using facet_trelliscope without defining axis limits

Using facet_trelliscope from trelliscopejs this works:
library(trelliscopejs)
library(ggplot2)
x <- 1:10
y <- 1:10
group <- rep(c("A","B"),5)
df <- data.frame(group, x, y)
ggplot(df, aes(x, y)) + geom_point() +
xlim(0, 10) + ylim(0, 10) +
facet_trelliscope(~group)
But this doesn't:
ggplot(df, aes(x, y)) + geom_point() +
facet_trelliscope(~group)
It throws up this error:
Error in scale_fn() : could not find function "scale_fn"
My question is, do axis limits always need to be defined for facet_trelliscope to work?
Thanks both it was indeed a package error. If anyone sees the error in future just remove and re-install ggplot2.

Change axis breaks without defining sequence - ggplot

Is there any way to set the break step size in ggplot without defining a sequence. For example:
x <- 1:10
y <- 1:10
df <- data.frame(x, y)
# Plot with auto scale
ggplot(df, aes(x,y)) + geom_point()
# Plot with breaks defined by sequence
ggplot(df, aes(x,y)) + geom_point() +
scale_y_continuous(breaks = seq(0,10,1))
# Plot with automatic sequence for breaks
ggplot(df, aes(x,y)) + geom_point() +
scale_y_continuous(breaks = seq(min(df$y),max(df$y),1))
# Does this exist?
ggplot(df, aes(x,y)) + geom_point() +
scale_y_continuous(break_step = 1)
You may say I am being lazy but there have been a few occasions where I have had to change the min and max limits of my seq due to the addition of error bars. So I just want to say...use a break size of x, with automatic scale limits.
You can define your own function to pass to the breaks argument. An example that would work in your case would be
f <- function(y) seq(floor(min(y)), ceiling(max(y)))
Then
ggplot(df, aes(x,y)) + geom_point() + scale_y_continuous(breaks = f)
gives
You could modify this to pass the step of the breaks, e.g.
f <- function(k) {
step <- k
function(y) seq(floor(min(y)), ceiling(max(y)), by = step)
}
then
ggplot(df, aes(x,y)) + geom_point() + scale_y_continuous(breaks = f(2))
would create a y-axis with ticks at 2, 4, .., 10, etc.
You can take this even further by writing your own scale function
my_scale <- function(step = 1, ...) scale_y_continuous(breaks = f(step), ...)
and just call it like
ggplot(df, aes(x,y)) + geom_point() + my_scale()
> # Does this exist?
> ggplot(df, aes(x,y)) + geom_point() +
> scale_y_continuous(break_step = 1)
If you're looking for an off-the-shelf solution, then you can use the scales::breaks_width() function like so:
scale_y_continuous(breaks = scales::breaks_width(1))
The scales package also includes handy functions to control breaks easily in "special" scales such as date-time, e.g. scale_x_datetime(breaks='6 hours').

Looping through columns with ggplot and modyfing geom_hline(yintercept) accordingly

This is an incremental question that refers directly to this topic:
How do I loop through column names and make a ggplot scatteplot for each one
I would like to loop through column names and make a ggplot scatteplot for each one, but I want add each time a horizintal line whose intercept depends on values in the column.
So I take that code:
Y <- rnorm(100)
df <- data.frame(A = rnorm(100), B = runif(100), C = rlnorm(100),
Y = Y)
colNames <- names(df)[1:3]
for(i in colNames){
plt <- ggplot(df, aes_string(x=i, y = Y)) +
geom_point(color="#B20000", size=4, alpha=0.5) +
geom_hline(yintercept=0, size=0.06, color="black") +
geom_smooth(method=lm, alpha=0.25, color="black", fill="black")
print(plt)
Sys.sleep(2)
}
I switch y with x
aes_string(x=Y, y = i))
and I want to to modify that line
geom_hline(yintercept=0, size=0.06, color="black")
...so that yintercept is not constant, but depends on i,
for example:
geom_hline(yintercept=c(quantile(i, 0.25)))
So that yintercept is always the first quartile of my column.
However, it doesnt work:
Error in (1 - h) * qs[i] :
non-numeric argument to binary operator
I tried different options such as aes_string, paste() etc
but none of this worked.
However, it doesnt work: Error in (1 - h) * qs[i] :
non-numeric argument to binary operator
I tried different options such as aes_string, paste() etc
but none of this worked.
You should call quantile(df[,i], 0.25) instead of quantile(i, 0.25) and it should work, your code would be :
for(i in colNames){
plt <- ggplot(df, aes_string(x=Y, y = i)) +
geom_point(color="#B20000", size=4, alpha=0.5) +
geom_hline(yintercept=c(quantile(df[,i], 0.25)))+
geom_smooth(method=lm, alpha=0.25, color="black", fill="black")
print(plt)
Sys.sleep(2)
}

How can I show different degree polynomial fits in ggplot2 with facet_grid?

I want to use facets (because I like the way they look for this) to show polynomial fits of increasing degree. It's easy enough to plot them separately as follows:
df <- data.frame(x=rep(1:10,each=10),y=rnorm(100))
ggplot(df,aes(x=x,y=y)) + stat_smooth(method="lm",formula=y~poly(x,2))
ggplot(df,aes(x=x,y=y)) + stat_smooth(method="lm",formula=y~poly(x,3))
ggplot(df,aes(x=x,y=y)) + stat_smooth(method="lm",formula=y~poly(x,4))
I know I can always combine them in some fashion using grobs, but I would like to combine them using facet_grid if possible. Maybe something similar to:
poly2 <- df
poly2$degree <- 2
poly3 <- df
poly3$degree <- 3
poly4 <- df
poly4$degree <- 4
polyn <- rbind(poly2,poly3,poly4)
ggplot(polyn,aes(x=x,y=y)) + stat_smooth(method="lm",formula=y~poly(x,degree)) +
facet_grid(degree~.)
This doesn't work, of course, because the faceting does not work on y~poly(x,degree) so that degree gets pulled from the data. Is there some way to make this work?
You can always predict the points manually and then facet quite easily,
## Data
set.seed(0)
df <- data.frame(x=rep(1:10,each=10),y=rnorm(100))
## Get poly fits
dat <- do.call(rbind, lapply(1:4, function(d)
data.frame(x=(x=runif(1000,0,10)),
y=predict(lm(y ~ poly(x, d), data=df), newdata=data.frame(x=x)),
degree=d)))
ggplot(dat, aes(x, y)) +
geom_point(data=df, aes(x, y), alpha=0.3) +
geom_line(color="steelblue", lwd=1.1) +
facet_grid(~ degree)
To add confidence bands, you can use the option interval='confidence' with predict. You might also be interested in the function ggplot2::fortify to get more fit statistics.
dat <- do.call(rbind, lapply(1:4, function(d) {
x <- seq(0, 10, len=100)
preds <- predict(lm(y ~ poly(x, d), data=df), newdata=data.frame(x=x), interval="confidence")
data.frame(cbind(preds, x=x, degree=d))
}))
ggplot(dat, aes(x, fit)) +
geom_point(data=df, aes(x, y), alpha=0.3) +
geom_line(color="steelblue", lwd=1.1) +
geom_ribbon(aes(x=x, ymin=lwr, ymax=upr), alpha=0.3) +
facet_grid(~ degree)
I have a very ugly solution, in which de plot is faceted and the fits are plotted for the appropriate subsets of the data:
p1 <- ggplot(polyn,aes(x=x,y=y)) + facet_grid(.~degree)
p1 +
stat_smooth(data=polyn[polyn$degree==2,],formula=y~poly(x,2),method="lm") +
stat_smooth(data=polyn[polyn$degree==3,],formula=y~poly(x,3),method="lm") +
stat_smooth(data=polyn[polyn$degree==4,],formula=y~poly(x,4),method="lm")
yields

Error in trying to write a plotting function in ggplot2

I am trying to write a function in ggplot2 and obtain this error message:
Error in layout_base(data, vars, drop = drop) :
At least one layer must contain all variables used for facetting
Here is my code:
growth.plot<-function(data,x,y,fac){
gp <- ggplot(data = data,aes(x = x, y = y))
gp <- gp + geom_point() + facet_wrap(~ fac)
return(gp)
}
growth.plot(data=mydata, x=x.var, y=y.var,fac= fac.var)
If I try without the function, the plot appears perfectly
gp1 <- ggplot(data = mydata,aes(x = x.var), y = y.var))
gp1+ geom_point()+ facet_wrap(~ fac.var) # this works
Here is reproducible solution where your x, y, and fac arguments must be passed as character:
library(ggplot2)
make_plot = function(data, x, y, fac) {
p = ggplot(data, aes_string(x=x, y=y, colour=fac)) +
geom_point(size=3) +
facet_wrap(as.formula(paste("~", fac)))
return(p)
}
p = make_plot(iris, x="Sepal.Length", y="Petal.Length", fac="Species")
ggsave("iris_plot.png", plot=p, height=4, width=8, dpi=120)
Thanks to commenters #Roland and #aosmith for pointing the way to this solution.

Resources