Name aesthetics using ".resid" syntax in ggplot2 - r

Someone pointed out to me that there's a different way to specify the data and the aesthetics in ggplot2 as below. I've never seen this -- in all the books, docs, data is always a data frame and inside aes are the names of the variables. What's this dot syntax?
y <- rnorm(100) ; x <- rnorm(100)
m <- lm(y ~ x)
library(ggplot2)
ggplot(data = m, aes(.resid, .fitted)) + geom_point()

Upgrade comment
ggplot is calling fortify on the lm object, which produces a dataframe that is then passed to ggplot.data.frame.
To see the code use
ggplot2:::ggplot.default
#function (data = NULL, mapping = aes(), ..., environment = parent.frame())
#{
# ggplot.data.frame(fortify(data, ...), mapping, environment = environment)
#}
#<environment: namespace:ggplot2>
As for fortify it coerces various models and R objects to dataframes. Have a look at methods(fortify).
You can directly see the results of fortify
ff <- fortify(m)
names(ff)
#[1] "y" "x" ".hat" ".sigma" ".cooksd" ".fitted" ".resid" ".stdresid"
So the dot isn't doing anything clever within aes, but is actually part of the column names that fortify produces.

Related

How to make my custom `$<-` method honor `invisible()`

I am documenting my research in rmarkdown workbooks but want to also save my ggplots into a variable to reconfigure the plots for other cases, for example resize them for presentations.
First, I had a function that prints my plot and saves into a global variable:
PLOTS <- list()
`%<p%` <- function(name, ggplot){
PLOTS[[name]] <<- ggplot
print(ggplot)
return(invisible(NULL)) # better use ggplot here (if used with %>but this is easier for question
}
# Plots and saves
"testplot" %<p% qplot(x = Sepal.Length, y = Sepal.Width, data = iris)
Soon I had the problem that I want to save one version of the plot, but plot another version, for example save the plot but then plot in my workbook two versions with different x axis limits. So I introduced that the function recognizes invisible():
`%<p%`<- function(name, ggplot){
v <- withVisible(ggplot)$visible
if(v) print(ggplot)
PLOTS[[name]] <<- ggplot
return(invisible(NULL))
}
# Save full plot but print only x between 4 and 5
"testplot" %<p% invisible(qplot(x = Sepal.Length, y = Sepal.Width, data = iris))
PLOTS$testplot + coord_cartesian(xlim = c(4,5))
Works beautifully. Then I got really lazy and wanted to use the RStudio shortcut for <- instead of typing unwieldy special characters for %<p%, so I thought of making this function an implementation of the $<- generic, for a new class "plotlist". I then create a list with this new class to cause invokation of `$<-.plotlist`() when an element is assigned to this.
PLOTS2 <- structure(list(), class = "plotlist")
`$<-.plotlist` <- function(x, name, value){
v <- withVisible(value)$visible
if(v) print(value)
NextMethod()
}
But now things get strange as now the invisible() does not work anymore. For example, this renders the ggplot:
PLOTS2$test <- invisible(qplot(x = Sepal.Length, y = Sepal.Width, data = iris))
On the other hand, if I hadn't used the custom $<- implementation, it wouldn't even print by default, for example if I use my ordinary list without the special class "plotlist" to store my plot!
PLOTS$test <- qplot(x = Sepal.Length, y = Sepal.Width, data = iris)
How is this? I do not really know how invisible() works and the manuals of invisible() and withVisibility() say only the object stays invisible "for a while". What are the criteria how long is an object invisible, why is it not in my custom $<- implementation and can I make it honor invisible()?

passing arguments to geom_point2 with mapply

My objective is pass lists as arguments to the function geom_point2 using lapply or analogously mapply. In similar situations, I had success passing a list (or lists) to geom_cladelabel as in:
mapply(function (x,y,z,w,v,u,t,s) geom_cladelabel(node=x, label=y,
align=F, etc. # Where x y z etc are lists.
Problem is related to the use of aes inside geom_point2. (not in geom_cladelabel):
In the case of geom_point2, the node info is inside aes, and I could't do it. Normally I do not get any error message, but it does not work.
The objective is to make this example work, but using mapply instead of writting geom_point2 two times.
# source("https://bioconductor.org/biocLite.R")
# biocLite("ggtree")
library(ggtree)
library(ape)
#standard code
newstree<-rtree(10)
node1<-getMRCA(newstree,c(1,2))
node2<-getMRCA(newstree,c(3,4))
ggtree(newstree)+
geom_point2(aes(subset=(node ==node1) ), fill="black", size=3, shape=23)+
geom_point2(aes(subset=(node ==node2) ), fill="green", size=3, shape=23)
#desire to substitute the geom_point2 layers with mapply or lapply:
#lapply(c(node1,node2), function (x) geom_point2(aes(subset=(node=x)))))
Here is a solution calling geom_point2 usig mapply:
library(ggtree)
ggtree(rtree(10)) +
mapply(function(x, y, z)
geom_point2(
aes_string(subset=paste("node ==", x)),
fill=y,
size=10,
shape=z
),
x=c(11,12),
y=c("green", "firebrick"),
z=c(23,24)
) +
geom_text2(aes(subset=!isTip, label=node))
The solution is in the aes_string(), which writes the value of x directly in the aesthetics. The default aes() does not pass on the value of x, but just the string "x". When plotting, ggtree then looks for a node called "x", and ends with an empty node list.
I guess this has to do with the variable being stored in the mapply-environment and not being passed on to the plot.
PS: Sorry for my too quick answer with do.call() earlier. It is useful, but off-topic here.

how do deparse & substitute work to allow access to an objects name?

My question is regarding the following code:
myfunc <- function(v1) {
deparse(substitute(v1))
}
myfunc(foo)
[1] "foo"
I typed in ?deparse and ?substitute into R and obtained the following:
deparse = Turn unevaluated expressions into character strings.
and
substitute = returns the parse tree for the (unevaluated) expression expr,
substituting any variables bound in env.
I don't seem to really understand this language. Would someone be able to simplify the technical aspect of these descriptions so that I could begin to appreciate how these two functions work together to allow us to do something cool like access the variable name of an object?
I struggle(d) with this too. The myplot() example from ?substitute is helpful. There, they define:
myplot <- function(x, y)
plot(x, y, xlab = deparse(substitute(x)),
ylab = deparse(substitute(y)))
calling
myplot(x=1:10, y = rnorm(10))
gives
whereas the alternative
x = 1:10
y = rnorm(10)
plot(x, y, xlab = x, ylab = y)
gives
Hopefully this shows what deparse(substitute()) is used for. In the plot version, the xlab and ylab arguments are the outputs of whatever was used to generate x and y. myplot knows to pass "character string versions of the actual arguments to the function" for xlab and ylab. (quotes from ?substitute)

Pass function argument to ggplot label [duplicate]

I need to wrap ggplot2 into another function, and want to be able to parse variables in the same manner that they are accepted, can someone steer me in the correct direction.
Lets say for example, we consider the below MWE.
#Load Required libraries.
library(ggplot2)
##My Wrapper Function.
mywrapper <- function(data,xcol,ycol,colorVar){
writeLines("This is my wrapper")
plot <- ggplot(data=data,aes(x=xcol,y=ycol,color=colorVar)) + geom_point()
print(plot)
return(plot)
}
Dummy Data:
##Demo Data
myData <- data.frame(x=0,y=0,c="Color Series")
Existing Usage which executes without hassle:
##Example of Original Function Usage, which executes as expected
plot <- ggplot(data=myData,aes(x=x,y=y,color=c)) + geom_point()
print(plot)
Objective usage syntax:
##Example of Intended Usage, which Throws Error ----- "object 'xcol' not found"
mywrapper(data=myData,xcol=x,ycol=y,colorVar=c)
The above gives an example of the 'original' usage by the ggplot2 package, and, how I would like to wrap it up in another function. The wrapper however, throws an error.
I am sure this applies to many other applications, and it has probably been answered a thousand times, however, I am not sure what this subject is 'called' within R.
The problem here is that ggplot looks for a column named xcol in the data object. I would recommend to switch to using aes_string and passing the column names you want to map using a string, e.g.:
mywrapper(data = myData, xcol = "x", ycol = "y", colorVar = "c")
And modify your wrapper accordingly:
mywrapper <- function(data, xcol, ycol, colorVar) {
writeLines("This is my wrapper")
plot <- ggplot(data = data, aes_string(x = xcol, y = ycol, color = colorVar)) + geom_point()
print(plot)
return(plot)
}
Some remarks:
Personal preference, I use a lot of spaces around e.g. x = 1, for me this greatly improves the readability. Without spaces the code looks like a big block.
If you return the plot to outside the function, I would not print it inside the function, but just outside the function.
This is just an addition to the original answer, and I do know that this is quite an old post, but just as an addition:
The original answer provides the following code to execute the wrapper:
mywrapper(data = "myData", xcol = "x", ycol = "y", colorVar = "c")
Here, data is provided as a character string. To my knowledge this will not execute correctly. Only the variables within the aes_string are provided as character strings, while the data object is passed to the wrapper as an object.

R : pass Graph as parameter to a function

I have a decent looking graph ,which I plotted using
r <- ggplot(data=data2.Gurgaon,aes(x=createdDate,y=count))+geom_point()
Now i want to higlight few points on the graph say 500,1000,5000 etc..
so ,I am trying to write a function , in which i can pass point I want to mark
Below is the function I have written
graphPoint <- function(graph,point) {
g <- graph
g <- g+geom_point(aes(x=createdDate[point],y=count[point]),pch=1,size=8,col='black')
g <- g+ geom_point(aes(x=createdDate[point],y=count[point]),pch=16,size=5,col='red')
g
}
when i am passing parameters
r -> graphPoint(r,500)
this is giving error
Error in lapply(X = x, FUN = "[", ..., drop = drop) :
object 'point' not found
i am not that great with R . Hope its possible , But I am missing at some small point .. Thanks.
This is actually an extremely subtle (and annoying...) problem in ggplot, although not a bug. The aes(...) function evaluates all symbols first in the context of the default dataset (e.g. it looks for columns with that name), and, if that fails in the global environment. It does not move up the calling chain, as you might justifiably expect it to. So in your case the symbol point is first evaluated in the context of data2.Gurgaon. Since there is no such column, it looks for point in the global environment, but not in the context of your graphPoint(...) function. Here is a demonstration:
df <- mtcars
library(ggplot2)
graphPoint <- function(graph,point) {
g <- graph
g <- g + geom_point(aes(x=wt[point],y=mpg[point]),pch=1,size=8,col='black')
g <- g + geom_point(aes(x=wt[point],y=mpg[point]),pch=16,size=5,col='red')
g
}
ggp <- ggplot(df, aes(x=wt, y=mpg)) + geom_point()
point=10
graphPoint(ggp, 10)
The reason this works is because I defined point in the global environment; the point variable inside the function is being ignored (you can demonstrate that by calling the fn with something other than 10: you'll get the same plot).
The correct way around this is by subsetting the data=... argument, as shown in the other answer.
You cannot select a subset of the data within the aesthetics part of a ggplot function, as you are trying to do. However you can achieve this by extracting the original data from the ggplot object, subsetting it and using the subset in the rest of the function.
r <- ggplot(data=mtcars,aes(x=cyl,y=drat))+geom_point()
graphPoint <- function(graph,point) {
g <- graph
data_subset <- g$data[point, ]
g <- g+geom_point(data = data_subset,
aes(x=cyl,y=drat),pch=1,size=8,col='black')
g <- g+ geom_point(data = data_subset,
aes(x=cyl,y=drat),pch=16,size=5,col='red')
g
}
graphPoint(r, point = 2)
PS for upcoming posts I would advise you to make a reproducible example by using data that is generally accessible, like the mtcars data. This would make it easier to help you out.

Resources