Difficulty understanding the ggplotly interface for custom geoms - r

I've had some success extending ggflags to use SVGs, and now I'm trying to make this extension play nice with ggplotly(). The Plotly documentation has a nice section on translating custom geoms, and it essentially comes down to either providing a to_basic method for your geom or a geom2trace method.
I'm gonna be honest: I'm outta my depth here, and I'm trying to cobble something together based on the two named examples: ggmosaic and ggalt. But something simple like:
#' #name plotly_helpers
#' #title Plotly helpers
#' #description Helper functions to make it easier to automatically create plotly charts
#' #importFrom plotly to_basic
#' #export
to_basic.GeomFlag <- getFromNamespace("to_basic.GeomPoint", asNamespace("plotly"))
Gives me an error when running devtools::document():
object 'to_basic.GeomPoint' not found
if I change it to GeomLine the package documents and installs, but a simple plot displays like this:
df = data_frame(
f = rep(1:5, each = 4),
x = rep(1:4, times = 5),
y = c(1:2, 2:1, 2:1, 1:2, 1:2, 2:1, 2:1, 1:2, 2:1, 1:2),
s = c(1:4, 4:1, 1:4, 4:1, 1:4),
c = rep(c('au', 'us', 'gb', 'de'), 5))
p1 = ggplot(df) + geom_flag(aes(x = x, y = y, size = s, frame = f, country = c)) + scale_country()
ggplotly(p1)
Obviously GeomLine isn't right—GeomPoint seems like a more natural point for this extension—but it could simply be that I can't reduce it and I need to implement geom2trace. But I don't really have a great idea of what needs to go in or out of either method. Can anyone help? Are there other examples of ggplot2 extensions that implement the Plotly interfaces?
EDIT: I've had a look at some of the built-in examples of geom2trace, but I'm not sure I can go much further here. I was kind of hoping I might be able to substitute custom grobs for the GeomPoint points, like with grid, but it seems like the Plotly API expects just basic point attributes.

Related

Run points() after plot() on a dataframe

I'm new to R and want to plot specific points over an existing plot. I'm using the swiss data frame, which I visualize through the plot(swiss) function.
After this, want to add outliers given by the Mahalanobis distance:
mu_hat <- apply(swiss, 2, mean); sigma_hat <- cov(swiss)
mahalanobis_distance <- mahalanobis(swiss, mu_hat, sigma_hat)
outliers <- swiss[names(mahalanobis_distance[mahalanobis_distance > 10]),]
points(outliers, pch = 'x', col = 'red')
but this last line has no effect, as the outlier points aren't added to the previous plot. I see that if repeat this procedure on a pair of variables, say
plot(swiss[2:3])
points(outliers[2:3], pch = 'x', col = 'red')
the red points are added to the plot.
Ask: is there any restriction to how the points() function can be used for a multivariate data frame?
Here's a solution using GGally::ggpairs. It's a little ugly as we need to modify the ggally_points function to specify the desired color scheme.
I've assumed that mu_hat = colMeans(swiss) and sigma_hat = cov(swiss).
library(dplyr)
library(GGally)
swiss %>%
bind_cols(distance = mahalanobis(swiss, colMeans(swiss), cov(swiss))) %>%
mutate(is_outlier = ifelse(distance > 10, "yes", "no")) %>%
ggpairs(columns = 1:6,
mapping = aes(color = is_outlier),
upper = list(continuous = function(data, mapping, ...) {
ggally_points(data = data, mapping = mapping) +
scale_colour_manual(values = c("black", "red"))
}),
lower = list(continuous = function(data, mapping, ...) {
ggally_points(data = data, mapping = mapping) +
scale_colour_manual(values = c("black", "red"))
}),
axisLabels = "internal")
Unfortunately this isn't possible the way you're currently doing things. When plotting a data frame R produces many plots and aligns them. What you're actually seeing there is 6 by 6 = 36 individual plots which have all been aligned to look nice.
When you use the dots command, it tells it to place the dots on the current plot. Which doesn't really make sense when you have 36 plots, at least not the way you want it to.
ggplot is a really powerful tool in R, it provides far greater combustibility. For example you could set up the dataframe to include your outliers, but have them labelled as "outlier" and place it in each plot that you have set up as facets. The more you explore it you might find there are better plots which suit your needs as well.
Plotting a dataframe in base R is a good exploratory tool. You could set up those outliers as a separate dataframe and plot it, so you can see each of the 6 by 6 plots side by side and compare. It all depends on your goal. If you're goal is to produce exactly as you've described, the ggplot2 package will help you create something more professional. As #Gregor suggested in the comments, looking up the function ggpairs from the GGally package would be a good place to start.
A quick google image search shows some funky plots akin to what you're after and then some!
Find it here

ggplot2 equivalent of 'factorization or categorization' in googleVis in R

Due to static graph prepared by ggplot, we are shifting our graphs to googleVis with interactive charts. But when it comes to categorization we are facing many problems. Let me give example which will help you understand:
#dataframe
df = data.frame( x = sample(1:100), y = sample(1:100), cat = sample(c('a','b','c'), 100, replace=TRUE) )
ggplot2 provides parameter like alpha, colour, linetype, size which we can use with categories like shown below:
ggplot(df) + geom_line(aes(x = x, y = y, colour = cat))
Not just line chart, but majority of ggplot2 graphs provide categorization based on column values. Now I would like to do the same in googleVis, based on value df$cat I would like parameters to get changed or grouping of line or charts.
Note:
I have already tried dcast to make multiple columns based on category column and use those multiple columns as Y input, but that it not what I would like to do.
Can anyone help me regarding this?
Let me know if you need more information.
vrajs5 you are not alone! We struggled with this issue. In our case we wanted to fill bar charts like in ggplot. This is the solution. You need to add specifically named columns, linked to your variables, to your data table for googleVis to pick up.
In my fill example, these are called roles, but once you see my syntax you can abstract it to annotations and other cool features. Google has them all documented here (check out superheroes example!) but it was not obvious how it applied to r.
#mages has this documented on this webpage, which shows features not in demo(googleVis):
http://cran.r-project.org/web/packages/googleVis/vignettes/Using_Roles_via_googleVis.html
EXAMPLE ADDING NEW DIMENSIONS TO GOOGLEVIS CHARTS
# in this case
# How do we fill a bar chart showing bars depend on another variable?
# We wanted to show C in a different fill to other assets
suppressPackageStartupMessages(library(googleVis))
library(data.table) # You can use data frames if you don't like DT
test.dt = data.table(px = c("A","B","C"), py = c(1,4,9),
"py.style" = c('silver', 'silver', 'gold'))
# Add your modifier to your chart as a new variable e.g. py1.style
test <-gvisBarChart(test.dt,
xvar = "px",
yvar = c("py", "py.style"),
options = list(legend = 'none'))
plot(test)
We have shown py.style deterministically here, but you could code it to be dependent on your categories.
The secret is myvar.googleVis_thing_youneed linking the variable myvar to the googleVis feature.
RESULT BEFORE FILL (yvar = "py")
RESULT AFTER FILL (yvar = c("py", "py.style"))
Take a look at mages examples (code also on Github) and you will have cracked the "categorization based on column values" issue.

How to draw loess estimation in GGally using ggpairs?

I tried GGally package a little bit. Especially the ggpairs function. However, I cannot figure out how to use loess instead of lm when plot smooth. Any ideas?
Here is my code:
require(GGally)
diamonds.samp <- diamonds[sample(1:dim(diamonds)[1],200),]
ggpairs(diamonds.samp[,c(1,5)],
lower = list(continuous = "smooth"),
params = c(method = "loess"),
axisLabels = "show")
Thanks!
P.S. compare with the plotmatrix function, ggpairs is much much slower... As a result, most of the time, I just use plotmatrix from ggplot2.
Often it is best to write your own function for it to use. Adapted from this answer to similar question.
library(GGally)
diamonds_sample = diamonds[sample(1:dim(diamonds)[1],200),]
# Function to return points and geom_smooth
# allow for the method to be changed
custom_function = function(data, mapping, method = "loess", ...){
p = ggplot(data = data, mapping = mapping) +
geom_point() +
geom_smooth(method=method, ...)
p
}
# test it
ggpairs(diamonds_sample,
lower = list(continuous = custom_function)
)
Produces this:
Well the documentation doesn't say, so use the source, Luke
You can dig deeper into the source with:
ls('package:GGally')
GGally::ggpairs
... and browse every function it references ...
seems like the args get mapped into ggpairsPlots and then -> plotMatrix which then gets called
So apparently selecting smoother is not explicitly supported, you can only select continuous = "smooth". If it behaves like ggplot2:geom_smooth it internally automatically figures out which of the supported smoothers to call (loess for <1000 datapoints, gam for >=1000).
You might like to step it through the debugger to see what's happening inside your plot. I tried to follow the source but my eyes glaze over.
or 2. Browse on https://github.com/ggobi/ggally/blob/master/R/ggpairs.r [4/14/2013]
#' upper and lower are lists that may contain the variables 'continuous',
#' 'combo' and 'discrete'. Each element of the list is a string implementing
#' the following options: continuous = exactly one of ('points', 'smooth',
#' 'density', 'cor', 'blank') , ...
#'
#' diag is a list that may only contain the variables 'continuous' and 'discrete'.
#' Each element of the diag list is a string implmenting the following options:
#' continuous = exactly one of ('density', 'bar', 'blank');

Plotting three densities on the same graph in different line patterns with titles etc

I am very, very new to R so please forgive the basic nature of my question. In short, I have done a lot of Google searching to try to answer this, but I find that even the basic guides available, and simple discussions on forums are assuming more prior knowledge than I have, especially when it comes to outlining what all of the coding terms are and what changing them means for a plot.
In short I have a tab formatted table with three columns of data that I wish to plot densities for on a single graph. I would like the lines to be different patterns (dotted, dashed etc. whatever makes it easy to tell them apart, I cannot use colours as my supervisor is colour blind).
I have code that reads in the data and makes accessible the columns I am interested in:
mydata <- read.table("c:/Users/Demon/Desktop/Thesis/Fst_all_genome.txt", header=TRUE,
sep="\t")
fstdata <- data.frame(Fst_ceu_mkk =rnorm(10),
Fst_ceu_yri =rnorm(10),
Fst_mkk_yri =rnorm(10))
Where do I go from here?
Appendix A of 'An Introduction to R' has a nice walkthrough tutorial you can do in ten minutes; it teaches among other things about line types etc
After that, plotting densities was explained dozens of times here too; search in the search box above for eg '[r] density'. There is also the R Graph Gallery (possibly down right now) and more.
A nice, free guide I often recommend is John Verzani's simpleR which stresses graphs a lot and will teach you what you need here.
Two options for you to explore using high-level graphics.
# dummy data
d = data.frame(x = rnorm(10), y = rnorm(10), z = rnorm(10))
You first need to reshape the data from wide to long format,
require(reshape2)
m = melt(d)
ggplot2 graphics
require(ggplot2)
ggplot(data = m, mapping = aes(x = value, linetype = variable)) +
geom_line(stat = "density")
Lattice graphics
Using the same melt()ed data,
require(lattice)
densityplot( ~ value, data = m, group = variable,
auto.key = TRUE, par.settings = col.whitebg())
If you need something very simple, you could do simply:
plot(density(mydata$col_1))
lines(density(mydata$col_2), lty = 2)
lines(density(mydata$col_2), lty = 3)
If the second and third density curves are far away from the first, you'll need define xy limits of the plotting region explicitly:
dens1 <- density(mydata$col_1)
dens2 <- density(mydata$col_2)
dens3 <- density(mydata$col_3)
plot(dens1, xlim = range(dens1$x, dens2$x, dens3$x),
ylim = range(dens1$y, dens2$y, dens3$y))
lines(density(mydata$col_2), lty = 2)
lines(density(mydata$col_2), lty = 3)
Hope this helps.

Annotate ggplot2 graphs using tikzAnnotate in tikzDevice

I would like to use tikzDevice to include annotated ggplot2 graphs in a Latex document.
tikzAnnotate help has an example of how to use it with base graphics, but how to use it with a grid-based plotting package like ggplot2? The challenge seems to be the positioning of the tikz node.
playwith package has a function convertToDevicePixels (http://code.google.com/p/playwith/source/browse/trunk/R/gridwork.R) that seems to be similar to grconvertX/grconvertY, but I am unable to get this to work either.
Would appreciate any pointers on how to proceed.
tikzAnnotate example using base graphics
library(tikzDevice)
library(ggplot2)
options(tikzLatexPackages = c(getOption('tikzLatexPackages'),
"\\usetikzlibrary{shapes.arrows}"))
tikz(standAlone=TRUE)
print(plot(15:20, 5:10))
#print(qplot(15:20, 5:10))
x <- grconvertX(17,,'device')
y <- grconvertY(7,,'device')
#px <- playwith::convertToDevicePixels(17, 7)
#x <- px$x
#y <- px$y
tikzAnnotate(paste('\\node[single arrow,anchor=tip,draw,fill=green] at (',
x,',',y,') {Look over here!};'))
dev.off()
Currently, tikzAnnotate only works with base graphics. When tikzAnnotate was first written, the problem with grid graphics was that we needed a way of specifying the x,y coordinates relative to the absolute lower left corner of the device canvas. grid thinks in terms of viewports and for many cases it seems the final coordinate system of the graphic is not known until it is heading to the device by means of the print function.
It would be great to have this functionality, but I could not figure out a way good way to implement it and so the feature got shelved. If anyone has details on a good implementation, feel free to start a discussion on the mailing list (which now has an alternate portal on Google Groups) and it will get on the TODO list.
Even better, implement the functionality and open a pull request to the project on GitHub. This is guaranteed to get the feature into a release over 9000 times faster than if it sits on my TODO list for months.
Update
I have had some time to work on this, and I have come up with a function for converting grid coordinates in the current viewport to absolute device coordinates:
gridToDevice <- function(x = 0, y = 0, units = 'native') {
# Converts a coordinate pair from the current viewport to an "absolute
# location" measured in device units from the lower left corner. This is done
# by first casting to inches in the current viewport and then using the
# current.transform() matrix to obtain inches in the device canvas.
x <- convertX(unit(x, units), unitTo = 'inches', valueOnly = TRUE)
y <- convertY(unit(y, units), unitTo = 'inches', valueOnly = TRUE)
transCoords <- c(x,y,1) %*% current.transform()
transCoords <- (transCoords / transCoords[3])
return(
# Finally, cast from inches to native device units
c(
grconvertX(transCoords[1], from = 'inches', to ='device'),
grconvertY(transCoords[2], from = 'inches', to ='device')
)
)
}
Using this missing piece, one can use tikzAnnotate to mark up a grid or lattice plot:
require(tikzDevice)
require(grid)
options(tikzLatexPackages = c(getOption('tikzLatexPackages'),
"\\usetikzlibrary{shapes.arrows}"))
tikz(standAlone=TRUE)
xs <- 15:20
ys <- 5:10
pushViewport(plotViewport())
pushViewport(dataViewport(xs,ys))
grobs <- gList(grid.rect(),grid.xaxis(),grid.yaxis(),grid.points(xs, ys))
coords <- gridToDevice(17, 7)
tikzAnnotate(paste('\\node[single arrow,anchor=tip,draw,fill=green,left=1em]',
'at (', coords[1],',',coords[2],') {Look over here!};'))
dev.off()
This gives the following output:
There is still some work to be done, such as:
Creation of a "annotation grob" that can be added to grid graphics.
Determine how to add such an object to a ggplot.
These features are scheduled to appear in release 0.7 of the tikzDevice.
I have made up a small example based on #Andrie's suggestion with geom_text and geom_polygon:
Initializing your data:
df <- structure(list(x = 15:20, y = 5:10), .Names = c("x", "y"), row.names = c(NA, -6L), class = "data.frame")
And the point you are to annotate is the 4th row in the dataset, the text should be: "Look over here!"
point <- df[4,]
ptext <- "Look over here!"
Make a nice arrow calculated from the coords of the point given above:
arrow <- data.frame(
x = c(point$x-0.1, point$x-0.3, point$x-0.3, point$x-2, point$x-2, point$x-0.3, point$x-0.3, point$x-0.1),
y = c(point$y, point$y+0.3, point$y+0.2, point$y+0.2, point$y-0.2, point$y-0.2, point$y-0.3, point$y)
)
And also make some calculations for the position of the text:
ptext <- data.frame(label=ptext, x=point$x-1, y=point$y)
No more to do besides plotting:
ggplot(df, aes(x,y)) + geom_point() + geom_polygon(aes(x,y), data=arrow, fill="green") + geom_text(aes(x, y, label=label), ptext) + theme_bw()
Of course, this is a rather hackish solution, but could be extended:
compute the size of arrow based on the x and y ranges,
compute the position of the text based on the length of the text (or by the real width of the string with textGrob),
define a shape which does not overlaps your points :)
Good luck!

Resources