Here is a reproducible example to work with:
library(lattice)
myimage<-matrix(c(1,1,2,3,3,4), nrow=3, ncol=2)
mytable<-data.frame(Xcoord=c(1.5, 1.5, 3,3), Ycoord=c(1,2,1,2), Labels=c("A","B","C","D"))
mycolors<-colorRampPalette(c("red","yellow","green","cyan","blue"))
windows()
levelplot(myimage, aspect="iso", col.regions = mycolors)
which produce the graph below.
Now I want to add (as text) the Labels in mytable at the specified coordinates indicated by Xcoord & Ycoord (which correspond to the rows and columns of the images). How can I do this ?
Only solution I could find is following user20650's link above and converting the matrix image to a data.frame with x, y coordinates (which I would have preferred to avoid):
dat <- data.frame(expand.grid(x = 1:3, y = 1:2), value = c(myimage))
Obj <-
levelplot(value ~ x+y, data = dat, aspect="iso", col.regions = mycolors) +
xyplot(y ~ x, data = dat,
panel = function(y, x, ...) {
ltext(x = mytable$Xcoord, y = mytable$Ycoord, labels = mytable$Labels, cex = 1, font = 2)
})
print({Obj})
Related
I have two spatial datasets with coordinates indicating observations of a species and want to estimate the area of overlap among these datasets. Since point coordinates cannot represent an area, one has to bin the coordinates using similar x (longitude) and y (latitude) categories for both datasets.
For this task, I found the practical hexbin package, which does hexagonal binning. The package is great, but at least I fail to find a function that directly outputs the coordinates / IDs of overlapping bins among hexbin objects. For example, the hdiffplot returns a nice graphical overview of overlapping bins, but how to extract this information for further analysis?
library(hexbin)
set.seed(1); df1 <- data.frame(x = rnorm(10, 0, 5), y = rnorm(10, 0, 5))
set.seed(2); df2 <- data.frame(x = rnorm(10, 0, 5), y = rnorm(10, 0, 5))
xrange <- c(floor(min(c(df1$x, df2$x))-1), ceiling(max(c(df1$x, df2$x))+1))
#-/+1 just to make the plot nicer
yrange <- c(floor(min(c(df1$y, df2$y))-1), ceiling(max(c(df1$y, df2$y)))+1)
hb1 <- hexbin(df1$x, df1$y, xbins = 10, xbnds = xrange, ybnds = yrange)
hb2 <- hexbin(df2$x, df2$y, xbins = 10, xbnds = xrange, ybnds = yrange)
hdiffplot(hb1,hb2, xbnds = xrange, ybnds = yrange)
I figured out a solution to this problem while making the question. Will post it here in hopes that it will help someone one day.
You can extract the coordinates using the hcell2xy function. Here is a little function to find the unique and overlapping coordinates for bin centroids:
#' #title Print overlapping and unique bin centroid coordinates for two hexbin objects
#' #param bin1,bin2 two objects of class hexbin.
#' #details The hexbin objects for comparison, bin1 and bin2, must have the same plotting limits and cell size.
#' #return Returns a list of data frames with unique coordinates for \code{bin1} and \code{bin2} as well as overlapping coordinates among bins.
hdiffcoords <- function(bin1, bin2) {
## Checks modified from: https://github.com/edzer/hexbin/blob/master/R/hdiffplot.R
if(is.null(bin1) | is.null(bin1)) {
stop("Need 2 hex bin objects")
} else {
if(bin1#shape != bin2#shape)
stop("Bin objects must have same shape parameter")
if(all(bin1#xbnds == bin2#xbnds) & all(bin1#ybnds == bin2#ybnds))
equal.bounds <- TRUE
else stop("Bin objects need the same xbnds and ybnds")
if(bin1#xbins != bin2#xbins)
stop("Bin objects need the same number of bins")
}
## Find overlapping and unique bins
hd1 <- data.frame(hcell2xy(bin1), count_bin1 = bin1#count, cell_bin1 = bin1#cell)
hd2 <- data.frame(hcell2xy(bin2), count_bin2 = bin2#count, cell_bin2 = bin2#cell)
overlapping_hd1 <- apply(hd1, 1, function(r, A){ sum(A$x==r[1] & A$y==r[2]) }, hd2)
overlapping_hd2 <- apply(hd2, 1, function(r, A){ sum(A$x==r[1] & A$y==r[2]) }, hd1)
overlaps <- merge(hd1[as.logical(overlapping_hd1),], hd2[as.logical(overlapping_hd2),])
unique_hd1 <- hd1[!as.logical(overlapping_hd1),]
unique_hd2 <- hd2[!as.logical(overlapping_hd2),]
## Return list of data.frames
list(unique_bin1 = unique_hd1, unique_bin2 = unique_hd2, overlapping = overlaps)
}
This information should be the same than returned by hdiffplot in graphical format:
df <- hdiffcoords(hb1, hb2)
library(ggplot2)
ggplot() +
geom_point(data = df$unique_bin1, aes(x = x, y = y), color = "red", size = 10) +
geom_point(data = df$unique_bin2, aes(x = x, y = y), color = "cyan", size = 10) +
geom_point(data = df$overlapping, aes(x = x, y = y), color = "green", size = 10) + theme_bw()
Any comments/corrections are appreciated.
Suppose I have the following data frame:
df <- data.frame(A1 = c(0,3.5,0,2.1), A2 =c(0.9,0,0,0.6), A3 = c(0,0.3,0,0.3),A4= c(0,1.9,0,0))
rownames(df) <- names(df)
every element df(i,j) is the strength of relation between ith column and jth row (they are mutually connected, meaning strength between 1 to j is different from strength between j to i). A "0" entry means there is no relation.
Now I would like to draw a circle, with the variables on the perimeter of the circle, and an arrow that shows which variables are connected to each other, and hopefully show the strength of the connection based on the width of the arrow.
So, the final product I wish to be something like this:
Is it even possible to do something like that with ggplot2?
Thanks in advance.
igraph
We start by making a graph from your adjacency matrix:
df <- t(df)
ga <- graph.adjacency(as.matrix(df), weighted = TRUE, mode = "directed")
Then, plot a circle:
par(mar = rep(0.25, 4))
pts <- seq(0, 2*pi, l = 100)
plot(cbind(sin(pts), cos(pts)), type = "l", frame = F, xaxt = "n", yaxt = "n")
Finally, plot the graph:
plot.igraph(ga,
vertex.label = V(ga)$name,
edge.width = E(ga)$weight,
edge.curved = TRUE,
edge.label = E(ga)$weight,
layout = layout_in_circle(ga, order = V(ga)),
add = T)
Output below. You can customize your graph (e.g. curvature and colors of edges, shapes of vertices) as desired.
ggplot2
The main idea is to set up three sets of geoms: the circle, the nodes (vertices), and the lines (edges). First, we load some packages, and prep the circle and nodes:
library(ggplot2)
library(tidyr)
library(dplyr)
# For circle
pts <- seq(0, 2*pi, l = 100)
# For nodes
theta <- seq(0, 2*pi, l = nrow(df) + 1)[1:nrow(df)]
l <- data.frame(x = sin(theta), y = cos(theta), v = names(df),
stringsAsFactors = FALSE)
The edges are a little bit more involved. I make a function to make coordinates for the lines, given an origin and destination:
make_edge <- function(origin, dest, l, shrink = .9) {
# l is the layout matrix for the nodes that we made previously
data.frame(
x0 = l$x[l$v == origin],
y0 = l$y[l$v == origin],
x1 = l$x[l$v == dest],
y1 = l$y[l$v == dest]
) * shrink
}
Then, we make an adjacency graph, and bind the edge coordinates to it:
gr <- gather(mutate(df, dest = names(df)), origin, wt, -dest)
gr <- gr[gr$wt != 0, ]
edges <- do.call(rbind,
mapply(make_edge, gr$origin, gr$dest, list(l), shrink = .94, SIMPLIFY = F)
)
ga <- cbind(gr, edges)
Finally, we plot:
ggplot() +
geom_path(data = data.frame(x = sin(pts), y = cos(pts)), aes(x, y)) +
geom_label(data = l, aes(x, y, label = v)) +
geom_curve(data = ga,
aes(x = x0, y = y0, xend = x1, yend = y1, size = wt, colour = origin),
alpha = 0.8,
curvature = 0.1,
arrow = arrow(length = unit(2, "mm"))) +
scale_size_continuous(range=c(.25,2), guide = FALSE) +
theme_void()
Output:
I wrote a little package that does this kind of thing. Here's a small demo vignette https://github.com/mkearney/lavplot/blob/master/vignettes/demo.Rmd. Image of plot provided below.
I'm trying to plot 2 sets of data points and a single line in R using ggplot.
The issue I'm having is with the legend.
As can be seen in the attached image, the legend applies the lines to all 3 data sets even though only one of them is plotted with a line.
I have melted the data into one long frame, but this still requires me to filter the data sets for each individual call to geom_line() and geom_path().
I want to graph the melted data, plotting a line based on one data set, and points on the remaining two, with a complete legend.
Here is the sample script I wrote to produce the plot:
xseq <- 1:100
x <- rnorm(n = 100, mean = 0.5, sd = 2)
x2 <- rnorm(n = 100, mean = 1, sd = 0.5)
x.lm <- lm(formula = x ~ xseq)
x.fit <- predict(x.lm, newdata = data.frame(xseq = 1:100), type = "response", se.fit = TRUE)
my_data <- data.frame(x = xseq, ypoints = x, ylines = x.fit$fit, ypoints2 = x2)
## Now try and plot it
melted_data <- melt(data = my_data, id.vars = "x")
p <- ggplot(data = melted_data, aes(x = x, y = value, color = variable, shape = variable, linetype = variable)) +
geom_point(data = filter(melted_data, variable == "ypoints")) +
geom_point(data = filter(melted_data, variable == "ypoints2")) +
geom_path(data = filter(melted_data, variable == "ylines"))
pushViewport(viewport(layout = grid.layout(1, 1))) # One on top of the other
print(p, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
You can set them manually like this:
We set linetype = "solid" for the first item and "blank" for others (no line).
Similarly for first item we set no shape (NA) and for others we will set whatever shape we need (I just put 7 and 8 there for an example). See e.g. http://www.r-bloggers.com/how-to-remember-point-shape-codes-in-r/ to help you to choose correct shapes for your needs.
If you are happy with dots then you can use my_shapes = c(NA,16,16) and scale_shape_manual(...) is not needed.
my_shapes = c(NA,7,8)
ggplot(data = melted_data, aes(x = x, y = value, color=variable, shape=variable )) +
geom_path(data = filter(melted_data, variable == "ylines") ) +
geom_point(data = filter(melted_data, variable %in% c("ypoints", "ypoints2"))) +
scale_colour_manual(values = c("red", "green", "blue"),
guide = guide_legend(override.aes = list(
linetype = c("solid", "blank","blank"),
shape = my_shapes))) +
scale_shape_manual(values = my_shapes)
But I am very curious if there is some more automated way. Hopefully someone can post better answer.
This post relied quite heavily on this answer: ggplot2: Different legend symbols for points and lines
Following up on a recent question of mine, this one is a bit different and illustrates the problem more fully using simpler examples. Below are two data sets and three functions. The first one draws some points and a circle as expected:
library("ggplot2")
library("grid")
td1 <- data.frame(x = rnorm(10), y = rnorm(10))
tf1 <- function(df) { # works as expected
p <- ggplot(aes(x = x, y = y), data = df)
p <- p + geom_point(color = "red")
p <- p + annotation_custom(circleGrob())
print(p)
}
tf1(td1)
This next one seems to ask for the exact sample plot but the code is slightly different. It does not give an error but does not draw the circle:
tf2 <- function(df) { # circle isn't draw, but no error either
p <- ggplot()
p <- p + geom_point(data = df, aes(x = x, y = y), color = "red")
p <- p + annotation_custom(circleGrob())
print(p)
}
tf2(td1)
Finally, this one involves a more complex aesthetic and gives an empty layer when you try to create the circle:
td3 <- data.frame(r = c(rnorm(5, 5, 1.5), rnorm(5, 8, 2)),
f1 = c(rep("L", 5), rep("H", 5)), f2 = rep(c("A", "B"), 5))
tf3 <- function(df) {
p <- ggplot()
p <- p + geom_point(data = df,
aes(x = f1, y = r, color = f2, group = f2))
# p <- p + annotation_custom(circleGrob()) # comment out and it works
print(p)
}
tf3(td3)
Now, I suspect the problem here is not the code but my failure to grasp the inner workings of ggplot2. I could sure use an explanation of why the circle is not drawn in the 2nd case and why the layer is empty in the third case. I looked at the code for annotation_custom and it has a hard-wired inherit.aes = TRUE which I think is the problem. I don't see why this function needs any aesthetic at all (see the docs on it). I did try several ways to override it and set inherit.aes = FALSE but I was unable to fully penetrate the namespace and make it stick. I tried to example the objects created by ggplot2 but these proto objects are nested very deeply and hard to decipher.
To answer this :
"I don't see why this function needs any aesthetic at all".
In fact annotation_custom need x and y aes to scale its grob, and to use after the native units.
Basically it did this :
x_rng <- range(df$x, na.rm = TRUE) ## ranges of x :aes x
y_rng <- range(df$y, na.rm = TRUE) ## ranges of y :aes y
vp <- viewport(x = mean(x_rng), y = mean(y_rng), ## create a viewport
width = diff(x_rng), height = diff(y_rng),
just = c("center","center"))
dd <- editGrob(grod =circleGrob(), vp = vp) ##plot the grob in this vp
To illustrate this I add a grob to a dummy plot used as a scale for my grob. The first is a big scale and the second is a small one.
base.big <- ggplot(aes(x = x1, y = y1), data = data.frame(x1=1:100,y1=1:100))
base.small <- ggplot(aes(x = x1, y = y1), data = data.frame(x1=1:20,y1=1:1))
I define my grob, see I use the native scales for xmin,xmax,ymin,ymax
annot <- annotation_custom(grob = circleGrob(), xmin = 0,
xmax = 20,
ymin = 0,
ymax = 1)
Now see the scales difference(small point / big circle) between (base.big +annot) and (base.small + annot).
library(gridExtra)
grid.arrange(base.big+annot,
base.small+annot)
I'm having a bit of a trouble here, please help me.
I have this data
set.seed(4)
mydata <- data.frame(var = rnorm(100),
temp = rnorm(100),
subj = as.factor(rep(c(1:10),5)),
trt = rep(c("A","B"), 50))
and this model that fits them
lm <- lm(var ~ temp * subj, data = mydata)
I want to plot the results with lattice and fit the regression line, predicted with my model, through them. To do so, I'm using this approach, outlined "Lattice Tricks for the power useR" by D. Sarkar
temp_rng <- range(mydata$temp, finite = TRUE)
grid <- expand.grid(temp = do.breaks(temp_rng, 30),
subj = unique(mydata$subj),
trt = unique(mydata$trt))
model <- cbind(grid, var = predict(lm, newdata = grid))
orig <- mydata[c("var","temp","subj","trt")]
combined <- make.groups(original = orig, model = model)
xyplot(var ~ temp | subj,
data = combined,
groups = which,
type = c("p", "l"),
distribute.type = TRUE
)
So far every thing is fine, but I also want to assign a fill color to the data points for the two treatments trt=1 and trt=2.
So I have written this piece of code, that works fine, but when it comes to plot the regression line, it seems that type is not recognized by the panel function...
my.fill <- c("black", "grey")
plot <- with(combined,
xyplot(var ~ temp | subj,
data = combined,
group = combined$which,
type = c("p", "l"),
distribute.type = TRUE,
panel = function(x, y, ..., subscripts){
fill <- my.fill[combined$trt[subscripts]]
panel.xyplot(x, y, pch = 21, fill = my.fill, col = "black")
},
key = list(space = "right",
text = list(c("trt1", "trt2"), cex = 0.8),
points = list(pch = c(21), fill = c("black", "grey")),
rep = FALSE)
)
)
plot
I've also tried to move type and distribute type within panel.xyplot, as well as subsetting the data in it panel.xyplot like this
plot <- with(combined,
xyplot(var ~ temp | subj,
data = combined,
panel = function(x, y, ..., subscripts){
fill <- my.fill[combined$trt[subscripts]]
panel.xyplot(x[combined$which=="original"], y[combined$which=="original"], pch = 21, fill = my.fill, col = "black")
panel.xyplot(x[combined$which=="model"], y[combined$which=="model"], type = "l", col = "black")
},
key = list(space = "right",
text = list(c("trt1", "trt2"), cex = 0.8),
points = list(pch = c(21), fill = c("black", "grey")),
rep = FALSE)
)
)
plot
but no success with that either.
Can anyone help me to get the predicted values plotted as a line instead of being points?
This might be a job for the latticeExtra package.
library(latticeExtra)
p1 <- xyplot(var ~ temp | subj, data=orig, panel=function(..., subscripts) {
fill <- my.fill[combined$trt[subscripts]]
panel.xyplot(..., pch=21, fill=my.fill, col="black")
})
p2 <- xyplot(var ~ temp | subj, data=model, type="l")
p1+p2
I'm not sure what's going on in your first attempt, but the one with the subscripts isn't working because x and y are a subset of the data for subj, so subsetting them using a vector based on combined won't work the way you think it will. Try this instead.
xyplot(var ~ temp | subj, groups=which, data = combined,
panel = function(x, y, groups, subscripts){
fill <- my.fill[combined$trt[subscripts]]
g <- groups[subscripts]
panel.points(x[g=="original"], y[g=="original"], pch = 21,
fill = my.fill, col = "black")
panel.lines(x[g=="model"], y[g=="model"], col = "black")
},
key = list(space = "right",
text = list(c("trt1", "trt2"), cex = 0.8),
points = list(pch = c(21), fill = c("black", "grey")),
rep = FALSE)
)
This may be trivial, but you may try:
xyplot(... , type=c("p","l","r"))
"p" adds points, "l" connects them with broken lines, "r" fits a linear model through your data. type="r" alone will plot only regression lines without showing data points.
It might be easier to simply use the panel.lmline function on just your original data:
xyplot(var ~ temp | subj,
data = orig,
panel = function(x,y,...,subscripts){
fill <- my.fill[orig$trt[subscripts]]
panel.xyplot(x, y, pch = 21, fill = my.fill,col = "black")
panel.lmline(x,y,col = "salmon")
},
key = list(space = "right",
text = list(c("trt1", "trt2"), cex = 0.8),
points = list(pch = c(21), fill = c("black", "grey")),
rep = FALSE)
)