lattice plot only lines with positive slope - r

Is there a easy and efficient way to define a function like panel.xyplot (or rather panel.lines) that connects only two point (x1,y1) and (x2,y2) if x1 <= x2 and y1 <= y2? (Ideally, with all other properties are retained by label.xyplot(...))
I asked the same question a view month ago and the solution is great:
lattice, connect points only if the connection has a positive slope
Now it would be fine to have a real panel.xyplot like function so that I can use my own groups. It should work and plot like below, except the crossed lines.
I welcome suggestions.

I'm not sure I understand what you're after, but if I do, then I think this should work for any given group:
library(dplyr)
set.seed(1)
dat <- data.frame(x=1:10,y=sample(1:10))
dat <- mutate(dat, x0 = x, y0 = y, x1 = lead(x), y1 = lead(y), slope = (x1 - x0)/(y1 - y0))
with(dat, plot(x, y))
with(dat[1:nrow(dat) - 1,], segments(x0 = x0, y0 = y0, x1 = x1, y1 = y1,
col = ifelse(slope >= 0, "black", "white"))) # This bit gets makes line-drawing conditional
Here's what I get from that:
And here's a version for grouped data that doesn't depend on lattice:
dat2 <- data.frame(x = rep(seq(10), 10),
y = sample(1:10, size = 100, replace = TRUE),
indx = rep(seq(10), each = 10))
dat2g <- dat2 %>%
group_by(indx) %>%
mutate(., x0 = x, y0 = y, x1 = lead(x), y1 = lead(y), slope = (x1 - x0)/(y1 - y0))
plotit <- function(group) {
require(dplyr)
datsub <- filter(dat2g, indx == group)
with(datsub, plot(x, y, main = group))
with(datsub[1:nrow(datsub) - 1,], segments(x0 = x0, y0 = y0, x1 = x1, y1 = y1, col = ifelse(slope >= 0, "black", "white")))
}
par(mfrow=c( floor(sqrt(max(dat2g$indx))), ceiling(sqrt(max(dat2g$indx)))))
par(mai=c(0.5,0.5,0.5,0.25))
for (i in 1:length(unique(dat2g$indx))) { plotit(i) }
Here's the plot output from that process. It could use fine-tuning, but I think it's what you're after?

Related

rgl: plot3d with "extended" plotting symbols

I am trying to extend the symbols available to me for plotting in 3D. In 2D, I use:
x1 <- sort(rnorm(10))
y1 <- rnorm(10)
z1 <- rnorm(10) + atan2(x1, y1)
x2 <- sort(rnorm(10))
y2 <- rnorm(10)
z2 <- rnorm(10) + atan2(x2, y2)
x3 <- sort(rnorm(10))
y3 <- rnorm(10)
z3 <- rnorm(10) + atan2(x3, y3)
new.styles <- -1*c(9818:9824, 9829, 9830, 9831)
In 2D, my plot works and gives the appropriate symbol:
plot(x1, y1, col="red", bg="red", pch=new.styles, cex = 2)
and the plot is here:
In 3D, however, the symbols do not get translated correctly.
rgl::plot3d(x1, y1, z1, col="red", bg="red", pch=new.styles, size = 10)
this yields:
The symbols are getting replaced with (one) circle.
I also tried with pch3d and got blank plots. However, pch3d does work with the "standard" plotting symbols.
rgl::pch3d(x1, y1, z1, col="red", bg="red", pch=10:19, size = 10)
I get the plot:
So, it appears to be that at least the symbols are not displaying in 3D. How can I display the preferred symbols?
I was able to only get a solution using text3d() -- hopefully there exists a better solution.
x1 <- sort(rnorm(12))
y1 <- rnorm(12)
z1 <- rnorm(12) + atan2(x1, y1)
x2 <- sort(rnorm(12))
y2 <- rnorm(12)
z2 <- rnorm(12) + atan2(x2, y2)
x3 <- sort(rnorm(12))
y3 <- rnorm(12)
z3 <- rnorm(12) + atan2(x3, y3)
new.styles <- c(9818:9824, 9829, 9830, 9831, 9832, 9827)
rgl::open3d()
pal.col <- RColorBrewer::brewer.pal(name = "Paired", n = 12)
for (i in 1:12)
rgl::text3d(x1[i], y1[i], z1[i], col=pal.col[i], text = intToUtf8(new.styles[i]), cex = 2, usePlotmath = TRUE)
rgl::box3d()
This yields the figure:
This may well be too complicated, hopefully there are better solutions out there.
This is the best I could do:
Set up file for texture/shape:
crown <- tempfile(pattern = "crown", fileext = ".png")
png(filename = crown)
plot(1,1, ann=FALSE, axes=FALSE, pch=-9818, cex = 40, col = 2)
dev.off()
Load package, define a function to plot the texture at a random point:
library(rgl)
xyz <- cbind(c(0,1,1,0), 0, c(0,0,1,1))
add_quad_point <- function(shape = crown, sd = 3) {
pos <- rnorm(3, sd = sd)
m <- sweep(xyz, MARGIN=2, STATS = pos, FUN = "+")
quads3d(m,
texture = shape,
texcoords = xyz[,c(1,3)],
col = "white",
specular = "black")
}
open3d()
replicate(10, add_quad_point())
axes3d()
## close3d()

Using effects package inside a function

I have a large frame with lots of variables which I'm going to analyze in the same way. Specifically, I want to plot effect confidence intervals in mixed effect model. I want to write function which make a custom plot for one dependent variable. Direct application of effect() function goes well. But the same code inside function cause error.
I tried two variants of function. Both cause errors.
Here is my reproducible example:
library(nlme)
library(effects)
df <- data.frame(y = rnorm(90), x = gl(3, 30), b = factor(rep(1:30, 3)))
fit <- lme(fixed = y ~ x, random = ~ 1 | b, data = df, method = "REML")
ef <- effect("x", fit)
bp <- barplot(as.vector(ef$fit), col = c("tomato", "skyblue", "limegreen"),
ylim = c(min(ef$lower), max(ef$upper) + (max(ef$upper) - min(ef$lower)) * 0.2 ))
arrows(x0 = bp, y0 = ef$lower, y1 = ef$upper, code = 3, angle = 90)
test1 <- function(y, x, b)
{
fit <- lme(fixed = y ~ x, random = ~ 1 | b, method = "REML")
ef <- effect("x", fit)
bp <- barplot(as.vector(ef$fit), col = c("tomato", "skyblue", "limegreen"),
ylim = c(min(ef$lower), max(ef$upper) + (max(ef$upper) - min(ef$lower)) * 0.2 ))
arrows(x0 = bp, y0 = ef$lower, y1 = ef$upper, code = 3, angle = 90)
}
test1(df$y, df$x, df$b)
# Error in eval(predvars, data, env) : object 'y' not found
test2 <- function(y, x, b)
{
frame <- data.frame(y, x, b)
fit <- lme(fixed = y ~ x, random = ~ 1 | b, frame, method = "REML")
ef <- effect("x", fit)
bp <- barplot(as.vector(ef$fit), col = c("tomato", "skyblue", "limegreen"),
ylim = c(min(ef$lower), max(ef$upper) + (max(ef$upper) - min(ef$lower)) * 0.2 ))
arrows(x0 = bp, y0 = ef$lower, y1 = ef$upper, code = 3, angle = 90)
}
test2(df$y, df$x, df$b)
# Error in as.data.frame.default(data, optional = TRUE) :
# cannot coerce class ‘"function"’ to a data.frame
Simpler:
function(df) {
fit <- lme(fixed = y ~ x, random = ~ 1 | b, data = df, method = "REML")
ef <- effect("x", fit)
bp <- barplot(as.vector(ef$fit), col = c("tomato", "skyblue", "limegreen"),
ylim = c(min(ef$lower),
max(ef$upper) + (max(ef$upper) - min(ef$lower)) * 0.2 ))
arrows(x0 = bp, y0 = ef$lower, y1 = ef$upper, code = 3, angle = 90)
}
You need to pass data to lme, the formula doesn't actually pass any data.
That said, your test2 should work. I can replicate your error but it is really very weird. Somehow the code works in the global env but not in the closure. Very surprising.

Control legend in ggplotly when using subplot

I use the R plotly package and the functions ggplotly and subplot to create an interactive plot consisting of multiple individually interactive ggplot2 plots. Some of the plots share the same grouping variables.
col <- factor(rep(c(1, 2), 5))
fill <- factor(c(rep("a", 5), rep("b", 5)))
x1 <- x2 <- y1 <- y2 <- 1:10
x3 <- y3 <- 11:20
d1 <- dplyr::tibble(x1 = x1, y1 = y1, col = col)
d2 <- dplyr::tibble(x2 = x2, y2 = y2, col = col, fill = fill)
d3 <- dplyr::tibble(x3 = x3, y3 = y3, col = col)
g1 <-
ggplot2::ggplot(d1) +
ggplot2::geom_point(ggplot2::aes(x = x1, y = y1, col = col))
g2 <-
ggplot2::ggplot(d2) +
ggplot2::geom_point(ggplot2::aes(x = x2, y = y2, col = col, fill = fill)) +
ggplot2::scale_fill_manual(values = c("red","green"))
g3 <-
ggplot2::ggplot(d3) +
ggplot2::geom_point(ggplot2::aes(x = x3, y = y3, col = col))
plotly::subplot(plotly::ggplotly(g1), plotly::ggplotly(g2), plotly::ggplotly(g3))
1) How can I remove the duplicated "col" labels in the interactive plotly legend?
2) How can I remove the legend for "fill", but keep the legend for "col"?
EDIT: I know about the following "dirty" solution, which is to manually disable the legend:
t <-
plotly::subplot(plotly::ggplotly(g1), plotly::ggplotly(g2), plotly::ggplotly(g3))
t$x$data[[1]]$showlegend <- FALSE
t$x$data[[2]]$showlegend <- FALSE
t$x$data[[3]]$showlegend <- FALSE
t$x$data[[4]]$showlegend <- FALSE
However, this requires me to know the positions of the list elements in advance, which is why I am looking for a more general solution.
Another way to manually remove the unwanted legends is to use style(). In your example, lt <- t %>% style(t, showlegend = FALSE, traces = 3:n), where n<-8 is defined before, will suppress the unwanted legends.

Add labels to the center of a geom_curve line (ggplot)

Is there any way to add a label on or near the center of a geom_curve line? Currently, I can only do so by labeling either the start or end point of the curve.
library(tidyverse)
library(ggrepel)
df <- data.frame(x1 = 1, y1 = 1, x2 = 2, y2 = 3, details = "Object Name")
ggplot(df, aes(x = x1, y = y1, label = details)) +
geom_point(size = 4) +
geom_point(aes(x = x2, y = y2),
pch = 17, size = 4) +
geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2)) +
geom_label(nudge_y = 0.05) +
geom_label_repel(box.padding = 2)
I would love some way to automatically label the curve near coordinates x=1.75, y=1.5. Is there a solution out there I haven't seen yet? My intended graph is quite busy, and labeling the origin points makes it harder to see what's happening, while labeling the arcs would make a cleaner output.
I've come to a solution for this problem. It's large and clunky, but effective.
The core problem is that geom_curve() does not draw a set path, but it moves and scales with the aspect ratio of the plot window. So short of locking the aspect ratio with coord_fixed(ratio=1) there is no way I can easily find to predict where the midpoint of a geom_curve() segment will be.
So instead I set about finding midpoint for a curve, and then forcing the curve to go through that point which I would later label. To find the midpoint I had to copy two functions from the grid package:
library(grid)
library(tidyverse)
library(ggrepel)
# Find origin of rotation
# Rotate around that origin
calcControlPoints <- function(x1, y1, x2, y2, curvature, angle, ncp,
debug=FALSE) {
# Negative curvature means curve to the left
# Positive curvature means curve to the right
# Special case curvature = 0 (straight line) has been handled
xm <- (x1 + x2)/2
ym <- (y1 + y2)/2
dx <- x2 - x1
dy <- y2 - y1
slope <- dy/dx
# Calculate "corner" of region to produce control points in
# (depends on 'angle', which MUST lie between 0 and 180)
# Find by rotating start point by angle around mid point
if (is.null(angle)) {
# Calculate angle automatically
angle <- ifelse(slope < 0,
2*atan(abs(slope)),
2*atan(1/slope))
} else {
angle <- angle/180*pi
}
sina <- sin(angle)
cosa <- cos(angle)
# FIXME: special case of vertical or horizontal line ?
cornerx <- xm + (x1 - xm)*cosa - (y1 - ym)*sina
cornery <- ym + (y1 - ym)*cosa + (x1 - xm)*sina
# Debugging
if (debug) {
grid.points(cornerx, cornery, default.units="inches",
pch=16, size=unit(3, "mm"),
gp=gpar(col="grey"))
}
# Calculate angle to rotate region by to align it with x/y axes
beta <- -atan((cornery - y1)/(cornerx - x1))
sinb <- sin(beta)
cosb <- cos(beta)
# Rotate end point about start point to align region with x/y axes
newx2 <- x1 + dx*cosb - dy*sinb
newy2 <- y1 + dy*cosb + dx*sinb
# Calculate x-scale factor to make region "square"
# FIXME: special case of vertical or horizontal line ?
scalex <- (newy2 - y1)/(newx2 - x1)
# Scale end points to make region "square"
newx1 <- x1*scalex
newx2 <- newx2*scalex
# Calculate the origin in the "square" region
# (for rotating start point to produce control points)
# (depends on 'curvature')
# 'origin' calculated from 'curvature'
ratio <- 2*(sin(atan(curvature))^2)
origin <- curvature - curvature/ratio
# 'hand' also calculated from 'curvature'
if (curvature > 0)
hand <- "right"
else
hand <- "left"
oxy <- calcOrigin(newx1, y1, newx2, newy2, origin, hand)
ox <- oxy$x
oy <- oxy$y
# Calculate control points
# Direction of rotation depends on 'hand'
dir <- switch(hand,
left=-1,
right=1)
# Angle of rotation depends on location of origin
maxtheta <- pi + sign(origin*dir)*2*atan(abs(origin))
theta <- seq(0, dir*maxtheta,
dir*maxtheta/(ncp + 1))[c(-1, -(ncp + 2))]
costheta <- cos(theta)
sintheta <- sin(theta)
# May have BOTH multiple end points AND multiple
# control points to generate (per set of end points)
# Generate consecutive sets of control points by performing
# matrix multiplication
cpx <- ox + ((newx1 - ox) %*% t(costheta)) -
((y1 - oy) %*% t(sintheta))
cpy <- oy + ((y1 - oy) %*% t(costheta)) +
((newx1 - ox) %*% t(sintheta))
# Reverse transformations (scaling and rotation) to
# produce control points in the original space
cpx <- cpx/scalex
sinnb <- sin(-beta)
cosnb <- cos(-beta)
finalcpx <- x1 + (cpx - x1)*cosnb - (cpy - y1)*sinnb
finalcpy <- y1 + (cpy - y1)*cosnb + (cpx - x1)*sinnb
# Debugging
if (debug) {
ox <- ox/scalex
fox <- x1 + (ox - x1)*cosnb - (oy - y1)*sinnb
foy <- y1 + (oy - y1)*cosnb + (ox - x1)*sinnb
grid.points(fox, foy, default.units="inches",
pch=16, size=unit(1, "mm"),
gp=gpar(col="grey"))
grid.circle(fox, foy, sqrt((ox - x1)^2 + (oy - y1)^2),
default.units="inches",
gp=gpar(col="grey"))
}
list(x=as.numeric(t(finalcpx)), y=as.numeric(t(finalcpy)))
}
calcOrigin <- function(x1, y1, x2, y2, origin, hand) {
# Positive origin means origin to the "right"
# Negative origin means origin to the "left"
xm <- (x1 + x2)/2
ym <- (y1 + y2)/2
dx <- x2 - x1
dy <- y2 - y1
slope <- dy/dx
oslope <- -1/slope
# The origin is a point somewhere along the line between
# the end points, rotated by 90 (or -90) degrees
# Two special cases:
# If slope is non-finite then the end points lie on a vertical line, so
# the origin lies along a horizontal line (oslope = 0)
# If oslope is non-finite then the end points lie on a horizontal line,
# so the origin lies along a vertical line (oslope = Inf)
tmpox <- ifelse(!is.finite(slope),
xm,
ifelse(!is.finite(oslope),
xm + origin*(x2 - x1)/2,
xm + origin*(x2 - x1)/2))
tmpoy <- ifelse(!is.finite(slope),
ym + origin*(y2 - y1)/2,
ifelse(!is.finite(oslope),
ym,
ym + origin*(y2 - y1)/2))
# ALWAYS rotate by -90 about midpoint between end points
# Actually no need for "hand" because "origin" also
# encodes direction
# sintheta <- switch(hand, left=-1, right=1)
sintheta <- -1
ox <- xm - (tmpoy - ym)*sintheta
oy <- ym + (tmpox - xm)*sintheta
list(x=ox, y=oy)
}
With that in place, I calculated a midpoint for each record
df <- data.frame(x1 = 1, y1 = 1, x2 = 10, y2 = 10, details = "Object Name")
df_mid <- df %>%
mutate(midx = calcControlPoints(x1, y1, x2, y2,
angle = 130,
curvature = 0.5,
ncp = 1)$x) %>%
mutate(midy = calcControlPoints(x1, y1, x2, y2,
angle = 130,
curvature = 0.5,
ncp = 1)$y)
I then make the graph, but draw two separate curves. One from the origin to the calculated midpoint, and another from the midpoint to the destination. The angle and curvature settings for both finding the midpoint and drawing these curves are tricky to keep the result from obviously looking like two different curves.
ggplot(df_mid, aes(x = x1, y = y1)) +
geom_point(size = 4) +
geom_point(aes(x = x2, y = y2),
pch = 17, size = 4) +
geom_curve(aes(x = x1, y = y1, xend = midx, yend = midy),
curvature = 0.25, angle = 135) +
geom_curve(aes(x = midx, y = midy, xend = x2, yend = y2),
curvature = 0.25, angle = 45) +
geom_label_repel(aes(x = midx, y = midy, label = details),
box.padding = 4,
nudge_x = 0.5,
nudge_y = -2)
Though the answer isn't ideal or elegant, it scales with a large number of records.
Maybe annotations would help here (see: http://ggplot2.tidyverse.org/reference/annotate.html)
library(tidyverse)
library(ggrepel)
df <- data.frame(x1 = 1, y1 = 1, x2 = 2, y2 = 3, details = "Object Name")
ggplot(df, aes(x = x1, y = y1, label = details)) +
geom_point(size = 4) +
geom_point(aes(x = x2, y = y2),
pch = 17, size = 4) +
geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2)) +
geom_label(nudge_y = 0.05) +
geom_label_repel(box.padding = 2) +
annotate("label", x=1.75, y=1.5, label=df$details)

R script - NLS not working

I have 5 (x,y) data points and I'm trying to find a best fit solution consisting of two lines which intersect at a point (x0,y0), and which follow these equations:
y1 = (m1)(x1 - x0) + y0
y2 = (m2)(x2 - x0) + y0
Specifically, I require that the intersection must occur between x=2 and x=3. Have a look at the code:
#Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)
x2 <- c(3,4,5)
y2 <- c(20,30,40)
g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
q <- nls(c(y1, y2) ~ ifelse(g == TRUE, m1 * (x1 - x0) + y0, m2 * (x2 - x0) + y0), start = c(m1 = -1, m2 = 1, y0 = 0, x0 = 2), algorithm = "port", lower = c(m1 = -Inf, m2 = -Inf, y0 = -Inf, x0 = 2), upper = c(m1 = Inf, m2 = Inf, y0 = Inf, x0 = 3))
coef <- coef(q)
m1 <- coef[1]
m2 <- coef[2]
y0 <- coef[3]
x0 <- coef[4]
#Plot the original x1, y1, and x2, y2
plot(x1,y1,xlim=c(1,5),ylim=c(0,50))
points(x2,y2)
#Plot the fits
x1 <- c(1,2,3,4,5)
fit1 <- m1 * (x1 - x0) + y0
lines(x1, fit1, col="red")
x2 <- c(1,2,3,4,5)
fit2 <- m2 * (x2 - x0) + y0
lines(x2, fit2, col="blue")
So, you can see the data points listed there. Then, I run it through my nls, get my parameters m1, m2, x0, y0 (the slopes, and the intersection point).
But, take a look at the solution:
Clearly, the red line (which is supposed to only be based on the first 2 points) is not the best line of fit for the first 2 points. This is the same case with the blue line (the 2nd fit), which supposed to be is dependent on the last 3 points). What is wrong here?
This is segmented regression:
# input data
x1 <- c(1,2); y1 <- c(10,10); x2 <- c(3,4,5); y2 <- c(20,30,40)
x <- c(x1, x2); y <- c(y1, y2)
# segmented regression
library(segmented)
fm <- segmented.lm(lm(y ~ x), ~ x, NA, seg.control(stop.if.error = FALSE, K = 2))
summary(fm)
# plot
plot(fm)
points(y ~ x)
See ?lm, ?segmented.lm and ?seg.control for more info.
I'm not exactly sure what's wrong but I can get it to work by rearranging things a bit. Please note the comment in ?nls about "Do not use ‘nls’ on artificial "zero-residual" data."; I added a bit of noise.
## Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)
x2 <- c(3,4,5)
y2 <- c(20,30,40)
## make single x, y vector
x <- c(x1,x2)
set.seed(1001)
## (add a bit of noise to avoid zero-residual artificiality)
y <- c(y1,y2)+rnorm(5,sd=0.01)
g <- c(TRUE,TRUE,FALSE,FALSE,FALSE) ## specify identities of points
## particular changes:
## * you have lower=upper=2 for x0. Did you want 2<x0<3?
## * specified data argument explicitly (allows use of predict() etc.)
## * changed name from 'q' to 'fit1' (avoid R built-in function)
fit1 <- nls(y ~ ifelse(g,m1,m1+delta_m)*(x - x0) + y0,
start = c(m1 = -1, delta_m = 2, y0 = 0, x0 = 2),
algorithm = "port",
lower = c(m1 = -Inf, delta_m = 0, y0 = -Inf, x0 = 2),
upper = c(m1 = Inf, delta_m = Inf, y0 = Inf, x0 = 3),
data=data.frame(x,y))
#Plot the original 'data'
plot(x,y,col=rep(c("red","blue"),c(2,3)),
xlim=c(1,5),ylim=c(0,50))
## add predicted values
xvec <- seq(1,5,length.out=101)
lines(xvec,predict(fit1,newdata=data.frame(x=xvec)))
edit: based ifelse clause on point identity, not x position
edit: changed to require second slope to be > first slope
On a second look, I think the issue above is probably due to the use of separate vectors for x1 and x2 above, rather than a single x vector: I suspect these got replicated by R to match up with the g vector, which would have messed things up pretty badly. For example, this stripped-down example:
g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
ifelse(g,x1,x2)
## [1] 1 2 5 3 4
shows that x2 gets extended to (3 4 5 3 4) before being used in the ifelse clause. The scariest part is that normally one gets a warning such as this:
> x2 + 1:5
[1] 4 6 8 7 9
Warning message:
In x2 + 1:5 :
longer object length is not a multiple of shorter object length
but in this case there is no warning ...

Resources