Related
When I do the below code on my data, since there are 35 variables the resulting plot is almost useless because of all the overlap. I can't seem to find anywhere that would give me the list of data that's used to make the plot. For instance, I have a factor called avg_sour that has a direction of about 272 degrees and a magnitude of 1. That's one of the few I can actually see. If I had this data in a table, however, I could see clearly what I'm looking for without having to zoom in and out every time. Add to that the fact that this is for a presentation, so I need to be able to make this visible quickly, without them looking at multiple things--but I think I could get away with a crowded graph and a table that explained the crowded portion. Seems like it ought to be simple, but...I'm afraid I haven't found it yet. Any ideas? I can use any package I can find.
ggbiplot(xD4PCA,obs.scale = .1, var.scale = 1,
varname.size = 3, labels.size=6, circle = T, alpha = 0, center = T)+
scale_x_continuous(limits=c(-2,2)) +
scale_y_continuous(limits=c(-2,2))
If your xD4PCA is from prcomp function, then $rotation gives you eigenvectors. See prcomp function - Value.
You may manually choose and add arrows from xD4PCA$rotation[,1:2]
I was working on this with sample data ir.pca, which is just simple prcomp object using iris data, and all these jobs are based on source code of ggbiplot.
pcobj <- ir.pca # change here with your prcomp object
nobs.factor <- sqrt(nrow(pcobj$x) - 1)
d <- pcobj$sdev
u <- sweep(pcobj$x, 2, 1 / (d * nobs.factor), FUN = '*')
v <- pcobj$rotation
choices = 1:2
choices <- pmin(choices, ncol(u))
df.u <- as.data.frame(sweep(u[,choices], 2, d[choices]^obs.scale, FUN='*'))
v <- sweep(v, 2, d^1, FUN='*')
df.v <- as.data.frame(v[, choices])
names(df.u) <- c('xvar', 'yvar')
names(df.v) <- names(df.u)
df.u <- df.u * nobs.factor
r <- sqrt(qchisq(circle.prob, df = 2)) * prod(colMeans(df.u^2))^(1/4)
v.scale <- rowSums(v^2)
df.v <- r * df.v / sqrt(max(v.scale))
df.v$varname <- rownames(v)
df.v$angle <- with(df.v, (180/pi) * atan(yvar / xvar))
df.v$hjust = with(df.v, (1 - 1.5 * sign(xvar)) / 2)
theta <- c(seq(-pi, pi, length = 50), seq(pi, -pi, length = 50))
circle <- data.frame(xvar = r * cos(theta), yvar = r * sin(theta))
df.v <- df.v[1:2,] # change here like df.v[1:2,]
ggbiplot::ggbiplot(ir.pca,obs.scale = .1, var.scale = 1,
varname.size = 3, labels.size=6, circle = T, alpha = 0, center = T, var.axes = FALSE)+
scale_x_continuous(limits=c(-2,2)) +
scale_y_continuous(limits=c(-2,2)) +
geom_segment(data = df.v, aes(x = 0, y = 0, xend = xvar, yend = yvar),
arrow = arrow(length = unit(1/2, 'picas')),
color = muted('red')) +
geom_text(data = df.v,
aes(label = rownames(df.v), x = xvar, y = yvar,
angle = angle, hjust = hjust),
color = 'darkred', size = 3)
ggbiplot::ggbiplot(ir.pca)+
scale_x_continuous(limits=c(-2,2)) +
scale_y_continuous(limits=c(-2,2)) +
geom_path(data = circle, color = muted('white'),
size = 1/2, alpha = 1/3)
Original one(having all four variables)
Edited one(select only first two variables)
I can't get the scatter3D() function to work in R. I would like to plot a surface but am unable to get the code to run although it is very simple to some I found online. I have tried reading the CRAN code documentation but I can't get anywhere. I would like my code to run.
Essentially there is a problem with some variable I am putting in but I can't see where. I think it might be hidden inside a function and the error message is not actually for me.
Any help would be appreciated.
Code:
library("plot3D")
w <- 1:100
sigma <- 5
x <- w + rnorm(w, w, 2*sigma)
y <- w + rnorm(w, w, sigma)
z<- w + rnorm(w, w, 3*sigma)
a <- c()
for (i in 1:10){
a <- c(a, rep(10*i, times = 10))
}
b <- rep(seq(10, 100, 10), times = 10)
ratio = 0.4
d = ratio*a + (1 - ratio)*b + rnorm(100, 0, 10)
fit <- lm(d ~ a + b)
grid.lines = 26
a.pred <- seq(min(a), max(a), length.out = grid.lines)
b.pred <- seq(min(b), max(b), length.out = grid.lines)
ab <- expand.grid(a = a.pred, b = b.pred)
d.pred <- matrix(predict(fit, newdata = ab),
nrow = grid.lines, ncol = grid.lines)
fitpoints <- predict(fit)
scatter plot with regression plane
scatter3D(a,b,d, pch = 18, cex = 2,
theta = 20, phi = 20,
surf = list(a=a.pred, b=b.pred, d=d.pred,
facets = F, fit = fitpoints))
Error message:
Error in addimg(poly = NULL, plist = plist, a = c(10, 13.6, 17.2, 20.8, :
argument 4 matches multiple formal arguments
I think you need to call x, y, and z-axis:
# scatter plot with regression plane
scatter3D(a,b,d, pch = 18, cex = 2,
theta = 20, phi = 20,
surf = list(x=a.pred, y=b.pred, z=d.pred,
facets = F, fit = fitpoints))
This worked in my case.
Comment concerning bugfixing:
Run your code line by line (RStudio Ctrl + Enter). Then you realize that your scatter3D is the problem.
Remove arguments until the command works. In your case this suggested that surf = list(...) was the problem.
I have a little issue with the binomial tree plot in R; I'm using the package fOptions. Given St=39, K=40, T1=0.5, r=0.02, sigma=0.2, n=2, I use the following code:
CRRTree<- BinomialTreeOption(TypeFlag='ce',39,40,0.5,0.02,0.02,0.2,2)
BinomialTreePlot(CRRTree)
and the corresponding plot is
I have two problems.
First: I want that the x axis starts from zero and goes to 2
Second: I don't undestand why the upper value of the tree is not showed in the picture; how can I fix it?
Thank you very much.
EDIT: I solved the second problem in the easiest way, I think. It was sufficient to code the plot in this way:
BinomialTreePlot(CRRTree,ylim=c(-2,2.5))
There is an easy way to solve also the problem of making the tree starts from 0?
You will have to modify the code for the BinomialTreePlot function. For example, you could try something like that:
my_BinomialTreePlot<-function (BinomialTreeValues, dx = -0.025, dy = 0.4, cex = 1,
digits = 2, ...)
{
Tree = round(BinomialTreeValues, digits = digits)
depth = ncol(Tree)
plot(x = c(0, depth-1), y = c(-depth + 1, depth - 1), type = "n",
col = 0, ...)
points(x = 0, y = 0)
text(0 + dx, 0 + dy, deparse(Tree[1, 1]), cex = cex)
for (i in 1:(depth - 1)) {
y = seq(from = -i, by = 2, length = i + 1)
x = rep(i, times = length(y)) + 0
points(x, y, col = 1)
for (j in 1:length(x)) text(x[j] + dx, y[j] + dy, deparse(Tree[length(x) +
1 - j, i + 1]), cex = cex)
y = (-i):i
x = rep(c(i, i-1), times = 2 * i)[1:length(y)]
lines(x, y, col = 2)
}
invisible()
}
Then use it like this:
CRRTree<- BinomialTreeOption(TypeFlag='ce',39,40,0.5,0.02,0.02,0.2,2)
my_BinomialTreePlot(CRRTree,xlim=c(-0.1,2), ylim=c(-2.5,2.5))
The question R interpolated polar contour plot shows an excellent way to produce interpolated polar plots in R. I include the very slightly modified version I'm using:
PolarImageInterpolate <- function(
### Plotting data (in cartesian) - will be converted to polar space.
x, y, z,
### Plot component flags
contours=TRUE, # Add contours to the plotted surface
legend=TRUE, # Plot a surface data legend?
axes=TRUE, # Plot axes?
points=TRUE, # Plot individual data points
extrapolate=FALSE, # Should we extrapolate outside data points?
### Data splitting params for color scale and contours
col_breaks_source = 1, # Where to calculate the color brakes from (1=data,2=surface)
# If you know the levels, input directly (i.e. c(0,1))
col_levels = 10, # Number of color levels to use - must match length(col) if
#col specified separately
col = rev(heat.colors(col_levels)), # Colors to plot
# col = rev(heat.colors(col_levels)), # Colors to plot
contour_breaks_source = 1, # 1=z data, 2=calculated surface data
# If you know the levels, input directly (i.e. c(0,1))
contour_levels = col_levels+1, # One more contour break than col_levels (must be
# specified correctly if done manually
### Plotting params
outer.radius = ceiling(max(sqrt(x^2+y^2))),
circle.rads = pretty(c(0,outer.radius)), #Radius lines
spatial_res=1000, #Resolution of fitted surface
single_point_overlay=0, #Overlay "key" data point with square
#(0 = No, Other = number of pt)
### Fitting parameters
interp.type = 1, #1 = linear, 2 = Thin plate spline
lambda=0){ #Used only when interp.type = 2
minitics <- seq(-outer.radius, outer.radius, length.out = spatial_res)
# interpolate the data
if (interp.type ==1 ){
Interp <- akima:::interp(x = x, y = y, z = z,
extrap = extrapolate,
xo = minitics,
yo = minitics,
linear = FALSE)
Mat <- Interp[[3]]
}
else if (interp.type == 2){
library(fields)
grid.list = list(x=minitics,y=minitics)
t = Tps(cbind(x,y),z,lambda=lambda)
tmp = predict.surface(t,grid.list,extrap=extrapolate)
Mat = tmp$z
}
else {stop("interp.type value not valid")}
# mark cells outside circle as NA
markNA <- matrix(minitics, ncol = spatial_res, nrow = spatial_res)
Mat[!sqrt(markNA ^ 2 + t(markNA) ^ 2) < outer.radius] <- NA
### Set contour_breaks based on requested source
if ((length(contour_breaks_source == 1)) & (contour_breaks_source[1] == 1)){
contour_breaks = seq(min(z,na.rm=TRUE),max(z,na.rm=TRUE),
by=(max(z,na.rm=TRUE)-min(z,na.rm=TRUE))/(contour_levels-1))
}
else if ((length(contour_breaks_source == 1)) & (contour_breaks_source[1] == 2)){
contour_breaks = seq(min(Mat,na.rm=TRUE),max(Mat,na.rm=TRUE),
by=(max(Mat,na.rm=TRUE)-min(Mat,na.rm=TRUE))/(contour_levels-1))
}
else if ((length(contour_breaks_source) == 2) & (is.numeric(contour_breaks_source))){
contour_breaks = pretty(contour_breaks_source,n=contour_levels)
contour_breaks = seq(contour_breaks_source[1],contour_breaks_source[2],
by=(contour_breaks_source[2]-contour_breaks_source[1])/(contour_levels-1))
}
else {stop("Invalid selection for \"contour_breaks_source\"")}
### Set color breaks based on requested source
if ((length(col_breaks_source) == 1) & (col_breaks_source[1] == 1))
{zlim=c(min(z,na.rm=TRUE),max(z,na.rm=TRUE))}
else if ((length(col_breaks_source) == 1) & (col_breaks_source[1] == 2))
{zlim=c(min(Mat,na.rm=TRUE),max(Mat,na.rm=TRUE))}
else if ((length(col_breaks_source) == 2) & (is.numeric(col_breaks_source)))
{zlim=col_breaks_source}
else {stop("Invalid selection for \"col_breaks_source\"")}
# begin plot
Mat_plot = Mat
Mat_plot[which(Mat_plot<zlim[1])]=zlim[1]
Mat_plot[which(Mat_plot>zlim[2])]=zlim[2]
image(x = minitics, y = minitics, Mat_plot , useRaster = TRUE, asp = 1, axes = FALSE, xlab = "", ylab = "", zlim = zlim, col = col)
# add contours if desired
if (contours){
CL <- contourLines(x = minitics, y = minitics, Mat, levels = contour_breaks)
A <- lapply(CL, function(xy){
lines(xy$x, xy$y, col = gray(.2), lwd = .5)
})
}
# add interpolated point if desired
if (points){
points(x, y, pch = 21, bg ="blue")
}
# add overlay point (used for trained image marking) if desired
if (single_point_overlay!=0){
points(x[single_point_overlay],y[single_point_overlay],pch=0)
}
# add radial axes if desired
if (axes){
# internals for axis markup
RMat <- function(radians){
matrix(c(cos(radians), sin(radians), -sin(radians), cos(radians)), ncol = 2)
}
circle <- function(x, y, rad = 1, nvert = 500){
rads <- seq(0,2*pi,length.out = nvert)
xcoords <- cos(rads) * rad + x
ycoords <- sin(rads) * rad + y
cbind(xcoords, ycoords)
}
# draw circles
if (missing(circle.rads)){
circle.rads <- pretty(c(0,outer.radius))
}
for (i in circle.rads){
lines(circle(0, 0, i), col = "#66666650")
}
# put on radial spoke axes:
axis.rads <- c(0, pi / 6, pi / 3, pi / 2, 2 * pi / 3, 5 * pi / 6)
r.labs <- c(90, 60, 30, 0, 330, 300)
l.labs <- c(270, 240, 210, 180, 150, 120)
for (i in 1:length(axis.rads)){
endpoints <- zapsmall(c(RMat(axis.rads[i]) %*% matrix(c(1, 0, -1, 0) * outer.radius,ncol = 2)))
segments(endpoints[1], endpoints[2], endpoints[3], endpoints[4], col = "#66666650")
endpoints <- c(RMat(axis.rads[i]) %*% matrix(c(1.1, 0, -1.1, 0) * outer.radius, ncol = 2))
lab1 <- bquote(.(r.labs[i]) * degree)
lab2 <- bquote(.(l.labs[i]) * degree)
text(endpoints[1], endpoints[2], lab1, xpd = TRUE)
text(endpoints[3], endpoints[4], lab2, xpd = TRUE)
}
axis(2, pos = -1.25 * outer.radius, at = sort(union(circle.rads,-circle.rads)), labels = NA)
text( -1.26 * outer.radius, sort(union(circle.rads, -circle.rads)),sort(union(circle.rads, -circle.rads)), xpd = TRUE, pos = 2)
}
# add legend if desired
# this could be sloppy if there are lots of breaks, and that's why it's optional.
# another option would be to use fields:::image.plot(), using only the legend.
# There's an example for how to do so in its documentation
if (legend){
library(fields)
image.plot(legend.only=TRUE, smallplot=c(.78,.82,.1,.8), col=col, zlim=zlim)
# ylevs <- seq(-outer.radius, outer.radius, length = contour_levels+ 1)
# #ylevs <- seq(-outer.radius, outer.radius, length = length(contour_breaks))
# rect(1.2 * outer.radius, ylevs[1:(length(ylevs) - 1)], 1.3 * outer.radius, ylevs[2:length(ylevs)], col = col, border = NA, xpd = TRUE)
# rect(1.2 * outer.radius, min(ylevs), 1.3 * outer.radius, max(ylevs), border = "#66666650", xpd = TRUE)
# text(1.3 * outer.radius, ylevs[seq(1,length(ylevs),length.out=length(contour_breaks))],round(contour_breaks, 1), pos = 4, xpd = TRUE)
}
}
Unfortunately, this function has a few bugs:
a) Even with a purely radial pattern, the produced plot has a distortion whose origin I don't understand:
#example
r <- rep(seq(0.1, 0.9, len = 8), each = 8)
theta <- rep(seq(0, 7/4*pi, by = pi/4), times = 8)
x <- r*sin(theta)
y <- r*cos(theta)
z <- z <- rep(seq(0, 1, len = 8), each = 8)
PolarImageInterpolate(x, y, z)
why the wiggles between 300° and 360°? The z function is constant in theta, so there's no reason why there should be wiggles.
b) After 4 years, some of the packages loaded have been modified and at least one functionality of the function is broken. Setting interp.type = 2 should use thin plate splines for interpolation instead than a basic linear interpolation, but it doesn't work:
> PolarImageInterpolate(x, y, z, interp.type = 2)
Warning:
Grid searches over lambda (nugget and sill variances) with minima at the endpoints:
(GCV) Generalized Cross-Validation
minimum at right endpoint lambda = 9.493563e-06 (eff. df= 60.80002 )
predict.surface is now the function predictSurface
Error in image.default(x = minitics, y = minitics, Mat_plot, useRaster = TRUE, :
'z' must be a matrix
the first message is a warning and doesn't worry me, but the second one is actually an error and prevents me from using thin plate splines. Can you help me solve these two problems?
Also, I'd like to "upgrade" to using ggplot2, so if you can give an answer which does that, it would be great. Otherwise, after the bugs are fixed, I'll try asking a specific question which only asks to modify the function so that it uses ggplot2.
For the ggplot2 solution, here is a start. geom_raster allows interpolation, but does not work for polar coordinates. Instead, you can use geom_tile, though then you may need to do the interpolation yourself before passing the values to ggplot.
Of important note: the example data you gave gives an error when working with geom_raster that I believe is caused by the spacing of the values. Here is an example set that works (note, used this blog as a guide, though it is now outdated):
dat_grid <-
expand.grid(x = seq(0,350,10), y = 0:10)
dat_grid$density <- runif(nrow(dat_grid))
ggplot(dat_grid
, aes(x = x, y = y, fill = density)) +
geom_tile() +
coord_polar() +
scale_x_continuous(breaks = seq(0,360,90)) +
scale_fill_gradient2(low = "white"
, mid = "yellow"
, high = "red3"
, midpoint = 0.5)
If you are working from raw data, you might be able to get ggplot to do the work for you. Here is an example working from raw data. There are a lot of manual tinkering things to do, but it is at least an optional starting point:
polarData <-
data.frame(
theta = runif(10000, 0, 2*pi)
, r = log(abs(rnorm(10000, 0, 10)))
)
toCart <-
data.frame(
x = polarData$r * cos(polarData$theta)
, y = polarData$r * sin(polarData$theta)
)
axisLines <-
data.frame(
x = 0
, y = 0
, xend = max(polarData$r)*cos(seq(0, 2*pi, pi/4))
, yend = max(polarData$r)*sin(seq(0, 2*pi, pi/4))
, angle = paste(seq(0, 2, 1/4), "pi") )
ticks <-
data.frame(
label = pretty(c(0, max(polarData$r)) )[-1]
)
ggplot(toCart) +
# geom_point(aes(x = x, y = y)) +
stat_density_2d(aes(x = x, y = y
, fill = ..level..)
, geom = "polygon") +
scale_fill_gradient(low = "white"
, high = "red3") +
theme(axis.text = element_blank()
, axis.title = element_blank()
, axis.line = element_blank()
, axis.ticks = element_blank()) +
geom_segment(data = axisLines
, aes(x = x, y = y
, xend = xend
, yend = yend)) +
geom_label(data = axisLines
, aes(x = xend, y = yend, label = angle)) +
geom_label(data = ticks
, aes(x = 0, y = label, label = label))
From an another post, I came to know that the fucnction predict.surface from package fields is deprecated whic is used for interp.type = 2 in PolarImageInterpolate. Instead, a new predictSurface function is introduced, which can be used here.
Example:
r <- rep(seq(0.1, 0.9, len = 8), each = 8)
theta <- rep(seq(0, 7/4*pi, by = pi/4), times = 8)
x <- r*sin(theta)
y <- r*cos(theta)
z <- z <- rep(seq(0, 1, len = 8), each = 8)
PolarImageInterpolate(x, y, z, interp.type = 2)
I'd like to create a sankey-like plot that I can create in ggplot2 where there are curved lines between my start and end locations. Currently, I have data that looks like this:
df <- data.frame(Line = rep(letters[1:4], 2),
Location = rep(c("Start", "End"), each=4),
X = rep(c(1, 10), each = 4),
Y = c(c(1,3, 5, 15), c(9,12, 14, 6)),
stringsAsFactors = F)
ex:
Line Location X Y
1 a Start 1 1
2 a End 10 9
and creates a plot that looks something like this:
library(ggplot2)
ggplot(df) +
geom_path(aes(x= X, y= Y, group = Line))
I would like to see the data come out like this:
This is another option for setting up the data:
df2 <- data.frame(Line = letters[1:4],
Start.X= rep(1, 4),
Start.Y = c(1,3,5,15),
End.X = rep(10, 4),
End.Y = c(9,12,14,6))
ex:
Line Start.X Start.Y End.X End.Y
1 a 1 1 10 9
I can find examples of how to add a curve to the graphics of base R but these examples don't demonstrate how to get a data frame of the points in between in order to draw that curve. I would prefer to use dplyr for data manipulation. I imagine this will require a for-loop to build a table of the interpolated points.
These examples are similar but do not produce an s-shaped curve:
Plotting lines on map - gcIntermediate
http://flowingdata.com/2011/05/11/how-to-map-connections-with-great-circles/
Thank you in advance!
The code below creates curved lines via a logistic function. You could use whatever function you like instead, but this is the main idea. I should note that for other than graphical purposes, creating a curved line out of 2 points is a bad idea. It implies that the data show a certain type of relation while it actually doesn't imply that relation.
df <- data.frame(Line = rep(letters[1:4], 2),
Location = rep(c("Start", "End"), each=4),
X = rep(c(1, 10), each = 4),
Y = c(c(1,3, 5, 15), c(9,12, 14, 6)),
stringsAsFactors = F)
# logistic function for curved lines
logistic = function(x, y, midpoint = mean(x)) {
ry = range(y)
if (y[1] < y[2]) {
sign = 2
} else {
sign = -2
}
steepness = sign*diff(range(x)) / diff(ry)
out = (ry[2] - ry[1]) / (1 + exp(-steepness * (x - midpoint))) + ry[1]
return(out)
}
# an example
x = c(1, 10)
y = c(1, 9)
xnew = seq(1, 10, .5)
ynew = logistic(xnew, y)
plot(x, y, type = 'b', bty = 'n', las = 1)
lines(xnew, ynew, col = 2, type = 'b')
# applying the function to your example
xnew = seq(min(df$X), max(df$X), .1) # new x grid
m = matrix(NA, length(xnew), 4) # matrix to store results
uniq = unique(df$Line) # loop over all unique values in df$Line
for (i in seq_along(uniq)) {
m[, i] = logistic(xnew, df$Y[df$Line == uniq[i]])
}
# base R plot
matplot(xnew, m, type = 'b', las = 1, bty = 'n', pch = 1)
# put stuff in a dataframe for ggplot
df2 = data.frame(x = rep(xnew, ncol(m)),
y = c(m),
group = factor(rep(1:ncol(m), each = nrow(m))))
library(ggplot2)
ggplot(df) +
geom_path(aes(x= X, y= Y, group = Line, color = Line)) +
geom_line(data = df2, aes(x = x, y = y, group = group, color = group))