How to set a logarithmic scale across multiple ggplot2 contour plots? - r

I am attempting to create three contour plots, each illustrating the following function applied to two input vectors and a fixed alpha:
alphas <- c(1, 5, 25)
x_vals <- seq(0, 25, length.out = 100)
y_vals <- seq(0, 50, length.out = 100)
my_function <- function(x, y, alpha) {
z <- (1 / (x + alpha)) * (1 / (y + alpha))
}
for each alpha in the vector alphas, I am creating a contour plot of z values—relative to the minimal z value—over x and y axes.
I do so with the following code (probably not best practices; I'm still learning the basics with R):
plots <- list()
for(i in seq_along(alphas)) {
z_table <- sapply(x_vals, my_function, y = y_vals, alpha = alphas[i])
x <- rep(x_vals, each = 100)
y <- rep(y_vals, 100)
z <- unlist(flatten(list(z_table)))
z_rel <- z / min(z)
d <- data.frame(cbind(x, y, z_rel))
plots[[i]] <- ggplot(data = d, aes(x = x, y = y, z = z_rel)) +
geom_contour_filled()
}
When alpha = 1:
When alpha = 25:
I want to display these plots in one grouping using ggarrange(), with one logarithmic color scale (as relative z varies so much from plot to plot). Is there a way to do this?

You can build a data frame with all the data for all alphas combined, with a column indicating the alpha, so you can facet your graph:
I basically removed the plot[[i]] part, and stacked up the d's created in the former loop:
d = numeric()
for(i in seq_along(alphas)) {
z_table <- sapply(x_vals, my_function, y = y_vals, alpha = alphas[i])
x <- rep(x_vals, each = 100)
y <- rep(y_vals, 100)
z <- unlist(flatten(list(z_table)))
z_rel <- z / min(z)
d <- rbind(d, cbind(x, y, z_rel))}
d = as.data.frame(d)
Then we create the alphas column:
d$alpha = factor(paste("alpha =", alphas[rep(1:3, each=nrow(d)/length(alphas))]),
levels = paste("alpha =", alphas[1:3]))
Then build the log scale inside the contour:
ggplot(data = d, aes(x = x, y = y, z = z_rel)) +
geom_contour_filled(breaks=round(exp(seq(log(1), log(1400), length = 14)),1)) +
facet_wrap(~alpha)
Output:

Related

Standalone legend in ggpairs

How can I include a legend inside one of the empty panels of the following matrix plot?
I have color coded different regression lines in the plots. I need a legend based on color.
I believe this answer comes closest to answer my question, yet I do not know how exactly to modify my code to get a legend based on color for different regression lines.
As for the background of the code, I am trying to study different robust and non-robust regression methods applied to multivariate data with and without outliers.
library(ggplot2)
library(GGally)
library(MASS)
library(robustbase)
## Just create data -- you can safely SKIP this function.
##
## Take in number of input variables (k), vector of ranges of k inputs
## ranges = c(min1, max1, min2, max2, ...) (must have 2k elements),
## parameters to create data (must be consistent with the number of
## input variables plus one), parameters are vector of linear
## coefficients (b) and random seed (seed), number of observations
## (n), vector of outliers (outliers)
##
## Return uncontaminated dataframe and contaminated dataframe
create_data <- function(k, ranges, b, seed = 6, n,
outliers = NULL) {
x <- NULL # x: matrix of input variables
for (i in 1:k) {
set.seed(seed^i)
## x <- cbind(x, runif(n, ranges[2*i-1], ranges[2*i]))
x <- cbind(x, rnorm(n, ranges[2*i-1], ranges[2*i]))
}
set.seed(seed - 2)
x_aug = cbind(rep(1, n), x)
y <- x_aug %*% b
y_mean = mean(y)
e <- rnorm(n, 0, 0.20 * y_mean) # rnorm x
y <- y + e
df <- data.frame(x = x, y = y)
len <- length(outliers)
n_rows <- len %/% (k+1)
if (!is.null(outliers)) {
outliers <- matrix(outliers, n_rows, k+1, byrow = TRUE)
df_contamin <- data.frame(x = rbind(x, outliers[,1:k]), y = c(y, outliers[,k+1]))
} else {
df_contamin <- df
}
dat <- list(df, df_contamin)
}
# plot different regression models (some are robust) for two types of
# data (one is contaminated with outliers)
plot_models <- function(data, mapping, data2) {
cb_palette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
## 1.grey, 2.light orange, 3.light blue, 4.green, 5.yellow, 6.blue, 7.red, 8.purple
plt <- ggplot(data = data, mapping = mapping) +
geom_point() +
theme_bw() +
geom_smooth(method = lm, formula = y ~ x, data = data2, color = cb_palette[3], se = FALSE) +
geom_smooth(method = lm, formula = y ~ x, color = cb_palette[7], se = FALSE) +
geom_smooth(method = rlm, formula = y ~ x, color = cb_palette[4], se = FALSE) +
geom_smooth(method = lmrob, formula = y ~ x, color = cb_palette[1], se = FALSE)
plt
}
# trim the upper and right panels of plots
trim_gg <- function(gg) {
n <- gg$nrow
gg$nrow <- gg$ncol <- n-1
v <- 1:n^2
gg$plots <- gg$plots[v > n & v%%n != 0]
gg$xAxisLabels <- gg$xAxisLabels[-n]
gg$yAxisLabels <- gg$yAxisLabels[-1]
gg
}
dat <- create_data(3, c(1, 10, 1, 10, 1, 10), c(5, 8, 6, 7), 6, 20, c(30, 30, 50, 400))
df <- dat[[1]]
df_contamin <- dat[[2]]
## Note that plot_models is called here
g <- ggpairs(df_contamin, columns = 1:4, lower = list(continuous = wrap(plot_models, data2 = df)), diag = list(continuous = "blankDiag"), upper = list(continuous = "blank")) #, legend = lgd)
gr <- trim_gg(g)
print(gr)
Created on 2019-10-09 by the reprex package (v0.3.0)
Sorry for the long code, but most probably only the plot_models function and the line where ggpairs is called need to be modified.
I want to get a legend in the blank upper half of the plots. It may be done by somehow tweaking the plot_models function, setting the mapping in ggpairs to color using ggplot2::aes_string, and using getPlot and putPlot of the GGally package. But I can't wrap my head around how to do it exactly.

List of plots generated in ggplot2 using scale_color_gradientn have wrong coloring

I'm attempting to use library(scales) and scale_color_gradientn() to create a custom mapping of colors to a continuous variable, in an attempt to limit the effect of outliers on the coloring of my plot. This works for a single plot, but does not work when I use a loop to generate several plots and store them in a list.
Here is a minimal working example:
library(ggplot2)
library(scales)
data1 <- as.data.frame(cbind(x = rnorm(100),
y = rnorm(100),
v1 = rnorm(100, mean = 2, sd = 1),
v2 = rnorm(100, mean = -2, sd = 1)))
#add outliers
data1[1,"v1"] <- 200
data1[2,"v1"] <- -200
data1[1,"v2"] <- 50
data1[2,"v2"] <- -50
#define color palette
cols <- colorRampPalette(c("#3540FF","black","#FF3535"))(n = 100)
#simple color scale
col2 <- scale_color_gradient2(low = "#3540FF",
mid = "black",
high = "#FF3535"
)
#outlier-adjusted color scale
{
aa <- min(data1$v1)
bb <- quantile(data1$v1, 0.05)
cc <- quantile(data1$v1, 0.95)
dd <- max(data1$v1)
coln <- scale_color_gradientn(colors = cols[c(1,5,95,100)],
values = rescale(c(aa,bb,cc,dd),
limits = c(aa,dd))
)
}
Plots:
1. Plot with simple scales - outliers cause scales to stretch out.
ggplot(data1, aes(x = x, y = y, color = v1))+
geom_point()+
col2
2. Plot with outlier-adjusted scales - correct color scaling.
ggplot(data1, aes(x = x, y = y, color = v1))+
geom_point()+
coln
3. The scales for v1 do not work for v2 as the data is different.
ggplot(data1, aes(x = x, y = y, color = v2))+
geom_point()+
coln
#loop to produce list of plots each with own scale
{
plots <- list()
k <- 1
for (i in c("v1","v2")){
aa <- min(data1[,i])
bb <- quantile(data1[,i],0.05)
cc <- quantile(data1[,i], 0.95)
dd <- max(data1[,i])
colm <- scale_color_gradientn(colors = cols[c(1,5,95,100)],
values = rescale(c(aa,bb,cc,dd),
limits = c(aa,dd)))
plots[[k]] <- ggplot(data1, aes_string(x = "x",
y = "y",
color = i
))+
geom_point()+
colm
k <- k + 1
}
}
4. First plot has the wrong scales.
plots[[1]]
5. Second plot has the correct scales.
plots[[2]]
So I'm guessing this has something to do with the scale_color_gradientn() function being called when the plotting takes place, rather than within the loop.
If anyone can help with this, it'd be much appreciated. In base R I would bin the continuous data and assigning hex colors into a vector used for fill color, but I'm unsure how I can apply this within ggplot.
You need to use a closure (function with associated environment):
{
plots <- list()
k <- 1
for (i in c("v1", "v2")){
colm <- function() {
aa <- min(data1[, i])
bb <- quantile(data1[, i], 0.05)
cc <- quantile(data1[, i], 0.95)
dd <- max(data1[, i])
scale_color_gradientn(colors = cols[c(1, 5, 95, 100)],
values = rescale(c(aa, bb, cc, dd),
limits = c(aa, dd)))
}
plots[[k]] <- ggplot(data1, aes_string(x = "x",
y = "y",
color = i)) +
geom_point() +
colm()
k <- k + 1
}
}
plots[[1]]
plots[[2]]

surface plots of large 3D datasets using R [duplicate]

Could you give me an example on how to use rgl to plot 3 variables at the axes x, y and z and a fourth one with different colours?
thanks
You use a combination of persp and colour according to a separate function. Here's some example code:
## Create a simple surface f(x,y) = -x^2 - y^2
## Colour the surface according to x^2 only
nx = 31; ny = 31
x = seq(-1, 1, length = nx)
y = seq(-1, 1, length = ny)
z = outer(x, y, function(x,y) -x^2 -y^2)
## Fourth dim
z_col = outer(x, y, function(x,y) x^2)
## Average the values at the corner of each facet
## and scale to a value in [0, 1]. We will use this
## to select a gray for colouring the facet.
hgt = 0.25 * (z_col[-nx,-ny] + z_col[-1,-ny] + z_col[-nx,-1] + z_col[-1,-1])
hgt = (hgt - min(hgt))/ (max(hgt) - min(hgt))
## Plot the surface with the specified facet colours.
persp(x, y, z, col = gray(1 - hgt))
persp(x, y, z, col=cm.colors(32)[floor(31*hgt+1)], theta=-35, phi=10)
This gives:
RGL
It's fairly straightforward to use the above technique with the rgl library:
library(rgl)
## Generate the data using the above commands
## New window
open3d()
## clear scene:
clear3d("all")
## setup env:
bg3d(color="#887777")
light3d()
surface3d(x, y, z, color=cm.colors(32)[floor(31*hgt+1)], alpha=0.5)
There is an example in ?plot3d if you are talking about plotting points in a 3d space and colouring them:
x <- sort(rnorm(1000))
y <- rnorm(1000)
z <- rnorm(1000) + atan2(x,y)
plot3d(x, y, z, col=rainbow(1000))
But if you mean to colour the points by a 4th variable, say a grouping variable, then we can modify the example above to do this by creating a grouping variable
grp <- gl(5, 200) ## 5 groups 200 members each
## now select the colours we want
cols <- 1:5
## Now plot
plot3d(x, y, z, col=cols[grp])
OK, is this more what you want?
X <- 1:10
Y <- 1:10
## Z is now a 100 row object of X,Y combinations
Z <- expand.grid(X = X, Y = Y)
## Add in Z1, which is the 3rd variable
## X,Y,Z1 define the surface, which we colour according to
## 4th variable Z2
Z <- within(Z, {
Z1 <- 1.2 + (1.4 * X) + (-1.9 * Y)
Z2 <- 1.2 + (1.4 * X) - (1.2 * X^2) + (1.9 * Y) + (-1.3 * Y^2)
Z3 <- 1.2 + (1.4 * X) + (-1.9 * Y) + (-X^2) + (-Y^2)})
## show the data
head(Z)
## Set-up the rgl device
with(Z, plot3d(X, Y, Z1, type = "n"))
## Need a scale for Z2 to display as colours
## Here I choose 10 equally spaced colours from a palette
cols <- heat.colors(10)
## Break Z2 into 10 equal regions
cuts <- with(Z, cut(Z2, breaks = 10))
## Add in the surface, colouring by Z2
with(Z, surface3d(1:10,1:10, matrix(Z1, ncol = 10),
color = cols[cuts], back = "fill"))
with(Z, points3d(X, Y, Z1, size = 5)) ## show grid X,Y,Z1
Here's a modification where the plane surface Z1 is curved (Z3).
## Set-up the rgl device plotting Z3, a curved surface
with(Z, plot3d(X, Y, Z3, type = "n"))
with(Z, surface3d(1:10,1:10, matrix(Z3, ncol = 10),
color = cols[cuts], back = "fill"))
The detail of what I did to get Z2 probably doesn't matter, but I tried to get something like the graph you linked to.
If I've still not got what you want, can you edit your Q with some example data and give us a better idea of what you want?
HTH
Take a look at example(points3d).
The r3d help page shows you how to draw axes.
x <- c(0, 10, 0, 0)
y <- c(0, 0, 100, 0)
z <- c(0, 0, 0, 1)
i <- c(1,2,1,3,1,4)
labels <- c("Origin", "X", "Y", "Z")
text3d(x,y,z,labels)
segments3d(x[i],y[i],z[i])
Now you add some points
dfr <- data.frame(x = 1:10, y = (1:10)^2, z = runif(10), col = rainbow(10))
with(dfr, points3d(x, y, z, col = col))

How to plot the intersection of a hyperplane and a plane in R

I have a set of (2-dimensional) data points that I run through a classifier that uses higher order polynomial transformations. I want to visualize the results as a 2 dimensional scatterplot of the points with the classifier superimbosed on top, preferably using ggplot2 as all other visualizations are made by this. Pretty much like this one that was used in the ClatechX online course on machine learning (the background color is optional).
I can display the points with colors and symbols and all, that's easy but I can't figure out how to draw anything like the classifiers (the intersection of the classifiing hyperplane with the plane representing my threshold). The only thing I found was stat_function and that only takes a function with a single argument.
Edit:
The example that was asked for in the comments:
sample data:
"","x","y","x","x","y","value"
"1",4.17338115745224,0.303530843229964,1.26674990184152,17.4171102853774,0.0921309727918932,-1
"2",4.85514814266935,3.452660451876,16.7631779801937,23.5724634872656,11.9208641959486,1
"3",3.51938610081561,3.41200957307592,12.0081790673332,12.3860785266141,11.6418093267617,1
"4",3.18545089452527,0.933340128976852,2.97310914874565,10.1470974014319,0.87112379635852,-16
"5",2.77556006214581,2.49701633118093,6.93061880335166,7.70373365857888,6.23509055818427,-1
"6",2.45974169578403,4.56341833807528,11.2248303614692,6.05032920997851,20.8247869282818,1
"7",2.73947941488586,3.35344674880616,9.18669833727041,7.50474746458339,11.2456050970786,-1
"8",2.01721803518012,3.55453519499861,7.17027250203368,4.06916860145595,12.6347204524838,-1
"9",3.52376445778646,1.47073399974033,5.1825201951431,12.4169159539591,2.1630584979922,-1
"10",3.77387718763202,0.509284208528697,1.92197605658768,14.2421490273294,0.259370405056702,-1
"11",4.15821685106494,1.03675272315741,4.31104264382058,17.2907673804804,1.0748562089743,-1
"12",2.57985028671101,3.88512040604837,10.0230289934507,6.65562750184287,15.0941605694935,1
"13",3.99800728890114,2.39457673509605,9.5735352407471,15.9840622821066,5.73399774026327,1
"14",2.10979392635636,4.58358959294856,9.67042948411309,4.45123041169019,21.0092935565863,1
"15",2.26988795562647,2.96687697409652,6.73447830932721,5.15239133109813,8.80235897942413,-1
"16",1.11802248633467,0.114183261757717,0.127659454208164,1.24997427994995,0.0130378172656312,-1
"17",0.310411276295781,2.09426849964075,0.650084557879535,0.0963551604515758,4.38596054858751,-1
"18",1.93197490065359,1.72926536411978,3.340897280049,3.73252701675543,2.99035869954433,-1
"19",3.45879891654477,1.13636834081262,3.93046958599847,11.9632899450912,1.29133300600123,-1
"20",0.310697768582031,0.730971727753058,0.227111284709427,0.0965331034018534,0.534319666774291,-1
"21",3.88408110360615,0.915658151498064,3.55649052359657,15.0860860193904,0.838429850404852,-1
"22",0.287852146429941,2.16121324687265,0.622109872005114,0.0828588582043242,4.67084269845782,-1
"23",2.80277011333965,1.22467750683427,3.4324895146344,7.85552030822994,1.4998349957458,-1
"24",0.579150241101161,0.57801398797892,0.334756940497835,0.335415001767533,0.334100170299295-,1
"25",2.37193428212777,1.58276639413089,3.7542178708388,5.62607223873297,2.50514945839009,-1
"26",0.372461311053485,2.51207412336953,0.935650421453748,0.138727428231681,6.31051640130279,-1
"27",3.56567220995203,1.03982002707198,3.70765737388213,12.7140183088242,1.08122568869998,-1
"28",0.634770628530532,2.26303249713965,1.43650656059435,0.402933750845047,5.12131608311011,-1
"29",2.43812176748179,1.91849716124125,4.67752968967431,5.94443775306852,3.68063135769073,-1
"30",1.08741064323112,3.01656032912433,3.28023980783858,1.18246190701233,9.0996362192467,-1
"31",0.98,2.74,2.6852,0.9604,7.5076,1
"32",3.16,1.78,5.6248,9.9856,3.1684,1
"33",4.26,4.28,18.2328,18.1476,18.3184,-1
The code to generate a classifier:
perceptron_train <- function(data, maxIter=10000) {
set.seed(839)
X <- as.matrix(data[1:5])
Y <- data["value"]
d <- dim(X)
X <- cbind(rep(1, d[1]), X)
W <- rep(0, d[2] + 1)
count <- 0
while (count < maxIter){
H <- sign(X %*% W)
indexs <- which(H != Y)
if (length(indexs) == 0){
break
} else {
i <- sample(indexs, 1)
W <- W + 0.1 * (X[i,] * Y[i,])
}
count <- count + 1
point <- as.data.frame(data[i,])
plot_it(data, point, W, paste("plot", sprintf("%05d", count), ".png", sep=""))
}
W
}
The code to generate the plot:
plot_it <- function(data, point, weights, name = "plot.png") {
line <- weights_to_line(weights)
point <- point
png(name)
p = ggplot() + geom_point(data = data, aes(x, y, color = value, size = 2)) + theme(legend.position = "none")
p = p + geom_abline(intercept = line[2], slope = line[1])
print(p)
dev.off()
}
This was solved using material from the question and answers from Issues plotting a fitted SVM model's decision boundary using ggplot2's stat_contour(). I skipped the call to geom_point for the grid-entires and some of the aesthetical definitions like scale_fill_manual and scale_colour_manual. Removing the dots for the grid entries solved the problem with the vanishing contour-line in my case.
train_and_plot_svm <- function(train, kernel = "sigmoid", type ="C", cost, gamma) {
fit <- svm(as.factor(value) ~ x + y, data = train, kernel = kernel, type = type, cost = cost)
grid <- expand.grid (x = seq(from = -0.1, to = 15, length = 100), y = seq(from = -0.1, to = 15, length = 100))
decisionValues <- as.vector(attributes(predict(fit, grid, decision.values = TRUE))$decision)
p <- predict(fit, grid)
grid$value <- p
grid$z <- decisionValues
p <- ggplot() + stat_contour(data = grid, aes(x = x, y = y, z = z), breaks = c(0))
p <- p + geom_point(data = train, aes(x, y, colour = as.factor(value)), alpha = 0.7)
p <- p + xlim(0,15) + ylim(0,15) + theme(legend.position="none")
}
Note that this function doesn't return the result of the svm training but the ggplot2 object.
This is, what I got:

R: 4D plot, x, y, z, colours

Could you give me an example on how to use rgl to plot 3 variables at the axes x, y and z and a fourth one with different colours?
thanks
You use a combination of persp and colour according to a separate function. Here's some example code:
## Create a simple surface f(x,y) = -x^2 - y^2
## Colour the surface according to x^2 only
nx = 31; ny = 31
x = seq(-1, 1, length = nx)
y = seq(-1, 1, length = ny)
z = outer(x, y, function(x,y) -x^2 -y^2)
## Fourth dim
z_col = outer(x, y, function(x,y) x^2)
## Average the values at the corner of each facet
## and scale to a value in [0, 1]. We will use this
## to select a gray for colouring the facet.
hgt = 0.25 * (z_col[-nx,-ny] + z_col[-1,-ny] + z_col[-nx,-1] + z_col[-1,-1])
hgt = (hgt - min(hgt))/ (max(hgt) - min(hgt))
## Plot the surface with the specified facet colours.
persp(x, y, z, col = gray(1 - hgt))
persp(x, y, z, col=cm.colors(32)[floor(31*hgt+1)], theta=-35, phi=10)
This gives:
RGL
It's fairly straightforward to use the above technique with the rgl library:
library(rgl)
## Generate the data using the above commands
## New window
open3d()
## clear scene:
clear3d("all")
## setup env:
bg3d(color="#887777")
light3d()
surface3d(x, y, z, color=cm.colors(32)[floor(31*hgt+1)], alpha=0.5)
There is an example in ?plot3d if you are talking about plotting points in a 3d space and colouring them:
x <- sort(rnorm(1000))
y <- rnorm(1000)
z <- rnorm(1000) + atan2(x,y)
plot3d(x, y, z, col=rainbow(1000))
But if you mean to colour the points by a 4th variable, say a grouping variable, then we can modify the example above to do this by creating a grouping variable
grp <- gl(5, 200) ## 5 groups 200 members each
## now select the colours we want
cols <- 1:5
## Now plot
plot3d(x, y, z, col=cols[grp])
OK, is this more what you want?
X <- 1:10
Y <- 1:10
## Z is now a 100 row object of X,Y combinations
Z <- expand.grid(X = X, Y = Y)
## Add in Z1, which is the 3rd variable
## X,Y,Z1 define the surface, which we colour according to
## 4th variable Z2
Z <- within(Z, {
Z1 <- 1.2 + (1.4 * X) + (-1.9 * Y)
Z2 <- 1.2 + (1.4 * X) - (1.2 * X^2) + (1.9 * Y) + (-1.3 * Y^2)
Z3 <- 1.2 + (1.4 * X) + (-1.9 * Y) + (-X^2) + (-Y^2)})
## show the data
head(Z)
## Set-up the rgl device
with(Z, plot3d(X, Y, Z1, type = "n"))
## Need a scale for Z2 to display as colours
## Here I choose 10 equally spaced colours from a palette
cols <- heat.colors(10)
## Break Z2 into 10 equal regions
cuts <- with(Z, cut(Z2, breaks = 10))
## Add in the surface, colouring by Z2
with(Z, surface3d(1:10,1:10, matrix(Z1, ncol = 10),
color = cols[cuts], back = "fill"))
with(Z, points3d(X, Y, Z1, size = 5)) ## show grid X,Y,Z1
Here's a modification where the plane surface Z1 is curved (Z3).
## Set-up the rgl device plotting Z3, a curved surface
with(Z, plot3d(X, Y, Z3, type = "n"))
with(Z, surface3d(1:10,1:10, matrix(Z3, ncol = 10),
color = cols[cuts], back = "fill"))
The detail of what I did to get Z2 probably doesn't matter, but I tried to get something like the graph you linked to.
If I've still not got what you want, can you edit your Q with some example data and give us a better idea of what you want?
HTH
Take a look at example(points3d).
The r3d help page shows you how to draw axes.
x <- c(0, 10, 0, 0)
y <- c(0, 0, 100, 0)
z <- c(0, 0, 0, 1)
i <- c(1,2,1,3,1,4)
labels <- c("Origin", "X", "Y", "Z")
text3d(x,y,z,labels)
segments3d(x[i],y[i],z[i])
Now you add some points
dfr <- data.frame(x = 1:10, y = (1:10)^2, z = runif(10), col = rainbow(10))
with(dfr, points3d(x, y, z, col = col))

Resources