Advancing on the answer given here where the same question was asked for a scatter plot, is it possible to plot a line where the colour is based on the y value?
Example data
x = 1:11
y = abs(6 - x)
plot(1:22,c(y,y), col = ifelse(c(y,y) < 2.5, 2, 3), pch = 16)
Will give
However, trying
plot(1:22,c(y,y), col = ifelse(c(y,y) < 2.5, 2, 3), type = "l")
Gives
or doing lines on y<2.5 which gives
instead of the solution I am after, which is
Is there any easy method to do this? This is only a simple case, and I can't manually add each section to my actual data.
Thanks!
Try this
x = 1:11
y = abs(6 - x)
y = c(y,y)
plot(1:22,y, col = ifelse(c(y,y) < 2.5, 2, 3), pch = 16)
for(i in 1:21){
if(y[i]>1.9&& y[i+1]>1.9){
linecolour="green"
} else {
linecolour="red"
}
lines(c((1:22)[i],(1:22)[i+1]),c(y[i],y[i+1]),col=linecolour)
}
Here is a vectorized solution. It is partly based on the answers to this question with the main difference that the function plotted there is a smooth function, unlike this one. This makes a difference when computing the points to plot and their colors.
fun <- function(x) abs(6 - x)
x <- 1:11
y <- fun(x)
X <- c(x, x + 11)
Y <- c(y, y)
n <- length(X)
color <- ifelse((Y[-n] < 2.5) & (Y[-1] < 2.5), 2, 3)
plot(X, Y, col = color, pch = 16)
segments(x0 = X[-n], y0 = Y[-n],
x1 = X[-1], y1 = Y[-1],
col = color, pch = 16)
To remove the points, start with
plot(X, Y, type = "n")
Related
I tried to plot 6 normal distribution figure to display the effect of mean and variance on such a plot,my code is as follow:
par(mfrow=c(3,2),bty = "n") # 3 rows by 2 columns, turn off border
mu <- c(6, 8, 6, 8, 6, 8) #designate the 6 mean values
sigma <- c(3, 3, 2, 2, 1, 1) #designate the 6 sd values
label <- c("(a)","(b)","(c)","(d)","(e)","(f)") #designate the 6 labels of the 6 figures
for(i in 1:length(mu))
{
mu.r <- mu[i]
sigma.r <- sigma[i]
lab.r <- label[i]
x <- seq((mu.r - 4*sigma.r), (mu.r + 4*sigma.r), len = 200)
#designate the starting and ending value of mean
plot(x, dnorm(x, mean = mu.r, sd = sigma.r),axes = F,
type="l",lwd = 2, xlab = lab.r, ylab = "",
main=paste0('mu=',mu.r,', sigma=',sigma.r),
)
axis(1, at = (mu.r - 4*sigma.r) : (mu.r + 4*sigma.r))
abline(v = mu.r, col = "red", lwd = 2.5, lty = "longdash")
}
the figures generated is as follow:
[enter image description here][1]
[1]: https://i.stack.imgur.com/Z4czh.png
You didin't said what exactly was the problem. I assume is that all your graphs look the same. That happens because you set your x axis depending on the variance, you need to leave all the graphs on the same scale in order to compare them. I simply set a arbitrary interval of 7 around the mean:
for(i in 1:length(mu))
{
mu.r <- mu[i]
sigma.r <- sigma[i]
lab.r <- label[i]
x <- (mu.r - 7):(mu.r + 7)
#designate the starting and ending value of mean
plot(x, dnorm(x, mean = mu.r, sd = sigma.r),axes = F,
type="l",lwd = 2, xlab = lab.r, ylab = "",
main=paste0('mu=',mu.r,', sigma=',sigma.r),
)
axis(1, at = x)
abline(v = mu.r, col = "red", lwd = 2.5, lty = "longdash")
}
Output:
The picture below explains what I'd like to achieve in R. The blue line is added with a simple call to lines. This line shows some values of parameter over time. The line is thick (lwd=3) so it can be seen at a low zoom, and also so that the individual data points fit inside of it. This means that it overlaps with itself when there is chatter over a small x interval. Can the line be made to have a border to better resolve areas with overlap?
Draw a thicker line first and then draw thinner line on top of it
set.seed(42)
x = 1:10
y = sample(1:10)
plot(x, y, type = "l", lwd = 5, col = "red")
lines(x, y, lwd = 3, col = "green")
An alternative.
Under the assumption that seeing borders in order and on top of the "fill" color is important, then borrowing from d.b's excellent, fast, and very-simple answer that intersections give no indication of which direction came first (left of the 3-pack, below).
Altered data, to show overlaps:
set.seed(42)
x <- sample(10)
y <- sample(10)
One solution is to draw each segment individually, controlling colors each time.
segments2 <- function(x, y, lwd = c(2, 1), ..., border = NA, fill = NA) {
stopifnot(length(x) == length(y))
len <- length(x)
ign <- Map(function(x0, y0, x1 = x0, y1 = y0) {
if (!is.na(border)) lines(c(x0, x1), c(y0, y1), lwd = lwd[1], col = border, ...)
if (!is.na(fill)) lines(c(x0, x1), c(y0, y1), lwd = lwd[2], col = fill, ...)
}, x[-len], y[-len], x[-1], y[-1])
invisible()
}
This produces the middle plot below, but notice that each vertex has border-intrusion. A third option, much more complex, can mitigate that with a lot of trickery.
segments3 <- function(x, y, lwd = c(2, 1), ..., border = NA, fill = NA, lend = 0) {
stopifnot(length(x) == length(y))
len <- length(x)
dx <- x[-len] + diff(x) / 2
mx <- rbind(
c(NA, x[1], dx[1]),
cbind(dx[-(len-1)], x[-c(1,len)], dx[-c(1)]))
mx <- rbind(
mx[-(len-1),],
c(x[len], dx[len-1], NA),
mx[len-1,])
dy <- y[-len] + diff(y) / 2
my <- rbind(
c(NA, y[1], dy[1]),
cbind(dy[-(len-1)], y[-c(1,len)], dy[-c(1)]))
my <- rbind(
my[-(len-1),],
c(y[len], dy[len-1], NA),
my[len-1,])
for (rn in seq_len(nrow(mx))) {
lend0 <- if (rn %in% c(1L, len-1)) lend else 1
lines(mx[rn,], my[rn,], lwd = lwd[1], col = border, ..., lend = lend0)
lines(mx[rn,], my[rn,], lwd = lwd[2], col = fill, ..., lend = lend0)
}
}
It is likely possible to simplify this, but its performance is not heinous, and it does produce a slightly clearer plot.
par(mfrow=c(1, 3))
# simpler method
plot(x, y, type = "l", lwd = 5, col = "red", main = "Simpler")
lines(x, y, lwd = 3, col = "green")
# slower method, full-size
plot(x, y, type = "n", main = "Slower")
segments2(x, y, lwd = c(5, 3), border="red", fill="green")
# slowest method, fairly complex
plot(x, y, type = "n", main = "Slowest")
segments3(x, y, lwd = c(5, 3), border="red", fill="green")
Some notes:
lwd is a length-2 vector used to control the widths of the two lines; perhaps it would be more intuitive to have lwd be the main line and border.lwd be the border's thickness?
... arguments are passed through to lines, except
lend, which is tightly-controlled in order to keep mid-segment line endings controller but allow the user to override the two ends
if either border or fill are NA, then they just won't be drawn, which would then make one question why using this function
depending on your perspective, dimensions, etc, it is possible that a segment mid-point hints at a little junction, which can be seen as a hint in the left-most segment around (1.8, 3.7) ... this can go away just be resizing, and is I believe just an artifact of raster graphics in general
I have data-frame DOTS with following columns: DOT, X, Y. There are 10 dots.
I want to display all possible connections: (a) between dots 1,2,3,4,5; (b) 5,6,7; and (c) between 7,8,9,10?
# what I tried so far
plot(DOTS$X, DOTS$Y, main= "DOTS", xlab= "X", ylab= "Y",
col= "blue", pch = 19, cex = 1, lty = "solid", lwd = 2)
text(DOTS$X, DOTS$Y, labels=DOTS$Dot, cex= 0.7, pos = 3)
lines(DOTS$X,DOTS$Y)
# the last line displays connection from 1 to 2 to 3 etc only
Thank you in advance for your suggestions.
I make a dataset first :
x <- runif(10, 0, 10)
y <- runif(10, 0, 10)
df <- data.frame(dot = LETTERS[1:10], x = x, y = y)
I think it's flexible to create a custom function and use combn() to generate all possible combinations of two dots. And then connect them with segments() respectively. In the custom function below, you can put any dots set and arguments e.g. col, lwd... etc.
plot(df$x, df$y)
text(df$x, df$y, labels = df$dot, pos = 3)
line.fun <- function(index, ...){
comb <- combn(index, 2)
start <- comb[1, ] # starting points
end <- comb[2, ] # end points
segments(df$x[start], df$y[start], df$x[end], df$y[end], ...)
}
line.fun(1:5, col = 2)
line.fun(5:7, col = 3)
line.fun(7:10, col = 4)
I want to plot a confusion matrix, but, I don't want to just use a heatmap, because I think they give poor numerical resolution. Instead, I want to also plot the frequency in the middle of the square. For instance, I like the output of this:
library(mlearning);
data("Glass", package = "mlbench")
Glass$Type <- as.factor(paste("Glass", Glass$Type))
summary(glassLvq <- mlLvq(Type ~ ., data = Glass));
(glassConf <- confusion(predict(glassLvq, Glass, type = "class"), Glass$Type))
plot(glassConf) # Image by default
However, 1.) I don't understand that the "01, 02, etc" means along each axis. How can we get rid of that?
2.) I would like 'Predicted' to be as the label of the 'y' dimension, and 'Actual' to be as the label for the 'x' dimension
3.) I would like to replace absolute counts by frequency / probability.
Alternatively, is there another package that will do this?
In essence, I want this in R:
http://www.mathworks.com/help/releases/R2013b/nnet/gs/gettingstarted_nprtool_07.gif
OR:
http://c431376.r76.cf2.rackcdn.com/8805/fnhum-05-00189-HTML/image_m/fnhum-05-00189-g009.jpg
The mlearning package seems quite inflexible with plotting confusion matrices.
Starting with your glassConf object, you probably want to do something like this:
prior(glassConf) <- 100
# The above rescales the confusion matrix such that columns sum to 100.
opar <- par(mar=c(5.1, 6.1, 2, 2))
x <- x.orig <- unclass(glassConf)
x <- log(x + 0.5) * 2.33
x[x < 0] <- NA
x[x > 10] <- 10
diag(x) <- -diag(x)
image(1:ncol(x), 1:ncol(x),
-(x[, nrow(x):1]), xlab='Actual', ylab='',
col=colorRampPalette(c(hsv(h = 0, s = 0.9, v = 0.9, alpha = 1),
hsv(h = 0, s = 0, v = 0.9, alpha = 1),
hsv(h = 2/6, s = 0.9, v = 0.9, alpha = 1)))(41),
xaxt='n', yaxt='n', zlim=c(-10, 10))
axis(1, at=1:ncol(x), labels=colnames(x), cex.axis=0.8)
axis(2, at=ncol(x):1, labels=colnames(x), las=1, cex.axis=0.8)
title(ylab='Predicted', line=4.5)
abline(h = 0:ncol(x) + 0.5, col = 'gray')
abline(v = 0:ncol(x) + 0.5, col = 'gray')
text(1:6, rep(6:1, each=6),
labels = sub('^0$', '', round(c(x.orig), 0)))
box(lwd=2)
par(opar) # reset par
The above code uses bits and pieces of the confusionImage function called by plot.confusion.
Here is a function for plotting confusion matrices I developed from jbaums excellent answer.
It is similar, but looks a bit nicer (IMO), and does not transpose the confusion matrix you feed it, which might be helpful.
### Function for plotting confusion matrices
confMatPlot = function(confMat, titleMy, shouldPlot = T) {
#' Function for plotting confusion matrice
#'
#' #param confMat: confusion matrix with counts, ie integers.
#' Fractions won't work
#' #param titleMy: String containing plot title
#' #return Nothing: It only plots
## Prepare data
x.orig = confMat; rm(confMat) # Lazy conversion to function internal variable name
n = nrow(x.orig) # conf mat is square by definition, so nrow(x) == ncol(x)
opar <- par(mar = c(5.1, 8, 3, 2))
x <- x.orig
x <- log(x + 0.5) # x<1 -> x<0 , x>=1 -> x>0
x[x < 0] <- NA
diag(x) <- -diag(x) # change sign to give diagonal different color
## Plot confusion matrix
image(1:n, 1:n, # grid of coloured boxes
# matrix giving color values for the boxes
# t() and [,ncol(x):1] since image puts [1,1] in bottom left by default
-t(x)[, n:1],
# ylab added later to avoid overlap with tick labels
xlab = 'Actual', ylab = '',
col = colorRampPalette(c("darkorange3", "white", "steelblue"),
bias = 1.65)(100),
xaxt = 'n', yaxt = 'n'
)
# Plot counts
text(rep(1:n, each = n), rep(n:1, times = n),
labels = sub('^0$', '', round(c(x.orig), 0)))
# Axis ticks but no lables
axis(1, at = 1:n, labels = rep("", n), cex.axis = 0.8)
axis(2, at = n:1, labels = rep("", n), cex.axis = 0.8)
# Tilted axis lables
text(cex = 0.8, x = (1:n), y = -0.1, colnames(x), xpd = T, srt = 30, adj = 1)
text(cex = 0.8, y = (n:1), x = +0.1, colnames(x), xpd = T, srt = 30, adj = 1)
title(main = titleMy)
title(ylab = 'Predicted', line = 6)
# Grid and box
abline(h = 0:n + 0.5, col = 'gray')
abline(v = 0:n + 0.5, col = 'gray')
box(lwd = 1, col = 'gray')
par(opar)
}
Example of output:
I came across this interesting website, with an idea of a way to visualize a clustering algorithm called "Clustergram":
(source: schonlau.net)
I am not sure how useful this really is, but in order to play with it I would like to reproduce it with R, but am not sure how to go about doing it.
How would you create a line for each item so it would stay consistent throughout the different number of clusters?
Here is an example code/data to play with for potential answer:
hc <- hclust(dist(USArrests), "ave")
plot(hc)
Update: I posted a solution with a lengthy example and discussion here. (it is based on the code I gave bellow). Also, Hadley was very kind and offered a ggplot2 implementation of the code.
Here is a basic solution (for a better one, look at the "update" above):
set.seed(100)
Data <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(Data) <- c("x", "y")
# noise <- runif(100,0,.05)
line.width <- rep(.004, dim(Data)[1])
Y <- NULL
X <- NULL
k.range <- 2:10
plot(0, 0, col = "white", xlim = c(1,10), ylim = c(-.5,1.6),
xlab = "Number of clusters", ylab = "Clusters means",
main = "(Basic) Clustergram")
axis(side =1, at = k.range)
abline(v = k.range, col = "grey")
centers.points <- list()
for(k in k.range){
cl <- kmeans(Data, k)
clusters.vec <- cl$cluster
the.centers <- apply(cl$centers,1, mean)
noise <- unlist(tapply(line.width, clusters.vec,
cumsum))[order(seq_along(clusters.vec)[order(clusters.vec)])]
noise <- noise - mean(range(noise))
y <- the.centers[clusters.vec] + noise
Y <- cbind(Y, y)
x <- rep(k, length(y))
X <- cbind(X, x)
centers.points[[k]] <- data.frame(y = the.centers , x = rep(k , k))
# points(the.centers ~ rep(k , k), pch = 19, col = "red", cex = 1.5)
}
require(colorspace)
COL <- rainbow_hcl(100)
matlines(t(X), t(Y), pch = 19, col = COL, lty = 1, lwd = 1.5)
# add points
lapply(centers.points,
function(xx){ with(xx,points(y~x, pch = 19, col = "red", cex = 1.3)) })