R - colouring scatterplot points - r

Hi there, I'm wondering why my code below makes the legend coloured, but the dots themselves are not:
# dataset <- data.frame(IDName, Value, Setpoints)
# dataset <- unique(dataset)
# Paste or type your script code here:
dat <- aggregate(Value ~ Setpoints + IDName, dataset, mean)
x <- dat$Value
y <- dat$Setpoints
z <- dataset$IDName
plot(x,y, main ="Turbidity Frequency Distribution",xlab="% Time < Turbidity level", ylab="Turbidity (NTU)")
lines(spline(x,y))
palette()
legend('topleft', legend = unique(z), col = 1:3, cex = 0.8, pch = 1)
#constant lines
abline(h=c(0.1,0.15,0.3), col=c("red","pink","purple"), lty=2, lwd=3)

Make sure that z is a factor. Then, use col = z when you create the plot. You will get colored points.
In your legend (the character values to appear in legend) to the levels of your factor z. In addition, set the colors based on unique(z) - they should match your points.
Here is the complete example. In the future, instead of putting data in a comment, please edit your question with the data. Also, you may want to consider ggplot2 for future plotting.
dat <- aggregate(Value ~ Setpoints + IDName, dataset, mean)
x <- dat$Value
y <- dat$Setpoints
z <- dataset$IDName
z <- factor(z)
plot(x, y,
main ="Turbidity Frequency Distribution",
xlab="% Time < Turbidity level",
ylab="Turbidity (NTU)",
col = z)
lines(spline(x,y))
palette()
legend('topleft',
legend = levels(z),
col = unique(z),
cex = 0.8,
pch = 1)
#constant lines
abline(h=c(0.1,0.15,0.3), col=c("red","pink","purple"), lty=2, lwd=3)
Plot
Data
dataset <- structure(list(IDName = c("Filter01", "Filter01", "Filter01",
"Filter01", "Filter01", "Filter02", "Filter02", "Filter02", "Filter02",
"Filter02"), Setpoints = c(0.16, 0.2, 0.3, 2, 2.2, 0.16, 0.2,
0.3, 2, 2.2), Value = c(96.1, 96.2, 96.428, 99.603, 99.6, 98.8,
98.9, 99.049, 99.194, 99.2)), class = "data.frame", row.names = c(NA,
-10L))

Related

Adding observations as proportions on a horizontal barplot in R using text() function

I cannot figure out how to get the percentage of responses at the end of the bars. I know I'm missing something within the text() function, just not sure what exactly I'm missing. Thank you!
#Training/Specialty Barplot
trainbarplot <- barplot(table(PSR$training), horiz = TRUE,
main="Respondent Distribution of Training", cex.main = 1.1, font.main = 2,
cex.lab = 0.8, cex.names = 0.4, font.axis = 4, las = 2,
xlab="Response Frequency", xlim=c(0, 40), cex.axis = 0.8,
border="black",
col=rgb (0.1, 0.1, 0.4, 0.5, 0.6),
density=c(50,40,30) , angle=c(9,11,36)
)
text(trainbarplot, table(PSR$training) - 3,
labels=paste(round(proportions(table(PSR$training))*100, 0), "%"))
Generate data
I generated some sample data to replicate your problem. Please note that you should always try to provide an example dataset :)
set.seed(123)
df1 <- data.frame(x = rnorm(10, mean=10, sd=2), y = LETTERS[1:20])
Plot the data
Here's a plot that follows the same structure as your code:
bp <- barplot(df1$x, names.arg = df1$y, col = df1$colour, horiz = T)
text(x= df1$x+0.5, y= bp, labels=paste0(round(df1$x),"%"), xpd=TRUE)
Using ggplot2
You can also plot your data using ggplot2. For instance, you could first create a new column in your dataset with information on the labels...
df1$perc <- paste0(round(df1$x),"%")
Next, you can plot your data using ggplot and adding different relevant layers.
library(ggplot2)
ggplot(df1, aes(x = x, y = y)) +
geom_col() +
geom_text(aes(label = perc)) +
theme_minimal()
Good luck!

Draw discrete CDF in R

I want to draw a CDF in R, but I am having some problems. I want it to look like this:
But I get lines between the open and closed points by using the command plot(x,y,type="s")
So how do I get rid of those lines?
This isn't a general purpose example, but it will show you how to build the plot you desire in a couple of steps.
First, let's create some data (notice the zeros at the beginning):
x <- 0:6
fx <- c(0, 0.19, 0.21, 0.4, 0.12, 0.05, 0.03)
Fx <- cumsum(fx)
n <- length(x)
Then let's make an empty plot
plot(x = NA, y = NA, pch = NA,
xlim = c(0, max(x)),
ylim = c(0, 1),
xlab = "X label",
ylab = "Y label",
main = "Title")
Add closed circles, open circles, and finally the horizontal lines
points(x = x[-n], y = Fx[-1], pch=19)
points(x = x[-1], y = Fx[-1], pch=1)
for(i in 1:(n-1)) points(x=x[i+0:1], y=Fx[c(i,i)+1], type="l")
Viola!
If you insist on not seeing the line "inside" of the white points, do this instead:
points(x = x[-n], y = Fx[-1], pch=19)
for(i in 1:(n-1)) points(x=x[i+0:1], y=Fx[c(i,i)+1], type="l")
points(x = x[-1], y = Fx[-1], pch=19, col="white")
points(x = x[-1], y = Fx[-1], pch=1)
You can construct this plot using:
plot(x, y, pch = 16, ylim = c(-0.03, 1.03), ylab = "CDF") # solid points/graphic settings
points(x[-1], y[-length(y)]) # open points
abline(h = c(0, 1), col = "grey", lty = 2) # horizontal lines
Note: plot(x,y, type = "s) does not produce a plot like your original question, but rather a step function with both treads (horizontal lines) and risers (vertical lines):
Data
library(dplyr)
set.seed(1)
df <- data.frame(x = rpois(30, 3)) %>%
dplyr::arrange(x) %>%
dplyr::add_count(x) %>%
dplyr::distinct(x, .keep_all = T) %>%
mutate(y = cumsum(n) / sum(n))
x <- df$x
y <- df$y

Adding a third dimension on a 2D heatmap

I am wondering if you could help me out with the following question:
I have a correlation matrix and a third variable (continuous) for every possible pair in the correlation matrix.
Here is a toy example:
set.seed(1234)
x <- rnorm(1000,2,1)
y <- 0.1*x+rnorm(1000,1,1)
z <- y+rnorm(1000)
third.dimension <- c("(x,y)" = 0.3, "(x,z)" = 0.5, "(y,z)"= 1)
my.df <- data.frame(x,y,z)
First, I want to create a heatmap of that correlation matrix which I do with
heatmap(cor(my.df))
Next, I would like to have a coloured dot within each "cell" of the heatmap, depending on the value of the third dimension for the respective pair. Example - if the value is between 0 and 0.49, I have a black dot, if it is between 0.5 and 1, a grey dot etc.
Hence, where I have the correlation between z and y, say, I would have a grey dot painted in the corresponding "cell" of the correlation matrix.
Thanks in advance for the help!
This should work for you:
set.seed(1234)
x <- rnorm(1000,2,1)
y <- 0.1*x+rnorm(1000,1,1)
z <- y+rnorm(1000)
third.dimension <- c("(x,y)" = 0.3, "(x,z)" = 0.5, "(y,z)"= 1)
my.df <- data.frame(x,y,z)
# required function
val2col <- function(z, zlim, col = heat.colors(12), breaks){
if(!missing(breaks)){
if(length(breaks) != (length(col)+1)){stop("must have one more break than color")}
}
if(missing(breaks) & !missing(zlim)){
breaks <- seq(zlim[1], zlim[2], length.out=(length(col)+1))
}
if(missing(breaks) & missing(zlim)){
zlim <- range(z, na.rm=TRUE)
breaks <- seq(zlim[1], zlim[2], length.out=(length(col)+1))
}
CUT <- cut(z, breaks=breaks, include.lowest = TRUE)
colorlevels <- col[match(CUT, levels(CUT))] # assign colors to heights for each point
return(colorlevels)
}
# plot
COR <- list(
x = seq(ncol(my.df)),
y = seq(ncol(my.df)),
z = cor(my.df)
)
image(COR, xaxt="n", yaxt="n")
axis(1, at=COR$x, labels = names(my.df))
axis(2, at=COR$x, labels = names(my.df))
box()
COR$col <- val2col(c(COR$z), col = grey.colors(21), zlim=c(0,1))
points(expand.grid(x=COR$x, y=COR$y), col=COR$col, pch=16, cex=3)

Can anybody help figure out why my labels for the y-axis and x-axis are not appearing?

As part of my code to have a 4 rows by 2 columns panel with 8 plots I was suggested to use the code below as an example but when doing so I cannot get the text on the y and x axis. Please see the code below.
#This is the code to have the plots as 4 x 2 in the page
m <- rbind(c(1,2,3,4), c(5,6,7,8) )
layout(m)
par(oma = c(6, 6, 1, 1)) # manipulate the room for the overall x and y axis titles
par(mar = c(.1, .1, .8, .8)) # manipulate the plots be closer together or further apart
###this is the code to insert for instance one of my linear regression plots as part of this panel (imagine I have other 7 identical replicates of this)
####ASF 356 standard curve
asf_356<-read.table("asf356.csv", head=TRUE, sep=',')
asf_356
# Linear Regression
fit <- lm( ct ~ count, data=asf_356)
summary(fit) # show results
predict.lm(fit, interval = c("confidence"), level = 0.95, add=TRUE)
newx <- seq(min(asf_356$count), max(asf_356$count), 0.1)
a <- predict(fit, newdata=data.frame(count=newx), interval="confidence")
plot(x = asf_356$count, y = asf_356$ct, xlab="Log(10) for total ASF 356 genome copies", ylab="Cycle threshold value", xlim=c(0,10), ylim=c(0,35), lty=1, family="serif")
curve(expr=fit$coefficients[1]+fit$coefficients[2]*x,xlim=c(min(asf_356$count), max(asf_356$count)),col="black", add=TRUE, lwd=2)
lines(newx,a[,2], lty=3)
lines(newx,a[,3], lty=3)
legend(x = 0.5, y = 20, legend = c("Logistic regression model", "95% individual confindence interval"), lty = c("solid", "dotdash"), col = c("black", "black"), enter code herebty = "n")
mod.fit=summary(fit)
r2 = mod.fit$r.squared
mylabel = bquote(italic(R)^2 == .(format(r2, digits = 3)))
text(x = 8.2, y = 25, labels = mylabel)
legend(x = 7, y = 35, legend =c("y= -3.774*x + 41.21"), bty="n")
I have been able to find a similar post here and the argument that I was missing was :
title(xlab="xx", ylab="xx", outer=TRUE, line=3, family="serif")
Thanks
Finally I have my work..thanks to whom helped me before as well

Plot A Confusion Matrix with Color and Frequency in R

I want to plot a confusion matrix, but, I don't want to just use a heatmap, because I think they give poor numerical resolution. Instead, I want to also plot the frequency in the middle of the square. For instance, I like the output of this:
library(mlearning);
data("Glass", package = "mlbench")
Glass$Type <- as.factor(paste("Glass", Glass$Type))
summary(glassLvq <- mlLvq(Type ~ ., data = Glass));
(glassConf <- confusion(predict(glassLvq, Glass, type = "class"), Glass$Type))
plot(glassConf) # Image by default
However, 1.) I don't understand that the "01, 02, etc" means along each axis. How can we get rid of that?
2.) I would like 'Predicted' to be as the label of the 'y' dimension, and 'Actual' to be as the label for the 'x' dimension
3.) I would like to replace absolute counts by frequency / probability.
Alternatively, is there another package that will do this?
In essence, I want this in R:
http://www.mathworks.com/help/releases/R2013b/nnet/gs/gettingstarted_nprtool_07.gif
OR:
http://c431376.r76.cf2.rackcdn.com/8805/fnhum-05-00189-HTML/image_m/fnhum-05-00189-g009.jpg
The mlearning package seems quite inflexible with plotting confusion matrices.
Starting with your glassConf object, you probably want to do something like this:
prior(glassConf) <- 100
# The above rescales the confusion matrix such that columns sum to 100.
opar <- par(mar=c(5.1, 6.1, 2, 2))
x <- x.orig <- unclass(glassConf)
x <- log(x + 0.5) * 2.33
x[x < 0] <- NA
x[x > 10] <- 10
diag(x) <- -diag(x)
image(1:ncol(x), 1:ncol(x),
-(x[, nrow(x):1]), xlab='Actual', ylab='',
col=colorRampPalette(c(hsv(h = 0, s = 0.9, v = 0.9, alpha = 1),
hsv(h = 0, s = 0, v = 0.9, alpha = 1),
hsv(h = 2/6, s = 0.9, v = 0.9, alpha = 1)))(41),
xaxt='n', yaxt='n', zlim=c(-10, 10))
axis(1, at=1:ncol(x), labels=colnames(x), cex.axis=0.8)
axis(2, at=ncol(x):1, labels=colnames(x), las=1, cex.axis=0.8)
title(ylab='Predicted', line=4.5)
abline(h = 0:ncol(x) + 0.5, col = 'gray')
abline(v = 0:ncol(x) + 0.5, col = 'gray')
text(1:6, rep(6:1, each=6),
labels = sub('^0$', '', round(c(x.orig), 0)))
box(lwd=2)
par(opar) # reset par
The above code uses bits and pieces of the confusionImage function called by plot.confusion.
Here is a function for plotting confusion matrices I developed from jbaums excellent answer.
It is similar, but looks a bit nicer (IMO), and does not transpose the confusion matrix you feed it, which might be helpful.
### Function for plotting confusion matrices
confMatPlot = function(confMat, titleMy, shouldPlot = T) {
#' Function for plotting confusion matrice
#'
#' #param confMat: confusion matrix with counts, ie integers.
#' Fractions won't work
#' #param titleMy: String containing plot title
#' #return Nothing: It only plots
## Prepare data
x.orig = confMat; rm(confMat) # Lazy conversion to function internal variable name
n = nrow(x.orig) # conf mat is square by definition, so nrow(x) == ncol(x)
opar <- par(mar = c(5.1, 8, 3, 2))
x <- x.orig
x <- log(x + 0.5) # x<1 -> x<0 , x>=1 -> x>0
x[x < 0] <- NA
diag(x) <- -diag(x) # change sign to give diagonal different color
## Plot confusion matrix
image(1:n, 1:n, # grid of coloured boxes
# matrix giving color values for the boxes
# t() and [,ncol(x):1] since image puts [1,1] in bottom left by default
-t(x)[, n:1],
# ylab added later to avoid overlap with tick labels
xlab = 'Actual', ylab = '',
col = colorRampPalette(c("darkorange3", "white", "steelblue"),
bias = 1.65)(100),
xaxt = 'n', yaxt = 'n'
)
# Plot counts
text(rep(1:n, each = n), rep(n:1, times = n),
labels = sub('^0$', '', round(c(x.orig), 0)))
# Axis ticks but no lables
axis(1, at = 1:n, labels = rep("", n), cex.axis = 0.8)
axis(2, at = n:1, labels = rep("", n), cex.axis = 0.8)
# Tilted axis lables
text(cex = 0.8, x = (1:n), y = -0.1, colnames(x), xpd = T, srt = 30, adj = 1)
text(cex = 0.8, y = (n:1), x = +0.1, colnames(x), xpd = T, srt = 30, adj = 1)
title(main = titleMy)
title(ylab = 'Predicted', line = 6)
# Grid and box
abline(h = 0:n + 0.5, col = 'gray')
abline(v = 0:n + 0.5, col = 'gray')
box(lwd = 1, col = 'gray')
par(opar)
}
Example of output:

Resources