How to find y value in line-and-dots plot? - r

I have this line-and-dots plot:
#generate fake data
xLab <- seq(0, 50, by=5);
yLab <- c(0, sort(runif(10, 0, 1)));
#this value is fixed
fixedVal <- 27.3
#new window
dev.new();
#generate the plot
paste0(plot(xLab, yLab, col=rgb(50/255, 205/255, 50/255, 1), type="o", lwd=3,
main="a line-and-dots plot", xlab="some values", ylab="a percentage",
pch=20, xlim=c(0, 50), ylim=c(0, 1), xaxt="n", cex.lab=1.5, cex.axis=1.5,
cex.main=1.5, cex.sub=1.5));
#set axis
axis(side = 1, at=c(seq(min(xLab), max(xLab), by=5)))
#plot line
abline(v=fixedVal, col="firebrick", lwd=3, lty=1);
now, I would like to find the y coordinate of the intersection point between the green and the red lines.
Can I achieve the goal without the need of a regression line? Is there a simple way of getting the coordinates of that unknown point?

You can use approxfun to do the interpolation:
> approxfun(xLab,yLab)(fixedVal)
[1] 0.3924427
Alternatively, just use approx:
> approx(xLab,yLab,fixedVal)
$x
[1] 27.3
$y
[1] 0.3924427

Quick fix like #JohnColeman said:
# find the two points flanking your value
idx <- findInterval(fixedVal,xLab)
# calculate the deltas
y_delta <- diff(yLab[idx:(idx+1)])
x_delta <- diff(xLab[idx:(idx+1)])
# interpolate...
ycut = (y_delta/x_delta) * (fixedVal-xLab[idx]) + yLab[idx]
ycut
[1] 0.4046399
So we try it on the plot..
paste0(plot(xLab, yLab, col=rgb(50/255, 205/255, 50/255, 1), type="o", lwd=3,
main="a line-and-dots plot", xlab="some values", ylab="a percentage",
pch=20, xlim=c(0, 50), ylim=c(0, 1), xaxt="n", cex.lab=1.5, cex.axis=1.5,
cex.main=1.5, cex.sub=1.5));
#set axis
axis(side = 1, at=c(seq(min(xLab), max(xLab), by=5)))
#plot line
abline(v=fixedVal, col="firebrick", lwd=3, lty=1);
abline(h=ycut, col="lightblue", lwd=3, lty=1);

Related

In R plotting line with different color above threshold limits

I have the following data and code in R:
x <- runif(1000, -9.99, 9.99)
mx <- mean(x)
stdevs_3 <- mx + c(-3, +3) * sd(x/5) # Statndard Deviation 3-sigma
And I plotted as line (alongwith 3 standard deviation and mean lines) in R:
plot(x, t="l", main="Plot of Data", ylab="X", xlab="")
abline(h=mx, col="red", lwd=2)
abline(h=stdevs_3, lwd=2, col="blue")
What I want to do:
Anywhere on the plot, whenever line is crossing 3 sigma thresholds (blue lines), above or below it, line should be in different color than black.
I tried this, but did not work:
plot(x, type="l", col= ifelse(x < stdevs_3[[1]],"red", "black"))
abline(h=mx, col="red", lwd=2)
abline(h=stdevs_3, lwd=2, col="blue")
Is there any other way?
This is what is requested, but it appears meaningless to me because of the arbitrary division of x by 5:
png( )
plot(NA, xlim=c(0,length(x)), ylim=range(x), main="Plot of Data", ylab="X", xlab="", )
stdevs_3 <- mx + c(-3, +3) * sd(x/5)
abline(h=mx, col="red", lwd=2)
abline(h=stdevs_3, lwd=2, col="blue")
segments( 0:999, head(x,-1), 1:1000, tail(x,-1) , col=c("black", "red")[
1+(abs(tail(x,-1)) > mx+3*sd(x/5))] )
dev.off()

How to get rid of a double label in x axis (plot function in R)

I'm trying to get rid of the double labels in the x axis when using the plot function in R with my data.
Sample data can be created like this:
Data=read.table(textConnection("
Geslacht y x1 x2 x3 x4 x5 x6 x7 x8 x9
M 0.00 5 6 16 9 5 13 14 5 7
M 0.25 6 7 17 9 5 13 17 5 7
M 0.67 12 12 28 14 7 21 24 13 13
"),head=TRUE)
I've used the following code so far:
Data <- read.table(Data, header=TRUE, sep="\t", na.string="NA", strip.white=TRUE)
par(new=T)
plot(Data$y, Data$x1, xlab="Chronological Age", ylab="%DNA methylation", cex = 0.9, col="black", pch=20, xlim=c(0, 100), ylim=c(0, 100), font.lab=2)
par(new=T)
plot(Data$y, Data$x2, pch=20, axes=F, ylab="", cex= 0.9, col="blue", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x3, pch=20, axes=F, ylab="", cex= 0.9, col="red", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x4, pch=20, axes=F, ylab="", cex= 0.9, col="green", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x5, pch=20, axes=F, ylab="", cex= 0.9, col="coral4", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x6, pch=20, axes=F, ylab="", cex= 0.9, col="darkgrey", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x7, pch=20, axes=F, ylab="", cex= 0.9, col="darkorchid1", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x8, pch=20, axes=F, ylab="", cex= 0.9, col="darkgoldenrod1", xlim=c(0, 100), ylim=c(0, 100))
par(new=T)
plot(Data$y, Data$x9, pch=20, axes=F, ylab="", cex= 0.9, col="forestgreen", xlim=c(0, 100), ylim=c(0, 100))
legend("bottomright", title="CpG regions", cex=.8, c("CpG1 ELOVL2","CpG2 ELOVL2","CpG3 ELOVL2", "CpG4 ELOVL2", "CpG5 ELOVL2", "CpG6 ELOVL2", "CpG7 ELOVL2", "CpG8 ELOVL2", "CpG9 ELOVL2"),
pch=c(20,20,20,20,20,20,20,20,20),col=c("black","blue", "red", "green", "coral4", "darkgrey","darkorchid1","darkgoldenrod1","forestgreen",horiz=FALSE))
I can't post the picture due to being a newbie...
Does anyone know why I get the double label in the x axis? The second label is "Data$y" even though I specify the xlabel in code line 3.
What are you trying to do here? Looks like you are trying to plot several sets of points in different colours on the same axes in one graph.
There are better ways...
Use plot to set the axis limits with xlim and ylim. Plot nothing, but name your axes:
plot(NA,xlab="my x",ylab="my y",xlim=c(1,23),ylim=c(99,1000))
Then use points(x,y,col=....) to add points to the current plot. Don't use par(new=T)!
Or
matplot
Or
rearrange your data into three columns, x, y, and col. Then plot(d$x, d$y, col=d$col) will do most of the job in one line.
Or
with the same rearrangement, you can use ggplot2 and make a pretty plot.
Here's how to do that rearrangement using reshape2 package:
library(reshape2)
md = melt(Data,id="y",measure=names(Data)[-(1:2)],value.name="x")
md is now a "long" data frame. Now convert the x1 variable to numeric:
md$icol = as.numeric(substr(md$variable,2,20))
define your colour palette:
col=c("black","blue", "red", "green", "coral4", "darkgrey","darkorchid1","darkgoldenrod1","forestgreen")
set the point colour by explicit lookup:
md$colour = col[md$icol]
and plot. Tweak axis labels, limits, title, etc to taste, simmer for 20 minutes and stir in a legend:
plot(md$y,md$x,col=md$colour,pch=19,xlim=c(0,100),ylim=c(0,100))

Combining a box plot with a dot plot using different Y scales

I am trying to generate a figure that consists of a box plot with a set of points overlaid on the boxplot. The key issue is that the y scale of the box plot is different from that of the points. (Yes, this is very poor visualization - but I'm not the lead author of the paper). I have been able to generate a plot with different y scales, but am facing an issue with the x axis.
Using the following code
boxdata <- data.frame(fc=runif(100, min=-4, max=4),
sym=sample(c('A', 'B', 'C', 'D', 'E'), 100, replace=TRUE))
par(mar=c(5, 4, 1, 6) + 0.1)
junk <- boxplot(fc ~ sym, boxdata, las=2, pch=19, ylim=c(-5,5),
varwidth=FALSE, xaxt='n')
mtext("Y-axis",side=2,line=2.5)
axis(1, at=1:5, labels=sort(unique(boxdata$sym)), las=2)
par(new=TRUE)
x <- 1:5
y <- runif(5, min=-1, max=1)
plot(x,y, col='red', type='p', pch=15, axes=FALSE, ylim=c(-1,1), cex=1.5)
axis(4, ylim=c(-1,1), las=1)
I get the following figure. As you can see the points in red do not align with the X-axis labels (or box centers). The box centers are located at 1:5, so I thought that the plot() call with x = 1:5 should line up.
Could anybody point me to a way to line up the second set of points with the box centers?
EDIT: This problem doesn't occur if I plot two sets of points on different y scales
plot(1:10, runif(10) , col='red', pch=19)
par(new=TRUE)
plot(1:10, runif(10, min=5, max=20), col='blue', pch=19, axes=FALSE)
axis(4, las=2)
Don't use par(new=TRUE), but use pointsinstead of the second plotcommand:
boxdata <- data.frame(fc=runif(100, min=-4, max=4),
sym=sample(c('A', 'B', 'C', 'D', 'E'), 100, replace=TRUE))
par(mar=c(5, 4, 1, 6) + 0.1)
junk <- boxplot(fc ~ sym, boxdata, las=2, pch=19, ylim=c(-5,5),
varwidth=FALSE, xaxt='n')
mtext("Y-axis",side=2,line=2.5)
axis(1, at=1:5, labels=sort(unique(boxdata$sym)), las=2)
x <- 1:5
y <- runif(5, min=-1, max=1)
points(x, 4*y, col='red', type='p', pch=15, ylim=c(-1,1), cex=1.5)
axis(4, at=seq(-4, 4, by=2), label=seq(-1, 1, by=.5), las=1)
EDIT: Check the ?bxp help page. You will find a note that xlim defaults to range(at, *) + c(-0.5, 0.5). So, you could specify the same for your second plot:
junk <- boxplot(fc ~ sym, boxdata, las=2, pch=19, ylim=c(-5,5),
varwidth=FALSE, xaxt='n')
mtext("Y-axis",side=2,line=2.5)
axis(1, at=1:5, labels=sort(unique(boxdata$sym)), las=2)
par(new=TRUE)
plot(x,y, col='red', type='p', pch=15, axes=FALSE, ylim=c(-1,1), cex=1.5,
xlim=range(x) + c(-0.5, 0.5))
axis(4, ylim=c(-1,1), las=1)

How to avoid wired ylab error when plotting in R

I need a two y-axes figure. hrbrmstr suggested to use simple plots. But when adapting the graph to my setting I observed I cannot add the ylab on the right hand side, getting a wired error:
Error in axis(4, ylim = c(0, 1), col = "black", col.axis = "black", las = 1, :
'labels' is supplied and not 'at'
Is this avoidable?
look at the code the bottom line fpr SOURCE OF ERROR
featPerf <- data.frame( expS=c("1", "2", "3", "4"),
exp1=c(1000, 0, 0, 0),
exp2=c(1000, 5000, 0, 0),
exp3=c(1000, 5000, 10000, 0),
exp4=c(1000, 5000, 10000,20000),
accuracy=c(0.4, 0.5, 0.65, 0.9) )
# make room for both axes ; adjust as necessary
par(mar=c(5, 5, 5, 7) + 0.2)
# plot the bars first with no annotations and specify limits for y
#barplot(as.matrix(featPerf[,2:5]), axes=FALSE, xlab="", ylab="", ylim=c(0, max(colSums(featPerf[2:5]))))
barplot(as.matrix(featPerf[,2:5]), axes=FALSE, xlab="", ylab="", beside=TRUE)
# make the bounding box (or not...it might not make sense for your plot)
#box()
# now make the left axis
axis(2, ylim=c(0, max(colSums(featPerf[2:5]))), col="black", las=1)
# start a new plot
par(new=TRUE)
# plot the line; adjust lwd as necessary
plot(x=1:4, y=featPerf[,6], xlab="Experiments", ylab="Abs. # of Features", axes=FALSE, type="l", ylim=c(0,1), lwd=5)
# annotate the second axis -- SOURCE OF ERROR -> VVVVVVVVVVVVVVVVVV
axis(4, ylim=c(0,1), col="black", col.axis="black", las=1, labels="Accuracy")
Like this?
par(mar=c(4,4,1,4) + 0.2)
barplot(as.matrix(featPerf[,2:5]), axes=FALSE, xlab="", ylab="", beside=TRUE)
axis(2, ylim=c(0, max(colSums(featPerf[2:5]))), col="black", las=1)
par(new=TRUE)
plot(x=1:4, y=featPerf[,6], xlab="Experiments", ylab="Abs. # of Features", axes=FALSE, type="l", ylim=c(0,1), lwd=5, col="blue")
axis(4, ylim=c(0,1), col="blue", col.axis="blue", las=1)
mtext("Accuracy",4,line=2, col="blue")
For the record, it is never a good idea to stack plots on top of each other this way (with two axes). I've made the line and the axis the same color in an attempt to draw attention to what you are doing, but this is still a very bad idea.
First of all it is not advisable to use two Y-axes in a same plot.
If you add at argument to the axis call, you get the name "Accuracy" on the right hand side of the plot.
axis(4, ylim=c(0,1), col="black", col.axis="black", las=1, labels="Accuracy",
at = .5)

R: plotting untransformed data on a log x axis (similar to plotting on log graph paper)

I have 3 sets of data that I am trying to plot on a single plot. The first data set x values range from ~ 1 to 1700 whereas the other two data sets x values are less than 20. Therefore I want to plot them on a log axis to show variations in all the data sets. However I do not want to transform the data as I want to be able to read the values off the graph. The x axis labels I would like are 1, 10, 100 and 1000 all equally spaced. Does anyone know how to do this? I can only find examples where the data is log as well as the axis. I have attached the code I am currently using below:
Thanks in advance for any help given.
Holly
Stats_nineteen<-read.csv('C:/Users/Holly/Documents/Software Manuals/R Stuff/Stats_nineteen.csv')
attach(Stats_nineteen)
x<-Max
x1<-Min
x2<-Max
y1<-Depth
y2<-Depth
par(bg="white")
par(xlog=TRUE)
plot(x2,y1, type="n", ylim=c(555,0), log="x", axes=FALSE, ann=FALSE)
box()
axis(3, at=c(1,10,100,1000), label=c(1,10,100,1000), pos=0, cex.axis=0.6)
axis(1, at=c(1,10,100,1000), label=c(1,10,100,1000), cex.axis=0.6)
axis(2, at=c(600,550,500,450,400,350,300,250,200,150,100,50,0), label=c
(600,"",500,"",400,"",300,"",200,"",100,"",0), cex.axis=0.6)
mtext("CLAST SIZE / mm", side=3, line=1, cex=0.6, las=0, col="black")
mtext("DEPTH / m", side=2, line=2, cex=0.6, las=0, col="black")
grid(nx = NULL, ny = NULL, col = "lightgray", lty = "solid",
lwd = par("lwd"), equilogs = TRUE)
par(new=TRUE)
lines(x1,y1, col="black", lty="solid", lwd=1)
lines(x2,y2, col="black", lty="solid", lwd=1)
polygon(c(x1,rev(x2)), c(y1,rev(y2)), col="grey", border="black")
par(new=TRUE)
plot(x=Average,y=Depth, type="o",
bg="red", cex=0.5, pch=21,
col="red", lty="solid",
axes=FALSE, xlim=c(0,1670), ylim=c(555,0),
ylab = "",xlab = "")
par(new=TRUE)
plot(x=Mode,y=Depth, type="o",
bg="blue", cex=0.5, pch=21,
col="blue", lty="solid",
axes=FALSE, xlim=c(0,1670), ylim=c(555,0),
ylab = "",xlab = "")
You can do this in ggplot using scale_x_log
so something like:
myplot <- ggplot( StatsNinetee,
aes (x = myResponse,
y = myPredictor,
groups = myGroupingVariable) ) +
geom_point() +
scale_x_log()
myplot
also, avoid attach() it can give odd behavior.

Resources