create a secondary y-axis in R - r

I had a question regarding creating secondary y-axis in R. Here is an example dataset
#generate some artifical data
per_cur <- runif(1171, 0.1, 7.62)
obs<-runif(1171,100,1000)
#create a density histogram of per_cur
par(mfrow=c(2,1))
op <- par(mar = c(5,4,4,4) + 0.5)
hist(per_cur, prob=TRUE, border="white",main=NULL,las=1,cex.axis=0.8,ann = FALSE)
lines(density(per_cur), col="blue",lwd=2)
#add obs with a secondary y-axis
par(new = TRUE)
plot(per_cur,obs, cex=.5, pch=16, col=adjustcolor("black",alpha=0.2), axes = FALSE, ylab="Density")
axis(4,cex.axis=0.5)
It produces a plot which tells me the distribution of per_cur and also shows my the relationship
between per_cur and obs through the secondary y-axis. However, when I run the following code with the only
difference that I set the limit of primary y-axis using ylim=c(0,0.3) you can see the plot completely changes.
It gives the impression that relationship between the obs and pre_cur is different in both plots (more obs points
come under the curve in first plot compared to the second plot).
op <- par(mar = c(5,4,4,4) + 0.5)
hist(per_cur, prob=TRUE,ylim=c(0,0.3), border="white",main=NULL,las=1,cex.axis=0.8,ann = FALSE)
lines(density(per_cur), col="blue",lwd=2)
par(new = TRUE)
plot(per_cur,obs, cex=.5, pch=16, col=adjustcolor("black",alpha=0.2), axes = FALSE, ylab="Density")
axis(4,cex.axis=0.5)
I wanted to ask is there any way my secondary y-axis also get adjusted as I adjust the primary y-axis limit so that
equal number of obs points are under the curve in both plots. Hope this is clear.

This can be accomplished by manipulating the ylim on each plot. For example:
#generate some artifical data
per_cur <- runif(1171, 0.1, 7.62)
obs<-runif(1171,100,1000)
#create a density histogram of per_cur
# use these variables to set the limits on all plots
y1max = max(density(per_cur)$y)
y2max = max(obs)
par(mfrow=c(2,1))
op <- par(mar = c(5,4,4,4) + 0.5)
hist(per_cur, prob=TRUE, ylim = c(0, y1max), border="white",main=NULL,las=1,cex.axis=0.8,ann = FALSE)
lines(density(per_cur), col="blue",lwd=2)
#add obs with a secondary y-axis
par(new = TRUE)
plot(per_cur,obs, cex=.5, ylim = c(0, y2max), pch=16, col=adjustcolor("black",alpha=0.2), axes = FALSE, ylab="Density")
axis(4,cex.axis=0.5)
Then you scale each axis by the same factor:
# used to scale the axes
factor <- 2
op <- par(mar = c(5,4,4,4) + 0.5)
hist(per_cur, prob=TRUE, ylim = c(0, y1max * factor), border="white",main=NULL,las=1,cex.axis=0.8,ann = FALSE)
lines(density(per_cur), col="blue",lwd=2)
par(new = TRUE)
plot(per_cur,obs, cex=.5, ylim = c(0, y2max * factor),pch=16, col=adjustcolor("black",alpha=0.2), axes = FALSE, ylab="Density")
axis(4,cex.axis=0.5)

Related

"col" argument in plot function not working when a factor value is used for x - axis

I am doing quarterly analysis, for which I want to plot a graph. To maintain continuity on x axis I have turned quarters into factors. But then when I am using plot function and trying to color it red, the col argument is not working.
An example:
quarterly_analysis <- data.frame(Quarter = as.factor(c(2020.1,2020.2,2020.3,2020.4,2021.1,2021.2,2021.3,2021.4)),
AvgDefault = as.numeric(c(0.24,0.27,0.17,0.35,0.32,0.42,0.38,0.40)))
plot(quarterly_analysis, col="red")
But I am getting the graph in black color as shown below:
Converting it to a factor is not ideal to plot unless you have multiple values for each factor - it tries to plot a box plot-style plot. For example, with 10 observations in the same factor, the col = "red" color shows up as the fill:
set.seed(123)
fact_example <- data.frame(factvar = as.factor(rep(LETTERS[1:3], 10)),
numvar = runif(30))
plot(fact_example$factvar, fact_example$numvar,
col = "red")
With only one observation for each factor, this is not ideal because it is just showing you the line that the box plot would make.
You could use border = "red:
plot(quarterly_analysis$Quarter,
quarterly_analysis$AvgDefault, border="red")
Or if you want more flexibility, you can plot it numerically and do a little tweaking for more control (i.e., can change the pch, or make it a line graph):
# make numeric x values to plot
x_vals <- as.numeric(substr(quarterly_analysis$Quarter,1,4)) + rep(seq(0, 1, length.out = 4))
par(mfrow=c(1,3))
plot(x_vals,
quarterly_analysis$AvgDefault, col="red",
pch = 7, main = "Square Symbol", axes = FALSE)
axis(1, at = x_vals,
labels = quarterly_analysis$Quarter)
axis(2)
plot(x_vals,
quarterly_analysis$AvgDefault, col="red",
type = "l", main = "Line graph", axes = FALSE)
axis(1, at = x_vals,
labels = quarterly_analysis$Quarter)
axis(2)
plot(x_vals,
quarterly_analysis$AvgDefault, col="red",
type = "b", pch = 7, main = "Both", axes = FALSE)
axis(1, at = x_vals,
labels = quarterly_analysis$Quarter)
axis(2)
Data
set.seed(123)
quarterly_analysis <- data.frame(Quarter = as.factor(paste0(2019:2022,
rep(c(".1", ".2", ".3", ".4"),
each = 4))),
AvgDefault = runif(16))
quarterly_analysis <- quarterly_analysis[order(quarterly_analysis$Quarter),]

Plot percentage change figure with 95% CI and stats

I am planning to reproduce the attached figure, but I have no clue how to do so:
Let´s say I would be using the CO2 example dataset, and I would like to plot the relative change of the Uptake according to the Treatment. Instead of having the three variables in the example figure, I would like to show the different Plants grouped for each day/Type.
So far, I managed only to get this bit of code, but this is far away from what it should look like.
aov1 <- aov(CO2$uptake~CO2$Type+CO2$Treatment+CO2$Plant)
plot(TukeyHSD(aov1, conf.level=.95))
Axes should be switched, and I would like to add statistical significant changes indicated with letters or stars.
You can do this by building it in base R - this should get you started. See comments in code for each step, and I suggest running it line by line to see what's being done to customize for your specifications:
Set up data
# Run model
aov1 <- aov(CO2$uptake ~ CO2$Type + CO2$Treatment + CO2$Plant)
# Organize plot data
aov_plotdata <- data.frame(coef(aov1), confint(aov1))[-1,] # remove intercept
aov_plotdata$coef_label <- LETTERS[1:nrow(aov_plotdata)] # Example labels
Build plot
#set up plot elements
xvals <- 1:nrow(aov_plotdata)
yvals <- range(aov_plotdata[,2:3])
# Build plot
plot(x = range(xvals), y = yvals, type = 'n', axes = FALSE, xlab = '', ylab = '') # set up blank plot
points(x = xvals, y = aov_plotdata[,1], pch = 19, col = xvals) # add in point estimate
segments(x0 = xvals, y0 = aov_plotdata[,2], y1 = aov_plotdata[,3], lty = 1, col = xvals) # add in 95% CI lines
axis(1, at = xvals, label = aov_plotdata$coef_label) # add in x axis
axis(2, at = seq(floor(min(yvals)), ceiling(max(yvals)), 10)) # add in y axis
segments(x0=min(xvals), x1 = max(xvals), y0=0, lty = 2) #add in midline
legend(x = max(xvals)-2, y = max(yvals), aov_plotdata$coef_label, bty = "n", # add in legend
pch = 19,col = xvals, ncol = 2)

R plot,why is introducing of axis limits creating havoc?

My code
library(Hmisc)
r1 <- read.table("mt7.1r1.rp", header = FALSE)
r2 <- read.table("mt7.1r2.rp", header = FALSE)
r3 <- read.table("mt7.2r1.rp", header = FALSE)
r4 <- read.table("mt7.2r2.rp", header = FALSE)
p1=r1[1]
per1=log10(p1)
p2=r2[1]
per2=log10(p2)
p3=r3[1]
per3=log10(p3)
p4=r4[1]
per4=log10(p4)
m1=nrow(per1)
m2=nrow(per2)
m3=nrow(per3)
m4=nrow(per4)
xmin <- floor( min(per1,per2,per3,per4))
xmax <- ceiling( max(per1,per2,per3,per4))
lxmax=10^(xmax)
lxmin=10^(xmin)
rhoaxy = r2[3]
phaxy = r2[5]
rhoayx = r3[3]
phayx = r3[5]
rhoaxx = r1[3]
phaxx = r1[5]
rhoayy = r4[3]
phayy = r4[5]
per2=unname(per2)
per2=unlist(per2)
per3=unname(per3)
per3=unlist(per3)
rhoaxy=unname(rhoaxy)
rhoaxy=unlist(rhoaxy)
rhoaxy=log10(rhoaxy)
rhoayx=unname(rhoayx)
rhoayx=unlist(rhoayx)
rhoayx=log10(rhoayx)
ymin1=floor(min(rhoaxy)-1)
ymax1=ceiling(max(rhoaxy)+1)
ymin2=floor(min(rhoayx)-1)
ymax2=ceiling(max(rhoayx)+1)
ymin=min(ymin1,ymin2)
ymax=max(ymax1,ymax2)
png("withlim.png")
plot(per2,rhoaxy, col='red', xlab='Per (s)', ylab = 'Rho-xy/yx',ylim=c(ymin, ymax))
par(new=TRUE)
plot(per3,rhoayx, col='green', xaxt='n', xlab= NA, yaxt = 'n', ylab = NA)
dev.off()
The image I got
If I delete ylim
My question is,why are the axis limits changing the image content?The values from the second image correspond to proper data values.The first image is with values that do not represent rhoaxy and rhoayx.
It is difficult to test without the data, but my guess is that, on the second plot, the Y axis is not the same, although the Y axis is not plot.
So you've got the superposition of 2 plot, with a different Y axis.
If you want the same ylim on both plot, add ylim=c(ymin, ymax) on the second plot also.
If it does not work, please provide data example, so we can test.

Points Scale in R barplot [duplicate]

This question already has answers here:
How can I plot with 2 different y-axes?
(6 answers)
Closed 6 years ago.
i'm having troubles in a multi axis barplot. I have an X,Y axis with bars and dots in the same graph. The point is that I have to shown both of them in different scales
While I can shown both (bars and dots) correctly, the problem comes when I try to set different scales in left and right axis. I dont know how to change the aditional axis scale, and how to bind the red dots to the right axis, and the bars to the left one.
This is my code and what I get:
labels <- value
mp <- barplot(height = churn, main = title, ylab = "% churn", space = 0, ylim = c(0,5))
text(mp, par("usr")[3], labels = labels, srt = 45, adj = c(1.1,1.1), xpd = TRUE, cex=.9)
# Population dots
points(popul, col="red", bg="red", pch=21, cex=1.5)
# Churn Mean
media <- mean(churn)
abline(h=media, col = "black", lty=2)
# Population scale
axis(side = 4, col= "red")
ylim= c(0,50)
ylim= c(0,5)
What I want is to have left(grey) axis at ylim=c(0,5) with the bars bound to that axis. And the right(red) axis at ylim=c(0,50) with the dots bound to that axis...
The goal is to represent bars and points in the same graph with diferent axis.
Hope I explained myself succesfully.
Thanks for your assistance!
Here is a toy example. The only "trick" is to store the x locations of the bar centers and the limits of the x axis when creating the barplot, so that you can overlay a plot with the same x axis and add your points over the centers of the bars. The xaxs = "i" in the call to plot.window indicates to use the exact values given rather than expanding by a constant (the default behavior).
set.seed(1234)
dat1 <- sample(10, 5)
dat2 <- sample(50, 5)
par(mar = c(2, 4, 2, 4))
cntrs <- barplot(dat1)
xlim0 <- par()$usr[1:2]
par(new = TRUE)
plot.new()
plot.window(xlim = xlim0, ylim = c(0, 50), xaxs = "i")
points(dat2 ~ cntrs, col = "darkred")
axis(side = 4, col = "darkred")

combine histogram with scatter plot in R

I am trying to produce a plot with histogram and scatter plot in just one plot using a secondary axis. In detail, here is an example data:
#generate example data
set.seed(1)
a <- rnorm(200,mean=500,sd=35)
data <- data.frame(a = a,
b = rnorm(200, mean=10, sd=2),
c = c(rep(1,100), rep(0,100)))
# produce a histogram of data$a
hist(a, prob=TRUE, col="grey")
#add a density line
lines(density(a), col="blue", lwd=2)
#scatter plot
plot(data$a,data$b,col=ifelse(data$c==1,"red","black"))
What I want to do is to combine the histogram and scatter plot together. This implies my x-axis will be data$a, my primary y-axis is the frequency/density for the histogram and my secondary y-axis is data$b.
Maybe something like this...
# produce a histogram of data$a
hist(a, prob=TRUE, col="grey")
#add a density line
lines(density(a), col="blue", lwd=2)
par(new = TRUE)
#scatter plot
plot(data$a,data$b,col=ifelse(data$c==1,"red","black"),
axes = FALSE, ylab = "", xlab = "")
axis(side = 4, at = seq(4, 14, by = 2))
There's a good blog on this here http://www.r-bloggers.com/r-single-plot-with-two-different-y-axes/.
Basically, as the blog describes you need to do:
par(new = TRUE)
plot(data$a,data$b,col=ifelse(data$c==1,"red","black"), axes = F, xlab = NA, ylab = NA)
axis(side = 4)

Resources