2 Y axis histogram (normal frequency vs relative frequency) - r

I would like your help, please.
I have this 2 plots, separately. One is normal frequency and the other one, with exactly the same data, is for relative frequency.
Can you tell me how can i join them in a single plot with 2 y axis ( frequency and relative frequency?)
x<- AAA$starch
h<-hist(x, breaks=40, col="lightblue", xlab="Starch ~ Corn",
main="Histogram with Normal Curve", xlim=c(58,70),ylim = c(0,2500),axes=TRUE)
xfit<-seq(min(x),max(x),length=40)
yfit<-dnorm(xfit,mean=mean(x),sd=sd(x))
yfit <- yfit*diff(h$mids[1:2])*length(x)
lines(xfit, yfit, col="blue", lwd=3)
library(HistogramTools)
x<- AAA$starch
c <- hist(x,breaks=10, ylab="Relative Frequency", main="Histogram with Normal Curve",ylim=c(0,2500), xlim=c(58,70), axes=TRUE)
PlotRelativeFrequency((c))
Thank you!!
EDIT:
This is just an example image of what I want...

I use doubleYScale from package latticeExtra.
Here is an example (I am not sure about relative frequency calculation) :
library(latticeExtra)
set.seed(42)
firstSet <- rnorm(500,4)
breaks = 0:10
#Cut data into sections
firstSet.cut = cut(firstSet, breaks, right=FALSE)
firstSet.freq = table(firstSet.cut)
#Calculate relative frequency
firstSet.relfreq = firstSet.freq / length(firstSet)
#Parse to a list to use xyplot later and assigning x values
firstSet.list <- list(x = 1:10, y = as.vector(firstSet.relfreq))
#Build histogram and relative frequency curve
hist1 <- histogram(firstSet, breaks = 10, freq = TRUE, col='skyblue', xlab="Starch ~ Corn", ylab="Frequency", main="Histogram with Normal Curve", ylim=c(0,40), xlim=c(0,10), plot=FALSE)
relFreqCurve <- xyplot(y ~ x, firstSet.list, type="l", ylab = "Relative frequency", ylim=c(0,1))
#Build double objects plot
doubleYScale(hist1, relFreqCurve, add.ylab2 = TRUE)
And here is the result with two y axis with different scales :

Related

Density curve on histogram is flat

I am trying to plot a curve that follows the trend of the histogram of my data, I have looked around and have tried out other peoples code but I still get a flat line. Here is my code
hist(Ferr,xlab = "Ferritin Plasma Concentration", ylab = "Frequency", main = "Histogram of Ferritin
Plasma Concentration", xlim = c(0,250), ylim = c(0,50), cex.axis=0.8, cex.lab=0.8,cex.main = 1)
curve(dnorm(x, mean = mean(Ferr), sd = sd(Ferr)), col="blue", add=TRUE)
lines(density(Ferr), col="red")
If anyone can help me to see where I have gone wrong, that would be great thank you.
Unlike an histogram, the integral of a density function over the whole space is equal to 1 :
sum(density(x)*dx) = 1
To scale the density function to the histogram, you can multiply it by the maximum value of the histogram bins and divide it by the distance between points.
Let's take mtcars$mpg as example:
Ferr <- mtcars$mpg
d <- density(Ferr)
dx <- diff(d$x)[1]
sum(d$y)*dx
[1] 1.000851
h <- hist(Ferr)
lines(x=d$x,y=max(h$counts)*d$y/dx)
You need to set freq = FALSE (and remove the constraints on ylimand xlim and change "Frequency" to "Density"):
hist(Ferr,
freq= FALSE,
xlab = "Ferritin Plasma Concentration", ylab = "Density",
main = "Histogram of Ferritin Plasma Concentration",
cex.axis=0.8, cex.lab=0.8,cex.main = 1)
curve(dnorm(x, mean = mean(Ferr), sd = sd(Ferr)), col="blue", add=TRUE)
lines(density(Ferr), col="red")
Toy data:
Ferr <- rnorm(1000)

adjusting plot axis in user defined function - R

I have a function in R which creates a standard normal plot, and then uses a for loop that calls density plots for the t distribution for various degrees of freedom. The plot looks like:
Note that the density for degrees of freedom = 2 extends outside of the y axis limits. I am wondering if there is a way to edit the for loop so that the axis limits are adjusted according to the range of the density lines that are drawn.
The for loop code that I am using is as follows:
N <- 1000
n <- c(25,50,100,200)
df<-c(1:4,seq(5,25,by=5))
histPlot <- function(data) {
x <- seq(-4, 4, length=100)
y <- dnorm(x, mean=0, sd=1)
plot(x, y, type="l",
main=paste("Distribution of size", nrow(data)/9000, sep=" "),
xlab="standard deviation")
colors <- brewer.pal(n = 9, name = "Spectral")
i<-1
for (d in df) {
lines(density(data[data$df==d, "t"]),col=colors[i])
legend("topright", pch=c(21,21), col=c(colors, "black"), legend=c(df, "normal"), bty="o", cex=.8)
i <- i+1
}
}
The lines functions called inside the for loop add up to the existing plot.
This means you have to change the ylim parameter in the plot function call. This will make a higher plot, and lines will be visible when added.
Try like this:
plot(x, y, type="l",
main=paste("Distribution of size", nrow(data)/9000, sep=" "),
xlab="standard deviation",
ylim = c(0, 1)) # This line will make the plot higher, i.e. the y axis range will be from 0 to 1

reversing the cutoff values in R using ROCR packge

I am using a confusion matrix to create a ROC Curve. The problem I have is the cutoff values are revered. How to do I put them in ascending order?
pred <- prediction(predictions =c(rep(5,19),rep(7,24),rep(9,40),rep(10,42)), labels =
c(rep(0,18),rep(1,1),rep(0,7),rep(1,17),rep(0,4),rep(1,36),rep(0,3),rep(1,39)))
perf <- performance(pred,"tpr","fpr")
plot(perf,colorize=TRUE)
abline(0,1,col='red')
x = c(0.09375,0.21875,0.43750)
y = c(0.4193548,0.8064516,0.9892473)
points(x , y , col="red", pch=19)
text(x , y+0.03, labels= c("9","7","5"), col="red", pch=19)
predictions = c(rep(5,19),rep(7,24),rep(9,40),rep(10,42))
labels = c(rep(0,18),rep(1,1),rep(0,7),rep(1,17),rep(0,4),rep(1,36),rep(0,3),rep(1,39))
auc(predictions,labels)

R: Plot fixed axis range for each plot in a loop

I have a sample data set for which I plot several png files divided by groups (in this case by ID) in a loop.
A question that concerns the x axis: How could I introduce a fixed range (lets say from 1940 to 2014 in every graph) into the for loop, so that the x-axis always corresponds to this range (case1) but if values in YEAR before 1940 are included, do the plot scenario with the whole group which is plotting the axis ranges automatically like in the for-loop introduced in the code above (case 2)?
Case 1 with the sample data would be for the group with ID 259 (NAME2) and case 2 would be for the group with ID 47 (NAME1)
Here is my code:
xy <- data.frame(NAME=c("NAME1", "NAME1","NAME1","NAME1","NAME2","NAME2","NAME2"),ID=c(47,47,47,47,259,259,259),YEAR=c(1932,1942,1965,1989,2007,2008,2014),VALUE=c(0,NA,-6,-16,0,-9,-28))
ind <- split(x = xy,f = xy[,'ID'])
### PLOT
for(i in 1:length(ind)){
png(names(ind[i]), width=3358, height=2329, res=300)
par(mar=c(6,8,6,5))
plot(ind[[i]][,c('YEAR','VALUE')],
type='n',
main=ind[[i]][1,'NAME'],
xlab="Time [Years]",
ylab="Length change [m]")
# plot axis
axis(1, at = seq(1000,2030,10), cex.axis=1, labels=FALSE, tcl=-0.3)
# plot points and lines
points(ind[[i]][,c('YEAR','VALUE')], type="l", lwd=2)
points(ind[[i]][,c('YEAR','VALUE')], type="p", lwd=1, cex=1, pch=21, bg='white')
# plot vertical line through 0
abline(h=0)
dev.off()
}
You've almost got it! Starting with a blank plot and then adding points/lines is perfect. Change your initial plot call to include the ranges you want, and you're good to go:
x.range <- c(1940, 2014)
if (min(ind[[i]][, 'YEAR'], na.rm = T) < 1940) {
x.range <- range(ind[[i]][, 'YEAR'], finite = T)
}
plot(x = x.range,
y = range(ind[[i]][,'VALUE'], finite = T),
type='n',
main=ind[[i]][1,'NAME'],
xlab="Time [Years]",
ylab="Length change [m]")
Note that these plots will still have different y axes.

Controlling z labels in contourplot

I am trying to control how many z labels should be written in my contour plot plotted with contourplot() from the lattice library.
I have 30 contour lines but I only want the first 5 to be labelled. I tried a bunch of things like
contourplot(z ~ z+y, data=d3, cuts=30, font=3, xlab="x axis", ylab="y axis", scales=list(at=seq(2,10,by=2)))
contourplot(z ~ z+y, data=d3, cuts=30, font=3, xlab="x axis", ylab="y axis", at=seq(2,10,by=2))
but nothing works.
Also, is it possible to plot two contourplot() on the same graph? I tried
contourplot(z ~ z+y, data=d3, cuts=30)
par(new=T)
contourplot(z ~ z+y, data=d3, cuts=20)
but it's not working.
Thanks!
Here is my take:
library(lattice)
x <- rep(seq(-1.5,1.5,length=50),50)
y <- rep(seq(-1.5,1.5,length=50),rep(50,50))
z <- exp(-(x^2+y^2+x*y))
# here is default plot
lp1 <- contourplot(z~x*y)
# here is an enhanced one
my.panel <- function(at, labels, ...) {
# draw odd and even contour lines with or without labels
panel.contourplot(..., at=at[seq(1, length(at), 2)], col="blue", lty=2)
panel.contourplot(..., at=at[seq(2, length(at), 2)], col="red",
labels=as.character(at[seq(2, length(at), 2)]))
}
lp2 <- contourplot(z~x*y, panel=my.panel, at=seq(0.2, 0.8, by=0.2))
lp3 <- update(lp2, at=seq(0.2,0.8,by=0.1))
lp4 <- update(lp3, lwd=2, label.style="align")
library(gridExtra)
grid.arrange(lp1, lp2, lp3, lp4)
You can adapt the custom panel function to best suit your needs (e.g. other scale for leveling the z-axis, color, etc.).
You can specify the labels as a character vector argument and set the last values with rep("", 5), so perhaps for the example you offered on an earlier question about contour
x = seq(0, 10, by = 0.5)
y = seq(0, 10, by = 0.5)
z <- outer(x, y)
d3 <- expand.grid(x=x,y=y); d3$z <- as.vector(z)
contourplot(z~x+y, data=d3)
# labeled '5'-'90'
contourplot(z~x+y, data=d3,
at=seq(5,90, by=5),
labels=c(seq(5,25, by=5),rep("", 16) ),
main="Labels only at the first 5 contour lines")
# contourplot seems to ignore 'extra' labels
# c() will coerce the 'numeric' elements to 'character' if any others are 'character'
?contourplot # and follow the link in the info about labels to ?panel.levelplot

Resources