reversing the cutoff values in R using ROCR packge - r

I am using a confusion matrix to create a ROC Curve. The problem I have is the cutoff values are revered. How to do I put them in ascending order?
pred <- prediction(predictions =c(rep(5,19),rep(7,24),rep(9,40),rep(10,42)), labels =
c(rep(0,18),rep(1,1),rep(0,7),rep(1,17),rep(0,4),rep(1,36),rep(0,3),rep(1,39)))
perf <- performance(pred,"tpr","fpr")
plot(perf,colorize=TRUE)
abline(0,1,col='red')
x = c(0.09375,0.21875,0.43750)
y = c(0.4193548,0.8064516,0.9892473)
points(x , y , col="red", pch=19)
text(x , y+0.03, labels= c("9","7","5"), col="red", pch=19)
predictions = c(rep(5,19),rep(7,24),rep(9,40),rep(10,42))
labels = c(rep(0,18),rep(1,1),rep(0,7),rep(1,17),rep(0,4),rep(1,36),rep(0,3),rep(1,39))
auc(predictions,labels)

Related

how to calculate the slope of a smoothed curve in R

I have the following data:
I plotted the points of that data and then smoothed it on the plot using the following code :
scatter.smooth(x=1:length(Ticker$ROIC[!is.na(Ticker$ROIC)]),
y=Ticker$ROIC[!is.na(Ticker$ROIC)],col = "#AAAAAA",
ylab = "ROIC Values", xlab = "Quarters since Feb 29th 2012 till Dec 31st 2016")
Now I want to find the Point-wise slope of this smoothed curve. Also fit a trend line to the smoothed graph. How can I do that?
There are some interesting R packages that implement nonparametric derivative estimation. The short review of Newell and Einbeck can be helpful: http://maths.dur.ac.uk/~dma0je/Papers/newell_einbeck_iwsm07.pdf
Here we consider an example based on the pspline package (smoothing splines with penalties on order m derivatives):
The data generating process is a negative logistic models with an additive noise (hence y values are all negative like the ROIC variable of #ForeverLearner) :
set.seed(1234)
x <- sort(runif(200, min=-5, max=5))
y = -1/(1+exp(-x))-1+0.1*rnorm(200)
We start plotting the nonparametric estimation of the curve (the black line is the true curve and the red one the estimated curve):
library(pspline)
pspl <- smooth.Pspline(x, y, df=5, method=3)
f0 <- predict(pspl, x, nderiv=0)
Then, we estimate the first derivative of the curve:
f1 <- predict(pspl, x, nderiv=1)
curve(-exp(-x)/(1+exp(-x))^2,-5,5, lwd=2, ylim=c(-.3,0))
lines(x, f1, lwd=3, lty=2, col="red")
And here the second derivative:
f2 <- predict(pspl, x, nderiv=2)
curve((exp(-x))/(1+exp(-x))^2-2*exp(-2*x)/(1+exp(-x))^3, -5, 5,
lwd=2, ylim=c(-.15,.15), ylab=)
lines(x, f2, lwd=3, lty=2, col="red")
#DATA
set.seed(42)
x = rnorm(20)
y = rnorm(20)
#Plot the points
plot(x, y, type = "p")
#Obtain points for the smooth curve
temp = loess.smooth(x, y, evaluation = 50) #Use higher evaluation for more points
#Plot smooth curve
lines(temp$x, temp$y, lwd = 2)
#Obtain slope of the smooth curve
slopes = diff(temp$y)/diff(temp$x)
#Add a trend line
abline(lm(y~x))

2 Y axis histogram (normal frequency vs relative frequency)

I would like your help, please.
I have this 2 plots, separately. One is normal frequency and the other one, with exactly the same data, is for relative frequency.
Can you tell me how can i join them in a single plot with 2 y axis ( frequency and relative frequency?)
x<- AAA$starch
h<-hist(x, breaks=40, col="lightblue", xlab="Starch ~ Corn",
main="Histogram with Normal Curve", xlim=c(58,70),ylim = c(0,2500),axes=TRUE)
xfit<-seq(min(x),max(x),length=40)
yfit<-dnorm(xfit,mean=mean(x),sd=sd(x))
yfit <- yfit*diff(h$mids[1:2])*length(x)
lines(xfit, yfit, col="blue", lwd=3)
library(HistogramTools)
x<- AAA$starch
c <- hist(x,breaks=10, ylab="Relative Frequency", main="Histogram with Normal Curve",ylim=c(0,2500), xlim=c(58,70), axes=TRUE)
PlotRelativeFrequency((c))
Thank you!!
EDIT:
This is just an example image of what I want...
I use doubleYScale from package latticeExtra.
Here is an example (I am not sure about relative frequency calculation) :
library(latticeExtra)
set.seed(42)
firstSet <- rnorm(500,4)
breaks = 0:10
#Cut data into sections
firstSet.cut = cut(firstSet, breaks, right=FALSE)
firstSet.freq = table(firstSet.cut)
#Calculate relative frequency
firstSet.relfreq = firstSet.freq / length(firstSet)
#Parse to a list to use xyplot later and assigning x values
firstSet.list <- list(x = 1:10, y = as.vector(firstSet.relfreq))
#Build histogram and relative frequency curve
hist1 <- histogram(firstSet, breaks = 10, freq = TRUE, col='skyblue', xlab="Starch ~ Corn", ylab="Frequency", main="Histogram with Normal Curve", ylim=c(0,40), xlim=c(0,10), plot=FALSE)
relFreqCurve <- xyplot(y ~ x, firstSet.list, type="l", ylab = "Relative frequency", ylim=c(0,1))
#Build double objects plot
doubleYScale(hist1, relFreqCurve, add.ylab2 = TRUE)
And here is the result with two y axis with different scales :

Interpolation on a Curve in R

I have a dataset called dataframe (a 2d table) and a best fit curve as:
scatter.smooth(dataframe, xlab="", ylab="")
What code would I need to realize and evaluate (get numerical value of) a Y value on that best fit curve at a single x value?
Try
set.seed(1)
dataframe <- data.frame(x=runif(100), y=runif(100))
scatter.smooth(dataframe, xlab="", ylab="")
res <- with(dataframe, loess.smooth(x, y, evaluation = 200))
lengths(res)
# x y
# 200 200
x <- 0.5
y <- res$y[res$x>=x][1]
points(x, y, col="blue", pch = 19, cex=2)

Extract coefficient numeric for r plot legend

I'm trying to get the legend in a simple R plot to report the coefficient (i.e. slope) without manually extracting the value. Does anybody know how to code the legend so that it displays the value rather than the command? Thanks
y <- rnorm(100)
x <- sample(rnorm(100), 100, replace = TRUE)
plot(x, y)
mod <- lm(y ~x)
abline(lm(y~x))
legend("topleft", "Slope = coef(mod)[2]", col = "black", pch = 15, cex = .8)
paste0("Slope = ", coef(mod)[2])

Poisson regression line

How can I add a poisson regression line to a plot? I tried the following, but the abline function doesn't not work. This is because abline() uses the intercept and slope, whereas a poisson regression line uses a log-link.
x = rpois(12, 5)
plot(x, axes = F)
axis(1,at=1:length(month.name), labels = month.name)
axis(side = 2)
y = c(1:12)
poislm = glm(x~y, family=poisson)
abline(poislm)
How about from R-help
predProbs<-predict(poislm,data.frame(y=seq(min(y), max(y), length.out=100)), type="response")
lines(seq(min(y), max(y), length.out=100), predProbs, col=2, lwd=2)

Resources