exponential curve fit on histogram in R - r

I made a histogram in R and I have to fit an exponential curve on it.
But the curve doesn't appear on the histogram.
This is the code:
hist(Adat$price, main="histogram",xlab="data")
curve(dexp(x, rate=1,log=FALSE), add = TRUE)
Could someone help me please?

You need to add set the argument freq=FALSE if you want the histogram to be normalized:
set.seed(32418)
sim <- rexp(100) + rnorm(100,0,.01)
hist(sim, freq=FALSE)
curve(dexp(x, rate=1, log=FALSE), add = TRUE)
Otherwise, the height of the bins will be a function of the number of samples. In fact, the curve technically did appear on your graph, it's just so small that you can't distinguish it from a flat line at y = 0.

Related

Changing scale of the ROC chart

I am using the following code to plot the ROC curve after having run the logistic regression.
fit1 <- glm(formula=GB160M3~Behvscore, data=eflscr,family="binomial", na.action = na.exclude)
prob1=predict(fit1, type=c("response"))
eflscr$prob1 = prob1
library(pROC)
g1 <- roc(GB160M3~prob1, data=eflscr, plot=TRUE, grid=TRUE, print.auc=TRUE)
The ROC curves plotted look like this (see link below)
The x-axis scale does not fill the who chart.
How can I change the x axis to report 1 - specifically?
By default pROC sets asp = 1 to ensure the plot is square and both sensitivity and specificity are on the same scale. You can set it to NA or NULL to free the axis and fill the chart, but your ROC curve will be misshaped.
plot(g1, asp = NA)
Using par(pty="s") as suggested by Joe is probably a better approach
This is purely a labeling problem: note that the x axis goes decreasing from 1 to 0, which is exactly the same as plotting 1-specificity on an increasing axis. You can set the legacy.axes argument to TRUE to change the behavior if the default one bothers you.
plot(g1, legacy.axes = TRUE)
A good shortcut to getting a square plot is to run the following before plotting:
par(pty="s")
This forces the shape of the plot region to be square. Set the plotting region back to maximal by simply resetting the graphics device and clearing the plot.
dev.off()
As pointed out by #Calimo, there is the legacy.axes argument to reverse the x-axis and the label is also changed automatically. You can run ?plot.roc to see all the pROC plotting options.
Example
# Get ROC object
data(aSAH)
roc1 <- roc(aSAH$outcome, aSAH$s100b)
# Plot
par(pty="s")
plot(roc1, grid = TRUE, legacy.axes = TRUE)
# Reset graphics device and clear plot
dev.off()

Understanding the Local Polynomial Regression

Could someone explain me why I get different lines when I plot? Somehow I thought the line should be the same
data(aircraft)
help(aircraft)
attach(aircraft)
lgWeight <- log(Weight)
library(KernSmooth)
# a) Fit a nonparametric regression to data (xi,yi) and save the estimated values mˆ (xi).
# Regression of degree 2 polynomial of lgWeight against Yr
op <- par(mfrow=c(2,1))
lpr1 <- locpoly(Yr,lgWeight, bandwidth=7, degree = 2, gridsize = length(Yr))
plot(Yr,lgWeight,col="grey", ylab="Log(Weight)", xlab = "Year")
lines(lpr1,lwd=2, col="blue")
lines(lpr1$y, col="black")
How can I get the values from the model? If I print the model, it gives me the values on $x and $y, but somehow if I plot them, is not the same as the blue line. I need the values of the fitted model (blue) for every x, could someone help me?
The fitted model (blue curve) is correctly in lpr1. As you said, the correct y-values are in lpr1$y and the correct x-values are in lpr1$x.
The reason the second plot looks like a straight line is because you are only giving the plot function one variable, lpr1$y. Since you don't specify the x-coordinates, R will automatically plot them along an index, from 1 to the length of the y variable.
The following are two explicit and equivalent ways to plot the curve and line:
lines(x = lpr1$x, y = lpr1$y,lwd=2, col="blue") # plots curve
lines(x = 1:length(lpr1$y), y = lpr1$y, col="black") # plot line

Overlaying a normal pdf onto a histogram in R

I've looked through similar problems and I am unable to resolve mine based on what has been answered with them. I have a histogram of Body Mass Index Data. I have found the mean and sd of the data, and I am trying to overlay a normal pdf with the same mean and sd of the data to compare it with the histogram.
Here is what I have so far.
I used hist(BMI) to graph the histogram. I found the mean(BMI) to be 26.65 and the sd(BMI) to be 3.47.
I am trying to use the curve function to plot a normal curve with those same parameters over the histogram.
curve(dnorm(x, mean = 26.65, sd = 3.47), add = T, col = "red")
HistOverlay
As you can see, the red curve is barely visible at the very bottom of the histogram. Why is this error occurring?
Thank you.
The normal distribution is probability density function. Hence the integral of the function is 1. Simply speaking, it is the probability that a single individual has this BMI. To generate the result you want to have you need to multiply the curve with the size of your population.
Something along the line:
curve(100*dnorm(x, mean = 26.65, sd = 3.47), add = T, col = "red")

How to plot exponential function on barplot R?

So I have a barplot in which the y axis is the log (frequencies). From just eyeing it, it appears that bars decrease exponentially, but I would like to know this for sure. What I want to do is also plot an exponential on this same graph. Thus, if my bars fall below the exponential, I would know that my bars to decrease either exponentially or faster than exponential, and if the bars lie on top of the exponential, I would know that they dont decrease exponentially. How do I plot an exponential on a bar graph?
Here is my graph if that helps:
If you're trying to fit density of an exponential function, you should probably plot density histogram (not frequency). See this question on how to plot distributions in R.
This is how I would do it.
x.gen <- rexp(1000, rate = 3)
hist(x.gen, prob = TRUE)
library(MASS)
x.est <- fitdistr(x.gen, "exponential")$estimate
curve(dexp(x, rate = x.est), add = TRUE, col = "red", lwd = 2)
One way of visually inspecting if two distributions are the same is with a Quantile-Quantile plot, or Q-Q plot for short. Typically this is done when inspecting if a distribution follows standard normal.
The basic idea is to plot your data, against some theoretical quantiles, and if it matches that distribution, you will see a straight line. For example:
x <- qnorm(seq(0,1,l=1002)) # Theoretical normal quantiles
x <- x[-c(1, length(x))] # Drop ends because they are -Inf and Inf
y <- rnorm(1000) # Actual data. 1000 points drawn from a normal distribution
l.1 <- lm(sort(y)~sort(x))
qqplot(x, y, xlab="Theoretical Quantiles", ylab="Actual Quantiles")
abline(coef(l.1)[1], coef(l.1)[2])
Under perfect conditions you should see a straight line when plotting the theoretical quantiles against your data. So you can do the same plotting your data against the exponential function you think it will follow.

best fitting curve from plot in R

I have a probability density function in a plot called ph that i derived from two samples of data, by the help of a user of stackoverflow, in this way
few <-read.table('outcome.dat',head=TRUE)
many<-read.table('alldata.dat',head=TRUE)
mh <- hist(many$G,breaks=seq(0,1.,by=0.03), plot=FALSE)
fh <- hist(few$G, breaks=mh$breaks, plot=FALSE)
ph <- fh
ph$density <- fh$counts/(mh$counts+0.001)
plot(ph,freq=FALSE,col="blue")
I would like to fit the best curve of the plot of ph, but i can't find a working method.
how can i do this? I have to extract the vaule from ph and then works on they? or there is same function that works on
plot(ph,freq=FALSE,col="blue")
directly?
Assuming you mean that you want to perform a curve fit to the data in ph, then something along the lines of
nls(FUN, cbind(ph$counts, ph$mids),...) may work. You need to know what sort of function 'FUN' you think the histogram data should fit, e.g. normal distribution. Read the help file on nls() to learn how to set up starting "guess" values for the coefficients in FUN.
If you simply want to overlay a curve onto the histogram, then smoo<-spline(ph$mids,ph$counts);
lines(smoo$x,smoo$y)
will come close to doing that. You may have to adjust the x and/or y scaling.
Do you want a density function?
x = rnorm(1000)
hist(x, breaks = 30, freq = FALSE)
lines(density(x), col = "red")

Resources