Model adequacy checking - normal probability plot in R [closed] - r

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
How can I create a normal probability plot of residuals in R so that there are normal probability values on y-axis?

Normally you'll make the normal probability plot with qqnorm and qqline.
Example:
fit <- lm(resp ~ dep1 + dep2)
qqnorm(fit$residuals, datax=TRUE)
qqline(fit$residuals, datax=TRUE)
You can get residuals vs. prob. with the plot and pnorm:
plot(fit$residuals, pnorm(fit$residuals))
(with prob. on the y-axis)

Related

normally distributed population, calculating in R the probability of negative or zero readings occurring [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
In R, how do you calculate the probability of negative or zero readings occurring?
μ and σ are giving.
You can use the distribution function of the gaussian distribution:
pnorm(0,μ,σ)
(I guess you are speaking about gaussian distribution)
edit
The pnorm is the cumulative density function. Its values are between 0 and 1, and its value at x gives the area under the gaussian curve from -inf to x. In my example below, the value at 0 of pnorm give the area in pink under the gaussian curve, so the probability you are looking for, i.e. the probability of sampling a value following the corresponding gausian distribution with a value below or equal to 0.

Fit saturation growth-rate model in R [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I have a response variable and an independent variable that visually fit to a saturation growth-rate model. How can I fit such model in R? Thank you!
give the nls function a try, but next time please provide some example data. I use the data from this excellent tutorial of a colleague (https://bscheng.com/2014/05/07/modeling-logistic-growth-data-in-r/):
library("car"); library("ggplot2")
#Here's the data
mass<-c(6.25,10,20,23,26,27.6,29.8,31.6,37.2,41.2,48.7,54,54,63,66,72,72.2,
76,75) #Wilson's mass in pounds
days.since.birth<-c(31,62,93,99,107,113,121,127,148,161,180,214,221,307,
452,482,923, 955,1308) #days since Wilson's birth
data<-data.frame(mass,days.since.birth) #create the data frame
plot(mass~days.since.birth, data=data) #always look at your data first!
wilson<-nls(mass~phi1/(1+exp(-(phi2+phi3*days.since.birth))),
start=list(phi1=100,phi2=-1.096,phi3=.002),data=data,trace=TRUE)
#set parameters
phi1<-coef(wilson)[1]
phi2<-coef(wilson)[2]
phi3<-coef(wilson)[3]
x<-c(min(data$days.since.birth):max(data$days.since.birth)) #construct a range of x values bounded by the data
y<-phi1/(1+exp(-(phi2+phi3*x))) #predicted mass
predict<-data.frame(x,y) #create the prediction data frame#And add a nice plot (I cheated and added the awesome inset jpg in another program)
ggplot(data=data,aes(x=days.since.birth,y=mass))+
geom_point(color='blue',size=5)+theme_bw()+
labs(x='Days Since Birth',y='Mass (lbs)')+
scale_x_continuous(breaks=c(0,250,500,750, 1000,1250))+
scale_y_continuous(breaks=c(0,10,20,30,40,50,60,70,80))+
theme(axis.text=element_text(size=18),axis.title=element_text(size=24))+
geom_line(data=predict,aes(x=x,y=y), size=1)

function scale() in R doesn't scale the data symmetrically [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I apologize if my question is simple. I tried to find the answer but I didn't find much info.
I use the scale() function in R to scale my data. What I don't understand is that when I plot my scaled data using matplot() it seems my scaled data aren't symmetric. which means the range of the sacled data is -1,-0.5,0,0.5,1,1.5. As I know, we scale the data to mean zero and standard deviation s. So my data should have a deviation of s from mean but here I have a deviation of 1.5 and a deviation of -1. Why?
Your data are not symmetric around their mean.
Compare the following:
x <- runif(1000) # symmetric around 0.5
y <- rexp(1000) # not symmetric around 1 at all
summary(scale(x))
summary(scale(y))

distribution from percentage with R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have distribution of parameter (natural gas mixture composition) expressed in percents. How to test such data for distribution parameters (it should be gamma, normal or lognormal distribution) and generate random composition based on that parameters in R?
This might be a better question for CrossValidated, but:
it is not generally a good idea to choose from among a range of possible distributions according to goodness of fit. Instead, you should choose according to the qualitative characteristics of your data, something like this:
Frustratingly, this chart doesn't actually have the best choice for your data (composition, continuous, bounded between 0 and 1 [or 0 and 100]), which is a Beta distribution (although there are technical issues if you have values of exactly 0 or 100 in your sample).
In R:
## some arbitrary data
z <- c(2,8,40,45,56,58,70,89)
## fit (beta values must be in (0,1), not (0,100), so divide by 100)
(m <- MASS::fitdistr(z/100,"beta",start=list(shape1=1,shape2=1)))
## sample 1000 new values
z_new <- 100*rbeta(n=1000,shape1=m$estimate["shape1"],
shape2=m$estimate["shape2"])

How to calculate the distance between the Best Fit Curve and the data points? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Hello Everyone!
I am fairly new to R programming and hence I have a small doubt regarding the distance (or offset) of the data-set points from their Best-fit Curve.
The given figure shows some points and a Best-fit Curve for those points.
As we can see some points are very far away from the Best-fit curve and I want to write a code which will tell me the distance (or offset) of all the points from the curve. Then I want to display all the points that are far away from the curve.
I have the equation of the curve and all the data points. The curve has an exponential equation.
The uploaded image is just a approximation of the real figure. I drew this one just as an example.
If someone can tell me what method or functions shoul be used here then it would be a big help.
Thank You.
In many R situations you will actually fit the data with a function such as lm or loess or a glm for instance and the model summary will save residuals with the result.
If you indeed have your own equation then you simply want to take those values of x from the data points - calculate the equation y-values, then subtract them from the corresponding data y-values.
e.g. a toy example
# decay function
x= 1:50
start= 80
decay=0.95
equation_y=start*(decay^x)
plot(x,equation_y, type="l")
# simulated data points
data_y = equation_y + rnorm(50, sd=3)
points(x,data_y, col="red")
# the differences
equation_y - data_y

Resources