Still quite new to R (and statistics to be honest) and I have currently only used it for simple linear regression models. But now one of my data sets clearly shows a inverted U pattern. I think I have to do a quadratic regression analysis on this data, but I'm not sure how. What I tried so far is:
independentvar2 <- independentvar^2
regression <- lm(dependentvar ~ independentvar + independentvar2)
summary (regression)
plot (independentvar, dependentvar)
abline (regression)
While this would work for a normal linear regression, it doesn't work for non-linear regressions. Can I even use the lm function since I thought that meant linear model?
Thanks
Bert
This example is from this SO post by #Tom Liptrot.
plot(speed ~ dist, data = cars)
fit1 = lm(speed ~ dist, cars) #fits a linear model
plot(speed ~ dist, data = cars)
abline(fit1) #puts line on plot
fit2 = lm(speed ~ I(dist^2) + dist, cars) #fits a model with a quadratic term
fit2line = predict(fit2, data.frame(dist = -10:130))
Related
"gls function fits a linear model using generalized least squares. The errors are allowed to be correlated and/or have unequal variances."
Example
# NOT RUN {
# AR(1) errors within each Mare
fm1 <- gls(follicles ~ sin(2*pi*Time) + cos(2*pi*Time), Ovary,
correlation = corAR1(form = ~ 1 | Mare))
# variance increases as a power of the absolute fitted values
fm2 <- update(fm1, weights = varPower())
# }
I got all the above information from https://www.rdocumentation.org/packages/nlme/versions/3.1-137/topics/gls
In the example, they used a nonlinear model "follicles ~ sin(2*pi*Time) + cos(2*pi*Time)". My quesion is why they used gls fucntion to fit the nonlinear model? Any idea please!
Thank you in advance
I am hitting my head against the computer...
I have a prediction model in R that goes like this
m.final.glm <- glm(binary_outcome ~ rcs(PredictorA, parms=kn.a) + rcs(PredictorB, parms=kn.b) + PredictorC , family = "binomial", data = train_data)
I want to validate this model on test_data2 - first by updating the linear predictor (lp)
train_data$lp <- predict(m.final.glm, train_data)
test_data2$lp <- predict(m.final.glm, test_data2)
lp2 <- predict(m.final.glm, test_data2)
m.update2.lp <- glm(binary_outcome ~ 1, family="binomial", offset=lp2, data=test_data2)
m.update2.lp$coefficients[1]
m.final.update2.lp <- m.final.glm
m.final.update2.lp$coefficients[1] <- m.final.update2.lp$coefficients[1] + m.update2.lp$coefficients[1]
m.final.update2.lp$coefficients[1]
p2.update.lp <- predict(m.final.update2.lp, test_data2, type="response")
This gets me to the point where I have updated the linear predictor, i.e. in the summary of the model only the intercept is different, but the coefficients of each predictor are the same.
Next, I want to include a new predictor (it is categorical, if that matters), PredictorD, into the updated model. This means that the model has to have the updated linear predictor and the same coefficients for Predictors A, B and C but the model also has to contain Predictor D and estimate its significance.
How do I do this? I will be very grateful if you could help me with this. Thanks!!!
I am trying to use a logistic model of the form
Y = exp(ao + a1fi1....)/(1 + exp(a0 + a1fi1 ....)
for multiple non linear regression in R, The dependent variable Y is a row consisting of about 500 values and there are 33 independent variables X1, X2, X3.....X33
I am reading my data from an EXCEL file:
data1<-read.csv(file.choose(), header=TRUE)
which populates R with my data. I performed linear regression with the lm() function using input:
results<- lm(Y~ X1 + X2....X33, data = data1)
which worked perfectly fine and now I am trying to use the self starting logistic function of the form:
nls(Y ~ SSlogis(x, Asym, xmid, scal), data1)
for non linear regression; however, I do not seem to be applying the function properly. Thus my question is how would I use this function to perform multiple non linear regression analysis for my dataset?? Thank you for any help you can provide.
You simply choose the type of the model when doing regression. The codes following should help. (I used a online dataset for example)
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
model <- glm(admit ~ .,
family=binomial(link='logit'),
data= my data)
Then you can use following code to get more info about your model
fit
fit$resample
fit$results
fit$finalModel
I have input the following dataset into R:
Distance Overall Average
-9.867344 0.001928728
-8.769583 0.0011855
-8.667573 0.001401361
-8.373794 0.001514139
-7.443686 0.001046918
-3.862468 0.000790981
1.817748 0.000945557
2.5333892 0.000940648
4.190557 0.001649773
When plotted onto a scatter plot these give a U-Shaped curve so I am trying to plot a non-linear regression 'line' onto the plot.
Plot of the original Data showing the U-shaped curve
I saw the following example given to a previous question...
plot(speed ~ dist, data = cars)
fit1 = lm(speed ~ dist, cars) #fits a linear model
plot(speed ~ dist, data = cars)
abline(fit1) #puts line on plot
fit2 = lm(speed ~ I(dist^2) + dist, cars) #fits a model with a quadratic term
fit2line = predict(fit2, data.frame(dist = -10:130))
However, after trying this multiple times I am consistently being shown the same error comment:
plot(Ammonia$Overall.Average~Ammonia$Distance)
fit1=lm(Ammonia$Overall.Average~Ammonia$Distance)
abline(fit1)
fit2=lm(Ammonia$Overall.Average ~ I(Ammonia$Distance^2) + Ammonia$Distance)
fit2line = predict(fit2, data.frame(Ammonia$Distance = 9,3))
Error: unexpected '=' in "fit2line = predict(fit2, data.frame(Ammonia$Distance ="
I am also not sure if "9,3" are the right numbers to have put in here as I'm not really sure where the example I followed got their numbers from.
I have some multiple linear models without intercept like below:
Y = a*X1 + b*X2 + c*X3
However this model is a linear model, but since it does not have an intercept we have to write it as a non-linear model in R:
model1= nls(Y ~ a*X1+b*X2, data = trainDat, start = list(a = 1, b=1))
The problem is that the summary(model1) does not give us the statistics of the model like F-statistics because it is not lm.
How can we report the significance of these models in R?