Any resources for curve fitting in R?
I came across https://systatsoftware.com/products/sigmaplot/product-uses/sigmaplot-products-uses-curve-fitting-using-sigmaplot/
Any similar recommendations or libraries in R?
Thank you!
Hi There are not one but several ways to do curve fitting in R. You could start with something as simple as below
x <- c(32,64,96,118,126,144,152.5,158)
#make y as response variable
y <- c(99.5,104.8,108.5,100,86,64,35.3,15)
plot(x,y,pch=19)
This should give you the below plot. Eyeballing the curve tells us we can fit some nice polynomial curve here.
Now we could fit our curve(s) on the data below:
linMod <- lm(y~x)
#second degree polynomial model
linMod2 <- lm(y~poly(x,2,raw=TRUE))
#third degree polynomial model
linMod3 <- lm(y~poly(x,3,raw=TRUE))
#fourth degree polynomial model
linMod4 <- lm(y~poly(x,4,raw=TRUE))
#generate new data in range of 50 numbers starting from 30 and ending at 160
newData <- seq(30,160, length=50)
plot(x,y,pch=19,ylim=c(0,150))
lines(newData, predict(linMod, data.frame(x=newData)), col="red")
lines(newData, predict(linMod2, data.frame(x=newData)), col="green")
lines(newData, predict(linMod3, data.frame(x=newData)), col="blue")
lines(newData, predict(linMod4, data.frame(x=newData)), col="purple")
Giving us:
This is just a simple illustration of curve fitting in R. There are tons of tutorials available out there, perhaps you could start looking here:
http://www.css.cornell.edu/faculty/dgr2/teach/R/R_CurveFit.pdf
https://rpubs.com/carlmart/228874
Fitting a curve to specific data
I'm trying to plot the resultant curve from fitting a non-linear mixed model. It should be something like a curve of a normal distribution but skewed to the right. I followed previous links here and here, but when I use my data I can not make it happen for different difficulties (see below).
Here is the dataset
and code
s=read.csv("GRVMAX tadpoles.csv")
t=s[s$SPP== levels(s$SPP)[1],]
head(t)
vmax=t[t$PERFOR=="VMAX",]
colnames(vmax)[6]="vmax"
vmax$TEM=as.numeric(as.character(vmax$TEM));
require(lme4)
start =c(TEM=25)
is.numeric(start)
nm1 <- nlmer ( vmax ~ deriv(TEM)~TEM|INDIVIDUO,nlpars=start, nAGQ =0,data= vmax)# this gives an error suggesting nlpars is not numeric, despite start is numeric...:~/
After that, I want to plot the curve over the original data
with(vmax,plot(vmax ~ (TEM)))
x=vmax$TEM
lines(x, predict(nm1, newdata = data.frame(TEM = x, INDIVIDUO = "ACI5")))
Any hint?
Thanks in advance
I am having problems adding a line to one scatter plot. My data is fit to an exponential model. I used nls to get coefficients:
fit <- nls(volumen ~ bo+ exp(b1*dap), data= df, start = list(bo=0, b1=55))
Later I used plot and lines commands to visualize the fit to my data:
plot(dap,volumen)
lines(dap,predict(fit,data.frame(x=dap)))
My Big problem is that I visualize a linear line that does not fit with my exponential points.
Could my model statement have errors?
Please, any comments I would appreciate.
I have a logistic regression model (using R) as
fit6 <- glm(formula = survived ~ ascore + gini + failed, data=records, family = binomial)
summary(fit6)
I'm using pROC package to draw ROC curves and figure out AUC for 6 models fit1 through fit6.
I have approached this way to plots one ROC.
prob6=predict(fit6,type=c("response"))
records$prob6 = prob6
g6 <- roc(survived~prob6, data=records)
plot(g6)
But is there a way I can combine the ROCs for all 6 curves in one plot and display the AUCs for all of them, and if possible the Confidence Intervals too.
You can use the add = TRUE argument the plot function to plot multiple ROC curves.
Make up some fake data
library(pROC)
a=rbinom(100, 1, 0.25)
b=runif(100)
c=rnorm(100)
Get model fits
fit1=glm(a~b+c, family='binomial')
fit2=glm(a~c, family='binomial')
Predict on the same data you trained the model with (or hold some out to test on if you want)
preds=predict(fit1)
roc1=roc(a ~ preds)
preds2=predict(fit2)
roc2=roc(a ~ preds2)
Plot it up.
plot(roc1)
plot(roc2, add=TRUE, col='red')
This produces the different fits on the same plot. You can get the AUC of the ROC curve by roc1$auc, and can add it either using the text() function in base R plotting, or perhaps just toss it in the legend.
I don't know how to quantify confidence intervals...or if that is even a thing you can do with ROC curves. Someone else will have to fill in the details on that one. Sorry. Hopefully the rest helped though.
In Excel, it's pretty easy to fit a logarithmic trend line of a given set of trend line. Just click add trend line and then select "Logarithmic." Switching to R for more power, I am a bit lost as to which function should one use to generate this.
To generate the graph, I used ggplot2 with the following code.
ggplot(data, aes(horizon, success)) + geom_line() + geom_area(alpha=0.3)+
stat_smooth(method='loess')
But the code does local polynomial regression fitting which is based on averaging out numerous small linear regressions. My question is whether there is a log trend line in R similar to the one used in Excel.
An alternative I am looking for is to get an log equation in form y = (c*ln(x))+b; is there a coef() function to get 'c' and 'b'?
Let my data be:
c(0.599885189,0.588404133,0.577784156,0.567164179,0.556257176,
0.545350172,0.535112897,0.52449292,0.51540375,0.507271336,0.499904325,
0.498851894,0.498851894,0.497321087,0.4964600,0.495885955,0.494068121,
0.492154612,0.490145427,0.486892461,0.482395714,0.477229238,0.471010333)
The above data are y-points while the x-points are simply integers from 1:length(y) in increment of 1. In Excel: I can simply plot this and add a logarithmic trend line and the result would look:
With black being the log. In R, how would one do this with the above dataset?
I prefer to use base graphics instead of ggplot2:
#some data with a linear model
x <- 1:20
set.seed(1)
y <- 3*log(x)+5+rnorm(20)
#plot data
plot(y~x)
#fit log model
fit <- lm(y~log(x))
#look at result and statistics
summary(fit)
#extract coefficients only
coef(fit)
#plot fit with confidence band
matlines(x=seq(from=1,to=20,length.out=1000),
y=predict(fit,newdata=list(x=seq(from=1,to=20,length.out=1000)),
interval="confidence"))
#some data with a non-linear model
set.seed(1)
y <- log(0.1*x)+rnorm(20,sd=0.1)
#plot data
plot(y~x)
#fit log model
fit <- nls(y~log(a*x),start=list(a=0.2))
#look at result and statistics
summary(fit)
#plot fit
lines(seq(from=1,to=20,length.out=1000),
predict(fit,newdata=list(x=seq(from=1,to=20,length.out=1000))))
You can easily specify alternative smoothing methods (such as lm(), linear least-squares fitting) and an alternative formula
library(ggplot2)
g0 <- ggplot(dat, aes(horizon, success)) + geom_line() + geom_area(alpha=0.3)
g0 + stat_smooth(method="lm",formula=y~log(x),fill="red")
The confidence bands are automatically included: I changed the color to make them visible since they're very narrow. You can use se=FALSE in stat_smooth to turn them off.
The other answer shows you how to get the coefficients:
coef(lm(success~log(horizon),data=dat))
I can imagine you might next want to add the equation to the graph: see Adding Regression Line Equation and R2 on graph
I'm pretty sure a simple +scale_y_log10() would get you what you wanted. GGPlot stats are calculated after transformations, so the loess() would then be calculated on the log transformed data.
I've just written a blog post here that describes how to match Excel's logarithmic curve fitting exactly. The nub of the approach centers around the lm() function:
# Set x and data.to.fit to the independent and dependent variables
data.to.fit <- c(0.5998,0.5884,0.5777,0.5671,0.5562,0.5453,0.5351,0.524,0.515,0.5072,0.4999,0.4988,0.4988,0.4973,0.49,0.4958,0.4940,0.4921,0.4901,0.4868,0.4823,0.4772,0.4710)
x <- c(seq(1, length(data.to.fit)))
data.set <- data.frame(x, data.to.fit)
# Perform a logarithmic fit to the data set
log.fit <- lm(data.to.fit~log(x), data=data.set)
# Print out the intercept, log(x) parameters, R-squared values, etc.
summary(log.fit)
# Plot the original data set
plot(data.set)
# Add the log.fit line with confidence intervals
matlines(predict(log.fit, data.frame(x=x), interval="confidence"))
Hope that helps.