I have a data which should follow the power law distribution.
x = distance
y = %
I want to create a model and to add the fitted line to my plot.
My aim to recreate something like this:
As author uses R-square; I assume they applied linear models, as R^2 is not suitable for non-linear models. http://blog.minitab.com/blog/adventures-in-statistics-2/why-is-there-no-r-squared-for-nonlinear-regression
However, I can't find out how to "curve" my line to the points; how to add the formula y ~ a*x^(-b) to my model.
Instead of curly line I got back the line as from the simple linear regression.
My questions are:
Do I correctly assume the model y ~ a*x^(-b) used by author is linear?
what type of model to use to recreate my example: lm, glm, nls, etc. ?
I generated the dummy data, including the applied power law formula from the plot above:
set.seed(42)
scatt<-runif(10)
x<-seq(1, 1000, 100)
b = 1.8411
a = 133093
y = a*x^(-b) + scatt # add some variability in my dependent variable
plot(y ~ x)
and tried to create a glm model.
# formula for non-linear model
m<-m.glm<-glm(y ~ x^2, data = dat) #
# add predicted line to plot
lines(x,predict(m),col="red",lty=2,lwd=3)
This is my first time to model, so I am really confused and I don't know where to start... thank you for any suggestion or directions, I really appreciate it...
I personally think this question a dupe of this: `nls` fails to estimate parameters of my model but I would be cold-blooded if I close it (as OP put a bounty). Anyway, bounty question can not be closed.
So the best I could think of, is to post a community wiki answer (I don't want to get this bounty).
As you want to fit a model of this form y ~ a*x^(-b), it often benefit from taking log transform on both sides and fit a linear model log(y) ~ log(x).
fit <- lm(log(y) ~ log(x))
As you have already known how to use curve to plot regression curve and are happy with it, I will now show how to make plot.
Some people call this log-log regression. Here are some other links I have for such kind of regression:
How to predict a new value using simple linear regression log(y)=b0+b1*log(x)
How to plot confidence bands for my weighted log-log linear regression?
m <- lm(log(y) ~ log(x), data=dat)
a <- exp(intercept)
b <- -exp(slope)
plot(y ~ x, type="p", lty=3)
lines(x, exp(predict(m)), col="blue", lty=2, lwd=3)
I'm trying to plot the resultant curve from fitting a non-linear mixed model. It should be something like a curve of a normal distribution but skewed to the right. I followed previous links here and here, but when I use my data I can not make it happen for different difficulties (see below).
Here is the dataset
and code
s=read.csv("GRVMAX tadpoles.csv")
t=s[s$SPP== levels(s$SPP)[1],]
head(t)
vmax=t[t$PERFOR=="VMAX",]
colnames(vmax)[6]="vmax"
vmax$TEM=as.numeric(as.character(vmax$TEM));
require(lme4)
start =c(TEM=25)
is.numeric(start)
nm1 <- nlmer ( vmax ~ deriv(TEM)~TEM|INDIVIDUO,nlpars=start, nAGQ =0,data= vmax)# this gives an error suggesting nlpars is not numeric, despite start is numeric...:~/
After that, I want to plot the curve over the original data
with(vmax,plot(vmax ~ (TEM)))
x=vmax$TEM
lines(x, predict(nm1, newdata = data.frame(TEM = x, INDIVIDUO = "ACI5")))
Any hint?
Thanks in advance
Please help.
I have managed to fit a fairly decent nonlinear regression curve to my data using:
model2 <- nls(urine~a*(1-exp(-c*water)), data = eagle, start = list(a=550, c=0.00385))
and then I generate my plot using:
av <- seq(0,1200, 0.5)
bv <- predict (model2, data=eagle, list(water=av))
plot (urine~water, data = eagle)
lines (av,bv)
I want to draw on some dotted lines for upper and lower 95% confidence intervals. I have managed to write code for my linear regression analyses but I am struggling with the current nonlinear regression. The following code:
a <- predict (model2, newdata=data.frame(water=av), interval="confidence")
for linear regressions this gives me a dataframe with three columns for fit, lwr and upr. With my nonlinear regression it just gives me a list of numbers which I cant do anything with. Any suggestions? Is there a simple way to get upper and confidence lines for a nonlinear regression?
se.fit isnt operating apparently. Found lots of forums discussing the difficulties of fitting confidence intervals to non linear regressions in R. After a whole day of searching and banging my head against the internet I finally found a very helpful formula. Im not 100% sure how it works but it works so here it is for anyone else who needs to do this...
http://www.r-bloggers.com/predictnls-part-1-monte-carlo-simulation-confidence-intervals-for-nls-models/
Once you have produced your table of statistics for your predicted values based on a sequence of x values, you can access the lwr and upr CI value data in columns 6 and 7 of the output and plot that way.
I have a logistic regression model (using R) as
fit6 <- glm(formula = survived ~ ascore + gini + failed, data=records, family = binomial)
summary(fit6)
I'm using pROC package to draw ROC curves and figure out AUC for 6 models fit1 through fit6.
I have approached this way to plots one ROC.
prob6=predict(fit6,type=c("response"))
records$prob6 = prob6
g6 <- roc(survived~prob6, data=records)
plot(g6)
But is there a way I can combine the ROCs for all 6 curves in one plot and display the AUCs for all of them, and if possible the Confidence Intervals too.
You can use the add = TRUE argument the plot function to plot multiple ROC curves.
Make up some fake data
library(pROC)
a=rbinom(100, 1, 0.25)
b=runif(100)
c=rnorm(100)
Get model fits
fit1=glm(a~b+c, family='binomial')
fit2=glm(a~c, family='binomial')
Predict on the same data you trained the model with (or hold some out to test on if you want)
preds=predict(fit1)
roc1=roc(a ~ preds)
preds2=predict(fit2)
roc2=roc(a ~ preds2)
Plot it up.
plot(roc1)
plot(roc2, add=TRUE, col='red')
This produces the different fits on the same plot. You can get the AUC of the ROC curve by roc1$auc, and can add it either using the text() function in base R plotting, or perhaps just toss it in the legend.
I don't know how to quantify confidence intervals...or if that is even a thing you can do with ROC curves. Someone else will have to fill in the details on that one. Sorry. Hopefully the rest helped though.
In Excel, it's pretty easy to fit a logarithmic trend line of a given set of trend line. Just click add trend line and then select "Logarithmic." Switching to R for more power, I am a bit lost as to which function should one use to generate this.
To generate the graph, I used ggplot2 with the following code.
ggplot(data, aes(horizon, success)) + geom_line() + geom_area(alpha=0.3)+
stat_smooth(method='loess')
But the code does local polynomial regression fitting which is based on averaging out numerous small linear regressions. My question is whether there is a log trend line in R similar to the one used in Excel.
An alternative I am looking for is to get an log equation in form y = (c*ln(x))+b; is there a coef() function to get 'c' and 'b'?
Let my data be:
c(0.599885189,0.588404133,0.577784156,0.567164179,0.556257176,
0.545350172,0.535112897,0.52449292,0.51540375,0.507271336,0.499904325,
0.498851894,0.498851894,0.497321087,0.4964600,0.495885955,0.494068121,
0.492154612,0.490145427,0.486892461,0.482395714,0.477229238,0.471010333)
The above data are y-points while the x-points are simply integers from 1:length(y) in increment of 1. In Excel: I can simply plot this and add a logarithmic trend line and the result would look:
With black being the log. In R, how would one do this with the above dataset?
I prefer to use base graphics instead of ggplot2:
#some data with a linear model
x <- 1:20
set.seed(1)
y <- 3*log(x)+5+rnorm(20)
#plot data
plot(y~x)
#fit log model
fit <- lm(y~log(x))
#look at result and statistics
summary(fit)
#extract coefficients only
coef(fit)
#plot fit with confidence band
matlines(x=seq(from=1,to=20,length.out=1000),
y=predict(fit,newdata=list(x=seq(from=1,to=20,length.out=1000)),
interval="confidence"))
#some data with a non-linear model
set.seed(1)
y <- log(0.1*x)+rnorm(20,sd=0.1)
#plot data
plot(y~x)
#fit log model
fit <- nls(y~log(a*x),start=list(a=0.2))
#look at result and statistics
summary(fit)
#plot fit
lines(seq(from=1,to=20,length.out=1000),
predict(fit,newdata=list(x=seq(from=1,to=20,length.out=1000))))
You can easily specify alternative smoothing methods (such as lm(), linear least-squares fitting) and an alternative formula
library(ggplot2)
g0 <- ggplot(dat, aes(horizon, success)) + geom_line() + geom_area(alpha=0.3)
g0 + stat_smooth(method="lm",formula=y~log(x),fill="red")
The confidence bands are automatically included: I changed the color to make them visible since they're very narrow. You can use se=FALSE in stat_smooth to turn them off.
The other answer shows you how to get the coefficients:
coef(lm(success~log(horizon),data=dat))
I can imagine you might next want to add the equation to the graph: see Adding Regression Line Equation and R2 on graph
I'm pretty sure a simple +scale_y_log10() would get you what you wanted. GGPlot stats are calculated after transformations, so the loess() would then be calculated on the log transformed data.
I've just written a blog post here that describes how to match Excel's logarithmic curve fitting exactly. The nub of the approach centers around the lm() function:
# Set x and data.to.fit to the independent and dependent variables
data.to.fit <- c(0.5998,0.5884,0.5777,0.5671,0.5562,0.5453,0.5351,0.524,0.515,0.5072,0.4999,0.4988,0.4988,0.4973,0.49,0.4958,0.4940,0.4921,0.4901,0.4868,0.4823,0.4772,0.4710)
x <- c(seq(1, length(data.to.fit)))
data.set <- data.frame(x, data.to.fit)
# Perform a logarithmic fit to the data set
log.fit <- lm(data.to.fit~log(x), data=data.set)
# Print out the intercept, log(x) parameters, R-squared values, etc.
summary(log.fit)
# Plot the original data set
plot(data.set)
# Add the log.fit line with confidence intervals
matlines(predict(log.fit, data.frame(x=x), interval="confidence"))
Hope that helps.