Best fit quadratic regression - r

I'm running into an odd problem; get my dataset here:dataset
All I need is a simple graph showing the best-fit regression (quadratic regression) between rao and obs_richness; but instead I am getting very different polynomial models. Any suggestions on how to fix this?
#read in data
F_Div<-read.csv('F_Div.csv', header=T)
str(F_Div)
pairs(F_Div[2:12], pch=16)
#richness vs functional diversity
par(mfrow=c(1,1))
lm1<-lm ( rao~Obs_Richness, data=F_Div)
summary (lm1)
plot (rao~Obs_Richness, data=F_Div, pch=16, xlab="Species Richness", ylab="Rao's Q")
abline(lm1, lty=3)
lines (lowess (F_Div$rao~F_Div$Obs_Richness))
poly.mod<- lm (F_Div$rao ~ poly (F_Div$Obs_Richness, 2, raw=T))
summary (poly.mod)
lines (F_Div$Obs_Richness, predict(poly.mod))
I need the line that best approximates the lowess line (a simple curve), not this squiggly mess.
I also tried this but not what need:
xx <- seq(0,30, length=67)
plot (rao~Obs_Richness, data=F_Div, pch=16, xlab="Species Richness", ylab="Rao's Q")
lines(xx, predict(poly.mod, data.frame(x=xx)), col="blue")

The squiggly mess happens because line(...) draws lines between successive points in the data's original order. Try this at the end.
p <- data.frame(x=F_Div$Obs_Richness,y=predict(poly.mod))
p <- p[order(p$x),]
lines(p)

Related

How to add a Poisson distribution curve that approaches 3?

I want to add a curve to an existing plot.
This curve should be a poisson distribution curve that approaches the mean 3.
I've tried this code
points is a vector with 1000 values
plot(c(1:1000), points,type="l")
abline(h=3)
x = 0:1000
curve(dnorm(x, 3, sqrt(3)), lwd=2, col="red", add=TRUE)
I am getting a plot, but without any curve.
I would like to see a curve that approaches 3.
you can do something like this:
plot(0:20, 3+dpois( x=0:20, lambda=3 ), xlim=c(-2,20))
normden <- function(x){3+dnorm(x, mean=3, sd=sqrt(3))}
curve(normden, from=-4, to=20, add=TRUE, col="red")
running this code will produce the following:
is that what you intended?

How to plot ellipses with bray-curtis similarity values?

I want to do an nMDS analysis and I want to add to my graph ellipses that represent the percentage of similarity of bray-curtis, but I do not know how to do it with R, this type of graphics you can do with PRIMER but I suppose with R as well.
How can I do it?
I want my graphic to look like this:
graph
I have never used PRIMER, and I do not know how these graphs are supposed to be drawn, but I have to guess from the graph you posted. It may be that you first get a cluster dendrogram (function hclust), and then cut it by tree height (function cutree) and then draw enclosing ellipses for these (vegan::ordiellipse(..., kind = "ehull")). If so, this should do the trick:
library(vegan)
data(BCI)
d <- vegdist(BCI)
ord <- metaMDS(d, trace=FALSE)
cl <- hclust(d, "average") # using average linkage
plot(cl, hang=-1)
rect.hclust(cl, h=0.4, border="red") # to see the clusters
rect.hclust(cl, h=0.5, border="cyan")
rect.hclust(cl, h=0.6, border="blue")
plot(ord) # then ordination & enclosing ellipses
ordiellipse(ord, cutree(cl, h=0.4), kind="ehull", col="red", lwd=2)
ordiellipse(ord, cutree(cl, h=0.5), kind="ehull", col="cyan", lwd=2)
ordiellipse(ord, cutree(cl, h=0.6), kind="ehull", col="blue", lwd=2)
I adjusted the limits for the current dendrogram. The ordiellipse will give you error message for every one-item class, but they are harmless (I got to clean the function from these).

Smoothing over a Polynomial curve

Given data that looks like this:
x<-c(0.287,0.361,0.348,0.430,0.294)
y<-c(105,230,249,758,379)
I'm trying to fit several different methods to this data. For this question I'm looking at 2nd order polynomial fits vs Loess fits. To get a smoother curve, I'd like to expand the x data to give me more points to predict over. So for my Loess curve I do this:
Loess_Fit<-loess(y ~ x)
MakeSmooth<-seq(min(x), max(x), (max(x)-min(x))/1000)
plot(x,y)
#WithoutSmoothing
lines(x=sort(x), y=predict(Loess_Fit)[order(x)], col="red", type="l")
#With Smoothing
lines(x=sort(MakeSmooth), y=predict(Loess_Fit,MakeSmooth)[order(MakeSmooth)], col="blue", type="l")
When I attempt to do the same thing with a 2nd order polynomial fit- I get an error
Poly2<-lm(y ~ poly(x,2,raw=TRUE))
plot(x,y)
#WithoutSmoothing
lines(x=sort(x), y=predict(Poly2)[order(x)], col="red", type="l")
#With Smoothing
lines(x=sort(MakeSmooth), y=predict(Poly2,MakeSmooth)[order(MakeSmooth)], col="blue", type="l")
Obviously there is some difference between Poly2 and Loess_Fit, but I don't know what the difference is. Is there a way to smooth out the Poly2 fit as I did with the Loess_Fit?
For lm, the new data needs to be a data frame:
lines(x=sort(MakeSmooth), y=predict(Poly2,data.frame(x=MakeSmooth))[order(MakeSmooth)], col="blue", type="l")

Understanding the Local Polynomial Regression

Could someone explain me why I get different lines when I plot? Somehow I thought the line should be the same
data(aircraft)
help(aircraft)
attach(aircraft)
lgWeight <- log(Weight)
library(KernSmooth)
# a) Fit a nonparametric regression to data (xi,yi) and save the estimated values mˆ (xi).
# Regression of degree 2 polynomial of lgWeight against Yr
op <- par(mfrow=c(2,1))
lpr1 <- locpoly(Yr,lgWeight, bandwidth=7, degree = 2, gridsize = length(Yr))
plot(Yr,lgWeight,col="grey", ylab="Log(Weight)", xlab = "Year")
lines(lpr1,lwd=2, col="blue")
lines(lpr1$y, col="black")
How can I get the values from the model? If I print the model, it gives me the values on $x and $y, but somehow if I plot them, is not the same as the blue line. I need the values of the fitted model (blue) for every x, could someone help me?
The fitted model (blue curve) is correctly in lpr1. As you said, the correct y-values are in lpr1$y and the correct x-values are in lpr1$x.
The reason the second plot looks like a straight line is because you are only giving the plot function one variable, lpr1$y. Since you don't specify the x-coordinates, R will automatically plot them along an index, from 1 to the length of the y variable.
The following are two explicit and equivalent ways to plot the curve and line:
lines(x = lpr1$x, y = lpr1$y,lwd=2, col="blue") # plots curve
lines(x = 1:length(lpr1$y), y = lpr1$y, col="black") # plot line

How do I plot the 'inverse' of a survival function?

I am trying to plot the inverse of a survival function, as the data I'm is actually an increase in proportion of an event over time. I can produce Kaplan-Meier survival plots, but I want to produce the 'opposite' of these. I can kind of get what I want using the following fun="cloglog":
plot(survfit(Surv(Days_until_workers,Workers)~Queen_Number+Treatment,data=xdata),
fun="cloglog", lty=c(1:4), lwd=2, ylab="Colonies with Workers",
xlab="Days", las=1, font.lab=2, bty="n")
But I don't understand quite what this has done to the time (i.e. doesn't start at 0 and distance decreases?), and why the survival lines extend above the y axis.
Would really appreciate some help with this!
Cheers
Use fun="event" to get the desired output
fit <- survfit(Surv(time, status) ~ x, data = aml)
par(mfrow=1:2, las=1)
plot(fit, col=2:3)
plot(fit, col=2:3, fun="event")
The reason for fun="cloglog" screwing up the axes is that it does not plot a fraction at all. It is instead plotting this according to ?plot.survfit:
"cloglog" creates a complimentary log-log survival plot (f(y) = log(-log(y)) along with log scale for the x-axis)
Moreover, the fun argument is not limited to predefined functions like "event" or "cloglog", so you can easily give it your own custom function.
plot(fit, col=2:3, fun=function(y) 3*sqrt(1-y))

Resources