How do I extract all data points from a lowess smoother using R? - r

I have used the following code to create the graph and a smoother. Now, I am wondering how I can get the data points for the line.
plot(mydata$chlindex ~ mydata$Time, pch=mydata$treatment, col=mydata$treatment)
for (i in c(1, 2, 3, 4)){
lines(lowess(mydata$chl[mydata$treatment==i] ~ mydata$Time[mydata$treatment==i]),
lty=2, col=i)
}
Thanks,
Michelle

Fabricated data example
create a couple correlated variables (correlation not necessary, but slightly more fun)
df <- data.frame(x=1:200)
df <- within(df, y <- rnorm(200,x*.01))
produce a scatter plot with the loess line
plot(df)
lines(predict(loess(y~x,df)),col="red")
Getting the loess line points
note that predict() was used in the drawing of the line. use it without lines() to get the points.
predict(loess(y~x,df))
# [1] 0.2461715 0.2498436 0.2536022 0.2574490 0.2613854 0.2654131 0.2695336
# [8] 0.2737485 0.2780593 0.2824677 0.2869751 0.2915832 ...

Related

r coding for customising vegan plot

I am attempting to produce an NMDS plot in vegan, but really struggling with the code. I am trying to display the site points and species points differently, with the site points coloured according to treatment. Both lines work individually, but I cannot work out how to combine these two lines of code into one line to form one graph. I am using ordipointlabel to prevent overlap. These are the two lines of code I want to combine into one.
ordipointlabel(NMDS10, scaling=2, display="species", select=sel)
ordipointlabel(NMDS10,display="sites", col=c(rep("darkgreen",4),rep("blue4",4)),cex=0.75)
You can access directly to ordinpointlabel object and make it look like you wish. Please see the sample:
library(vegan)
data(dune)
NMDS10 <- metaMDS(dune[1:8, ])
pdf(file = NULL)
y <- ordipointlabel(NMDS10, display=c("sites", "species"))
dev.off()
# select sites & species
sel <- unlist(dimnames(dune[1:8, ]))[-(20:ncol(dune))]
# messing with ordipointlabel object
y$points <- y$points[rownames(y$points) %in% sel, ]
y$args$pcol[] = rep("red", length(y$args$pcol))
y$args$pcol[1:8] <- c(rep("darkgreen", 4), rep("blue4", 4))
y$par$cex <- 0.75
plot(y)

labeling axis for parametric terms with plot.gam

I am trying to plot my gam results. The plotting works very well for all the smooth terms (in my case terms 1 to 8) but if I want to plot parametric terms (from 9 onwards), I can't change the axis labels. No matter if I use plot, plot.gam, termplot or text I can't do it. Any tips? Below is the code example
par(mfrow=c(3,3), oma=c(1,1,1,1),pty="s",mar=c(4.5,4.5,1,1))
# the first three graphs work perfectly
plot.gam(model$gam,select=1,scale=0,pers=TRUE,all.terms=T,shade=T,xlab="Water depth",ylab="")
plot.gam(model$gam,select=2,scale=0,pers=TRUE,all.terms=T,shade=T,xlab="Bottom current speed",ylab="")
plot.gam(model$gam,select=3,scale=0,pers=TRUE,all.terms=T,shade=T,xlab="Substance",ylab="")
# this graph for the parametric term is plotted but I cannot change axis labels
plot.gam(model$gam,select=9,scale=0,pers=T,all.terms=T,shade=T,xlab="AIS",ylab="")
If you are using RStudio you can check the source code of plot.gam by hitting the F2 button. In R execute the plot.gam without brackets. Then you can find, that plot() is replaced by termplot() for some select values.
Thus, to maipulate the x-axis labels you have to use xlabs instead of xlab.
require(mgcv)
pa <- c(1, rep(0, 9))
term_A <- runif(10, 9, 15)
term_B <- runif(10, 1, 25)
data <- as.data.frame(cbind(pa, term_A, term_B))
mod <- gam(pa ~ s(term_A, k=3) + term_B, family=binomial, data=data)
summary(mod)
par(mfrow=c(2, 2))
# xlab=""
plot.gam(mod, select=1, all.terms=T, shade=T, xlab="your own lab title", ylab="")
# xlabs=""
plot.gam(mod, select=2, all.terms=T, shade=T, xlabs="your own lab title", ylab="")

R superimposing bivariate normal density (ellipses) on scatter plot

There are similar questions on the website, but I could not find an answer to this seemingly very simple problem. I fit a mixture of two gaussians on the Old Faithful Dataset:
if(!require("mixtools")) { install.packages("mixtools"); require("mixtools") }
data_f <- faithful
plot(data_f$waiting, data_f$eruptions)
data_f.k2 = mvnormalmixEM(as.matrix(data_f), k=2, maxit=100, epsilon=0.01)
data_f.k2$mu # estimated mean coordinates for the 2 multivariate Gaussians
data_f.k2$sigma # estimated covariance matrix
I simply want to super-impose two ellipses for the two Gaussian components of the model described by the mean vectors data_f.k2$mu and the covariance matrices data_f.k2$sigma. To get something like:
For those interested, here is the MatLab solution that created the plot above.
If you are interested in the colors as well, you can use the posterior to get the appropriate groups. I did it with ggplot2, but first I show the colored solution using #Julian's code.
# group data for coloring
data_f$group <- factor(apply(data_f.k2$posterior, 1, which.max))
# plotting
plot(data_f$eruptions, data_f$waiting, col = data_f$group)
for (i in 1: length(data_f.k2$mu)) ellipse(data_f.k2$mu[[i]],data_f.k2$sigma[[i]], col=i)
And for my version using ggplot2.
# needs ggplot2 package
require("ggplot2")
# ellipsis data
ell <- cbind(data.frame(group=factor(rep(1:length(data_f.k2$mu), each=250))),
do.call(rbind, mapply(ellipse, data_f.k2$mu, data_f.k2$sigma,
npoints=250, SIMPLIFY=FALSE)))
# plotting command
p <- ggplot(data_f, aes(color=group)) +
geom_point(aes(waiting, eruptions)) +
geom_path(data=ell, aes(x=`2`, y=`1`)) +
theme_bw(base_size=16)
print(p)
You can use the ellipse-function from package mixtools. The initial problem was that this function swaps x and y from your plot. I'll try to figure this out and update the answe. (I'll leave the colors to somebody else...)
plot( data_f$eruptions,data_f$waiting)
for (i in 1: length(data_f.k2$mu)) ellipse(data_f.k2$mu[[i]],data_f.k2$sigma[[i]])
Using mixtools internal plotting function:
plot.mixEM(data_f.k2, whichplots=2)

function lines() is not working

I have a problem with the function lines.
this is what I have written so far:
model.ew<-lm(Empl~Wage)
summary(model.ew)
plot(Empl,Wage)
mean<-1:500
lw<-1:500
up<-1:500
for(i in 1:500){
mean[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[1]
lw[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[2]
up[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[3]
}
plot(Wage,Empl)
lines(mean,type="l",col="red")
lines(up,type="l",col="blue")
lines(lw,type="l",col="blue")
my problem i s that no line appears on my plot and I cannot figure out why.
Can somebody help me?
You really need to read some introductory manuals for R. Go to this page, and select one that illustrates using R for linear regression: http://cran.r-project.org/other-docs.html
First we need to make some data:
set.seed(42)
Wage <- rnorm(100, 50)
Empl <- Wage + rnorm(100, 0)
Now we run your regression and plot the lines:
model.ew <- lm(Empl~Wage)
summary(model.ew)
plot(Empl~Wage) # Note. You had the axes flipped here
Your first problem was that you flipped the axes. The dependent variable (Empl) goes on the vertical axis. That is the main reason you didn't get any lines on the plot. To get the prediction lines requires no loops at all and only a single plot call using matlines():
xval <- seq(min(Wage), max(Wage), length.out=101)
conf <- predict(model.ew, data.frame(Wage=xval),
interval="confidence", level=.90)
matlines(xval, conf, col=c("red", "blue", "blue"))
That's all there is to it.

Plot power of a straight line not a curve

So I am using the following to script:
area <- c(1854,2001,2182,2520,4072,1627,1308,1092,854,1223,2231,1288,898,2328,1660,6018,5420,943,1625,1095,1484,929,1178,4072,2413)
weight1 <- c(24281,28474,33725,40707,76124,16263,12190,10153,8631,13690,34408,15375,8806,36245,20506,109489,104014,11308,23262,11778,20650,8771,12356,76124,28346)
weight <- weight1/1000
df <- data.frame(weight = log10(weight), area = log10(area))
fit_line <- predict(lm(area ~ weight, data=df))
fit_power <- predict(nls(area ~ i*weight^z, start=list(i=2,z=0.7), data=df))
plot(df$weight,df$area)
lines(df$weight,fit_line,col="red")
lines(sort(df$weight),sort(fit_power), col="blue")
To do a log - log plot. I can plot a straight with lm() but when I use nls() to do power fit, it plots a curve and not a straight line, see below:
How do I plot the power fit in the form of a straight line, or how can I derive it from lm(). SO that I have the answer in the form of: y = a*x^b
Your plot is not a log plot. To do a log plot:
plot(log(area)~log(weight), df)
Then to fit a line:
LM.Log <- lm(log(area)~log(weight), df)
abline(LM.Log, col="red")
And to do a curved line through a straight plot more efficiently:
Power <- coef(LM.Log)[2]
LM.Normal <- lm(area~I(weight^Power)+0, df)
plot(area~weight, df)
plot(function(x) coef(LM.Normal)*x^Power, 0, 2, add=T, col="blue")
Perhaps the following will be instructive...
df <- data.frame(weight, area, weightl = log10(weight), areal = log10(area))
df <- df[order(df$weight),]
fit_line <- predict(lm(areal ~ weightl, data=df))
fit_power <- predict(nls(area ~ i*weight^z, start=list(i=2,z=0.7), data=df))
plot(df$weightl, df$areal)
lines(df$weightl, fit_line, col="red")
lines(df$weightl, log10(fit_power), col="blue")
plot(df$weight, df$area)
lines(df$weight, 10^fit_line, col="red")
lines(df$weight, fit_power, col="blue")
I guessed, I hope correctly, that you really want a power curve through the raw values and you're taking log10 as a proxy for such. So, what you need to do is get predicted values of the raw weight / area relations and then log those and put everything on a log graph. Or get a the linear of the log values and put them both as curves on a raw graph. Examine both of the plots produced above.

Resources