Understanding the Local Polynomial Regression - r

Could someone explain me why I get different lines when I plot? Somehow I thought the line should be the same
data(aircraft)
help(aircraft)
attach(aircraft)
lgWeight <- log(Weight)
library(KernSmooth)
# a) Fit a nonparametric regression to data (xi,yi) and save the estimated values mˆ (xi).
# Regression of degree 2 polynomial of lgWeight against Yr
op <- par(mfrow=c(2,1))
lpr1 <- locpoly(Yr,lgWeight, bandwidth=7, degree = 2, gridsize = length(Yr))
plot(Yr,lgWeight,col="grey", ylab="Log(Weight)", xlab = "Year")
lines(lpr1,lwd=2, col="blue")
lines(lpr1$y, col="black")
How can I get the values from the model? If I print the model, it gives me the values on $x and $y, but somehow if I plot them, is not the same as the blue line. I need the values of the fitted model (blue) for every x, could someone help me?

The fitted model (blue curve) is correctly in lpr1. As you said, the correct y-values are in lpr1$y and the correct x-values are in lpr1$x.
The reason the second plot looks like a straight line is because you are only giving the plot function one variable, lpr1$y. Since you don't specify the x-coordinates, R will automatically plot them along an index, from 1 to the length of the y variable.
The following are two explicit and equivalent ways to plot the curve and line:
lines(x = lpr1$x, y = lpr1$y,lwd=2, col="blue") # plots curve
lines(x = 1:length(lpr1$y), y = lpr1$y, col="black") # plot line

Related

`abline` does not add line when producing regression diagonstic plots with `par()`

I am using par() function to draw a multi-panel plot and I want to add a line to exactly second plot...
par(mfrow = c(2, 2))
hist(model$residuals) # model is some predefined lm object
plot((model$residuals + model$fitted.values) ~ model$fitted.values)
# Now I want to add a line (or points or curve) to only above plot like
abline(model$coef) # but this doesn't work
qqnorm(model$residuals) # some more plots, doesn't matter which
Any help? I do not intend to use ggplot() and want to keep it simple.
The problem is not what you think to be with par; it is merely because you feed inappropriate values to abline. You changed your question several times, showing that you don't know what line should be added for different several plots. I will now clarify this, assuming mod is your fitted model.
residuals v.s. fitted
with(mod, plot(fitted.values, residuals))
abline(h = 0) ## residuals are centred, so we want a horizontal line
fitted v.s. response
with(mod, plot(fitted.values + residuals, fitted.values))
abline(0, 1) ## perfect fit has `fitted = response`, so we want line `y = x`
scatter plot with regression line
v <- attr(mod$terms, "term.labels") ## independent variable name
with(mod, plot(model[[v]], fitted.values + residuals)) ## scatter plot
abline(mod$coef) ## or simply `abline(mod)`, for add regression curve
reproducible example
set.seed(0)
xx <- rnorm(100)
yy <- 1.3 * xx - 0.2 + rnorm(100, sd = 0.5)
mod <- lm(yy ~ xx)
rm(xx, yy)
par(mfrow = c(2,2))
with(mod, plot(fitted.values, residuals))
abline(h = 0)
with(mod, plot(fitted.values + residuals, fitted.values))
abline(0, 1)
v <- attr(mod$terms, "term.labels") ## independent variable name
with(mod, plot(model[[v]], fitted.values + residuals)) ## scatter plot
abline(mod$coef) ## or simply `abline(mod)`
As #ZheyuanLi says, it's hard to see exactly what you want. Some of your problems appear to be from adding lines that don't overlap with the existing plot limits.
model <- lm(Illiteracy~Income,data.frame(state.x77))
par(mfrow = c(2, 2))
hist(model$residuals)
plot(model$residuals ~ model$fitted.values)
plot((model$residuals+model$fitted.values) ~ model$fitted.values)
Adding elements immediately after the plot works fine:
abline(a=0,b=1)
What if you want to go back and add elements to a previous frame? That's a bit difficult. Reset plot to row 1, column 2: this does not put us inside the plotting frame of the previous plot, it just gets us ready to plot in this subframe.
par(mfg=c(1,2))
We want to set up the same plot frame again: we'll cheat by plotting the same thing again (ensuring the same axis limits, etc. etc.), but turning off all aspects of the plot (new=FALSE means we don't blank out the previous plot):
plot(model$residuals ~ model$fitted.values,
type="n",new=FALSE,axes=FALSE,ann=FALSE)
abline(h=0,col=2)
Base graphics are really not designed for modifying existing plots; if you want to do much of it, you should look into the grid graphics system (which lattice and ggplot2 graphics are built on).

Plotting Modeled Data as my Regression Curve in R/RStudio

I have two numerical data sets; one experimental, and one modeled. I have plotted the data set as a scatter plot in R. The experimental data are:
x=c(1.11,2.84,3.97,6.40,7.60,7.43,5.75,3.60,3.59)
y=c(0.973,0.818,0.74,0.44,0.2688,0.282,0.50,0.613,0.656)
This is a pretty straight forward scatter plot. I also have some modeled data I acquired using an equation. This data needs to be superimposed on this plot. The modeled data is as follows:
x = c(1,2,3,4,5,6,7,8,9,10,11,12)
y = c(.994,.954,.860,.721,.570,.434,.362,.244,.185,.142,.110,.087)
How do I take these data, make a regression curve with them, and use them to generate some simple statistics comparing the modeled data to the experimental data?
Thanks!
First plot line
x = c(1,2,3,4,5,6,7,8,9,10,11,12)
y = c(.994,.954,.860,.721,.570,.434,.362,.244,.185,.142,.110,.087)
plot(x, y, type = "l")
Then points
x=c(1.11,2.84,3.97,6.40,7.60,7.43,5.75,3.60,3.59)
y=c(0.973,0.818,0.74,0.44,0.2688,0.282,0.50,0.613,0.656)
points(x, y, col = "red")

Calibration (inverse prediction) from LOESS object in R

I have fit a LOESS local regression to some data and I want to be able to find the X value associated with a given Y value.
plot(cars, main = "Stopping Distance versus Speed")
car_loess <- loess(cars$dist~cars$speed,span=.5)
lines(1:50, predict(car_loess,data.frame(speed=1:50)))
I was hoping that I could use teh inverse.predict function from the chemCal package, but that does not work for LOESS objects.
Does anyone have any idea how I might be able to do this calibrationa in a better way than predicticting Y values from a long vector of X values and looking through the resulting fitted Y for the Y value of interest and taking its corresponding X value?
Practically speaking in the above example, let's say I wanted to find the speed at which the stopping distance is 15.
Thanks!
The predicted line that you added to the plot is not quite right. Use code like this instead:
# plot the loess line
lines(cars$speed, car_loess$fitted, col="red")
You can use the approx() function to get a linear approximation from the loess line at a give y value. It works just fine for the example that you give:
# define a given y value at which you wish to approximate x from the loess line
givenY <- 15
estX <- approx(x=car_loess$fitted, y=car_loess$x, xout=givenY)$y
# add corresponding lines to the plot
abline(h=givenY, lty=2)
abline(v=estX, lty=2)
But, with a loess fit, there may be more than one x for a given y. The approach I am suggesting does not provide you with ALL of the x values for the given y. For example ...
# example with non-monotonic x-y relation
y <- c(1:20, 19:1, 2:20)
x <- seq(y)
plot(x, y)
fit <- loess(y ~ x)
# plot the loess line
lines(x, fit$fitted, col="red")
# define a given y value at which you wish to approximate x from the loess line
givenY <- 15
estX <- approx(x=fit$fitted, y=fit$x, xout=givenY)$y
# add corresponding lines to the plot
abline(h=givenY, lty=2)
abline(v=estX, lty=2)

How to plot a linear regression to a double logarithmic R plot?

I have the following data:
someFactor = 500
x = c(1:250)
y = x^-.25 * someFactor
which I show in a double logarithmic plot:
plot(x, y, log="xy")
Now I "find out" the slope of the data using a linear model:
model = lm(log(y) ~ log(x))
model
which gives:
Call:
lm(formula = log(y) ~ log(x))
Coefficients:
(Intercept) log(x)
6.215 -0.250
Now I'd like to plot the linear regression as a red line, but abline does not work:
abline(model, col="red")
What is the easiest way to add a regression line to my plot?
lines(log(x), exp(predict(model, newdata=list(x=log(x)))) ,col="red")
The range of values for x plotted on the log-scale and for log(x) being used as the independent variable are actually quite different. This will give you the full range:
lines(x, exp(predict(model, newdata=list(x=x))) ,col="red")
Your line is being plotted, you just can't see it in the window because the values are quite different. What is happening when you include the log='xy' argument is that the space underneath the plot (so to speak) is being distorted (stretched and/or compressed), nonetheless, the original numbers are still being used. (Imagine you are plotting these points by hand on graph paper; you are still marking a point where the faint blue graph lines for, say, (1,500) cross, but the graph paper has been continuously stretched such that the lines are not equally spaced anymore.) On the other hand, your model is using the transformed data.
You need to make your plot with the same transformed data as your model, and then simply re-mark your axes in a way that will be sufficiently intuitively accessible. This is a first try:
plot(log(x), log(y), axes=FALSE, xlab="X", ylab="Y")
box()
axis(side=1, at=log(c(1,2, 10,20, 100,200)),
labels=c( 1,2, 10,20, 100,200))
axis(side=2, at=log(c(125,135, 250,260, 350, 500)),
labels=c( 125,135, 250,260, 350, 500))
abline(model, col="red")
Instead of transforming the axes, plot the log-transformed x and y.
plot(log(x), log(y))
abline(model, col="red")

Only plotting the fitted spline line and not the data points

I have checked my references, it seems to me that to fit a dataset with x and y, many tutorial need to first plot the x and y, then the fitted line is plot. The normal procedure is like below:
## Calculate the fitted line
smoothingSpline = smooth.spline(tree_number[2:100], jaccard[1:99], spar=0.35)
plot(tree_number[2:100],jaccard[1:99]) #plot the data points
lines(smoothingSpline) # add the fitted spline line.
However, I do not want to plot the tree_number and jaccard, but rather, I only want to plot the fitted spline line in the plot, how should I do?
You can use the associcated plot function:
plot(smoothingSpline, type="l")
Or you can extract the x and y values explicitly and plot them
plot(smoothingSpline$x, smoothingSpline$y, type="l")
Why not just plot(smoothingSpline, type = "l")? That should allow you to add the fitted spline line without having to first plot the data points.

Resources