How to plot a linear regression to a double logarithmic R plot? - r

I have the following data:
someFactor = 500
x = c(1:250)
y = x^-.25 * someFactor
which I show in a double logarithmic plot:
plot(x, y, log="xy")
Now I "find out" the slope of the data using a linear model:
model = lm(log(y) ~ log(x))
model
which gives:
Call:
lm(formula = log(y) ~ log(x))
Coefficients:
(Intercept) log(x)
6.215 -0.250
Now I'd like to plot the linear regression as a red line, but abline does not work:
abline(model, col="red")
What is the easiest way to add a regression line to my plot?

lines(log(x), exp(predict(model, newdata=list(x=log(x)))) ,col="red")
The range of values for x plotted on the log-scale and for log(x) being used as the independent variable are actually quite different. This will give you the full range:
lines(x, exp(predict(model, newdata=list(x=x))) ,col="red")

Your line is being plotted, you just can't see it in the window because the values are quite different. What is happening when you include the log='xy' argument is that the space underneath the plot (so to speak) is being distorted (stretched and/or compressed), nonetheless, the original numbers are still being used. (Imagine you are plotting these points by hand on graph paper; you are still marking a point where the faint blue graph lines for, say, (1,500) cross, but the graph paper has been continuously stretched such that the lines are not equally spaced anymore.) On the other hand, your model is using the transformed data.
You need to make your plot with the same transformed data as your model, and then simply re-mark your axes in a way that will be sufficiently intuitively accessible. This is a first try:
plot(log(x), log(y), axes=FALSE, xlab="X", ylab="Y")
box()
axis(side=1, at=log(c(1,2, 10,20, 100,200)),
labels=c( 1,2, 10,20, 100,200))
axis(side=2, at=log(c(125,135, 250,260, 350, 500)),
labels=c( 125,135, 250,260, 350, 500))
abline(model, col="red")

Instead of transforming the axes, plot the log-transformed x and y.
plot(log(x), log(y))
abline(model, col="red")

Related

Smoothing over a Polynomial curve

Given data that looks like this:
x<-c(0.287,0.361,0.348,0.430,0.294)
y<-c(105,230,249,758,379)
I'm trying to fit several different methods to this data. For this question I'm looking at 2nd order polynomial fits vs Loess fits. To get a smoother curve, I'd like to expand the x data to give me more points to predict over. So for my Loess curve I do this:
Loess_Fit<-loess(y ~ x)
MakeSmooth<-seq(min(x), max(x), (max(x)-min(x))/1000)
plot(x,y)
#WithoutSmoothing
lines(x=sort(x), y=predict(Loess_Fit)[order(x)], col="red", type="l")
#With Smoothing
lines(x=sort(MakeSmooth), y=predict(Loess_Fit,MakeSmooth)[order(MakeSmooth)], col="blue", type="l")
When I attempt to do the same thing with a 2nd order polynomial fit- I get an error
Poly2<-lm(y ~ poly(x,2,raw=TRUE))
plot(x,y)
#WithoutSmoothing
lines(x=sort(x), y=predict(Poly2)[order(x)], col="red", type="l")
#With Smoothing
lines(x=sort(MakeSmooth), y=predict(Poly2,MakeSmooth)[order(MakeSmooth)], col="blue", type="l")
Obviously there is some difference between Poly2 and Loess_Fit, but I don't know what the difference is. Is there a way to smooth out the Poly2 fit as I did with the Loess_Fit?
For lm, the new data needs to be a data frame:
lines(x=sort(MakeSmooth), y=predict(Poly2,data.frame(x=MakeSmooth))[order(MakeSmooth)], col="blue", type="l")

Understanding the Local Polynomial Regression

Could someone explain me why I get different lines when I plot? Somehow I thought the line should be the same
data(aircraft)
help(aircraft)
attach(aircraft)
lgWeight <- log(Weight)
library(KernSmooth)
# a) Fit a nonparametric regression to data (xi,yi) and save the estimated values mˆ (xi).
# Regression of degree 2 polynomial of lgWeight against Yr
op <- par(mfrow=c(2,1))
lpr1 <- locpoly(Yr,lgWeight, bandwidth=7, degree = 2, gridsize = length(Yr))
plot(Yr,lgWeight,col="grey", ylab="Log(Weight)", xlab = "Year")
lines(lpr1,lwd=2, col="blue")
lines(lpr1$y, col="black")
How can I get the values from the model? If I print the model, it gives me the values on $x and $y, but somehow if I plot them, is not the same as the blue line. I need the values of the fitted model (blue) for every x, could someone help me?
The fitted model (blue curve) is correctly in lpr1. As you said, the correct y-values are in lpr1$y and the correct x-values are in lpr1$x.
The reason the second plot looks like a straight line is because you are only giving the plot function one variable, lpr1$y. Since you don't specify the x-coordinates, R will automatically plot them along an index, from 1 to the length of the y variable.
The following are two explicit and equivalent ways to plot the curve and line:
lines(x = lpr1$x, y = lpr1$y,lwd=2, col="blue") # plots curve
lines(x = 1:length(lpr1$y), y = lpr1$y, col="black") # plot line

How do I plot an abline() when I don't have any data points (in R)

I have to plot a few different simple linear models on a chart, the main point being to comment on them. I have no data for the models. I can't get R to create a plot with appropriate axes, i.e. I can't get the range of the axes correct. I think I'd like my y-axis to 0-400 and x to be 0-50.
Models are:
$$
\widehat y=108+0.20x_1
$$$$
\widehat y=101+2.15x_1
$$$$
\widehat y=132+0.20x_1
$$$$
\widehat y=119+8.15x_1
$$
I know I could possibly do this much more easily in a different software or create a dataset from the model and estimate and plot the model from that but I'd love to know if there is a better way in R.
As #Glen_b noticed, type = "n" in plot produces a plot with nothing on it. As it demands data, you have to provide anything as x - it can be NA, or some data. If you provide actual data, the plot function will figure out the plot margins from the data, otherwise you have to choose the margins by hand using xlim and ylim arguments. Next, you use abline that has parameters a and b for intercept and slope (or h and v if you want just a horizontal or vertical line).
plot(x=NA, type="n", ylim=c(100, 250), xlim=c(0, 50),
xlab=expression(x[1]), ylab=expression(hat(y)))
abline(a=108, b=0.2, col="red")
abline(a=101, b=2.15, col="green")
abline(a=132, b=0.2, col="blue")
abline(a=119, b=8.15, col="orange")

Calibration (inverse prediction) from LOESS object in R

I have fit a LOESS local regression to some data and I want to be able to find the X value associated with a given Y value.
plot(cars, main = "Stopping Distance versus Speed")
car_loess <- loess(cars$dist~cars$speed,span=.5)
lines(1:50, predict(car_loess,data.frame(speed=1:50)))
I was hoping that I could use teh inverse.predict function from the chemCal package, but that does not work for LOESS objects.
Does anyone have any idea how I might be able to do this calibrationa in a better way than predicticting Y values from a long vector of X values and looking through the resulting fitted Y for the Y value of interest and taking its corresponding X value?
Practically speaking in the above example, let's say I wanted to find the speed at which the stopping distance is 15.
Thanks!
The predicted line that you added to the plot is not quite right. Use code like this instead:
# plot the loess line
lines(cars$speed, car_loess$fitted, col="red")
You can use the approx() function to get a linear approximation from the loess line at a give y value. It works just fine for the example that you give:
# define a given y value at which you wish to approximate x from the loess line
givenY <- 15
estX <- approx(x=car_loess$fitted, y=car_loess$x, xout=givenY)$y
# add corresponding lines to the plot
abline(h=givenY, lty=2)
abline(v=estX, lty=2)
But, with a loess fit, there may be more than one x for a given y. The approach I am suggesting does not provide you with ALL of the x values for the given y. For example ...
# example with non-monotonic x-y relation
y <- c(1:20, 19:1, 2:20)
x <- seq(y)
plot(x, y)
fit <- loess(y ~ x)
# plot the loess line
lines(x, fit$fitted, col="red")
# define a given y value at which you wish to approximate x from the loess line
givenY <- 15
estX <- approx(x=fit$fitted, y=fit$x, xout=givenY)$y
# add corresponding lines to the plot
abline(h=givenY, lty=2)
abline(v=estX, lty=2)

Scatter plot in R

I want a scatterplot in R for points (x,y).
where 500 sample points of x,y are drawn from Normal distribution N(0,1) and N(0,16). Also mark these points as red from distribution N(0,1) and blue for N(0,16).
I am new to R and know only basic plotting. Anyone please help me with this.
Thanks
As the comments suggest, you have two possibilities with data from
x <- rnorm(500, mean=0, sd=sqrt(1))
y <- rnorm(500, mean=0, sd=sqrt(16))
One way would be to plot y against x, but your colour suggestions do not mean anything
plot(y ~ x)
Alternatively you can show the two sets of data with colours, remembering that y probably has a wider range than x.
plot(y, col="blue")
points(x, col="red")
The plots you get are

Resources