using lines() with 'multiple x entries' - r

I'm looking for a way to plot a nonlinear regression line on a data set where every value in my vector y is being stored multiple times, so I tried to use something like:
x <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(1,4,9,15,25,9,36,25,36,25)
reg4 <- lm( x ~ y + I(y^2) )
plot(x ~ y)
lines(y, predict(reg4), type="l", col="red", lwd=1)
this gives http://i.imgur.com/qSEVNdT.png
So my question is, is there a way to, let's say, use some sort of mean value for each y entry? Or well just make it a 'continous' line instead of something that branches of into multiple lines/returns to a lower y value at the points where there are multiple 'entries'.

In these cases, it is best to predict from the model over the range of the covariate. You do this for say 50 or 100 locations equally spaced over the range of x. Increasing or decreasing the number of locations to predict at as needed - more complex responses will need more locations etc. Doing this also solves the spaghetti plot issue as the newdata supplied will be in the order of x
x <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(1,4,9,15,25,9,36,25,36,25)
reg4 <- lm( x ~ y + I(y^2) )
## predictions
pred <- data.frame(y = seq(min(y), max(y), length = 100))
pred <- transform(pred, x = predict(reg4, newdata = pred))
## plot
plot(x ~ y)
lines(x ~ y, data = pred, type = "l", col = "red", lwd = 1)

The problem does not come from the ties in the data:
for a given value of y, there is only one forecast.
The problem is that the points are not sorted,
so that when you join them, you end up with a tangle of lines.
You can use order to reorder the points.
plot(
x ~ y,
xlab = "y", ylab = "x" # Confusing...
)
i <- order(y)
lines( y[i], predict(reg4)[i] )

Related

Use lattice xyplot for grouped data with points and lines

I wish to make a single-panel graph in lattice that shows data (y) from several groups (g) with superimposed lines showing predicted values (y_pred). I generate example data below:
d <- data.frame(x = rep(1:100,2), g = factor(rep(c('a','b'), each = 100)))
d$y_pred <- -0.1*x + 0.001*x^2
d$y_pred <- with(d, ifelse(g=='a', y_pred+2,y_pred))
d$y <- d$y_pred + rnorm(nrow(d),0,1)
Using 'type=c('p','l'), distribute.type=TRUE does not work, nor does my attempt at making a panel:
xyplot(y + y_pred ~ x, data = d,
groups = g,
panel = panel.superpose,
panel.groups=function(...){
panel.xyplot(x, y, type='p')
panel.xyplot(x, y_pred, type='l')
}
)
What should I do here?
Ok, you can do this with latticeExtra
dat <- xyplot(y ~ x, data=d,
groups = g,
type="p"
)
dat
dat + layer(panel.xyplot(x=x, y=y_pred,
groups = g,
type="l",
subscripts=TRUE),
data=d)
But this is really finicky and non-robust. The 'dat + layer()' code will render properly if it is the first thing I run after starting the program, but after that it will frequently render with several of lines for different groups missing.

Adding two regression graphs having problem in dots and lines

I cannot find my answer anywhere.
I have two regression lines from two different datasets. I am trying to put these two regression lines in one graph. The following worked well.
regression1<-lm(Y ~ X, data = mydata1)
regression2<-lm(Y ~ X, data = mydata2)
abline(regression1)
abline(regression2)
However in this plot I just have lines and I don't have dots. I run:
regression1<-lm(Y ~ X, data = mydata1)
regression2<-lm(Y ~ X, data = mydata2)
plot(c(0,2),c(0,2),type="n") +
points(rnorm(200), rnorm(200), col = "red")
abline(regression1)
abline(regression2)
With this command, I had just dots and I don't have lines, still does not work for me. What I want is having one graph with two different lines (representing the each regression's fitted lines) and dots of these regressions. I want these with different colours. Any help would be appreciated. Thanks
When, using base R graphics, you plot something and it doesn't show up in the graph the odds are that you have plotted it outside the plot area. In the example below I will make sure that everything is in the plot area by first getting appropriate x and y axis limits.
The first two code lines do the trick.
rangeX <- range(c(mydata1$X, mydata2$X))
rangeY <- range(c(mydata1$Y, mydata2$Y))
regression1 <- lm(Y ~ X, data = mydata1)
regression2 <- lm(Y ~ X, data = mydata2)
plot(rangeX, rangeY, type = "n", xlab = "X", ylab = "Y")
with(mydata1, points(X, Y, col = "red"))
with(mydata2, points(X, Y, col = "blue"))
abline(regression1, col = "red")
abline(regression2, col = "blue")
Data creation code.
set.seed(1234)
n <- 20
x <- seq_len(n) + rnorm(n)
mydata1 <- data.frame(X = x, Y = x + rnorm(n))
x <- seq_len(n) + rnorm(n)
mu <- 3
mydata2 <- data.frame(X = x + rnorm(n), Y = mu + x + rnorm(n))

`scatterplot3d`: can not add a regression plane to 3D scatter plot

I have created a 3d Scatterplot in R and want to add a regression plane. I have looked at code from the statmethods.net website, which can be very useful, and it worked. I then tried it with my own data and the plane did not show up.
library(scatterplot3d)
s3d <- scatterplot3d(Try$Visits, Try$Net.Spend, Try$Radio, pch=16, highlight.3d = TRUE, type = "h", main = "3D Scatterplot")
fit <- lm(Try$Visits ~ Try$Net.Spend +Try$Radio)
s3d$plane3d(fit)
I can not reproduce the issue with the following reproducible example:
set.seed(0)
x <- runif(20)
y <- runif(20)
z <- 0.1 + 0.3 * x + 0.5 * y + rnorm(20, sd = 0.1)
dat <- data.frame(x, y, z)
rm(x,y,z)
fit <- lm(z ~ x + y, data = dat)
library(scatterplot3d)
s3d <- scatterplot3d(dat$x, dat$y, dat$z, pch=16, highlight.3d = TRUE, type = "h", main = "3D Scatterplot")
s3d$plane3d(fit)
You should avoid $ in model formula. Use data argument instead:
fit <- lm(Visits ~ Net.Spend + Radio, data = Try)
Your z-variable(dependent variable) in the scatter plot is Try$Radio whereas in the regression model, the dependent variable is Try$Visits and this is causing confusion. The 3rd variable in the scatter plot argument is treated as the dependent variable R.

Line instead of Dot (R) Not so easy

If you could help me it would be great :
So i'm doing a double curve (SDT) graph, and i have a bit of a problem : here my graph :
First time I have this problem ... Really have no clue how to solve it, well I just think my data is not ordered but how can I order it easily ?
Here's me code (but really nothing special) :
x = TDSindice2$Hit
mean = mean(x)
sd = sd(x)
y = dnorm(x,mean,sd)
plot(x,y, col = "red")
x = TDSindice2$Fa
mean = mean(x)
sd = sd(x)
y = dnorm(x,mean,sd)
par(new=TRUE)
plot(x,y ,type = "l", col ="blue")
Thanks for all :)
You need to order your data in terms of increasing values of x before plotting. For example:
set.seed(1)
x <- runif(50)
y <- 1.2 + (1.4 * x) + (-2.5 * x^2)
plot(x, y)
lines(x, y)
The order() function can be used to generate an index that when applied to a variable/object places the values of that object in the required order (increasing by default):
ord <- order(x)
plot(x[ord], [ord], type = "o")
But you'd be better off have x and y in the same object, a data frame, and just sort the rows of that:
dat <- data.frame(x = x, y = y)
ord <- with(dat, order(x))
plot(y ~ x, data = dat[ord, ], type = "o") ## or
## lines(y ~ x, data = dat[ord, ])
Note that order() is used to index the data hence we don't change the original ordering, we just permute the rows as we supply the object to the plot() function.

Plot the observed and fitted values from a linear regression using xyplot() from the lattice package

I can create simple graphs. I would like to have observed and predicted values (from a linear regression) on the same graph. I am plotting say Yvariable vs Xvariable. There is only 1 predictor and only 1 response. How could I also add linear regression curve to the same graph?
So to conclude need help with:
plotting actuals and predicted both
plotting regression line
Here is one option for the observed and predicted values in a single plot as points. It is easier to get the regression line on the observed points, which I illustrate second
First some dummy data
set.seed(1)
x <- runif(50)
y <- 2.5 + (3 * x) + rnorm(50, mean = 2.5, sd = 2)
dat <- data.frame(x = x, y = y)
Fit our model
mod <- lm(y ~ x, data = dat)
Combine the model output and observed x into a single object for plott
res <- stack(data.frame(Observed = dat$y, Predicted = fitted(mod)))
res <- cbind(res, x = rep(dat$x, 2))
head(res)
Load lattice and plot
require("lattice")
xyplot(values ~ x, data = res, group = ind, auto.key = TRUE)
The resulting plot should look similar to this
To get just the regression line on the observed data, and the regression model is a simple straight line model as per the one I show then you can circumvent most of this and just plot using
xyplot(y ~ x, data = dat, type = c("p","r"), col.line = "red")
(i.e. you don't even need to fit the model or make new data for plotting)
The resulting plot should look like this
An alternative to the first example which can be used with anything that will give coefficients for the regression line is to write your own panel functions - not as scary as it seems
xyplot(y ~ x, data = dat, col.line = "red",
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.abline(coef = coef(mod), ...) ## using mod from earlier
}
)
That gives a plot from Figure 2 above, but by hand.
Assuming you've done this with caret then
mod <- train(y ~ x, data = dat, method = "lm",
trControl = trainControl(method = "cv"))
xyplot(y ~ x, data = dat, col.line = "red",
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.abline(coef = coef(mod$finalModel), ...) ## using mod from caret
}
)
Will produce a plot the same as Figure 2 above.
Another option is to use panel.lmlineq from latticeExtra.
library(latticeExtra)
set.seed(0)
xsim <- rnorm(50, mean = 3)
ysim <- (0 + 2 * xsim) * (1 + rnorm(50, sd = 0.3))
## basic use as a panel function
xyplot(ysim ~ xsim, panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.lmlineq(x, y, adj = c(1,0), lty = 1,xol.text='red',
col.line = "blue", digits = 1,r.squared =TRUE)
})

Resources