I have created a 3d Scatterplot in R and want to add a regression plane. I have looked at code from the statmethods.net website, which can be very useful, and it worked. I then tried it with my own data and the plane did not show up.
library(scatterplot3d)
s3d <- scatterplot3d(Try$Visits, Try$Net.Spend, Try$Radio, pch=16, highlight.3d = TRUE, type = "h", main = "3D Scatterplot")
fit <- lm(Try$Visits ~ Try$Net.Spend +Try$Radio)
s3d$plane3d(fit)
I can not reproduce the issue with the following reproducible example:
set.seed(0)
x <- runif(20)
y <- runif(20)
z <- 0.1 + 0.3 * x + 0.5 * y + rnorm(20, sd = 0.1)
dat <- data.frame(x, y, z)
rm(x,y,z)
fit <- lm(z ~ x + y, data = dat)
library(scatterplot3d)
s3d <- scatterplot3d(dat$x, dat$y, dat$z, pch=16, highlight.3d = TRUE, type = "h", main = "3D Scatterplot")
s3d$plane3d(fit)
You should avoid $ in model formula. Use data argument instead:
fit <- lm(Visits ~ Net.Spend + Radio, data = Try)
Your z-variable(dependent variable) in the scatter plot is Try$Radio whereas in the regression model, the dependent variable is Try$Visits and this is causing confusion. The 3rd variable in the scatter plot argument is treated as the dependent variable R.
Related
I cannot find my answer anywhere.
I have two regression lines from two different datasets. I am trying to put these two regression lines in one graph. The following worked well.
regression1<-lm(Y ~ X, data = mydata1)
regression2<-lm(Y ~ X, data = mydata2)
abline(regression1)
abline(regression2)
However in this plot I just have lines and I don't have dots. I run:
regression1<-lm(Y ~ X, data = mydata1)
regression2<-lm(Y ~ X, data = mydata2)
plot(c(0,2),c(0,2),type="n") +
points(rnorm(200), rnorm(200), col = "red")
abline(regression1)
abline(regression2)
With this command, I had just dots and I don't have lines, still does not work for me. What I want is having one graph with two different lines (representing the each regression's fitted lines) and dots of these regressions. I want these with different colours. Any help would be appreciated. Thanks
When, using base R graphics, you plot something and it doesn't show up in the graph the odds are that you have plotted it outside the plot area. In the example below I will make sure that everything is in the plot area by first getting appropriate x and y axis limits.
The first two code lines do the trick.
rangeX <- range(c(mydata1$X, mydata2$X))
rangeY <- range(c(mydata1$Y, mydata2$Y))
regression1 <- lm(Y ~ X, data = mydata1)
regression2 <- lm(Y ~ X, data = mydata2)
plot(rangeX, rangeY, type = "n", xlab = "X", ylab = "Y")
with(mydata1, points(X, Y, col = "red"))
with(mydata2, points(X, Y, col = "blue"))
abline(regression1, col = "red")
abline(regression2, col = "blue")
Data creation code.
set.seed(1234)
n <- 20
x <- seq_len(n) + rnorm(n)
mydata1 <- data.frame(X = x, Y = x + rnorm(n))
x <- seq_len(n) + rnorm(n)
mu <- 3
mydata2 <- data.frame(X = x + rnorm(n), Y = mu + x + rnorm(n))
I have a multi-level model with categorical and continuous variables and splines. Nice and complex. Anyhow I am trying to visualize model fit.
For example, here is some toy data:
library(lme4)
library(rms)
library(gridExtra)
## Make model using sleepstudy data
head(sleepstudy)
# Add some extra vars
sleepstudy$group <- factor( sample(c(1,2), nrow(sleepstudy), replace=TRUE) )
sleepstudy$x1 <- jitter(sleepstudy$Days, factor=5)^2 * jitter(sleepstudy$Reaction)
# Set up a mixed model with spline
fm1 <- lmer(Reaction ~ rcs(Days, 4) * group + (rcs(Days, 4) | Subject), sleepstudy)
# Now add continuous covar
fm2 <- lmer(Reaction ~ rcs(Days, 4) * group + x1 + (rcs(Days, 4) | Subject), sleepstudy)
# Plot fit
new.df <- sleepstudy
new.df$pred1 <- predict(fm1, new.df, allow.new.levels=TRUE, re.form=NA)
new.df$pred2 <- predict(fm2, new.df, allow.new.levels=TRUE, re.form=NA)
g1 <- ggplot(data=new.df, aes(x=Days)) +
geom_line(aes(y=pred1, col=group), size=2) +
ggtitle("Model 1")
g2 <- ggplot(data=new.df, aes(x=Days)) +
geom_line(aes(y=pred2, col=group), size=2) +
ggtitle("Model 2")
grid.arrange(g1, g2, nrow=1)
Plot 1 is smooth, but plot 2 is jagged due to the effect of x1. So I would like to make a surface plot with x = Days, y = x1 and z = pred2 and stratified by group. Not having experience of surface plots I've started out with the wireframe command:
wireframe(pred2 ~ Days * x1, data = new.df[new.df$group==1,],
xlab = "Days", ylab = "x1", zlab="Predicted fit"
)
However although this command does not give an error, my plot is blank:
Questions:
Where am I going wrong with my wireframe?
Is there a better way to visualize my model fit?
I figured out that the data format needed for a wireframe' orplot_ly' surface is that of a 2D matrix of x rows by y columns of corresponding z values (I got a hint towards this from this question Plotly 3d surface graph has incorrect x and y axis values). I also realised I could use `expand.grid' to make a matrix covering the range of possible x and y values and use those to predict z as follows:
days <- 0:9
x1_range <- range(sleepstudy$x1)[2] * c(0.05, 0.1, 0.15, 0.2, 0.25, 0.3)
new.data2 <- expand.grid(Days = days, x1 = x1_range, group = unique(sleepstudy$group) )
new.data2$pred <- predict(fm2, new.data2, allow.new.levels=TRUE, re.form=NA)
I can then stuff those into two different matrices to represent the z-surface for each group in my model:
surf1 <- ( matrix(new.data2[new.data2$group == 1, ]$pred, nrow = length(days), ncol = length(x1_range)) )
surf2 <- ( matrix(new.data2[new.data2$group == 2, ]$pred, nrow = length(days), ncol = length(x1_range)) )
group <- c(rep(1, nrow(surf1)), rep(2, nrow(surf2) ))
Finally I can use plot_ly to plot each surface:
plot_ly (z=surf1, x = mets_range, y = ages, type="surface") %>%
add_surface (z = surf2, surfacecolor=surf2,
color=c('red','yellow'))
The resulting plot:
So the resulting plot is what I wanted (albeit not very useful in this made up example but useful in real data). The only thing I can't figure out is how to show two different color scales. I can suppres the scale altogether but if anyone knows how to show 2 scales for different surfaces do please let me know and I will edit the answer.
I made 3d plot in rgl.persp3d but I don't know how to smooth that to see trend. Or maybe next solution is to implement wireframe in rgl.persp3d (because I need this plot to be interactive). Please, help.
library(mgcv)
x<- rnorm(200)
y<- rnorm(200)
z<-rnorm(200)
tab<-data.frame(x,y,z)
tab
#surface wireframe:
mod <- gam(z ~ te(x, y), data = tab)
wyk <- matrix(fitted(mod), ncol = 20) #8 i 10 też ok
wireframe(wyk, drape=TRUE, colorkey=TRUE)
#surface persp3d
library(rgl)
library(akima)
z_interpolation <- 200
tabint <- interp(x, y, z)
x.si <- tabint$x
y.si <- tabint$y
z.si <- tabint$z
nbcol <- 200
vertcol <- cut(t, nbcol)
color = rev(rainbow(nbcol, start = 0/6, end = 4/6))
persp3d(x.si, y.si, z.si, col = color[vertcol], smooth=T)
So wireframe is neither smoothed nor interactive
...and rgl.persp3d is interactive but no smoothed. And I can't have both smoothed and interactive.
rgl just draws what you give it. You need to use mgcv as in your first example to do the smoothing, but you don't get a matrix of fitted values back at the end, so you'll want to use deldir to turn the results into a surface. For example,
library(mgcv)
x<- rnorm(200)
y<- rnorm(200)
z<-rnorm(200)
tab<-data.frame(x,y,z)
tab
#surface wireframe:
mod <- gam(z ~ te(x, y), data = tab)
library(rgl)
library(deldir)
zfit <- fitted(mod)
col <- cm.colors(20)[1 +
round(19*(zfit - min(zfit))/diff(range(zfit)))]
persp3d(deldir(x, y, z = zfit), col = col)
aspect3d(1, 2, 1)
This gives a nice smooth surface, for example
A simpler way, without the delaunay stuff:
library(mgcv)
x <- rnorm(200)
y <- rnorm(200)
z <- rnorm(200)
tab <- data.frame(x,y,z)
mod <- mgcv::gam(z ~ te(x, y), data = tab)
grid <- -5:5
zfit <- predict(mod, expand.grid(x = grid, y = grid))
persp3d(grid, grid, zfit)
I can create simple graphs. I would like to have observed and predicted values (from a linear regression) on the same graph. I am plotting say Yvariable vs Xvariable. There is only 1 predictor and only 1 response. How could I also add linear regression curve to the same graph?
So to conclude need help with:
plotting actuals and predicted both
plotting regression line
Here is one option for the observed and predicted values in a single plot as points. It is easier to get the regression line on the observed points, which I illustrate second
First some dummy data
set.seed(1)
x <- runif(50)
y <- 2.5 + (3 * x) + rnorm(50, mean = 2.5, sd = 2)
dat <- data.frame(x = x, y = y)
Fit our model
mod <- lm(y ~ x, data = dat)
Combine the model output and observed x into a single object for plott
res <- stack(data.frame(Observed = dat$y, Predicted = fitted(mod)))
res <- cbind(res, x = rep(dat$x, 2))
head(res)
Load lattice and plot
require("lattice")
xyplot(values ~ x, data = res, group = ind, auto.key = TRUE)
The resulting plot should look similar to this
To get just the regression line on the observed data, and the regression model is a simple straight line model as per the one I show then you can circumvent most of this and just plot using
xyplot(y ~ x, data = dat, type = c("p","r"), col.line = "red")
(i.e. you don't even need to fit the model or make new data for plotting)
The resulting plot should look like this
An alternative to the first example which can be used with anything that will give coefficients for the regression line is to write your own panel functions - not as scary as it seems
xyplot(y ~ x, data = dat, col.line = "red",
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.abline(coef = coef(mod), ...) ## using mod from earlier
}
)
That gives a plot from Figure 2 above, but by hand.
Assuming you've done this with caret then
mod <- train(y ~ x, data = dat, method = "lm",
trControl = trainControl(method = "cv"))
xyplot(y ~ x, data = dat, col.line = "red",
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.abline(coef = coef(mod$finalModel), ...) ## using mod from caret
}
)
Will produce a plot the same as Figure 2 above.
Another option is to use panel.lmlineq from latticeExtra.
library(latticeExtra)
set.seed(0)
xsim <- rnorm(50, mean = 3)
ysim <- (0 + 2 * xsim) * (1 + rnorm(50, sd = 0.3))
## basic use as a panel function
xyplot(ysim ~ xsim, panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.lmlineq(x, y, adj = c(1,0), lty = 1,xol.text='red',
col.line = "blue", digits = 1,r.squared =TRUE)
})
I'm looking for a way to plot a nonlinear regression line on a data set where every value in my vector y is being stored multiple times, so I tried to use something like:
x <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(1,4,9,15,25,9,36,25,36,25)
reg4 <- lm( x ~ y + I(y^2) )
plot(x ~ y)
lines(y, predict(reg4), type="l", col="red", lwd=1)
this gives http://i.imgur.com/qSEVNdT.png
So my question is, is there a way to, let's say, use some sort of mean value for each y entry? Or well just make it a 'continous' line instead of something that branches of into multiple lines/returns to a lower y value at the points where there are multiple 'entries'.
In these cases, it is best to predict from the model over the range of the covariate. You do this for say 50 or 100 locations equally spaced over the range of x. Increasing or decreasing the number of locations to predict at as needed - more complex responses will need more locations etc. Doing this also solves the spaghetti plot issue as the newdata supplied will be in the order of x
x <- c(1,2,3,4,5,6,7,8,9,10)
y <- c(1,4,9,15,25,9,36,25,36,25)
reg4 <- lm( x ~ y + I(y^2) )
## predictions
pred <- data.frame(y = seq(min(y), max(y), length = 100))
pred <- transform(pred, x = predict(reg4, newdata = pred))
## plot
plot(x ~ y)
lines(x ~ y, data = pred, type = "l", col = "red", lwd = 1)
The problem does not come from the ties in the data:
for a given value of y, there is only one forecast.
The problem is that the points are not sorted,
so that when you join them, you end up with a tangle of lines.
You can use order to reorder the points.
plot(
x ~ y,
xlab = "y", ylab = "x" # Confusing...
)
i <- order(y)
lines( y[i], predict(reg4)[i] )