R - Extending Linear Model beyond scatterplot3d - r

I have created a scatterplot3d with a linear model applied.
Unfortunately the results of the LM are subtle and need to be emphasised, my question is how can I extend the LM grid outside of the 'cube'.
Plot:
Code:
Plot1 <-scatterplot3d(
d$MEI,
d$YYYYMM,
d$AOELog10,
pch=20,
grid = FALSE,
color = "black",
xlab="MEI",
ylab="Date",
zlab="AOE Log(10)"
)
fit <- lm(d$AOELog10 ~ d$MEI+d$Rank)
Plot1$plane3d(fit)
Now I guess it might be a variable within lm(), but I cant find anything....

To see a larger region, or region of interest, specify the x, y, and z limits in the scatterplot command.
library(scatterplot3d)
d<-data.frame(MEI=runif(200,-3,3),
YYYYMM=runif(200,1,300),
AOELog10=runif(200,1,20),
Rank=runif(200,1,5))
fit <- lm(d$AOELog10 ~ d$MEI+d$Rank)
Plot1 <-scatterplot3d(
d$MEI, d$YYYYMM, d$AOELog10,
pch=20, grid = FALSE, color = "black",
xlab="MEI", ylab="Date", zlab="AOE Log(10)",
main="baseline"
)
Plot1$plane3d(fit)
Plot2 <-scatterplot3d(
x=d$MEI, y=d$YYYYMM, z=d$AOELog10,
pch=20, grid = FALSE, color = "black",
xlab="MEI", ylab="Date", zlab="AOE Log(10)",
xlim = c(-5,5), ylim = c(-50,400), zlim = c(-10,50), # Specify the plot range
main="larger region"
)
Plot2$plane3d(fit)

Related

Padding Around Legend when using Pch in Base R

Just a minor question. I am trying to make a legend for the following plot.
# fitting the linear model
iris_lm = lm(Petal.Length ~ Sepal.Length, data = iris)
summary(iris_lm)
# calculating the confidence interval for the fitted line
preds = predict(iris_lm, newdata = data.frame(Sepal.Length = seq(4,8,0.1)),
interval = "confidence")
# making the initial plot
par(family = "serif")
plot(Petal.Length ~ Sepal.Length, data = iris, col = "darkgrey",
family = "serif", las = 1, xlab = "Sepal Length", ylab = "Pedal Length")
# shading in the confidence interval
polygon(
c(seq(8,4,-0.1), seq(4,8,0.1)), # all of the necessary x values
c(rev(preds[,3]), preds[,2]), # all of the necessary y values
col = rgb(0.2745098, 0.5098039, 0.7058824, 0.4), # the color of the interval
border = NA # turning off the border
)
# adding the regression line
abline(iris_lm, col = "SteelBlue")
# adding a legend
legend("bottomright", legend = c("Fitted Values", "Confidence Interval"),
lty = c(1,0))
Here's the output so far:
My goal is to put a box in the legend next to the "Confidence Interval" tab, and color it in the same shade that it is in the picture. Naturally, I thought to use the pch parameter. However, when I re-run my code with the additional legend option pch = c(NA, 25), I get the following:
It is not super noticeable, but if you look closely at the padding on the left margin of the legend, it actually has decreased, and the edge of the border is now closer to the line than I would like. Is there any way to work around this?
That's a curious behavior in legend(). I'm sure someone will suggest a ggplot2 alternative. However, legend() does offer a solution. This solution calls the function without plotting anything to capture the dimensions of the desired rectangle. The legend is then plotted with the elements you really want but no enclosing box (bty = "n"). The desired rectangle is added explicitly. I assume you mean pch = 22 to get the filled box symbol. I added pt.cex = 2 to make it a bit larger.
# Capture the confidence interval color, reusable variables
myCol <- rgb(0.2745098, 0.5098039, 0.7058824, 0.4)
legText <- c("Fitted Values", "Confidence Interval")
# Picking it up from 'adding a legend'
ans <- legend("bottomright", lty = c(1,0), legend = legText, plot = F)
r <- ans$rect
legend("bottomright", lty = c(1,0), legend = legText, pch = c(NA,22),
pt.bg = myCol, col = c(1, 0), pt.cex = 2, bty = "n")
# Draw the desired box
rect(r$left, r$top - r$h, r$left + r$w, r$top)
By the way, I don't think this will work without further tweaking if you place the legend on the left side.

Share area of full confidence interval (base graphics)

I am using the following code in R to a plot a linear regression with confidence interval bands (95%) around the regression line.
Average <- c(0.298,0.783429,0.2295,0.3725,0.598,0.892,2.4816,2.79975,
1.716368,0.4845,0.974133,0.824,0.936846,1.54905,0.8166,1.83535,
1.6902,1.292667,0.2325,0.801,0.516,2.06645,2.64965,2.04785,0.55075,
0.698615,1.285,2.224118,2.8576,2.42905,1.138143,1.94225,2.467357,0.6615,
0.75,0.547,0.4518,0.8002,0.5936,0.804,0.7,0.6415,0.702182,0.7662,0.847)
Area <-c(8.605,16.079,4.17,5.985,12.419,10.062,50.271,61.69,30.262,11.832,25.099,
8.594,17.786,36.995,7.473,33.531,30.97,30.894,4.894,8.572,5.716,45.5,69.431,
40.736,8.613,14.829,4.963,33.159,66.32,37.513,27.302,47.828,39.286,9.244,19.484,
11.877,9.73,11.542,12.603,9.988,7.737,9.298,14.918,17.632,15)
lm.out <- lm (Area ~ Average)
newx = seq(min(Average), by = 0.05)
conf_interval <- predict(lm.out, newdata = data.frame(Average = newx), interval ="confidence",
level = 0.95)
plot(Average, Area, xlab ="Average", ylab = "Area", main = "Regression")
abline(lm.out, col = "lightblue")
lines(newx, conf_interval[,2], col = "blue", lty ="dashed")
lines(newx, conf_interval[,3], col = "blue", lty ="dashed")
I am stuck because the graph I got reports the bands just for the first part pf the line, leaving out all the remaining line (you find the link to the image at the bottom of the message). What is going wrong? I would also like to shade the area of the confidence interval (not just the lines corresponding to the limits) but I can't understand how to do it.
Any help would be really appreciated, I am completely new in R.
This is very easy with the ggplot2 -library. Here is the code:
library(ggplot2)
data = data.frame(Average, Area)
ggplot(data=data, aes(x=Area, y=Average))+
geom_smooth(method="lm", level=0.95)+
geom_point()
Code to install the library:
install.packages("ggplot2")

R: adding biplot arrows to CCA plot

Starting with the following code:
library(vegan)
data(dune)
data(dune.env)
Ordination.model1 <- cca(dune ~ Management,dune.env)
plot1 <- plot(Ordination.model1, choices=c(1,2), scaling=1)
I get a plot with sites, species, centroids, and biplot arrows. I want to build up a plot with just the sites depicted by points, and the arrows with customized labels.
So far, I have:
colvec <- c("red", "green", "blue")
plot(Ordination.model1, type="n", scaling=1)
with(dune.env, points(Ordination.model1, display ="sites", col=colvec[Use], scaling=1, pch =16, bg = colvec[Use]))
I am stuck as far as how to put the arrows in. Thanks in advance!
You can add arrows using text. I was not able to use your code as I kept getting errors, however here is a basic example that does what you want. I took it from R Help: CCA Plot
Once you add text the arrows should show.
require(vegan)
data(varespec)
data(varechem)
vare.cca <- cca(varespec ~ ., data = varechem)
plot(vare.cca, display = c("sites","species"), scaling = 3)
text(vare.cca, scaling = 3, display = "bp")
Here is an example with the labels argument:
## S3 method for class 'cca':
text((x, display = "sites", labels, choices = c(1, 2),
scaling = "species", arrow.mul, head.arrow = 0.05, select, const,
axis.bp = TRUE, correlation = FALSE, hill = FALSE, ...))
labels:
Optional text to be used instead of row names:
Plot or Extract Results of Constrained Correspondence Analysis or Redundancy Analysis
I was able to rename the arrows: below is the full code.
library(vegan)
data(dune)
data(dune.env)
Ordination.model1 <- cca(dune ~ Management,dune.env)
summary(Ordination.model1) # Lets you see the current biplot labels in the output.
colvec <- c("red", "green", "blue", "orange")
plot(Ordination.model1, type="n", scaling=1)
with(dune.env, points(Ordination.model1, display ="sites", col=colvec[Management],scaling=1, pch =16, bg = colvec[Management]))
labl <- c("HF", "NM", "SF") # new labels. Need to be in the same order as the old biplot labels.
text(Ordination.model1, display="bp", scaling=1, labels=labl)

why my GAM fit doesn't seem to have a correct intecept? [R]

My GAM curves are being shifted downwards. Is there something wrong with the intercept? I'm using the same code as Introduction to statistical learning... Any help's appreciated..
Here's the code. I simulated some data (a straight line with noise), and fit GAM multiple times using bootstrap.
(It took me a while to figure out how to plot multiple GAM fits in one graph. Thanks to this post Sam's answer, and this post)
library(gam)
N = 1e2
set.seed(123)
dat = data.frame(x = 1:N,
y = seq(0, 5, length = N) + rnorm(N, mean = 0, sd = 2))
plot(dat$x, dat$y, xlim = c(1,100), ylim = c(-5,10))
gamFit = vector('list', 5)
for (ii in 1:5){
ind = sample(1:N, N, replace = T) #bootstrap
gamFit[[ii]] = gam(y ~ s(x, 10), data = dat, subset = ind)
par(new=T)
plot(gamFit[[ii]], col = 'blue',
xlim = c(1,100), ylim = c(-5,10),
axes = F, xlab='', ylab='')
}
The issue is with plot.gam. If you take a look at the help page (?plot.gam), there is a parameter called scale, which states:
a lower limit for the number of units covered by the limits on the ‘y’ for each plot. The default is scale=0, in which case each plot uses the range of the functions being plotted to create their ylim. By setting scale to be the maximum value of diff(ylim) for all the plots, then all subsequent plots will produced in the same vertical units. This is essential for comparing the importance of fitted terms in additive models.
This is an issue, since you are not using range of the function being plotted (i.e. the range of y is not -5 to 10). So what you need to do is change
plot(gamFit[[ii]], col = 'blue',
xlim = c(1,100), ylim = c(-5,10),
axes = F, xlab='', ylab='')
to
plot(gamFit[[ii]], col = 'blue',
scale = 15,
axes = F, xlab='', ylab='')
And you get:
Or you can just remove the xlim and ylim parameters from both calls to plot, and the automatic setting of plot to use the full range of the data will make everything work.

Adding point and lines to 3D scatter plot in R

I want to visualize concentration ellipsoids in 3d scatter plot in respect of principal components (principal components as axes of these ellipsoids). I used function scatter3d with option ellipsoid = TRUE
data3d <- iris[which(iris$Species == "versicolor"), ]
library(car)
library(rgl)
scatter3d(x = data3d[,1], y = data3d[,2], z = data3d[,3],
surface=FALSE, grid = TRUE, ellipsoid = TRUE,
axis.col = c("black", "black", "black"), axis.scales = FALSE,
xlab = "X1", ylab = "X2", zlab = "X3", surface.col = "blue",
revolution=0, ellipsoid.alpha = 0.0, level=0.7, point.col = "yellow", add=TRUE)
to draw this plot:
Then I was trying to add "mean point" using
points3d(mean(data3d[,1]), mean(data3d[,2]), mean(data3d[,3]), col="red", size=20)
but this point is not in the place it's supposed to be (in the center of ellipsoid):
and I'm wondering why and how can I rescale it (?). And another question, which will arise after this how can I add axes of this ellipsoid to the plot?
Looking at car:::scatter3d.default shows that the coordinates are internally scaled by the min and max of each dimension; the following code scales before plotting:
sc <- function(x,orig) {
d <- diff(range(orig))
m <- min(orig)
(x-m)/d
}
msc <- function(x) {
sc(mean(x),x)
}
points3d(msc(data3d[,1]),
msc(data3d[,2]),
msc(data3d[,3]), col="red", size=20)

Resources