R : multiple smoothScatter plots - r

I have two sets of scatterplots. I want to use smoothScatter() but it seems that the add function does not work, I want to use separated colors for both distributions. For example:
X1<-rnorm(1000, mean = -2, sd = 1)
Y1<-rnorm(1000, mean = -2, sd = 1)
X2<-rnorm(1000, mean = 2, sd = 1)
Y2<-rnorm(1000, mean = 2, sd = 1)
smoothScatter(X1,Y1,col="green",colramp=colorRampPalette(c("white", "green")));
smoothScatter(X2,Y2,col="green",colramp=colorRampPalette(c("white", "red")),add=T);
Is it possible ?

You should use transparency like this colramp = colorRampPalette(c(rgb(1, 1, 1, 0), rgb(1, 0, 0, 1)), alpha = TRUE) for the second function.
This will add transparency to the background of second plot.

Related

Use of mapply() to prevent double nested loop

I am trying to compute the density of a bivariate normal distribution for sets of x and y values. Using mapply(), I want to iterate over a set of means (means, means2) and each x and y values specified in the lower = and upper = arguments. I want to use mapply() to provide a nested for-loop (one loop for elements in lower and upper, one for elements in means, and one for elements in means2.
# Params needed for pmvnorm()
sigma1 <- matrix(c(1, 0.5, 0.5, 2), 2)
means <- seq(from = 0, to = 15, by = 0.5)
means_2 <- seq(from = 10, to = 15, by = 0.5)
mapply(
pmvnorm,
lower = c(
c(-Inf, 7, 10),
c(-Inf, seq(from = -3, to = 4, by = 1))
),
upper = c(
c(7, 10, Inf),
c(seq(from = -3, to = 4, by = 1), Inf)
),
mean = c(
means,
means_2
),
MoreArgs = list(sigma = sigma1, keepAttr = FALSE)
)
)
However, this does produces the following error message:
Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, :
‘diag(sigma)’ and ‘lower’ are of different length
For simply calculating the density for one set of x and y values and means, the following code works:
pmvnorm(lower = c(0, 1), upper = c(7, 10),
mean = c(1, 1), sigma = matrix(c(1, 0.5, 0.5, 2), 2), keepAttr = FALSE)
Could someone provide me pointers as to how do fix this error?

Prp plot - Coloring positive and negative values differently

I am fitting regression trees via the function rpart(). Given my data, I am going to have both positive and negative estimates in nodes. Is there a way to color them differently?
In particular, what I would like to have is a tree whose nodes are shaded in blue for negative values and in red for positive values, where darker colors signal stronger absolute values.
I attach a minimal reproducible example.
library(rpart)
library(rpart.plot)
# Simulating data.
set.seed(1986)
X = matrix(rnorm(2000, 0, 1), nrow = 1000, ncol = 2)
epsilon = matrix(rnorm(1000, 0, 0.01), nrow = 1000)
y = X[, 1] + X[, 2] + epsilon
dta = data.frame(X, y)
# Fitting regression tree.
my.tree = rpart(y ~ X1 + X2, data = dta, method = "anova", maxdepth = 3)
# Plotting.
prp(my.tree,
type = 2,
clip.right.labs = FALSE,
extra = 101,
under = FALSE,
under.cex = 1,
fallen.leaves = TRUE,
box.palette = "BuRd",
branch = 1,
round = 0,
leaf.round = 0,
prefix = "" ,
main = "",
cex.main = 1.5,
branch.col = "gray",
branch.lwd = 3)
# Repeating, with median(y) != 0.
X = matrix(rnorm(2000, 5, 1), nrow = 1000, ncol = 2)
epsilon = matrix(rnorm(1000, 0, 0.01), nrow = 1000)
y = X[, 1] + X[, 2] + epsilon
dta = data.frame(X, y)
my.tree = rpart(y ~ X1 + X2, data = dta, method = "anova", maxdepth = 3)
# HERE I NEED HELP!
prp(my.tree,
type = 2,
clip.right.labs = FALSE,
extra = 101,
under = FALSE,
under.cex = 1,
fallen.leaves = TRUE,
box.palette = "BuRd",
branch = 1,
round = 0,
leaf.round = 0,
prefix = "" ,
main = "",
cex.main = 1.5,
branch.col = "gray",
branch.lwd = 3)
As far as I understood, thanks to the box.palette option, I obtained the result I need in the first setting because median(y) is close to zero.
Indeed, in the second setting I am unhappy: I get blue shades for values less than median(y), and red shades for those above such value. How can I impose zero as the threshold for the two colors?
To be more specific, I would like a command that automatically ensures the two-colors system in any tree.
Ook, I answered my own question. The solution is actually quite simple: if the box.palette option is a two-color diverging palette (as in my example), we can use pal.thresh to set the threshold we want. In my case:
prp(my.tree,
type = 2,
clip.right.labs = FALSE,
extra = 101,
under = FALSE,
under.cex = 1,
fallen.leaves = TRUE,
box.palette = "BuRd",
branch = 1,
round = 0,
leaf.round = 0,
prefix = "" ,
main = "",
cex.main = 1.5,
branch.col = "gray",
branch.lwd = 3,
pal.thresh = 0) # HERE THE SOLUTION!
Even if this is probably bad for me, I will leave here the answer for future users and close the question, rather than deleting it.

3D trajectory visualization with path in R

I'm looking for an efficient way to plot time, x, y, z with different colors for different objects - to view proximity of the objects over time.
plot3D::line3D works with add = TRUE, but it is not very elegant. Here's a sample code that works:
data$object_id <- factor(data$object_id)
library(plot3D)
for(tr in unique(data$object_id)) {
lines3D(data$x[data$object_id == tr], data$y[data$object_id == tr], data$z[data$ba object_id ll == tr], add = T, col = data$object_id[data$object_id == tr])
}
Example data:
data <- data.frame(object_id = c(1, 1, 2, 2), t = c(0, 1, 0, 1), x = c(0, 1, 1, 0), y = c(0, 1, 1, 0), altitude = c(0, 1, 1, 0))
Desired result: path traced by different objects at a given time along with an arrow that indicates the current direction of heading (determined by joining the last 2 known positions).
At time t = 0, this should yield nothing or should yield points. At t = 1, this should yield 2 lines (one over the other) of different colors: one color for each object.
2D equivalent is ggplot2::geom_path, which does all the heavy-lifting using group parameter which joins all the paths by the grouping variable.

R function with multiple operators

data ranges from -6 to 6 and I am trying to create 3 categories, however my function is not returning anyone for category 2 even though there are people present
FFMIBMDcopdcases$lowBMD = ifelse((FFMIBMDcopdcases$copd_Tscore >= -1) , 0,
ifelse((FFMIBMDcopdcases$copd_Tscore < -1), 1,
ifelse((FFMIBMDcopdcases$copd_Tscore <= -2.5), 2, NA)))
Try using cut function. Example:
myValues <- runif(n = 20, min = -6, max = 6)
as.numeric(as.character(cut(x = myValues, breaks = c(-Inf, -2.5, -1, Inf), labels = c(2, 1, 0))))
Since you want a numeric result it might be easiest to use findInterval although you will need to subtract the result from 2 to get in the inverse order ( 2 for lowest and 0 for highest) :
FFMIBMDcopdcases$lowBMD = 2 - findInterval(FFMIBMDcopdcases$copd_Tscore ,
c(-Inf, -2.5, -1, Inf) )

Density distributions in R

An assignment has tasked us with creating a series of variables: normal1, normal2, normal3, chiSquared1 and 2, t, and F. They are defined as follows:
library(tibble)
Normal.Frame <- data_frame(normal1 = rnorm(5000, 0, 1),
normal2 = rnorm(5000, 0, 1),
normal3 = rnorm(5000, 0, 1),
chiSquared1 = normal1^2,
chiSquared2 = normal2^2,
F = sum(chiSquared1/chiSquared2),
t = sum(normal3/sqrt(chiSquared1 )))
We then have to make histograms of the distributions for normal1, chiSquared1 and 2, t, and F, which is simple enough for normal1 and the chiSquared variables, but when I try to plot F and t, the plot space is blank.
Our lecturer recommended limiting the range of F to 0-10, and t to -5 to 5. To do this, I use:
HistT <- hist(Normal.Frame$t, xlim = c(-5, 5))
HistF <- hist(Normal.Frame$F, xlim = c(0, 10))
Like I mentioned, this yields blank plots.
Your t and F are defined as sums; they will be single values. If those values are outside your range, the histogram will be empty. If you remove the sum() function you should get the desired results.

Resources