Legend for Scatterplot Matrix in R - r

I am making a scatterplot matrix with three variables (x,y,z) with the points coloured different based on type (w). I want to add a legend to this scatterplot matrix to explain what the colours of plotted points stand for. However, the legend command does not seem to work (i.e. I see no legend in the plot), and trying to change margins of the scatterplot matrix seems to not be working either (no change seems to happen in the plot). My code is as below:
x <- rnorm(20, 1, 0.1)
y <- rnorm(20, 5, 1)
z <- rnorm(20, 10, 2)
w <- c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)
dat <- data.frame(x, y, z, w)
pallet = c( "red", "blue")
plot(dat[c('x','y','z')], col=pallet[dat$w])
legend("topright", legend=c("one", "two"), col=pallet)
Is there a way to add a legend to a scatterplot matrix that I am missing here?

Related

How can I highlight minimum values in a levelplot in R?

How can I highlight the ten minimum value grid points of a 385*373 levelplot as black points?
I have the indexes as well as the coordinates of the ten minimum grid points. Preferably I would use the idexes...
I have the following levelplot displaying Europe's air temperature (Z), with X and Y being longitude and latitude respectively.
levelplot(Z ~ X*Y, data=data , xlab="X" , col.regions = heat.colors(100))
One further question: how can I add the country contours with the same projection type as the base data? I tried that before within another function
image(x,y,data,...)
data(wrdl_simpl)
plot(wrld_simpl, add = TRUE)
where the country contours plot seemed to have a totally different projection. However, I want to do this for levelplot() now.
I am very thankful for any help!
lattice plots differ to base plots. Therefore using points does not work. But there are replacement functions. Here is a way to do it:
x <- seq(-10, 10, length.out = 100)
y <- seq(-10, 10, length.out = 100)
z <- as.vector(sqrt(outer(x^2, y^2, "+")))
grid <- cbind(expand.grid(x=x, y=y), z)
minimum <- grid[which.min(grid$z),]
levelplot(z ~ x * y, grid, panel = function(...) {
panel.levelplot(...)
panel.points(x = minimum$x, y = minimum$y, pch = "x", cex =2)
})
We are basically building up the plot inside the panel argument.

R: how to optimize the position of labeling in plot

Hi I guess that I have quite a rudimentary question here.
I have a plot like this
but as you could easily notice, some of the label could not be displayed (some are overlapped with the symbols, some are just out of the figure frame)
I noticed that there are some way to adjust the position of labels
text(tsne_out$Y[,1], tsne_out$Y[,2], labels=samplegrouptry, pos=1)
for example, I could specify the the value of "pos" (from 1 to 4). I guess they are good enough in most cases .But I wonder whether there are some better ways to do that.
Any suggestion, thanks!
Following the suggestion from
vas_u Through change the axis ranges as well as "pos", I could get better plot:
One way around the problem would be to enlarge the axes of the plot.
Your example approximately reproduced with dummy data:
x <- rnorm(16, mean = 0)
y <- rnorm(16, mean = 1)
# Initial scatterplot with text labels out of plot area:
plot(x, y, pch = 16)
text(x, y, labels = paste("Name", 1:16), pos = 1) # Some labels outside plot area
# Second plot with the X and Y axes gently expanded:
plot(x, y, pch = 16,
xlim = 1.1*range(x),
ylim = 1.1*range(y))
text(x, y, labels = paste("Name", 1:16), pos = 1) # Labels now fit inside!
I hope this helps.

How to get R plot to plot variable on heat.color scale

I'm plotting data in R. I'm running the following two commands:
plot(x = df$Latitude, df$Longitude, col = heat.colors(nrow(df)), type = "p")
plot(x = df$Latitude, df$Longitude, col = df$feature, type = "p")
The first line plots the points along a color gradient (points with higher values are red, points with lower values are yellow) and the second line plots data with color dictated by the int values given by features.
However, I want to combine both such that I'm plotting points with colors on a scale using the numeric values from feature. In some sense, I want to pass two arguments to col. How can I do this?
You can try:
# some data
set.seed(123)
x <- rnorm(100)
# Create some breaks and use colorRampPalette to transform the breaks into a color code
gr <- .bincode(x, seq(min(x), max(x), len=length(x)), include.lowest = T)
col <- colorRampPalette(c("red", "white", "blue"))(length(x))[gr]
# the plot:
plot(x, pch=16, col=col)
For a legend see solutions here or here

How to do a 3D plot using R?

I want to plot a 3D plot using R. My data set is independent, which means the values of x, y, and z are not dependent on each other. The plot I want is given in this picture:
This plot was drawn by someone using MATLAB. How can I can do the same kind of Plot using R?
Since you posted your image file, it appears you are not trying to make a 3d scatterplot, rather a 2d scatterplot with a continuous color scale to indicate the value of a third variable.
Option 1: For this approach I would use ggplot2
# make data
mydata <- data.frame(x = rnorm(100, 10, 3),
y = rnorm(100, 5, 10),
z = rpois(100, 20))
ggplot(mydata, aes(x,y)) + geom_point(aes(color = z)) + theme_bw()
Which produces:
Option 2: To make a 3d scatterplot, use the cloud function from the lattice package.
library(lattice)
# make some data
x <- runif(20)
y <- rnorm(20)
z <- rpois(20, 5) / 5
cloud(z ~ x * y)
I usually do these kinds of plots with the base plotting functions and some helper functions for the color levels and color legend from the sinkr package (you need the devtools package to install from GitHib).
Example:
#library(devtools)
#install_github("marchtaylor/sinkr")
library(sinkr)
# example data
grd <- expand.grid(
x=seq(nrow(volcano)),
y=seq(ncol(volcano))
)
grd$z <- c(volcano)
# plot
COL <- val2col(grd$z, col=jetPal(100))
op <- par(no.readonly = TRUE)
layout(matrix(1:2,1,2), widths=c(4,1), heights=4)
par(mar=c(4,4,1,1))
plot(grd$x, grd$y, col=COL, pch=20)
par(mar=c(4,1,1,4))
imageScale(grd$z, col=jetPal(100), axis.pos=4)
mtext("z", side=4, line=3)
par(op)
Result:

Plotting multiple curves same graph and same scale

This is a follow-up of this question.
I wanted to plot multiple curves on the same graph but so that my new curves respect the same y-axis scale generated by the first curve.
Notice the following example:
y1 <- c(100, 200, 300, 400, 500)
y2 <- c(1, 2, 3, 4, 5)
x <- c(1, 2, 3, 4, 5)
# first plot
plot(x, y1)
# second plot
par(new = TRUE)
plot(x, y2, axes = FALSE, xlab = "", ylab = "")
That actually plots both sets of values on the same coordinates of the graph (because I'm hiding the new y-axis that would be created with the second plot).
My question then is how to maintain the same y-axis scale when plotting the second graph.
(The typical method would be to use plot just once to set up the limits, possibly to include the range of all series combined, and then to use points and lines to add the separate series.) To use plot multiple times with par(new=TRUE) you need to make sure that your first plot has a proper ylim to accept the all series (and in another situation, you may need to also use the same strategy for xlim):
# first plot
plot(x, y1, ylim=range(c(y1,y2)))
# second plot EDIT: needs to have same ylim
par(new = TRUE)
plot(x, y2, ylim=range(c(y1,y2)), axes = FALSE, xlab = "", ylab = "")
This next code will do the task more compactly, by default you get numbers as points but the second one gives you typical R-type-"points":
matplot(x, cbind(y1,y2))
matplot(x, cbind(y1,y2), pch=1)
points or lines comes handy if
y2 is generated later, or
the new data does not have the same x but still should go into the same coordinate system.
As your ys share the same x, you can also use matplot:
matplot (x, cbind (y1, y2), pch = 19)
(without the pch matplopt will plot the column numbers of the y matrix instead of dots).
You aren't being very clear about what you want here, since I think #DWin's is technically correct, given your example code. I think what you really want is this:
y1 <- c(100, 200, 300, 400, 500)
y2 <- c(1, 2, 3, 4, 5)
x <- c(1, 2, 3, 4, 5)
# first plot
plot(x, y1,ylim = range(c(y1,y2)))
# Add points
points(x, y2)
DWin's solution was operating under the implicit assumption (based on your example code) that you wanted to plot the second set of points overlayed on the original scale. That's why his image looks like the points are plotted at 1, 101, etc. Calling plot a second time isn't what you want, you want to add to the plot using points. So the above code on my machine produces this:
But DWin's main point about using ylim is correct.
My solution is to use ggplot2. It takes care of these types of things automatically. The biggest thing is to arrange the data appropriately.
y1 <- c(100, 200, 300, 400, 500)
y2 <- c(1, 2, 3, 4, 5)
x <- c(1, 2, 3, 4, 5)
df <- data.frame(x=rep(x,2), y=c(y1, y2), class=c(rep("y1", 5), rep("y2", 5)))
Then use ggplot2 to plot it
library(ggplot2)
ggplot(df, aes(x=x, y=y, color=class)) + geom_point()
This is saying plot the data in df, and separate the points by class.
The plot generated is
I'm not sure what you want, but i'll use lattice.
x = rep(x,2)
y = c(y1,y2)
fac.data = as.factor(rep(1:2,each=5))
df = data.frame(x=x,y=y,z=fac.data)
# this create a data frame where I have a factor variable, z, that tells me which data I have (y1 or y2)
Then, just plot
xyplot(y ~x|z, df)
# or maybe
xyplot(x ~y|z, df)

Resources