Add points to pairs plot? - r

Is there any way for me to add some points to a pairs plot?
For example, I can plot the Iris dataset with pairs(iris[1:4]), but I wanted to execute a clustering method (for example, kmeans) over this dataset and plot its resulting centroids on the plot I already had.
It would help too if there's a way to plot the whole data and the centroids together in a single pairs plot in such a way that the centroids can be plotted in a different way. The idea is, I plot pairs(rbind(iris[1:4],centers) (where centers are the three centroids' data) but plotting the three last elements of this matrix in a different way, like changing cex or pch. Is it possible?

You give the solution yourself in the last paragraph of your question. Yes, you can use pch and col in the pairs function.
pairs(rbind(iris[1:4], kmeans(iris[1:4],3)$centers),
pch=rep(c(1,2), c(nrow(iris), 3)),
col=rep(c(1,2), c(nrow(iris), 3)))

Another option is to use panel function:
cl <- kmeans(iris[1:4],3)
idx <- subset(expand.grid(x=1:4,y=1:4),x!=y)
i <- 1
pairs(iris[1:4],bg=cl$cluster,pch=21,
panel=function(x, y,bg, ...) {
points(x, y, pch=21,bg=bg)
points(cl$center[,idx[i,'x']],cl$center[,idx[i,'y']],
cex=4,pch=10,col='blue')
i <<- i +1
})
But I think it is safer and easier to use lattice splom function. The legend is also automatically generated.
cl <- kmeans(iris[1:4],3)
library(lattice)
splom(iris[1:4],groups=cl$cluster,pch=21,
panel=function(x, y,i,j,groups, ...) {
panel.points(x, y, pch=21,col=groups)
panel.points(cl$center[,j],cl$center[,i],
pch=10,col='blue')
},auto.key=TRUE)

Related

How to add more points/dots on an existing pairs plot?

par(mfrow=c(1,2))
Trigen <- data.frame(OTriathlon$Gender,OTriathlon$Swim,OTriathlon$Bike,OTriathlon$Run)
colnames(Trigen) <- c("Gender","Swim","Bike","Run")
res <- split(Trigen[,2:4],Trigen$Gender)
pairs(res$Male, pch="M", col = 4)
points(res$Female, pch ="F", col= 2)
Basically, Customize the pairs plot, so where the plot symbol and color of each data point represents
gender.
I did some random things in the code but the issue that I am facing is that I cant add female points to the existing plot. After running the points code it just stays the same doesn't get updated
There is no need to call points sevral times, because you can use the factor directly as a color. Example:
plot(iris[,c(2,3)], col=iris$Species)

R distinct some points with different color

I have around 20.000 points in my scatter plot. I have a list of interesting points and want to show those points in the scatter plot with different color. Is there any simple way to do it? Thank you.
Further explanation,
I have a matrix, consist of 20.000 rows, let's say R1 to R20000 and 4 columns, let's say A,B,C, and, D. Each row has its own row.names. I want to make a scatter plot between A and C. It is easy with plot(data$A,data$B).
On the other hand, I have a list of row.names which I want to check where in the scatter plot those point is. Let's say R1,R3,R5,R10,R20,R25.
I just want to change the color of R1,R3,R5,R10,R20,R25 in the scatter plot different from other points. Sorry if my explanation is not clear.
If your data is in a simple form, then it is easy to do. For example:
# Make some toy data
dat <- data.frame(x = rnorm(1000), y = rnorm(1000))
# List of indicies (or a logical vector) defining your interesting points
is.interesting <- sample(1000, 30)
# Create vector/column of colours
dat$col <- "lightgrey"
dat$col[is.interesting] <- "red"
# Plot
with(dat, plot(x, y, col = col, pch = 16))
Without a reproducible example, it's hard to say anything more specific.

Tiled xyplots, how to colour independently to mimic levelplot map?

I am trying to plot several lines using the lattice package to produce a tiled figure. Here is an example (which is wrong, see bellow):
(source: ubuntuone.com)
I would like to mimic this one (in terms of the colour pattern) :
(source: ubuntuone.com)
Both plots are colouring the same variable "subset$speed1DaboveT". The first is done with xyplot the second with levelplot.
My problem is that the one with the curves is not colouring correctly, possibly because i am asking the wrong thing. I have followed this example from another question (which, importantly, is a single figure):
dfrm <- data.frame( y=c(rnorm(10),rnorm(10)),
x=1:10,
grp=rep(c("a","b"),each=10))
xyplot(y~x, group=grp, type="l", data=dfrm, col=c("red","blue"))
By doing:
keys=unique(subset$speed1DaboveT)
theseCol= colF(length(keys))
pp= xyplot( freq~patch|B+G
,data=subset
,scales=list( x=list(cex=0.5), y=list(cex=0.5,lim=c(0,1.1)) )
,strip = FALSE
,t='l'
,group=speed1DaboveT
,col=theseCol
,lwd=2
,cex=0.5
,panel = function(x, y, col, ...) {
panel.xyplot(x, y, col=col, ...)
}
)
In my case i am using 2 factors (B and G) and the example has none. This may mean that the colours given to xyplot should perhaps follow a different order? Anyone knows how to?
So, i failed to notice that the range would not be the same. Sorry for the question, things were solved once i forced the colour range of the xyplot to be the same as the levelplot, that is [0,0.16]
thanks

How can I recreate this 2d surface + contour + glyph plot in R?

I've run a 2d simulation in some modelling software from which i've got an export of x,y point locations with a set of 6 attributes. I wish to recreate a figure that combines the data, like this:
The ellipses and the background are shaded according to attribute 1 (and the borders of these are of course representing the model geometry, but I don't think I can replicate that), the isolines are contours of attribute 2, and the arrow glyphs are from attributes 3 (x magnitude) and 4 (y magnitude).
The x,y points are centres of the triangulated mesh I think, and look like this:
I want to know how I can recreate a plot like this with R. To start with I have irregularly-spaced data due to it being exported from an irregular mesh. That's immediately where I get stuck with R, having only ever used it for producing box-and-whisper plots and the like.
Here's the data:
https://dl.dropbox.com/u/22417033/Ellipses_noheader.txt
Edit: fields: x, y, heat flux (x), heat flux (y), thermal conductivity, Temperature, gradT (x), gradT (y).
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
It's quite easy to make the lower plot (making the assumption that there is a dataframe named 'edat' read in with:
edat <- read.table(file=file.choose())
with(edat, plot(V1,V2), cex=0.2)
Things get a bit more beautiful with:
with(edat, plot(V1,V2, cex=0.2, col=V5))
So I do not think your original is being faithfully represented by the data. The contour lines are NOT straight across the "conductors". I call them "conductors" because this looks somewhat like iso-potential lines in electrostatics. I'm adding some text here to serve as a search handle for others who might be searching for plotting problems in real world physics: vector-field (the arrows) , heat equations, gradient, potential lines.
You can then overlay the vector field with:
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
You could"zoom in" with xlim and ylim:
with(edat, plot(V1,V2, cex=0.3, col=V5, xlim=c(0, 10000), ylim=c(-8000, -2000) ))
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
Guessing that the contour requested if for the Temps variable. Take your pick of contourplots.
require(akima)
intflow<- with(edat, interp(x=x, y=y, z=Temps, xo=seq(min(x), max(x), length = 410),
yo=seq(min(y), max(y), length = 410), duplicate="mean", linear=FALSE) )
require(lattice)
contourplot(intflow$z)
filled.contour(intflow)
with( intflow, contour(x=x, y=y, z=z) )
The last one will mix with the other plotting examples since those were using base plotting functions. You may need to switch to points instead of plot.
There are several parts to your plot so you will probably need several tools to make the different parts.
The background and ellipses can be created with polygon (once you figure where they should be).
The contourLines function can calculate the contour lines for you which you can add with the lines function (or contour has and add argument and could probably be used to add the lines directly).
The akima package has a function interp which can estimate values on a grid given the values ungridded.
The my.symbols function along with ms.arrows, both from the TeachingDemos package, can be used to draw the vector field.
#DWin is right to say that your graph don't represent faithfully your data, so I would advice to follow his answer. However here is how to reproduce (the closest I could) your graph:
Ellipses <- read.table(file.choose())
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
require(splancs)
require(akima)
First preparing the data:
#First the background layer (the 'kxx' layer):
# Here the regular grid on which we're gonna do the interpolation
E.grid <- with(Ellipses,
expand.grid(seq(min(x),max(x),length=200),
seq(min(y),max(y),length=200)))
names(E.grid) <- c("x","y") # Without this step, function inout throws an error
E.grid$Value <- rep(0,nrow(E.grid))
#Split the dataset according to unique values of kxx
E.k <- split(Ellipses,Ellipses$kxx)
# Find the convex hull delimiting each of those values domain
E.k.ch <- lapply(E.k,function(X){X[chull(X$x,X$y),]})
for(i in unique(Ellipses$kxx)){ # Pick the value for each coordinate in our regular grid
E.grid$Value[inout(E.grid[,1:2],E.k.ch[names(E.k.ch)==i][[1]],bound=TRUE)]<-i
}
# Then the regular grid for the second layer (Temp)
T.grid <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=200),
yo=seq(min(y),max(y),length=200),
duplicate="mean", linear=FALSE))
# The regular grids for the arrow layer (gradT)
dx <- with(Ellipses,
interp(x,y,gradTx,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
dy <- with(Ellipses,
interp(x,y,gradTy,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
T.grid2 <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
gradTgrid<-expand.grid(dx$x,dx$y)
And then the plotting:
palette(grey(seq(0.5,0.9,length=5)))
par(mar=rep(0,4))
plot(E.grid$x, E.grid$y, col=E.grid$Value,
axes=F, xaxs="i", yaxs="i", pch=19)
contour(T.grid, add=TRUE, col=colorRampPalette(c("blue","red"))(15), drawlabels=FALSE)
arrows(gradTgrid[,1], gradTgrid[,2], # Here I multiply the values so you can see them
gradTgrid[,1]-dx$z*40*T.grid2$z, gradTgrid[,2]-dy$z*40*T.grid2$z,
col="yellow", length=0.05)
To understand in details how this code works, I advise you to read the following help pages: ?inout, ?chull, ?interp, ?expand.grid and ?contour.

matrix of hexbin plots with common bin/legend breaks

Suppose I have a number of replications of bivariate experiments which I wish to display simultaneously in hexagonally binned plots, with common cell counts. Is there existing code to do this? Is there an easy way to modify the hexbin package to do this for me?
For example:
library(hexbin)
x <- replicate(9, rnorm(10000), simplify=FALSE)
y <- replicate(9, rnorm(10000), simplify=FALSE)
h <- mapply(hexbin, x, y)
par(mfrow=c(3,3))
lapply(h, plot)
This code doesn't display a grid of hexbin plots with common cell counts, but I'd like it to.
hexbin objects are plotted using grid graphics so your par(mfrow=c(3,3)) does not do anything. Each graph is plotted on a separate page. To get the details of the plot options:
?gplot.hexbin
In this case, we want to set maxcnt to the largest cell count:
lapply(h, plot, maxcnt=max(unlist(lapply(h, function(x) max(x#count)))))
This will apply the same legend to each graph.

Resources