I have created a NMDS plot using the 'vegan' package, like this:
y=metaMDS(data,type="p").
plot(y)
Now I have this NMDS with a good spread of my points. However, I would like to add the graphics of the plot. I would like to give the points in the plot a different colour, depending on a categorical variable (the variable is called 'regio') in my dataset, which has two values (1 or 2).
Is this possible? And if so, how?
Best,
Koen
The easiest way is to use the grouping variable regio to index into a vector of colours you want to plot with. E.g., (untested as I don't have your data...)
colvec <- c("red","blue")
plot(y, type = "n")
points(y, display = "sites", col = colvec[data$regio])
## or
text(y, display = "sites", col = colvec[data$regio])
## depending on how you want to represent the sample scores
Related
par(mfrow=c(1,2))
Trigen <- data.frame(OTriathlon$Gender,OTriathlon$Swim,OTriathlon$Bike,OTriathlon$Run)
colnames(Trigen) <- c("Gender","Swim","Bike","Run")
res <- split(Trigen[,2:4],Trigen$Gender)
pairs(res$Male, pch="M", col = 4)
points(res$Female, pch ="F", col= 2)
Basically, Customize the pairs plot, so where the plot symbol and color of each data point represents
gender.
I did some random things in the code but the issue that I am facing is that I cant add female points to the existing plot. After running the points code it just stays the same doesn't get updated
There is no need to call points sevral times, because you can use the factor directly as a color. Example:
plot(iris[,c(2,3)], col=iris$Species)
I am currently trying to plot the colours of different subgroups of a large dataset. I have separated the data in to 6 subgroups with 6 colours. However my plot3d function only plots the first two principle components.
Here is the example of the plot.
Here is the code. I have create a PCA analysis of my dataset and originally only want to show the first 3 main principle components but I have tried plotting all the principle components to ensure it isn't to do with the data.
PCA_Model <- prcomp(t(Input_dataset), center = T, scale=F)
samples_names <- row.names(PCA_Model$rotation)
# Bind sample names to their subgroup
pca_matrix <- cbind(samples_names, "Subgroup"=labeled_subgroup, stringsAsFactors=FALSE)
# Link dataframe to color
colours <- as.character(factor(pca_matrix[,"Subgroup"], levels = paste0("C", 1:6),labels = c("blue",
"red", "yellow", "green", "black", "white")))
plot3d(PCA_Model$x[,1:440], col=colours)
The dataset is very diverse so should show all subgroups. Any help would be much appreciated!
You may be using the wrong plotting function. Using scatter3d in the latest version of plot3D package:
# fit PCA model
PCA_Model <- prcomp(dplyr::select(iris, -Species), center = T, scale=F)
# Plot
scatter3D(x = PCA_Model$x[,1], y = PCA_Model$x[,2], z = PCA_Model$x[,3],
# just use the factor to color the points:
col = factor(iris$Species))
I think you get something odd from feeding characters into the col option of plot3d. So below I show an example of how to feed the colors. You create a color vector first, named it after your levels and then call them out. Adjust the script before for 6 colours:
library(rgl)
library(RColorBrewer)
pca = prcomp(iris[,-5])$x
COLS = brewer.pal(3,"Set1")
names(COLS) = levels(iris$Species)
plot3d(pca,col=COLS[as.character(iris$Species)])
I used snapshot3d() to capture the image, and the axis labels seem quite squished
I found some similar questions but the answers didn't solve my problem.
I try to plot a time series of to variables as a scatterplot and using the date to color the points. In this example, I created a simple dataset (see below) and I want to plot all data with timesteps in the 1960ties, 70ties, 80ties and 90ties with one colour respectively.
Using the standard plot command (plot(x,y,...)) it works the way it should, as I try using the ggplot library some strange happens, I guess I miss something. Has anyone an idea how to solve this and generate a correct plot?
Here is my code using the standard plot command with a colorbar
# generate data frame with test data
x <- seq(1,40)
y <- seq(1,40)
year <- c(rep(seq(1960,1969),2),seq(1970,1989,2),seq(1990,1999))
df <- data.frame(x,y,year)
# define interval and assing color to interval
myinterval <- seq(1959,1999,10)
mycolors <- rainbow(4)
colbreaks <- findInterval(df$year, vec = myinterval, left.open = T)
# basic plot
layout(array(1:2,c(1,2)),widths =c(5,1)) # divide the device area in two panels
par(oma=c(0,0,0,0), mar=c(3,3,3,3))
plot(x,y,pch=20,col = mycolors[colbreaks])
# add colorbar
ncols <- length(myinterval)-1
colbarlabs <- seq(1960,2000,10)
par(mar=c(5,0,5,5))
image(t(array(1:ncols, c(ncols,1))), col=mycolors, axes=F)
box()
axis(4, at=seq(0.5/(ncols-1)-1/(ncols-1),1+1/(ncols-1),1/(ncols-1)), labels=colbarlabs, cex.axis=1, las=1)
abline(h=seq(0.5/(ncols-1),1,1/(ncols-1)))
mtext("year",side=3,line=0.5,cex=1)
As I would like to use ggplot package, as I do for other plots, I tried this version with ggplot
# plot with ggplot
require(ggplot2)
ggplot(df, aes(x=x,y=y,color=year)) + geom_point() +
scale_colour_gradientn(colours= mycolors[colbreaks])
but it didn't work the way I thought it would. Obviously, there is something wrong with the color coding. Also, the colorbar looks strange. I also tried it with scale_color_manual and scale_color_gradient2 but I got more errors (Error in continuous_scale).
Any idea how to solve this and generate a plot according to the standard plot 3 including a colorbar.
Is there any way for me to add some points to a pairs plot?
For example, I can plot the Iris dataset with pairs(iris[1:4]), but I wanted to execute a clustering method (for example, kmeans) over this dataset and plot its resulting centroids on the plot I already had.
It would help too if there's a way to plot the whole data and the centroids together in a single pairs plot in such a way that the centroids can be plotted in a different way. The idea is, I plot pairs(rbind(iris[1:4],centers) (where centers are the three centroids' data) but plotting the three last elements of this matrix in a different way, like changing cex or pch. Is it possible?
You give the solution yourself in the last paragraph of your question. Yes, you can use pch and col in the pairs function.
pairs(rbind(iris[1:4], kmeans(iris[1:4],3)$centers),
pch=rep(c(1,2), c(nrow(iris), 3)),
col=rep(c(1,2), c(nrow(iris), 3)))
Another option is to use panel function:
cl <- kmeans(iris[1:4],3)
idx <- subset(expand.grid(x=1:4,y=1:4),x!=y)
i <- 1
pairs(iris[1:4],bg=cl$cluster,pch=21,
panel=function(x, y,bg, ...) {
points(x, y, pch=21,bg=bg)
points(cl$center[,idx[i,'x']],cl$center[,idx[i,'y']],
cex=4,pch=10,col='blue')
i <<- i +1
})
But I think it is safer and easier to use lattice splom function. The legend is also automatically generated.
cl <- kmeans(iris[1:4],3)
library(lattice)
splom(iris[1:4],groups=cl$cluster,pch=21,
panel=function(x, y,i,j,groups, ...) {
panel.points(x, y, pch=21,col=groups)
panel.points(cl$center[,j],cl$center[,i],
pch=10,col='blue')
},auto.key=TRUE)
I have a simple scatter plot
x<-rnorm(100)
y<-rnorm(100)
z<-rnorm(100)
I want to plot the plot(x,y) but the color of the points should be color coded based on z.
Also, I would like to have the ability to define how many groups (and thus colours) z should have. And that this grouping should be resistant to outliers (maybe split the z density into n equal density groups).
Till now I do this manually, is there any way to do this automatically?
Note: I want to do this with base R not with ggplot.
You can pass a vector of colours to the col parameter, so it is just a matter of defining your z groups in a way that makes sense for your application. There is the cut() function in base, or cut2() in Hmisc which offers a bit more flexibility. To assist in picking reasonable colour palettes, the RColorBrewer package is invaluable. Here's a quick example after defining x,y,z:
z.cols <- cut(z, 3, labels = c("pink", "green", "yellow"))
plot(x,y, col = as.character(z.cols), pch = 16)
You can obviously add a legend manually. Unfortunately, I don't think all types of plots accept vectors for the col argument, but type = "p" obviously works. For instance, plot(x,y, type = "l", col = as.character(z.cols)) comes out as a single colour for me. For these plots, you can add different colours with lines() or segments() or whatever the low level plotting command you need to use is. See the answer by #Andrie for doing this with type = "l" plots in base graphics here.