How to colour every sub group of points? - r

I am trying to plot a set of points in r by "plot" command, I would like to sub group them by colour. E.x. I have 9 points the first three points in to red, the next three points into blue , ...

you just have to provide a vector of the respective colors
plot(1:9, 1:9, col = c(rep("black", 3), rep("blue", 3), rep("red", 3)))
Altough under normal circumstances you shouldn't do this manually, creating the vector of colours according to a grouping variable instead.

Without more detailed requirements, sample code and some example data it'll be difficult to determine exactly what you're looking for, but perhaps the col parameter for qplot is what you need.
# load required library.
require(ggplot2)
# create some data.frame with numbers and colours.
p <- data.frame(x=1:9, y=1:9, c=rep(c("red","blue","green"), each=3))
# plot.
qplot(x, y, col=c, data=p)
Hope that helps.

Related

How to add more points/dots on an existing pairs plot?

par(mfrow=c(1,2))
Trigen <- data.frame(OTriathlon$Gender,OTriathlon$Swim,OTriathlon$Bike,OTriathlon$Run)
colnames(Trigen) <- c("Gender","Swim","Bike","Run")
res <- split(Trigen[,2:4],Trigen$Gender)
pairs(res$Male, pch="M", col = 4)
points(res$Female, pch ="F", col= 2)
Basically, Customize the pairs plot, so where the plot symbol and color of each data point represents
gender.
I did some random things in the code but the issue that I am facing is that I cant add female points to the existing plot. After running the points code it just stays the same doesn't get updated
There is no need to call points sevral times, because you can use the factor directly as a color. Example:
plot(iris[,c(2,3)], col=iris$Species)

R distinct some points with different color

I have around 20.000 points in my scatter plot. I have a list of interesting points and want to show those points in the scatter plot with different color. Is there any simple way to do it? Thank you.
Further explanation,
I have a matrix, consist of 20.000 rows, let's say R1 to R20000 and 4 columns, let's say A,B,C, and, D. Each row has its own row.names. I want to make a scatter plot between A and C. It is easy with plot(data$A,data$B).
On the other hand, I have a list of row.names which I want to check where in the scatter plot those point is. Let's say R1,R3,R5,R10,R20,R25.
I just want to change the color of R1,R3,R5,R10,R20,R25 in the scatter plot different from other points. Sorry if my explanation is not clear.
If your data is in a simple form, then it is easy to do. For example:
# Make some toy data
dat <- data.frame(x = rnorm(1000), y = rnorm(1000))
# List of indicies (or a logical vector) defining your interesting points
is.interesting <- sample(1000, 30)
# Create vector/column of colours
dat$col <- "lightgrey"
dat$col[is.interesting] <- "red"
# Plot
with(dat, plot(x, y, col = col, pch = 16))
Without a reproducible example, it's hard to say anything more specific.

Mismatch in legend/color in R plot

I created several plots in R. Occasionally, the program does not match the color of the variables in the plot to the variable colors in the legend. In the attached file (Unfortunately, I can't yet attach images b/c of reputation), the first 2 graphs are assigned a black/red color scheme. But, the third chart automatically uses a green/black and keeps the legend with black/red. I cannot understand why this happens.
How can I prevent this from happening?
I know it's possible to assign color, but I am struggling to find a clear way to do this.
Code:
plot(rank, abundance, pch=16, col=type, cex=0.8)
legend(60,50,unique(type),col=1:length(type),pch=16)
plot(rank, abundance, pch=16, col=Origin, cex=0.8)
legend(60,50,unique(Origin),col=1:length(Origin),pch=16)
Below is where color pattern won't match
plot(rank, abundance, pch=16, col=Lifecycle, cex=0.8)
legend(60,50,unique(Lifecycle),col=1:length(Lifecycle),pch=16)
data frame looks like this:
Plant rank abundance Lifecycle Origin type
X 1 23 Perennial Native Weedy
Y 2 10 Annual Exotic Ornamental
Z 3 9 Perennial Native Ornamental
First, I create some fake data.
df <- data.frame(rank = 1:10, abundance = runif(10,10,100),
Lifecycle = sample(c('Perennial', 'Annual'), 10, replace=TRUE))
Then I explicitly say what colors I want my points to be.
cols=c('dodgerblue', 'plum')
Then I plot, using the factor df$Lifecycle to color points.
plot(df$rank, df$abundance, col = cols[df$Lifecycle], pch=16)
When the factor df$Lifecycle is used above, it converts it to a numeric reference to cols, such that it sorts the values alphabetically. Therefore, in the legend, we just need to sort the unique df$Lifecycle values, and then hand it our color vector (cols).
legend(5, 40, sort(unique(df$Lifecycle)), col=cols, pch=16, bty='n')
Hopefully this helps.

Coloring scatterplot in R based on fold enrichment

I'm very new to R and have tried to search around for an answer to my question, but couldn't find quite what I was looking for (or I just couldn't figure out the right keywords to include!). I think this is a fairly common task in R though, I am just very new.
I have a x vs y scatterplot and I want to color those points for which there is at least a 2-fold enrichment, ie where x/y>=2 . Since my values are expressed as log2 values, the the transformed value needs to be x/y>=4.
I currently have the scatterplot plotted with
plot(log2(counts[,40], log2(counts[,41))
where counts is a .csv imported files and 40 & 41 are my columns of interested.
I've also created a column for fold change using
counts$fold<-counts[,41]/counts[,40]
I don't know how to incorporate these two pieces of information... Ultimately I want a graph that looks something like the example here: http://s17.postimg.org/s3k1w8r7j/error_messsage_1.png
where those points that are at least two-fold enriched will colored in blue.
Any help would be greatly appreciated. Thanks!
Is this what you're looking for:
# Fake data
dat = data.frame(x=runif(100,0,50), y = rnorm(100, 10, 2))
plot(dat$x, dat$y, col=ifelse(dat$x/dat$y > 4, "blue", "red"), pch=16)
The ifelse statement creates a vector of "blue" and "red" (or whatever colors you want) based on the values of dat$x/dat$y and plot uses that to color the points.
This might be helpful if you've never worked with colors in R.
Another option is to use ggplot2 instead of base graphics. Here's an example:
library(ggplot2)
ggplot(dat, aes(x,y, colour=cut(x/y, breaks=c(-1000,4,1000),
labels=c("<=4",">4")))) +
geom_point(size=5) +
labs(colour="x/y")

Setting the color for an individual data point

How can I set the colour for a single data point in a scatter plot in R?
I am using plot
To expand on #Dirk Eddelbuettel's answer, you can use any function for col in the call to plot. For instance, this colors the x==3 point red, leaving all others black:
x <- 1:5
plot(x, x, col=ifelse(x==3, "red", "black"))
Same goes for point character pch, character expansion cex, etc.
plot(x, x, col=ifelse(x==3, "red", "black"),
pch=ifelse(x==3, 19, 1), cex=ifelse(x==3, 2, 1))
Doing what you want to do through code is easy enough and others have given nice ways to do this. If, however, you would rather click on the points you want to change the color of you can do this by using 'identify' along with the 'points' command to replot over those points in a new color.
# Make some data
n <- 15
x <- rnorm(n)
y <- rnorm(n)
# Plot the data
plot(x,y)
# This lets you click on the points you want to change
# the color of. Right click and select "stop" when
# you have clicked all the points you want
pnt <- identify(x, y, plot = F)
# This colors those points red
points(x[pnt], y[pnt], col = "red")
# identify beeps when you click.
# Adding the following line before the 'identify' line will disable that.
# options(locatorBell = FALSE)
Use the col= argument which is vectorized so that eg in
plot(1:5, 1:5, col=1:5)
you get five points in five different colors:
You can use the same logic to use just two or three colors among your data points. Understanding indexing is key in languages like R.

Resources