Legend with rainbow colours R - r

I have been trying to create a legend for an R plot with the rainbow option but I am facing some difficulties.
I plot
plot(test$a,test$b, col = rainbow(length(test$s))[rank(test$s)])
with the colour assigned according to test$s. The problem is that test$s is equal for many values of the data frame test so if than I write
legend('topright',legend=test.sub$s,col=rainbow(length(test.sub$s))
[rank(test.sub$s)])
I get in the legend all duplicates of test$s but the colours are correct. Since I don't want the duplicates I wrote
legend('topright',legend=unique(test.sub$s),col=rainbow(length(test.sub$s))
[rank(test.sub$s])
but then all the colours are messed up!
Thanks in advance

You're problem is that unique(test.sub$s) is not the same length as rainbow(length(test.sub$s))[rank(test.sub$s]. My solution would be to do:
col=rainbow(length(test.sub$s))[rank(test.sub$s)[!duplicated(test.sub$s)]]

Related

Plot a curve with different color for each point in R

I have a curve, for instance
y_curve=c(1,2,5,6,9,1).
and the colors for each curve point
colors=c("#0000FF","#606060","#606060","#FF0000","#FF0000","#FF0000").
In theory I want to plot a curve where the first half has one color (except for the first point which is blue) and the second half has another color. In my example the dataset has more than 3000 observations so it makes sense.
For some reason, if I plot the data just using the command
plot(y_curve,col=colors), the color of points is plotted corrently.
Nevertheless, if I add the option type="l", the plotted curve has only one color - the blue, which is the first color in the vector colors ("#0000FF").
Does anyone know what am I doing wrong?
So the code is
y_curve=c(1,2,5,6,9,1)
colors=c("#0000FF","#606060","#606060","#FF0000","#FF0000","#FF0000")
plot(y_curve,col=colors,type="l")
Thank you all in advance.
I avoid using ggplot since this part of code is inside an already complicated function and I prefer using the base R commands.
The line option for the plot function does not accept multiple colors.
There is the segments() function that we can use to manually draw in each separate segment individually with a unique color.
y_curve=c(1,2,5,6,9,1)
colors=c("#0000FF","#606060","#606060","#FF0000","#FF0000","#FF0000")
#create a mostly blank plot
plot(y_curve,col=NA)
# Use this to show the points:
#plot(y_curve,col=colors)
#index variable
x = seq_along(y_curve)
#draw the segments
segments(head(x,-1), head(y_curve,-1), x[-1], y_curve[-1], type="l", col=colors)
This answer is based on the solution to this question:
How do I plot a graph in R, with the first values in one colour and the next values in another colour?

How to prevent geom_text_repel from labeling points on scatter plot with default number ordering list?

My dataset looks like this:
I'm trying to create a simple scatter plot with data labels that are names (first and last name).
I used geom_text_repel in ggrepel to create data labels, but the labels on the plot are just numbers in the order of the data points in my dataset.
For example, if you look at the first datapoint, instead of the label being "Stephen Curry" it is "1"
I have no idea why this is happening and I can't find anyone else who even has my problem, let alone a solution.
Code:
ggplot(gravity,
aes(TS., USG., label = rownames(gravity))) +
geom_point(aes(TS., USG.), color='black') +
geom_text_repel(aes(TS., USG., label = rownames(gravity)))
The image above shows the plot created by the code. As you can see, the labels are just the ordering number instead of the name. I don't see why this happening considering those ordering numbers are not part of the dataset I imported.
Thanks in advance

text annotation to a graph in ggplot

I am drawing a PC plot using ggplots.
I know this question has been answered in some previous posts but I could not still solve my problem.
I have a data set called tab which is the output of PCA
sample.id pop EV1 EV2
HT185_MK8-2.sort.bam HA_27 -0.03796869 0.046369552
HT48_SD1A-37.sort.bam HA_14 0.04208393 0.032961404
HT53_IA1A-10.sort.bam HA_1 -0.02580365 0.005262476
HT260_MK1-4.sort.bam HA_20 -0.06090545 0.005578504
HT170_SD2W-14.sort.bam HA_17 0.01288395 0.012117833
Q093_MK7-13.sort.bam HA_26 0.06310162 0.188558067
I want to add labels on each dot in the plot, theses dots are individuals from several populations. So I want to give them their population ID (pop column in the data set).
I am using something this
ggplot(data=tab,aes(EV1,EV2, label=tab[,2])) + geom_point(aes(color=as.factor(pop))) + ylab("Principal component 2") + xlab("Principal component 1")
But I do not get my desired output.
This is my PC plot!
So could anyone help me to add population label on each dot in the plot!
Thanks
Try geom_text:
geom_text(aes(label=as.character(pop)),hjust=0,vjust=0)
Also consider looking into plotly, or setting a threshold on the labels, because labeling every point will lead to a very crowded plot, and probably very little additional useful information.

Adding a factor legend to a scatterplot matrix in R?

I have made a scatterplot matrix which is conditioned on two factors, one with different colours and one with different shapes. I want to add a legend to the right-middle of the plot showing the labels for just the factor with the shapes (Scenario), is this possible? I have looked online but cannot figure it out! Here is my code, I would be grateful for any help. Thank you.
pairs(Data[,1:3], col=Data$Physician, pch=Data$Scenario, main="Scatterplot Matrix")

R plotting a graph with different groups of data

I have a dataset:
a<-c(1,2,3,4,5,6,7,8,9,10)
b<-c(2,2,2,2,4,5,6,8,4,1)
c<-c("red","red","red","blue","blue","blue","orange","orange","orange","orange")
data<-data.frame(a=a,b=b,c=c)
I now want to plot the data on a graph with each group having a different colour:
plot(a[c=="red"],b[c=="red"],col="red",xlim=c(min(a),max(a)),ylim=c(min(b),max(b)))
points(a[c=="blue"],b[c=="blue"],col="blue")
points(a[c=="orange"],b[c=="orange"],col="orange")
This works fine - however, say if I have 30 groups, the task of writing the code becomes tedious. I am wondering if there is a better way of writing the code such that R will automatically plot the graph and give different colours to different groups?
Also, I wonder if there is a quick way to display a legend in the graph.
Thank you for all your help.
Try this:
with(data,plot(a,b,col=c))
The col argument in plot() stands for color. This can contain a vector of the colors you want.
Additionally, you don't have to make a column just to define the color if the color-group relationship is not that important. For example, you could make column c a more meaningful column like this:
a<-c(1,2,3,4,5,6,7,8,9,10)
b<-c(2,2,2,2,4,5,6,8,4,1)
c<-c(rep('Group1',3),rep('Group2',3),rep('Group3',4))
data<-data.frame(a=a,b=b,c=c)
Then to plot, use:
with(data,plot(a,b,col=c))
To add a legend:
legend('topleft',legend = levels(data[,'c']),col=1:nlevels(data[,'c']),pch=1)
Try ggplot2
library(ggplot2)
ggplot(data=data, aes(x=a, y=b, colour=c)) + geom_point()

Resources