Group geom_point with the geom_polygon - r

My dataset:
Taxa dn dc
Cha 10.2 -20.4
Cha 10.7 -19.7
Cha 4.9 -21.0
Cha 5.4 -20.6
Cha 8.6 -21.2
Cha 8.0 -20.9
Cha 8.1 -21.3
Cha 6.9 -21.1
Cha 8.5 -21.1
Cha 9.1 -20.8
Hyd 6.6 -19.2
Hyd 10.2 -17.0
Hyd 9.7 -18.2
Hyd 8.1 -16.5
Hyd 8.8 -15.8
Hyd 8.7 -15.8
Hyd 7.6 -18.3
Hyd 8.9 -16.0
Hyd 8.4 -17.5
Hyd 9.8 -18.8
Hyd 8.3 -18.4
Scy 9.4 -20.1
Scy 9.1 -20.0
Scy 7.8 -20.2
Scy 9.1 -17.6
Scy 8.2 -19.8
Scy 9.4 -19.2
Scy 9.0 -20.1
Sip 5.7 -15.2
Sip 6.2 -18.6
Sip 5.6 -18.0
Sip 8.6 -17.6
Sip 4.8 -16.9
Sip 5.2 -15.4
Sip 1.9 -18.4
The code I use is:
library(ggplot2)
ggplot(mydata, aes(x=dC, y=dN, colour=Taxa, shape=Taxa))+
geom_point(size=2, alpha=0.5)+
geom_polygon(aes(fill=Taxa, group=Taxa))+
theme(legend.position = "none")
I would like to plot the polygon group with "Taxa" in my data. However, it looks like the polygon connects each point.
What I want is like this one. How should I edit my codes?

To connect outer points in group and encircle ones that are within the group use geom_encircle function from ggalt package.
library(ggplot2)
library(ggalt)
ggplot(mydata, aes(dc, dn)) +
geom_point(aes(color = Taxa)) +
geom_encircle(aes(fill = Taxa), s_shape = 1, expand = 0,
alpha = 0.2, color = "black", show.legend = FALSE)
Use s_shape = 1 and expand = 0 to connect outer points, otherwise it will encircle with margins.

You can also calculate the convex hulls, them plot them down:
library(ggplot2)
library(plyr)
# some fake data:
mydata <- data.frame(Taxa = c('Cha','Cha','Cha','Cha','Cha','Cha','Hyd','Hyd','Hyd','Hyd','Hyd','Hyd'),
dn = c(10.2,10.7,4.9,5.4,8.6,8.0, 6.6,10.2,9.7,8.1,8.8,8.7),
dc =c(-20.4,-19.7,-21.0,-20.6,-21.2,-20.9,-19.2,-17.0,-18.2,-16.5,-15.8,-15.8))
# calculate convex hulls:
chulls <- ddply(mydata, .(Taxa), function(mydata) mydata[chull(mydata$dn, mydata$dc), ])
# plot them:
ggplot(data=mydata, aes(x=dn, y=dc, color=Taxa)) + geom_point() +
geom_polygon(data=chulls, aes(x=dn, y=dc, fill=Taxa, alpha=0.2))
Nice source here.

Related

Problems with scatterplot error bars in ggplot2

I have a question about how I can do a scatterplot with error bars. I´m working with stable isotopes so I have data on D13C and D15N for faunal samples. I want to obtain a plot like this one (without convex hulls) attached (target)target.png
But on the contrary I obtain a plot like this (CNPlot)CNPlot.png
I´m using this script :
a<-read.table("Means.txt", header = TRUE)
theme_set(theme_classic(base_size = 16))
ggplot(a, aes(x=D13C, y=D15N)) +
geom_errorbar(aes(ymax=D13C+D13C.ds, ymin=D13C-D13C.ds), width=0.15,alpha=.8)+
geom_errorbarh(aes(xmax=D15N+D15N.ds, xmin=D15N-D15N.ds), height=0.15,alpha=.8)+
geom_point(aes(shape=Species),fill="white",size=4) +
geom_point(aes(color=Species,fill=Species,shape=Species),size=4, alpha = .5) +
scale_color_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_fill_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_shape_manual(values=c(21,23,22,24))+
labs(title=NULL,
subtitle=NULL,
caption=NULL,
x=expression(paste(delta^{13}, "Ccol(‰)")),
y=expression(paste(delta^{15}, "N(‰)")))
and I have two datasets but I´m using the one named Means but I have another one named CN_fauna where I included the raw data
Means:
Species D13C D13C.ds D15N D15N.ds
Bird -16.4 7.1 7.6 1.5
SH -18.5 1.7 5.5 2.7
CH -14.8 2.9 8.8 0.6
Deer -19.2 0.7 4.8 1.04
CN_fauna:
taxa D13C D15N
Bird -24.1 7.9
Bird -9.9 9
Bird -15.2 5.9
SH -17.0 9.6
SH -16.6 7.3
SH -20.3 4.6
SH -20.3 2.6
SH -20.3 2.7
SH -18.6 6.6
CH -16.9 9.4
CH -11.5 8.2
CH -16.1 8.8
Deer -18.6 3.0
Deer -19.1 6.0
Deer -18.3 5.4
Deer -17.9 5.4
Deer -19.2 5.6
Deer -20.4 5.6
Deer -19.5 6.1
Deer -20.3 5.9
Deer -18.7 5.4
Deer -19.7 3.8
Deer -19.2 3.4
Deer -19.9 4.1
Deer -18.4 4.3
Deer -20.1 4.1
I do not understand why the scales of the error barplots are different in my plot, any help is more than welcome.
Not to take away from the sage advice that #stefan provided (reproducible questions get better responses faster)...
I could be wrong, but I think your errorbar data is on the wrong axis. Is that what you were trying to create?
If so, you need to change your xmin to ymin and so on for the two errorbar layers. It would look something like this:
ggplot(Means, aes(x = D13C, y = D15N)) +
geom_errorbar(aes(xmax = D13C + D13C.ds,
xmin = D13C - D13C.ds), width=0.15,alpha=.8)+
geom_errorbar(aes(ymax = D15N + D15N.ds,
ymin = D15N - D15N.ds), height=0.15,alpha=.8)+
geom_point(aes(shape=Species),fill="white",size=4) +
geom_point(aes(color=Species,fill=Species,shape=Species),size=4, alpha = .5) +
scale_color_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_fill_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_shape_manual(values=c(21,23,22,24))+
labs(title=NULL,
subtitle=NULL,
caption=NULL,
x=expression(paste(delta^{13}, "Ccol(‰)")),
y=expression(paste(delta^{15}, "N(‰)")))

How ro draw a multiline plot in R

I have a dataframe with 6 features like this:
X1 X2 X3 X4 X5 X6
Modern Dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden Jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese Wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian Wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1
I want to draw a line plot like this:
I'm trying this:
plot(df$X1, type = "o",col = "red", xlab = "Month", ylab = "Rain fall")
lines(c(df$X2, df$X3, df$X4, df$X5, df$X6), type = "o", col = "blue")
But it's only plotting a single variable. I'm sorry if this question is annoying, i'm totally new to R and i just don't know how to get this done. I would really appreciate any help on this.
Thanks in advance
The easiest way would be to convert your dataset to a long format (e.g. by using the gather function in the tidyr package), and then plotting using the group aesthetic in ggplot.
I recreate your dataset, assuming your group variable is named "Group":
df <- read.table(text = "
Group X1 X2 X3 X4 X5 X6
Modern_Dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden_Jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese_Wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian_Wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1 ",
header = TRUE, stringsAsFactors = FALSE)
Then convert the dataset to long format and plot:
library(tidyr)
library(ggplot2)
df_long <- df %>% gather(X1:X6, key = "Month", value = "Rainfall")
ggplot(df_long, aes(x = Month, y = Rainfall, group = Group, shape = Group)) +
geom_line() +
geom_point() +
theme(legend.position = "bottom")
See also the answers here: Group data and plot multiple lines.

How to add labels at the top of vlines in ggplot2 and add these in separate legends

I have created a dummy dataframe representative of my data-
SQ AgeGroup Prop LCI UCI
2010-1 0 to 18 4.3 4.2 4.4
2010-1 19 to 25 5.6 5.3 5.6
2010-1 26 and over 7.8 7.6 7.9
2010-2 0 to 18 4.1 3.9 4.2
2010-2 19 to 25 5.8 5.6 5.9
2010-2 26 and over 8.1 7.9 8.3
2010-3 0 to 18 4.2 4 4.4
2010-3 19 to 25 5.5 5.2 5.6
2010-3 26 and over 7.6 7.4 7.7
2010-4 0 to 18 3.9 3.6 4.1
2010-4 19 to 25 5.2 5 5.4
2010-4 26 and over 7.4 7.2 7.6
2011-1 0 to 18 4.3 4.1 4.5
2011-1 19 to 25 5.7 5.5 5.8
2011-1 26 and over 8.2 8 8.3
2011-2 0 to 18 4.1 4 4.5
2011-2 19 to 25 5.7 5.5 5.9
2011-2 26 and over 8.2 8 8.4
2011-3 0 to 18 4.4 4.2 4.6
2011-3 19 to 25 5.7 5.5 7.9
2011-3 26 and over 8.2 8 8.4
which creates an image that looks like this-
I have used the following code-
library(readxl)
library(dplyr)
library(epitools)
library(gtools)
library(reshape2)
library(binom)
library(pivottabler)
library(readxl)
library(phecharts)
library(ggplot2)
library(RODBC)
rm(list=ls())
df<-read_xlsx("Dummydata.xlsx")
pd<-position_dodge(width=0.3)
limits <- aes(ymax =df$UCI , ymin = df$LCI)
p<-ggplot(df, aes(x = SQ, y =Prop, group=AgeGroup, colour= AgeGroup)) +
geom_line(position=pd)+
geom_point(size=2.0, position=pd)+
geom_errorbar(limits, width = 0.55, size=0.4, position= pd)+
labs(
y = "Percentage",
x = "Study Quarter")
p<-p +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
scale_y_continuous(name="Percentage",breaks=c(0,2,4,6,8,10),limits=c(0,10))+#limits need to change with every pot
scale_fill_manual(values = pal)+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1,size=16))+
theme(axis.text.y=element_text(size=16))+
theme(legend.text = element_text(size=18))+
theme(legend.title=element_text(size=16))+
theme(legend.title=element_blank())+
theme(legend.position="bottom")+
theme(axis.title = element_text(size=22))
p + geom_vline(xintercept = c(2,4,6), linetype="dotted",
color = "black", size=1.0, show.legend = TRUE)
However, what I want is that the three geom lines should have a lable (L1, L2 and L3) at the top of each of these lines and a separate legend at the bottom where I can add what these lines stand for. Something like this-
L1: Launch of x
L2: Launch of y
L3: Launch of z
Can someone please help with this?

Merge information on the same plot

I have some data that looks like this:
Expression1 Expression2 CellType Patient
9.34 8.23 3.2 A
8.2 3.2 10.9 B
2.12 5.3 12.9 B
2.10 1.3 2.9 B
2.12 1.5 2.9 A
2.11 9.5 6.9 A
... .... ... ... ....
I would like to generate a plot (with ggplot) with Expression1 and Expression2 on the y and x axes respectively and dots coloured in a gradient of a single color according to the CellType column and at the same time distinguishing between Patient A and B on the same plot.
Can anyone help me please?
ggplot(myDF, aes(Expression1, Expression2)) + geom_point(aes(colour = CellType)) + scale_colour_gradient2(low="black",mid="white" , high="red", + ggtitle("First_attempt")
I don't know how to add a gradient for Patient
Thank you in advance
The below seems to work fine:
dt <- data.table::fread('Expression1 Expression2 CellType Patient
9.34 8.23 3.2 A
8.2 3.2 10.9 B
2.12 5.3 12.9 B
2.10 1.3 2.9 B
2.12 1.5 2.9 A
2.11 9.5 6.9 A ')
library(ggplot2)
ggplot(dt) + geom_point(aes(x = Expression2, y = Expression1,
color = CellType, shape = Patient))
output

0 x 0 matrix when running PCA in FactoMineR

I'm trying to run a principal component analysis (PCA) indicating the quantitative data and the qualitative data, but I get this error when performing:
library(FactoMineR)
pca(data, quanti.sup = 4:12, quali.sup = 1:3, scale.unit = FALSE, ncp=2)
Error in eigen(t(X)%*%X, symmetric = TRUE): = 0x0 matrix
My data is a 2980 x 12 data frame with names, so it's really weird.
Any advice would be very much appreciated.
The problem you encountered is because you have specified all of your variables as supplementary variables when you call PCA().
To illustrate with an example we can use the built in dataset USJudgeRatings.
head(USJudgeRatings)
CONT INTG DMNR DILG CFMG DECI PREP FAMI ORAL WRIT PHYS RTEN
AARONSON,L.H. 5.7 7.9 7.7 7.3 7.1 7.4 7.1 7.1 7.1 7.0 8.3 7.8
ALEXANDER,J.M. 6.8 8.9 8.8 8.5 7.8 8.1 8.0 8.0 7.8 7.9 8.5 8.7
ARMENTANO,A.J. 7.2 8.1 7.8 7.8 7.5 7.6 7.5 7.5 7.3 7.4 7.9 7.8
BERDON,R.I. 6.8 8.8 8.5 8.8 8.3 8.5 8.7 8.7 8.4 8.5 8.8 8.7
BRACKEN,J.J. 7.3 6.4 4.3 6.5 6.0 6.2 5.7 5.7 5.1 5.3 5.5 4.8
BURNS,E.B. 6.2 8.8 8.7 8.5 7.9 8.0 8.1 8.0 8.0 8.0 8.6 8.6
In this data there are 43 judges who were ranked on 11 qualities by lawyers (columns 2:12). Column 1 is the number of contacts the lawyers had with the judge.
The PCA won't work if you specify that all variables are supplementary.
library(FactoMineR)
result <- PCA(USJudgeRatings, ncp = 3, quanti.sup = 1:12)
# Error in eigen(t(X) %*% X, symmetric = TRUE) : 0 x 0 matrix
We have to give the PCA some variables to work with. Instead, we let our 11 variables go into the PCA and specify only the number of contacts the lawyers had with the judges as a quantitative supplementary variable:
result <- PCA(USJudgeRatings, ncp = 3, quanti.sup = 1)
This runs and you can then view the results with summary.PCA(result).

Resources