Problems with scatterplot error bars in ggplot2 - r

I have a question about how I can do a scatterplot with error bars. I´m working with stable isotopes so I have data on D13C and D15N for faunal samples. I want to obtain a plot like this one (without convex hulls) attached (target)target.png
But on the contrary I obtain a plot like this (CNPlot)CNPlot.png
I´m using this script :
a<-read.table("Means.txt", header = TRUE)
theme_set(theme_classic(base_size = 16))
ggplot(a, aes(x=D13C, y=D15N)) +
geom_errorbar(aes(ymax=D13C+D13C.ds, ymin=D13C-D13C.ds), width=0.15,alpha=.8)+
geom_errorbarh(aes(xmax=D15N+D15N.ds, xmin=D15N-D15N.ds), height=0.15,alpha=.8)+
geom_point(aes(shape=Species),fill="white",size=4) +
geom_point(aes(color=Species,fill=Species,shape=Species),size=4, alpha = .5) +
scale_color_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_fill_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_shape_manual(values=c(21,23,22,24))+
labs(title=NULL,
subtitle=NULL,
caption=NULL,
x=expression(paste(delta^{13}, "Ccol(‰)")),
y=expression(paste(delta^{15}, "N(‰)")))
and I have two datasets but I´m using the one named Means but I have another one named CN_fauna where I included the raw data
Means:
Species D13C D13C.ds D15N D15N.ds
Bird -16.4 7.1 7.6 1.5
SH -18.5 1.7 5.5 2.7
CH -14.8 2.9 8.8 0.6
Deer -19.2 0.7 4.8 1.04
CN_fauna:
taxa D13C D15N
Bird -24.1 7.9
Bird -9.9 9
Bird -15.2 5.9
SH -17.0 9.6
SH -16.6 7.3
SH -20.3 4.6
SH -20.3 2.6
SH -20.3 2.7
SH -18.6 6.6
CH -16.9 9.4
CH -11.5 8.2
CH -16.1 8.8
Deer -18.6 3.0
Deer -19.1 6.0
Deer -18.3 5.4
Deer -17.9 5.4
Deer -19.2 5.6
Deer -20.4 5.6
Deer -19.5 6.1
Deer -20.3 5.9
Deer -18.7 5.4
Deer -19.7 3.8
Deer -19.2 3.4
Deer -19.9 4.1
Deer -18.4 4.3
Deer -20.1 4.1
I do not understand why the scales of the error barplots are different in my plot, any help is more than welcome.

Not to take away from the sage advice that #stefan provided (reproducible questions get better responses faster)...
I could be wrong, but I think your errorbar data is on the wrong axis. Is that what you were trying to create?
If so, you need to change your xmin to ymin and so on for the two errorbar layers. It would look something like this:
ggplot(Means, aes(x = D13C, y = D15N)) +
geom_errorbar(aes(xmax = D13C + D13C.ds,
xmin = D13C - D13C.ds), width=0.15,alpha=.8)+
geom_errorbar(aes(ymax = D15N + D15N.ds,
ymin = D15N - D15N.ds), height=0.15,alpha=.8)+
geom_point(aes(shape=Species),fill="white",size=4) +
geom_point(aes(color=Species,fill=Species,shape=Species),size=4, alpha = .5) +
scale_color_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_fill_manual(values=c("black","dodgerblue1","coral4","darkorchid"))+
scale_shape_manual(values=c(21,23,22,24))+
labs(title=NULL,
subtitle=NULL,
caption=NULL,
x=expression(paste(delta^{13}, "Ccol(‰)")),
y=expression(paste(delta^{15}, "N(‰)")))

Related

How to add labels at the top of vlines in ggplot2 and add these in separate legends

I have created a dummy dataframe representative of my data-
SQ AgeGroup Prop LCI UCI
2010-1 0 to 18 4.3 4.2 4.4
2010-1 19 to 25 5.6 5.3 5.6
2010-1 26 and over 7.8 7.6 7.9
2010-2 0 to 18 4.1 3.9 4.2
2010-2 19 to 25 5.8 5.6 5.9
2010-2 26 and over 8.1 7.9 8.3
2010-3 0 to 18 4.2 4 4.4
2010-3 19 to 25 5.5 5.2 5.6
2010-3 26 and over 7.6 7.4 7.7
2010-4 0 to 18 3.9 3.6 4.1
2010-4 19 to 25 5.2 5 5.4
2010-4 26 and over 7.4 7.2 7.6
2011-1 0 to 18 4.3 4.1 4.5
2011-1 19 to 25 5.7 5.5 5.8
2011-1 26 and over 8.2 8 8.3
2011-2 0 to 18 4.1 4 4.5
2011-2 19 to 25 5.7 5.5 5.9
2011-2 26 and over 8.2 8 8.4
2011-3 0 to 18 4.4 4.2 4.6
2011-3 19 to 25 5.7 5.5 7.9
2011-3 26 and over 8.2 8 8.4
which creates an image that looks like this-
I have used the following code-
library(readxl)
library(dplyr)
library(epitools)
library(gtools)
library(reshape2)
library(binom)
library(pivottabler)
library(readxl)
library(phecharts)
library(ggplot2)
library(RODBC)
rm(list=ls())
df<-read_xlsx("Dummydata.xlsx")
pd<-position_dodge(width=0.3)
limits <- aes(ymax =df$UCI , ymin = df$LCI)
p<-ggplot(df, aes(x = SQ, y =Prop, group=AgeGroup, colour= AgeGroup)) +
geom_line(position=pd)+
geom_point(size=2.0, position=pd)+
geom_errorbar(limits, width = 0.55, size=0.4, position= pd)+
labs(
y = "Percentage",
x = "Study Quarter")
p<-p +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
scale_y_continuous(name="Percentage",breaks=c(0,2,4,6,8,10),limits=c(0,10))+#limits need to change with every pot
scale_fill_manual(values = pal)+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1,size=16))+
theme(axis.text.y=element_text(size=16))+
theme(legend.text = element_text(size=18))+
theme(legend.title=element_text(size=16))+
theme(legend.title=element_blank())+
theme(legend.position="bottom")+
theme(axis.title = element_text(size=22))
p + geom_vline(xintercept = c(2,4,6), linetype="dotted",
color = "black", size=1.0, show.legend = TRUE)
However, what I want is that the three geom lines should have a lable (L1, L2 and L3) at the top of each of these lines and a separate legend at the bottom where I can add what these lines stand for. Something like this-
L1: Launch of x
L2: Launch of y
L3: Launch of z
Can someone please help with this?

Group geom_point with the geom_polygon

My dataset:
Taxa dn dc
Cha 10.2 -20.4
Cha 10.7 -19.7
Cha 4.9 -21.0
Cha 5.4 -20.6
Cha 8.6 -21.2
Cha 8.0 -20.9
Cha 8.1 -21.3
Cha 6.9 -21.1
Cha 8.5 -21.1
Cha 9.1 -20.8
Hyd 6.6 -19.2
Hyd 10.2 -17.0
Hyd 9.7 -18.2
Hyd 8.1 -16.5
Hyd 8.8 -15.8
Hyd 8.7 -15.8
Hyd 7.6 -18.3
Hyd 8.9 -16.0
Hyd 8.4 -17.5
Hyd 9.8 -18.8
Hyd 8.3 -18.4
Scy 9.4 -20.1
Scy 9.1 -20.0
Scy 7.8 -20.2
Scy 9.1 -17.6
Scy 8.2 -19.8
Scy 9.4 -19.2
Scy 9.0 -20.1
Sip 5.7 -15.2
Sip 6.2 -18.6
Sip 5.6 -18.0
Sip 8.6 -17.6
Sip 4.8 -16.9
Sip 5.2 -15.4
Sip 1.9 -18.4
The code I use is:
library(ggplot2)
ggplot(mydata, aes(x=dC, y=dN, colour=Taxa, shape=Taxa))+
geom_point(size=2, alpha=0.5)+
geom_polygon(aes(fill=Taxa, group=Taxa))+
theme(legend.position = "none")
I would like to plot the polygon group with "Taxa" in my data. However, it looks like the polygon connects each point.
What I want is like this one. How should I edit my codes?
To connect outer points in group and encircle ones that are within the group use geom_encircle function from ggalt package.
library(ggplot2)
library(ggalt)
ggplot(mydata, aes(dc, dn)) +
geom_point(aes(color = Taxa)) +
geom_encircle(aes(fill = Taxa), s_shape = 1, expand = 0,
alpha = 0.2, color = "black", show.legend = FALSE)
Use s_shape = 1 and expand = 0 to connect outer points, otherwise it will encircle with margins.
You can also calculate the convex hulls, them plot them down:
library(ggplot2)
library(plyr)
# some fake data:
mydata <- data.frame(Taxa = c('Cha','Cha','Cha','Cha','Cha','Cha','Hyd','Hyd','Hyd','Hyd','Hyd','Hyd'),
dn = c(10.2,10.7,4.9,5.4,8.6,8.0, 6.6,10.2,9.7,8.1,8.8,8.7),
dc =c(-20.4,-19.7,-21.0,-20.6,-21.2,-20.9,-19.2,-17.0,-18.2,-16.5,-15.8,-15.8))
# calculate convex hulls:
chulls <- ddply(mydata, .(Taxa), function(mydata) mydata[chull(mydata$dn, mydata$dc), ])
# plot them:
ggplot(data=mydata, aes(x=dn, y=dc, color=Taxa)) + geom_point() +
geom_polygon(data=chulls, aes(x=dn, y=dc, fill=Taxa, alpha=0.2))
Nice source here.

Merge information on the same plot

I have some data that looks like this:
Expression1 Expression2 CellType Patient
9.34 8.23 3.2 A
8.2 3.2 10.9 B
2.12 5.3 12.9 B
2.10 1.3 2.9 B
2.12 1.5 2.9 A
2.11 9.5 6.9 A
... .... ... ... ....
I would like to generate a plot (with ggplot) with Expression1 and Expression2 on the y and x axes respectively and dots coloured in a gradient of a single color according to the CellType column and at the same time distinguishing between Patient A and B on the same plot.
Can anyone help me please?
ggplot(myDF, aes(Expression1, Expression2)) + geom_point(aes(colour = CellType)) + scale_colour_gradient2(low="black",mid="white" , high="red", + ggtitle("First_attempt")
I don't know how to add a gradient for Patient
Thank you in advance
The below seems to work fine:
dt <- data.table::fread('Expression1 Expression2 CellType Patient
9.34 8.23 3.2 A
8.2 3.2 10.9 B
2.12 5.3 12.9 B
2.10 1.3 2.9 B
2.12 1.5 2.9 A
2.11 9.5 6.9 A ')
library(ggplot2)
ggplot(dt) + geom_point(aes(x = Expression2, y = Expression1,
color = CellType, shape = Patient))
output

Making a stacked bar plot based on ranges in R and plotly

I want to create a stacked bar chart in R and plotly using iris dataset. In the x-axis, I want to set limits like iris_limits below in the code and the y-axis should contain all the Sepal.Length values which fit into these ranges. I want to pass the values as a single vector. Also, if the limits can be made dynamic by understanding the range of the Sepal.Length instead of hard coding it, please help. I have written a basic script with values to give you an idea. Thanks.
library(plotly)
iris_limits <- c("1-4", "4-6", "6-8")
sepal <- c(2.4,5.4,7.1)
data <- data.frame(iris_limits, sepal)
p <- plot_ly(data, x = ~iris_limits, y = ~sepal, type = 'bar', name =
'Sepal') %>%
layout(yaxis = list(title = 'Count'), barmode = 'group')
p
I tried my best to understand. First dividing the sepal length to the desired categories iris_limits: "1-3","3-6","6-9"
iris$iris_limits <- cut(iris$Sepal.Length, c(1,3,6,9))
Note: no sepal length is in between 1-3, so you only have 2 groups.
Then you want each sepal length limit as a separate bar on the x axis, and each individual sepal length falling into category to be bar stacked onto each other? You linked to a stack bar chart with varying color for the stacked bars, is this what you want?
Create an ID for each sepal length:
iris$ID <- factor(1:nrow(iris))
Plot, set color=~ID if you want different colors for the stacked bars:
library(plotly)
p <- plot_ly(iris, x = ~iris_limits, y = ~Sepal.Length, type = 'bar', color=~ID) %>%
layout(yaxis = list(title = 'Count'), barmode = 'stack')
EDITED For version that is not stacked but grouped by iris_limits, I switched to ggplot2 to make use of facet_wrap functionality to segregate by iris_limits, then use ggplotly.
gg <- ggplot(iris, aes(x=ID, y=Sepal.Length, fill=iris_limits)) +
geom_bar(stat="identity", position="dodge") +
facet_wrap(~iris_limits, scales="free_x", labeller=label_both) +
theme_minimal() + xlab("") + ylab("Sepal Length") +
theme(axis.text.x=element_blank())
ggplotly(gg)
EDITED: Re: Changing legend title and tooltip display
To change the legend title, use labs. Here it was also necessary to change the legend.title font size under theme to fit the ggplotly margins.
To change the tooltip text, add text parameter to aes to create desired character string, then define aes values to be displayed in tooltip in ggplotly.
gg <- ggplot(iris, aes(x=ID, y=Sepal.Length, fill=iris_limits,
text=paste("Sepal Length:", Sepal.Length, "cm"))) +
geom_bar(stat="identity", position="dodge") +
facet_wrap(~iris_limits, scales="free_x") +
theme_minimal() + xlab("") + ylab("Sepal Length (cm)") +
theme(axis.text.x=element_blank(), legend.title=element_text(size=10)) +
labs(fill="Sepal \nLength (cm)")
ggplotly(gg, tooltip=c("x", "text"))
Try using cut:
library(plotly)
iris$iris_limits <- as.numeric(cut(iris$Sepal.Length,3))
p <- plot_ly(iris, x = ~iris_limits, y = ~Sepal.Length, type = 'bar', name =
'Sepal') %>%
layout(yaxis = list(title = 'Count'), barmode = 'group')
p
The grouping details:
> iris$Sepal.Length[iris$iris_limits==1]
[1] 5.1 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4 4.8 4.8 4.3 5.4 5.1 5.1 5.4 5.1 4.6 5.1 4.8 5.0 5.0 5.2 5.2 4.7 4.8
[29] 5.4 5.2 5.5 4.9 5.0 5.5 4.9 4.4 5.1 5.0 4.5 4.4 5.0 5.1 4.8 5.1 4.6 5.3 5.0 5.5 4.9 5.2 5.0 5.5 5.5 5.4 5.5 5.5
[57] 5.0 5.1 4.9
> iris$Sepal.Length[iris$iris_limits==2]
[1] 5.8 5.7 5.7 6.4 6.5 5.7 6.3 6.6 5.9 6.0 6.1 5.6 6.7 5.6 5.8 6.2 5.6 5.9 6.1 6.3 6.1 6.4 6.6 6.7 6.0 5.7 5.8 6.0
[29] 6.0 6.7 6.3 5.6 6.1 5.8 5.6 5.7 5.7 6.2 5.7 6.3 5.8 6.3 6.5 6.7 6.5 6.4 5.7 5.8 6.4 6.5 6.0 5.6 6.3 6.7 6.2 6.1
[57] 6.4 6.4 6.3 6.1 6.3 6.4 6.0 6.7 5.8 6.7 6.7 6.3 6.5 6.2 5.9
> iris$Sepal.Length[iris$iris_limits==3]
[1] 7.0 6.9 6.8 7.1 7.6 7.3 7.2 6.8 7.7 7.7 6.9 7.7 7.2 7.2 7.4 7.9 7.7 6.9 6.9 6.8
>

0 x 0 matrix when running PCA in FactoMineR

I'm trying to run a principal component analysis (PCA) indicating the quantitative data and the qualitative data, but I get this error when performing:
library(FactoMineR)
pca(data, quanti.sup = 4:12, quali.sup = 1:3, scale.unit = FALSE, ncp=2)
Error in eigen(t(X)%*%X, symmetric = TRUE): = 0x0 matrix
My data is a 2980 x 12 data frame with names, so it's really weird.
Any advice would be very much appreciated.
The problem you encountered is because you have specified all of your variables as supplementary variables when you call PCA().
To illustrate with an example we can use the built in dataset USJudgeRatings.
head(USJudgeRatings)
CONT INTG DMNR DILG CFMG DECI PREP FAMI ORAL WRIT PHYS RTEN
AARONSON,L.H. 5.7 7.9 7.7 7.3 7.1 7.4 7.1 7.1 7.1 7.0 8.3 7.8
ALEXANDER,J.M. 6.8 8.9 8.8 8.5 7.8 8.1 8.0 8.0 7.8 7.9 8.5 8.7
ARMENTANO,A.J. 7.2 8.1 7.8 7.8 7.5 7.6 7.5 7.5 7.3 7.4 7.9 7.8
BERDON,R.I. 6.8 8.8 8.5 8.8 8.3 8.5 8.7 8.7 8.4 8.5 8.8 8.7
BRACKEN,J.J. 7.3 6.4 4.3 6.5 6.0 6.2 5.7 5.7 5.1 5.3 5.5 4.8
BURNS,E.B. 6.2 8.8 8.7 8.5 7.9 8.0 8.1 8.0 8.0 8.0 8.6 8.6
In this data there are 43 judges who were ranked on 11 qualities by lawyers (columns 2:12). Column 1 is the number of contacts the lawyers had with the judge.
The PCA won't work if you specify that all variables are supplementary.
library(FactoMineR)
result <- PCA(USJudgeRatings, ncp = 3, quanti.sup = 1:12)
# Error in eigen(t(X) %*% X, symmetric = TRUE) : 0 x 0 matrix
We have to give the PCA some variables to work with. Instead, we let our 11 variables go into the PCA and specify only the number of contacts the lawyers had with the judges as a quantitative supplementary variable:
result <- PCA(USJudgeRatings, ncp = 3, quanti.sup = 1)
This runs and you can then view the results with summary.PCA(result).

Resources