Multiple Plots in R - r

I want to plot 2 graphs in 1 frame. Basically I want to compare the results.
Anyways, the code I tried is:
plot(male,pch=16,col="red")
lines(male,pch=16,col="red")
par(new=TRUE)
plot(female,pch=16,col="green")
lines(female,pch=16,col="green")
When I run it, I DO get 2 plots in a frame BUT it changes my y-axis. Added my plot below. Anyways, y-axis values are -4,-4,-3,-3,...
It's like both of the plots display their own axis.
Please help.
Thanks

You don't need the second plot. Just use
> plot(male,pch=16,col="red")
> lines(male, pch=16, col = "red")
> lines(female, pch=16, col = "green")
> points(female, pch=16, col = "green")
Note: that will set the frame boundaries based on the first data set, so some data from the second plot could be outside the boundaries of the plot. You can fix it by e.g. setting the limits of the first plot yourself.

For this kind of plot I usually like the plotting with ggplot2 much better. The main reason: It generalizes nicely to more than two lines without a lot of code.
The drawback for your sample data is that it is not available as a data.frame, which is required for ggplot2. Furthermore, in every case you need a x-variable to plot against. Thus, first let us create a data.frame out of your data.
dat <- data.frame(index=rep(1:10, 2), vals=c(male, female), group=rep(c('male', 'female'), each=10))
Which leaves us with
> dat
index vals group
1 1 -0.4334269341 male
2 2 0.8829902521 male
3 3 -0.6052638138 male
4 4 0.2270191965 male
5 5 3.5123679143 male
6 6 0.0615821014 male
7 7 3.6280155376 male
8 8 2.3508890457 male
9 9 2.9824432680 male
10 10 1.1938052833 male
11 1 1.3151289227 female
12 2 1.9956491556 female
13 3 0.8229389822 female
14 4 1.2062726250 female
15 5 0.6633392820 female
16 6 1.1331669670 female
17 7 -0.9002109636 female
18 8 3.2137052284 female
19 9 0.3113656610 female
20 10 1.4664434215 female
Note that my command assumes you have 10 data values each. That command would have to be adjusted according to your actual data.
Now we may use the mighty power of ggplot2:
library(ggplot2)
ggplot(dat, aes(x=index, y=vals, color=group)) + geom_point() + geom_line()
The call above has three elements: ggplot initializes the plot, tells R to use dat as datasource and defines the plot aesthetics, or better: Which aesthetic properties of the plot (such as color, position, size, etc.) are influenced by your data. We use the x and y-values as expected and furthermore set the color aesthetic to the grouping variable - that makes ggplot automatically plot two groups with different colors. Finally, we add two geometries, that pretty much do what is written above: Draw lines and draw points.
The result:
If you have your data saved in the standard way in R (in a data.frame), you end with one line of code. And if after some thousands years of evolution you want to add another gender, it is still one line of code.

Related

ggplot multiple lines in same graph

I am trying to plot multiple gene expressions over time in the same graph to demonstrate a similar profile and then add a line to illustrate the mean of total for each timepoint (like the figure 4b in recent Nature comm article https://www.nature.com/articles/s41467-017-02546-5/figures/4). My data has been normalised to be around 0 so they are all on the same scale.
df2 sample:
variable value gene
1 5 -0.610384193 1
2 5 -6.25967087 2
3 5 -3.773389731 3
50 6 -0.358879035 1
51 6 -6.066341017 2
52 6 -4.202998579 3
99 7 -0.103885903 1
100 7 -6.648844687 2
101 7 -5.041554127 3
I plot the expression levels with ggplot2:
plotC <- ggplot(df2, aes(x=variable, y=value, group=factor(gene), colour=gene)) + geom_line(size=0.5, aes(color=gene), alpha=0.4)
But adding the mean line in red to this plot is proving difficult. I calculated the means and put them in another dataframe:
means
value variable gene
1 -1.5037354 5 50
2 -0.8783492 6 50
3 -0.7769085 7 50
Then tried adding them as another layer:
plotC + geom_line(data=means, aes(x=variable, y=value, color="red", group=factor(gene)), size=0.75)
But I get an error Error: Discrete value supplied to continuous scale
Do you have any suggestions as to how I can plot this mean on the same graph in another color?
Thank you,
Anna
edit: the answer by RG20 is helpful, thanks for pointing out I had the color in the wrong place. However it plots the line outside the rest of the graph... I really don't understand what's wrong with my graph...
enter image description here
plotC + geom_line(data=means, aes(x=variable, y=value, group=factor(gene)), color='red',size=0.75)

Cumulative plot in R

I try to make a cumulative plot for a particular (for instance the first) column of my data (example):
1 3
2 5
4 9
8 11
12 17
14 20
16 34
20 40
Than I want to overlap this plot with another cumulative plot of another data (for example the second column) and save it as a png or jpg image.
Without using the vectors implementation "by hand" as in Cumulative Plot with Given X-Axis because if I have a very large dataset i can't be able to do that.
I try the follow simple commands:
A <- read.table("cumul.dat", header=TRUE)
Read the file, but now I want that the cumulative plot is down with a particular column of this file.
The command is:
cdat1<-cumsum(dat1)
but this is for a particular vector dat1 that I need to take from the data array (cumul.dat).
Thanks
I couldn't follow your question so this is a shot in the dark answer based on key words I did get:
m <- read.table(text=" 1 3
2 5
4 9
8 11
12 17
14 20
16 34
20 40")
library(ggplot2)
m2 <- stack(m)
qplot(rep(1:nrow(m), 2), values, colour=ind, data=m2, geom="step")
EDIT I decided I like this approach bettwe:
library(ggplot2)
library(reshape2)
m$x <- seq_len(nrow(m))
m2 <- melt(m, id='x')
qplot(x, value, colour=variable, data=m2, geom="step")
I wasn't quite sure when the events were happening and what the observations were. I'm assuming the events are just at 1,2,3,4 and the columns represent sounds of the different groups. If that's the case, using Lattice I would do
require(lattice)
A<-data.frame(dat1=c(1,2,4,8,12,14,16,20), dat2=c(3,5,9,11,17,20,34,40))
dd<-do.call(make.groups, lapply(A, function(x) {data.frame(x=seq_along(x), y=cumsum(x))}))
xyplot(y~x,dd, groups=which, type="s", auto.key=T)
Which produces
With base graphics, this can be done by specifying type='s' in the plot call:
matplot(apply(A, 2, cumsum), type='s', xlab='x', ylab='y', las=1)
Note I've used matplot here, but you could also plot the series one at a time, the first with plot and the second with points or lines.
We could also add a legend with, for example:
legend('topleft', c('Series 1', 'Series 2'), bty='n', lty=c(1, 3), col=1:2)

Adding bidirectional error bars to points on scatter plot in ggplot

I am trying to add x and y axis error bars to each individual point in a scatter plot.
Each point represents a standardized mean value for fitness for males and females (n=33).
I have found the geom_errorbar and geom_errorbarh functions and this example
ggplot2 : Adding two errorbars to each point in scatterplot
However my issue is that I want to specify the standard error for each point (which I have already calculated) from another column in my dataset which looks like this below
line MaleBL1 FemaleBL1 BL1MaleSE BL1FemaleSE
3 0.05343516 0.05615977 0.28666600 0.3142001
4 -0.53321642 -0.27279609 0.23929438 0.1350793
5 -0.25853484 -0.08283566 0.25904025 0.2984323
6 -1.11250479 0.03299387 0.23553281 0.2786233
7 -0.14784506 0.28781883 0.27872358 0.2657080
10 0.38168220 0.89476555 0.25620796 0.3108585
11 0.24466921 0.14419021 0.27386482 0.3322349
12 -0.06119015 1.42294820 0.32903199 0.3632367
14 0.38957538 1.66850680 0.30362671 0.4437925
15 0.05784842 -0.12453429 0.32319116 0.3372879
18 0.71964923 -0.28669563 0.16336556 0.1911489
23 0.03191843 0.13955703 0.34522310 0.1872229
28 -0.04598340 -0.35156017 0.27001451 0.1822967
'line' is the population (n=10 individuals in each) from where each value comes from my x,y variables are 'MaleBL1' & 'FemaleBL1' and the standard error for each populations for males and females respectively 'BL1MaleSE' & 'BL1FemaleSE'
So far code wise I have
p<-ggplot(BL1ggplot, aes(x=MaleBL1, y=FemaleBL1)) +
geom_point(shape=1) +
geom_smooth(method=lm)+ # add regression line
xmin<-(MaleBL1-BL1MaleSE)
xmax<-(MaleBL1+BL1MaleSE)
ymin<-(FemaleBL1-BL1FemaleSE)
ymax<-(FemaleBL1+BL1FemaleSE)
geom_errorbarh(aes(xmin=xmin,xmax=xmax))+
geom_errorbar(aes(ymin=ymin,ymax=ymax))
I think the last two lines are wrong with specifying the limits of the error bars. I just don't know how to tell R where to take the SE values for each point from the columns BL1MaleSE and BL1FemaleSE
Any tips greatly appreciated
You really should study some tutorials. You haven't understood ggplot2 syntax.
BL1ggplot <- read.table(text=" line MaleBL1 FemaleBL1 BL1MaleSE BL1FemaleSE
3 0.05343516 0.05615977 0.28666600 0.3142001
4 -0.53321642 -0.27279609 0.23929438 0.1350793
5 -0.25853484 -0.08283566 0.25904025 0.2984323
6 -1.11250479 0.03299387 0.23553281 0.2786233
7 -0.14784506 0.28781883 0.27872358 0.2657080
10 0.38168220 0.89476555 0.25620796 0.3108585
11 0.24466921 0.14419021 0.27386482 0.3322349
12 -0.06119015 1.42294820 0.32903199 0.3632367
14 0.38957538 1.66850680 0.30362671 0.4437925
15 0.05784842 -0.12453429 0.32319116 0.3372879
18 0.71964923 -0.28669563 0.16336556 0.1911489
23 0.03191843 0.13955703 0.34522310 0.1872229
28 -0.04598340 -0.35156017 0.27001451 0.1822967", header=TRUE)
library(ggplot2)
p<-ggplot(BL1ggplot, aes(x=MaleBL1, y=FemaleBL1)) +
geom_point(shape=1) +
geom_smooth(method=lm)+
geom_errorbarh(aes(xmin=MaleBL1-BL1MaleSE,
xmax=MaleBL1+BL1MaleSE),
height=0.2)+
geom_errorbar(aes(ymin=FemaleBL1-BL1FemaleSE,
ymax=FemaleBL1+BL1FemaleSE),
width=0.2)
print(p)
Btw., looking at the errorbars you should probably use Deming regression or Total Least Squares instead of OLS regression.

Simplest way to do grouped barplot

I have the following dataframe:
Catergory Reason Species
1 Decline Genuine 24
2 Improved Genuine 16
3 Improved Misclassified 85
4 Decline Misclassified 41
5 Decline Taxonomic 2
6 Improved Taxonomic 7
7 Decline Unclear 41
8 Improved Unclear 117
I'm trying to make a grouped bar chart, species as height and then 2 colours for catergory.
here is my code:
Reasonstats<-read.csv("bothstats.csv")
Reasonstats2<-as.matrix(Reasonstats[,3])
barplot((Reasonstats2),beside=T,col=c("darkblue","red"),ylab="number of
species",names.arg=Reasonstats$Reason, cex.names=0.8,las=2,space=c(0,100)
,ylim=c(0,120))
box(bty="l")
Now what I want, is to not have to label the two bars twice and to group them apart, I've tried changing the space value to all sorts of things and it doesn't seem to move the bars apart. Can anyone tell me what I'm doing wrong?
with ggplot2:
library(ggplot2)
Animals <- read.table(
header=TRUE, text='Category Reason Species
1 Decline Genuine 24
2 Improved Genuine 16
3 Improved Misclassified 85
4 Decline Misclassified 41
5 Decline Taxonomic 2
6 Improved Taxonomic 7
7 Decline Unclear 41
8 Improved Unclear 117')
ggplot(Animals, aes(factor(Reason), Species, fill = Category)) +
geom_bar(stat="identity", position = "dodge") +
scale_fill_brewer(palette = "Set1")
Not a barplot solution but using lattice and barchart:
library(lattice)
barchart(Species~Reason,data=Reasonstats,groups=Catergory,
scales=list(x=list(rot=90,cex=0.8)))
There are several ways to do plots in R; lattice is one of them, and always a reasonable solution, +1 to #agstudy. If you want to do this in base graphics, you could try the following:
Reasonstats <- read.table(text="Category Reason Species
Decline Genuine 24
Improved Genuine 16
Improved Misclassified 85
Decline Misclassified 41
Decline Taxonomic 2
Improved Taxonomic 7
Decline Unclear 41
Improved Unclear 117", header=T)
ReasonstatsDec <- Reasonstats[which(Reasonstats$Category=="Decline"),]
ReasonstatsImp <- Reasonstats[which(Reasonstats$Category=="Improved"),]
Reasonstats3 <- cbind(ReasonstatsImp[,3], ReasonstatsDec[,3])
colnames(Reasonstats3) <- c("Improved", "Decline")
rownames(Reasonstats3) <- ReasonstatsImp$Reason
windows()
barplot(t(Reasonstats3), beside=TRUE, ylab="number of species",
cex.names=0.8, las=2, ylim=c(0,120), col=c("darkblue","red"))
box(bty="l")
Here's what I did: I created a matrix with two columns (because your data were in columns) where the columns were the species counts for Decline and for Improved. Then I made those categories the column names. I also made the Reasons the row names. The barplot() function can operate over this matrix, but wants the data in rows rather than columns, so I fed it a transposed version of the matrix. Lastly, I deleted some of your arguments to your barplot() function call that were no longer needed. In other words, the problem was that your data weren't set up the way barplot() wants for your intended output.
I wrote a function wrapper called bar() for barplot() to do what you are trying to do here, since I need to do similar things frequently. The Github link to the function is here. After copying and pasting it into R, you do
bar(dv = Species,
factors = c(Category, Reason),
dataframe = Reasonstats,
errbar = FALSE,
ylim=c(0, 140)) #I increased the upper y-limit to accommodate the legend.
The one convenience is that it will put a legend on the plot using the names of the levels in your categorical variable (e.g., "Decline" and "Improved"). If each of your levels has multiple observations, it can also plot the error bars (which does not apply here, hence errbar=FALSE

How can I produce this graph with ggplot2

This is the basic graph R presents when plotting a data frame.
plot(df)
It displays the relationship between all variables.
I know about faceting in ggplot2 but it's used for partition according to specific variables. I want to facet by a target parameter (for color) and split the grid by the variables.
sample data:
prediction.date mean.forcast mean.Error standard.Deviation AIC param.u param.v
2012-08-29 0.0015608102 0.008296402 0.008296402 -6.165365 2 5
2012-08-30 -0.0002720289 0.008537309 0.008537309 -6.164167 2 4
2012-09-02 -0.0014277972 0.008194409 0.008194409 -6.168868 4 0
2012-09-03 0.0016537998 0.008062687 0.008062687 -6.176634 5 3
2012-09-04 -0.0030247699 0.007885009 0.007885009 -6.181844 4 3
2012-09-05 0.0001538991 0.007524703 0.007524703 -6.197240 3 4
If you just need to color points in plot you provided, then you can use argument col= in plot() and set names of colors and variable to use in determining color.
#variable of test result (should be the same length as number of rows in df)
test.result<-c(0,1,1,0,0,1)
plot(df[,3:7],col=c("green","red")[as.factor(test.result)])

Resources