overlap of time series in ggplot2 keeping the x labels - r

How can I overlap two time series with ggplot2 and keep both X labels (one with 1970 and another with 1980)?
This is an overview of my datasets and the code I use to plot each graphic.
> dataset1.data
Date Obs
1 1/1/1970 2.0
2 1/2/1970 1.0
3 1/3/1970 0.0
4 1/4/1970 0.0
5 1/5/1970 0.5
6 1/6/1970 5.1
7 1/7/1970 0.0
8 1/8/1970 0.0
> dataset2.data
Date Obs
1 1/1/1980 3.0
2 1/2/1980 0.5
3 1/3/1980 0.5
4 1/4/1980 5.0
5 1/5/1980 0.4
6 1/6/1980 6.2
7 1/7/1980 9.0
8 1/8/1980 1.3
qplot(main="Observations 1")+xlab("Date")+ylab("Obs")+
geom_point(data = dataset1.data,aes(Date, Obs, colour="blue"),alpha = 0.7,na.rm = TRUE)+
scale_colour_identity("Legend", breaks=c("blue"), labels="1970")
qplot(main="Observations 2")+xlab("Date")+ylab("Obs")+
geom_point(data = dataset2.data,aes(Date, Obs, colour="red"),alpha = 0.7,na.rm = TRUE)+
scale_colour_identity("Legend", breaks=c("red"), labels="1980")

I would put them both in a single dataset, and then use a new Year variable for the color aesthetic:
dataset1.data = read.table('dataset1.txt')
dataset2.data = read.table('dataset2.txt')
dataset1.data$Date = as.Date(dataset1.data$Date, format='%m/%d/%Y')
dataset2.data$Date = as.Date(dataset2.data$Date, format='%m/%d/%Y')
data = rbind(dataset1.data, dataset2.data)
data = transform(data, MonthDay=gsub('(.+)-(.+-.+)', '\\2', data$Date), Year=gsub('(.+)-(.+-.+)', '\\1', data$Date))
qplot(main="Observations 1")+xlab("Date")+ylab("Obs")+geom_point(data = data,aes(MonthDay, Obs, colour=Year),alpha = 0.7,na.rm = TRUE)
It's probably also possible to do it by editing the grid objects. For example, see: https://github.com/hadley/ggplot2/wiki/Editing-raw-grid-objects-from-a-ggplot

Related

Color and shape coding within ggplot

Working with a chemical dateset and what I want to do is to color code the geom_points by the depth at which they were sampled from and then make the shape based on when it was sampled from. I also want to add a thin black border on all the geom_points in order to distinguish them.
Here is a sample table:
ID Depth(m) Sampling Date Cl Br
1 1 May 4.0 .05
2 1 June 5.0 .07
3 2 May 6.0 .03
4 2 June 7.0 .05
5 3 May 8.0 .01
6 3 June 9.0 .03
7 4 May 10.0 .00
8 4 June 11.0 .01
I am trying to use the code
graph <- df %>%
ggplot(aes(x = Cl, y = Br, fill = Depth, shape = Sampling Date), color = black) +
geom_point(shape = c(21:24, size = 4) +
labs(x = "Cl", y = "Br")
graph
But everytime I do this it just fills in the shape black ignoring the color specification. Also I need to use the shapes 21:25 but everytime I try to specify the number of shapes it always says that it doesn't match the number of variables within my dataset.
Your code is somewhat filled with ... challenges.
Remove all spaces! That makes your life easier. Also add shape aes to geom_point and specify the shapes with a scale call.
library(ggplot2)
df <- read.table(text = "ID Depth SamplingDate Cl Br
1 1 May 4.0 .05
2 1 June 5.0 .07
3 2 May 6.0 .03
4 2 June 7.0 .05
5 3 May 8.0 .01
6 3 June 9.0 .03
7 4 May 10.0 .00
8 4 June 11.0 .01", header = T)
ggplot(df, aes(x = Cl, y = Br, fill = Depth, shape = SamplingDate)) +
geom_point(aes(shape = SamplingDate), size = 4) +
scale_shape_manual(values = 21:24)
Created on 2020-07-30 by the reprex package (v0.3.0)

Make scatter (or X, Y) plot by treatment for different time period

I have a data (R dataframe) like this:
Treatment Diameter(inches).Sep Diameter(inches).Dec
Aux_Drop NA NA
Aux_Spray 3.7 2
DMSO NA NA
Water 4.2 2
Aux_Drop 2.6 3
Aux_Spray 3.7 3
DMSO 4 2
Water 5.2 1
Aux_Drop 5.4 2
Aux_Spray 3.4 2
DMSO 4.8 2
Water 4.2 2
Aux_Drop 4.7 2
Aux_Spray 2.7 2
DMSO 3.4 2
Water 4.9 2
.......
.......
I want to make a scatter (or x, y) plot of diameter for each treatment group. I have found lattice library plot more helpful as of now and I have used:
require(lattice)
xyplot(`Diameter(inches).Sep` ~ Treatment , merged.Sep.Dec.Mar, pch= 20)
to generate the plot:
However, I want to add the scatter plot for "Diameter from Dec" next to the "Diameter of Sep" for each treatments with different color. I am not able to find a workable example that I can use for my purpose so far.
Method with lattice, ggplot2 or base plot or any other would be really helpful.
Thanks,
Something like this?
library(tidyverse)
df %>%
gather(Month, Diameter, -Treatment) %>%
ggplot(aes(Treatment, Diameter)) +
geom_point(aes(colour = Month), position = position_dodge(width = 0.9))
You can adjust the amount of separation between the different coloured points by changing width inside position_dodge.
Sample data
df <- read.table(text =
"Treatment Diameter(inches).Sep Diameter(inches).Dec
Aux_Drop NA NA
Aux_Spray 3.7 2
DMSO NA NA
Water 4.2 2
Aux_Drop 2.6 3
Aux_Spray 3.7 3
DMSO 4 2
Water 5.2 1
Aux_Drop 5.4 2
Aux_Spray 3.4 2
DMSO 4.8 2
Water 4.2 2
Aux_Drop 4.7 2
Aux_Spray 2.7 2
DMSO 3.4 2
Water 4.9 2", header = T)
Here's a tidyverse solution. It uses tidyr::gather to put the two diameter types into one column. You can then facet on the values in that column. I hide the colour legend, since the categories are apparent from the axis labels.
Assuming the data frame is named mydata.
library(tidyverse)
mydata %>%
gather(Result, Value, -Treatment) %>%
ggplot(aes(Result, Value)) +
geom_jitter(aes(color = Result),
width = 0.1) +
facet_wrap(~Treatment) +
guides(color = FALSE)

Plotting multi repetitions in R

I have the following dataset:
Class R1 R2 R3 R4 R5
Operator 6.5 2 18 3.6 5.1
Assest 1.3 9.5 6 6.3 7.5
Operator 10 5 9 2.2 7.5
Execute 6.3 4 2.5 9 9
Execute 6 5 5 5 1.6
Assest 6 2.5 6.6 7 7.9
Operator 10 5 13 5 7.5
Assest 5 2.5 6.6 9 7.9
I would like to generate a mulitplot for each class where each individual plot represents a single run (each multiplot will have three plots based on the example).
I started by doing the following:
data <- read_csv("/home/adam/Desktop/dataa.csv")
dataset <- data %>% melt(id.vars = c("Class"))
p2_data <- dataset %>% filter(Class == "Operator")
pp2 <- p2_data %>% ggplot(aes(x=variable, y=value, group=Class, colour=Class)) +
geom_line() +
scale_x_discrete(breaks = seq(0, 1000, 100)) +
but that only give me a plot of one class with all the runs, which is not what I want. Can you please help me solving this?
If I understand your question correctly and you would like to have separate plots for each of the three Classes with a line for each row of observations (3 for Assest, 2 for Execute and 3 for Operator), perhaps the below would help?
data %>%
group_by(Class) %>%
mutate(run=row_number()) %>%
melt(id.vars = c("Class", "run")) %>%
mutate(run=as.factor(run)) %>%
ggplot(aes(variable, value, colour=run, group=run)) +
geom_point() + geom_line() + facet_wrap(~Class)

Filter data frame based off factor - R

I have the following data frame (called cats, can be accessed using library(MASS)
Sex Bwt Hwt
1 F 2.0 7.0
2 F 2.0 7.4
3 F 2.0 9.5
4 F 2.1 7.2
5 F 2.1 7.3
6 F 2.1 7.6
7 F 2.1 8.1
8 F 2.1 8.2
9 F 2.1 8.3
10 F 2.1 8.5
I first create 3 factors:
x = cut(cats$Bwt, breaks=3)
Now I need to grab all the data which fits in the first factor, plot it in a boxplot. Then do the same for the other 2 factors.
I have tried:
new_data = subset(cats, cats$Bwt %in% x[1])
also
new_data = cats[which(cats$Bwt == x[1])]
I can't seem to filter this data based on the factor. How is this done?
The simple answer is that the variable you created is the one you should be iterating over when performing the comparison. So:
new_data <- cats[which(x == unique(x)[1]),]
Another alternative is not to subset at all but instead use the facet functionality from ggplot something like this
cats %>%
mutate(breaks = cut(Bwt, breaks=3)) %>%
ggplot() +
geom_boxplot(aes(x = Sex, y = Hwt)) +
facet_wrap(~breaks)

R slopegraph geom_line color ggplot2

I am trying to create a slopegraph with ggplot and geom_line. I want the lines of a subset of data (e.g. those higher then 0.5) to be in red and those less than 0.5 to be another color. Here's my code:
library(ggplot2)
library(reshape2)
mydata <- read.csv("testset.csv")
mydatam = melt(mydata)
line plot:
ggplot(mydatam, aes(factor(variable), value, group = Gene, label = Gene)) +
geom_line(col='red')
in this case, all the lines are red. how do I make red lines for those "Gene"s that have a variable low value > 0.5 (there are 5 of them, aa,ac, ba, bc and bd) and the rest black lines?
mydatam looks like this:
Gene variable value
1 aa Control 0.0
2 ab Control 0.0
3 ac Control 0.0
4 ad Control 0.0
5 ba Control 0.0
6 bb Control 0.0
7 bc Control 0.0
8 bd Control 0.0
9 aa Low 0.6
10 ab Low 0.2
11 ac Low 0.8
12 ad Low 0.1
13 ba Low 0.7
14 bb Low 0.3
15 bc Low 0.8
16 bd Low 1.2
17 aa High -0.6
18 ab High 1.6
19 ac High 2.1
20 ad High 0.7
21 ba High -1.2
22 bb High -0.7
23 bc High -0.8
24 bd High 0.6
You'll probably want to create a new variable in the data for this. Here's one way:
## Load dplyr package for data manipulation
library("dplyr")
## Genes where "Low" value is >0.5
genes <- mydatam[mydatam$variable == "Low" & mydatam$value > 0.5, "Gene"]
## Add new column
newdat <- mutate(mydatam, newval = ifelse(Gene %in% genes, ">0.5", "<=0.5"))
Now we can create the plot using newval to set the color.
## Color lines based on `newval` column
ggplot(newdat, aes(factor(variable), value, group = Gene, label = Gene)) +
geom_line(aes(color = newval)) +
scale_color_manual(values = c("#000000", "#FF0000"))

Resources