I am trying to make a lexis_grid for a series of events for a synthetic cohorot of people aged 0:80 over the time period 1900-2021. What I'd like to get is something that looks a little like this:
Which I have taken from this article.
I have some dummy code created below:
library('dplyr')
library('LexisPlotR')
library('lubridate')
library('ggplot2')
df <- data.frame(
year <- sample(c(1900:2021), 1000, TRUE),
age <- sample(c(0:80), 1000, TRUE),
event <- sample(c(0:5), 1000, TRUE)
)
colnames(df) <- c("year", "age", "event")
mylexis <- lexis_grid(year_start = 1900,
year_end = 2021,
age_start = 0,
age_end = 80,
delta = 10
)
And I can create a heatmap in ggplot:
ggplot(df, aes(x = year, y = age, fill = event)) + geom_tile()
But I have been unsuccessful at combining them. These were my best guesses:
mylexis + geom_tile(df, mapping = aes(x = year(year), y = age, fill = event))
mylexis + ggplot(df, aes(x = year, y = age, fill = event)) + geom_tile()
Any advice on where to go from here?
One option would be to convert your year variable to a proper date:
library(ggplot2)
mylexis +
geom_tile(data = df, mapping = aes(x = as.Date(paste0(year, "-01-01")), y = age, fill = event))
EDIT A bit hacky but also a quick approach to change the order of the3 layers would be to manipulate the layers of the ggplot2 object like so, i.e. move the geom_tile (layer 3) to the first position (But I have to admit that at least for your example data the difference is hardly visible):
library(ggplot2)
p <- mylexis +
geom_tile(data = df, mapping = aes(x = as.Date(paste0(year, "-01-01")), y = age, fill = event))
p$layers <- p$layers[c(3, 1, 2)]
p
I want to create a box plot + line plot in a single plot using ggplot2
This is what my code now:
library(ggplot2)
dat <- data.frame(day = c(0,0,0,0,0,0,10,10,10,10,10,10,14,14,14,14,14,14,21,21,21,21,21,21,28,28,28,28,28,28,35,35,35,35,35,35,42,42,42,42,42,42), group = c('Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP'), score = c(37.5,43,7,63,26,15,17,16,43,26,53,26,26,26,43,10,6,15,18,9,10,4,8,18,60,26,20,12.5,9,43,43,43,11,10,7,60,43,43,32,10.5,8,57.5))
g1 = ggplot(data = dat, aes(x = factor(day), y = score)) +
geom_boxplot(aes(fill = group))
g1
When doing box plot, I want scores of different treatments(groups) to be represented separately, so I let x = factor(day).
But for line plot, I want each day's score to be the average of the two treatments(group) of the day.
This is how my plot look like now
This is how I want my plot to look
How can I do this? Thank you so much!
#Libraries
library(tidyverse)
#Data
dat <- data.frame(day = c(0,0,0,0,0,0,10,10,10,10,10,10,14,14,14,14,14,14,21,21,21,21,21,21,28,28,28,28,28,28,35,35,35,35,35,35,42,42,42,42,42,42), group = c('Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP','Saline','RP','Saline','Saline','RP','RP'), score = c(37.5,43,7,63,26,15,17,16,43,26,53,26,26,26,43,10,6,15,18,9,10,4,8,18,60,26,20,12.5,9,43,43,43,11,10,7,60,43,43,32,10.5,8,57.5))
#How to
dat %>%
ggplot(aes(x = factor(day), y = score)) +
geom_boxplot(aes(fill = group))+
geom_line(
data = dat %>%
group_by(day) %>%
summarise(score = median(score,na.rm = TRUE)),
aes(group = 1),
size = 1,
col = "red"
)
I want to create in R a plot which contains side by side bars and line charts as follows:
I tried:
Total <- c(584,605,664,711,759,795,863,954,1008,1061,1117,1150)
Infected <- c(366,359,388,402,427,422,462,524,570,560,578,577)
Recovered <- c(212,240,269,301,320,359,385,413,421,483,516,548)
Death <- c(6,6,7,8,12,14,16,17,17,18,23,25)
day <- itemizeDates(startDate="01.04.20", endDate="12.04.20")
df <- data.frame(Day=day, Infected=Infected, Recovered=Recovered, Death=Death, Total=Total)
value_matrix = matrix(, nrow = 2, ncol = 12)
value_matrix[1,] = df$Recovered
value_matrix[2,] = df$Death
plot(c(1:12), df$Total, ylim=c(0,1200), xlim=c(1,12), type = "b", col="peachpuff", xaxt="n", xlab = "", ylab = "")
points(c(1:12), df$Infected, type = "b", col="red")
barplot(value_matrix, beside = TRUE, col = c("green", "black"), width = 0.35, add = TRUE)
But the bar chart does not fit the line chart. I guess it would be easier to use ggplot2, but don't know how. Could anyone help me? Thanks a lot in advance!
With ggplot2, the margins are handled nicely for you, but you'll need the data in two separate long forms. Reshape from wide to long with tidyr::gather, tidyr::pivot_longer, reshape2::melt, reshape, or whatever you prefer.
library(tidyr)
library(ggplot2)
df <- data.frame(
Total = c(584,605,664,711,759,795,863,954,1008,1061,1117,1150),
Infected = c(366,359,388,402,427,422,462,524,570,560,578,577),
Recovered = c(212,240,269,301,320,359,385,413,421,483,516,548),
Death = c(6,6,7,8,12,14,16,17,17,18,23,25),
day = seq(as.Date("2020-04-01"), as.Date("2020-04-12"), by = 'day')
)
ggplot(
tidyr::gather(df, Population, count, Total:Infected),
aes(day, count, color = Population, fill = Population)
) +
geom_line() +
geom_point() +
geom_col(
data = tidyr::gather(df, Population, count, Recovered:Death),
position = 'dodge', show.legend = FALSE
)
Another way to do it is to gather twice before plotting. Not sure if this is easier or harder to understand, but you get the same thing.
df %>%
tidyr::gather(Population, count, Total:Infected) %>%
tidyr::gather(Resolution, count2, Recovered:Death) %>%
ggplot(aes(x = day, y = count, color = Population)) +
geom_line() +
geom_point() +
geom_col(
aes(y = count2, color = Resolution, fill = Resolution),
position = 'dodge', show.legend = FALSE
)
You can actually plot the lines and points without reshaping by making separate calls for each, but to dodge bars (or get legends), you'll definitely need to reshape.
I need to plot hourly data for different days using ggplot, and here is my dataset:
The data consists of hourly observations, and I want to plot each day's observation into one separate line.
Here is my code
xbj1 = bj[c(1:24),c(1,6)]
xbj2 = bj[c(24:47),c(1,6)]
xbj3 = bj[c(48:71),c(1,6)]
ggplot()+
geom_line(data = xbj1,aes(x = Date, y= Value), colour="blue") +
geom_line(data = xbj2,aes(x = Date, y= Value), colour = "grey") +
geom_line(data = xbj3,aes(x = Date, y= Value), colour = "green") +
xlab('Hour') +
ylab('PM2.5')
Please advice on this.
I'll make some fake data (I won't try to transcribe yours) first:
set.seed(2)
x <- data.frame(
Date = rep(Sys.Date() + 0:1, each = 24),
# Year, Month, Day ... are not used here
Hour = rep(0:23, times = 2),
Value = sample(1e2, size = 48, replace = TRUE)
)
This is a straight-forward ggplot2 plot:
library(ggplot2)
ggplot(x) +
geom_line(aes(Hour, Value, color = as.factor(Date))) +
scale_color_discrete(name = "Date")
ggplot(x) +
geom_line(aes(Hour, Value)) +
facet_grid(Date ~ .)
I highly recommend you find good tutorials for ggplot2, such as http://www.cookbook-r.com/Graphs/. Others exist, many quite good.