I'm trying to "capture" some points within a bar.
The points represent 36 values on a monthly basis for 3 years.
The bars represent 3 values on a yearly basis for the same 3 years.
If you run the code you can see that some point of the first year are maybe captured by the bar of the second year and that the points of the 3rd year are "running out" of the last bar.
How can I align the bars and the points?
library(ggplot2)
set.seed(1)
df.year <- data.frame(yeardate = seq(as.Date("2010-01-01"), by = "year", length.out = 3), datevalue = abs(rnorm(3)))
df.month <- data.frame(monthdate = seq(as.Date("2010-01-01"), by = "month", length.out = 36), datevalue = abs(rnorm(36)))
df.month$inyear <- format(df.month$monthdate, "%Y")
df.month
p <- ggplot()
p <- p + geom_point(
data = df.month
,aes(x = monthdate, y = datevalue, color=inyear)
)
p <- p + geom_bar(
data = df.year
,aes(x = yeardate, y = datevalue)
,alpha=0.7
,stat = "identity"
)
p + scale_x_date(labels = date_format("%Y"), breaks = date_breaks("years"))
geom_bar is centering the bars on the dates given. Since the given dates are the first of the year, it is centered around the first of the year, and so much of 2012 lies outside the bar centered on 2012-01-01 (and much of that bar lies in 2011). So either center the bars in the middle of the year:
df.year <- data.frame(yeardate = seq(as.Date("2010-07-01"),
by = "year",
length.out = 3),
datevalue = abs(rnorm(3)))
which gives
or draw rectangles with the exact extent that you want them to be
df.year <- data.frame(yearstart = seq(as.Date("2010-01-01"),
by = "year", length.out = 3),
yearend = seq(as.Date("2010-12-31"),
by = "year", length.out = 3),
datevalue = abs(rnorm(3)))
and replace the geom_bar call with
p <- p + geom_rect(
data = df.year
,aes(xmin = yearstart, xmax = yearend,
ymin = 0, ymax = datevalue)
,alpha=0.7
)
giving
Related
I am attempting to plot the blood test results for a patient in a time series. I have managed to do this and included a reference range between two shaded y-intercepts. My problem is that the annotate() or geom_segment() calls want me to specify, in the units of my independent variable, which is, unhelpfully, a date (YYYY-MM-DD).
Is it possible to get R to ignore the units of the x- and y-axis and specify the arrow co-ordinates as if they were on a grid?
result <- runif(25, min = 2.0, max = 3.5)
start_date <- ymd("2021-08-16")
end_date <- ymd("2022-10-29")
date <- sample(seq(start_date, end_date, by = "days"), 25, replace = TRUE)
q <- data.table(numbers, date)
ggplot(q, aes(x = date, y = result)) +
geom_line() +
geom_point(aes(x = date, y = result), shape = 21, size = 3) +
scale_x_date(limits = c(min(q$date), max(q$date)),
breaks = date_breaks("1 month"),
labels = date_format("%b %Y")) +
ylab("Corrected calcium (mmol/L")+
xlab("Date of blood test") +
ylim(1,4)+
geom_ribbon(aes(ymin=2.1, ymax=2.6), fill="grey", alpha=0.2, colour="grey")+
geom_vline(xintercept=as.numeric(q$date[c(3, 2)]),
linetype=4, colour="black") +
theme(axis.text.x = element_text(angle = 45)) + theme_prism(base_size = 10) +
annotate("segment", x = 1, y = 2, xend = 3, yend = 4, arrow = arrow(length = unit(0.15, "cm")))
The error produced is Error: Invalid input: date_trans works with objects of class Date only.
I can confirm that:
> class(q$date)
[1] "Date"
I've just gone with test co-ordinates (1,2,3,4) for the annotate("segment"...), ideally I want to be able to get the arrow to point to a specific data point on the plot to indicate when the patient went on treatment.
Many thanks,
Sandro
You don't need to convert to points or coordinates. Just use the actual values from your data frame. I am just subsetting within annotate using a hard coded index (you can also automate this of course), but you will need to "remind" R that you are dealing with dates - thus the added lubridate::as_date call.
library(ggplot2)
library(lubridate)
result <- runif(25, min = 2.0, max = 3.5)
start_date <- ymd("2021-08-16")
end_date <- ymd("2022-10-29")
date <- sample(seq(start_date, end_date, by = "days"), 25, replace = TRUE)
q <- data.frame(result, date)
## I am arranging the data frame by date
q <- dplyr::arrange(q, date)
ggplot(q, aes(x = date, y = result)) +
geom_line() +
## for start use a random x and y so it starts whereever you want it to start
## for end, use the same row from your data frame, in this case row 20
annotate(geom = "segment",
x = as_date(q$date[2]), xend = as_date(q$date[20]),
y = min(q$result), yend = q$result[20],
arrow = arrow(),
size = 2, color = "red")
R noob here. I have been stumped on this graph all day and solutions like this and this this seem to hold my answer but I cannot get them to work for me.
I have a data frame that is a large version of the below sample which I am trying to plot using ggplot.
# create data
df <- data.frame(
"ID" = rep(1:5, each = 4),
"Date" = c(seq(as.Date("2019/09/18"), by = "day", length.out = 4),
seq(as.Date("2019/09/18"), by = "day", length.out = 4),
seq(as.Date("2020/08/07"), by = "day", length.out = 4),
seq(as.Date("2020/09/12"), by = "day", length.out = 4),
seq(as.Date("2020/09/29"), by = "day", length.out = 4)),
"MaxDepth" = round(runif(20, min = 10, max = 50), 1),
"Trip" = rep(1:5, each = 4)
)
# plot using ggplot
ggplot(df, aes(Date, MaxDepth, col = factor(Trip))) +
geom_line() +
facet_grid(ID ~ format(Date, "%Y"), scales = "free_x") +
scale_y_reverse() +
scale_x_date(date_labels = "%b") +
labs(title = "Daily maximum depth\n",
x = "",
y = "Depth [m]\n",
col = "Fishing trip")
This turns out nicely as a two column, eleven row faceted graph with the fishing trips as colours.
However, it includes a lot of empty panels which I would like to avoid by creating a one column graph still with all eleven ID rows but that are separated by the same split label the two columns had. I.e. I would like the two individuals that were in the LHS 2019 plot to have that 2019 label on top, separated by the 2020 label from the other 9 individuals.
.
Hope this is clear. Please correct me or let me know what to improve for a better question.
Grateful for any help! Even if those are suggestions that this is not a good way of representation or something like this is simply not possible. Thank you all!
Here is a possible way. I am not sure whether it works for your real data.
library(ggplot2)
library(patchwork)
library(dplyr)
plot_fun <- function(dtt){
ggplot(dtt, aes(Date, MaxDepth, col = factor(Trip))) +
geom_line() +
facet_grid(ID ~ format(Date, "%Y"), scales = "free_x") +
scale_y_reverse() +
scale_x_date(date_labels = "%b") +
labs(x = NULL, y = NULL, col = "Fishing trip")
}
p1 <- plot_fun(df %>% filter(format(Date, '%Y') == '2019'))
p2 <- plot_fun(df %>% filter(format(Date, '%Y') == '2020'))
p1 / p2
ggsave('~/Downloads/test.png', width = 6, height = 6)
Is there any way to shift the dates of a seasonal graph so that they match an arbitrary fiscal year (for example MARCH/FEB instead of DEC/JAN)?
I have this of far:
startMonth = 3 # march
startDate = as.Date(paste0(year(today()), '-', startMonth, '-1'))
dts = seq.Date(from = today() - 500, to = today(), by = 'day')
dat = data.frame(date = dts, value = runif(n = length(dts), min = 1, max = 10))
dat$month = month(dat$date)
dat$year = year(dat$date)
dat$yearPlot = ifelse(test = dat$month < startMonth, yes = (dat$year - 1), no = dat$year)
dat$year = as.character(dat$year)
dat$ydaydiff = yday(dat$date) - yday(startDate)
dat$datePlot1 = ifelse(dat$ydaydiff < 0, dat$ydaydiff + 365, dat$ydaydiff)
dat$datePlot1 = as.Date('0001-01-01') + days(dat$datePlot1)
dat$yearPlot = as.character(dat$yearPlot)
ggplot(dat) +
geom_path(aes(x = datePlot1, y = value, color = yearPlot)) +
scale_x_date(date_labels = '%b', )
Which makes this plot:
However I'd like the x-axis to start at March instead of Jan. Is there any way to adjust this? I thought of using the month column in dat but not sure how to implement.
Here is a not-too-pretty solution. The logic somewhat follows your own: find the starting date/time for each fiscal year (March 1 = time 1) and the last date/time (Feb 28 = time 365). Use this separate 'time' variable as your x-axis, then re-label the tick marks. You can change the scale_x_continuous() breaks and labels to get your desired dates along the x-axis.
t <- data.frame(date=seq.Date(as.Date('2018-03-01'),as.Date('2020-02-28'),by='days'),
fy=1)
t$fy[t$date>='2019-03-01'] <- 2
t <- t %>% group_by(fy) %>% mutate(time=seq(1:n()))
dat <- left_join(dat,t)
dat %>% ggplot(.) +
geom_path(aes(x = time, y = value, color = factor(fy),group=fy)) +
scale_x_continuous(breaks = c(1,100,200,300),labels=c('March 1','June 8','Sept 16','Dec 25'))
The breaks_width argument to scale_x_date() allows you to offset the breaks by a few months in a year.
The labels argument accepts a function to format the labels as a fiscal year, e.g. to convert a date 2019-03-01 to "19/20".
# Function to create fiscal year labels like "14/15" for the 2014/15 fiscal year
fiscal_year <- function(x) {
year_number <- lubridate::year(x)
paste(substr(year_number, 3, 4),
substr(year_number + 1, 3, 4),
sep = "/")
}
ggplot(dat) +
geom_path(aes(x = date, y = value, color = yearPlot)) +
scale_x_date(labels = fiscal_year, # Use the function to create the labels
breaks = scales::breaks_width("1 year", offset = 90)) # Offset by 90 days to March
I have a matrix with three variables Row = Time, column = Date and the third variable Money which its value is an intersection of rows and columns. e.g. For Time = 5 and Date = 10, Money is 12 and for Time = 6 and Date = 15, Money is 15. I would like to draw the value of Money for the intersection of x_axis = Time and Y_axis = Date.
How to place Money in below?
plot.new()
matplot(Time,Date, type = "p", lty = 1:5, lwd = 1, lend = par("lend"),col = 1,
pch = 17 , xlab = "Time", ylab = "Date", xlim = range(0,40), ylim = range (0,120))
I think you could use geom_raster if you convert your data to a data.frame first:
ggplot(data, aes(Time, Date)) +
geom_raster(aes(fill = Money))
See more on this here: http://docs.ggplot2.org/current/geom_tile.html
edit:
see with random data here:
time <- c(1:100)
date <- c(1:100)
data <- expand.grid(TIME = time, DATE = date)
data$MONEY <- runif(1:10000, 0, 10)
ggplot(data, aes(TIME, DATE)) +
geom_raster(aes(fill = MONEY), interpolate = F)
I am trying to shade winter months using geom_rect() in ggplot2 and have a number of different time series data with varying start and end points. For each time series I want the period from November - March of every year to be shaded (the remaining months should receive no coloring).
For example, with the following time series
library(ggplot2)
library(lubridate)
now <- Sys.time()
set.seed(123)
datOne <- data.frame(IndID = "One",
DateTime = seq(from = now - dyears(0.2), length.out = 50, by = "months"),
Value = rnorm(50))
I can define a new data frame object with the starting and ending point of each shading section (i.e. Nov - Mar for every year)
temp <- data.frame(
start = as.Date(c('2016-11-1', '2017-11-01', '2018-11-01', '2019-11-01')),
end = as.Date(c('2017-03-01', '2018-03-01', '2019-03-01', '2020-03-01')))
dateRanges <- data.frame(
start = as.POSIXct(temp [,1], "%Y-%m-%d")+ hours(6),
end = as.POSIXct(temp [,2], "%Y-%m-%d")+ hours(6))
And then make a plot.
ggplot(datOne) +
geom_rect(data = dateRanges, aes(xmin = start , xmax = end, ymin = -Inf, ymax = Inf),
inherit.aes=FALSE, alpha = 0.4, fill = c("lightblue"))+
geom_line(aes(x= DateTime, y = Value), size = 1.5)
While this works fine for a single time series, I am making a separate figure for each of many individual time series, each of which have different start and end points and require a unique data frame from which to create shading regions.
For example, as pictured below the same dateRanges data frame applied to a different time series obviously does not work.
datTwo <- data.frame(IndID = "Two",
DateTime = seq(from = now , length.out = 100, by = "months"),
Value = rnorm(100))
ggplot(datTwo) +
geom_rect(data = dateRanges, aes(xmin = start , xmax = end, ymin = -Inf, ymax = Inf),
inherit.aes=FALSE, alpha = 0.4, fill = c("lightblue"))+
geom_line(aes(x= DateTime, y = Value), size = 1.5)
Is there a different way to shade the two time series so that for each I can either automate the creation of dateRanges or use a different method all together (i.e. annotate...). Said differently, given the need to make roughly 100 different figures with 100 different time series over different periods, what is the best way to shade the period from Nov - Mar for any given year?
Thanks in advance.
I don't know if this is "the best" method, but it's an easy one: Expand your dateRanges frame and reset the limits for your axis:
# let's make it past & future proof for the next few years:
dateRanges <- data.frame(
start = seq(as.POSIXct("1900-11-01 07:00:00"), as.POSIXct("2100-11-01 07:00:00"), "1 year"),
end = seq(as.POSIXct("1901-03-01 07:00:00"), as.POSIXct("2101-03-01 07:00:00"), "1 year")
)
ggplot(datTwo) +
geom_rect(data = dateRanges, aes(xmin = start , xmax = end, ymin = -Inf, ymax = Inf),
inherit.aes=FALSE, alpha = 0.4, fill = c("lightblue"))+
geom_line(aes(x= DateTime, y = Value), size = 1.5) +
coord_cartesian(xlim = range(datTwo$DateTime))
Try this:
# make it a function
get.date.ranges <- function(data) {
range.df <- NULL
for (year in seq(min(year(data$DateTime)), year(max(data$DateTime)), 1)) {
cur <- data.frame(
start = as.POSIXct(as.Date(paste(year, '-11-1',sep='')), "%Y-%m-%d")+ hours(6),
end = as.POSIXct(as.Date(paste(year+1, '-03-1',sep='')), "%Y-%m-%d")+ hours(6)
)
if (cur$start <= max(data$DateTime)) {
range.df <- rbind(range.df, cur)
}
}
if (nrow(range.df) > 0) {
range.df[1,]$start <- max(range.df[1,]$start, min(data$DateTime))
range.df[nrow(range.df),]$end <- min(range.df[nrow(range.df),]$end, max(data$DateTime))
}
return(range.df)
}
# plot function
plot.data <- function(data) {
ggplot(data) +
geom_rect(data = get.date.ranges(data), aes(xmin = start , xmax = end, ymin = -Inf, ymax = Inf),
inherit.aes=FALSE, alpha = 0.4, fill = c("lightblue"))+
geom_line(aes(x= DateTime, y = Value), size = 1.5)
}
plot.data(datOne)
plot.data(datTwo)