Excel days since 1899-12-30 from POSIX in R - r

I have a date in POSIX in R and need to supply a data frame with it in Excel days since decimal format - which is days since 1899-12-30 and hour as a fraction of 24.
I can only find examples of the other way around.
Have had a go below, but a bit long winded and am sure there must be an existing function in lubridate or openxlsx?
library(lubridate)
start_date <- "1899-12-31"
end_date <- "2022-11-15 07:00:00"
d8_df <- as.character(seq(
from=as.POSIXct(start_date, tz="UTC"),
to=as.POSIXct(end_date, tz="UTC"),
by="day"
))
day_decimal <- NROW(d8_df)+hour(ymd_hms(end_date))/24

If you need days its seq(0, 44883). In Excel 0 is 1900-01-01 and 44883 is 2022-11-18. The current day is simply days since 1900.
library(openxlsx2)
dat <- data.frame(days_since_31Dec1899 = c(0, 44883))
wb <- wb_workbook() %>%
wb_add_worksheet() %>% wb_add_data(x = dat) %>%
wb_add_numfmt(dims = "A2:A3", numfmt = 22) %>%
wb_open()

Related

add_months function in Spark R

I have a variable of the form "2020-09-01". I need to increase and decrease this by 3 months and 5 months and store it in other variables. I need a syntax in Spark R.Thanks. Any other method will also work.Thanks, Again
In R following code works fine
y <- as.Date(load_date,"%Y-%m-%d") %m+% months(i)
The code below didn't work. Error says
unable to find an inherited method for function ‘add_months’ for signature ‘"Date", "numeric"
loaddate = 202009
year <- substr(loaddate,1,4)
month <- substr(loaddate,5,6)
load_date <- paste(year,month,"01",sep = "-")
y <- as.Date(load_date,"%Y%m%d")
y1 <- add_months(y,-3)
Expected Result - 2020-06-01
The lubridate package makes dealing with dates much easier. Here I have shuffled as.Date up a step, then simply subtract 3 months.
library(lubridate)
loaddate = 202009
year <- substr(loaddate,1,4)
month <- substr(loaddate,5,6)
load_date <- as.Date(paste(year,month,"01",sep = "-"))
new_date <- load_date - months(3)
new_date Output:
Date[1:1], format: "2020-06-01"

how to generate speific periods during serval days in R

I want to generate the same period during serval days, e.g. from 09:30:00 to 16:00:00 every day, and I know that
dates<- seq(as.POSIXct("2000-01-01 9:00",tz='UTC'), as.POSIXct("2000-04-9 16:00",tz='UTC'), by=300)
can help me obtain the time series observed every 5 minutes during 24 hours in 100 days. But what I want is the 09:30:00 to 16:00:00 over 100 days.
Thanks in advance
Here is one way. We can create a date sequence for every day, and then create sub-list with each day for the five minute interval. Finally, we can combine this list. final_seq is the final output.
date_seq <- seq(as.Date("2000-01-01"), as.Date("2000-04-09"), by = 1)
hour_seq <- lapply(date_seq, function(x){
temp_date <- as.character(x)
temp_seq <- seq(as.POSIXct(paste(temp_date, "09:30"), tz = "UTC"),
as.POSIXct(paste(temp_date, "16:00"), tz = "UTC"),
by = 300)
})
final_seq <- do.call("c", hour_seq)
An option using tidyr::crossing() (which I love) and the lubridate package:
crossing(c1 = paste(dmy("01/01/2000") + seq(1:100), "09:30"),
c2 = seq(0, 390, 5)) %>%
mutate(time_series = ymd_hm(c1) + minutes(c2)) %>%
pull(time_series)

How do I custom the 24 hour start hour and finish hour for line plot? (for example, start at 7:30)

I have a 24 hour data starting from 7:30 today (for example), until 7:30 the next day, because I didn't link the date to the line plot, R sorts the hour starting from 00:00 despite the data starting at 7:30, I am a beginner in R, and I don't know where to begin to even solve this problem, should I try linking the date also to the X axis, or is there a better solution?
My time function somehow didn't work either, it used to work when I was plotting data for 15 minute increments.
library(chron)
d <- read.csv(file="data.csv", header = T)
t <- times(d$Time)
plot(t,d$MCO2, type="l")
Graph created from the 24 hour data I have :
Graph created from a 15 minute data using the same code :
I wanted the outcome to be from 7:30 to 7:30 the next day, but it showed now a decimal number from 0.0 to 1
Here is the link to the data, just in case:
https://www.dropbox.com/s/wsg437gu00e5t08/Data%20210519.csv?dl=0
The question is actually about combining a date column and a time column to create a timestamp containing date AND time. Note that I suggest to process everything as if we are in GMT timezone. You can pick whatever timezone you want, then stick to it.
# use ggplot
library(ggplot2)
# assume everything happens in GMT timezone
Sys.setenv( TZ = "GMT" )
# replicating the data: a measurement result sampled at 1 sec interval
t <- seq(start, end, by = "1 sec")
Time24 <- trimws(strftime(t, format = "%k:%M:%OS", tz="GMT"))
Date <- strftime(t, format = "%d/%m/%Y", tz="GMT")
head(Time24)
head(Date)
d <- data.frame(Date, Time24)
# this is just a random data of temperature
d$temp <- rnorm(length(d$Date),mean=25,sd=5)
head(d)
# the resulting data is as follows
# Date Time24 temp
#1 22/05/2019 0:00:00 22.67185
#2 22/05/2019 0:00:01 19.91123
#3 22/05/2019 0:00:02 19.57393
#4 22/05/2019 0:00:03 15.37280
#5 22/05/2019 0:00:04 31.76683
#6 22/05/2019 0:00:05 26.75153
# this is the answer to the question
# which is combining the the date and the time column of the data
# note we still assume that this happens in GMT
t <- as.POSIXct(paste(d$Date,d$Time24,sep=" "), format = "%d/%m/%Y %H:%M:%OS", tz="GMT")
# print the data into a plot
png(filename = "test.png", width = 800, height = 600, units = "px", pointsize = 22 )
ggplot(d,aes(x=t,y=temp)) + geom_line() +
scale_x_datetime(date_breaks = "3 hour",
date_labels = "%H:%M\n%d-%b")
The problem is that the function times does not include information about the day. This is a problem since your data spans two days.
The data type you use should be able to include information about the day. Posix is this data type. Also, since Posix is the go-to date-time object in R it is much easier to plot.
Before plotting the data, the time column should have the correct difference in days. When just transforming the column with as.POSIXct, the times of day 2 are read as if it is from day 1. This is why we have to add 24 hours to the correct entries.
After that, it is just a matter of plotting. I added an example of the package of ggplot2 since I prefer these plots.
You might notice that using as.POSIXct will add an incorrect date to your time information. Don't bother about this, you use this date just as a dummy date. You don't use this date itself, you just use it to be able to work with the difference in days.
library(ggplot2)
# Read in your data set
d <- read.csv(file="Data 210519.csv", header = T)
# Read column into R date-time object
t <- as.POSIXct(d$Time24, format = "%H:%M:%OS")
# Add 24 hours to time the time on day 2.
startOfDayTwo <- as.POSIXct("00:00:00", format = "%H:%M:%OS")
endOfDayTwo <- as.POSIXct("07:35:00", format = "%H:%M:%OS")
t[t >= startOfDayTwo & t <= endOfDayTwo] <- t[t >= startOfDayTwo & t <= endOfDayTwo] + 24*60*60
plot(t,d$MCO2, type="l")
# arguably a nicer plot
ggplot(d,aes(x=t,y=MCO2)) + geom_line() +
scale_x_datetime(date_breaks = "2 hour",
date_labels = "%I:%M %p")

How to change time into time intervals in R? [duplicate]

This question already has an answer here:
Split time series data into time intervals (say an hour) and then plot the count
(1 answer)
Closed 7 years ago.
Date,Time,Lots,Status
"10-28-15","00:04:50","13-09","1111111110000000"
"10-28-15","00:04:50","13-10","1111100000000000"
"10-28-15","00:04:50","13-11","1111111011100000"
"10-28-15","00:04:50","13-12","1111011111000000"
"10-28-15","00:04:57","13-13","1111111111000000"
"10-28-15","00:04:57","13-14","1111111111110000"
"10-28-15","00:04:57","13-15","1111111100000000"
"10-28-15","00:04:57","13-16","1111111111000000"
"10-28-15","00:05:04","13-17","1111111110000000"
"10-28-15","00:05:04","13-18","1111101100000000"
"10-28-15","00:05:04","13-19","1111111111100000"
"10-28-15","00:05:04","13-20","1111111111100000"
"10-28-15","00:05:11","13-21","1111110100000000"
"10-28-15","00:05:11","13-22","1000011111100000"
"10-28-15","00:05:11","13-23","1101011111110000"
"10-28-15","00:05:11","13-24","1111111111000000"
"10-28-15","00:05:19","13-25","1011000000000000"
"10-28-15","00:05:19","13-26","0000000000000000"
"10-28-15","00:05:19","13-27","1111011110000000"
"10-28-15","00:05:19","13-28","1010000000000000"
say dfrm above "sample", I need to convert the time into time interval of 15 minutes. How do I do that? Eg: I have 4 intervals for every hour. 00:04:50 will go into 00:04:45 - 00:04:50. Thanks!
I have tried using:
format(as.POSIXlt(as.POSIXct('2000-1-1', "UTC") + round(as.numeric(sample$V3)/300)*300), format = "%H:%M:%S")
I hope I understood your question correctly, but I think you could do it like this:
# example data frame:
myDat <- data.frame(date = rep("2010-08-15",30),
time = sprintf("%02i:%02i:%02i",rep(12:14,each=10),rep(c(0,15,16,29:31,44:45,59,1),3),sample(1:59)[30]))
#produce a datetime variable
myDat$datetime <- strptime(x = paste(myDat$date,myDat$time),format = "%Y-%m-%d %H:%M:%S")
# extract the minutes
myDat$min <- as.integer(as.character(myDat$datetime,format="%M"))
# find the interval to put them in
myDat$ival <- findInterval(myDat$min,c(15,30,45,60)); myDat$ival <- factor(myDat$ival)
levels(myDat$ival) <- c("00","15","30","45")
# concatenate minute interval and hour
myDat$timeIval <- sprintf("%s:%s:00",as.character(myDat$datetime,format="%H"),myDat$ival)
myDat[order(myDat$datetime),]
I'd first convert the data to POSIXct
time <- as.POSIXct(
strptime(paste0(df$column1, " ", df$column2), format="%m-%d-%y %H:%M:%S")
)
and then use seq.POSIXct and cut to generate a factor variable for the 15 min intervals.
interval <- cut(
time,
breaks = seq(starttime, endtime, as.difftime(15, units="mins))
)
You may want to add -Inf or Inf to the breaks to avoid generating NA values.

Create vector of non-weekend time intervals for part of a day in R

I have a raw dataset of observations taken at 5 minute intervals between 6am and 9pm during weekdays only. These do not come with date-time information for plotting etc so I am attempting to create a vector of date-times to add to this to my data. ie this:
X425 X432 X448
1 0.07994814 0.1513559 0.1293103
2 0.08102852 0.1436480 0.1259074
to this
X425 X432 X448
2010-05-24 06:00 0.07994814 0.1513559 0.1293103
2010-05-24 06:05 0.08102852 0.1436480 0.1259074
I have gone about this as follows:
# using lubridate and xts
library(xts)
library(lubridate)
# sequence of 5 min intervals from 06:00 to 21:00
sttime <- hms("06:00:00")
intervals <- sttime + c(0:180) * minutes(5)
# sequence of days from 2010-05-24 to 2010-11-05
dayseq <- timeBasedSeq("2010-05-24/2010-11-05/d")
# add intervals to dayseq
dayPlusTime <- function(days, times) {
dd <- NULL
for (i in 1:2) {
dd <- c(dd,(days[i] + times))}
return(dd)
}
obstime <- dayPlusTime(dayseq, intervals)`
But obstime is coming out as a list. days[1] + times works so I guess it's something to do with the way the POSIXct objects are concatenated together to make dd but i can't figure out what am I doing wrong otr where to go next.
Any help appreciated
A base alternative:
# create some dummy dates
dates <- Sys.Date() + 0:14
# select non-weekend days
wd <- dates[as.integer(format(dates, format = "%u")) %in% 1:5]
# create times from 06:00 to 21:00 by 5 min interval
times <- format(seq(from = as.POSIXct("2015-02-18 06:00"),
to = as.POSIXct("2015-02-18 21:00"),
by = "5 min"),
format = "%H:%M")
# create all date-time combinations, paste, convert to as.POSIXct and sort
wd_times <- sort(as.POSIXct(do.call(paste, expand.grid(wd, times))))
One of the issues is that your interval vector does not change the hour when the minutes go over 60.
Here is one way you could do this:
#create the interval vector
intervals<-c()
for(p in 6:20){
for(j in seq(0,55,by=5)){
intervals<-c(intervals,paste(p,j,sep=":"))
}
}
intervals<-c(intervals,"21:0")
#get the days
dayseq <- timeBasedSeq("2010-05-24/2010-11-05/d")
#concatenate everything and format to POSIXct at the end
obstime<-strptime(unlist(lapply(dayseq,function(x){paste(x,intervals)})),format="%Y-%m-%d %H:%M", tz="GMT")

Resources