Data frames and datetimes [duplicate] - r

This question already has answers here:
Extracting time from POSIXct
(7 answers)
Closed 8 months ago.
I have a dataset that I’m working with and I’m trying to change the format of my time column. The current format reads like this, example: “2022-05-23 23:06:58”, I’m trying to change this to only show me the hour times and erase the dates.
Other info: I want to make this change within my data frame, not just random times. I want to change over 100,000 rows so I need a function or solution that will do so. Tidyverse, Lubridate, Format, etc. Thank you guys.
Edit: There was one thing I may not have articulated fully, I wanted to keep the exact time and nothing else. so ‘23:48:07 would’ be how I’m looking for it not just the our. I need it so I can eventually subtract the time passed between two columns. You get me?

Try this
for the first question here is the code to convert to the hour of the day
your_time<-format(as.POSIXct(your_time), format = "%H:%M:%S")
#which gives "23" hours of the day
Since you want to apply on a large dataset we use this below
large_df%>%
mutate(Hour = format(as.POSIXct(Datetime), format ="%H:%M:%S"))
where the large_df is your large dataset worth over 100,000 records
The mutate will open another column for the result which is named the Hour column
and the Datetime is the DateTime column in your large_df dataset

Is the time as a string ok? Cause then you can use substr to extract the hour and minutes like so:
time <- c("2022-05-23 23:02:58", "2022-05-23 13:52:58", "2022-05-23 03:31:58", "2022-05-23 09:09:58")
n <- nchar(time)
hour <- substr(time, n - 7, n - 3)
Just time with your 100.000 row time column

library(data.table)
hour("2022-05-23 23:06:58") # 23

Related

Date Class Altered After Calling R.ts

I have a dataframe with 2 columns: a date column and an observation column.
Here is the head:
I call ts with this line of code
data_grp_ts = ts(data_grp, frequency = 52)
And when I look at the head of the resulting time series, the date is gone:
There is 2 years of data in the data frame (104 observations) and the resulting time series index is unexpected
index(data_grp_ts)
Is there something wrong with the original date format?
Thanks in advance
Since there is not enough info in the post for people to answer this, I am closing it and will re-open a new post with more detail. It is found here: Working With Weekly Time Series Data In R using xts

How to convert date format to total number of days?

I'm trying to convert a yyyy-mm-dd data in a data frame to the total number of days from some date to put in my survival function.
I've already tried as_date() and grepl(), but I can't seem to get it to work since there are either too many NA values in my data frame or I'm doing something wrong.
Ref.date <- ymd("1941-08-24")
Date.MI <- ymd("Date.MI")
Day <- as.numeric(difftime(Date.MI, Ref.date))
I expect just the total number of days since 1941-08-24.
How do I solve the problem?
difftime() gives you the option to specify the units for the resulting output. So maybe try something like this
as.numeric(difftime(as.POSIXct("1941-08-25"), as.POSIXct("1941-08-24"), units = c("days")))
The way to solve it:
as.numeric(difftime(as.POSIXct(Date.MI[[1]]), as.POSIXct("1941-08-24"), units = c("days")))
There were square brackets needed since that refers to the first column.

How to get monthly time series cross sectional into zoo using R

I want to get a panel data set into zoo so that it catches both month and year. My data set looks like this.
and the data can be downloaded from HERE.
The best way I could do is,
dat<-read.csv("dat_lag.csv")
zdat <- read.zoo(dat, format="%d/%m/%Y")
However, I could do this by including column 1- Date and column 4- Day in my data set. Is there any clever way to get both month and year into zoo using R without including the Date and Day columns? Thanks, in advance for any help.

R - For loop over large list of elements

I have split my large data set by date like so to create a large list of several elements:
days <- split(df, df$Date)
My data has columns including time of sunrise, sunset etc. for each day. I now want to use a for loop to do further work on each day separately like this:
for(i in 1:length(days){
sunrisetime <- as.character(df$Sunrise[1])
# Further similar work (using time of sunrise & sunset for each date to split
into daytime hours and nighttime hours)
}
My question is about the df$Sunrise on the second line - I don't think this is the right code to use when trying to access the sunrise time of each day on the days list. I have tried all sorts of variations but am an R newbie so must just be hitting the wrong terms.
Thanks in advance.
sunrisetime<-rep(NA,length(days))
for(i in 1:length(days){
sunrisetime[i] <- as.character(df$Sunrise[i])
}

Creating a single timestamp from separate DAY OF YEAR, Year and Time columns in R

I have a time series dataset for several meteorological variables. The time data is logged in three separate columns:
Year (e.g. 2012)
Day of year (e.g. 261 representing 17-September in a Leap Year)
Hrs:Mins (e.g. 1610)
Is there a way I can merge the three columns to create a single timestamp in R? I'm not very familiar with how R deals with the Day of Year variable.
Thanks for any help with this!
It looks like the timeDate package can handle gregorian time frames. I haven't used it personally but it looks straightforward. There is a shift argument in some methods that allow you to set the offset from your data.
http://cran.r-project.org/web/packages/timeDate/timeDate.pdf
Because you mentioned it, I thought I'd show the actual code to merge together separate columns. When you have the values you need in separate columns you can use paste to bring them together and lubridate::mdy to parse them.
library(lubridate)
col.month <- "Jan"
col.year <- "2012"
col.day <- "23"
date <- mdy(paste(col.month, col.day, col.year, sep = "-"))
Lubridate is a great package, here's the official page: https://github.com/hadley/lubridate
And here is a nice set of examples: http://www.r-statistics.com/2012/03/do-more-with-dates-and-times-in-r-with-lubridate-1-1-0/
You should get quite far using ISOdatetime. This function takes vectors of year, day, hour, and minute as input and outputs an POSIXct object which represents time. You just have to split the third column into two separate hour minute columns and you can use the function.

Resources