I have an object in R that I have converted to a POSIXct object:
data<- data.frame(date_time= c('2021-06-24 18:37:00', '2021-06-24 19:07:00', '2021-06-24 19:37:00', '2021-06-24 20:07:00','2021-06-24 20:37:00'))
data$date_time<- as.POSIXct(data$date_time, format = "%Y-%m-%d %H:%M:%S")
I want to convert this column to a decimal that gets bigger as the time passes. For example, '2021-06-24 18:37:00' should be smaller than '2021-06-24 19:07:00' and so on. However everything that I have tried so far does yield a decimal, but it does not get bigger as the time goes on. I have tried this:
data$date_time2<- yday(data$date_time) + hour(data$date_time)/24 + minute(data$date_time)/60
However this yields:
[1] 176.3667 175.9083 176.4083 175.9500 176.4500
I need the numbers to increase incrementally as minutes go by. Any help?
A datetime object is an integer counting the number of seconds from 1/1/1970. So this works as.integer(data$date_time) to create an integer value. Note the datetime is reference to GMT timezone.
To get the date as a decimal, requires the use of some integer math. The end result is the number of days from 1/1/1970 and the time as fraction.
data<- data.frame(date_time= c('2021-06-24 18:37:00', '2021-06-24 19:07:00', '2021-06-24 19:37:00', '2021-06-24 20:07:00','2021-06-24 20:37:00'))
data$date_time<- as.POSIXct(data$date_time, format = "%Y-%m-%d %H:%M:%S", tz="GMT")
intvalue <- as.integer(data$date_time)
#numbers of seconds, take the MOD with the seconds per day
#divide the result by seconds per day to make the decimal part
decfraction <- intvalue%%(3600*24)/(3600*24)
#perform integer division to get the number of days
days <- intvalue%/%(3600*24)
# or as.integer(as.Date(data$date_time))
#put together for the final answer
dateAsDecimal <- days + decfraction
#result
#18802.78 18802.80 18802.82 18802.84 18802.86
If you are only concerned that the number mapped to preserves order then xtfrm will map objects to order preserving numbers. In the case of POSIXct objects it just returns the internal numeric representation, i.e. seconds since the UNIX Epoch.
xtfrm(data$date_time)
Related
I have a dataset in .csv, and I have added in a column on my own in the csv that takes the total time taken for a task to be completed. There are two other columns that consists of the start time and the end time, and that is where I calculated the total time taken column from. The format of the start time and end time columns are in the datetime format 5/7/2018 16:13 while the format of the total time taken column is 0:08:20(H:MM:SS).
I understand that for datetime, it is possible to use the functions as.Date or as.POSIXlt to change the variable type from a factor to that of date. Is there a function that I can convert my total time taken column to (from that of factor) so that I can use it to plot scatterplots/plots in general? I tried as.numeric but the numbers that come out are gibberish and do not correspond to the original time.
If you want to plot the total time taken for each row, then I would suggest just plotting that difference as seconds. Here is a code snippet which shows how you can convert your start or end date into a numerical value:
start <- "5/7/2018 16:13"
start_date <- as.POSIXct(start, format="%d/%m/%Y %H:%M")
as.numeric(start_date)
[1] 1530799980
The above is a UNIX timestamp, which is number of seconds since the epoch (January 1, 1970). But, since you want a difference between start and end times, this detail does not really matter for you, and the difference you get should be valid.
If you want to use minutes, hours, or some other time unit, then you can easily convert.
SAS documentation states the following for data and datetime values:
SAS time value: is a value representing the number of seconds since midnight of the current day. SAS time values are between 0 and 86400.
SAS datetime value: is a value representing the number of seconds between January 1, 1960 and an hour/minute/second within a specified date.
I'm willing to convert the following date and hour values with R, I have a big doubt for the hour (datetime) conversion, which one of the "HH:MM:SS" values within R_hour1 and R_hour2 is correct ?
I have to separate columns, SAS date = 20562 and SAS hour = 143659, in my table
R: R_date <- as.Date(as.integer(20562), origin="1960-01-01"); R_date
[1] "2016-04-18"
R: R_hour1 <- as.POSIXct(143659, origin = R_date); R_hour1
[1] "2016-04-19 17:54:19 CEST"
R: R_hour2 <- as.POSIXct(143659, origin = "1960-01-01"); R_hour2
[1] "1960-01-02 16:54:19 CET"
Similar to R, SAS Date and DateTime values can have whatever origin you wish them to. The default formats have a default (1/1/1960 for both), but you can use the datetime field to mean any origin you wish, and it will generally still work perfectly well with any of the datetime functions (though it will not display properly unless you write a custom format). It is very possible to have a different origin, as you show above with R_hour1.
As such, you would have to ask the person who generated the data what the meaning of the field is and what its origin should be.
I have some numbers that represent dates in milliseconds since epoch, 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970
1365368400000,
1365973200000,
1366578000000
I'm converting them to date format:
as.Date(as.POSIXct(my_dates/1000, origin="1970-01-01", tz="GMT"))
answer:
[1] "2013-04-07" "2013-04-14" "2013-04-21"
How to convert these strings back to milliseconds since epoch?
Here are your javascript dates
x <- c(1365368400000, 1365973200000, 1366578000000)
You can convert them to R dates more easily by dividing by the number of milliseconds in one day.
y <- as.Date(x / 86400000, origin = "1970-01-01")
To convert back, just convert to numeric and multiply by this number.
z <- as.numeric(y) * 86400000
Finally, check that the answer is what you started with.
stopifnot(identical(x, z))
As per the comment, you may sometimes get numerical rounding errors leading to x and z not being identical. For numerical comparisons like this, use:
library(testthat)
expect_equal(x, z)
I will provide a simple framework to handle various kinds of dates encoding and how to go back an forth. Using the R package ‘lubridate’ this is made very easy using the period and interval classes.
When dealing with days, it can be easy as one can use the as.numeric(Date) to get the number of dates since the epoch. To get any unit of time smaller than a day one can convert using the various factors (24 for hours, 24 * 60 for minutes, etc.) However, for months, the math can get a bit more tricky and thus I prefer in many instances to use this method.
library(lubridate)
as.period(interval(start = epoch, end = Date), unit = 'month')#month
This can be used for year, month, day, hour, minute, and smaller units through apply the factors.
Going the other way such as being given months since epoch:
library(lubridate)
epoch %m+% as.period(Date, unit = 'months')
I presented this approach with months as it might be the more complicated one. An advantage to using period and intervals is that it can be adjusted to any epoch and unit very easily.
I have an array of time strings, for example 115521.45 which corresponds to 11:55:21.45 in terms of an actual clock.
I have another array of time strings in the standard format (HH:MM:SS.0) and I need to compare the two.
I can't find any way to convert the original time format into something useable.
I've tried using strptime but all it does is add a date (the wrong date) and get rid of time decimal places. I don't care about the date and I need the decimal places:
for example
t <- strptime(105748.35, '%H%M%OS') = ... 10:57:48
using %OSn (n = 1,2 etc) gives NA.
Alternatively, is there a way to convert a time such as 10:57:48 to 105748?
Set the options to allow digits in seconds, and then add the date you wish before converting (so that the start date is meaningful).
options(digits.secs=3)
strptime(paste0('2013-01-01 ',105748.35), '%Y-%M-%d %H%M%OS')
I have one question. How to convert that format 20110711201023 of date and time, to the number of hours. This is output of software which I use to image analysis, and I can’t change it. It is very important to define starting Date and Time.
Format: 2011 year, 07 month, 11 day, 20 hour, 10 minute, 23 second.
Example:
Starting Data and Time - 20110709201023,
First Data and Time - 20110711214020
Result = 49,5h.
I have 10000 data in this format so I don't want to do this manually.
I will be very gratefully for any advice.
Best is to first make it a real R time object using strptime:
time_obj = strptime("20110711201023", format = "%Y%m%d%H%M%S")
If you do this with both the start and the end date, you can simply say:
end_time - start_time
to get the difference in seconds, which can easily be converted to number of hours. To convert a whole list of these time strings, simply do:
time_vector = strptime(dat$time_string, format = "%Y%m%d%H%M%S")
where dat is the data.frame with the data, and time_string the column containing the time strings. Note that strptime works also on a vector (it is vectorized). You can also make the new time vector part of dat:
dat$time = strptime(dat$time_string, format = "%Y%m%d%H%M%S")
or more elegantly (at least if you hate $ as much as me :)):
dat = within(dat, { time = strptime(dat$time_string, format = "%Y%m%d%H%M%S") })