I have a data which has epoch time and I need to extract the human readable time in year, month, day, minutes, seconds, milliseconds.
epoch time before conversion:
1517166673385
After conversion I need it to be in this format:
20180128191113385
I have written the following function and it works well, but it takes a long time. I am searching for a faster function because I have thousands of files to process.
getDTI<- function(echotime){
DTItemp<-as.POSIXct(as.POSIXct(as.numeric(substr(echotime,1,10)), origin="1970-01-01", tz="GMT"), origin="1970-01-01", tz="GMT")
DTI<-paste0(substr(DTItemp,1,4),substr(DTItemp,6,7),substr(DTItemp,9,10),substr(DTItemp,12,13),substr(DTItemp,15,16),substr(DTItemp,18,19),substr(test$STIME,11,13))
return(DTI)
}
a = 1517166673385
paste0(format(as.POSIXct(a/1000,origin="1970-01-01", tz="GMT"),"%Y%m%d%H%M%S"),sprintf("%03d",a%%1000))
[1] "20180128191113385"
in a function form:
fun=function(a){
paste0(format(as.POSIXct(a/1000,origin="1970-01-01",
tz="GMT"),"%Y%m%d%H%M%S"),sprintf("%03d",a%%1000))
}
d=c(1517166673385, 1517701556075)
fun(d)
[1] "20180128191113385" "20180203234556075"
Related
Can someone please explain me how the below epoch time
epoch time/unix-timestamp :1668443121840
converts to the date : 2022-11-14T16:25:21.840+0000
How is the conversion taking place and additionally how to identify an epoch timestamp if it is mentioned in seconds, milliseconds, microseconds or nanoseconds?
Additionally, is there a function in pyspark to convert the date back to epoch timestamp?
Thanks! in advance.
I tried a number of methods but I am not achieving the expected result:
t = datetime.datetime.strptime('2021-11-12 02:12:23', '%Y-%m-%d %H:%M:%S')
print(t.strftime('%s'))
As I am not able to control the format or accuracy in terms of seconds, milliseconds, microseconds or nanoseconds.
The epoch time/unix-timestamp uses a reference date: 00:00:00 UTC on 1 January 1970. It counts the seconds/milliseconds from that date.
The value you are looking for is in miliseconds, so you would have to calculate the milliseconds and concatenate with the epoch time:
import pyspark.sql.functions as F
df = spark.createDataFrame([('2022-11-14T16:25:21.840+0000',)]).toDF("timestamp")\
df\
.withColumn("timestamp",F.to_timestamp(F.col("timestamp")))\
.withColumn("epoch_seconds",F.unix_timestamp("timestamp"))\
.withColumn("epoch_miliseconds",F.concat(F.unix_timestamp("timestamp"), F.date_format("timestamp", "S")))\
.show(truncate=False)
# +----------------------+-------------+-----------------+
# |timestamp |epoch_seconds|epoch_miliseconds|
# +----------------------+-------------+-----------------+
# |2022-11-14 16:25:21.84|1668443121 |16684431218 |
# +----------------------+-------------+-----------------+
The unix timestamp counts the seconds that have elapsed since 00:00:00 UTC on 1 January 1970.
To convert dates to unix-timestamp in PySpark, you can use the unix_timestamp function.
I have a very specific problem. I have been trying to convert a date time character into a date time format in R. Example: "2017-05-21 00:00:00".
Whenever I try to convert it using strptime and as.POSIXct to a date time format it gives me "2017-05-21".
Thanks for any help
As #ngm says, this is only a formatting choice on the part of R. You can check to make sure it's actually midnight. Datetimes are stored as seconds past the epoch, and can actually be used in arithmetic.
t1 <- as.POSIXct("2017-05-21 00:00:00")
t1
# [1] "2017-05-21 EDT"
as.integer(t1)
# [1] 1495339200
So your time is 1,495,339,200 seconds after the epoch. Now we can look at midnight plus one second.
t2 <- as.POSIXct("2017-05-21 00:00:01")
t2
# [1] "2017-05-21 00:00:01 EDT"
as.integer(t2)
# [1] 1495339201
Which is one second higher than t1. So t1 is, in fact, midnight.
I have some numbers that represent dates in milliseconds since epoch, 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970
1365368400000,
1365973200000,
1366578000000
I'm converting them to date format:
as.Date(as.POSIXct(my_dates/1000, origin="1970-01-01", tz="GMT"))
answer:
[1] "2013-04-07" "2013-04-14" "2013-04-21"
How to convert these strings back to milliseconds since epoch?
Here are your javascript dates
x <- c(1365368400000, 1365973200000, 1366578000000)
You can convert them to R dates more easily by dividing by the number of milliseconds in one day.
y <- as.Date(x / 86400000, origin = "1970-01-01")
To convert back, just convert to numeric and multiply by this number.
z <- as.numeric(y) * 86400000
Finally, check that the answer is what you started with.
stopifnot(identical(x, z))
As per the comment, you may sometimes get numerical rounding errors leading to x and z not being identical. For numerical comparisons like this, use:
library(testthat)
expect_equal(x, z)
I will provide a simple framework to handle various kinds of dates encoding and how to go back an forth. Using the R package ‘lubridate’ this is made very easy using the period and interval classes.
When dealing with days, it can be easy as one can use the as.numeric(Date) to get the number of dates since the epoch. To get any unit of time smaller than a day one can convert using the various factors (24 for hours, 24 * 60 for minutes, etc.) However, for months, the math can get a bit more tricky and thus I prefer in many instances to use this method.
library(lubridate)
as.period(interval(start = epoch, end = Date), unit = 'month')#month
This can be used for year, month, day, hour, minute, and smaller units through apply the factors.
Going the other way such as being given months since epoch:
library(lubridate)
epoch %m+% as.period(Date, unit = 'months')
I presented this approach with months as it might be the more complicated one. An advantage to using period and intervals is that it can be adjusted to any epoch and unit very easily.
I'm trying to get acquainted with weatherData in R.
Having downloaded a set of temperature data I've then exported it to CSV.
Opening the CSV in Libre Calc shows the date and time for each temperature reading as a string of ten digits. In spite of some Googling I have not found a way of successfully converting the string into the format in which it appears in R.
For example: 1357084200 I believe should translate to 2013-01-01 23:50:00
Any help in getting the correct date in the same date format to appear in Calc via the CSV greatly appreciated.
Here is the direct way:
as.POSIXct(1357084200, origin="1970-01-01", tz="GMT")
#[1] "2013-01-01 23:50:00 GMT"
If it's really a character:
as.POSIXct(as.numeric("1357084200"), origin="1970-01-01", tz="GMT")
I'm not aware of a direct way of doing this, but I believe I've figured out a workaround.
For starters your example is correct. The long number (timestamp) is the number of seconds passed since 1970-01-01 00:00:00. Knowing this you can actually calculate the exact date and time from the timestamp. It's a bit complicated due to needing to take into account the leap years.
What comes in handy is the ability to supply an arbitrary number of days/months/years to LibreOffice function DATE. So in essence you can find out the number of days represented in timestamp by dividing it by 60*60*24 (number of seconds in a minute, number of minutes in an hour, number of hours in a day). And then supply that number to the date function.
timestamp = 1357084200
days = timestamp / FLOOR(timestamp / (60*60*24); 1) // comes out at 15706
actualdate = DATE(1970; 1; 1 + days) // comes out at 2013-01-01
seconds = timestamp - days * 60 * 60 * 24 // comes out at 85800
actualtime = TIME(0; 0; seconds) // comes out at 23:50:00
Then you can concatenate these or whatever else you want to do.
in a CSV file I have a few columns. One column has timestamps, where each stamp is the microseconds passed midnight of today (each csv file only have data within a day), so this is not ambiguous.
My question is, how do I parse these microseconds time stamps into R? thanks a lot!
part of my CSV file:
34201881666,250,10.8,2612,10.99,11,460283,11.01,21450,,,,,
34201883138,23712,10.02,562,10.03,10.04,113650,11,460283,,,,,
34201883138,23712,10.02,562,10.03,10.04,113650,10.05,57811,,,,,
The first column is the time stamps (the microseconds passed midnight of today). I want to construct a time series, for example in xts package, so that the time stamps of that series is from the first column.
Here is what I would do:
Create an 'anchor' timestamp of midnight using, e.g ISOdatetime(). Keep as POSIXct, or convert using as.numeric().
Add you microseconds-since-midnight to it, properly scaled.
Convert to POSIXct (if needed), and you're done.
Quick example using your first three timestamps:
R> ISOdatetime(2011,8,2,0,0,0) + c(34201881666, 34201883138, 34201883138)*1e-6
[1] "2011-08-02 09:30:01.881665 CDT" "2011-08-02 09:30:01.883137 CDT"
[3] "2011-08-02 09:30:01.883137 CDT"
R>