in a CSV file I have a few columns. One column has timestamps, where each stamp is the microseconds passed midnight of today (each csv file only have data within a day), so this is not ambiguous.
My question is, how do I parse these microseconds time stamps into R? thanks a lot!
part of my CSV file:
34201881666,250,10.8,2612,10.99,11,460283,11.01,21450,,,,,
34201883138,23712,10.02,562,10.03,10.04,113650,11,460283,,,,,
34201883138,23712,10.02,562,10.03,10.04,113650,10.05,57811,,,,,
The first column is the time stamps (the microseconds passed midnight of today). I want to construct a time series, for example in xts package, so that the time stamps of that series is from the first column.
Here is what I would do:
Create an 'anchor' timestamp of midnight using, e.g ISOdatetime(). Keep as POSIXct, or convert using as.numeric().
Add you microseconds-since-midnight to it, properly scaled.
Convert to POSIXct (if needed), and you're done.
Quick example using your first three timestamps:
R> ISOdatetime(2011,8,2,0,0,0) + c(34201881666, 34201883138, 34201883138)*1e-6
[1] "2011-08-02 09:30:01.881665 CDT" "2011-08-02 09:30:01.883137 CDT"
[3] "2011-08-02 09:30:01.883137 CDT"
R>
Related
Newbie here, first post (please be gentle). I have been trying to resolve this for several hours, so finally decided time to ask advice.
I have a large spreadsheet which I am importing with readxl. It contains one column with date (format dd/mm/yyyy) and several time columns in format hh:mm as can be seen: excel
Essentially I want to be able to import both time and date columns and combine them, so that I can then do some other calculations, like time elapsed.
If I import letting R guess the col-types, it converts the times to POSIXct, but these then have a date on 1899 attached to them: R_POSIXct
If I force readxl to assign the time column to numeric, I get a decimal (e.g. 0.315972222 for 07:35), which then tried converting using similar syntax to
format(as.POSIXct(Sys.Date() + 0.315972222), "%Y-%m-%d %H:%M:%S", tz="UTC")
i.e.
df$datetime <- format(as.POSIXct(df$date + df$time), "%Y-%m-%d %H:%M", tz="UTC")
which results in the correct date, but with a time of 00:00, not the time it is passed.
I have tried searching here and found posts to be not quite the same question (e.g. Combining date and time columns into dd/mm/yyyy hh:mm), and have read widely, including about about lubridate, but as I'm only 6 months into R, am finding some explanations a bit cryptic.
Suggestions or ignposting appreciated (if there are solutions I haven't found)
If you subtract the number of days between 1899-01-01 and 1970-01-01 and then multiply that (shifted) Excel numeric value by 3600 you should come close to the number of seconds since start of 1970. You could then convert to POSIXct with as.POSIXct( x, origin="1970-01-01"). That does seem to be "the hard way", however
It would be far easier and probably more accurate to convert the date-times to YYYY-MM-DD H:M:S format and then export as csv to be imported into R as text. There is a "POSIXct" colClasses argument to read.csv, although it doesn't handle separate columns of date and time. For that you would be advised to import as character values and then paste the dates and times. Then watch you format strings for as.POSIXct. The dd/mm/yyyy "format" would be specified by "%d/%m/%Y".
I want to convert number of days to date with time:
> 15525.1+as.Date("1970-01-01")
[1] "2012-07-04" ## correct but no time
I tried this:
> apollo.fmt <- "%B %d, %Y, %H:%M:%S"
> as.POSIXct((15525.1+as.Date("1970-01-01")), format=apollo.fmt, tz="UTC")
[1] "2012-07-04 04:24:00 CEST"
but as you see the results provide in CEST. But I need it it in UTC.
Any hints on this?
For the original conversion, refer to this question: Converting numeric time to datetime POSIXct format in R and these pages: Date-times in R , Date-time conversions and Converting excel dates (number) to R date-time object. Bascially, it depends on your data source, the time origin for that data sources (Excel, Apache etc.) and the units. For example, you may have the total time elapsed in seconds, minutes, hours or days since the time origin for your data source which will be different for Excel or Apache. Once you have this information, you can use strptime or origin arguments and convert to R date-time objects.
If you are only concerned with changing the timezone, you can use attr:
> u <- Sys.time()
> u
[1] "2017-12-21 09:01:35 EST"
> attr(u, "tzone") <- "UTC"
> u
[1] "2017-12-21 14:01:35 UTC"
You may want to check up on the valid timezones for your machine though. A good way to get a time-zone that works with your machine would be googleway::google_timezone. To get the coordinates for your location (or the location from where you're importing data), you can either look those up online or use ggmap::geocode() - useful if converting time stamps in data from different time zones.
I think the problem is as.POSIXct doesn't change anything if the time is already POSIXct, so the tz option has no effect.
Use attr as explained here
SAS documentation states the following for data and datetime values:
SAS time value: is a value representing the number of seconds since midnight of the current day. SAS time values are between 0 and 86400.
SAS datetime value: is a value representing the number of seconds between January 1, 1960 and an hour/minute/second within a specified date.
I'm willing to convert the following date and hour values with R, I have a big doubt for the hour (datetime) conversion, which one of the "HH:MM:SS" values within R_hour1 and R_hour2 is correct ?
I have to separate columns, SAS date = 20562 and SAS hour = 143659, in my table
R: R_date <- as.Date(as.integer(20562), origin="1960-01-01"); R_date
[1] "2016-04-18"
R: R_hour1 <- as.POSIXct(143659, origin = R_date); R_hour1
[1] "2016-04-19 17:54:19 CEST"
R: R_hour2 <- as.POSIXct(143659, origin = "1960-01-01"); R_hour2
[1] "1960-01-02 16:54:19 CET"
Similar to R, SAS Date and DateTime values can have whatever origin you wish them to. The default formats have a default (1/1/1960 for both), but you can use the datetime field to mean any origin you wish, and it will generally still work perfectly well with any of the datetime functions (though it will not display properly unless you write a custom format). It is very possible to have a different origin, as you show above with R_hour1.
As such, you would have to ask the person who generated the data what the meaning of the field is and what its origin should be.
I have a vector of decimal numbers, which represent 'decimal day', the fraction of a day. I want to convert it into HH:MM format using R.
For example, the number 0.8541667 would correspond to 20:30. How can I convert the numbers to HH:MM format?
Using chron:
chron::times(0.8541667)
#[1] 20:30:00
Try this:
R> format(as.POSIXct(Sys.Date() + 0.8541667), "%H:%M", tz="UTC")
[1] "20:30"
R>
We start with a date--which can be any date, so we use today--and add your desired fractional day.
We then convert the Date type into a Datetime object.
Finally, we format the hour and minute part of the Datetime object, ensuring that UTC is used for the timezone.
One option with data.table:
> library(data.table)
> structure(as.integer(0.4305556*60*60*24), class="ITime")
[1] "10:20:00"
We convert from day fraction to seconds since midnight; coerce to integer; and apply ITime class. (ITime works with integer-stored seconds since midnight.)
Other resources:
#GaborGrothendieck re chron package and link to his R News article with Thomas Petzoldt about converting from Excel in particular Converting a time decimal/fraction representing days to its actual time in R?
#JorisChau re RStudio's hms package how to convert excel internal coding for hours to hours in R?
The r code that I am working on is supposed to use the data collected in every five minute intervals.
The data is saved in csv format. However, due to inconsistency in the data collected, the time column in the data sometimes represent timestamp instead of just time.(dd/mm/yyyy HH:MM, instead of HH:MM)
This causes an error to my system as the system reads the data as having multiple different values for the same time value. Therefore, I would like to omit the date format from the timestamp such that the code would only read the time value.
My failed attempt was:
as.Date(data[[1]],"%H:%M")
which gave me all NA values for the time column.
I have searched for similar questions in SO, but I did not manage to find a clear answer to my question. Can anyone suggest me some possible functions to use?
I appreciate your help.
You could just strip the date portion of the text and then use as.POSIXct to convert them all to a %H:%M timestamp, e.g.:
x <- c("10:25","01/01/2014 10:30")
x <- gsub("^.+(\\d{2}:\\d{2})$","\\1",x)
as.POSIXct(x,format="%H:%M",tz="UTC")
#[1] "2014-06-02 10:25:00 UTC" "2014-06-02 10:30:00 UTC"