I am currently trying to determine the time and date on the observations in my dataset.
The date/timestamp is as follows:
1458024601.18659
1458024660.818
The observation are recorded ever minute.
I am trying to convert the above date/time stamp into something for understandable/ interpretable.
Could you please help me with this issue.
Many thanks.
Looks like seconds, but seconds starting from when? Typically, 1970-01-01:
> x = 1458024601.18659
> as.POSIXct(x, origin="1970-01-01")
[1] "2016-03-15 06:50:01 GMT"
So if you are expecting that timestamp to be that time, we've got the origin right.
If you are expecting a date in 1946, then origin="1900-01-01" is probably what you want.
Since, according to your most recent post, the data is stored as a factor class, some further manipulations are required.
To convert the factor column into the required numeric class, this modification of #Spacedman's answer should work:
as.POSIXct(as.numeric(as.character(all_prices$timestamp)), origin="1970-01-01")
Your solution is perfect, except that i have another issue :(
I tried to run this code on the data.frame that i have. Unfortunately, i keep getting this following error after running the code.
dates <- as.POSIXct(all_prices$timestamp, origin="2016-03-15")
Error in as.POSIXlt.character(as.character(x), ...) :
character string is not in a standard unambiguous format
Data "all_prices" is a data.frame.
class(all_prices)
[1] "data.frame"
data "all_prices$timestamp" is a factor.
class(all_prices$timestamp)
[1] "factor"
Related
I have a dataset called EPL2011_12. I would like to make new a dataset by subsetting the original by date. The dates are in the column named Date The dates are in DD-MM-YY format.
I have tried
EPL2011_12FirstHalf <- subset(EPL2011_12, Date > 13-01-12)
and
EPL2011_12FirstHalf <- subset(EPL2011_12, Date > "13-01-12")
but get this error message each time.
Warning message:
In Ops.factor(Date, 13- 1 - 12) : > not meaningful for factors
I guess that means R is treating like text instead of a number and that why it won't work?
Well, it's clearly not a number since it has dashes in it. The error message and the two comments tell you that it is a factor but the commentators are apparently waiting and letting the message sink in. Dirk is suggesting that you do this:
EPL2011_12$Date2 <- as.Date( as.character(EPL2011_12$Date), "%d-%m-%y")
After that you can do this:
EPL2011_12FirstHalf <- subset(EPL2011_12, Date2 > as.Date("2012-01-13") )
R date functions assume the format is either "YYYY-MM-DD" or "YYYY/MM/DD". You do need to compare like classes: date to date, or character to character. And if you were comparing character-to-character, then it's only going to be successful if the dates are in the YYYYMMDD format (with identical delimiters if any delimiters are used).
The first thing you should do with date variables is confirm that R reads it as a Date. To do this, for the variable (i.e. vector/column) called Date, in the data frame called EPL2011_12, input
class(EPL2011_12$Date)
The output should read [1] "Date". If it doesn't, you should format it as a date by inputting
EPL2011_12$Date <- as.Date(EPL2011_12$Date, "%d-%m-%y")
Note that the hyphens in the date format ("%d-%m-%y") above can also be slashes ("%d/%m/%y"). Confirm that R sees it as a Date. If it doesn't, try a different formatting command
EPL2011_12$Date <- format(EPL2011_12$Date, format="%d/%m/%y")
Once you have it in Date format, you can use the subset command, or you can use brackets
WhateverYouWant <- EPL2011_12[EPL2011_12$Date > as.Date("2014-12-15"),]
I have a column "DateTime".
Example value: 2016-12-05-16.25.54.875000
When I import this, R reads it as a Factor.
Now, when I sort the dataset by decreasing "DateTime", the maximum DateTime is 23 June 2017. When I use DateTime = as.POSIXct(DateTime), it changes to 22 June 2017. How is this happening?
P.S. I am running this R script in Power BI.
So some comments first. When you read strings in R, unless you specify otherwise they are imported as factors. You can use the Option
Trying what #Disco Superfly has suggested works if you define the data as a string in R
> a <- "2016-12-05-16.25.54.875000"
> as.POSIXct(a, format="%Y-%m-%d-%H.%M.%S")
[1] "2016-12-05 16:25:54 CET"
> as.POSIXct(a)
[1] "2016-12-05 CET"
Is not clear what you are saying about the fact that the data is being changed. Can you give a reproducible example?
To summarize, if your Dates are strings that what other have already suggested works perfectly. I suppose you are trying to do more than what you are explaining and therefore I don't understand what you are saying exactly.
I have date format in following format in a data frame:
Jan-85
Apr-99
1-Nov
Feb-96
When I see the typeof(df$col) I get the answer as "integer".
Actually when I see the format in excel it is in m/d/yyyy format. I was trying to convert this to date format in R. All my efforts yielded NA.
I tried parse_date_time function. I tried as.date along with as.character. I tried as.POSIXct but everything is giving me NA.
My trials were as follows and everything was a failure:
as.Date.numeric(df$col,"m%d%Y")
transform(df$col, as.Date(as.character(df$col), "%m%d%Y"))
as.Date(df$col,"m%d%Y")
as.POSIXct.numeric(as.character(loan_new$issue_d), format="%Y%m%d")
as.POSIXct.date(as.character(df$col), format="%Y%m%d")
mdy(df$col)
parse_date_time(df$col,c("mdy"))
How can I convert this to date format? I have used lubridate package for parse_date_time and mdy package.
dput output is below
Label <- factor(c("Apr-08",
"Apr-09", "Apr-10", "Apr-11", "Aug-07", "Aug-08", "Aug-09", "Aug-10",
"Aug-11", "Dec-07", "Dec-08", "Dec-09", "Dec-10", "Dec-11", "Feb-08",
"Feb-09", "Feb-10", "Feb-11", "Jan-08", "Jan-09", "Jan-10", "Jan-11",
"Jul-07", "Jul-08", "Jul-09", "Jul-10", "Jul-11", "Jun-07", "Jun-08",
"Jun-09", "Jun-10", "Jun-11", "Mar-08", "Mar-09", "Mar-10", "Mar-11",
"May-08", "May-09", "May-10", "May-11", "Nov-07", "Nov-08", "Nov-09",
"Nov-10", "Nov-11", "Oct-07", "Oct-08", "Oct-09", "Oct-10", "Oct-11",
"Sep-07", "Sep-08", "Sep-09", "Sep-10", "Sep-11"))
NA is typically what you get when you misspecify the format. Which is what you do. That said, if your data is really looking like the first example you gave, it's impossible to simply convert this to a date. You have two different formats, one being month-year and the other day-month.
If your updated date (i.e. Dec-11) is the correct format, then you use the format argument of as.Date like this:
date <- "Dec-11"
as.Date(date, format = "%b-%d")
# [1] "2017-12-11"
Or on your example data:
as.Date(Label, format = "%b-%d")
# [1] "2017-04-08" "2017-04-09" "2017-04-10" "2017-04-11" "2017-08-07" "2017-08-08"
# [7] "2017-08-09" "2017-08-10" "2017-08-11" "2017-12-07" "2017-12-08" "2017-12-09"
If you want to convert something like Jan-85, you have to decide which day of the month that date should have. Say we just take the first of each month, then you can do:
x <- "Jan-85"
xd <- paste0("1-",x)
as.Date(xd, "%d-%b-%y")
# [1] "1985-01-01"
More information on the format codes can be found on ?strptime
Note that R will automatically add this year as the year. It has to, otherwise it can't specify the date. In case you do not have a day of the month (eg like Jan-85), conversion to a date is impossible because the underlying POSIX algorithms don't have all necessary information.
Also keep in mind that this only works when your locale is set to english. Otherwise you have a big chance your OS won't recognize the month abbreviations correctly. To do so, do eg:
Sys.setlocale(category = "LC_TIME", locale = "English_United Kingdom")
You can later set it back to the original one if you must, or restart your R session to reset the locale settings.
note: Please check carefully which locale notations are valid for your OS. The above example works on Windows, but is not guaranteed on either Linux or Mac.
Why you see integer
The fact that these string values are of integer type, is due to the fact that R automatically convert character vectors to factors when reading in a data frame. So typeof() returns integer because that's the internal representation of a factor.
I have a dataset with some date time like this "{datetime:2015-07-01 09:10:00" So I wanted to remove the text, and then keep the date & the time as as.Date returns only the date. So I write this code but the only problem I have is that during the second line with strsplit, it only returns me the date time of the first line and so erase the others... I woud love to get ALL my date time not only the first. I thought about sapply maybe, but I can't make it right I have many errors or maybe with a loop for? I am novice to R so I don't really know how to do this the best way.
Could you help me please? Besides If you have another idea for the time & date format or a simple way to do it, it should be very nice of you too.
data$`Date Time`=as.character(data$`Date Time`)
data$`Date Time`=unlist(strsplit(data[,1], split='e:'))[2]
date=substr(data$`Date Time`,0,10)
date=as.Date(date)
time=substr(data$`Date Time`,12,19)
data$Date=date
data$Time=time
Thank you very much for your help!
You could use the format argument to avoid all the strsplit:
times <- as.POSIXct(data$`Date Time`, format='{datetime:%Y-%m-%d %H:%M:%S')
(The reason for the "{datetime:" in the format is because you mentioned this is the format of your strings).
This object has both date and time in it, and then you can just store it in the dataframe as a single column of type POSIXct rather than two columns of type string e.g.
data$datetime <- times
but if you do want to store the date as a Date and the time as a string (as in your example above):
data$Date <- as.Date(times)
data$Time <- strftime(times, format='%H:%M:%S')
See ?as.Date, ?as.POSIXct, ?strptime for more details on that format argument and various conversions between date and string.
My code:
axis.Date(1,sites$date, origin="1970-01-01")
Error:
Error in as.Date.numeric(x) : 'origin' must be supplied
Why is it asking me for the origin when I supplied it in the above code?
I suspect you meant:
axis.Date(1, as.Date(sites$date, origin = "1970-01-01"))
as the 'x' argument to as.Date() has to be of type Date.
As an aside, this would have been appropriate as a follow-up or edit of your previous question.
My R use 1970-01-01:
>as.Date(15103, origin="1970-01-01")
[1] "2011-05-09"
and this matches the calculation from
>as.numeric(as.Date(15103, origin="1970-01-01"))
So generally this has been solved, but you might get this error message because the date you use is not in the correct format.
I know this is an old post, but whenever I run this I get NA all the way down my date column. My dates are in this format 20150521 – NealC Jun 5 '15 at 16:06
If you have dates of this format just check the format of your dates with:
str(sides$date)
If the format is not a character, then convert it:
as.character(sides$date)
For as.Date, you won't need an origin any longer, because this is supplied for numeric values only. Thus you can use (assuming you have the format of NealC):
as.Date(as.character(sides$date),format="%Y%m%d")
I hope this might help some of you.
Another option is the lubridate package:
library(lubridate)
x <- 15103
as_date(x, origin = lubridate::origin)
"2011-05-09"
y <- 1442866615
as_datetime(y, origin = lubridate::origin)
"2015-09-21 20:16:55 UTC"
From the docs:
Origin is the date-time for 1970-01-01 UTC in POSIXct format. This date-time is the origin for the numbering system used by POSIXct, POSIXlt, chron, and Date classes.
If you have both date and time information in the numeric value, then use as.POSIXct. Data.table package IDateTime format is such a case. If you use fwrite to save a file, the package automatically converts date-times to idatetime format which is unix time. To convert back to normal format following can be done.
Example: Let's say you have a unix time stamp with date and time info: 1442866615
> as.POSIXct(1442866615,origin="1970-01-01")
[1] "2015-09-21 16:16:54 EDT"
by the way, the zoo package, if it is loaded, overrides the base as.Date() with its own which, by default, provides origin="1970-01-01".
(i mention this in case you find that sometimes you need to add the origin, and sometimes you don't.)