As a new and self taught R user I am struggling with converting date and time values characters into numbers to enable me to group unique combinations of data. I'm hoping someone has come across this before and knows how I might go about it.
I'd like to convert a field of DateTime data (30/11/2012 14:35) to a numeric version of the date and time (seconds from 1970 maybe??) so that I can back reference the date and time if needed.
I have search the R help and online help and only seem to be able to find POSIXct, strptime which seem to convert the other way in the examples I've seen.
I will need to apply the conversion to a large dataset so I need to set the formatting for a field not an individual value.
I have tried to modify some python code but to no avail...
Any help with this, including pointers to tools I should read about would be much appreciated.
You can do this with base R just fine, but there are some shortcuts for common date formats in the lubridate package:
library(lubridate)
d <- ymd_hms("30/11/2012 14:35")
> as.numeric(d)
[1] 1921407275
From ?POSIXct:
Class "POSIXct" represents the (signed) number of seconds since the
beginning of 1970 (in the UTC timezone) as a numeric vector.
Related
Referring to the above screenshot, i'm trying to crawl data from Singapore Stock Exchange, which the web content is loaded dynamically from an API call returning json, example here
I'm having some problem with the dates, which is given as a number by the json. For example, 1575491760000 is supposed to be 2019-12-04 20:36:00GMT.
After some trial and error, i've figured solution using R:
as.POSIXct(1575491760000/1000, origin="1970-01-01", tz = 'GMT')
# not sure why need to divide the number by 1000 here but i guess this is the way to make it work
and the above code does return "2019-12-04 20:36:00 GMT" in R.
However, my question is there a solution to the above conversion in Excel? I've tried a few different ways but none of them can deal with such long data scenario (date + time format). Appreciated if anyone can provide a specific solution!
Here's the Excel equivalent.
=DATE(1970,1,1) + 1575491760000/(1000*60*60*24)
# 12/4/19 20:36:00 with cell formatting set to m/d/yy h:mm:ss
Unix time increments one for every millisecond since 1/1/1970. Excel datetimes increment one for every day since 1/1/1900.
So to convert from UNIX time to excel, divide by the number of milliseconds in a day (1000*60*60*24) and add to the date 1/1/70 (25569 under the hood in Excel.)
I have downloaded the SP500 data from Yahoo Finance ticker GSPC and am trying to filter it by year, however the Date column is stored as Factor so R can't filter it. Can anyone help me convert it? I tried multiple solutions, but nothing worked.
So far I've used the loaded the lubridate package and used the following code, but all the values just got replaced with NA's.
as.Date(SP500$Date, format = "%m-%d-%Y")
Then I used the: SP500$Date <- ymd(SP500$Date, format = "%Y-%m-%d") code and again nothing happened. (SP500 is the name of the data frame that I stored the data in)
Also, tried using just SP500$Date <- as.Date(SP500$Date) but R says do not know how to convert it to Date.
Any help would be much appreciated! Thank you!
Classes only exist in the environment of a programming language. What likely happened was that your data (perhaps a .csv file?) got interpreted as factor by R during reading.
Everything you're trying to do here can be accomplished using the base library in R (meaning you don't need to import anything).
If you're dealing with dates:
df$date <- as.Date(df$date, format = "%Y-%m-%d")
If you're dealing with datetimes:
df$date <- as.POSIXct(df$date, format = "%Y-%m-%d %H:%M:%S")
(obviously the specific format may vary; see list)
Occasionally, coercion in R may act finicky. The format parameter is somewhat unforgiving of errors. I personally frequently mistake - for /, or conflate "%Y-%m-%d" with "%d-%m-%Y" causing the operation to throw an error. Obviously, if the format isn't consistent in your data, instances that can't be described by the specific format you supplied will result in NAs.
Sometimes your dates are actually integers (e.g. 20181111); in this case, you may need to supply '1970-01-01' to the origin parameter of as.Date(). For example, if you are iterating through a vector of Dates using a for loop, R won't honour the class of passed Dates and will convert them to integers.
It may sound like a bandaid solution, but class coercions from common types like character are usually written well; I often pre-emptively coerce the object to character when I'm clueless about why my attempt to coerce a class failed.
I have a spreadsheet on which one of the columns is a date. When importing that SS to R, most of the columns have the right information, but the date column has the row number instead of the date. I'm using openxlsx. Any idea on what the problem is?
Try loading your data with readxl package. Loads very fast and keeps most data in the right format. Otherwise, you could try XLConnect slower but more versatile.
Is by any chance this happening?
as.numeric(as.Date("29.3.2016", format = "%d.%m.%Y"))
[1] 16889
If yes, then be amazed at this.
diff(as.Date(c("29.3.2016", "1.1.1970"), format = "%d.%m.%Y"))
Time difference of -16889 days
What is going on? Each date has an origin, and by default it is set to that wonderful day of January 1, 1970. If you coerce a date to numeric, the result is the difference between from the origin. See how R handles dates.
I currently am very new with R and am working with stock data. I am trying to set up a date and closing price dataset with 3 different stocks. I have merged all 3 stocks by date into one dataset, but now I have no clue how to get R to recognize my column "Date" as actual dates, instead of numerals. I need to plot date by price for these stocks. I have dabbled with as.Date() but I think that the necessary format for this command is 01/01/15, whereas the format I have for my data is in 1/1/15. Long story short, I cannot change the format in Excel then import it back over, so I am currently stuck with 1/1/15 format and unable to get R to recognize my data as dates. Any help would be greatly appreciated!
Sorry for wall of text.
So, the format it expects (assuming that's 1 January 2015?) is "2015-01-01" or similar. You can use base R's tools but they're more painful for you as a user than say, lubridate - a package designed just for date formatting that includes something for handling day-month-year dates:
install.packages("lubridate")
library(lubridate)
day <- "1/1/15"
as.Date(dmy(day))
[1] "2015-01-01"
Give that a whirl, see if it works for you.
I have a vector of times in R, all_symbols$Time and I am trying to find out how to get JUST the times (or convert the times to strings without losing information). I use
strptime(all_symbol$Time[j], format="%H:%M:%S")
which for some reason assumes the date is today and returns
[1] "2013-10-18 09:34:16"
Date and time formatting in R is quite annoying. I am trying to get the time only without adding too many packages (really any--I am on a school computer where I cannot install libraries).
Once you use strptime you will of necessity get a date-time object and the default behavior for no date in the format string is to assume today's date. If you don't like that you will need to prepend a string that is the date of your choice.
#James' suggestion is equivalent to what I was going to suggest:
format(all_symbol$Time[j], format="%H:%M:%S")
The only package I know of that has time classes (i.e time of day with no associated date value) is package:chron. However I find that using format as a way to output character values from POSIXt objects lends itself well to functions that require factor input.
In the decade since this was written there is now a package named “hms” that has some sort of facility for hours, minutes, and seconds.
hms: Pretty Time of Day
Implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.
Came across the same problem recently and found this and other posts R: How to handle times without dates? inspiring. I'd like to contribute a little for whoever has similar questions.
If you only want to you base R, take advantage of as.Date(..., format = ("...")) to transform your date into a standard format. Then, you can use substr to extract the time. e.g. substr("2013-10-01 01:23:45 UTC", 12, 16) gives you 01:23.
If you can use package lubridate, functions like mdy_hms will make life much easier. And substr works most of the time.
If you want to compare the time, it should work if they are in Date or POSIXt objects. If you only want the time part, maybe force it into numeric (you may need to transform it back later). e.g. as.numeric(hm("00:01")) gives 60, which means it's 60 seconds after 00:00:00. as.numeric(hm("23:59")) will give 86340.