This question already has answers here:
R: How to handle times without dates?
(3 answers)
Want only the time portion of a date-time object in R
(2 answers)
Strip the date and keep the time
(2 answers)
Closed 4 years ago.
How can I change the datatype in my dataframe from chr to time, without adding dates to it?
My dataframe is called exit_count and my column is called time.
When I use the head(exit_count) I can see the table how it is supposed to, however when I use the str(exit_count) I can see that my time coulmn is in chr-format.
Programing in R:
I have tried using the as.POSIXct():
exit_count$time <- as.POSIXct(exit_count$time,format="%H:%M")
however, this adds today's date which messes up with my dataset (I have a separate column for date).
I have tried the chron-package:
chron(times = exit_count$time, format="%H:%M")
however, it still doesn't work.
exit_count <- read.csv (Data/Exit count/exit_count_all_data.csv",
sep=";", dec=".", header=TRUE, stringsAsFactors=FALSE)
exit_count$date <- as.Date(as.POSIXct(exit_count$date, format="%d.%m.%Y", tz="GMT")) #This works
exit_count$time <- chron(times = exit_count$time,format="%H:%M") #However, neither this
exit_count$time <- as.POSIXct(exit_count$time,format="%H:%M") #Nor this
exit_count$time <- times(paste0(exit_count$date, "%H:%M")) #Nor this works (This was something that #Sotos wanted me to try, but it still doesn't work).
exit_count$time <- chron(times=exit_count$time) #Nor this. However, i get an output here - but this yields big numbers (e.g. 17911.93), which is not what I want.
I need for this to work in order to get my data frames and my plots to be correct for my masters thesis.
Without me being an expert, I imagine it can be done in such a way that the date and the time can be in two separate columns and that the time can be in a datatype that R understands as time (like it does with the POSIX and date, that I have already manage to make).
Please help! Thanks :)
Related
I have data below for work hours which I need to compare - start and stop with date and time. I first extract the time portion of each as start and stop variables, then use the chron package to change them from factor data to something I can compare more easily.
require(chron)
eg_data3 <- data.frame(
id = c('42', '42', '42', '42', '42'),
time_in = as.factor(c('11/5/2017 13:52', '11/4/2017 14:25', '11/5/2017 15:30', '11/5/2017 17:10', '11/6/2017 18:20')),
time_out = as.factor(c('11/5/2017 13:59', '11/4/2017 14:59', '11/5/2017 16:00', '11/5/2017 17:45', '11/6/2017 18:50')))
eg_data3$start_time <- substring(strptime(eg_data3$time_in, format = "%m/%d/%Y %H:%M"),12,19)
eg_data3$end_time <- substring(strptime(eg_data3$time_out, format = "%m/%d/%Y %H:%M"),12,19)
eg_data3$end_time <- chron(times = eg_data3$end_time)
eg_data3$start_time <- chron(times = eg_data3$start_time)
Next, I generate another variable which compares the difference between stop time 1, and start time 2, IE stop time in row 1 with start time in row 2, to see the gap between them.
require(dplyr)
eg_data3 <- eg_data3 %>% group_by(id) %>% mutate(diff_outX0_inX1 = start_time - lag(end_time))
When I do this, the variable is formatted as a decimal. I cannot for the life of me get it to display as hh:mm:ss. I have tried specifying out.format as hh:mm:ss in chron, changing time_in / time_out to numeric and character before and after extraction and applying chron(times), changing the format of the diff_ variable after, etc.
What seems like a very simple question -
How do I get the result comparison (diff_outX0_inX1) variable to display as time, either hh:mm or hh:mm:ss ?? I know the formula to convert fractional days into minutes in Excel, but I'd prefer to not write out a two step function, I assume it's a simple formatting issue.
Any help is appreciated.
EDIT - got flagged as a duplicate...OK. I asked if there was a way to do this that did not involve writing a function. The answer that was linked involves a function. First comment provided a clean simple answer. I can reproduce the answer in the comment, I could not reproduce the function myself, not nearly as helpful. I also added another solution that does not requre dplyr. No where I looked online showed me something as simple as "just format the result with chron."
I am relatively new to R and I have a dataset in which I am trying to convert a date and time into a numeric value. The date and time are in the format 01JUN17:00:00:00 under a variable called pickup_datetime. I have tried using the code
cab_small_sample$pickup_datetime <- as.numeric(as.Date(cab_small_sample$pickup_datetime, format = '%d%b%y'))
but this way doesn't incorporate time, I tried to add the time format to the format section of code but still did not work. Is there an R function that will convert the data into a numeric value>
R has two main time classes: "Date" and "POSIXct". POSIXct is a datetime class and you can get all the gory details at: ? DateTimeClasses. The help page for the formats used at the time of data input, however, are at ?striptime.
cab_small_sample <- data.frame(pickup_datetime = "01JUN17:00:00:00")
cab_small_sample$pickup_dt <- as.numeric(as.POSIXct(cab_small_sample$pickup_datetime,
format = '%d%b%y:%H:%M:%S'))
cab_small_sample
# pickup_datetime pickup_dt
#1 01JUN17:00:00:00 1496300400 # seconds since 1970-01-01
I find that a "destructive reassignment of values" is generally a bad idea so as a "my (best?) practice rule" I don't assign to the same column until I'm sure I have the code working properly. (And I always leave an untouched copy somewhere safe.)
lubridate is an extremely handy package for dealing with dates. It includes a variety of functions which do the date/time parsing for you, as long as you can provide the order of components. In this case, since your data is in day-month-year-hms form, you can use the dmy_hms function.
library(lubridate)
cab_small_sample <- dplyr::tibble(
pickup_datetime = c("01JUN17:00:00:00", "01JUN17:11:00:00"))
cab_small_sample$pickup_POSIX <- dmy_hms(cab_small_sample$pickup_datetime)
I believe this may have been asked a lot, but I have data in the format below and I can't apply the existing answers to my questions (including this nearest answer - R - Stock market data from csv to xts).
Date,AX,BY,CZ
5/21/2015,817,57,22.55
5/22/2015,810.5,57.45,22.7
So the data is in the format of DATE, CLOSE of stock AX, CLOSE of stock BY, CLOSE of stock CZ. Just to specify, the date is in the format shown above that is, m/d/YYYY where months and days are flexible in digits (one or two) while year is always in the format of four digits. The file is saved as CSV.
I wanted to use this code to convert the "zoo" read data to xts.
x <- as.xts(z)
The xts and zoo vignettes aren't really newbie friendly, so I hope someone can give a little nudge.
You might look at the examples in the help page for read.zoo. You need to tell the function about the header, the date format, and the separator between values. To read the data from a text string would look like
library(xts)
z <- read.zoo(header=TRUE, format="%m/%d/%Y", sep=",",
text ="Date,AX,BY,CZ
5/21/2015,817,57,22.55
5/22/2015,810.5,57.45,22.7")
z <- as.xts(z)
To read from the file fileName, replace text=... with file="filename".
I have a dataset with some date time like this "{datetime:2015-07-01 09:10:00" So I wanted to remove the text, and then keep the date & the time as as.Date returns only the date. So I write this code but the only problem I have is that during the second line with strsplit, it only returns me the date time of the first line and so erase the others... I woud love to get ALL my date time not only the first. I thought about sapply maybe, but I can't make it right I have many errors or maybe with a loop for? I am novice to R so I don't really know how to do this the best way.
Could you help me please? Besides If you have another idea for the time & date format or a simple way to do it, it should be very nice of you too.
data$`Date Time`=as.character(data$`Date Time`)
data$`Date Time`=unlist(strsplit(data[,1], split='e:'))[2]
date=substr(data$`Date Time`,0,10)
date=as.Date(date)
time=substr(data$`Date Time`,12,19)
data$Date=date
data$Time=time
Thank you very much for your help!
You could use the format argument to avoid all the strsplit:
times <- as.POSIXct(data$`Date Time`, format='{datetime:%Y-%m-%d %H:%M:%S')
(The reason for the "{datetime:" in the format is because you mentioned this is the format of your strings).
This object has both date and time in it, and then you can just store it in the dataframe as a single column of type POSIXct rather than two columns of type string e.g.
data$datetime <- times
but if you do want to store the date as a Date and the time as a string (as in your example above):
data$Date <- as.Date(times)
data$Time <- strftime(times, format='%H:%M:%S')
See ?as.Date, ?as.POSIXct, ?strptime for more details on that format argument and various conversions between date and string.
I need to modify this example code for using it with intraday data which I should get from here and from here. As I understand, the code in that example works well with any historical data (or not?), so my problem then boils down to a question of loading the initial data in a necessary format (I mean daily or intraday).
As I also understand from answers on this question, it is impossible to load intraday data with getSymbols(). I tried to download that data into my hard-drive and to get it then with a read.csv() function, but this approach didn't work as well. Finally, I found few solutions of this problem in various articles (e.g. here), but all of them seem to be very complicated and "artificial".
So, my question is how to load the given intraday data into the given code elegantly and correctly from programmer's point of view, without reinventing the wheel?
P.S. I am very new to analysis of time series in R and quantstrat thus if my question seems to be obscure let me know what you need to know to answer it.
I don't know how to do this without "reinventing the wheel" because I'm not aware of any existing solutions. It's pretty easy to do with a custom function though.
intradataYahoo <- function(symbol, ...) {
# ensure xts is available
stopifnot(require(xts))
# construct URL
URL <- paste0("http://chartapi.finance.yahoo.com/instrument/1.0/",
symbol, "/chartdata;type=quote;range=1d/csv")
# read the metadata from the top of the file and put it into a usable list
metadata <- readLines(paste(URL, collapse=""), 17)[-1L]
# split into name/value pairs, set the names as the first element of the
# result and the values as the remaining elements
metadata <- strsplit(metadata, ":")
names(metadata) <- sub("-","_",sapply(metadata, `[`, 1))
metadata <- lapply(metadata, function(x) strsplit(x[-1L], ",")[[1]])
# convert GMT offset to numeric
metadata$gmtoffset <- as.numeric(metadata$gmtoffset)
# read data into an xts object; timestamps are in GMT, so we don't set it
# explicitly. I would set it explicitly, but timezones are provided in
# an ambiguous format (e.g. "CST", "EST", etc).
Data <- as.xts(read.zoo(paste(URL, collapse=""), sep=",", header=FALSE,
skip=17, FUN=function(i) .POSIXct(as.numeric(i))))
# set column names and metadata (as xts attributes)
colnames(Data) <- metadata$values[-1L]
xtsAttributes(Data) <- metadata[c("ticker","Company_Name",
"Exchange_Name","unit","timezone","gmtoffset")]
Data
}
I'd consider adding something like this to quantmod, but it would need to be tested. I wrote this in under 15 minutes, so I'm sure there will be some issues.