im relatively new to R. I´m doing an experiment where i measure the exact time where a couple of insects mate in 14 days in different luminic conditions (12:12H Light/Dark, Continious light, Continious dark).The idea is to analyze this data with ANOVA, but i'm having problems with the data. So I have a .csv file with 3 columns: Light condition, Date and Time. Date is not required for the analysis so i dont need it. But i have trouble converting the time data for a proper data R can work with. I've already tried read.csv(file="",stringsAsFactors = FALSE) but it doesnt work at all, I've also tried with lubridate , as.POSIX function and strptime() but nothing seems to work (or maybe im not converting the data at all for a proper analysis)
Thank you in advance.
It looks like you have date and time in two different columns and for time you have only hour and minutes. You can combine them using paste and convert to date time using appropriate format.
df$DateTime <- as.POSIXct(paste(df$Dia, df$Hora),
format = "%Y-%m-%d %H:%M",tz = "UTC")
#Can also use strptime
df$DateTime <- strptime(paste(df$Dia, df$Hora),
format = "%Y-%m-%d %H:%M", tz = "UTC")
Or with lubridate
df$DateTime <- lubridate::ymd_hm(paste(df$Dia, df$Hora))
Related
I am trying to read an excel file into R. Among other fields, the excel file has two "date" fields, each containing both the date and time stamp in the SAME field.
Example:
StartDate 9/14/2019 10:18:59 AM
EndDate 9/18/2019 2:27:14 AM
When I tried read_excel to read in the excel file, the data frame formatted these two columns very strangely. It spat out the days (with decimals). Such as 43712.429849537039, Which I thought was days from Jan-01-1970 (the origin date that popped up when I typed lubrudate::origin).
data %<>%
mutate(StartDate = as.Date(StartDate, origin = "1970-01-01 UTC"))
So I tried converting this back using as.Date, but it converts it to the totally wrong date... (converts all the dates to the year 2089). Example, 2089-09-05.
Any help with this would be really appreciated! There must be a simpler way to directly read in a date-time column?!
You can use the lubridate package, it is excellent:
library(tidyverse)
df <- data.frame(StartDate =c("9/14/2019 10:18:59 AM","9/14/2019 3:18:59 PM"),
EndDate= c("9/18/2019 2:27:14 AM","9/18/2019 1:27:14 PM"))
df <- df %>% mutate(StartDate = lubridate::mdy_hms(StartDate), EndDate = lubridate::mdy_hms(EndDate))
It turns out that excel has a different "origin date" from R. Excels counts the days from 01-01-1900, where as R counts days from 01-01-1970.
When I used read_excel to read the file into a df, R used excels' counts of days. Which is why I got a weird date when I tried to convert to the date format using 1970. As soon as I used as.Date with excels "origin" date of 1990 (excels origin date), my dates parsed out correctly!
I have data below for work hours which I need to compare - start and stop with date and time. I first extract the time portion of each as start and stop variables, then use the chron package to change them from factor data to something I can compare more easily.
require(chron)
eg_data3 <- data.frame(
id = c('42', '42', '42', '42', '42'),
time_in = as.factor(c('11/5/2017 13:52', '11/4/2017 14:25', '11/5/2017 15:30', '11/5/2017 17:10', '11/6/2017 18:20')),
time_out = as.factor(c('11/5/2017 13:59', '11/4/2017 14:59', '11/5/2017 16:00', '11/5/2017 17:45', '11/6/2017 18:50')))
eg_data3$start_time <- substring(strptime(eg_data3$time_in, format = "%m/%d/%Y %H:%M"),12,19)
eg_data3$end_time <- substring(strptime(eg_data3$time_out, format = "%m/%d/%Y %H:%M"),12,19)
eg_data3$end_time <- chron(times = eg_data3$end_time)
eg_data3$start_time <- chron(times = eg_data3$start_time)
Next, I generate another variable which compares the difference between stop time 1, and start time 2, IE stop time in row 1 with start time in row 2, to see the gap between them.
require(dplyr)
eg_data3 <- eg_data3 %>% group_by(id) %>% mutate(diff_outX0_inX1 = start_time - lag(end_time))
When I do this, the variable is formatted as a decimal. I cannot for the life of me get it to display as hh:mm:ss. I have tried specifying out.format as hh:mm:ss in chron, changing time_in / time_out to numeric and character before and after extraction and applying chron(times), changing the format of the diff_ variable after, etc.
What seems like a very simple question -
How do I get the result comparison (diff_outX0_inX1) variable to display as time, either hh:mm or hh:mm:ss ?? I know the formula to convert fractional days into minutes in Excel, but I'd prefer to not write out a two step function, I assume it's a simple formatting issue.
Any help is appreciated.
EDIT - got flagged as a duplicate...OK. I asked if there was a way to do this that did not involve writing a function. The answer that was linked involves a function. First comment provided a clean simple answer. I can reproduce the answer in the comment, I could not reproduce the function myself, not nearly as helpful. I also added another solution that does not requre dplyr. No where I looked online showed me something as simple as "just format the result with chron."
I am relatively new to R and I have a dataset in which I am trying to convert a date and time into a numeric value. The date and time are in the format 01JUN17:00:00:00 under a variable called pickup_datetime. I have tried using the code
cab_small_sample$pickup_datetime <- as.numeric(as.Date(cab_small_sample$pickup_datetime, format = '%d%b%y'))
but this way doesn't incorporate time, I tried to add the time format to the format section of code but still did not work. Is there an R function that will convert the data into a numeric value>
R has two main time classes: "Date" and "POSIXct". POSIXct is a datetime class and you can get all the gory details at: ? DateTimeClasses. The help page for the formats used at the time of data input, however, are at ?striptime.
cab_small_sample <- data.frame(pickup_datetime = "01JUN17:00:00:00")
cab_small_sample$pickup_dt <- as.numeric(as.POSIXct(cab_small_sample$pickup_datetime,
format = '%d%b%y:%H:%M:%S'))
cab_small_sample
# pickup_datetime pickup_dt
#1 01JUN17:00:00:00 1496300400 # seconds since 1970-01-01
I find that a "destructive reassignment of values" is generally a bad idea so as a "my (best?) practice rule" I don't assign to the same column until I'm sure I have the code working properly. (And I always leave an untouched copy somewhere safe.)
lubridate is an extremely handy package for dealing with dates. It includes a variety of functions which do the date/time parsing for you, as long as you can provide the order of components. In this case, since your data is in day-month-year-hms form, you can use the dmy_hms function.
library(lubridate)
cab_small_sample <- dplyr::tibble(
pickup_datetime = c("01JUN17:00:00:00", "01JUN17:11:00:00"))
cab_small_sample$pickup_POSIX <- dmy_hms(cab_small_sample$pickup_datetime)
I have a data frame containing what should be a datetime column that has been read into R. The time values are appearing as numeric time as seen in the below data example. I would like to convert these into datetime POSIXct or POSIXlt format, so that date and time can be viewed.
tdat <- c(974424L, 974430L, 974436L, 974442L, 974448L, 974454L, 974460L, 974466L, 974472L,
974478L, 974484L, 974490L, 974496L, 974502L, 974508L, 974514L, 974520L, 974526L,
974532L,974538L)
974424 should equate to 00:00:00 01/03/2011, but the do not know the origin time of the numeric values (i.e. 1970-01-01 used below does not work). I have tried using commands such as the below to achieve this and have spent time trying to get as.POXISct to work, but I haven’t found a solution (i.e. I either end up with a POSIXct object of NAs or end up with obscure datetime values).
Attempts to convert numeric time to datetime:
datetime <- as.POSIXct(strptime(time, format = "%d/%m/%Y %H:%M:%S"))
datetime <- as.POSIXct(as.numeric(time), origin='1970-01-01')
I am sure that this is a simple thing to do. Any help would be greatly received. Thanks!
Try one of these depending on which time zone you want:
t.gmt <- as.POSIXct(3600 * (tdat - 974424), origin = '2011-03-01', tz = "GMT")
t.local <- as.POSIXct(format(t.gmt))
I am working with data from csv files that will all look the same so I am hoping to come up with a code that can be easily applied to all of them.
However, sadly enough I am failing at step one :-(.
The csv files have the date and time saved in one column, so when I import them with read.csv that column gets read as a chr. How can I most easily convert this into a date that I then can use for plotting and analysis?
Here is what I tried:
load the data --> will save the date and time as chr under mydata$Date.Time (e.g. 1/1/15 0:00)
mydata<-read.csv(file.choose(), stringsAsFactors = FALSE,
strip.white = TRUE,
na.strings = c("NA",""), skip=16,
header=TRUE)
separate the Date.Time into Date and Time:
new <- do.call( rbind , strsplit( as.character( mydata$Date.Time ) , " " ) )
add these two back to the df mydata:
cbind( mydata , Date = new[,2] , Time = new[,1] )
convert Date into a date format via as.Date:
mydata$Date <- as.Date(new[,1], format="")
So this works fine for the date however I am stuck with the time, I tried this:
mydata$Time <- format(as.POSIXct(new[,2], format="%H:%M"))
this gives me the following error:
Error in as.POSIXlt.character(x, tz, ...) :
character string is not in a standard unambiguous format
I wonder if there is a smarter way of doing this? Reading in time and date seems to be one of the substantial tasks that I would like to understand. Is there a way of R directly recognizing the date and time from the csv? Or is it generally smarter to generate a time vector by its own, if so how would I do that?
Thanks so much for your help.
Sandra
If you want to use time only, consider using the chron package:
library(chron)
mytime <- times("21:19:37")
or in your case
times(new[,2])
assuming that that's a character vector.
I tried the chron approach but it wouldn't work for me :-(.
So what I ended up doing is just creating a time vector for the period that I am loading the data in for:
date <-seq(as.POSIXct("2015/1/1 00:00"), as.POSIXct("2015/1/31 23:00"), "hours")
and then adding it back to the df.
Not what I wanted but it will work until I find the ultimate solution :-)