I would like to convert a date that I have in R into an individual date and time. At the moment the format of the date is POSIXct
An example is given here:
"2019-03-29 20:42:07"
I want the date to be in one column and the time of that date in a corresponding column. I have found something similar here, but it doesn't answer my question.
Many thanks
If the column shows POSIXct class. Create two new columns by coercing to Date (as.Date) and the time part with format
df1 <- transform(df1, date = as.Date(datetime), time = format(datetime, "%T"))
df1
# datetime date time
#1 2019-03-29 20:42:07 2019-03-30 20:42:07
data
df1 <- structure(list(datetime = structure(1553910127, class = c("POSIXct",
"POSIXt"), tzone = "")), class = "data.frame", row.names = c(NA,
-1L))
Related
I have a column in a dataframe that is the number of seconds past midnight. How would I got about converting that number to a time displayed as hh:mm:ss? For instance:
hrsecs
1563
13088
14309
becomes
Time
00:26:03
03:38:08
03:58:29
Convert the seconds to period (seconds_to_period) and use hms from hms package
library(lubridate)
library(dplyr)
df1 <- df1 %>%
transmute(Time = hms::hms(seconds_to_period(hrsecs)))
-output
df1
Time
1 00:26:03
2 03:38:08
3 03:58:29
data
df1 <- structure(list(hrsecs = c(1563L, 13088L, 14309L)),
class = "data.frame", row.names = c(NA,
-3L))
1) character output Convert to POSIXct and then format. No packages are used.
x <- c(1563, 13088, 14309)
tt <- format(as.POSIXct("1970-01-01") + x, "%T"); tt
## [1] "00:26:03" "03:38:08" "03:58:29"
or
tt <- format(structure(x, class = c("POSIXct", "POSIXt"), tzone = "UTC"), "%T")
tt
## [1] "00:26:03" "03:38:08" "03:58:29"
2) times class output If you want to be able to manipulate the times then this will express them internally as fractions of a day but render them as times.
library(chron)
times(tt)
## [1] 00:26:03 03:38:08 03:58:29
This question already has an answer here:
Searching for nearest date in data frame
(1 answer)
Closed 1 year ago.
I have two dataframes, DF1 containing monthly data snapshot of data whereas DF2 with a particular date and i want to be able to retrieve data only for closest maxdate (<=) from DF1 wrt DF2 data.
DF1
Account
Date
A1000001
1-JAN-2021
A1000002
1-FEB-2021
A1000003
1-MAR-2021
A1000004
1-APR-2021
DF2
Date
15-MAR-2021
Output Expected:
Account
Date
A1000003
1-MAR-2021
Change the dates to actual date class and using sapply you may find the closest date in df1 for each date in df2.
df1$Date <- as.Date(df1$Date, '%d-%b-%Y')
df2$Date <- as.Date(df2$Date, '%d-%b-%Y')
result <- df1[sapply(df2$Date, function(x) which.min(abs(df1$Date - x))), ]
result
# Account Date
#3 A1000003 2021-03-01
data
It is easier to help if you provide data in a reproducible format
df1 <- structure(list(Account = c("A1000001", "A1000002", "A1000003",
"A1000004"), Date = c("1-JAN-2021", "1-FEB-2021", "1-MAR-2021",
"1-APR-2021")), row.names = c(NA, -4L), class = "data.frame")
df2 <- structure(list(Date = "15-MAR-2021"), row.names = c(NA, -1L),
class = "data.frame")
I have 2 dataframes:
df1<-structure(list(id = c(1, 2), date = structure(c(1483636800, 1485192000
), class = c("POSIXct", "POSIXt"), tzone = "")), class = "data.frame", row.names = c(NA,
-2L))
df2<-structure(list(id.1 = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), sunrise = structure(c(1483617946, 1485198384,
1485205584), class = c("POSIXct", "POSIXt"), tzone = "")), class = "data.frame", row.names = c(NA,
-3L))
df1:
id date
1 1 2017-01-05 12:20:00
2 2 2017-01-23 12:20:00
df2:
id.1 sunrise
1 A 2017-01-05 07:05:46
2 B 2017-01-23 14:06:24
3 C 2017-01-23 16:06:24
I would like to find the closest sunrise date in df2 to each of the dates in df1, and then calculate the time difference (in hours) and put these new values in new column "closest" in df1. To find the time until sunrise, whereby a negative value would indicate a period after sunrise and a positive value a period before sunrise.
My df1 is very large > 100 million rows, so it's important that the solution works fast and efficient. The solution I found cannot accommodate the size of my data set, not even on a single column (will return: "Error: vector memory exhausted (limit reached?)". Attempts to remedy this issue were so far unsuccessful.
sDTA <- data.table(df1)[2]
rDTB <- data.table(df2)
try1<-sDTA[, closest := rDTB[sDTA, on = .(sunrise = date), roll = "nearest", x.sunrise]][]
try1$Time_until_sunrise<difftime(try1$closest,try1$date,units="hours")
I previously encountered exhausted memory issues while trying to use aggregate() functions, but was able to 'repair' this by replacing these with those that used dplyr. Perhaps this could be a solution again.
Can we add the date of Character class to another date (lag of specific date). I want to reduce by 05:30:00
df
Date
12:48:36
12:48:37
13:48:36
Required dateframe
df
Date
07:48:36
07:48:37
08:48:36
df <- structure(list(Date = structure(1:3, .Label = c("12:48:36", "12:48:37",
"13:48:36"), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))
You could use as.ITime from data.table
library(data.table)
setDT(df)
df[, Date := as.ITime(Date) - as.ITime('05:00:00')]
df
# Date
# 1: 07:48:36
# 2: 07:48:37
# 3: 08:48:36
Edit: If you have stored Date as a factor (as in this example) you need to convert to character first
df[, Date := as.character(Date)]
I have a revision data frame with 3 columns:
revisionTime
date
value
For instance here is a sample, but mine is very very long (several hundred of thousands of rows)
df = structure(list(revisionTime = structure(c(1471417781, 1471417781,
1471417781, 1473978576, 1473978576, 1473978576), class = c("POSIXct",
"POSIXt"), tzone = ""), date = structure(c(1464652800, 1467244800,
1469923200, 1456704000, 1467244800, 1472601600), class = c("POSIXct",
"POSIXt"), tzone = ""), value = c(103.7, 104.1, 104.9, 104.414,
104.3, 104.4)), .Names = c("revisionTime", "date", "value"), row.names = 536:541, class = "data.frame")
What I need is a very fast way to extract from this data.frame the latest revisionTime for each date (and the corresponding value). There are some similar questions, but my question is more precise: is there a way to avoid loops?
Thank you
We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'date' after converting to Date class, order the 'revisionTime' in descending order (in i) and get the first row with head.
library(data.table)
setDT(df1)[order(-revisionTime), head(.SD, 1), .(date = as.Date(date))]
If your revisionTime is nicely formatted (Y-m-d H:M:S) always as in your example, you may not need to convert to Date time at all, this should simply work:
aggregate(revisionTime ~ date, df, max)