My data has a start and end time stamp such as this:
200401010000 200401010030
200401010030 200401010100
200401010100 200401010130 and so on...
I'm trying to convert these fields into %YYYY%MM%DD%HH%MM format using lubridate and as.POSIXct but it I get only NAs. Any help will be appreciated.
My goal is to aggregate the data for each month.
The code I've used so far is as follows:
start_time = as.POSIXct(dat$TIMESTAMP_START, format = "%YYYY%MM%DD %HH%MM",origin = "2004-01-01 00:00", tz="EDT")
stop_time = as.POSIXct(dat$TIMESTAMP_END, format = "%YYYY%MM%DD%HH%MM",origin = "2004-01-01 00:30", tz="EDT")
dat$interval <- interval(start_time, stop_time)
Two problems I can see:
If you're using lubridate already, you should probably use the function ymd_hm(), which is just cleaner IMO.
You can't apply that function to a vector (which I presume dat$TIMESTAMP_START and dat$TIMESTAMP_END are); to do this, you can use:
start_time <- sapply(dat$TIMESTAMP_START, ymd_hm())
end_time <- sapply(dat$TIMESTAMP_END, ymd_hm())
That will apply the function to each item in your vector.
Related
I have two databases where I need to combine columns based on 2 common Date columns, with condition that the DAY for those dates are the same.
"2020/01/01 20:30" MUST MATCH "2020/01//01 17:50"
All dates are in POSIXct format.
While I could use some pre-cprocessing with string parsing or the like, I wanted to handle it via lubridate/dplyr like:
DB_New <- left_join(DB_A,DB_B, by=c((date(Date1) = date(Date2)))
notice I am using the function "date" from dplyr to rightly match condition as explained above. I am though getting the error as below:
DB_with_rain <- left_join(DB_FEB_2019_join,Chuvas_BH, by=c(date(Saida_Real)= date(DateTime)))
Error: unexpected '=' in "DB_with_rain <- left_join(DB_FEB_2019_join,Chuvas_BH, by=c(date(Saida_Real)="
Within in the by, we cannot do the conversion - it expects the column name as a string. It should be done before the left_join
library(dplyr)
DF_FEB_2019_join %>%
mutate(Saida_Real = as.Date(Saida_Real, format = "%Y/%m/%d %H:%M")) %>%
left_join(Chuvas_BH %>%
mutate(DateTime = as.Date(DateTime, format = "%Y/%m/%d %H:%M")),
by = c(Saida_Real = "DateTime"))
With lubridate function, the as.Date can be replaced with ymd_hm and convert to Date class with as.Date
I have to pull different data sets from the same API regularly but for different reasons, so I have to write out the code for many different pulls. I'd like to create some functions to help with this, but I need some help.
I haven't been able to figure out how to set up the function so that I can change the data set but still pull from the same column each time. In this example, I have 3 columns with timestamps that mean different things (made up in this data). I need to change the timezone here to my local time zone. The column name will remain the same in all of my datasets, but the name of the dataset will change. I have a few places in my code where I need to do this, and I haven't been able to figure it out, so any suggestions would be much appreciated!
The second section of this example code is not included in the actual code, but it is there to set the data up correctly. The data comes out of the API in the format shown as GMT.
df <- data.frame(col_1 = c(1, 2, 3, 4),
time_1 = c("2021-01-20 23:58:21", "2021-01-20 21:21:00", "2021-01-20 17:14:04", "2021-01-20 01:05:18"),
time_2 = c("2021-01-19 23:58:21", "2021-01-19 21:21:00", "2021-01-19 17:14:04", "2021-01-19 01:05:18"),
time_3 = c("2021-01-18 23:46:21", "2021-01-18 36:21:00", "2021-01-18 15:14:04", "2021-01-18 01:05:18"),
time_4 = c("2021-01-17 23:58:21", "2021-01-17 20:21:00", "2021-01-17 18:14:04", "2021-01-17 02:05:18"))
# Not part of actual code
df$time_1 <- as.POSIXlt(df$time_1, tz = "GMT")
df$time_2 <- as.POSIXlt(df$time_2, tz = "GMT")
df$time_3 <- as.POSIXlt(df$time_3, tz = "GMT")
df$time_4 <- as.POSIXlt(df$time_4, tz = "GMT")
# What I want it to do
# df$time_1 <- lubridate::with_tz(df$time_1, tz = "America/Los_Angeles")
# df$time_2 <- lubridate::with_tz(df$time_2, tz = "America/Los_Angeles")
# df$time_3 <- lubridate::with_tz(df$time_3, tz = "America/Los_Angeles")
# df$time_4 <- lubridate::with_tz(df$time_4, tz = "America/Los_Angeles")
# Attempted function
timezone_cleanup <- function(my_df){
my_df$time_1 <- lubridate::with_tz(my_df$time_1, tz = "America/Los_Angeles")
my_df$time_2 <- lubridate::with_tz(my_df$time_2, tz = "America/Los_Angeles")
my_df$time_3 <- lubridate::with_tz(my_df$time_3, tz = "America/Los_Angeles")
my_df$time_4 <- lubridate::with_tz(my_df$time_4, tz = "America/Los_Angeles")
}
# how I'd like to use this function. Not working now. Even if I wrap it with data.frame(), it's not what I wanted.
new_df <- timezone_cleanup(df)
I think you need to return my_df in your function to get the changed dataframe back. However, you can use lapply or across to apply the same function to multiple columns.
library(dplyr)
timezone_cleanup <- function(my_df){
my_df %>%
mutate(across(starts_with('time'),
lubridate::with_tz, tz = "America/Los_Angeles"))
}
new_df <- timezone_cleanup(df)
By the way, I do recive a warning message while using this Unrecognized time zone 'America/Los_Angeles'. Are you sure you are using the correct tz value?
In R:
How can I change the column of a data frame from yyyymmddHHMMSS to yyyy-mm-ss HH:MM:SS?
I tried
for(i in 1:nrow(tabla_eaq)){
tabla_eaq[i,'datetime'] = ymd_hms(tabla_eaq[i,'datetime'])
}
But it shows up as for example 1606514222 for input 20201127215702.
We don't need a for loop as the ymd_hms is vectorized
library(lubridate)
tabla_eaq$datetime <- ymd_hms(tabla_eaq$datetime)
data
tabla_eaq <- data.frame(datetime = c(20201127215702, 20201127215702, 20201127215702))
I am trying to format a string column to a date-time serie.
The row in the column are like this example: "2019-02-27T19:08:29+000"
(dateTime is the column, the variable)
mutate(df,dateTime=as.Date(dateTime, format = "%Y-%m-%dT%H:%M:%S+0000"))
But the results is:
2019-02-27
What about the hours, minutes and seconds ?
I need it to apply a filter by date-time
Your code is almost correct. Just the extra 0 and the as.Date command were wrong:
library("dplyr")
df <- data.frame(dateTime = "2019-02-27T19:08:29+000",
stringsAsFactors = FALSE)
mutate(df, dateTime = as.POSIXct(dateTime, format = "%Y-%m-%dT%H:%M:%S+000"))
I'm trying to write a script that will scan a table for times, and those that happen to be past 6pm will be changed to be 6am of the following day. I have tried using the lubridate package (ymd_hms), but the problem is that it forces me to specify a date (I would like to just use the current system date).
I am kind of new to R (and programming in general) so I'm having trouble wrapping my head around how factors, variables and all that works.
endTime <- ymd_hms("x 18:00:00", tz = "America/Chicago")
Ideally I would want the "x" to take on the system date (no time), but lubridate won't let me do that as it only wants a numerical date in there, it won't let me assign some date to a name and use that.
After that, this should happen
for (Time in firstTen) {
if (tables$Time > endTime ) {
dateTime = ymd_hms("x+1 06:00:00")
}
}
I know the code isn't functional but I just want to give you an idea of what I have in mind.
Any help appreciated!
Here you go mate
dateTime = ymd_hms( paste(Sys.Date()+1, "06:00:00", sep="-"))
EDIT: for your other question regarding changing timezones, you can use this: (from here)
require(lubridate)
dateTime = ymd_hms( paste(Sys.Date()+1, "06:00:00", sep="-"))
dateTime <- as.POSIXct(dateTime, tz="Europe/London")
attributes(dateTime)$tzone <- "America/Los_Angeles"
dateTime
You can achieve this with
library(lubridate)
library(dplyr)
endTime <- ymd_hms(paste(Sys.Date(), "18:00:00"), tz = "America/Chicago")
test.data <- data.frame("Original.time" = endTime + minutes(round(rnorm(15, 1440, 2000))))
time.hours <- hour(test.data$Original.time) +
minute(test.data$Original.time)/60 +
second(test.data$Original.time)/3600
test.data$New.Time <- if_else(time.hours > 18,
ymd_hms(paste(date(test.data$Original.time)+1, "6:00:00"), tz = "America/Chicago"),
test.data$Original.time)
I hope this helps!