I have three dates in the database. They all represent 5 pm "real user time" meaning that in real life, the event occurring at those timestamps will occur at 5 pm:
1600117200000 (09/14/2020 # 5:00pm (America/Toronto) - 09/14/2020 # 9:00pm (UTC)
1615240800000 (03/08/2021 # 5:00pm (America/Toronto) - 03/08/2021 # 10:00pm (UTC)
1615842000000 (03/15/2021 # 5:00pm (America/Toronto) - 03/15/2021 # 9:00pm (UTC)
When I run the following, the third date is displayed as 4 pm instead of 5 pm. Why?
moment(1600117200000).tz('America/Toronto').format('LLL')
-> "September 14, 2020 5:00 PM"
moment(1615240800000).tz('America/Toronto').format('LLL')
-> "March 8, 2021 5:00 PM"
moment(1615842000000).tz('America/Toronto').format('LLL')
-> "March 15, 2021 4:00 PM"
I could understand why the middle date (1615240800000) would display an invalid hour since it's in different daylight saving time than I am currently, but the third (1615842000000) is in the same daylight saving time as when I execute the code.
Thanks!
:facepalm: I had an old version of moment-timezone and since the 2021 timezone updates did not exist, it was just displaying an invalid hour. I updated the lib (and database) and everything is fine.
Related
I have read, studied, and tested, but I'm just not getting it. Here is my data frame:
MyDate TEMP1 TEMP2
Monday, July 1, 2019 12:00:00:000 AM 90.0 1586
Monday, July 1, 2019 12:01:00:000 AM 88.6 1581
Monday, July 1, 2019 12:02:00:000 AM 89.4 1591
Monday, July 1, 2019 12:03:00:000 AM 90.5 1586
I need to compare it to a second data frame:
Date Time A.B.Flow A.B.Batch.Volume
7/1/2019 14:47:46 1.0 2.0
7/9/2019 14:47:48 3.0 5.0
7/11/2019 14:47:52 0.0 2.0
7/17/2019 14:48:52 3.8 4.0
7/24/2019 14:49:52 0.0 3.1
I just have to combine the two data frames when the minutes dates, hours, and minutes match. The seconds do not have to match.
So far I have gleaned that I need to convert the first Column MyDate into separate Dates and Times. I've been unable to come up with a strsplit command that actually does this.
This just gives each element in quotes:
Tried, newdate <- strsplit(testdate$MyDate, "\\s+ ")[[3]]
This is better but "2019"is gone:
Tried, newdate <- strsplit(testdate$MyDate, "2019")
It looks like this:
[1] "Monday, July 1, " "12:00:00:000 AM"
[[2]]
[1] "Monday, July 1, " "12:01:00:000 AM"
[[3]]
[1] "Monday, July 1, " "12:02:00:000 AM"
[[4]]
[1] "Monday, July 1, " "12:03:00:000 AM"
Please tell me what I am doing wrong. I would love some input as to whether I am barking up the wrong tree.
I've tried a few other things using anytime and lubridate, but I keep coming back to this combined date and time with the day written out as my nemesis.
You could get rid of the day (Monday, ...) in your MyDate field by splitting on ',', removing the first element, then combining the rest and converting to POSIXCt.
Assuming your first dataframe is called df:
dt <- strsplit(df$MyDate, ',')
df$MyDate2 <- sapply(dt, function(x) trimws(paste0(x[-1], collapse = ',')))
df$MyDate2 <- as.POSIXct(df$MyDate2, format = '%b %d, %Y %H:%M:%S')
And since you are not interested in the seconds part of the timestamps, you can do:
df$MyDate2 <- format(df$MyDate2, '%Y-%m-%d %H:%M')
You should similarly convert the Date/Time fields of your second dataframe df2, creating a MyDate2 field there with the seconds part removed as above.
Now you can merge the two dataframes on the MyDate2 column.
This might give you a hint:
Since you have time, you shouldn't used as.Date but rather as.POSIXct, imho.
x=c("Monday, July 1, 2019 12:00:00:000 AM 90.0 1586")
Months=c("January","February","March","April","May","June","July","August","September","October","November","December")
GetDate=function(x){
x=str_remove_all(x,",")#get rid of the
mo=which(Months==word(x,2))
day=word(x,3)
year=word(x,4)
time=word(x,5)
as.POSIXct(paste(paste(year,mo,day,sep="-"),time))
}
GetDate(x)
I'm working with a dataset that looks like this -
data = data.frame(ID=c(1,2,3,4,5,6,7,8,9,10),
Date=c('Jan 11, 2019 12:00:00 am','Feb 15, 2019 12:00:00 am','Mar 8, 2019 12:00:00 am',
'Apr 5, 2019 12:00:00 am','Apr 12, 2019 12:00:00 am','26/01/2015 00:00','2015-02-16 00:00:00',
'2015-02-12 00:00:00','2015-11-10 00:00:00','Dec 7, 2018 12:00:00 am'))
#Converting the Date column to character
data$Date=as.character(data$Date)
The column Date contains dates in different formats. I'd like to clean this column up so that all dates are in the same format.
The desired format - YYYY-MM-DD
HERE'S MY ATTEMPT -
I used the AsDate function from the flipTime package to convert my dates.
require(devtools)
install_github("Displayr/flipTime")
library(flipTime)
data$Date_New=AsDate(data$Date)
Which gives me the following error
Error in handleParseFailure(deparse(substitute(x)), length(x),
on.parse.failure) : Could not parse data$Date into a valid date in
any format.
However when I try the same function with any single date from my dataset, it works fine.
AsDate("Feb 15, 2018 12:00:00 AM")
[1] "2018-02-15"
AsDate("19/07/2017 00:00")
[1] "2017-07-19"
Any suggestions or alternative solutions would be highly appreciated
I wish convert some dates in Norwegian to actual dates in R. I'm using readr, and it kind of works - but I stumbled upon an issue which really annoys me, and I don't really know how to get around it.
Here is an illustration of my problem:
> parse_date(c("29. mai 2017", "29. sep 2017"), format = "%d. %b %Y", locale = locale("nn"))
Warning: 1 parsing failure.
row # A tibble: 1 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 2 NA date like %d. %b %Y 29. sep 2017
[1] "2017-05-29" NA
So it catches the date in May but not the one in September. It turns out that this is because the abbreviation for September in Norwegian needs a "." (sep. instead of sep), whereas the May abbreviations does not (probably because it's actually not an abbreviation ;-)):
locale("nb")
<locale>
Numbers: 123,456.78
Formats: %AD / %AT
Timezone: UTC
Encoding: UTF-8
<date_names>
Days: søndag (søn.), mandag (man.), tirsdag (tir.), onsdag (ons.), torsdag(tor.), fredag (fre.), lørdag (lør.)
Months: januar (jan.), februar (feb.), mars (mar.), april (apr.), mai (mai), juni (jun.), juli (jul.), august (aug.), september (sep.), oktober (okt.), november (nov.), desember (des.)
AM/PM: a.m./p.m.
However it seems inconsistent that it will not require the same number of charterers for all months. I also noticed that these annoying "." are not a part of the abbreviations in English:
> locale("en")
<locale>
Numbers: 123,456.78
Formats: %AD / %AT
Timezone: UTC
Encoding: UTF-8
<date_names>
Days: Sunday (Sun), Monday (Mon), Tuesday (Tue), Wednesday (Wed), Thursday (Thu), Friday (Fri), Saturday (Sat)
Months: January (Jan), February (Feb), March (Mar), April (Apr), May (May), June (Jun), July (Jul), August (Aug),
September (Sep), October (Oct), November (Nov), December (Dec)
AM/PM: AM/PM
It really is terrible inconvenient also because I believe it is somewhat rare to actually include the "." at all when registration dates with abbreviations (but that is really just based on personal preferences and experience). Any input is much appreciated.
You can edit the locale manually like this...
loc <- locale("nb")
loc$date_names$mon_ab <- substr(loc$date_names$mon_ab, 1, 3) #just take first 3 characters
parse_date(c("29. mai 2017", "29. sep 2017"), format = "%d. %b %Y", locale = loc)
[1] "2017-05-29" "2017-09-29"
A solution similar to and inspired #Andrew Gustar is creating your own date_names object:
loc <- locale("nb")
myNo <- date_names(mon = loc$date_names$mon,
mon_ab = substr(loc$date_names$mon_ab, 1, 3),
day = loc$date_names$day,
day_ab = substr(loc$date_names$day, 1, 3))
parse_date(c("29. mai 2017", "29. sep 2017"), format = "%d. %b %Y", locale = locale(date_names = myNo))
[1] "2017-05-29" "2017-09-29"
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a large database of text, read as data frame with one column of text which has few sentences with time mentioned in different formats as below:
Row 1. I tried to call you on xxx-xxx-xxxx, however reached voice mail I'm scheduling our next follow up on 6/13/2018 between 12 PM and 2 PM PST.
Row 2. I will call you again today if I hear something from them, if not, will call you tomorrow between 4 - 6PM EST.
Row 3. We will await for your reply, if we don't hear from you then we will call you tomorrow between 12:00PM to 2:00PM CST
Row 4. As discussed over the call, we scheduled call back for tomorrow between 12 - 02 PM EST.
Row 5. As suggested by you, we will have our next follow up on 6/13/2018 between 12 PM TO 2 PM PST.
Would like to extract just the time part along with EST/CST/PST.
Expected Outputs:
6/13/2018 4 PM - 6 PM EST
tomorrow 12 PM TO 2 PM PST
Have tried the below:
x <- text$string
sc1 <- str_match(x, " follow up on (.*?) T.")
which returns something like:
follow up on 6/13/2018 between 1 PM TO | 6/13/2018 between 1 PM
Tried to combine other formats using below codes
sc2 <- str_match(x, " will call you tomorrow between (.*?) T.")
and do a rowbind to include both formats (follow up * and will call you*)
sc1rb <- rbind(sc1,sc2)
which did not workk
Any way to extract only the time part along with timezone from the above example strings?
Thanks in advance!
Here's something that works for the sample. As #MrFlick mentioned, please try to share your data in a reproducible way.
Data
> dput(txt)
c("Next follow up on 6/13/2018 between 12 PM and 2 PM PST.",
"will call you tomorrow between 4 - 6PM EST.", "will call you tomorrow between 12:00PM to 2:00PM CST",
"will call you tomorrow between 11 AM to 12 PM EST", "Next follow up on 6/13/2018 between 12 PM TO 2 PM PST."
)
code
> regmatches(txt, regexec('[[:space:]]([[:digit:]]{1,2}[[:space:]].*[[:upper:]]{3})', txt))
[[1]]
[1] " 12 PM and 2 PM PST" "12 PM and 2 PM PST"
[[2]]
[1] " 4 - 6PM EST" "4 - 6PM EST"
[[3]]
character(0)
[[4]]
[1] " 11 AM to 12 PM EST" "11 AM to 12 PM EST"
[[5]]
[1] " 12 PM TO 2 PM PST" "12 PM TO 2 PM PST"
the output is a list wherein each element has two character vectors (read the help section for regmatches). You can simplify this further to get only the output indicated above:
> unname(sapply(txt, function(z){
pattern <- '[[:space:]]([[:digit:]]{1,2}([[:space:]]|:).*[[:upper:]]{3})'
k <- unlist(regmatches(z, regexec(pattern = pattern, z)))
return(k[2])
}))
[1] "12 PM and 2 PM PST" "4 - 6PM EST" "12:00PM to 2:00PM CST" "11 AM to 12 PM EST"
[5] "12 PM TO 2 PM PST"
This based on the sample input. Of course if the input is far too irregular, it'll be hard to use a single regex. If you have such a case, I'd recommend using multiple regex functions that are called one after the other depending on if the preceding ones return NA. Hope this is helpful!
sub(".*?(\\d+\\s*[PA:-].*)","\\1",data)
[1] "12 PM and 2 PM PST." "4 - 6PM EST." "12:00PM to 2:00PM CST"
[4] "11 AM to 12 PM EST" "12 PM TO 2 PM PST."
This code works for almost all your specifications, excepted this substring "4 - 6PM EST". I hope it would be useful on your whole data
data=c(
"Next follow up on 6/13/2018 between 12 PM and 2 PM PST.",
"will call you tomorrow between 4 - 6PM EST.",
"will call you tomorrow between 12:00PM to 2:00PM CST",
"will call you tomorrow between 11 AM to 12 PM EST",
"Next follow up on 6/13/2018 between 12 PM TO 2 PM PST.")
#date exclusion with regex
data=gsub( "*(\\d{1,2}/\\d{1,2}/\\d{4})*", "", data)
#parameters for exlusion and substitution#
excluded_texts=c("Next follow up on","between","will call you tomorrow",":00","\\.")
replaced_input=c(" ","\'-","and","TO"," AM"," PM")
replaced_output=c("","to","to","to","AM","PM")
for (i in excluded_texts){
data=gsub(i, "", data)}
for (j in 1:length(replaced_input)){
data=gsub(replaced_input[j],replaced_output[j],data)
}
print(data)
I have this time dataframe
3/31/2001 8:15
4/31/2001 8:25
2/31/2001 8:45
4/31/2001 8:55
Which I am converting into months in a different column using this line of code
all$month<-strftime(as.Date(all$time, format="%m/%d/%Y %H:%M"), "%B")
The result I get is of the form:
March
April
February
April
But I would like to get only the short form of the month names, i.e.
Mar
Apr
Feb
Apr
How can I implement it?
Use a lowercase "b":
> strftime(as.Date("3/31/2001 8:15", format="%m/%d/%Y %H:%M"), "%b")
[1] "Mar"