Total items in a list - r

Can anyone tell me why I do have only 1895 elements instead of 1896(79 days X 24 hours)?
time_index <- seq(from = as.POSIXct("2017-01-02 01:00"),
to = as.POSIXct("2017-03-21 24:00"), by = "hour")
length(time_index)
# >[1] 1895

daylight saving ?
time_index[1655:1660]
[1] "2017-03-11 23:00:00 EST" "2017-03-12 00:00:00 EST"
[3] "2017-03-12 01:00:00 EST" "2017-03-12 03:00:00 EDT"
[5] "2017-03-12 04:00:00 EDT" "2017-03-12 05:00:00 EDT"
to stop it from happening one must choose a time zone where there is no daylight saving, here is an example
time_index <- seq(from = as.POSIXct("2017-01-02 01:00",tz = 'UTC'),
to = as.POSIXct("2017-03-21 24:00", tz = 'UTC'),
by = "hour")
length(time_index)
[1] 1896

Related

How do I format 12 hour data with am/pm without NAs?

The reformatting works for just the date, but when I add in the am/pm data it doesn't work. Here is the code and data:
str(Hourly_Steps$activity_hour)
chr [1:22099] "4/12/2016 12:00:00 AM" "4/12/2016 1:00:00 AM" ...
This worked to format just the date...
> day <- strptime(Hourly_Steps$activity_hour, format = "%m/%d/%Y %I:%M:%S")
> day
[1] "2016-04-12 00:00:00 IST" "2016-04-12 01:00:00 IST"
[3] "2016-04-12 02:00:00 IST" "2016-04-12 03:00:00 IST"
[5] "2016-04-12 04:00:00 IST" "2016-04-12 05:00:00 IST"
[7] "2016-04-12 06:00:00 IST" "2016-04-12 07:00:00 IST"
[9] "2016-04-12 08:00:00 IST" "2016-04-12 09:00:00 IST"
[11] "2016-04-12 10:00:00 IST" "2016-04-12 11:00:00 IST"
But when I try add in the %p for am/pm info, it goes back to 24hr...
> day <- strptime(Hourly_Steps$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
> day
[1] "2016-04-12 00:00:00 IST" "2016-04-12 01:00:00 IST"
[3] "2016-04-12 02:00:00 IST" "2016-04-12 03:00:00 IST"
[5] "2016-04-12 04:00:00 IST" "2016-04-12 05:00:00 IST"
[7] "2016-04-12 06:00:00 IST" "2016-04-12 07:00:00 IST"
[9] "2016-04-12 08:00:00 IST" "2016-04-12 09:00:00 IST"
[11] "2016-04-12 10:00:00 IST" "2016-04-12 11:00:00 IST"
[13] "2016-04-12 12:00:00 IST" "2016-04-12 13:00:00 IST"
[15] "2016-04-12 14:00:00 IST" "2016-04-12 15:00:00 IST"
What am I missing here?
Your code works correctly.
You are starting with a "character" vector, and strptime() correctly reads your imperial time into "POSIXt" time format. The "POSIXt" is only stored in this specific "YYYY-MM-DD HH:MM:SS TZ" format and this is what you see displayed.
x <- c("4/12/2016 1:00:00 AM", "4/12/2016 1:00:00 PM", "4/12/2016 12:00:00 AM",
"4/12/2016 12:00:00 PM")
y <- strptime(x, '%d/%m/%Y %I:%M:%S %p')
y
# [1] "2016-04-12 01:00:00 CEST" "2016-04-12 13:00:00 CEST"
# [3] "2016-04-12 00:00:00 CEST" "2016-04-12 12:00:00 CEST"
and
class(y)
# [1] "POSIXlt" "POSIXt"
Maybe you are looking for a way to change your output format, where you'd want to use strftime().
z <- strftime(y, '%m/%d/%Y %I:%M:%S %p')
z
# [1] "04/12/2016 01:00:00 am" "04/12/2016 01:00:00 pm"
# [3] "04/12/2016 12:00:00 am" "04/12/2016 12:00:00 pm"
Note, however, that you now have a "character" class.
class(z)
# [1] "character"
So, altogether you may want:
strftime(strptime(x, '%m/%d/%Y %I:%M:%S %p'), '%d/%m/%Y %I:%M:%S %p')
# [1] "12/04/2016 01:00:00 am" "12/04/2016 01:00:00 pm"
# [3] "12/04/2016 12:00:00 am" "12/04/2016 12:00:00 pm"
Sidenote: Midnight may not be displayed even though it is stored internally.
strptime("4/12/2016 12:00:00 AM", '%m/%d/%Y %I:%M:%S %p')
# [1] "2016-04-12 CEST"

Generate an ordered series of datetime

I am working in R.
I have to generate a series of dates and times. In particular, I would like to have two data points per day, hence to assign twice each date with a different time, for instance:
"2001-05-13 00:00:00"
"2001-05-13 12:00:00"
"2001-05-14 00:00:00"
"2001-05-14 12:00:00"
I found the following code to produce a series of dates:
seq(as.Date("2000/1/1"), as.Date("2003/1/1"), by = 0.5)
Nevertheless, even if I set the by = 0.5, the code returns only a date , not a datetime.
Any idea how to produce a series of datetimes?
as.Date will produce only dates, use as.POSIXct to produce date-time.
seq(as.POSIXct("2000-01-01 00:00:00", tz = 'UTC'),
as.POSIXct("2003-01-01 00:00:00", tz = 'UTC'), by = '12 hours')
# [1] "2000-01-01 00:00:00 UTC" "2000-01-01 12:00:00 UTC"
# [3] "2000-01-02 00:00:00 UTC" "2000-01-02 12:00:00 UTC"
# [5] "2000-01-03 00:00:00 UTC" "2000-01-03 12:00:00 UTC"
# [7] "2000-01-04 00:00:00 UTC" "2000-01-04 12:00:00 UTC"
# [9] "2000-01-05 00:00:00 UTC" "2000-01-05 12:00:00 UTC"
#[11] "2000-01-06 00:00:00 UTC" "2000-01-06 12:00:00 UTC"
#[13] "2000-01-07 00:00:00 UTC" "2000-01-07 12:00:00 UTC"
#...
#...

Create a time series by 30 minute intervals

I am trying to create a time series with 30 min intervals. I used the following command with the output also shown:
ts = seq(as.POSIXct("2009-01-01 00:00"), as.POSIXct("2014-12-31 23:30"),by = "hour")
"2010-02-21 12:00:00 EST" "2010-02-21 13:00:00 EST" "2010-02-21 14:00:00 EST"
When I change it to by ="min" it changes to be every minute.
How do I create a time series with every 30 minute intervals?
You can specify minutes in the by argument, and pass the time zone "UTC" as Adrian pointed out. Check ?seq.POSIXt for more details about the by argument specified as a character string:
A character string, containing one of "sec", "min", "hour", "day",
"DSTday", "week", "month", "quarter" or "year". This can optionally be
preceded by a (positive or negative) integer and a space, or followed
by "s".
ts <- seq(as.POSIXct("2017-01-01", tz = "UTC"),
as.POSIXct("2017-01-02", tz = "UTC"),
by = "30 min")
head(ts)
Output
[1] "2017-01-01 00:00:00 UTC"
[2] "2017-01-01 00:30:00 UTC"
[3] "2017-01-01 01:00:00 UTC"
[4] "2017-01-01 01:30:00 UTC"
[5] "2017-01-01 02:00:00 UTC"
[6] "2017-01-01 02:30:00 UTC"
Default units are seconds. So just do 1800 seconds to get 30 minutes.
ts = seq(as.POSIXct("2009-01-01 00:00"), as.POSIXct("2014-12-31 23:30"),by = 1800)
ts[1:20]
[1] "2009-01-01 00:00:00 EST" "2009-01-01 00:30:00 EST" "2009-01-01 01:00:00 EST" "2009-01-01 01:30:00 EST" "2009-01-01 02:00:00 EST"
[6] "2009-01-01 02:30:00 EST" "2009-01-01 03:00:00 EST" "2009-01-01 03:30:00 EST" "2009-01-01 04:00:00 EST" "2009-01-01 04:30:00 EST"
[11] "2009-01-01 05:00:00 EST" "2009-01-01 05:30:00 EST" "2009-01-01 06:00:00 EST" "2009-01-01 06:30:00 EST" "2009-01-01 07:00:00 EST"
[16] "2009-01-01 07:30:00 EST" "2009-01-01 08:00:00 EST" "2009-01-01 08:30:00 EST" "2009-01-01 09:00:00 EST" "2009-01-01 09:30:00 EST"

Subsetting results from sapply

After I use sapply, I get a list, and I would like to access individual elements of those lists. So far, I have:
large.list <- sapply(1:length(visit_num), function(x)
seq(enter.shift.want[x], to= exit.prime[x], by= 'hour'))
where enter.shift.want and exit.prime are vectors of dates.
head(large.list, 2)
[[1]]
[1] "1982-05-17 13:00:00 PDT" "1982-05-17 14:00:00 PDT" "1982-05-17 15:00:00 PDT"
[4] "1982-05-17 16:00:00 PDT" "1982-05-17 17:00:00 PDT" "1982-05-17 18:00:00 PDT"
[7] "1982-05-17 19:00:00 PDT" "1982-05-17 20:00:00 PDT" "1982-05-17 21:00:00 PDT"
[10] "1982-05-17 22:00:00 PDT"
[[2]]
[1] "1982-07-14 13:00:00 PDT" "1982-07-14 14:00:00 PDT" "1982-07-14 15:00:00 PDT"
[4] "1982-07-14 16:00:00 PDT" "1982-07-14 17:00:00 PDT" "1982-07-14 18:00:00 PDT"
[7] "1982-07-14 19:00:00 PDT" "1982-07-14 20:00:00 PDT" "1982-07-14 21:00:00 PDT"
[10] "1982-07-14 22:00:00 PDT"
I would like to have large.list[1] as a vector of dates/time.
Then I would like to do
large.list[1]<=enter.shift.want[1]
and get a vector of true and false results. Then I would want generalize and do
large.list[n]<= enter.shift.want[n] for each n in (1:length(visit_num)) , and add up the true/falses.
Thanks in advance.
If enter.shift.want is a list or a vector with same number of elements as large.list, here is one way to apply it to the whole list.
res <- Map(`<=`, large.list, enter.shift.want)
res1 <- Map(`<=`, large.list, enter.shift.want1)
To get the total number of TRUE per list element
colSums(do.call(cbind, res))
#[1] 3 3
Or
sapply(res, sum)
#[1] 3 3
sapply(res1,sum)
#[1] 3 7
data
large.list <- list(structure(c(390488400, 390492000, 390495600, 390499200,
390502800, 390506400, 390510000, 390513600, 390517200, 390520800
), class = c("POSIXct", "POSIXt"), tzone = "PDT"), structure(c(395499600,
395503200, 395506800, 395510400, 395514000, 395517600, 395521200,
395524800, 395528400, 395532000), class = c("POSIXct", "POSIXt"
), tzone = "PDT"))
v1 <- c('1982-05-17 00:00:00', '1982-07-14 00:00:00')
enter.shift.want <- lapply(v1, function(x) seq(as.POSIXct(x, tz='PDT'),
length.out=10, by='3 hour'))
enter.shift.want1 <- as.POSIXct(c('1982-05-17 15:00:00',
'1982-07-14 19:00:00'), tz='PDT')

05:00:00 - 28:59:59 time format

I have dataset where time.start vary from 5:00:00 to 28:59:59 (i.e. 01.01.2013 28:00:00 is actually 02.01.2013 04:00:00). Dates are in %d.%m.%Y format.
Date Time.start
01.01.2013 22:13:07
01.01.2013 22:52:23
01.01.2013 23:34:06
01.01.2013 23:44:25
01.01.2013 27:18:48
01.01.2013 28:41:04
I want to convert it to normal date format.
dates$date <- paste(dates$Date,dates$Time.start, sep = " ")
dates$date <- as.POSIXct(strptime(dates$date, "%m.%d.%Y %H:%M:%S"))
But obviously I have NA for time > 23:59:59
How should I modify my code?
E.g. add the time as seconds to the date:
df <- read.table(header=T, text=" Date Time.start
01.01.2013 22:13:07
01.01.2013 22:52:23
01.01.2013 23:34:06
01.01.2013 23:44:25
01.01.2013 27:18:48
01.01.2013 28:41:04", stringsAsFactors=FALSE)
as.POSIXct(df$Date, format="%d.%m.%Y") +
sapply(strsplit(df$Time.start, ":"), function(t) {
t <- as.integer(t)
t[3] + t[2] * 60 + t[1] * 60 * 60
})
# [1] "2013-01-01 22:13:07 CET" "2013-01-01 22:52:23 CET" "2013-01-01 23:34:06 CET"
# [4] "2013-01-01 23:44:25 CET" "2013-01-02 03:18:48 CET" "2013-01-02 04:41:04 CET"
Just a modification of lukeAs solution:
with(df, as.POSIXct(Date, format="%d.%m.%Y")+
colSums(t(read.table(text=Time.start, sep=":",header=F))*c(3600,60,1)))
[1] "2013-01-01 22:13:07 EST" "2013-01-01 22:52:23 EST"
[3] "2013-01-01 23:34:06 EST" "2013-01-01 23:44:25 EST"
[5] "2013-01-02 03:18:48 EST" "2013-01-02 04:41:04 EST"
Using lubridate:
with(dates, mdy(Date) + hms(Time.start))
Generates:
[1] "2013-01-01 22:13:07 UTC" "2013-01-01 22:52:23 UTC"
[3] "2013-01-01 23:34:06 UTC" "2013-01-01 23:44:25 UTC"
[5] "2013-01-02 03:18:48 UTC" "2013-01-02 04:41:04 UTC"

Resources