Related
I have two fields in a dataframe that are of the class "times". Call it Time1 and Time2. I am trying to find the time difference between the two.
CombinedFrame2$Duration <- difftime(CombinedFrame2$Time1, CombinedFrame2$Time2)
Error in as.POSIXct.numeric(CombinedFrame2$Time1) :
'origin' must be supplied
How do I get the classes to cooperate to do the calculation?
Example:
Time1 Time2 Duration
5:30:00 6:24:00 0:54:00
$ Time1 : POSIXlt, format: "2019-07-10 16:07:00" "2019-07-10 22:05:00" "2019-07-10 22:20:00" "2019-07-10 22:43:00" ...
$ Time2 : POSIXlt, format: "2019-07-10 22:05:00" "2019-07-10 22:20:00" "2019-07-10 22:43:00" "2019-07-10 23:15:00" ...
> dput(head(CombinedFrame2[,c("Time1", "Time2")]))
structure(list(Time1 = structure(list(sec = c(0, 0, 0, 0,
0, 0), min = c(7L, 5L, 20L, 43L, 15L, 35L), hour = c(16L, 22L,
22L, 22L, 23L, 23L), mday = c(11L, 11L, 11L, 11L, 11L, 11L),
mon = c(6L, 6L, 6L, 6L, 6L, 6L), year = c(119L, 119L, 119L,
119L, 119L, 119L), wday = c(4L, 4L, 4L, 4L, 4L, 4L), yday = c(191L,
191L, 191L, 191L, 191L, 191L), isdst = c(1L, 1L, 1L, 1L,
1L, 1L), zone = c("EDT", "EDT", "EDT", "EDT", "EDT", "EDT"
), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_)), class = c("POSIXlt", "POSIXt"
)), Time2 = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(5L,
20L, 43L, 15L, 35L, 55L), hour = c(22L, 22L, 22L, 23L, 23L, 23L
), mday = c(11L, 11L, 11L, 11L, 11L, 11L), mon = c(6L, 6L, 6L,
6L, 6L, 6L), year = c(119L, 119L, 119L, 119L, 119L, 119L), wday = c(4L,
4L, 4L, 4L, 4L, 4L), yday = c(191L, 191L, 191L, 191L, 191L, 191L
), isdst = c(1L, 1L, 1L, 1L, 1L, 1L), zone = c("EDT", "EDT",
"EDT", "EDT", "EDT", "EDT"), gmtoff = c(NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_)), class = c("POSIXlt",
"POSIXt"))), row.names = c("1:1", "1:2", "1:3", "1:4", "1:5",
"1:6"), class = "data.frame")
You need to make sure that your time is formatted correctly. See the code below.
You can use strptime() to format your time into hours, minutes, and seconds.
time1 <- "5:30:00"
time2 <- "6:24:00"
time1a <- strptime(time1,format="%H:%M:%S")
time2a <- strptime(time2,format="%H:%M:%S")
duration <- difftime(time2a,time1a)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I am referencing another post that appears to provide the exact solution I'm looking for:
Creating new column based on earliest date value in other column in R
Here is my sample data:
structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a1", "b1"), class = "factor"), Begin = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(28L, 4L, 10L, 10L, 12L, 13L), mon = c(11L, 11L, 11L, 11L, 11L, 11L), year = c(115L, 115L,115L, 115L, 115L, 115L), wday = c(1L, 5L, 4L, 4L, 6L, 0L), yday = c(361L, 337L, 343L, 343L, 345L, 346L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L), zone = c("PST", "PST", "PST", "PST", "PST", "PST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt"))), .Names = c("ID", "Begin"), row.names = c(NA, -6L), class = "data.frame")
Here is what I am looking for:
structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a1", "b1"), class = "factor"), Begin = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(28L, 4L, 10L, 10L, 12L, 13L), mon = c(11L, 11L, 11L, 11L, 11L, 11L), year = c(115L, 115L, 115L, 115L, 115L, 115L), wday = c(1L, 5L, 4L, 4L, 6L, 0L), yday = c(361L, 337L, 343L, 343L, 345L, 346L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L), zone = c("PST", "PST", "PST", "PST", "PST", "PST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt")), BeginE = structure(list(
sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L
), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(4L, 4L, 4L,
10L, 10L, 10L), mon = c(11L, 11L, 11L, 11L, 11L, 11L), year = c(115L,
115L, 115L, 115L, 115L, 115L), wday = c(5L, 5L, 5L, 4L, 4L,
4L), yday = c(337L, 337L, 337L, 343L, 343L, 343L), isdst = c(0L,
0L, 0L, 0L, 0L, 0L), zone = c("PST", "PST", "PST", "PST",
"PST", "PST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt"))), .Names = c("ID", "Begin", "BeginE"), row.names = c(NA, -6L), class = "data.frame")
In response to good comment about providing all code, I attempted the following:
df2 <- as.data.frame(data.table(df)[, BeginE:= min(Begin), by = ID])
This was the error:
`Error in as.POSIXct.POSIXlt(X[[i]], ...) : invalid 'x' argument`
I fixed the issue with a simple conversion:
df$Begin<-as.POSIXct(df$Begin)
Works on my huge dataset as well.
I have 9x3 dataframe DATA, where one column has timestamps (DateTime), one prices (Close), and one binary values (FOMCBinary).
I want to add column SignalBinary recording a 1 IF Close < an X value (1126 in this example) AND FOMCBinary > 0 in any of the two rows below, but only if SignalBinary = 0 in the row below (i.e. do not want consecutive 1s).
In the example here I need to record a 1 under SignalBinary only at 14:15:00. My coding attempt is instead recording a 1 at 14:15:00 and at 14:30:00. Should be fairly simple, don't understand why my code is not producing the desired result. How could I get this fixed?
DATA <- structure(list(DateTime = structure(list(sec = c(0, 0, 0, 0,0, 0, 0, 0, 0), min = c(30L, 15L, 0L, 45L, 30L, 15L, 0L, 45L,30L), hour = c(15L, 15L, 15L, 14L, 14L, 14L, 14L, 13L, 13L),mday = c(27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L), mon = c(0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), year = c(116L, 116L, 116L,116L, 116L, 116L, 116L, 116L, 116L), wday = c(3L, 3L, 3L,3L, 3L, 3L, 3L, 3L, 3L), yday = c(26L, 26L, 26L, 26L, 26L,26L, 26L, 26L, 26L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L), zone = c("EST", "EST", "EST", "EST", "EST", "EST","EST", "EST", "EST"), gmtoff = c(NA_integer_, NA_integer_,NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour","mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt")), Close = c(1127.2, 1127.5,1126.9, 1128.3, 1125.4, 1122.7, 1122.8, 1117.3, 1116), FOMCBinary = c(0,0, 0, 0, 0, 0, 1, 0, 0)), .Names = c("DateTime", "Close", "FOMCBinary"), row.names = 2131:2139, class = "data.frame")
Xvalue = 1126
#For comparing lagged or forward rows
rowShift <- function(x, shiftLen = 1L) {
r <- (1L + shiftLen):(length(x) + shiftLen)
r[r<1] <- NA
return(x[r]) }
DATA$SignalBinary <- ifelse(
DATA$Close < Xvalue & (
rowShift(DATA$FOMCBinary, +1) > 0 |
(rowShift(DATA$FOMCBinary, +2) > 0 & rowShift(DATA$FOMCBinary, +1) == 0))
, 1, 0)
##Note rowShift(DATA$FOMCBinary, +1) is equivalent to DATA$FOMCBinary[seq(nrow(DATA))+1]##
Output for DATA after calculations:
DateTime Close FOMCBinary SignalBinary
2131 2016-01-27 15:30:00 1127.2 0 0
2132 2016-01-27 15:15:00 1127.5 0 0
2133 2016-01-27 15:00:00 1126.9 0 0
2134 2016-01-27 14:45:00 1128.3 0 0
2135 2016-01-27 14:30:00 1125.4 0 1 => UNWANTED 1
2136 2016-01-27 14:15:00 1122.7 0 1
2137 2016-01-27 14:00:00 1122.8 1 0
2138 2016-01-27 13:45:00 1117.3 0 NA
2139 2016-01-27 13:30:00 1116.0 0 NA
Thank you very much.
had a closer look. you wanted to remove the first consecutive 1 in SignalBinary but you didnt sweep through SignalBinary. Here is a rough code
DATA <- structure(list(DateTime = structure(list(sec = c(0, 0, 0, 0,0, 0, 0, 0, 0), min = c(30L, 15L, 0L, 45L, 30L, 15L, 0L, 45L,30L), hour = c(15L, 15L, 15L, 14L, 14L, 14L, 14L, 13L, 13L),mday = c(27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L), mon = c(0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), year = c(116L, 116L, 116L,116L, 116L, 116L, 116L, 116L, 116L), wday = c(3L, 3L, 3L,3L, 3L, 3L, 3L, 3L, 3L), yday = c(26L, 26L, 26L, 26L, 26L,26L, 26L, 26L, 26L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L), zone = c("EST", "EST", "EST", "EST", "EST", "EST","EST", "EST", "EST"), gmtoff = c(NA_integer_, NA_integer_,NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, NA_integer_)), .Names = c("sec", "min", "hour","mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt")), Close = c(1127.2, 1127.5,1126.9, 1128.3, 1125.4, 1122.7, 1122.8, 1117.3, 1116), FOMCBinary = c(0,0, 0, 0, 0, 0, 1, 0, 0)), .Names = c("DateTime", "Close", "FOMCBinary"), row.names = 2131:2139, class = "data.frame")
Xvalue = 1126
#For comparing lagged or forward rows
rowShift <- function(x, shiftLen = 1) {
r <- (1 + shiftLen):(length(x) + shiftLen)
r[r<1] <- NA
return(x[r]) }
DATA$SignalBinary <- as.numeric(DATA$Close < Xvalue & rowShift(DATA$FOMCBinary, +1) > 0)
DATA$SignalBinary <- c(sapply(1:(nrow(DATA)-1), function(n) {
if (is.na(DATA$SignalBinary[n+1])) return(NA)
if (DATA$SignalBinary[n+1]) return(0)
DATA$SignalBinary[n]
}), tail(DATA$SignalBinary,1))
DATA
anotherDATA <- structure(list(DateTime = structure(list(sec = c(0, 0, 0, 0, 0, 0, 0, 0, 0), min = c(30L, 15L, 0L, 45L, 30L, 15L, 0L, 45L,30L),
hour = c(15L, 15L, 15L, 14L, 14L, 14L, 14L, 13L, 13L),mday = c(27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L),
mon = c(0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), year = c(116L, 116L, 116L,116L, 116L, 116L, 116L, 116L, 116L),
wday = c(3L, 3L, 3L,3L, 3L, 3L, 3L, 3L, 3L), yday = c(26L, 26L, 26L, 26L, 26L,26L, 26L, 26L, 26L),
isdst = c(0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L), zone = c("EST", "EST", "EST", "EST", "EST", "EST","EST", "EST", "EST"),
gmtoff = c(NA_integer_, NA_integer_,NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, NA_integer_)),
.Names = c("sec", "min", "hour","mday", "mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt", "POSIXt")),
Close = c(1127.2, 1127.5,1126.9, 1128.3, 1125.4, 1122.7, 1122.8, 1117.3, 1116),
FOMCBinary = c(0,0, 0, 0, 0, 1, 1, 0, 0)), .Names = c("DateTime", "Close", "FOMCBinary"), row.names = 2131:2139, class = "data.frame")
Xvalue = 1126
#For comparing lagged or forward rows
rowShift <- function(x, shiftLen = 1) {
r <- (1 + shiftLen):(length(x) + shiftLen)
r[r<1] <- NA
return(x[r]) }
anotherDATA$SignalBinary <- as.numeric(anotherDATA$Close < Xvalue & rowShift(anotherDATA$FOMCBinary, +1) > 0)
anotherDATA$SignalBinary <- c(sapply(1:(nrow(anotherDATA)-1), function(n) {
if (is.na(anotherDATA$SignalBinary[n+1])) return(NA)
if (anotherDATA$SignalBinary[n+1]) return(0)
anotherDATA$SignalBinary[n]
}), tail(anotherDATA$SignalBinary,1))
anotherDATA
In R I want to merge two dataframes on a range of dates, taking all rows from the second dataframe which fall on and between two columns of dates from the first dataframe. I couldn't find a strictly R function or version of the merge function that could do this, but I know there's a 'between' function in sql and I was thinking of trying the sqldf package (although I'm not well versed in sql). If there's a more R-ish way to do this, that would be preferable. Thank you in advance for your help!
df1 <- structure(list(ID = 1:2, PtID = structure(c(1L, 1L), .Label = c("T031", "T040", "T045", "T064", "T074", "T081", "T092", "T094", "T096", "T105", "T107", "T108", "T115", "T118", "T120", "T124", "T125", "T128", "T130", "T132", "T138", "T140", "T142", "T142_R1", "T146", "T158", "T159", "T160", "T164", "T166", "T169", "T171", "T173", "T197", "T208", "T214", "T221"), class = "factor"), StartDateTime = structure(list(sec = c(0, 0), min = c(11L, 35L), hour = c(17L, 17L), mday = c(23L, 23L), mon = c(9L, 9L), year = c(112L, 112L), wday = c(2L, 2L), yday = c(296L, 296L), isdst = c(1L, 1L)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst"), class = c("POSIXlt", "POSIXt")), EndDateTime = structure(list(sec = c(0, 0), min = c(16L, 37L), hour = c(17L, 17L), mday = c(23L, 23L), mon = c(9L, 9L), year = c(112L, 112L), wday = c(2L, 2L), yday = c(296L, 296L), isdst = c(1L, 1L)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst"), class = c("POSIXlt", "POSIXt"))), .Names = c("ID", "PtID", "StartDateTime", "EndDateTime"), row.names = 1:2, class = "data.frame")
df1
ID PtID StartDateTime EndDateTime
1 1 T031 2012-10-23 17:11:00 2012-10-23 17:16:00
2 2 T031 2012-10-23 17:35:00 2012-10-23 17:37:00
The second dataframe has several IDs (which match the first dataframe) and timestamps on the minute level.
df2
df2 <- structure(list(ID = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), dateTime = structure(list(sec = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), min = 2:44, hour = c(17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L), mday = c(23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L, 23L), mon = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), year = c(112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L), wday = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), yday = c(296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L, 296L), isdst = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst"), class = c("POSIXlt", "POSIXt")), lat = c(33.06621406, 33.06616621, 33.06617305, 33.06617624, 33.06617932, 33.06618161, 33.06618326, 33.06618604, 33.06615089, 33.06628004, 33.06618461, 33.06615113, 33.0661362, 33.06620301, 33.0662218, 33.06624283, 33.06622268, 33.06622425, 33.06622787, 33.06623042, 33.06623318, 33.06623654, 33.06623826, 33.06623919, 33.06623907, 33.06624009, 33.06623804, 33.06624255, 33.06624377, 33.06624446, 33.06624242, 33.06624254, 33.06624513, 33.06624582, 33.06615573, 33.06625534, 33.06618541, 33.06613825, 33.06613624, 33.06614027, 33.06614551, 33.06614844, 33.06615393), lon = c(-116.6105531, -116.6105651,-116.6105613, -116.6105553, -116.610551, -116.610549, -116.6105484, -116.6105512, -116.6105712, -116.6104996, -116.6104711, -116.6104854, -116.6105596, -116.6104509, -116.610524, -116.6105535, -116.6105461, -116.6105461, -116.6105477, -116.6105498, -116.6105478, -116.6105473, -116.6105473, -116.6105488, -116.6105497, -116.6105479, -116.610545, -116.6105461, -116.6105448, -116.610543, -116.6105409, -116.6105395, -116.6105367, -116.6105337, -116.6105344, -116.6104779, -116.6104953,-116.6105222, -116.610526, -116.6105255, -116.6105282, -116.6105265,-116.6105282)), .Names = c("ID", "dateTime", "lat", "lon"), row.names = 1023:1065, class = "data.frame")
So the desired output would look like this:
ID PtID DateTime lat lon
1 T031 2012-10-23 17:11:00 33.06628 -116.6105
1 T031 2012-10-23 17:12:00 33.06618 -116.6105
1 T031 2012-10-23 17:13:00 33.06615 -116.6105
1 T031 2012-10-23 17:14:00 33.06614 -116.6106
1 T031 2012-10-23 17:15:00 33.06620 -116.6105
1 T031 2012-10-23 17:16:00 33.06622 -116.6105
2 T031 2012-10-23 17:35:00 33.06625 -116.6105
2 T031 2012-10-23 17:36:00 33.06616 -116.6105
2 T031 2012-10-23 17:37:00 33.06626 -116.6105
So with sqldf maybe something like this?
sqldf("SELECT df2.ID, df2.lon, df2.lat, FROM df1
INNER JOIN df2 ON df1.ID = df2.ID
WHERE df2.DateTime BETWEEN df1.StartDateTime AND df1.EndDateTime")
In general, its not a good idea to use POSIXlt in data frames. Use POSIXct instead. Also your SQL statement is ok except the comma before FROM needs to be removed:
df1a <- transform(df1,
StartDateTime = as.POSIXct(StartDateTime),
EndDateTime = as.POSIXct(EndDateTime))
df2a <- transform(df2, dateTime = as.POSIXct(dateTime))
The SQL statement in the question has an extraneous commma before FROM.
Here is a slightly simplified statement. This one uses a left join instead to ensure that all ID's from df1a are included even if they have no matches in df2a.
sqldf("SELECT df1a.ID, PtID, dateTime, lat, lon
FROM df1a LEFT JOIN df2a
ON df1a.ID = df2a.ID AND dateTime BETWEEN StartDateTime AND EndDateTime")
You may want to look into defining your data as zoo objects. merge.zoo does something very close to what you ask. Refer to this question for more: R: merge two irregular time series
I've to plot these data:
day temperature
02/01/2012 13:30:00 10
10/01/2012 20:30:00 8
15/01/2012 13:30:00 12
25/01/2012 20:30:00 6
02/02/2012 13:30:00 5
10/02/2012 20:30:00 3
15/02/2012 13:30:00 6
25/02/2012 20:30:00 -1
02/03/2012 13:30:00 4
10/03/2012 20:30:00 -2
15/03/2012 13:30:00 7
25/03/2012 20:30:00 1
in the x-axis I want to label only the month and the day (e.g. Jan 02 ). How can I do this using the command plot() and axis()?
First, you will need to put your date text into a dtae class (e.g. as.POSIXct):
df <- structure(list(day = structure(list(sec = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0), min = c(30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L), hour = c(13L, 20L, 13L, 20L, 13L, 20L,
13L, 20L, 13L, 20L, 13L, 20L), mday = c(2L, 10L, 15L, 25L, 2L,
10L, 15L, 25L, 2L, 10L, 15L, 25L), mon = c(0L, 0L, 0L, 0L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L), year = c(112L, 112L, 112L, 112L,
112L, 112L, 112L, 112L, 112L, 112L, 112L, 112L), wday = c(1L,
2L, 0L, 3L, 4L, 5L, 3L, 6L, 5L, 6L, 4L, 0L), yday = c(1L, 9L,
14L, 24L, 32L, 40L, 45L, 55L, 61L, 69L, 74L, 84L), isdst = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L)), .Names = c("sec",
"min", "hour", "mday", "mon", "year", "wday", "yday", "isdst"
), class = c("POSIXlt", "POSIXt")), temperature = c(10L, 8L,
12L, 6L, 5L, 3L, 6L, -1L, 4L, -2L, 7L, 1L)), .Names = c("day",
"temperature"), row.names = c(NA, -12L), class = "data.frame")
df
df$day <- as.POSIXct(df$day, format="%d/%m/%Y %H:%M:%S")
Your dates should now plot correctly. Don't apply the x-axis, by using the argument xaxt="n". Afterwards, you can create a sequence of dates where you would like your axis labeled, and apply this with axis.POSIXct:
plot(df$day, df$temperature, t="l", ylab="Temperature", xlab="Date", xaxt="n")
SEQ <- seq(min(df$day), max(df$day), by="months")
axis.POSIXct(SEQ, at=SEQ, side=1, format="%b %Y")
Similarly, to get a daily axis, simply modify the SEQ and axis.POSIXct code accordingly. For example, you may try:
plot(df$day, df$temperature, t="l", ylab="Temperature", xlab="Date", xaxt="n")
SEQ <- seq(min(df$day), max(df$day), by="days")
axis.POSIXct(SEQ, at=SEQ, side=1, format="%b %d")