I am trying to plot months and days that cross over different years, but not display the year. I can use factors to force the dates to be out of order. If i have a single year of data this code works. but for multiple years i am running into trouble duplicating leap year, and also getting the plot to work.
Error: Column x must be length 1 or 730, not 365
I would like to use the factors to plot the years seasonaly one over the other for each "%m-%d".
Thank you.
dates <- format(seq(from = as.Date("2016-12-01"), to = as.Date("2018-11-30"),
by = "days"), format = "%m-%d")
values = c(rnorm(length(dates)/2, 8, 1.5), rnorm(length(dates)/2, 16, 2))
fDates <- factor(dates, dates)
plot_ly(x = ~fDates, y = ~values, type = "scatter", mode = "lines")
EDIT: This is the desired range for the x-axis. I use the factor to force the levels to conform to this order. I just need to be able to plot multiple years %m-%d against these x axis coordinates.
[1] 12-01 12-02 12-03 12-04 12-05 12-06 12-07 12-08 12-09 12-10 12-11 12-12 12-13 12-14 12-15 12-16 12-17 12-18 12-19 12-20 12-21
[22] 12-22 12-23 12-24 12-25 12-26 12-27 12-28 12-29 12-30 12-31 01-01 01-02 01-03 01-04 01-05 01-06 01-07 01-08 01-09 01-10 01-11
[43] 01-12 01-13 01-14 01-15 01-16 01-17 01-18 01-19 01-20 01-21 01-22 01-23 01-24 01-25 01-26 01-27 01-28 01-29 01-30 01-31 02-01
[64] 02-02 02-03 02-04 02-05 02-06 02-07 02-08 02-09 02-10 02-11 02-12 02-13 02-14 02-15 02-16 02-17 02-18 02-19 02-20 02-21 02-22
[85] 02-23 02-24 02-25 02-26 02-27 02-28 03-01 03-02 03-03 03-04 03-05 03-06 03-07 03-08 03-09 03-10 03-11 03-12 03-13 03-14 03-15
[106] 03-16 03-17 03-18 03-19 03-20 03-21 03-22 03-23 03-24 03-25 03-26 03-27 03-28 03-29 03-30 03-31 04-01 04-02 04-03 04-04 04-05
[127] 04-06 04-07 04-08 04-09 04-10 04-11 04-12 04-13 04-14 04-15 04-16 04-17 04-18 04-19 04-20 04-21 04-22 04-23 04-24 04-25 04-26
[148] 04-27 04-28 04-29 04-30 05-01 05-02 05-03 05-04 05-05 05-06 05-07 05-08 05-09 05-10 05-11 05-12 05-13 05-14 05-15 05-16 05-17
[169] 05-18 05-19 05-20 05-21 05-22 05-23 05-24 05-25 05-26 05-27 05-28 05-29 05-30 05-31 06-01 06-02 06-03 06-04 06-05 06-06 06-07
[190] 06-08 06-09 06-10 06-11 06-12 06-13 06-14 06-15 06-16 06-17 06-18 06-19 06-20 06-21 06-22 06-23 06-24 06-25 06-26 06-27 06-28
[211] 06-29 06-30 07-01 07-02 07-03 07-04 07-05 07-06 07-07 07-08 07-09 07-10 07-11 07-12 07-13 07-14 07-15 07-16 07-17 07-18 07-19
[232] 07-20 07-21 07-22 07-23 07-24 07-25 07-26 07-27 07-28 07-29 07-30 07-31 08-01 08-02 08-03 08-04 08-05 08-06 08-07 08-08 08-09
[253] 08-10 08-11 08-12 08-13 08-14 08-15 08-16 08-17 08-18 08-19 08-20 08-21 08-22 08-23 08-24 08-25 08-26 08-27 08-28 08-29 08-30
[274] 08-31 09-01 09-02 09-03 09-04 09-05 09-06 09-07 09-08 09-09 09-10 09-11 09-12 09-13 09-14 09-15 09-16 09-17 09-18 09-19 09-20
[295] 09-21 09-22 09-23 09-24 09-25 09-26 09-27 09-28 09-29 09-30 10-01 10-02 10-03 10-04 10-05 10-06 10-07 10-08 10-09 10-10 10-11
[316] 10-12 10-13 10-14 10-15 10-16 10-17 10-18 10-19 10-20 10-21 10-22 10-23 10-24 10-25 10-26 10-27 10-28 10-29 10-30 10-31 11-01
[337] 11-02 11-03 11-04 11-05 11-06 11-07 11-08 11-09 11-10 11-11 11-12 11-13 11-14 11-15 11-16 11-17 11-18 11-19 11-20 11-21 11-22
[358] 11-23 11-24 11-25 11-26 11-27 11-28 11-29 11-30
365 Levels: 12-01 12-02 12-03 12-04 12-05 12-06 12-07 12-08 12-09 12-10 12-11 12-12 12-13 12-14 12-15 12-16 12-17 12-18 ... 11-30
Related
I have a very small data set of 63 obs in a data frame. I want to convert it to a ts object so that I can do seasonal and trend decomposition using stl() with a s.window of 7. The odd thing is when I coerce the data with a frequency of 7 for weekly seasonal effects, the ts never sets it as 7, but 1?? Frequency 7 is needed for weekly periodicity. I don't get an error or warning, and I can't find any reference out there that could explain this behavior.
Actual data set:
x <- c(77, 19, 13, 46, 302, 676, 1311, 1479, 1884, 1941, 2601, 2603, 2812,
2570, 2602, 2459, 2623, 2672, 2710, 2780, 2945, 2711, 2645, 2077, 2842,
2990, 3205, 2956, 2823, 2421, 1302, 1180, 2539, 2087, 2610, 2726, 3129,
4716, 2848, 3152, 2869, 2894, 2763, 3011, 2579, 2610, 2724, 2832, 2373,
2412, 2269, 2488, 1958, 2161, 2574, 2818, 2164, 2973, 2840, 2762, 2613,
1896, 3180)
df <- data.frame(x)
xts <- as.ts(df$x, start=c(2017,1), frequency=7, calendar=T)
xts
Results
Time Series:
Start = 1
End = 63
Frequency = 1
[1] 77 19 13 46 302 676 1311 1479 1884 1941 2601 2603 2812 2570
2602 2459 2623 2672 2710 2780 2945 2711
[23] 2645 2077 2842 2990 3205 2956 2823 2421 1302 1180 2539 2087 2610 2726
3129 4716 2848 3152 2869 2894 2763 3011
[45] 2579 2610 2724 2832 2373 2412 2269 2488 1958 2161 2574 2818 2164 2973
2840 2762 2613 1896 3180
It's quite weird to ask this question that I apply a sas7 dataset into R.
One of my variable is visit_date
now it looks like this, i am wondering where i can transform them back to MM-DD-YYYY since i need to exclude data that's less than MDY(08-01-2010).
> chris$visit_date
[1] 17077 17091 17105 17119 17133 17069 17083 17097 17111 17125 17080 17094 17108
[14] 17122 17136 17098 17112 17210 17224 17238 17252 17266 17247 17261 17254 17268
[27] 17282 17296 17324 17237 17251 17265 17279 17293 17329 17343 17357 17385 17413
[40] 17259 17273 17287 17301 17315 17328 17342 17356 17370 17384 17335 17349 17377
[53] 17391 17405 17331 17345 17359 17373 17387 17435 17449 17463 17477 17505 17336
[66] 17364 17378 17392 17406 17352 17366 17380 17394 17408 17427 17441 17469 17483
[79] 17497 17440 17454 17468 17482 17496 17434 17448 17462 17476 17490 17419 17433
[92] 17447 17461 17475 17518 17560 17574 17588 17616 17653 17667 17681 17695 17709
[105] 17644 17658 17686 17700 17728 17755 17769 17783 17811 17825 17825 17610 17624
[118] 17638 17652 17666 18072 18114 18127 18155 18169 17651 17665 17680 17693 17707
[131] 17657 17671 17685 17699 17659 17673 17687 17701 17715 17646 17660 17674 17688
[144] 17702 17721 17735 17749 17763 17770 17734 17748 17762 17790 17861 17736 17750
[157] 17764 17778 17792 17751 17765 17779 17793 17807 17742 17756 17770 17784 17798
[170] 17772 17757 17771 17785 17799 17813 17777 17791 17819 17833 17854 17923 17937
[183] 17965 17979 17993 17825 17839 17853 17867 17909 17832 17846 17860 17874 17888
[196] 17919 17933 17961 17975 17989 17960 17974 17988 18002 18016 18183 18211 18225
[209] 18239 18253 17931 17945 17959 17973 17987 17940 17954 17968 17982 17996 17966
[222] 17980 17994 18022 18036 18021 18035 18049 18063 18091 18050 18064 18078 18092
[235] 18106 18045 18059 18073 18087 18115 18024 18038 18052 18066 18080 18056 18070
[248] 18084 18098 18112 18107 18121 18135 18149 18163 18105 18119 18133 18161 18175
[261] 18143 18171 18185 18199 18213 18203 18246 18274 18288 18302 18316 18248 18276
[274] 18290 18304 18318 18310 18324 18338 18352 18366 18315 18343 18357 18364 18378
[287] 18350 18364 18378 18406 18420 18337 18351 18365 18379 18393 18374 18388 18402
[300] 18430 18472 18344 18358 18386 18400 18414 18353 18381 18395 18409 18423 18387
[313] 18415 18429 18443 18450 18408 18422 18436 18443 18464 18430 18437 18457 18464
[326] 18471 18427 18434 18441 18455 18462 18428 18442 18456 18463 18470
Thanks
Those "dates" are clearly using a different origin/offset than the typical POSIX standard that would work with this conversion. R generally uses YYYY-MM-DD format
as.Date(ddd, origin="1970-01-01")
> head( as.Date(ddd, origin="1970-01-01") )
[1] "2016-10-03" "2016-10-17" "2016-10-31" "2016-11-14" "2016-11-28" "2016-09-25"
So you need to establish the correct origin. If it was 1960-01-01, then none of those dates is greater than 08-01-2010.
> sum( as.Date(ddd, origin="1960-01-01") >= as.Date("2010-08-01") )
[1] 0
> sum( as.Date(ddd, origin="1960-01-01") < as.Date("2010-08-01") )
[1] 336
If we have data for three years:
dat= (x1:x1096)
I want to compute the average this way:
[x1+x366(the first day in the second year)+ x731(the first day in the third year)]/3
[x2+x367(the second day in the second year)+ x732(the second day in the third year)]/3
and so on till the day 365:
[x365+x730(the last day in the second year)+ x1096(the last day in the third year)]/3
finally I will get 365 values out of that.
dat= c(1:1096)
Any idea on how to do this?
data.table comes quite handy in here: (even though a base R solution is perfectly doable!):
> set.seed(1)
> dat <- data.table(date=seq(as.Date("2010-01-01"), as.Date("2012-12-31"), "days"),
+ var=rnorm(1096))
> dat
date var
1: 2010-01-01 -0.626453811
2: 2010-01-02 0.183643324
3: 2010-01-03 -0.835628612
4: 2010-01-04 1.595280802
5: 2010-01-05 0.329507772
---
1092: 2012-12-27 0.711213964
1093: 2012-12-28 -0.337691156
1094: 2012-12-29 -0.009148952
1095: 2012-12-30 -0.125309208
1096: 2012-12-31 -2.090846097
> dat[, mean(var), by=list(month=month(date), mday(date))]
month mday V1
1: 1 1 -0.16755484
2: 1 2 0.59942582
3: 1 3 -0.44336168
4: 1 4 0.01297244
5: 1 5 -0.20317854
---
362: 12 28 -0.18076284
363: 12 29 0.07302903
364: 12 30 -0.01790655
365: 12 31 -0.87164859
366: 2 29 -0.78859794
the 29th on Feb is at end because when [.data.table did the groups that day was the last unique combination (of month(date) and mday(date)) found, cause it appears first time in 2012. once you have your result you can assign the keys and so sort the table:
> result <- dat[, mean(var), by=list(month=month(date), mday(date))]
> setkey(result, month, mday)
> result
month mday V1
1: 1 1 -0.16755484
2: 1 2 0.59942582
3: 1 3 -0.44336168
4: 1 4 0.01297244
5: 1 5 -0.20317854
---
362: 12 27 -0.60348463
363: 12 28 -0.18076284
364: 12 29 0.07302903
365: 12 30 -0.01790655
366: 12 31 -0.87164859
Here is a base solution that account for leap years:
# First your data
set.seed(1)
dat <- rnorm(1096) #Value for each day
day <- seq(as.Date("2010-01-01"), as.Date("2012-12-31"), "days") #Corresponding days
sapply(split(dat,format(day,"%m-%d")),mean)
01-01 01-02 01-03 01-04 01-05 01-06 01-07 01-08 01-09
-0.167554841 0.599425816 -0.443361675 0.012972442 -0.203178536 -0.553501370 0.563475994 -0.094459075 0.567263811
01-10 01-11 01-12 01-13 01-14 01-15 01-16 01-17 01-18
-0.325835336 -0.247226807 -0.272224241 0.171886332 -0.562604980 0.640473418 -0.209380261 0.709635402 -0.263715734
01-19 01-20 01-21 01-22 01-23 01-24 01-25 01-26 01-27
0.929096171 1.173422823 -0.197411808 -0.730959553 -0.277022971 -1.075673025 -0.494038031 -0.255709319 0.827062779
01-28 01-29 01-30 01-31 02-01 02-02 02-03 02-04 02-05
0.208963353 0.215192803 -0.118735162 0.141028516 0.703267761 -0.282852177 -0.297731589 -0.112031601 0.784073396
02-06 02-07 02-08 02-09 02-10 02-11 02-12 02-13 02-14
0.714499179 0.206640777 0.283234842 -0.255182989 -0.293285997 -0.761585755 0.443379228 1.138436815 -0.483004921
02-15 02-16 02-17 02-18 02-19 02-20 02-21 02-22 02-23
-0.692188333 0.701422889 0.677544133 -0.423576371 0.498868978 0.053960271 0.518228979 -0.250840385 -0.722647734
02-24 02-25 02-26 02-27 02-28 02-29 03-01 03-02 03-03
1.344507325 0.693403586 -0.226489715 -0.406929668 -0.171335064 -0.788597935 0.115894011 1.798749522 -0.502676829
03-04 03-05 03-06 03-07 03-08 03-09 03-10 03-11 03-12
0.244453933 -0.278023124 -0.817932086 -0.618472996 -0.842995408 -0.887451556 0.432459430 0.559562525 -0.516256302
03-13 03-14 03-15 03-16 03-17 03-18 03-19 03-20 03-21
0.392447923 0.191049834 -0.727128826 -0.261740657 -0.189455949 0.775326029 0.236835450 -0.266491426 -0.010319849
03-22 03-23 03-24 03-25 03-26 03-27 03-28 03-29 03-30
-0.949967889 -0.277676523 -0.556777524 -0.507373521 0.076952129 0.697147181 -0.416867359 -0.906909972 -0.231494410
03-31 04-01 04-02 04-03 04-04 04-05 04-06 04-07 04-08
-0.453616811 0.158367456 0.670354625 -0.285493660 -0.040162162 0.762953404 -0.388049908 1.079423205 -0.246508050
04-09 04-10 04-11 04-12 04-13 04-14 04-15 04-16 04-17
-0.215358691 -0.337611847 0.486368813 0.115883308 -0.282207017 0.614554509 0.531435739 1.063455284 -0.199968099
04-18 04-19 04-20 04-21 04-22 04-23 04-24 04-25 04-26
-0.080662691 -0.052822528 1.679629547 -1.341639141 0.986160744 0.468143827 0.029621883 -0.025910053 0.061093981
04-27 04-28 04-29 04-30 05-01 05-02 05-03 05-04 05-05
-0.387992910 -0.917561336 0.161867089 0.874549452 0.866708261 0.048304939 -1.209756576 -0.825689257 -0.176605953
05-06 05-07 05-08 05-09 05-10 05-11 05-12 05-13 05-14
-0.381265758 0.419105218 -0.440418731 -0.293923704 1.427366374 -0.020773738 -0.358619841 -0.294738750 -0.269765222
05-15 05-16 05-17 05-18 05-19 05-20 05-21 05-22 05-23
0.277361477 -0.505072373 -0.765572754 -0.493223200 -0.253297588 0.902399037 0.007676731 -0.273059247 -0.784701888
05-24 05-25 05-26 05-27 05-28 05-29 05-30 05-31 06-01
0.063532445 -0.681369105 -1.034300631 0.689037398 -0.209889037 -0.535166412 -0.994984541 0.438795387 -0.167806908
06-02 06-03 06-04 06-05 06-06 06-07 06-08 06-09 06-10
0.079629296 -0.063908968 0.484892252 -0.922112094 0.978258635 -0.790949931 -0.303356059 0.681310315 -0.512109593
06-11 06-12 06-13 06-14 06-15 06-16 06-17 06-18 06-19
0.337126461 0.526594905 0.742784618 -0.163083706 0.027435241 0.709630255 -1.144544436 -0.374108608 0.102721328
06-20 06-21 06-22 06-23 06-24 06-25 06-26 06-27 06-28
0.577569049 0.224528626 0.206667019 0.392007605 -0.557974448 0.068685789 0.460201512 1.101334023 0.035838933
06-29 06-30 07-01 07-02 07-03 07-04 07-05 07-06 07-07
0.873903793 -0.586658280 -0.395094221 0.303312480 -0.631756580 0.088308518 0.046129624 0.642985443 -0.615693218
07-08 07-09 07-10 07-11 07-12 07-13 07-14 07-15 07-16
0.372776652 0.453644860 0.466905164 -0.526930331 -0.351139797 0.250132593 -0.881175203 -1.090136940 0.409708249
07-17 07-18 07-19 07-20 07-21 07-22 07-23 07-24 07-25
0.206436178 0.056134229 -0.057927905 0.807127686 0.423170493 -0.325181464 -0.053593067 0.261438323 0.520617153
07-26 07-27 07-28 07-29 07-30 07-31 08-01 08-02 08-03
0.053800701 0.326492953 -0.471839346 0.438963172 0.499502012 0.620917026 0.619923442 -1.422177067 0.212056501
08-04 08-05 08-06 08-07 08-08 08-09 08-10 08-11 08-12
0.497181456 0.703607380 -0.054104370 0.931407619 0.545759743 -0.323646872 0.127371847 0.017697636 -0.033060879
08-13 08-14 08-15 08-16 08-17 08-18 08-19 08-20 08-21
-0.583034512 0.824859915 -0.019064796 -0.226035270 -1.026526076 -0.882074229 -0.079167867 -2.073168805 0.378121135
08-22 08-23 08-24 08-25 08-26 08-27 08-28 08-29 08-30
-0.004516521 -0.661187139 0.339497500 -0.042210229 0.026970585 0.431653210 0.104619786 0.149562359 -0.473661114
08-31 09-01 09-02 09-03 09-04 09-05 09-06 09-07 09-08
-0.235250025 -0.624645896 0.141205349 -0.485201261 0.097633486 0.462059099 -0.500082678 1.386621118 -0.070895288
09-09 09-10 09-11 09-12 09-13 09-14 09-15 09-16 09-17
-0.126090048 -0.371028573 -0.010479329 0.192555782 0.025085776 -1.410061589 1.046273116 0.938254501 -0.072773342
09-18 09-19 09-20 09-21 09-22 09-23 09-24 09-25 09-26
-0.272947102 0.279357832 0.172702983 0.219560592 0.922992902 -0.612832806 -0.450896711 -1.134353324 -0.336199724
09-27 09-28 09-29 09-30 10-01 10-02 10-03 10-04 10-05
-0.459242718 0.049888664 0.079844541 -0.058636867 0.581553407 -0.315806482 -0.163864166 -1.513984901 0.069093641
10-06 10-07 10-08 10-09 10-10 10-11 10-12 10-13 10-14
-0.325709367 0.114176104 -0.470510646 -0.393891025 -0.659031395 -0.224657523 -0.336803115 -0.510526475 -0.941899166
10-15 10-16 10-17 10-18 10-19 10-20 10-21 10-22 10-23
0.559205646 0.346629848 0.310935589 -0.851962382 0.387930834 0.505692192 -0.738722861 0.410302113 -0.181359914
10-24 10-25 10-26 10-27 10-28 10-29 10-30 10-31 11-01
0.831105889 -0.398852239 -0.164535170 -0.870295447 0.057609116 -1.058556114 0.809784093 0.188277796 1.432543613
11-02 11-03 11-04 11-05 11-06 11-07 11-08 11-09 11-10
0.040680316 0.711553107 0.565285429 -0.829181807 0.455487776 -0.037182199 -0.644669824 -0.704611643 0.491631958
11-11 11-12 11-13 11-14 11-15 11-16 11-17 11-18 11-19
-0.051188454 0.963031185 -0.511791970 0.193671830 -0.333065645 -0.176479500 0.367566807 -0.056534518 1.391773053
11-20 11-21 11-22 11-23 11-24 11-25 11-26 11-27 11-28
0.162741879 -0.269991630 0.866532461 -0.352034768 -0.028515790 -0.671437717 -0.393703641 0.394041604 -0.959721458
11-29 11-30 12-01 12-02 12-03 12-04 12-05 12-06 12-07
-0.187149463 0.203037321 -0.824439261 -0.081277243 0.361409692 -0.300022665 -0.067589145 -0.265877741 -0.474834675
12-08 12-09 12-10 12-11 12-12 12-13 12-14 12-15 12-16
-0.903405316 0.026396956 0.930117145 -0.489879346 -0.481598661 0.122388492 0.042287328 -0.160328704 0.777249363
12-17 12-18 12-19 12-20 12-21 12-22 12-23 12-24 12-25
-0.359802827 0.252189848 0.754686655 -0.012767780 0.683605939 0.782528149 -0.786087093 0.751560196 -0.610885984
12-26 12-27 12-28 12-29 12-30 12-31
0.203570612 -0.603484627 -0.180762839 0.073029026 -0.017906554 -0.871648586
The idea is to split according to the day of the year (%d-%m) and do the mean of each subgroup.
EDIT - Michele (I thought it was better to improve this answer as exclusively base related, instead of mine):
If the above vector were used to create a data.frame then this solution is good alternative:
dat <- data.frame(date=day, var=dat)
> ddply(dat, .(day=format(date,"%m-%d")), summarise, result=mean(var))
day result
1 01-01 -0.167554841
2 01-02 0.599425816
3 01-03 -0.443361675
4 01-04 0.012972442
5 01-05 -0.203178536
6 01-06 -0.553501370
NB: sorry, it actually uses plyr package but it still uses data.frame and ddply could be replaced by by from base package.
Perhaps like this? I tried it out on a slightly smaller example than your 1:1096 vector - I used 5 values per year instead.
# the data, here 3 years with 5 values per year.
dat <- 1:15
# put your vector in a matrix
# by default, the matrix is filled column-wise
# thus, each column corresponds to a year, and each row to day of year
mm <- matrix(dat, ncol = 3)
# calculate row means
mm <- cbind(mm, rowMeans(mm))
mm
# [,1] [,2] [,3] [,4]
# [1,] 1 6 11 6
# [2,] 2 7 12 7
# [3,] 3 8 13 8
# [4,] 4 9 14 9
# [5,] 5 10 15 10
Update
Another base alternative that accounts for leap years, using the same (i.e. set.seed(1)) 'full' data from #Michele's answer:
df2 <- aggregate(var ~ format(date, "%m-%d"), data = dat, FUN = mean)
head(df2)
# format(date, "%m-%d") var
# 1 01-01 -0.16755484
# 2 01-02 0.59942582
# 3 01-03 -0.44336168
# 4 01-04 0.01297244
# 5 01-05 -0.20317854
# 6 01-06 -0.55350137
I am attempting to perform a study on the clustering of high/low points based on time. I managed to achieve the above by using to.daily on intraday data and merging the two using:
intraday.merge <- merge(intraday,daily)
intraday.merge <- na.locf(intraday.merge)
intraday.merge <- intraday.merge["T08:30:00/T16:30:00"] # remove record at 00:00:00
Next, I tried to obtain the records where the high == daily.high/low == daily.low using:
intradayhi <- test[test$High == test$Daily.High]
intradaylo <- test[test$Low == test$Daily.Low]
Resulting data resembles the following:
Open High Low Close Volume Daily.Open Daily.High Daily.Low Daily.Close Daily.Volume
2012-06-19 08:45:00 258.9 259.1 258.5 258.7 1424 258.9 259.1 257.7 258.7 31523
2012-06-20 13:30:00 260.8 260.9 260.6 260.6 1616 260.4 260.9 259.2 260.8 35358
2012-06-21 08:40:00 260.7 260.8 260.4 260.5 493 260.7 260.8 257.4 258.3 31360
2012-06-22 12:10:00 255.9 256.2 255.9 256.1 626 254.5 256.2 253.9 255.3 50515
2012-06-22 12:15:00 256.1 256.2 255.9 255.9 779 254.5 256.2 253.9 255.3 50515
2012-06-25 11:55:00 254.5 254.7 254.4 254.6 1589 253.8 254.7 251.5 253.9 65621
2012-06-26 08:45:00 253.4 254.2 253.2 253.7 5849 253.8 254.2 252.4 253.1 70635
2012-06-27 11:25:00 255.6 256.0 255.5 255.9 973 251.8 256.0 251.8 255.2 53335
2012-06-28 09:00:00 257.0 257.3 256.9 257.1 601 255.3 257.3 255.0 255.1 23978
2012-06-29 13:45:00 253.0 253.4 253.0 253.4 451 247.3 253.4 246.9 253.4 52539
There are duplicated results using the subset, how do I achieve only the first record of the day? I would then be able to plot the count of records for periods in the day.
Also, are there alternate methods to get the results I want? Thanks in advance.
Edit:
Sample output should look like this, count could either be 1st result for day or aggregated (more than 1 occurrence in that day):
Time Count
08:40:00 60
08:45:00 54
08:50:00 60
...
14:00:00 20
14:05:00 12
14:10:00 30
You can get the first observation of each day via:
y <- apply.daily(x, first)
Then you can simply aggregate the count based on hours and minutes:
z <- aggregate(1:NROW(y), by=list(Time=format(index(y),"%H:%M")), sum)
It's quite weird to ask this question that I apply a sas7 dataset into R.
One of my variable is visit_date
now it looks like this, i am wondering where i can transform them back to MM-DD-YYYY since i need to exclude data that's less than MDY(08-01-2010).
> chris$visit_date
[1] 17077 17091 17105 17119 17133 17069 17083 17097 17111 17125 17080 17094 17108
[14] 17122 17136 17098 17112 17210 17224 17238 17252 17266 17247 17261 17254 17268
[27] 17282 17296 17324 17237 17251 17265 17279 17293 17329 17343 17357 17385 17413
[40] 17259 17273 17287 17301 17315 17328 17342 17356 17370 17384 17335 17349 17377
[53] 17391 17405 17331 17345 17359 17373 17387 17435 17449 17463 17477 17505 17336
[66] 17364 17378 17392 17406 17352 17366 17380 17394 17408 17427 17441 17469 17483
[79] 17497 17440 17454 17468 17482 17496 17434 17448 17462 17476 17490 17419 17433
[92] 17447 17461 17475 17518 17560 17574 17588 17616 17653 17667 17681 17695 17709
[105] 17644 17658 17686 17700 17728 17755 17769 17783 17811 17825 17825 17610 17624
[118] 17638 17652 17666 18072 18114 18127 18155 18169 17651 17665 17680 17693 17707
[131] 17657 17671 17685 17699 17659 17673 17687 17701 17715 17646 17660 17674 17688
[144] 17702 17721 17735 17749 17763 17770 17734 17748 17762 17790 17861 17736 17750
[157] 17764 17778 17792 17751 17765 17779 17793 17807 17742 17756 17770 17784 17798
[170] 17772 17757 17771 17785 17799 17813 17777 17791 17819 17833 17854 17923 17937
[183] 17965 17979 17993 17825 17839 17853 17867 17909 17832 17846 17860 17874 17888
[196] 17919 17933 17961 17975 17989 17960 17974 17988 18002 18016 18183 18211 18225
[209] 18239 18253 17931 17945 17959 17973 17987 17940 17954 17968 17982 17996 17966
[222] 17980 17994 18022 18036 18021 18035 18049 18063 18091 18050 18064 18078 18092
[235] 18106 18045 18059 18073 18087 18115 18024 18038 18052 18066 18080 18056 18070
[248] 18084 18098 18112 18107 18121 18135 18149 18163 18105 18119 18133 18161 18175
[261] 18143 18171 18185 18199 18213 18203 18246 18274 18288 18302 18316 18248 18276
[274] 18290 18304 18318 18310 18324 18338 18352 18366 18315 18343 18357 18364 18378
[287] 18350 18364 18378 18406 18420 18337 18351 18365 18379 18393 18374 18388 18402
[300] 18430 18472 18344 18358 18386 18400 18414 18353 18381 18395 18409 18423 18387
[313] 18415 18429 18443 18450 18408 18422 18436 18443 18464 18430 18437 18457 18464
[326] 18471 18427 18434 18441 18455 18462 18428 18442 18456 18463 18470
Thanks
Those "dates" are clearly using a different origin/offset than the typical POSIX standard that would work with this conversion. R generally uses YYYY-MM-DD format
as.Date(ddd, origin="1970-01-01")
> head( as.Date(ddd, origin="1970-01-01") )
[1] "2016-10-03" "2016-10-17" "2016-10-31" "2016-11-14" "2016-11-28" "2016-09-25"
So you need to establish the correct origin. If it was 1960-01-01, then none of those dates is greater than 08-01-2010.
> sum( as.Date(ddd, origin="1960-01-01") >= as.Date("2010-08-01") )
[1] 0
> sum( as.Date(ddd, origin="1960-01-01") < as.Date("2010-08-01") )
[1] 336