Group rows based on time window and sum observations - r

Here an example of my data.frame:
df_1 = read.table(text='ID Day Episode Count
1 30001 7423 47 16
2 33021 7423 47 16
3 33024 7423 47 16
4 33034 7423 47 16
5 37018 7423 47 16
6 40004 7423 47 16
7 40011 7423 47 16
8 41028 7423 47 16
9 42001 7423 47 16
10 42011 7423 47 16
11 45003 7423 47 16
12 45004 7423 47 16
13 45005 7423 47 16
14 46006 7423 47 16
15 46008 7423 47 16
16 47004 7423 47 16
17 2001 7438 54 13
18 19007 7438 54 13
19 20002 7438 54 13
20 22006 7438 54 13
21 22007 7438 54 13
22 29002 7438 54 13
23 29003 7438 54 13
24 30001 7438 54 13
25 30004 7438 54 13
26 33023 7438 54 13
27 33029 7438 54 13
28 41006 7438 54 13
29 41020 7438 54 13
30 21009 7428 65 12
31 24001 7428 65 12
32 25005 7428 65 12
33 25009 7428 65 12
34 27002 7428 65 12
35 27003 7428 65 12
36 27009 7428 65 12
37 30001 7428 65 12
38 33023 7428 65 12
39 33029 7428 65 12
40 33050 7428 65 12
41 34003 7428 65 12
42 14001 7427 81 10
43 16004 7427 81 10
44 17001 7427 81 10
45 17005 7427 81 10
46 19001 7427 81 10
47 19006 7427 81 10
48 19007 7427 81 10
49 19010 7427 81 10
50 20001 7427 81 10
51 21009 7427 81 10
52 28047 7424 143 9
53 28049 7424 143 9
54 29002 7424 143 9
55 29003 7424 143 9
56 30003 7424 143 9
57 30004 7424 143 9
58 32010 7424 143 9
59 33014 7424 143 9
60 33023 7424 143 9
104 30001 6500 111 9
105 33021 6500 111 9
106 33024 6500 111 9
107 33034 6500 111 9
108 37018 6500 111 9
109 40004 6500 111 9
110 40011 6500 111 9
111 41028 6500 111 9
112 42001 6500 111 9
61 25005 7422 158 8
62 27021 7422 158 8
63 28015 7422 158 8
64 28047 7422 158 8
65 28049 7422 158 8
66 29001 7422 158 8
67 29002 7422 158 8
68 29003 7422 158 8
69 27002 7425 246 6
70 27003 7425 246 6
71 27021 7425 246 6
72 33006 7425 246 6
73 34001 7425 246 6
74 37019 7425 246 6
75 33014 7429 979 5
76 33021 7429 979 5
77 33024 7429 979 5
78 34001 7429 979 5
79 35010 7429 979 5
80 28022 7426 1199 5
81 34006 7426 1199 5
82 37006 7426 1199 5
83 37008 7426 1199 5
84 37018 7426 1199 5
85 29001 7437 1756 4
86 30014 7437 1756 4
87 32010 7437 1756 4
88 45004 7437 1756 4
89 4003 7430 1757 4
90 15013 7430 1757 4
91 16004 7430 1757 4
92 43007 7430 1757 4
93 7002 7434 1570 4
94 8006 7434 1570 4
95 15006 7434 1570 4
96 94001 7434 1570 4
113 33024 6499 135 4
114 33034 6499 135 4
115 37018 6499 135 4
116 40004 6499 135 4
222 3005 7440 999 2
223 3400 7440 999 2
97 3002 7433 2295 2
98 4003 7433 2295 2
99 48005 7436 3389 2
100 49004 7436 3389 2
101 8006 7431 3390 2
102 15006 7431 3390 2
104 6780 7439 22 1
103 41020 7435 4511 1', header = TRUE)
The data.frame has got this order based on -Count and -Day and it cannot be changed.
What I need to do is to group the data.frame by Day and its previous 16 days (17 days in total) and sum the Count column's observations. So in this case the largest Day observation with largest Count is 7438, then take 7438, 7437, 7436, 7435, etc.. up to 7424 and sum the Count values. The next Day observation which cannot be included in the 17-days time window of 7438 is 6500. Then look for 6500, 6499, etc... and do the same as per 7438 group. And do the same for Day 7440 and 7439.
The thing is that if we start to count the Day backwards from the top of the Day col, Day = 7423 will include in its Episode Day = 7422 and both of them will be trowed away from being included in the Day = 7438 Episode.
How can we write a code that first kind of analyse all the possible Episodes' combinations (based on the time window) and eventually select the one which cover the largest number of days (no more than the time window)?
Expected output:
ID Day Episode Count
1 2001 7438 1 103
2 19007 7438 1 103
3 20002 7438 1 103
4 22006 7438 1 103
5 22007 7438 1 103
6 29002 7438 1 103
7 29003 7438 1 103
8 30001 7438 1 103
9 30004 7438 1 103
10 33023 7438 1 103
11 33029 7438 1 103
12 41006 7438 1 103
13 41020 7438 1 103
14 29001 7437 1 103
15 30014 7437 1 103
16 32010 7437 1 103
17 45004 7437 1 103
18 48005 7436 1 103
19 49004 7436 1 103
20 41020 7435 1 103
21 7002 7434 1 103
22 8006 7434 1 103
23 15006 7434 1 103
24 94001 7434 1 103
25 3002 7433 1 103
26 4003 7433 1 103
27 8006 7431 1 103
28 15006 7431 1 103
29 4003 7430 1 103
30 15013 7430 1 103
31 16004 7430 1 103
32 43007 7430 1 103
33 33014 7429 1 103
34 33021 7429 1 103
35 33024 7429 1 103
36 34001 7429 1 103
37 35010 7429 1 103
38 21009 7428 1 103
39 24001 7428 1 103
40 25005 7428 1 103
41 25009 7428 1 103
42 27002 7428 1 103
43 27003 7428 1 103
44 27009 7428 1 103
45 30001 7428 1 103
46 33023 7428 1 103
47 33029 7428 1 103
48 33050 7428 1 103
49 34003 7428 1 103
50 14001 7427 1 103
51 16004 7427 1 103
52 17001 7427 1 103
53 17005 7427 1 103
54 19001 7427 1 103
55 19006 7427 1 103
56 19007 7427 1 103
57 19010 7427 1 103
58 20001 7427 1 103
59 21009 7427 1 103
60 28022 7426 1 103
61 34006 7426 1 103
62 37006 7426 1 103
63 37008 7426 1 103
64 37018 7426 1 103
65 27002 7425 1 103
66 27003 7425 1 103
67 27021 7425 1 103
68 33006 7425 1 103
69 34001 7425 1 103
70 37019 7425 1 103
71 28047 7424 1 103
72 28049 7424 1 103
73 29002 7424 1 103
74 29003 7424 1 103
75 30003 7424 1 103
76 30004 7424 1 103
77 32010 7424 1 103
78 33014 7424 1 103
79 33023 7424 1 103
80 30001 7423 1 103
81 33021 7423 1 103
82 33024 7423 1 103
83 33034 7423 1 103
84 37018 7423 1 103
85 40004 7423 1 103
86 40011 7423 1 103
87 41028 7423 1 103
88 42001 7423 1 103
89 42011 7423 1 103
90 45003 7423 1 103
91 45004 7423 1 103
92 45005 7423 1 103
93 46006 7423 1 103
94 46008 7423 1 103
95 47004 7423 1 103
96 25005 7422 1 103
97 27021 7422 1 103
98 28015 7422 1 103
99 28047 7422 1 103
100 28049 7422 1 103
101 29001 7422 1 103
102 29002 7422 1 103
103 29003 7422 1 103
104 30001 6500 2 13
105 33021 6500 2 13
106 33024 6500 2 13
107 33034 6500 2 13
108 37018 6500 2 13
109 40004 6500 2 13
110 40011 6500 2 13
111 41028 6500 2 13
112 42001 6500 2 13
113 33024 6499 2 13
114 33034 6499 2 13
115 37018 6499 2 13
116 40004 6499 2 13
117 3005 7440 3 3
118 3400 7440 3 3
119 6780 7439 3 3
My real data.frame has got >40,000 rows and >1600 episodes.

There is probably a better way to accomplish this, but something like this may get you what you need.
library(dplyr)
groups <- seq(max(df_1$Day)+1,min(df_1$Day),by = -17)
groups <- rev(append(groups, min(df_1$Day)))
df_1$group <- groups[cut(df_1$Day, breaks = groups, labels = FALSE, right = FALSE)]
df_1 <- df_1 %>%
group_by(group) %>%
mutate(TotalPerGroup = sum(Count))

Related

Subset a dataframe with specific condition in R

hello I have this df
res1 res4 aa1234
1 1 4 IVGG
2 10 13 RQFP
3 102 105 TSSV
4 112 115 LQNA
5 118 121 EAGT
6 12 15 FPFL
7 132 135 RSGG
8 138 141 SRFP
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
12 165 168 TRRG
13 171 174 CNGD
14 172 175 NGDG
15 174 177 DGGT
16 181 184 CEGL
17 195 198 PCGR
18 20 23 NQGR
19 205 208 RVAL
20 32 35 HARF
21 39 42 AASC
22 40 43 ASCF
23 48 51 PGVS
24 57 60 AYDL
25 59 62 DLRR
26 64 67 ERQS
27 65 68 RQSR
28 78 81 ENGY
29 8 11 RPRQ
30 82 85 DPQQ
31 83 86 PQQN
32 86 89 NLND
33 95 98 LDRE
I want to subset it considering only rows in which res1 are in sequence as i and i <= i+4, as :
res1 res4 aa1234
29 8 11 RPRQ
6 12 15 FPFL
21 39 42 AASC
22 40 43 ASCF
24 57 60 AYDL
25 59 62 DLRR
26 64 67 ERQS
27 65 68 RQSR
28 78 81 ENGY
30 82 85 DPQQ
31 83 86 PQQN
32 86 89 NLND
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
13 171 174 CNGD
14 172 175 NGDG
15 174 177 DGGT
I tried something woth functions "filter" and "subset" but I didn't got the result expected.
So in general, I need to have the overlap between two rows in a range (i-i+4) including i+4.
For example, in this 3 lines there is the overlap between rows [9] and [10] (150-153 overlaps with 151-154), but also row [11] corresponds to res1[10] + 4 (151+4 = 155). So maybe an idea should be to consider res1[i] and check if res1[i+1] is =< res[i].
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
why not we are simply doing this?
df[df$res1 %in% c(df$res1 -4,df$res1 -3, df$res1-2, df$res1 -1, df$res1+1,df$res1 +2, df$res1 +3, df$res1 +4),]
res1 res4 aa1234
2 10 13 RQFP
6 12 15 FPFL
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
13 171 174 CNGD
14 172 175 NGDG
15 174 177 DGGT
21 39 42 AASC
22 40 43 ASCF
24 57 60 AYDL
25 59 62 DLRR
26 64 67 ERQS
27 65 68 RQSR
28 78 81 ENGY
29 8 11 RPRQ
30 82 85 DPQQ
31 83 86 PQQN
32 86 89 NLND
edited scenario just order the df, and rest will be same. See
df <- df[order(df$res1),]
df[sort(unique(c(which(rev(diff(rev(df$res1))) >= -3 & rev(diff(rev(df$res1))) <= 0), which(diff(df$res1) <= 4 & diff(df$res1) >= 0)+1))),]
res1 res4 aa1234
29 8 11 RPRQ
2 10 13 RQFP
6 12 15 FPFL
21 39 42 AASC
22 40 43 ASCF
24 57 60 AYDL
25 59 62 DLRR
26 64 67 ERQS
27 65 68 RQSR
30 82 85 DPQQ
31 83 86 PQQN
32 86 89 NLND
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
13 171 174 CNGD
14 172 175 NGDG
15 174 177 DGGT
old answer Use this
df[sort(unique(c(which(rev(diff(rev(df$res1))) >= -3 & rev(diff(rev(df$res1))) <= 0), which(diff(df$res1) <= 4 & diff(df$res1) >= 0)+1))),]
res1 res4 aa1234
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
13 171 174 CNGD
14 172 175 NGDG
15 174 177 DGGT
21 39 42 AASC
22 40 43 ASCF
24 57 60 AYDL
25 59 62 DLRR
26 64 67 ERQS
27 65 68 RQSR
30 82 85 DPQQ
31 83 86 PQQN
32 86 89 NLND
Data used
df <- read.table(text = "res1 res4 aa1234
1 1 4 IVGG
2 10 13 RQFP
3 102 105 TSSV
4 112 115 LQNA
5 118 121 EAGT
6 12 15 FPFL
7 132 135 RSGG
8 138 141 SRFP
9 150 153 PEDQ
10 151 154 EDQC
11 155 158 RPNN
12 165 168 TRRG
13 171 174 CNGD
14 172 175 NGDG
15 174 177 DGGT
16 181 184 CEGL
17 195 198 PCGR
18 20 23 NQGR
19 205 208 RVAL
20 32 35 HARF
21 39 42 AASC
22 40 43 ASCF
23 48 51 PGVS
24 57 60 AYDL
25 59 62 DLRR
26 64 67 ERQS
27 65 68 RQSR
28 78 81 ENGY
29 8 11 RPRQ
30 82 85 DPQQ
31 83 86 PQQN
32 86 89 NLND
33 95 98 LDRE", header = T)

GAMs in R: Fewer unique covariate combinations than df

I tried fitting gams to some dataframes I have. All minus one work. It fails with the error:
Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : A term has fewer unique covariate combinations than specified maximum degrees of freedom
I looked a bit on the internet but couldn't really figure out what's really going wrong. All my 7 over dataframes run without a problem.
I then ran epiR::epi.cp(srtm[-c(1,7,8)]) and it gave me this output:
$cov.pattern
id n curv_plan curv_prof dem slope ca
1 1 1 1.113192e-02 3.991046e-03 3909 43.601479 5.225853
2 2 1 -2.686749e-03 3.474989e-03 3312 35.022511 4.418310
3 3 1 -1.033450e-02 -4.626922e-03 3326 36.678623 4.421465
4 4 1 -5.439283e-03 2.066148e-03 4069 31.501045 3.887526
5 5 1 -2.602015e-03 -1.249511e-04 3021 37.199219 5.010560
6 6 1 1.068216e-03 1.216902e-03 2844 44.694374 4.852220
7 7 1 -1.855443e-02 -5.965539e-03 2841 42.753750 5.088554
8 8 1 2.363193e-03 2.353357e-03 2833 33.160995 4.652209
9 9 1 2.169674e-02 1.049735e-02 2964 32.311535 4.671970
10 10 1 2.850910e-02 9.416230e-03 2956 50.791847 3.496096
11 11 1 -1.932028e-02 4.949751e-04 2794 38.714302 4.217102
12 12 1 -1.372750e-03 -4.437230e-03 3799 48.356312 4.597039
13 13 1 1.154181e-04 -4.114155e-03 3808 54.669777 3.518823
14 14 1 2.743768e-02 7.829833e-03 3580 23.674162 3.268744
15 15 1 7.216539e-03 9.818082e-04 3969 29.421440 4.354250
16 16 1 2.385139e-03 6.333927e-04 3635 10.555381 4.905733
17 17 1 -1.129411e-02 2.719948e-03 2805 29.195084 4.807369
18 18 1 4.584329e-04 -1.497223e-03 3676 32.754879 3.729304
19 19 1 1.883965e-03 4.189690e-03 3165 30.973505 4.833158
20 20 1 -5.350136e-03 -2.615470e-03 2745 32.534698 4.420852
21 21 1 1.484253e-02 -1.245213e-03 3872 26.113234 4.045357
22 22 1 -2.449377e-02 -5.045668e-04 2931 31.060991 5.170872
23 23 1 -2.962795e-02 -9.271557e-03 2917 21.680889 4.547461
24 24 1 -2.487545e-02 -7.834328e-03 2736 41.775677 4.543325
25 25 1 2.890568e-03 -2.040353e-03 2577 47.003765 3.739546
26 26 1 -5.119631e-03 8.869720e-03 3401 38.519680 5.428564
27 27 1 6.171266e-03 -6.515175e-04 2687 36.678623 4.152842
28 28 1 -8.297552e-03 -7.053435e-03 3678 39.532673 4.081311
29 29 1 8.652663e-03 2.394378e-03 3515 33.895370 4.220177
30 30 1 -2.528805e-03 -1.293259e-03 3404 42.548138 4.266330
31 31 1 1.899994e-02 6.367806e-03 3191 41.696201 3.300749
32 32 1 -2.243623e-02 -1.866033e-04 2433 34.162479 5.364681
33 33 1 -6.934012e-03 9.280805e-03 2309 32.667160 5.650699
34 34 1 -1.121149e-02 6.376335e-05 2188 31.119059 4.706416
35 35 1 -1.429000e-02 5.299596e-04 2511 34.543365 4.538456
36 36 1 -7.168889e-03 1.301791e-03 2625 30.826660 4.059711
37 37 1 -4.226461e-03 7.440552e-03 2830 33.398251 4.941027
38 38 1 -2.635832e-03 8.748529e-03 3378 45.972672 4.861779
39 39 1 -2.007920e-02 -8.081778e-03 3281 31.735376 5.173269
40 40 1 -3.453595e-02 -6.867430e-03 2690 47.515182 4.935358
41 41 1 1.698363e-03 -8.296107e-03 2529 42.224693 4.386349
42 42 1 5.257193e-03 1.021242e-02 2571 43.070564 4.194372
43 43 1 6.968817e-03 5.538784e-03 2581 36.055031 4.209373
44 44 1 -7.632907e-04 2.803704e-04 2582 28.257311 4.230427
45 45 1 -3.468894e-03 -9.099842e-04 2409 29.421440 4.190946
46 46 1 1.879089e-02 6.532978e-03 3733 41.535984 4.032614
47 47 1 -1.076225e-03 -1.138945e-03 2712 39.260731 4.580621
48 48 1 -5.306205e-03 2.667941e-03 3446 34.250553 4.925404
49 49 1 -5.380515e-03 -2.595619e-03 3785 50.561493 4.642792
50 50 1 -2.571232e-03 -2.063937e-03 3768 46.160892 4.728879
51 51 1 -7.638110e-03 -2.432463e-03 3413 32.401161 5.058373
52 52 1 -2.950254e-03 -2.034031e-04 3852 32.543564 4.443869
53 53 1 -2.702386e-03 -1.776183e-03 2483 31.002720 3.879390
54 54 1 -3.892425e-02 -2.266178e-03 2225 26.126318 5.750985
55 55 1 -2.644659e-03 3.034660e-03 2192 32.103516 4.949506
56 56 1 -2.862503e-02 3.673996e-04 2361 23.930893 5.181818
57 57 1 6.263880e-03 -7.725377e-04 3780 17.752790 4.890797
58 58 1 1.054093e-03 -1.563014e-03 3089 36.422310 4.520845
59 59 1 9.474340e-04 -3.901043e-03 3155 42.552841 4.265886
60 60 1 5.569567e-03 -1.770366e-04 3516 13.166321 4.772187
61 61 1 -8.342760e-03 -9.908290e-03 3097 36.815479 5.346615
62 62 1 -1.422498e-03 -1.645628e-03 2865 29.802414 4.131463
63 63 1 4.523963e-02 1.067406e-02 2163 36.154739 3.369432
64 64 1 -1.164162e-02 6.808200e-04 2316 19.610609 4.634536
65 65 1 -8.043590e-03 9.395104e-03 2614 44.298817 3.983136
66 66 1 -1.925332e-02 -4.521391e-03 2035 31.205780 4.134195
67 67 1 -1.429050e-02 5.435983e-03 2799 38.876656 4.180761
68 68 1 6.935605e-04 3.015038e-03 2679 37.863647 4.213497
69 69 1 -5.062089e-03 5.961242e-04 2831 32.401161 3.729215
70 70 1 -3.617065e-04 -2.874465e-03 3152 45.871994 4.703659
71 71 1 -4.216370e-02 -4.917050e-03 3726 25.376934 4.614913
72 72 1 -2.184333e-02 -2.840071e-03 3610 43.138550 4.237120
73 73 1 -1.735273e-02 -2.199261e-03 3339 33.984894 4.811754
74 74 1 1.929157e-02 5.358084e-03 3447 32.356407 3.355368
75 75 1 -4.118797e-02 -2.408211e-03 3251 22.373844 5.160147
76 76 1 -1.393304e-02 7.900328e-05 3297 22.090260 4.724728
77 77 1 -3.078095e-02 -5.535597e-03 3143 37.298687 4.625203
78 78 1 1.717030e-02 -1.120720e-03 3617 37.965389 4.627342
79 79 1 -5.965119e-04 -5.377157e-04 3689 28.360373 4.767213
80 80 1 7.843294e-03 -9.579902e-04 3676 48.356312 3.907819
81 81 1 5.994634e-03 2.034169e-03 2759 25.142431 3.980591
82 82 1 -1.323012e-02 2.393529e-03 3972 26.880308 5.107575
83 83 1 6.312347e-03 2.877600e-04 3323 32.167103 3.496723
84 84 1 -1.180464e-02 4.438243e-03 3790 40.369972 4.081389
85 85 1 -8.333334e-03 4.009274e-03 3248 14.931417 4.881107
86 86 1 2.016023e-03 -5.707344e-04 3994 18.305449 4.278613
87 87 1 -5.515654e-03 -8.373593e-04 3368 40.703190 4.229169
88 88 1 8.931696e-03 1.677515e-03 4651 30.133842 4.327270
89 89 1 1.962347e-04 -7.458636e-04 5075 57.352509 3.263017
90 90 1 -2.880805e-02 -5.200595e-04 2645 11.976726 5.634262
91 91 1 -2.101875e-02 -5.110677e-03 3109 34.218582 4.925558
92 92 1 -8.390786e-03 -1.188547e-02 3667 39.895481 4.249029
93 93 1 -1.366958e-02 9.873455e-04 2827 22.636129 5.269634
94 94 1 1.004551e-02 5.205147e-04 3667 44.028976 3.993555
95 95 1 5.892557e-03 -5.482296e-04 2416 5.385977 4.614692
96 96 1 -1.662132e-02 -9.946494e-04 3806 42.599808 3.951163
97 97 1 -7.977792e-03 5.937776e-03 3470 28.888371 3.120762
98 98 1 -2.408042e-02 -2.647421e-03 2975 16.228737 4.227977
99 99 1 -1.191509e-02 -2.014583e-03 2461 30.051607 4.361413
100 100 1 1.110316e-02 2.506189e-04 3362 29.517509 4.591039
101 101 1 2.010373e-03 4.185408e-04 5104 17.387333 3.642855
102 102 1 -3.218945e-03 1.004196e-02 4113 44.448421 3.282414
103 103 1 2.438254e-03 2.551999e-03 3234 31.205780 3.844411
104 104 1 -1.178511e-02 2.775465e-04 1864 1.350224 3.875072
105 105 1 -9.511201e-04 -1.446065e-03 2351 22.406872 4.392300
106 106 1 -4.563018e-03 -5.890041e-03 3141 24.862123 3.998985
107 107 1 -1.471223e-02 5.965497e-03 3765 25.363234 3.661456
108 108 1 -5.857890e-03 -9.363544e-03 2272 22.878105 5.105480
109 109 1 1.369277e-02 1.019289e-02 4016 44.848000 4.092690
110 110 1 -8.784844e-03 3.358194e-03 3293 32.543564 4.115062
111 111 1 -5.148044e-03 5.372697e-03 3038 31.772562 3.626687
112 112 1 -1.556184e+35 5.799786e+34 4961 29.421440 3.020591
113 113 1 3.831991e-03 1.570888e-03 2069 28.821898 3.790284
114 114 1 8.289138e-04 6.439757e-04 2154 21.045721 3.959267
115 115 1 -4.800863e-03 3.194520e-03 5294 45.660866 3.701611
116 116 1 2.974254e-02 1.197812e-02 4380 31.670097 3.877057
117 117 1 1.137725e-02 -1.082659e-02 5172 18.774675 3.572600
118 118 1 -4.678526e-03 7.448288e-03 2257 39.260731 4.227000
119 119 1 -4.655881e-03 -1.119303e-03 3233 30.205467 5.613868
120 120 1 -4.827522e-03 -4.766134e-03 3414 42.974857 3.831894
121 121 1 -8.568994e-04 1.053632e-03 1750 29.421440 4.132886
122 122 1 1.212121e-02 0.000000e+00 5018 20.136303 3.669850
123 123 1 -4.711660e-03 -2.261143e-03 3013 45.007954 3.622240
124 124 1 -1.226328e-02 4.688181e-04 3842 26.880308 3.098333
125 125 1 3.438910e-03 1.441129e-03 3470 11.386165 4.552782
126 126 1 1.192164e-02 -1.295839e-03 3473 22.684824 4.748498
127 127 1 -1.960781e-40 0.000000e+00 4155 90.000000 2.960569
128 128 1 2.124726e-04 1.945100e-03 2496 32.103516 5.242211
129 129 1 5.669804e-03 -4.589476e-03 2577 35.398876 4.271112
130 130 1 -8.838220e-03 -9.496282e-04 4921 14.506372 4.088247
131 131 1 1.009090e-02 -2.243944e-03 3385 38.372120 4.067030
132 132 1 5.630660e-03 -8.632211e-04 4003 33.322365 3.776054
133 133 1 -9.103803e-03 -6.322661e-03 2758 47.934212 3.739807
134 134 1 6.225513e-03 -1.824928e-03 3925 37.085732 3.389725
135 135 1 -1.303080e-03 3.580316e-03 2978 27.432941 4.345174
136 136 1 1.355920e-02 3.468190e-03 5058 57.797195 3.739124
137 137 1 2.092464e-02 -3.244962e-04 2400 3.931096 3.032193
138 138 1 5.691811e-02 -7.933985e-04 3885 15.069956 3.414036
139 139 1 8.052407e-05 -3.197287e-03 3493 33.993008 3.881695
140 140 1 -1.892967e-02 -5.049255e-03 2985 24.904482 4.417928
141 141 1 2.278842e-02 1.188287e-02 3666 31.670097 3.313449
142 142 1 1.496110e-02 2.181270e-03 3702 30.498932 3.171413
[ reached 'max' / getOption("max.print") -- omitted 18 rows ]
$id
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
[34] 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
[67] 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
[100] 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
[133] 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
I tried to lower the number of knots in the gam-call but didn't suceed as well...
Anyone might have an idea?
I fit the gam using the following line:
mgcv::gam(slide ~ s(curv_plan) + s(curv_prof) + s(dem) + s(slope) + s(ca), data = dataframes_new[[7]], family = binomial)
I have experienced the same issue. The root cause was that some of my categorical variables had fewer levels than k in my formula specification. To give an example:
Suppose one of the terms in my formula specification was:
s(I(pmin(example_variable, 120)), k = 5)
and the data in my example_variable had 3 levels (say, "yes", "no", "maybe"). This would throw the above-mentioned error.
In my case, I solved it by creating additional levels in my data (I was creating test data for a unit test). In other cases it could be solved by ensuring k does not exceed the number of levels in your categorical variables.
If you're using categorical variables, check if the root cause might be the same for you.
I found the solution to my problem by reading these:
https://stat.ethz.ch/pipermail/r-sig-ecology/2011-May/002148.html
https://stat.ethz.ch/pipermail/r-help/2007-October/143569.html
The error means that you tried to create a thin plate spline basis expansion with more basis functions than the variable from which the expansion is to be made has unique values.
As you don't show the model fitting code, we can't say more than that one of the smooths in the model you tried to fit didn't have enough unique values for the value of k you specific or used (if you didn't set k a default value was used).

Sum grouped observations based on time window of given days

Here an example of my data.frame:
df = read.table(text = 'ID Day Episode Count
28047 6000 143 7
28049 6000 143 7
29002 6000 143 7
29003 6000 143 7
30003 6000 143 7
30004 6000 143 7
32010 6000 143 7
30001 7436 47 6
33021 7436 47 6
33024 7436 47 6
33034 7436 47 6
37018 7436 47 6
40004 7436 47 6
29003 7300 111 6
30003 7300 111 6
30004 7300 111 6
32010 7300 111 6
30001 7300 111 6
33021 7300 111 6
2001 7438 54 5
19007 7438 54 5
20002 7438 54 5
22006 7438 54 5
22007 7438 54 5
32010 7301 99 5
30001 7301 99 5
33021 7301 99 5
2001 7301 99 5
19007 7301 99 5
27021 5998 158 5
28015 5998 158 5
28047 5998 158 5
28049 5998 158 5
29001 5998 158 5
21009 7437 65 4
24001 7437 65 4
25005 7437 65 4
25009 7437 65 4
14001 7435 81 4
16004 7435 81 4
17001 7435 81 4
17005 7435 81 4
21009 7299 77 4
24001 7299 77 4
25005 7299 77 4
25009 7299 77 4
29002 5996 158 4
29003 5996 158 4
27002 5996 158 4
27003 5996 158 4
33014 5999 56 3
33023 5999 56 3
25005 5999 56 3
27021 5995 246 2
33006 5995 246 2
8876 7439 765 2
5421 7439 765 2
6678 7298 68 1
34001 5994 125 1
4432 7440 841 1', header = TRUE)
What I need to do is for each unique Day observation look for its Count value and add it to the previous 3 days' Count ones (i.e. 4-days time window).
e.g. 1) Day = 6000, sum 7 (Count value) to Count values of Day 5999, 5998 and 5997 (the last one not present in the df), which are respectively 3, 5 and 0 -> 7 + 3 + 5 + 0 = new_Count 15;
2) next Day = 7436, sum 6 to Count values of 7435, 7434 and 7433 -> 6 + 4 + 0 + 0 = new_Count 10;
and so on up to the last Day within df.
Desired output:
ID Day new_Episode new_Count
2001 7438 1 19
19007 7438 1 19
20002 7438 1 19
22006 7438 1 19
22007 7438 1 19
21009 7437 1 19
24001 7437 1 19
25005 7437 1 19
25009 7437 1 19
30001 7436 1 19
33021 7436 1 19
33024 7436 1 19
33034 7436 1 19
37018 7436 1 19
40004 7436 1 19
14001 7435 1 19
16004 7435 1 19
17001 7435 1 19
17005 7435 1 19
8876 7439 2 17
5421 7439 2 17
2001 7438 2 17
19007 7438 2 17
20002 7438 2 17
22006 7438 2 17
22007 7438 2 17
21009 7437 2 17
24001 7437 2 17
25005 7437 2 17
25009 7437 2 17
30001 7436 2 17
33021 7436 2 17
33024 7436 2 17
33034 7436 2 17
37018 7436 2 17
40004 7436 2 17
32010 7301 3 16
30001 7301 3 16
33021 7301 3 16
2001 7301 3 16
19007 7301 3 16
29003 7300 3 16
30003 7300 3 16
30004 7300 3 16
32010 7300 3 16
30001 7300 3 16
33021 7300 3 16
21009 7299 3 16
24001 7299 3 16
25005 7299 3 16
25009 7299 3 16
6678 7298 3 16
28047 6000 4 15
28049 6000 4 15
29002 6000 4 15
29003 6000 4 15
30003 6000 4 15
30004 6000 4 15
32010 6000 4 15
33014 5999 4 15
33023 5999 4 15
25005 5999 4 15
27021 5998 4 15
28015 5998 4 15
28047 5998 4 15
28049 5998 4 15
29001 5998 4 15
21009 7437 5 14
24001 7437 5 14
25005 7437 5 14
25009 7437 5 14
30001 7436 5 14
33021 7436 5 14
33024 7436 5 14
33034 7436 5 14
37018 7436 5 14
40004 7436 5 14
14001 7435 5 14
16004 7435 5 14
17001 7435 5 14
17005 7435 5 14
4432 7440 6 12
8876 7439 6 12
5421 7439 6 12
2001 7438 6 12
19007 7438 6 12
20002 7438 6 12
22006 7438 6 12
22007 7438 6 12
21009 7437 6 12
24001 7437 6 12
25005 7437 6 12
25009 7437 6 12
33014 5999 7 12
33023 5999 7 12
25005 5999 7 12
27021 5998 7 12
28015 5998 7 12
28047 5998 7 12
28049 5998 7 12
29001 5998 7 12
29002 5996 7 12
29003 5996 7 12
27002 5996 7 12
27003 5996 7 12
29003 7300 8 11
30003 7300 8 11
30004 7300 8 11
32010 7300 8 11
30001 7300 8 11
33021 7300 8 11
21009 7299 8 11
24001 7299 8 11
25005 7299 8 11
25009 7299 8 11
6678 7298 8 11
27021 5998 9 11
28015 5998 9 11
28047 5998 9 11
28049 5998 9 11
29001 5998 9 11
29002 5996 9 11
29003 5996 9 11
27002 5996 9 11
27003 5996 9 11
27021 5995 9 11
33006 5995 9 11
30001 7436 10 10
33021 7436 10 10
33024 7436 10 10
33034 7436 10 10
37018 7436 10 10
40004 7436 10 10
14001 7435 10 10
16004 7435 10 10
17001 7435 10 10
17005 7435 10 10
29002 5996 11 7
29003 5996 11 7
27002 5996 11 7
27003 5996 11 7
27021 5995 11 7
33006 5995 11 7
34001 5994 11 7
21009 7299 12 5
24001 7299 12 5
25005 7299 12 5
25009 7299 12 5
6678 7298 12 5
14001 7435 13 4
16004 7435 13 4
17001 7435 13 4
17005 7435 13 4
27021 5995 14 3
33006 5995 14 3
34001 5994 14 3
6678 7298 15 1
34001 5994 16 1
Note that the output_df is larger than df (but it's ok) and it is ranked by -new_Count and -Day with new_Episode column accordingly to -new_Count ranking.
Any suggestion?
So I'm not sure why output_df has more rows than the original data.frame, but we can use the by function along with subset to calculate new_Count. Note that I've called your data.frame df1 instead of df.
output_df1 <- do.call('rbind', by(df1, list(df1$Day, df1$ID), FUN = function(d){
#grab subset of df
sub_df <- subset(df1, Day < d$Day & Day > (d$Day - 4))
#select unique day, count
sub_df_u <- unique(sub_df[,-1])
d$new_Count <- sum(sub_df_u$Count) + d$Count
d
}))
head(output_df1)
ID Day Episode Count new_Count
14 2001 7438 54 5 15
28 14001 7435 81 4 4
29 16004 7435 81 4 4
30 17001 7435 81 4 4
31 17005 7435 81 4 4
15 19007 7438 54 5 15
To get the new_Episode column, we can use the dense_rank function from the dplyr package:
output_df1$new_Episode <- dplyr::dense_rank(-output_df1$new_Count)

How can we determine a transition point in a curve for a bar graph

I have plotted a bar graph and knowing that the transition in the curve takes place around values 21 to 25 , I want to find such point in general.
barplot(Views$V2,names=Views$V1,las=2,cex.names=0.2,border="blue",xpd=FALSE)
abline(v=25,col="red")
Any help is appreciated
Update :
A part of my data frame is given below. It is sorted by values V2 . By transition what I want to say is that a point in the sorted data where there is a large deviation and after which it continues in that pattern. So a point where there is difference in patterns.
V1 V2
1 1 16154424
2 2 3701944
3 3 1618377
4 4 903302
5 5 569824
6 6 389772
7 7 281751
8 8 212450
9 9 166364
10 10 133339
11 11 109410
12 12 90934
13 13 77155
14 14 66124
15 15 57861
16 16 50765
17 17 44805
18 18 39996
19 19 35850
20 20 32492
21 21 29522
22 22 27152
23 23 24821
24 24 22619
25 25 21238
26 26 19639
27 27 18320
28 28 16867
29 29 15890
30 30 14936
31 31 14252
32 32 13150
33 33 12696
34 34 11656
35 35 11191
36 36 10951
37 37 10232
38 38 9605
39 39 9058
40 40 8916
41 41 8531
42 42 8010
43 43 7932
44 44 7436
45 45 6991
46 46 6750
47 47 6613
48 48 6254
49 49 6292
50 50 5731
51 51 5659
52 52 5551
53 53 5396
54 54 5122
55 55 4845
56 56 4860
57 57 4591
58 58 4504
59 59 4233
60 60 4371
61 61 4014
62 62 4083
63 63 3923
64 64 3796
65 65 3616
66 66 3519
67 67 3466
68 68 3409
69 69 3357
70 70 3215
71 71 3118
72 72 3081
73 73 3040
74 74 2951
75 75 2808
76 76 2797
77 77 2829
78 78 2714
79 79 2564
80 80 2563
81 81 2445
82 82 2528
83 83 2443
84 84 2316
85 85 2314
86 86 2212
87 87 2215
88 88 2102
89 89 2172
90 90 2020
91 91 2108
92 92 2020
93 93 2027
94 94 1982
95 95 1936
96 96 1836
97 97 1801
98 98 1850
99 99 1751
100 100 1810
It has to be something to do with the ratios of differences maybe and something which can be evident after observing the graph.

Sequence with different intervals in R: matching sensor data

I need a vector that repeats numbers in a sequence at varying intervals. I basically need this
c(rep(1:42, each=6), rep(43:64, each = 7),
rep(65:106, each=6), rep(107:128, each = 7),
.... but I need to this to keep going, until almost 2 million.
So I want a vector that looks like
[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 ...
.....
[252] 43 43 43 43 43 43 43 44 44 44 44 44 44 44
....
[400] 64 64 64 64 64 64 65 65 65 65 65 65...
and so on. Not just alternating between 6 and 7 repetitions, rather mostly 6s and fewer 7s until the whole vector is something like 1.7 million rows. So, is there a loop I can do? Or apply, replicate? I need the 400th entry in the vector to be 64, the 800th entry to be 128, and so on, in somewhat evenly spaced integers.
UPDATE
Thank you all for the quick clever tricks there. It worked, at least well enough for the deadline I was dealing with. I realize repeating 6 xs and 7 xs are a really dumb way to try to solve this, but it was quick at least. But now that I have some time, I would like to get everyone's opinions /ideas on my real underlying issue here.
I have two datasets to merge. They are both sensor datasets, both with stopwatch time as primary keys. But one records every 1/400 of a second, and the other records every 1/256 of a second. I have trimmed the top of each so that they are starting the exact same moment. But.. now what? I have 400 records for each second in one set, and 256 records for 1 second in the other. Is there a way to merge these without losing data? Interpolating or just repeating obs is a-ok, necessary, I think, but I'd rather not throw any data out.
I read this post here, that had to do with using xts and zoo for a very similar problem to mine. But they have nice epoch date/times for each. I just have these awful fractions of seconds!
sample data (A):
time dist a_lat
1 139.4300 22 0
2 139.4325 22 0
3 139.4350 22 0
4 139.4375 22 0
5 139.4400 22 0
6 139.4425 22 0
7 139.4450 22 0
8 139.4475 22 0
9 139.4500 22 0
10 139.4525 22 0
sample data (B):
timestamp hex_acc_x hex_acc_y hex_acc_z
1 367065215501 -0.5546875 -0.7539062 0.1406250
2 367065215505 -0.5468750 -0.7070312 0.2109375
3 367065215509 -0.4218750 -0.6835938 0.1796875
4 367065215513 -0.5937500 -0.7421875 0.1562500
5 367065215517 -0.6757812 -0.7773438 0.2031250
6 367065215521 -0.5937500 -0.8554688 0.2460938
7 367065215525 -0.6132812 -0.8476562 0.2109375
8 367065215529 -0.3945312 -0.8906250 0.2031250
9 367065215533 -0.3203125 -0.8906250 0.2226562
10 367065215537 -0.3867188 -0.9531250 0.2578125
(oh yeah, and btw, the B dataset timestamps are epoch format * 256, because life is hard. i haven't converted it for this because dataset A has nothing like that, only just 0.0025 intervals. Also the B data sensor was left on for hours later the A data sensor turned off, so that doesn't help)
Or if you like, you can try this using apply
# using this sample data
df <- data.frame(from=c(1,4,7,11), to = c(3,6,10,13),rep=c(6,7,6,7));
> df
# from to rep
#1 1 3 6
#2 4 6 7
#3 7 10 6
#4 11 13 7
unlist(apply(df, 1, function(x) rep(x['from']:x['to'], each=x['rep'])))
# [1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 4
#[26] 5 5 5 5 5 5 5 6 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8
#[51] 8 9 9 9 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 11 12 12 12 12 12
#[76] 12 12 13 13 13 13 13 13 13
Now that you put it that way ... I have absolutely no idea how you are planning on using all of the 6s and 7s. :-)
Regardless, I recommend standardizing the time, adding a "sample" column, and merging on them. Having the "sample" column may facilitate your processing later on, perhaps.
Your data:
df400 <- structure(list(time = c(139.43, 139.4325, 139.435, 139.4375, 139.44, 139.4425,
139.445, 139.4475, 139.45, 139.4525),
dist = c(22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L, 22L),
a_lat = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)),
.Names = c("time", "dist", "a_lat"),
class = "data.frame", row.names = c(NA, -10L))
df256 <- structure(list(timestamp = c(367065215501, 367065215505, 367065215509, 367065215513,
367065215517, 367065215521, 367065215525, 367065215529,
367065215533, 367065215537),
hex_acc_x = c(-0.5546875, -0.546875, -0.421875, -0.59375, -0.6757812,
-0.59375, -0.6132812, -0.3945312, -0.3203125, -0.3867188),
hex_acc_y = c(-0.7539062, -0.7070312, -0.6835938, -0.7421875,
-0.7773438, -0.8554688, -0.8476562, -0.890625,
-0.890625, -0.953125),
hex_acc_z = c(0.140625, 0.2109375, 0.1796875, 0.15625, 0.203125,
0.2460938, 0.2109375, 0.203125, 0.2226562, 0.2578125)),
.Names = c("timestamp", "hex_acc_x", "hex_acc_y", "hex_acc_z"),
class = "data.frame", row.names = c(NA, -10L))
Standardize your time frames:
colnames(df256)[1] <- 'time'
df400$time <- df400$time - df400$time[1]
df256$time <- (df256$time - df256$time[1]) / 256
Assign a label for easy reference (not that the NAs won't be clear enough):
df400 <- cbind(sample='A', df400, stringsAsFactors=FALSE)
df256 <- cbind(sample='B', df256, stringsAsFactors=FALSE)
And now for the merge and sorting:
dat <- merge(df400, df256, by=c('sample', 'time'), all.x=TRUE, all.y=TRUE)
dat <- dat[order(dat$time),]
dat
## sample time dist a_lat hex_acc_x hex_acc_y hex_acc_z
## 1 A 0.000000 22 0 NA NA NA
## 11 B 0.000000 NA NA -0.5546875 -0.7539062 0.1406250
## 2 A 0.002500 22 0 NA NA NA
## 3 A 0.005000 22 0 NA NA NA
## 4 A 0.007500 22 0 NA NA NA
## 5 A 0.010000 22 0 NA NA NA
## 6 A 0.012500 22 0 NA NA NA
## 7 A 0.015000 22 0 NA NA NA
## 12 B 0.015625 NA NA -0.5468750 -0.7070312 0.2109375
## 8 A 0.017500 22 0 NA NA NA
## 9 A 0.020000 22 0 NA NA NA
## 10 A 0.022500 22 0 NA NA NA
## 13 B 0.031250 NA NA -0.4218750 -0.6835938 0.1796875
## 14 B 0.046875 NA NA -0.5937500 -0.7421875 0.1562500
## 15 B 0.062500 NA NA -0.6757812 -0.7773438 0.2031250
## 16 B 0.078125 NA NA -0.5937500 -0.8554688 0.2460938
## 17 B 0.093750 NA NA -0.6132812 -0.8476562 0.2109375
## 18 B 0.109375 NA NA -0.3945312 -0.8906250 0.2031250
## 19 B 0.125000 NA NA -0.3203125 -0.8906250 0.2226562
## 20 B 0.140625 NA NA -0.3867188 -0.9531250 0.2578125
I'm guessing your data was just a small representation. If I've guessed poorly (that A's integers are seconds and B's integers are 1/400ths of a second) then just scale differently. Either way, by resetting the first value to zero and then merging/sorting, they are easy to merge and sort.
alt <- data.frame(len=c(42,22),rep=c(6,7));
alt;
## len rep
## 1 42 6
## 2 22 7
altrep <- function(alt,cyc,len) {
cyclen <- sum(alt$len*alt$rep);
if (missing(cyc)) {
if (missing(len)) {
cyc <- 1;
len <- cyc*cyclen;
} else {
cyc <- ceiling(len/cyclen);
};
} else if (missing(len)) {
len <- cyc*cyclen;
};
if (isTRUE(all.equal(len,0))) return(integer());
result <- rep(1:(cyc*sum(alt$len)),rep(rep(alt$rep,alt$len),cyc));
length(result) <- len;
result;
};
altrep(alt,2);
## [1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8 8 9 9 9
## [52] 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 12 12 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 15 15 15 15 15 15 16 16 16 16 16 16 17 17 17 17 17 17
## [103] 18 18 18 18 18 18 19 19 19 19 19 19 20 20 20 20 20 20 21 21 21 21 21 21 22 22 22 22 22 22 23 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25 26 26 26
## [154] 26 26 26 27 27 27 27 27 27 28 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 31 31 31 31 31 31 32 32 32 32 32 32 33 33 33 33 33 33 34 34 34 34 34 34
## [205] 35 35 35 35 35 35 36 36 36 36 36 36 37 37 37 37 37 37 38 38 38 38 38 38 39 39 39 39 39 39 40 40 40 40 40 40 41 41 41 41 41 41 42 42 42 42 42 42 43 43 43
## [256] 43 43 43 43 44 44 44 44 44 44 44 45 45 45 45 45 45 45 46 46 46 46 46 46 46 47 47 47 47 47 47 47 48 48 48 48 48 48 48 49 49 49 49 49 49 49 50 50 50 50 50
## [307] 50 50 51 51 51 51 51 51 51 52 52 52 52 52 52 52 53 53 53 53 53 53 53 54 54 54 54 54 54 54 55 55 55 55 55 55 55 56 56 56 56 56 56 56 57 57 57 57 57 57 57
## [358] 58 58 58 58 58 58 58 59 59 59 59 59 59 59 60 60 60 60 60 60 60 61 61 61 61 61 61 61 62 62 62 62 62 62 62 63 63 63 63 63 63 63 64 64 64 64 64 64 64 65 65
## [409] 65 65 65 65 66 66 66 66 66 66 67 67 67 67 67 67 68 68 68 68 68 68 69 69 69 69 69 69 70 70 70 70 70 70 71 71 71 71 71 71 72 72 72 72 72 72 73 73 73 73 73
## [460] 73 74 74 74 74 74 74 75 75 75 75 75 75 76 76 76 76 76 76 77 77 77 77 77 77 78 78 78 78 78 78 79 79 79 79 79 79 80 80 80 80 80 80 81 81 81 81 81 81 82 82
## [511] 82 82 82 82 83 83 83 83 83 83 84 84 84 84 84 84 85 85 85 85 85 85 86 86 86 86 86 86 87 87 87 87 87 87 88 88 88 88 88 88 89 89 89 89 89 89 90 90 90 90 90
## [562] 90 91 91 91 91 91 91 92 92 92 92 92 92 93 93 93 93 93 93 94 94 94 94 94 94 95 95 95 95 95 95 96 96 96 96 96 96 97 97 97 97 97 97 98 98 98 98 98 98 99 99
## [613] 99 99 99 99 100 100 100 100 100 100 101 101 101 101 101 101 102 102 102 102 102 102 103 103 103 103 103 103 104 104 104 104 104 104 105 105 105 105 105 105 106 106 106 106 106 106 107 107 107 107 107
## [664] 107 107 108 108 108 108 108 108 108 109 109 109 109 109 109 109 110 110 110 110 110 110 110 111 111 111 111 111 111 111 112 112 112 112 112 112 112 113 113 113 113 113 113 113 114 114 114 114 114 114 114
## [715] 115 115 115 115 115 115 115 116 116 116 116 116 116 116 117 117 117 117 117 117 117 118 118 118 118 118 118 118 119 119 119 119 119 119 119 120 120 120 120 120 120 120 121 121 121 121 121 121 121 122 122
## [766] 122 122 122 122 122 123 123 123 123 123 123 123 124 124 124 124 124 124 124 125 125 125 125 125 125 125 126 126 126 126 126 126 126 127 127 127 127 127 127 127 128 128 128 128 128 128 128
altrep(alt,len=1000);
## [1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8 8 9 9 9
## [52] 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 12 12 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 15 15 15 15 15 15 16 16 16 16 16 16 17 17 17 17 17 17
## [103] 18 18 18 18 18 18 19 19 19 19 19 19 20 20 20 20 20 20 21 21 21 21 21 21 22 22 22 22 22 22 23 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25 26 26 26
## [154] 26 26 26 27 27 27 27 27 27 28 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30 31 31 31 31 31 31 32 32 32 32 32 32 33 33 33 33 33 33 34 34 34 34 34 34
## [205] 35 35 35 35 35 35 36 36 36 36 36 36 37 37 37 37 37 37 38 38 38 38 38 38 39 39 39 39 39 39 40 40 40 40 40 40 41 41 41 41 41 41 42 42 42 42 42 42 43 43 43
## [256] 43 43 43 43 44 44 44 44 44 44 44 45 45 45 45 45 45 45 46 46 46 46 46 46 46 47 47 47 47 47 47 47 48 48 48 48 48 48 48 49 49 49 49 49 49 49 50 50 50 50 50
## [307] 50 50 51 51 51 51 51 51 51 52 52 52 52 52 52 52 53 53 53 53 53 53 53 54 54 54 54 54 54 54 55 55 55 55 55 55 55 56 56 56 56 56 56 56 57 57 57 57 57 57 57
## [358] 58 58 58 58 58 58 58 59 59 59 59 59 59 59 60 60 60 60 60 60 60 61 61 61 61 61 61 61 62 62 62 62 62 62 62 63 63 63 63 63 63 63 64 64 64 64 64 64 64 65 65
## [409] 65 65 65 65 66 66 66 66 66 66 67 67 67 67 67 67 68 68 68 68 68 68 69 69 69 69 69 69 70 70 70 70 70 70 71 71 71 71 71 71 72 72 72 72 72 72 73 73 73 73 73
## [460] 73 74 74 74 74 74 74 75 75 75 75 75 75 76 76 76 76 76 76 77 77 77 77 77 77 78 78 78 78 78 78 79 79 79 79 79 79 80 80 80 80 80 80 81 81 81 81 81 81 82 82
## [511] 82 82 82 82 83 83 83 83 83 83 84 84 84 84 84 84 85 85 85 85 85 85 86 86 86 86 86 86 87 87 87 87 87 87 88 88 88 88 88 88 89 89 89 89 89 89 90 90 90 90 90
## [562] 90 91 91 91 91 91 91 92 92 92 92 92 92 93 93 93 93 93 93 94 94 94 94 94 94 95 95 95 95 95 95 96 96 96 96 96 96 97 97 97 97 97 97 98 98 98 98 98 98 99 99
## [613] 99 99 99 99 100 100 100 100 100 100 101 101 101 101 101 101 102 102 102 102 102 102 103 103 103 103 103 103 104 104 104 104 104 104 105 105 105 105 105 105 106 106 106 106 106 106 107 107 107 107 107
## [664] 107 107 108 108 108 108 108 108 108 109 109 109 109 109 109 109 110 110 110 110 110 110 110 111 111 111 111 111 111 111 112 112 112 112 112 112 112 113 113 113 113 113 113 113 114 114 114 114 114 114 114
## [715] 115 115 115 115 115 115 115 116 116 116 116 116 116 116 117 117 117 117 117 117 117 118 118 118 118 118 118 118 119 119 119 119 119 119 119 120 120 120 120 120 120 120 121 121 121 121 121 121 121 122 122
## [766] 122 122 122 122 122 123 123 123 123 123 123 123 124 124 124 124 124 124 124 125 125 125 125 125 125 125 126 126 126 126 126 126 126 127 127 127 127 127 127 127 128 128 128 128 128 128 128 129 129 129 129
## [817] 129 129 130 130 130 130 130 130 131 131 131 131 131 131 132 132 132 132 132 132 133 133 133 133 133 133 134 134 134 134 134 134 135 135 135 135 135 135 136 136 136 136 136 136 137 137 137 137 137 137 138
## [868] 138 138 138 138 138 139 139 139 139 139 139 140 140 140 140 140 140 141 141 141 141 141 141 142 142 142 142 142 142 143 143 143 143 143 143 144 144 144 144 144 144 145 145 145 145 145 145 146 146 146 146
## [919] 146 146 147 147 147 147 147 147 148 148 148 148 148 148 149 149 149 149 149 149 150 150 150 150 150 150 151 151 151 151 151 151 152 152 152 152 152 152 153 153 153 153 153 153 154 154 154 154 154 154 155
## [970] 155 155 155 155 155 156 156 156 156 156 156 157 157 157 157 157 157 158 158 158 158 158 158 159 159 159 159 159 159 160 160
You can specify len=1.7e6 (and omit the cyc argument) to get exactly 1.7 million elements, or you can get a whole number of cycles using cyc.
How about
len <- 2e6
step <- 400
x <- rep(64 * seq(0, ceiling(len / step) - 1), each = step) +
sort(rep(1:64, length.out = step))
x <- x[seq(len)] # to get rid of extra elements

Resources