I am conducting a phylogenetic independent contrast on sex ratios in 200 bird species using RStudio. I need to be able to identify which post-contrast value is from which species so I can find an outlier. However, the row names disappear when I conduct the pic and when I try to add them, I get the error message
Error in names(csr) <- row.names(data) : 'names' attribute [200] must be the same length as the vector [199]
I downloaded my pic values for 'csr' to a .csv and found that the length is indeed now 199. Why is a species missing and how can I attach the species name to my pic values?
Quick note: I thought maybe this was an issue of duplicate data in my original data frame but I checked and none of my species are duplicated.
New R user, thanks in advance!
Library(ape)
Library(maps)
Library(phytools)
Library(geiger)
#upload data
>data<-read.csv(file.choose(),row.names=1)
#upload set of trees
> et<-read.nexus(file.choose())
#form consensus tree
> cet<-consensus.edges(et,"least.squares")
[1] "RSS: 74.744666106374"
#make tree dichotomous
> rcet<-multi2di(cet)
#check tree and data match up
> name.check(rcet,data)
[1] "OK"
> sr<-data$SR
> names(sr)<-row.names(data)
> csr<-pic(sr,rcet)
#check to see if row names are attached
>head.matrix(csr)
201 202 203 204
5.090712e-04 -9.531727e-04 1.648872e-03 -4.288288e-03
205 206 207 208
1.460023e-03 1.940847e-03 1.430754e-03 8.495663e-04
209 210 211 212
-1.590387e-03 -4.440047e-03 5.930776e-03 3.885212e-03
213 214 215 216
-8.180639e-03 4.020204e-03 9.256576e-03 2.211563e-02
217 218 219 220
-1.236238e-02 2.187909e-02 8.064300e-03 2.221089e-02
221 222 223 224
-2.730282e-02 5.591690e-03 -1.043775e-02 -5.360213e-03
225 226 227 228
1.414753e-02 1.078473e-02 -2.452003e-03 2.211674e-03
229 230 231 232
3.004410e-03 -5.365461e-03 -5.391057e-03 5.968397e-03
233 234 235 236
8.282451e-03 -7.260091e-03 4.575852e-03 2.860073e-03
237 238 239 240
1.052456e-02 -1.903541e-03 1.125396e-02 6.927645e-03
241 242 243 244
-3.089605e-02 -1.153509e-02 3.953120e-02 5.213560e-02
245 246 247 248
5.349170e-03 1.309613e-03 8.532669e-03 3.641861e-02
249 250 251 252
-2.571262e-02 2.900506e-03 -3.481454e-02 -6.424101e-03
253 254 255 256
9.802964e-03 -3.150135e-03 -1.101131e-02 -2.131249e-02
257 258 259 260
6.274294e-02 2.587949e-02 -8.674770e-04 6.398537e-03
261 262 263 264
-2.207722e-02 6.961859e-03 -7.092074e-03 3.326304e-04
265 266 267 268
-8.826976e-04 2.446652e-02 3.202071e-03 -4.300357e-03
269 270 271 272
-8.697415e-03 1.632332e-02 1.139373e-02 -8.293938e-03
273 274 275 276
3.187131e-03 -2.838793e-03 -3.491220e-03 7.986199e-03
277 278 279 280
-5.931380e-03 8.005507e-04 -1.515201e-03 6.203605e-03
281 282 283 284
-1.763623e-03 2.263001e-02 2.058192e-03 -6.677623e-03
285 286 287 288
-6.068511e-04 1.232161e-02 1.137790e-02 1.129776e-02
289 290 291 292
1.467367e-02 -1.221627e-02 -1.236961e-02 2.468580e-03
293 294 295 296
-1.562174e-02 2.392474e-03 -2.466936e-04 9.032847e-03
297 298 299 300
7.028428e-03 -1.605058e-02 1.090764e-01 3.823460e-03
301 302 303 304
-3.617284e-04 -3.620753e-03 -1.493839e-03 1.757362e-03
305 306 307 308
4.024892e-03 1.011166e-03 3.607874e-04 2.564815e-04
309 310 311 312
1.339123e-03 7.928470e-04 -1.579597e-03 5.422977e-03
313 314 315 316
-2.079001e-03 9.967008e-03 9.050382e-03 -8.922487e-03
317 318 319 320
-1.695307e-04 1.028737e-02 1.216367e-02 3.031379e-03
321 322 323 324
-1.263116e-02 -1.537278e-02 2.242444e-04 2.426469e-03
325 326 327 328
-2.664895e-03 3.884286e-03 6.880529e-03 -5.927206e-04
329 330 331 332
9.830635e-03 1.280008e-03 1.424032e-02 -7.288540e-04
333 334 335 336
-9.240581e-04 3.195132e-04 -4.259236e-03 -2.214205e-03
337 338 339 340
-6.881941e-03 -6.423759e-03 -2.609067e-03 -3.503663e-05
341 342 343 344
2.788641e-03 1.372338e-03 6.089936e-04 9.587636e-04
345 346 347 348
3.785345e-03 -2.026423e-03 -1.177728e-03 6.512821e-04
349 350 351 352
-3.906498e-04 -8.785059e-03 -1.431750e-03 -2.324442e-04
353 354 355 356
-1.076415e-03 1.441769e-03 -1.714267e-03 -1.674929e-03
357 358 359 360
8.652113e-04 1.238680e-03 -7.712385e-04 2.297910e-02
361 362 363 364
2.757363e-03 -1.088451e-05 -7.907335e-03 -4.752825e-03
365 366 367 368
-1.202807e-03 5.597340e-03 -2.864217e-04 8.340569e-04
369 370 371 372
-3.930913e-03 -5.725912e-03 4.980890e-04 3.697257e-03
373 374 375 376
5.995110e-03 -1.339679e-03 -9.186386e-03 8.241024e-03
377 378 379 380
-6.799925e-03 -3.594279e-03 3.265258e-03 3.038261e-03
381 382 383 384
-9.738195e-04 1.535296e-03 -8.603250e-04 -4.378884e-03
385 386 387 388
-2.952824e-03 2.063849e-03 -4.624888e-03 3.525655e-03
389 390 391 392
-5.207749e-03 -9.276466e-04 1.684872e-03 2.511384e-03
393 394 395 396
-2.189145e-03 -1.098284e-02 -4.546533e-03 -1.349024e-03
397 398 399
-5.619031e-04 -5.592868e-03 8.620104e-03
#they’re not so go to add them
> names(csr)<-row.names(data)
Error in names(csr) <- row.names(data) :
'names' attribute [200] must be the same length as the vector [199]
Related
I am working on a time series-based study on the Czech Republic. I have macroeconomic data from 1993 to 2021. I tested my time series for stationarity using both R (function adfTest from package fUnitRoots) and Gretl. The results are significantly different to the point that for example the differences of GDP are strongly stationary according to Gretl, but nonstationary according to R. Both the test statistics and p-values are different. Do you have any idea why is that and which result is correct?
The test statistic for differences (I used the "constant" version and 3 lags as recommended by R)
According to R: -1.8587
According to Gretl: -4.27469
The p-values:
According to R: 0.3727
According to Gretl: 0.0004865
I am also enclosing the data
Year;GDP_(CZKm)
1993;1 205 330
1994;1 375 851
1995;1 596 306
1996;1 829 255
1997;1 971 024
1998;2 156 624
1999;2 252 983
2000;2 386 289
2001;2 579 126
2002;2 690 982
2003;2 823 452
2004;3 079 207
2005;3 285 601
2006;3 530 881
2007;3 859 533
2008;4 042 860
2009;3 954 320
2010;3 992 870
2011;4 062 323
2012;4 088 912
2013;4 142 811
2014;4 345 766
2015;4 625 378
2016;4 796 873
2017;5 110 743
2018;5 410 761
2019;5 791 498
2020;5 709 131
2021;6 108 717
I am trying to write a function to calculate Williams %R on data in R. Here is my code:
getSymbols('AMD', src = 'yahoo', from = '2018-01-01')
wr = function(high, low, close, n) {
highh = runMax((high),n)
lowl = runMin((low),n)
-100 * ((highh - close) / (highh - lowl))
}
williampr = wr(AMD$AMD.High, AMD$AMD.Low, AMD$AMD.Close, n = 10)
After implementing a buy/sell/hold signal, it returns integer(0):
## 1 = BUY, 0 = HOLD, -1 = SELL
## implement Lag to shift the time back to the previous day
tradingSignal = Lag(
## if wpr is greater than 0.8, BUY
ifelse(Lag(williampr) > 0.8 & williampr < 0.8,1,
## if wpr signal is less than 0.2, SELL, else, HOLD
ifelse(Lag(williampr) > 0.2 & williampr < 0.2,-1,0)))
## make all missing values equal to 0
tradingSignal[is.na(tradingSignal)] = 0
## see how many SELL signals we have
which(tradingSignal == "-1")
What am I doing wrong?
It would have been a good idea to identify that you were using the package quantmod in your question.
There are two things preventing this from working.
You didn't inspect what you expected! Your results in williampr are all negative. Additionally, you multiplied the values by 100, so 80% is 80, not .8. I removed -100 *.
I have done the same thing so many times.
wr = function(high, low, close, n) {
highh = runMax((high),n)
lowl = runMin((low),n)
((highh - close) / (highh - lowl))
}
That's it. It works now.
which(tradingSignal == "-1")
# [1] 13 15 19 22 39 71 73 84 87 104 112 130 134 136 144 146 151 156 161 171 175
# [22] 179 217 230 255 268 288 305 307 316 346 358 380 386 404 449 458 463 468 488 492 494
# [43] 505 510 515 531 561 563 570 572 574 594 601 614 635 642 644 646 649 666 668 672 691
# [64] 696 698 719 729 733 739 746 784 807 819 828 856 861 872 877 896 900 922 940 954 968
# [85] 972 978 984 986 1004 1035 1048 1060
I have some data taken from a moving instrument through the water. The instrument moved in a zig zag way and flow data were logged every 0.5 seconds.
I need to make a graph showing flow along the path of the instrument with different colors for each flow value.
How can I use R to plot the graph?
Here's some of my data:
idNr flow dep
27 0.288261301 4.04
28 0.321201425 3.96
29 0.348002863 4.05
30 0.266207609 3.98
31 0.344623682 3.98
32 0.33590977 4.02
33 0.333196711 3.98
34 0.443371838 4.08
35 0.751650508 4.35
36 1.026660332 5.15
37 1.79303221 6.52
38 1.804413243 8.04
39 1.773816905 9.55
40 1.782303493 10.99
41 1.726813914 12.49
42 1.61061413 13.95
43 1.747734972 15.44
44 1.619344989 16.88
45 1.527087967 18.37
46 1.552443997 19.84
47 1.580849856 21.36
48 1.47038517 22.8
49 1.392708417 24.28
50 1.56442883 25.78
51 1.777948528 27.22
52 1.802147241 28.7
53 1.87299915 30.2
54 2.053852522 31.7
55 1.642625947 33.18
56 1.427217507 34.62
57 1.52030689 36.05
58 1.417431073 37.55
59 1.443192082 39.1
60 1.34374145 40.56
61 1.421155629 42.01
62 1.333494728 43.58
63 1.3194019 45.03
64 1.394158603 46.62
65 1.429844828 48.08
66 1.367911241 49.58
67 1.355840925 51.02
68 1.378465281 52.55
69 1.523250886 53.91
70 1.365535668 55.61
71 1.396372615 57.04
72 1.347452677 58.57
73 1.382778102 60.02
74 1.455112272 61.48
75 1.350807161 63.02
76 1.386283066 64.42
77 1.390035765 65.97
78 1.383424985 67.4
79 1.395385154 68.96
80 1.381371239 70.49
81 1.400707773 71.98
82 1.476066775 73.48
83 1.284056739 75.03
84 1.475329288 76.46
85 1.459387313 78.01
86 1.465585987 79.61
87 1.431249165 81.19
88 1.357601114 82.55
89 1.382301557 84.12
90 1.445689198 85.61
91 1.36922513 87.16
92 1.520221768 88.68
93 1.498713299 90.21
94 1.598120373 91.74
95 1.434218834 93.23
96 1.526169617 94.77
97 1.53240429 96.32
98 1.593795786 97.73
99 1.60067114 99.26
100 1.699682725 100.85
101 1.656267698 102.39
102 1.688246548 103.97
103 1.665317693 105.45
104 1.710451732 106.94
105 1.558604843 108.5
106 1.682163929 109.96
107 1.754686611 111.43
108 1.451731985 112.94
109 1.75889143 114.49
110 1.578577562 115.98
111 1.660725181 117.43
112 1.638376473 119.05
113 1.701685385 120.54
114 1.603928968 122.06
115 1.716673882 123.55
116 1.721613119 125.04
117 1.520097264 126.52
118 1.673504264 128.04
119 1.651535476 129.54
120 1.716307179 131.06
121 1.661417444 132.56
122 1.807044943 134.09
123 1.670927777 135.61
124 1.816092103 137.15
125 1.61581054 138.66
126 1.443289015 140.17
127 1.548887918 141.65
128 1.742922886 143.19
129 1.407817467 144.64
130 1.537282981 146.2
131 1.605707701 147.66
132 1.595514766 149.11
133 1.664969522 150.64
134 1.597832102 152.2
135 1.656365329 153.63
136 1.475701825 155.17
137 1.584298389 156.64
138 1.511851004 158.09
139 1.81195684 159.64
140 1.429699891 161.16
141 1.453907433 162.64
142 1.583450822 164.13
143 1.670092861 165.61
144 1.564082726 167.12
145 1.705749786 168.64
146 1.617373325 170.1
147 1.705749786 171.67
148 1.750116174 173.24
149 1.612112174 174.71
150 1.543739041 176.19
151 1.658449408 177.8
152 1.544094384 179.26
153 1.660865163 180.72
154 1.718616091 182.32
155 1.652198157 183.8
156 1.663230727 185.21
157 1.760015837 186.74
158 1.543345815 188.16
159 1.518992563 189.66
160 1.719743279 191.14
161 1.871325988 192.63
162 1.338309201 194.15
163 1.834802202 195.66
164 1.900303456 197.19
165 1.789994802 198.73
166 1.641265789 200.15
167 1.711407354 201.66
168 1.777665955 203.12
169 1.650013219 204.61
170 1.752274015 206.09
171 1.769734944 207.59
172 1.63480019 209.04
173 1.67727874 210.53
174 1.661860415 212.01
175 1.670834431 213.51
176 1.875008828 214.91
177 1.198086144 216.41
178 1.473233127 217.84
179 1.401750052 219.35
180 1.465064864 220.81
181 1.507361683 222.31
182 1.594120168 223.76
183 1.603827938 225.28
184 1.836556292 226.72
185 1.778422956 228.19
186 1.644971831 229.7
187 1.803737202 231.17
188 1.828251113 232.68
189 1.762103594 234.16
190 1.659823074 235.6
191 1.894313483 237.05
192 1.814170041 238.56
193 1.887235594 239.99
194 1.941720868 241.54
195 1.948726387 242.95
196 1.563589543 244.41
197 2.066101568 245.9
198 1.880538604 247.35
199 1.728849162 248.83
200 1.653487854 250.29
201 1.79743983 251.68
202 1.871167606 253.23
203 2.080688007 254.73
204 1.929135455 256.02
205 1.911069255 257.82
206 1.989423875 259.3
207 1.990416392 260.79
208 2.114887554 262.26
209 1.906502828 263.69
210 1.85085669 265.23
211 1.831357625 266.45
212 1.045764375 266.63
213 0.917485268 266.81
214 0.569920704 266.97
215 0.622531097 267.12
216 0.256491127 267.56
217 0.274736316 267.07
218 0.292981505 266.97
219 0.214553306 266.97
220 0.136125106 267.07
221 0.160799901 267.07
222 0.185474696 267.02
223 0.210149491 267.05
224 0.234824285 267.18
225 0.25949908 267.09
226 0.284173875 267.07
227 0.30884867 267.09
228 0.333523464 267.22
229 0.358198259 267.14
230 0.382873054 267.12
231 0.407547849 267.03
232 0.600179663 266.73
233 0.433797472 266.11
234 1.060132593 265.09
235 1.425310466 264.15
236 1.486875715 263.14
237 1.43689659 262.02
238 1.431465999 260.75
239 1.583936163 259.53
240 1.473233127 258.31
241 1.531834262 257.17
242 1.533104902 255.91
243 1.328999131 254.8
244 1.421761697 253.69
245 1.335436257 252.51
246 1.419856749 251.32
247 1.3703995 250.15
248 1.408783282 248.96
249 1.517088222 247.77
250 1.343802584 246.62
251 1.421537306 245.37
252 1.40061345 244.24
253 1.272161015 243.04
254 1.406138921 241.84
255 1.085843236 240.68
256 1.33602153 239.51
257 1.388280895 238.34
258 1.529552382 237.13
259 1.369493364 235.92
260 1.483642587 234.77
261 1.32851338 233.57
262 1.44039718 232.31
263 1.496418268 231.13
264 1.165322973 229.97
265 1.106764899 228.73
266 1.329991043 227.55
267 1.207740991 226.4
268 1.235528323 225.22
269 1.306827536 224.05
270 1.143623212 222.81
271 1.441714487 221.66
272 1.232362719 220.44
273 1.383297901 219.22
274 1.352388645 218.1
275 1.311460327 216.89
276 1.321217162 215.72
277 1.207279506 214.54
278 1.398024867 213.36
279 1.213781005 212.17
280 1.261582854 210.99
281 1.25985833 209.84
282 1.286305023 208.73
283 1.182046703 207.58
284 1.184988693 206.45
285 1.402951933 205.31
286 1.192968302 204.19
287 1.196836091 203.12
288 1.32083886 201.98
289 1.253255036 200.76
290 1.317613352 199.63
291 1.18891426 198.5
292 1.112625424 197.37
293 1.449004681 196.23
294 1.221578748 195.05
295 1.08139601 193.89
296 1.124301897 192.79
297 1.317373248 191.67
298 0.752773451 190.53
299 1.286574964 189.41
300 1.221500152 188.31
301 1.277374406 187.24
302 1.14505629 186.12
303 1.114030657 184.93
304 1.156791522 183.85
305 1.339019098 182.71
306 1.072565891 181.58
307 1.219754186 180.44
308 1.176963397 179.29
309 1.311925665 178.2
310 1.106051874 176.99
311 1.210041464 175.82
312 1.08161895 174.75
313 1.236070783 173.61
314 1.249614259 172.45
315 1.129300446 171.35
316 1.091558486 170.16
317 1.191516344 169.05
318 1.127521887 167.92
319 1.218621558 166.77
320 1.213781005 165.66
321 1.128293182 164.56
322 1.088186282 163.4
323 1.137727115 162.3
324 1.127388814 161.14
325 1.12490278 160.04
326 1.161036195 158.9
327 1.151329101 157.69
328 1.15757108 156.62
329 1.021796004 155.52
330 1.209076674 154.36
331 1.140870746 153.2
332 1.216107261 152.13
333 1.143405517 150.94
334 1.215779458 149.84
335 1.197327517 148.74
336 1.160052933 147.56
337 1.15880757 146.43
338 1.140225768 145.32
339 1.150143952 144.23
340 1.181602396 143.07
341 1.262274239 141.98
342 1.150799931 140.89
343 1.190711665 139.69
344 1.192584323 138.56
345 1.122635588 137.42
346 1.17500199 136.32
347 1.23807517 135.17
348 1.237353656 134.06
349 1.186746005 132.94
350 1.194924674 131.84
351 1.259003717 130.69
352 1.19677967 129.59
353 1.194242503 128.47
354 1.220205134 127.36
355 1.127990121 126.24
356 1.213781005 125.12
357 1.225973285 123.98
358 1.084756214 122.92
359 1.16161302 121.79
360 1.133334361 120.69
361 1.103127249 119.47
362 1.213781005 118.39
363 1.145836506 117.33
364 1.20781161 116.23
365 1.268126196 115.11
366 1.246070826 113.87
367 1.152060081 112.76
368 1.115006535 111.65
369 1.317280173 110.52
370 1.122671561 109.52
371 1.023553395 108.36
372 1.238632991 107.22
373 1.235124322 106.09
374 1.345223487 104.93
375 1.129630965 103.82
376 1.203403012 102.73
377 1.218779527 101.63
378 1.083551945 100.55
379 1.217290587 99.34
380 1.189233024 98.26
381 1.181995069 97.11
382 1.178651465 95.98
383 1.167093928 94.92
384 1.151487191 93.77
385 1.117688875 92.67
386 1.20046723 91.55
387 1.110913079 90.4
388 1.159221326 89.29
389 1.129196317 88.25
390 1.147979401 87.2
391 1.063138489 86.2
392 1.039855855 85.19
393 1.055872599 84.26
394 1.021566005 83.23
395 1.031262374 82.26
396 1.036922271 81.3
397 1.071390137 80.36
398 1.051908853 79.33
399 1.05363193 78.37
400 1.049712054 77.42
401 1.065736139 76.43
402 1.040108714 75.45
403 1.036886439 74.46
404 1.034880221 73.49
405 1.026660332 72.52
406 1.045490737 71.48
407 1.041245049 70.52
408 1.043207468 69.59
409 1.040543451 68.63
410 1.048860034 67.59
411 1.051861014 66.62
412 1.037198757 65.68
413 1.046622175 64.69
414 1.030435095 63.77
415 1.030156191 62.77
416 1.03762919 61.78
417 1.050493282 60.81
418 1.038155437 59.82
419 1.045090365 58.8
420 1.043834133 57.86
421 1.029653777 56.85
422 1.030381872 55.92
423 1.05024807 54.98
424 1.04567738 54.06
425 1.050961984 53.12
426 1.064207877 52.05
427 1.041433576 51.05
428 1.049447154 49.99
429 1.030094049 49.05
430 1.061292117 48.12
431 1.03820263 47.11
432 1.047136302 46.12
433 1.043207468 45.15
434 1.031741777 44.27
435 1.042840882 43.31
436 1.023755536 42.17
437 1.039746288 41.24
438 1.01600053 40.31
439 1.043669864 39.36
440 1.034305507 38.36
441 1.037222818 37.37
442 1.071353563 36.34
443 1.041574131 35.4
Not exactly what you were asking for, but this shows that the flows were distinctly lower as the device was ascending - something the colors don't really demonstrate IMO.
library(ggplot2)
data$direction <- with(data,ifelse(idNr<idNr[which.max(dep)],
"Descending","Ascending"))
ggplot(data,aes(x=dep,y=flow))+
geom_path(aes(color=direction))+
theme_bw()
One possible way (assuming I've got the x-axis & y-axis variables correct that that dep is "depth" - which is why I inverted the y-axis):
library(ggplot2)
gg <- ggplot(dat, aes(x=idNr, y=dep))
gg <- gg + geom_line(aes(color=flow), size=2)
gg <- gg + scale_y_reverse()
gg <- gg + scale_color_gradient(low="blue", high="red")
gg <- gg + theme_bw()
gg
Though you might be better of using cut to define discrete breaks (and, hence, colors) for the flow values rather than assign a continuous color range.
I have a time series. If i draw this time series I have such a diagram
my Data:
539 532 531 538 544 554 575 571 543 559 511 525 512 540
535 514 524 527 532 547 564 548 572 564 549 532 519 520
520 543 550 542 528 523 531 548 554 574 575 560 534 518
511 519 527 554 543 527 540 524 523 539 569 552 553 540
522 522 492 519 532 527 532 550 535 517 551 548 571 574
539 535 515 512 510 527 533 543 540 533 519 539 555 542
574 543 555 539 507 522 518 519 516 546 523 530 532 539
540 568 554 563 550 526 509 492 525 519 527 526 515 530
531 553 563 562 576 568 539 516 512 500 516 542 522 527
523 531
How can I smooth this graph, to see the sin function more clearly
Here are some things to get you started.
df <- data.frame(index=1:length(values),values)
# loess smoothing; note the use of predict(fit)
fit.loess <- loess(values~index,df,span=.1)
plot(df, type="l", col="blue",main="loess")
lines(df$index,predict(fit.loess),col="red")
# non-linear regression usign a single sine term
fit.nls <- nls(values~a*sin(b*index+c)+d,df,
start=c(a=1000,b=pi/10,c=0,d=mean(df$values)))
plot(df, type="l", col="blue",main="sin [1 term]")
lines(df$index,predict(fit.nls),col="red")
# non-linear regression using 2 sine terms
fit.nls <- nls(values~a1*sin(b1*index+c1)+a2*sin(b2*index+c2)+d,df,
start=c(a1=1000,b1=pi/10,c1=1,
a2=1000,b2=pi/2,c2=1,d=mean(df$values)))
plot(df, type="l", col="blue",main="sin [2 terms]")
lines(df$index,predict(fit.nls),col="red")
From the non-linear fits you can get an estimate of the period (b) using summary(fit.nls).
Read the documentation on loess, nls, and predict
You can use a smoothing function from any R package you wish. Basically, you can perform a moving average function like ARIMA models.
Something that is very easy to explore is this scenario (I hope this helps you):
#Read the data
cd4Data <- read.table("./RData/cd4.data", col.names=c("time", "cd4", "age", "packs", "drugs", "sex", "cesd", "id"))
cd4Data <- cd4Data[order(cd4Data$time),]
head(cd4Data)
#Plot the data
par(mfrow=c(1,1))
plot(cd4Data$time,cd4Data$cd4,pch=19,cex=0.1)
#A moving average (With 3 points average)
plot(cd4Data$time,cd4Data$cd4,pch=19,cex=0.1)
aveTime <- aveCd4 <- rep(NA,length(3:(dim(cd4Data)[1]-2)))
for(i in 3:(dim(cd4Data)[1]-2)){
aveTime[i] <- mean(cd4Data$time[(i-2):(i+2)])
aveCd4[i] <- mean(cd4Data$cd4[(i-2):(i+2)])
}
lines(aveTime,aveCd4,col="blue",lwd=3)
#Average many more points
plot(cd4Data$time,cd4Data$cd4,pch=19,cex=0.1)
aveTime <- aveCd4 <- rep(NA,length(201:(dim(cd4Data)[1]-200)))
for(i in 201:(dim(cd4Data)[1]-2)){
aveTime[i] <- mean(cd4Data$time[(i-200):(i+200)])
aveCd4[i] <- mean(cd4Data$cd4[(i-200):(i+200)])
}
lines(aveTime,aveCd4,col="blue",lwd=3)
So, I have these data given below, and my goal is to aggregate column v3 in terms of columns v1 and v2 and add the v3 values for each bin of v1 and v2. For example, the first line correspond to interval v1=21, v2=16, so the value of v3 will be aggregated over its (v1,v2) interval. And repeat this for the rest of rows. I want to use the mean as the aggregation function!
> df
v1 v2 v3
1 21.359 16.234 24.283
2 47.340 9.184 21.328
3 35.363 -13.258 14.556
4 -29.888 14.154 17.718
5 -10.109 -16.994 20.200
6 -32.387 1.722 15.735
7 49.240 -5.266 17.601
8 -38.933 2.558 16.377
9 41.213 5.937 21.654
10 -33.287 -4.028 19.525
11 -10.223 11.961 16.756
12 -48.652 16.558 20.800
13 44.778 27.741 17.793
14 -38.546 29.708 13.948
15 -45.622 4.729 17.793
16 -36.290 12.383 18.014
17 -19.626 19.767 18.182
18 -32.248 29.480 15.108
19 -41.859 35.502 8.490
20 -36.058 21.191 16.714
21 -23.588 0.524 21.471
22 -24.423 39.963 18.257
23 -0.042 -45.899 17.654
24 -35.479 32.049 9.294
25 -24.632 20.603 17.757
26 -26.591 25.882 18.968
27 -34.364 43.959 13.905
28 -19.334 29.728 20.102
29 12.304 -39.997 17.002
30 0.958 37.162 20.779
31 -35.475 -40.611 14.719
32 -39.268 44.382 11.247
33 -10.154 39.053 19.458
34 -12.612 32.056 17.759
35 2.730 -1.473 20.228
36 -45.326 -52.299 9.305
37 -1.996 -15.551 13.295
38 -26.655 -37.319 19.148
39 -18.509 -30.047 18.889
40 -22.705 -25.577 19.007
41 -15.705 -15.397 19.112
42 -2.637 9.790 10.548
43 -14.107 -3.145 19.654
44 -29.272 -19.906 18.503
45 -9.569 -4.632 11.334
46 2.114 18.048 14.744
47 -4.241 16.073 15.420
48 31.869 -3.394 21.559
49 20.425 35.205 22.250
50 -18.605 -8.866 20.082
51 -26.677 -7.690 21.850
52 -5.240 4.805 11.399
53 -6.766 2.538 6.292
54 4.567 22.554 19.682
55 -20.701 6.430 20.996
56 -23.972 16.141 17.976
57 -6.651 24.048 18.082
58 -32.243 -6.100 19.517
59 2.236 29.736 19.667
60 18.830 15.586 15.969
61 -9.598 28.414 17.806
62 -30.825 12.194 22.346
63 -17.415 15.795 18.135
64 -14.823 5.931 17.915
65 -14.234 12.882 13.001
66 9.937 18.368 20.421
67 -38.766 9.590 21.648
68 -30.896 27.047 16.453
69 -4.432 -10.562 10.061
70 -4.290 33.170 22.942
71 7.285 41.416 23.906
72 24.411 40.531 23.584
73 45.409 -32.420 20.831
74 49.341 -34.047 15.269
75 -7.730 -47.724 21.692
76 -10.563 -29.082 17.984
77 4.412 -41.182 16.845
78 31.822 -37.297 19.665
79 -43.355 31.093 17.688
80 -44.353 -44.723 13.832
81 -16.961 38.438 20.715
82 -21.225 -39.244 18.156
83 -42.022 -8.686 20.362
84 -42.904 -25.498 18.394
85 43.822 -25.990 21.287
86 43.013 -9.071 19.285
87 -36.901 -24.185 21.938
88 -28.251 -36.583 19.330
89 -19.830 -22.412 21.677
90 -3.789 -15.663 17.439
91 40.453 -21.796 17.432
92 -40.778 -31.188 18.762
93 -27.072 -48.609 18.913
94 -18.035 -1.791 19.909
95 -20.781 -7.912 22.563
96 47.307 -15.432 19.101
97 30.700 5.097 22.801
98 46.453 0.171 17.810
99 -27.439 -5.860 22.626
100 -30.526 -18.007 23.219
101 -18.280 -15.187 25.302
102 -18.367 6.044 18.864
103 41.265 -1.686 22.743
104 29.227 -14.814 19.196
105 -36.080 -32.715 18.930
106 7.475 7.061 25.002
107 -18.586 -45.207 21.864
108 35.227 11.148 21.388
109 -7.581 38.773 22.048
110 -43.685 14.083 22.037
111 -29.533 39.735 17.613
112 8.760 -39.400 22.421
113 -14.962 24.624 12.030
114 18.627 -32.888 23.036
115 -31.300 33.612 15.608
116 -38.024 45.839 16.567
117 -15.104 36.893 18.162
118 -12.809 -23.029 21.589
119 -21.614 36.264 16.680
120 42.917 -36.838 18.738
121 6.104 -14.961 14.468
122 44.032 -41.556 17.618
123 -24.493 21.886 17.366
124 -24.361 29.941 14.374
125 -25.060 43.383 16.437
126 -6.017 -24.640 19.207
127 -32.617 -40.549 18.059
128 -43.285 -43.364 18.827
129 -29.856 -46.089 16.881
130 -16.547 -43.619 22.547
131 -16.257 42.814 18.932
132 -9.236 -11.694 14.455
133 13.488 -35.422 24.436
134 -47.456 -32.714 18.123
135 39.476 -28.008 16.087
136 -21.933 -43.522 15.390
137 -17.347 -38.250 16.738
138 -4.948 -39.747 21.598
139 -31.018 -28.912 21.332
140 -36.364 30.461 17.542
141 -39.639 18.272 23.663
142 -24.162 -13.582 19.136
143 -8.935 -32.699 22.108
144 0.001 -19.219 17.888
145 -6.912 -24.885 20.683
146 7.785 -31.229 15.972
147 22.176 -7.478 21.335
148 8.755 -13.323 20.831
149 44.081 41.160 11.938
150 -8.451 -37.721 17.465
151 18.671 -2.776 23.374
152 12.668 -26.749 18.071
153 1.582 -21.252 20.750
154 20.832 -27.718 16.190
155 44.220 -45.690 12.598
156 -0.226 -37.737 17.634
157 -25.130 -19.197 23.170
158 2.086 -31.271 18.180
159 -20.445 -33.083 19.984
160 23.801 1.116 24.230
161 18.283 -17.922 20.256
162 -38.985 -13.770 20.702
163 -26.264 -27.413 20.276
164 10.396 -19.375 20.415
165 -16.343 -22.847 16.516
166 29.992 -8.215 21.661
167 35.052 -19.475 16.953
168 3.052 -6.800 22.509
169 -10.350 -5.413 19.222
170 14.371 -10.383 23.471
171 11.896 -4.191 21.773
172 18.152 8.741 23.669
173 25.748 -47.786 18.578
174 31.613 -0.735 23.898
175 12.660 25.645 23.549
176 2.933 29.345 25.170
177 9.369 18.791 26.817
178 15.805 4.798 27.866
179 27.556 -25.571 14.796
180 -5.112 -7.835 21.201
181 -30.571 3.471 20.496
182 19.816 -22.114 21.210
183 2.826 47.437 22.911
184 25.488 -33.064 21.442
185 44.826 42.162 22.994
186 25.208 -48.487 25.325
187 14.635 -17.430 17.083
188 -1.901 -33.370 22.163
189 12.306 -47.265 20.052
190 42.552 35.750 23.213
191 37.318 -46.069 22.599
192 4.725 -22.289 21.600
193 -40.815 -37.793 17.371
194 11.890 -12.862 14.286
195 35.251 -31.746 17.816
196 27.121 -27.638 19.677
197 36.024 -39.105 20.202
198 -47.119 41.940 17.526
199 0.837 -40.694 23.063
200 23.797 -39.795 20.198
201 -42.859 -21.372 23.554
202 39.407 -20.211 21.246
203 25.782 -18.892 20.423
204 34.529 -9.576 20.411
205 44.397 -13.247 23.180
206 5.534 6.856 14.248
207 31.598 -18.085 22.350
208 7.250 -0.481 15.453
209 -43.458 -15.204 23.193
210 -38.296 -31.524 21.776
211 4.276 -3.483 12.145
212 25.757 -11.708 22.360
213 15.634 37.478 24.624
214 -43.669 -3.197 20.742
215 45.381 6.365 21.351
216 -38.755 -6.877 20.879
217 -6.925 3.994 21.120
218 8.059 12.831 26.032
219 3.572 22.105 26.920
220 16.042 30.267 21.039
221 26.629 13.042 23.633
222 -12.126 -0.151 21.261
223 -11.981 24.600 19.236
224 29.480 28.362 21.838
225 -2.500 22.858 23.177
226 -41.163 19.863 20.059
227 35.953 27.401 19.101
228 -16.641 13.248 17.984
229 -3.778 14.090 18.943
230 11.643 34.817 21.621
231 34.921 38.666 17.359
232 25.621 22.451 22.866
233 34.936 17.384 19.836
234 40.017 37.599 13.987
235 19.547 33.838 22.575
236 11.197 39.977 19.347
237 16.972 -33.927 14.205
238 22.938 38.064 20.351
239 40.234 18.672 23.030
240 -0.846 42.320 18.383
241 -11.437 18.284 16.502
242 19.552 43.222 21.370
243 13.925 -46.486 18.917
244 41.709 -39.559 16.143
245 19.014 -44.563 17.796
246 32.260 33.114 18.402
247 -4.693 29.228 18.622
248 21.765 -38.452 15.147
249 39.157 -31.135 19.800
250 32.638 46.241 18.943
251 2.797 10.089 21.330
252 8.256 46.910 18.834
253 38.634 -2.429 20.413
254 28.642 2.763 19.580
255 0.456 1.422 7.452
256 3.050 11.792 14.196
257 24.736 14.532 17.886
258 16.787 -10.155 18.607
259 12.676 11.651 18.656
260 13.184 1.081 15.385
261 27.365 26.576 25.486
262 -7.878 -18.191 14.547
263 -42.112 32.576 20.865
264 15.069 21.684 17.986
265 33.045 27.166 25.252
266 21.810 -0.186 19.477
267 18.227 26.690 20.415
268 33.759 18.366 21.255
269 39.491 13.272 23.036
270 30.662 9.368 20.192
271 5.470 35.303 22.685
272 21.663 -44.343 20.999
273 31.261 33.178 24.335
274 21.854 22.665 20.876
275 21.853 7.932 18.588
276 -40.168 3.682 19.642
277 -42.292 23.997 22.199
278 10.233 28.731 21.263
279 17.745 41.831 19.536
280 38.406 25.165 26.534
281 -49.329 -0.465 20.887
282 40.398 -8.120 21.362
283 -2.531 46.118 22.933
284 7.959 -30.856 20.497
285 -34.467 -23.724 22.206
286 30.541 44.284 25.878
287 45.682 29.897 21.964
288 -22.251 -0.089 20.756
289 21.484 16.532 23.513
290 46.912 10.195 21.908
291 35.320 -13.352 16.102
292 -30.431 14.048 17.362
293 -8.976 -17.325 21.645
294 -32.661 2.301 16.805
295 49.317 -5.509 17.711
296 -37.756 4.459 16.054
297 41.445 6.158 21.442
298 -33.148 -3.499 19.543
299 -10.065 12.238 16.649
300 -48.323 17.153 20.974
301 45.010 28.147 17.838
302 -39.630 29.183 13.254
303 -45.191 5.065 18.214
304 -35.936 11.953 16.540
305 -19.816 19.624 18.279
306 -32.055 29.757 15.358
307 -41.533 36.169 10.005
308 -35.448 20.960 16.720
309 -23.384 0.511 20.005
310 -25.101 40.569 18.180
311 -0.547 -45.779 17.603
312 -35.291 32.643 9.548
313 -25.109 20.826 17.494
314 -26.202 27.012 18.678
315 -34.805 43.850 14.006
316 -18.819 30.611 20.309
317 13.019 -40.248 16.874
318 -0.655 37.112 20.924
319 -34.142 -41.553 15.237
320 -39.509 43.886 12.464
321 -9.491 38.639 18.839
322 -12.164 31.977 17.598
323 3.437 -1.596 20.318
324 -45.713 -52.599 9.918
325 -2.062 -15.946 12.847
326 -27.435 -37.600 18.257
327 -18.094 -29.624 18.791
328 -22.647 -26.123 18.746
329 -16.775 -15.505 19.204
330 -2.628 9.599 11.219
331 -15.718 -1.797 19.491
332 -29.476 -20.107 17.485
333 -10.618 -4.938 12.227
334 1.423 17.458 14.706
335 -4.503 16.630 14.718
336 32.450 -2.029 21.591
337 20.529 35.464 21.630
338 -19.348 -7.844 19.464
339 -26.760 -6.856 21.422
340 -4.539 4.393 11.819
341 -5.741 1.934 7.121
342 4.781 21.919 18.908
343 -19.797 6.928 20.928
344 -24.555 16.834 19.796
345 -5.664 24.465 18.432
346 -32.891 -6.571 18.691
347 2.354 28.462 19.825
348 18.058 16.251 16.335
349 -9.603 28.582 17.743
350 -31.282 11.454 22.342
351 -17.580 16.428 18.401
352 -13.884 6.206 17.270
353 -13.631 13.767 11.761
354 9.712 18.008 18.896
355 -37.987 9.024 21.309
356 -29.969 27.506 16.964
357 -4.248 -10.813 9.284
358 -5.755 32.673 22.541
359 6.675 41.952 24.227
360 24.564 41.173 23.241
361 45.314 -32.299 20.778
362 -45.890 -33.510 16.314
363 -8.277 -47.943 21.573
364 -11.044 -29.464 17.708
365 3.972 -41.396 17.411
366 31.776 -36.643 19.998
367 -43.072 31.311 17.828
368 -45.805 -43.071 14.477
369 -15.628 39.837 19.709
370 -21.129 -39.101 18.814
371 -41.628 -8.980 19.850
372 -42.244 -23.659 18.856
373 44.149 -25.710 21.099
374 42.623 -9.185 20.147
375 -35.949 -23.979 22.255
376 -28.512 -36.367 19.378
377 -19.827 -21.781 21.621
378 -3.429 -15.706 18.677
379 39.741 -20.721 18.670
380 -41.663 -29.499 19.260
381 -26.931 -48.467 18.185
382 -17.571 -1.467 19.770
383 -20.039 -7.591 22.737
384 46.370 -14.790 19.922
385 30.710 4.167 22.987
386 46.755 0.417 18.088
387 -27.293 -4.398 22.168
388 -30.364 -17.573 23.869
389 -16.870 -14.893 25.817
390 -18.152 6.546 18.392
391 40.134 0.160 23.661
392 28.179 -14.323 19.301
393 -35.907 -32.647 19.306
394 8.486 7.101 24.551
395 -17.155 -45.435 22.745
396 34.226 10.748 19.773
397 -7.760 38.754 22.211
398 -42.899 13.804 22.628
399 -29.972 40.435 17.784
400 8.764 -39.195 22.070
401 -15.624 25.585 12.291
402 18.620 -33.314 23.282
403 -30.436 34.219 15.102
404 -37.665 44.955 15.257
405 -15.861 37.488 18.956
406 -13.375 -22.408 20.312
407 -20.972 36.906 17.387
408 43.162 -35.948 19.695
409 6.639 -15.783 14.608
410 44.186 -41.037 17.398
411 -23.917 22.236 18.702
412 -23.957 30.033 14.725
413 -25.056 43.824 15.489
414 -6.795 -24.375 18.537
415 -33.485 -40.651 17.538
416 -43.186 -43.071 17.481
417 -30.325 -46.122 16.440
418 -17.489 -43.551 22.006
419 -16.376 43.928 18.992
420 -9.076 -10.921 14.131
421 13.704 -36.352 23.812
422 -47.302 -31.918 18.719
423 39.459 -27.814 15.558
424 -22.509 -42.660 14.366
425 -17.920 -37.614 16.572
426 -5.780 -39.212 21.667
427 -30.519 -28.942 21.931
428 -35.937 31.435 17.106
429 -38.680 18.435 23.342
430 -24.796 -13.279 18.543
431 -9.283 -32.388 21.895
432 0.493 -19.505 17.276
433 -7.046 -25.243 20.741
434 7.884 -32.006 16.727
435 22.451 -7.834 21.082
436 8.379 -13.690 22.002
437 43.730 41.697 11.894
438 -9.040 -38.086 17.500
439 18.831 -2.759 23.252
440 12.732 -27.410 18.948
441 0.739 -21.091 21.354
442 20.339 -27.959 16.514
443 44.688 -46.449 12.356
444 -0.402 -36.951 17.891
445 -24.790 -18.139 23.337
446 2.173 -30.577 18.023
447 -18.995 -33.799 20.730
448 23.372 0.223 24.855
449 17.835 -17.372 19.878
450 -38.915 -13.815 20.923
451 -26.241 -27.800 19.877
452 11.074 -18.156 19.249
453 -16.478 -22.928 16.386
454 29.646 -8.349 21.115
455 33.910 -20.809 16.629
456 3.306 -6.830 22.059
457 -10.512 -5.322 19.876
458 14.024 -10.406 23.456
459 12.365 -3.699 21.818
460 18.186 8.532 23.951
461 25.140 -47.653 18.592
462 32.288 -2.117 23.423
463 10.836 24.937 23.310
464 4.531 28.913 25.238
465 9.944 18.397 26.661
466 16.274 4.852 27.837
467 27.316 -26.007 15.934
468 -4.508 -8.010 20.906
469 -29.858 2.412 19.958
470 20.376 -21.957 21.306
471 2.077 47.431 23.248
472 25.777 -33.367 21.695
473 44.854 42.801 22.904
474 25.356 -48.833 25.402
475 15.322 -16.926 17.318
476 -2.656 -33.400 20.365
477 11.950 -47.390 20.328
478 42.961 36.955 22.919
479 35.726 -45.402 24.272
480 4.675 -21.758 21.780
481 -40.568 -36.931 16.934
482 11.758 -12.859 14.206
483 35.483 -31.760 16.975
484 27.336 -27.577 19.429
485 36.689 -39.218 19.668
486 -46.357 41.618 17.456
487 0.002 -40.589 22.558
488 23.525 -39.918 21.247
489 -43.269 -21.304 22.699
490 40.191 -20.594 21.145
491 25.728 -18.024 20.298
492 34.964 -10.441 20.189
493 43.627 -13.279 23.038
494 5.766 6.876 14.077
495 32.432 -18.172 21.848
496 7.087 -1.122 15.098
497 -44.110 -14.034 23.080
498 -39.474 -31.289 22.312
499 4.118 -4.077 11.067
500 26.597 -11.667 22.641
so, using these commands I can find the intervals, as below
x.bin <- seq(floor(min(d[,1])), ceiling(max(df[,1])), by=2)
y.bin <- seq(floor(min(d[,2])), ceiling(max(df[,2])), by=2)
> x.bin
[1] -50 -48 -46 -44 -42 -40 -38 -36 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14
[20] -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20 22 24
[39] 26 28 30 32 34 36 38 40 42 44 46 48 50
> y.bin
[1] -53 -51 -49 -47 -45 -43 -41 -39 -37 -35 -33 -31 -29 -27 -25 -23 -21 -19 -17
[20] -15 -13 -11 -9 -7 -5 -3 -1 1 3 5 7 9 11 13 15 17 19 21
[39] 23 25 27 29 31 33 35 37 39 41 43 45 47
But, then I don't know how to assign each row of the raw data (df) to each x.bin and y.bin and calculate the aggregate (sum) of each bin.
library(plyr)
#I am using cut function with 50 breaks for both v1 and v2 and ddply from plyr package for computing the mean
newdata<-ddply(df,.(cut(v1,50),cut(v2,50)),summarise,mean.v3=mean(v3))
> head(newdata)
cut(v1, 50) cut(v2, 50) mean.v3
1 (-49.4,-47.5] (-34.7,-32.7] 18.123
2 (-49.4,-47.5] (-0.576,1.43] 20.887
3 (-49.4,-47.5] (15.5,17.5] 20.887
4 (-47.5,-45.5] (-52.7,-50.7] 9.918
5 (-47.5,-45.5] (-44.7,-42.7] 14.477
6 (-47.5,-45.5] (-34.7,-32.7] 16.314
Updated as per the comments: If you want the lower, middle and mid-points, you can use the following function or use with details as follow(you need to use the sub function to deal with ( and ]):
df$newv1<-with(df,cut(v1,50))
df$newv2<-with(df,cut(v2,50))
df$lowerv1<-with(df,as.numeric( sub("\\((.+),.*", "\\1", newv1))) #lower value
df$upperv1<-with(df,as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", newv1))) # upper value
df$midv1<-with(df,(lowerv1+upperv1)/2) #mid value
df$lowerv2<-with(df,as.numeric( sub("\\((.+),.*", "\\1",newv2))) #lower value
df$upperv2<-with(df,as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", newv2))) # upper value
df$midv2<-with(df,(lowerv2+upperv2)/2)#mid value
newdata<-ddply(df,.(newv1,newv2),transform,mean.v3=mean(v3))
> head(newdata)
v1 v2 v3 newv1 newv2 lowerv1 upperv1 midv1 lowerv2 upperv2 midv2 mean.v3
1 -47.456 -32.714 18.123 (-49.4,-47.5] (-34.7,-32.7] -49.4 -47.5 -48.45 -34.700 -32.70 -33.700 18.123
2 -49.329 -0.465 20.887 (-49.4,-47.5] (-0.576,1.43] -49.4 -47.5 -48.45 -0.576 1.43 0.427 20.887
3 -48.652 16.558 20.800 (-49.4,-47.5] (15.5,17.5] -49.4 -47.5 -48.45 15.500 17.50 16.500 20.887
4 -48.323 17.153 20.974 (-49.4,-47.5] (15.5,17.5] -49.4 -47.5 -48.45 15.500 17.50 16.500 20.887
5 -45.713 -52.599 9.918 (-47.5,-45.5] (-52.7,-50.7] -47.5 -45.5 -46.50 -52.700 -50.70 -51.700 9.918
6 -45.805 -43.071 14.477 (-47.5,-45.5] (-44.7,-42.7] -47.5 -45.5 -46.50 -44.700 -42.70 -43.700 14.477