How to perform interpolation for a (column vector) data series using the largest column vector in a data frame in R - r
I have an excel sheet having some series of data in the form of column vectors. each column vector is of different length. the sample data in the excel sheet is presented as column vectors as shown below.
No 1 2 4 5 6 7
1 7.68565 7.431991 7.620156 7.34955 7.493848 7.244905
2 8.247334 7.895186 8.107751 7.629121 8.01165 7.898938
3 8.861417 8.411331 8.616113 7.960177 8.551065 8.432346
4 9.522981 8.945542 9.117843 8.263698 9.129371 9.118917
5 10.10206 9.465829 9.621576 8.515904 9.680468 9.695693
6 10.74194 10.05058 10.2111 8.824739 10.22375 10.48411
7 11.41614 10.59113 10.70612 9.12775 10.78299 11.1652
8 12.08601 11.12069 11.23061 9.445629 11.32874 11.8499
9 12.8509 11.68692 11.81479 9.762563 11.92125 12.77563
10 13.79793 12.31746 12.3436 10.12344 12.5586 14.05427
11 14.40335 12.85409 12.81579 10.4148 13.2323 14.74745
12 14.96397 13.44764 13.39124 10.76968 13.91571 15.48449
13 15.49457 13.5184 13.94058 11.05081 14.43318 16.12423
14 16.06153 13.99386 14.35261 11.38416 14.95082 16.84513
15 16.61133 14.4879 14.86438 11.71484 15.47574 17.42593
16 17.24876 14.95296 15.30651 12.06838 16.01853 18.05138
17 17.8686 15.48764 15.82241 12.41315 16.546 18.69939
18 18.49424 16.01478 16.33324 12.76782 17.07923 19.29467
19 19.0651 16.5115 16.8808 13.11234 17.62211 20.00391
20 19.73842 17.07482 17.40481 13.46479 18.14528 20.67474
21 20.47123 17.51353 17.88455 13.55012 18.69565 21.35446
22 21.16333 18.00172 18.38069 13.82592 19.23222 22.16516
23 21.83083 18.55357 18.79004 14.10343 19.93576 23.0249
24 22.50095 19.04932 19.25296 14.38997 20.6087 23.75609
25 23.27895 19.66359 19.68497 14.66933 21.19856 24.33014
26 23.86791 20.19746 20.25114 14.96252 21.7933 25.16132
27 24.42128 20.79322 20.8394 15.27082 22.4216 25.64038
28 25.02747 21.34963 21.36803 15.59645 22.95553 26.40612
29 25.64392 21.96625 21.92369 15.90159 23.62858 26.99359
30 26.15457 22.51419 22.49119 16.21841 24.27062 27.48933
31 26.78083 23.14052 23.09582 16.5353 24.75912 28.13525
32 27.39095 23.71215 23.71597 16.84909 25.34079 28.66253
33 28.04546 24.23099 24.22622 17.23782 25.90887 29.27824
34 28.68887 24.69722 24.76757 17.58071 26.51803 30.06892
35 29.45707 25.24266 25.30781 17.91193 27.12488 30.87034
36 30.03946 25.75705 25.86998 18.24291 27.73606 31.71053
37 30.71511 26.29254 26.34333 18.50986 28.30462 32.37958
38 31.42378 26.91853 26.69165 18.81327 28.91142 33.07085
39 32.50335 27.44403 27.12134 19.20657 29.51637 33.8685
40 33.12328 27.98299 27.578 19.55173 30.14371 34.5783
41 33.71293 28.42661 28.16382 19.818 30.7509 35.29098
42 34.22313 29.11766 28.58075 20.20322 31.50584 35.97233
43 34.84822 29.69339 29.14229 20.60828 32.14028 36.53085
44 35.51228 30.30699 29.71523 20.86474 32.72842 36.82623
45 36.11674 30.89355 30.28881 21.24548 33.02594 37.79391
46 36.80722 31.50952 30.94186 21.56593 33.17226 38.42553
47 37.60966 31.98561 31.63391 21.89768 33.34089 39.20039
48 38.25016 32.63639 32.19883 22.23119 33.67384 39.98531
49 38.95744 33.18134 32.72147 22.4859 34.27073 40.76857
50 39.66163 33.67109 33.14864 22.90394 34.86681 41.49251
51 40.37425 34.12463 33.60807 23.26918 35.59697 42.51444
52 41.23707 34.66628 34.09723 23.52158 36.24535 43.14603
53 41.82558 35.1961 34.57659 23.89679 36.90796 44.16233
54 42.55081 35.72951 35.03618 24.49229 37.65297 44.59068
55 43.39907 36.31952 35.46371 24.81181 38.33818 45.22966
56 44.05056 37.05194 35.98615 25.12065 38.85623 46.23367
57 44.78049 37.1323 36.51719 25.4582 39.54339 46.54872
58 45.43282 37.76535 37.09313 25.88998 40.23827 47.07784
59 46.18882 38.27575 37.17476 26.22639 40.92604 47.807
60 46.90982 38.88576 37.90604 26.56257 41.63398 48.4778
61 47.56264 39.64927 38.5283 26.8499 42.29979 49.21885
62 48.10035 40.19561 39.16806 27.1614 42.99679 50.18735
63 49.01068 40.89077 39.80176 27.43677 43.8278 51.9102
64 49.76271 41.6514 40.39578 27.89204 44.4915 52.78747
65 50.53434 42.09778 41.03402 28.18638 45.01828 53.46253
66 51.67479 42.83619 41.44307 28.49254 45.8151 54.44443
67 52.20818 43.35224 42.17046 28.87821 46.38069 55.20507
68 52.84818 43.94838 42.54818 29.18387 47.27983 55.71156
69 53.54274 44.61937 43.04368 29.58712 47.76875 56.11357
70 54.24117 45.2113 43.55424 29.97786 48.52082 56.56269
71 55.10781 45.87016 44.19418 30.30342 49.17041 57.04574
72 55.81844 46.58728 44.70245 30.92939 50.00576 57.61847
73 56.53417 47.17022 45.19135 64.12819 50.76387 58.46774
74 56.99077 47.80587 45.81162 64.46482 51.44632 59.35406
75 57.70125 48.4632 46.53608 64.47179 52.09271 60.34232
76 58.40646 49.11251 47.44626 65.28538 52.77505 60.76057
77 59.20803 49.70755 48.0586 65.42728 53.3777 61.86707
78 59.71753 50.13534 48.76304 65.97044 54.06384 63.14102
79 60.58331 50.72049 49.47997 66.51449 54.7547 64.43312
80 61.03398 51.41927 50.11546 67.02634 55.4798 65.58254
81 61.80681 51.97609 50.69514 67.59518 55.96139 66.72086
82 62.48501 52.59973 51.31683 68.12712 56.93643 67.53484
83 63.36452 53.36562 51.73617 68.64816 57.6551 68.07806
84 64.31261 53.98405 52.21327 69.24711 58.23373 68.63623
85 65.24776 54.51552 52.77048 70.48085 58.97933 69.02074
86 66.17772 55.20282 53.22162 70.64199 59.76285 69.38057
87 67.08787 55.91391 53.7916 71.38781 60.25809 70.01195
88 68.01987 56.61301 54.46721 71.58064 61.31948 70.5335
89 68.92189 57.28238 55.16064 71.99983 62.18978 71.61938
90 69.79762 57.88332 55.85772 72.89091 63.02894 72.77907
91 69.86632 58.52047 56.78106 73.05919 63.78964 74.13258
92 70.60662 59.12164 57.49112 73.58095 64.54343 75.77073
93 71.63203 59.77399 58.20212 74.1192 65.36834 76.57243
94 72.18227 60.47282 58.77127 74.6143 65.83804 77.84715
95 72.97624 60.7739 59.41283 75.4809 66.61507 78.78102
96 73.75372 61.22352 59.84708 75.66663 67.44336 79.33527
97 74.66983 61.87689 60.49374 76.09998 68.30974 79.86294
98 75.85329 62.58495 60.7886 76.67287 69.23421 80.51763
99 76.38837 63.32424 61.5629 77.20351 70.00735 80.91219
100 77.38139 64.07433 62.21648 77.95189 70.7836 81.57964
101 78.25631 64.82328 62.74316 78.21231 71.2177 82.16656
102 79.19827 65.50484 63.64724 78.89301 72.00792 83.12364
103 80.38764 66.23685 64.48991 79.32261 73.00548 84.00261
104 80.87278 66.95412 65.2793 79.95379 73.50331 85.22213
105 81.76581 67.70247 65.82581 80.52102 74.28909 86.6621
106 83.02712 68.55701 66.62666 81.06393 75.11777 88.11059
107 83.48909 69.23235 67.35486 81.7409 75.9652
108 84.82759 70.58522 68.15342 82.25188 76.8884
109 85.28537 71.04559 68.92251 82.98396 77.83717
110 86.70018 71.73407 69.51888 83.51862 78.45438
111 87.35397 72.45837 70.31539 83.69946 79.32315
112 88.69969 73.14394 70.9007 84.25947 80.39831
113 73.92206 71.50578 85.10349 81.20853
114 74.65082 72.20686 85.26869 81.95338
115 75.32388 72.81664 86.07426 82.36201
116 76.37313 73.52561 86.33713 83.16817
117 76.85229 74.32013 86.85325 83.96463
118 77.55033 75.04207 87.32344 84.8136
119 78.19957 75.90256 87.93314 85.7303
120 79.23823 76.41772 88.39268 86.46136
121 79.57755 77.11913 88.96714 87.30937
122 79.70834 78.01459 88.17579
123 80.44374 78.76607 89.00109
124 81.47443 79.56496
125 81.80569 79.69939
126 82.57823 80.52383
127 83.38485 81.27236
128 84.09743 81.94386
129 84.78618 83.01913
130 85.91491 83.52692
131 86.18631 84.52093
132 86.87262 85.26204
133 88.0145 85.93992
134 88.30018 86.70402
135 89.08487 87.58891
136 88.27903
from the above data, the values are ranged from 7.3 (approx.) to 89.08 (approx) in the top to bottom. however, I have some data ranged from 7.3 to 89.09 (approx) in the bottom to top in another sheet of excel file.
Now, I would like to take the longest column vector (from the sample data it is column vector: 3) i.e 136*1 size and convert other column vectors (1,2,4,5 and 6) into the size of column vector :3 such that the original values (magnitudes) should remain same and their positions (values in the rows can be shifted). Between the values (original magnitudes), I need to interpolate so that, all the column vectors will be of same length (136*1).
like this column vectors, I have some hundreds.
the expected output is presented only for column:1 with reference to column:3
No 1 3
1 7.68565 7.620156
2 8.247334 8.107751
3 8.861417 8.616113
4 9.522981 9.117843
5 **9.8125205** 9.621576
6 10.10206 10.2111
7 10.74194 10.70612
8 11.41614 11.23061
9 **11.751075** 11.81479
10 12.08601 12.3436
11 12.8509 12.81579
12 13.79793 13.39124
13 **14.10064** 13.94058
14 14.40335 14.35261
15 14.96397 14.86438
16 15.49457 15.30651
17 **15.77805** 15.82241
18 16.06153 16.33324
19 16.61133 16.8808
20 17.24876 17.40481
21 17.8686 17.88455
22 18.49424 18.38069
23 **18.77967** 18.79004
24 19.0651 19.25296
25 19.73842 19.68497
26 20.47123 20.25114
27 **20.81728** 20.8394
28 21.16333 21.36803
29 21.83083 21.92369
30 22.50095 22.49119
31 23.27895 23.09582
32 23.86791 23.71597
33 24.42128 24.22622
34 **24.724375** 24.76757
35 25.02747 25.30781
36 25.64392 25.86998
37 26.15457 26.34333
38 26.78083 26.69165
39 27.39095 27.12134
40 **27.718205** 27.578
41 28.04546 28.16382
42 28.68887 28.58075
43 29.45707 29.14229
44 **29.748265** 29.71523
45 30.03946 30.28881
46 30.71511 30.94186
47 31.42378 31.63391
48 32.50335 32.19883
49 **32.813315** 32.72147
50 33.12328 33.14864
51 33.71293 33.60807
52 34.22313 34.09723
53 34.84822 34.57659
54 **35.18025** 35.03618
55 35.51228 35.46371
56 **35.81451** 35.98615
57 36.11674 36.51719
58 36.80722 37.09313
59 37.60966 37.17476
60 **37.92991** 37.90604
61 38.25016 38.5283
62 38.95744 39.16806
63 39.66163 39.80176
64 40.37425 40.39578
65 41.23707 41.03402
66 41.82558 41.44307
67 42.55081 42.17046
68 **42.97494** 42.54818
69 43.39907 43.04368
70 **43.724815** 43.55424
71 44.05056 44.19418
72 44.78049 44.70245
73 45.43282 45.19135
74 **45.81082** 45.81162
75 46.18882 46.53608
76 46.90982 47.44626
77 47.56264 48.0586
78 48.10035 48.76304
79 49.01068 49.47997
80 49.76271 50.11546
81 50.53434 50.69514
82 51.67479 51.31683
83 **51.941485** 51.73617
84 52.20818 52.21327
85 52.84818 52.77048
86 53.54274 53.22162
87 **53.891955** 53.7916
88 54.24117 54.46721
89 55.10781 55.16064
90 55.81844 55.85772
91 56.53417 56.78106
92 56.99077 57.49112
93 57.70125 58.20212
94 58.40646 58.77127
95 59.20803 59.41283
96 59.71753 59.84708
97 60.58331 60.49374
98 61.03398 60.7886
99 61.80681 61.5629
100 62.48501 62.21648
101 **62.924765** 62.74316
102 63.36452 63.64724
103 64.31261 64.48991
104 65.24776 65.2793
105 **65.71274** 65.82581
106 66.17772 66.62666
107 67.08787 67.35486
108 68.01987 68.15342
109 68.92189 68.92251
110 69.79762 69.51888
111 69.86632 70.31539
112 70.60662 70.9007
113 71.63203 71.50578
114 72.18227 72.20686
115 72.97624 72.81664
116 73.75372 73.52561
117 74.66983 74.32013
118 75.85329 75.04207
119 76.38837 75.90256
120 **76.88488** 76.41772
121 77.38139 77.11913
122 78.25631 78.01459
123 **78.72729** 78.76607
124 79.19827 79.56496
125 **79.792955** 79.69939
126 80.38764 80.52383
127 80.87278 81.27236
128 81.76581 81.94386
129 83.02712 83.01913
130 83.48909 83.52692
131 84.82759 84.52093
132 85.28537 85.26204
133 85.992775 85.93992
134 86.70018 86.70402
135 87.35397 87.58891
136 88.69969 88.27903
the expected interpolated values in column:1 are presented in double starred. here, the interpolation is done by averaging the i-1th and i+1th cell for the ith cell (simply linear interpolation)
the main purpose of doing so is to perform clustering. since column vectors/row vectors of unequal length cannot be used for clustering
is there any code to do that?
or can we calculate distance using DTW(Dynamic Time Warping) method or any other method with column vectors having unequal length (as shown in the example dataset) and perform clustering??
Related
How to isolate certain letters inside a string in R?
I need to find an index for all the values that present Q2 or Q4. So basically skipping the first element that is a title. I need to save: 3,5,7,9..... till the end of the object. How can I do this analyzing a string? This is the code: line = c("ISSUE_CUR", "1993-Q1", "1993-Q2", "1993-Q3", "1993-Q4", "1994-Q1", "1994-Q2", "1994-Q3", "1994-Q4", "1995-Q1", "1995-Q2", "1995-Q3", "1995-Q4", "1996-Q1", "1996-Q2", "1996-Q3", "1996-Q4", "1997-Q1", "1997-Q2", "1997-Q3", "1997-Q4", "1998-Q1", "1998-Q2", "1998-Q3", "1998-Q4", "1999-Q1", "1999-Q2", "1999-Q3", "1999-Q4", "2000-Q1", "2000-Q2", "2000-Q3", "2000-Q4", "2001-Q1", "2001-Q2", "2001-Q3", "2001-Q4", "2002-Q1", "2002-Q2", "2002-Q3", "2002-Q4", "2003-Q1", "2003-Q2", "2003-Q3", "2003-Q4", "2004-Q1", "2004-Q2", "2004-Q3", "2004-Q4", "2005-Q1", "2005-Q2", "2005-Q3", "2005-Q4", "2006-Q1", "2006-Q2", "2006-Q3", "2006-Q4", "2007-Q1", "2007-Q2", "2007-Q3", "2007-Q4", "2008-Q1", "2008-Q2", "2008-Q3", "2008-Q4", "2009-Q1", "2009-Q2", "2009-Q3", "2009-Q4", "2010-Q1", "2010-Q2", "2010-Q3", "2010-Q4", "2011-Q1", "2011-Q2", "2011-Q3", "2011-Q4", "2012-Q1", "2012-Q2", "2012-Q3", "2012-Q4", "2013-Q1", "2013-Q2", "2013-Q3", "2013-Q4", "2014-Q1", "2014-Q2", "2014-Q3", "2014-Q4", "2015-Q1", "2015-Q2", "2015-Q3", "2015-Q4", "2016-Q1", "2016-Q2", "2016-Q3", "2016-Q4", "2017-Q1", "2017-Q2", "2017-Q3", "2017-Q4", "2018-Q1", "2018-Q2", "2018-Q3", "2018-Q4", "2019-Q1", "2019-Q2", "2019-Q3", "2019-Q4")
We can use grep to return the index by matching 'Q2' or (|) 'Q4' at the end ($) of the string grep("(Q2|Q4)$", line) #[1] 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 #[31] 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99 101 103 105 107 109 Or the 'Q' can be placed outside as it is common grep("Q(2|4)$", line) Or another option is endsWith which(endsWith(line, "Q2")|endsWith(line, "Q4"))
You can use the grep() function. grep("[24]$", line, value = T) The "[24]$" is a regular expression pattern. For a good tutorial on it, you may check here. https://www.regular-expressions.info/rlanguage.html
Calculate mean value for each row with interval
i need to calculate the mean value for each row (mean of interval). Here is a basic example (maybe anyone has even better idea to do it): M_1_mb <- (15 : -15)#creating a vector value --> small M_31 <- cut(M_31_mb,128)# getting 128 groups from the small vector #M_1_mb <- (1500 : -1500)#creating a vector value #M_1 <- cut(M_1_mb,128)# getting 128 groups from the vector I do need to get the mean value for each row/group out of 128 intervals created in M_1 (actually i do not need even those intervals, i just need the mean of them) and i cannot figure out how to do it... I had a look at the cut2 function from Hmisc library but unfortunatelly there is no option to set up number of intervals into which vector is to be cut (-> but there is an option to get the mean value of created intervals: levels.mean...) I would appreciate any help! Thanks! Additional Info: cut2 function is working well for bigger vectors (M_1_mb), however when my vector is small (M_31_mb), then i am getting a Warning message: Warning message: In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf and only 31 groups are created: M_31_mb <- (15 : -15) # smaller vector M_31 <- table(cut2(M_31_mb,g=128,levels.mean = TRUE)) whereas g = number of quantile groups
like this? aggregate(M_1_mb,by=list(M_1),mean) EDIT: Result Group.1 x 1 (-1.5e+03,-1.48e+03] -1488.5 2 (-1.48e+03,-1.45e+03] -1465.0 3 (-1.45e+03,-1.43e+03] -1441.5 4 (-1.43e+03,-1.41e+03] -1418.0 5 (-1.41e+03,-1.38e+03] -1394.5 6 (-1.38e+03,-1.36e+03] -1371.0 7 (-1.36e+03,-1.34e+03] -1347.5 8 (-1.34e+03,-1.31e+03] -1324.0 9 (-1.31e+03,-1.29e+03] -1301.0 10 (-1.29e+03,-1.27e+03] -1277.5 11 (-1.27e+03,-1.24e+03] -1254.0 12 (-1.24e+03,-1.22e+03] -1230.5 13 (-1.22e+03,-1.2e+03] -1207.0 14 (-1.2e+03,-1.17e+03] -1183.5 15 (-1.17e+03,-1.15e+03] -1160.0 16 (-1.15e+03,-1.12e+03] -1136.5 17 (-1.12e+03,-1.1e+03] -1113.0 18 (-1.1e+03,-1.08e+03] -1090.0 19 (-1.08e+03,-1.05e+03] -1066.5 20 (-1.05e+03,-1.03e+03] -1043.0 21 (-1.03e+03,-1.01e+03] -1019.5 22 (-1.01e+03,-984] -996.0 23 (-984,-961] -972.5 24 (-961,-938] -949.0 25 (-938,-914] -926.0 26 (-914,-891] -902.5 27 (-891,-867] -879.0 28 (-867,-844] -855.5 29 (-844,-820] -832.0 30 (-820,-797] -808.5 31 (-797,-773] -785.0 32 (-773,-750] -761.5 33 (-750,-727] -738.0 34 (-727,-703] -715.0 35 (-703,-680] -691.5 36 (-680,-656] -668.0 37 (-656,-633] -644.5 38 (-633,-609] -621.0 39 (-609,-586] -597.5 40 (-586,-562] -574.0 41 (-562,-539] -551.0 42 (-539,-516] -527.5 43 (-516,-492] -504.0 44 (-492,-469] -480.5 45 (-469,-445] -457.0 46 (-445,-422] -433.5 47 (-422,-398] -410.0 48 (-398,-375] -386.5 49 (-375,-352] -363.0 50 (-352,-328] -340.0 51 (-328,-305] -316.5 52 (-305,-281] -293.0 53 (-281,-258] -269.5 54 (-258,-234] -246.0 55 (-234,-211] -222.5 56 (-211,-188] -199.0 57 (-188,-164] -176.0 58 (-164,-141] -152.5 59 (-141,-117] -129.0 60 (-117,-93.8] -105.5 61 (-93.8,-70.3] -82.0 62 (-70.3,-46.9] -58.5 63 (-46.9,-23.4] -35.0 64 (-23.4,0] -11.5 65 (0,23.4] 12.0 66 (23.4,46.9] 35.0 67 (46.9,70.3] 58.5 68 (70.3,93.8] 82.0 69 (93.8,117] 105.5 70 (117,141] 129.0 71 (141,164] 152.5 72 (164,188] 176.0 73 (188,211] 199.0 74 (211,234] 222.5 75 (234,258] 246.0 76 (258,281] 269.5 77 (281,305] 293.0 78 (305,328] 316.5 79 (328,352] 340.0 80 (352,375] 363.5 81 (375,398] 387.0 82 (398,422] 410.0 83 (422,445] 433.5 84 (445,469] 457.0 85 (469,492] 480.5 86 (492,516] 504.0 87 (516,539] 527.5 88 (539,562] 551.0 89 (562,586] 574.0 90 (586,609] 597.5 91 (609,633] 621.0 92 (633,656] 644.5 93 (656,680] 668.0 94 (680,703] 691.5 95 (703,727] 715.0 96 (727,750] 738.5 97 (750,773] 762.0 98 (773,797] 785.0 99 (797,820] 808.5 100 (820,844] 832.0 101 (844,867] 855.5 102 (867,891] 879.0 103 (891,914] 902.5 104 (914,938] 926.0 105 (938,961] 949.0 106 (961,984] 972.5 107 (984,1.01e+03] 996.0 108 (1.01e+03,1.03e+03] 1019.5 109 (1.03e+03,1.05e+03] 1043.0 110 (1.05e+03,1.08e+03] 1066.5 111 (1.08e+03,1.1e+03] 1090.0 112 (1.1e+03,1.12e+03] 1113.5 113 (1.12e+03,1.15e+03] 1137.0 114 (1.15e+03,1.17e+03] 1160.0 115 (1.17e+03,1.2e+03] 1183.5 116 (1.2e+03,1.22e+03] 1207.0 117 (1.22e+03,1.24e+03] 1230.5 118 (1.24e+03,1.27e+03] 1254.0 119 (1.27e+03,1.29e+03] 1277.5 120 (1.29e+03,1.31e+03] 1301.0 121 (1.31e+03,1.34e+03] 1324.0 122 (1.34e+03,1.36e+03] 1347.5 123 (1.36e+03,1.38e+03] 1371.0 124 (1.38e+03,1.41e+03] 1394.5 125 (1.41e+03,1.43e+03] 1418.0 126 (1.43e+03,1.45e+03] 1441.5 127 (1.45e+03,1.48e+03] 1465.0 128 (1.48e+03,1.5e+03] 1488.5
Loop Linear Regression
As a begginer in R i have a, probably, simple question. I have a linear regression with this specification: X1 = X1_t-h + X2_t-h h for is equal to 1,2,3,4,5: For example, when h=1 i run this code: Modelo11 <- dynlm(X1 ~ L(X1,1) + L(X2, 1)-1, data = GDP) Its a simple regression. I want to implement a function that gives me the five linear regressions (h=1,2,3,4 and 5) with and without HAC heteroscedasticity estimation: I did this, and didnt work: for(h in 1:5){ Modelo1[h] <- dynlm(GDPTrimestralemT ~ L(SpreademT,h) + L(GDPTrimestralemT, h)-1, data = MatrizDadosUS) coeftest(Modelo1[h], df = Inf, vcov = parzenHAC) return(list(summary(Modelo1[h]))) } One of the error message is: number of items to replace is not a multiple of replacement length This is my data.frame: GDP <- data.frame(data ) GDP X1 X2 1 0.542952690 0.226341364 2 0.102328393 0.743360185 3 0.166345969 0.186533485 4 1.406733422 1.392420181 5 -0.469811005 -0.114609464 6 -0.509268267 0.687555461 7 1.470439930 0.298655018 8 1.046456428 -1.056387597 9 -0.492462197 -0.530284962 10 -0.516065519 0.645957530 11 0.624638996 1.044731264 12 0.213616470 -1.652979785 13 0.669747432 1.398602289 14 0.552089131 -0.821013792 15 0.452715216 1.420094663 16 -0.892063248 -1.436600779 17 1.429284965 0.559738610 18 0.853740565 -0.898976767 19 0.741864168 1.352012831 20 0.171494650 1.704764705 21 0.422326351 -0.267064235 22 -1.261643503 -2.090694608 23 -1.321086283 -0.273954212 24 0.365226000 1.965167113 25 -0.080888690 -0.594498893 26 -0.183293801 -0.483053404 27 -1.033792032 0.586491772 28 0.718322432 1.776210145 29 -2.822693790 -0.731509917 30 -1.251740437 -1.918124078 31 1.184256949 -0.016548037 32 2.255202675 0.303438286 33 -0.930446147 0.803126180 34 -1.691383225 -0.157839283 35 -1.081643279 -0.006652717 36 1.034162006 -1.970063305 37 -0.716827488 0.306792930 38 0.098471514 0.338333164 39 0.343536547 0.389775011 40 1.442117465 -0.668885360 41 0.095131066 -0.298356861 42 0.222524607 0.291485267 43 -0.499969717 1.308312472 44 0.588162304 0.026539575 45 0.581215173 0.167710855 46 0.629343124 -0.052835206 47 0.811618963 0.716913172 48 1.463610069 -0.356369304 49 -2.000576321 1.226446201 50 1.278233553 0.313606888 51 -0.700373666 0.770273988 52 -1.206455648 0.344628878 53 0.024602262 1.001621886 54 0.858933385 -0.865771777 55 -1.592291995 -0.384908852 56 -0.833758365 -1.184682199 57 -0.281305858 2.070391729 58 -0.122848757 -0.308397782 59 -0.661013984 1.590741535 60 1.887869805 -1.240283364 61 -0.313677463 -1.393252994 62 1.142864110 -1.150916732 63 -0.633380499 -0.223923970 64 -0.158729527 -1.245647224 65 0.928619010 -1.050636078 66 0.424317087 0.593892028 67 1.108704956 -1.792833100 68 -1.338231248 1.138684394 69 -0.647492569 0.181495183 70 0.295906675 -0.101823172 71 -0.079827607 0.825158278 72 0.050353111 -0.448453121 73 0.129068772 0.205619797 74 -0.221450137 0.051349511 75 -1.300967949 1.639063824 76 -0.861963677 1.273104220 77 -1.691001610 0.746514122 78 0.365888734 -0.055308006 79 1.297349754 1.146102001 80 -0.652382297 -1.095031447 81 0.165682952 -0.012926971 82 0.127996446 0.510673745 83 0.338743162 -3.141650682 84 -0.266916587 -2.483389321 85 0.148135154 -1.239997153 86 1.256591385 0.051984536 87 -0.646281986 0.468210275 88 0.180472423 0.393014848 89 0.231892902 -0.545305005 90 -0.709986273 0.104969765 91 1.231712844 -1.703489840 92 0.435378714 0.876505107 93 -1.880394798 -0.885893722 94 1.083580732 0.117560662 95 -0.499072654 -1.039222894 96 1.850756855 -1.308752222 97 1.653952857 0.440405804 98 -1.057618294 -1.611779530 99 -0.021821282 -0.807071503 100 0.682923562 -2.358596342 101 -1.132293845 -1.488806929 102 0.319237353 0.706203968 103 -2.393105781 -1.562111727 104 0.188653972 -0.637073832 105 0.667003685 0.047694037 106 -0.534018861 1.366826933 107 -2.240330371 -0.071797320 108 -0.220633546 1.612879694 109 -0.022442941 1.172582601 110 -1.542418139 0.635161458 111 -0.684128812 -0.334973482 112 0.688849615 0.056557966 113 0.848602803 0.785297518 114 -0.874157558 -0.434518305 115 -0.404999060 -0.078893114 116 0.735896917 1.637873669 117 -0.174398836 0.542952690 118 0.222418628 0.102328393 119 0.419461884 0.166345969 120 -0.042602368 1.406733422 121 2.135670836 -0.469811005 122 1.197644287 -0.509268267 123 0.395951293 1.470439930 124 0.141327444 1.046456428 125 0.691575897 -0.492462197 126 -0.490708151 -0.516065519 127 -0.358903359 0.624638996 128 -0.227550909 0.213616470 129 -0.766692832 0.669747432 130 -0.001690915 0.552089131 131 -1.786701123 0.452715216 132 -1.251495762 -0.892063248 133 1.123462446 1.429284965 134 0.237862653 0.853740565 Thanks.
Your variable Modelo1 is a vector which cannot store lm objects. When Modelo1 is a list it should work. library(dynlm) df<-data.frame(rnorm(50),rnorm(50)) names(df)<-c("a","b") c<-list() for(h in 1:5){ c[[h]] <- dynlm(a ~ L(a,h) + L(b, h)-1, data = df) } To get the summary you have to access the single list elements. For example: summary(c[[1]]) *edit in response to Richard Scriven comment The most efficent way to to get all summaries would be: lapply(c, summary) This applies the summary function to each element of the list and returns a list with the results.
how do I select points in a dataset above x% contour of a density map?
I have a matrix of data (see below) and I am trying to turn it into a density contour map (Can1 and Can2 variables), maybe with ks or sm packages. My question is how do I select those points in the dataset which lie above (say) 80% contour of the density map? Thanks ID Can1 Can2 4 -12.3235137 -1.0788867664 1 -12.2949912 -0.9321009837 5 -12.2835123 -1.0164225574 2 -12.2571822 -0.7094457036 3 -12.2713779 -0.9908419863 10 -12.9870438 -1.0936405526 6 -12.7167605 -1.4620772026 7 -12.8193776 -1.0911349785 8 -12.9781963 -1.1762698594 9 -12.7983478 -1.3453369581 13 -14.0389948 0.2855210115 11 -14.0015922 0.1467552738 15 -14.0723604 0.0244576488 14 -14.0743560 0.1417245145 12 -13.9898266 0.0005437008 20 -6.5881994 0.5124980991 17 -6.1812321 0.6789584579 16 -6.4704200 0.5942317307 18 -6.6960456 0.5720874622 19 -6.1159788 0.5960966790 22 -2.4794887 2.5493267897 24 -2.4918040 2.7823374576 21 -2.5145044 2.5877290160 23 -2.5048371 2.4916280770 25 -2.5018765 2.8536302559 29 -0.1781852 2.0805229401 26 -0.1581308 2.0151355747 28 -0.2118605 1.9658284615 27 -0.4184119 2.0540218901 30 -0.2994573 2.0205573385 35 2.6254869 1.3858705991 31 2.3146430 1.3510499304 33 2.5346138 1.2524229847 34 2.3741699 1.3842499455 32 2.6008389 1.3446707509 37 3.0920503 1.5807032840 38 3.1559727 1.4924092104 36 3.1593556 1.5803284343 39 3.0801444 1.6031732981 40 3.2562384 1.5810975265 43 4.8414364 2.1539254215 41 4.7938193 2.1613978258 44 4.7919209 2.2151527426 42 4.9830802 2.2374622446 45 4.7629268 2.4217335005 46 5.5631728 0.9986762598 50 5.5250403 1.0549399894 48 5.5833619 1.1368625963 47 5.5660312 1.1881215490 49 5.6224256 1.1634998303 53 5.5536366 0.2513665533 54 5.5276808 0.2685455911 51 5.7103045 0.2193839293 52 5.6014729 0.2353172964 55 5.5959034 0.2447836618 56 5.1542133 0.6070006863 59 5.0043394 0.4518710615 58 5.2314146 0.5656457888 60 5.1318728 0.4771275341 57 5.3599822 0.4918185651 61 7.0235173 -0.2669136870 63 7.0216315 -0.0097862523 64 7.0521253 -0.2457722410 62 7.0150637 -0.1456269078 65 7.0729018 -0.3573952321 69 5.8115406 -1.4652084167 67 5.7624475 -1.4147564126 68 5.8692888 -1.4695783153 70 5.9088094 -1.4927034632 66 5.8400205 -1.4817447808 71 4.8586107 -1.3111515744 73 4.7198564 -1.2891991780 72 4.9153659 -1.4499710448 74 4.7653488 -1.2839433419 75 4.7754971 -1.4655359108 77 3.8955675 -7.0922887151 78 3.8338151 -7.1595858283 80 3.7255063 -7.2147373050 79 3.7367055 -7.3468877516 76 4.0166957 -7.1952570639
Calculate the 80% point. One way: y<- x[x > 0.8 * max(x)] (I'm assuming you wanted 80% of the max level, not the 80th percentile) . Then plot y .
After a bit of searching I think it can be achieved using the kde2d function from the MASS package.
Converting probe ids to entrez ids from a list of lists
The conversion of probe ids to entrez ids is quite straight forward i1<-c("246653_at", "246897_at", "251347_at", "252988_at", "255528_at", "256535_at", "257203_at", "257582_at", "258807_at", "261509_at", "265050_at", "265672_at") select(ath1121501.db, i1, "ENTREZID", "PROBEID") PROBEID ENTREZID 1 246653_at 833474 2 246897_at 832631 3 251347_at 825272 4 252988_at 829998 5 255528_at 827380 6 256535_at 840223 7 257203_at 821955 8 257582_at 841494 9 258807_at 819558 10 261509_at 843504 11 265050_at 841636 12 265672_at 817757 But Iam unsure how to do it for a long list of lists resulting from a clustering and store it as a list of ENTREZ ids instead of probe ids again: For instance: [[1]] 247964_at 248684_at 249126_at 249214_at 250223_at 253620_at 254907_at 259897_at 261256_at 267126_s_at 28 40 44 45 54 95 108 152 171 229 [[2]] 248230_at 250869_at 259765_at 265948_at 266221_at 33 64 151 216 221 [[3]] 245385_at 247282_at 248967_at 250180_at 250881_at 251073_at 53874_at 256093_at 257054_at 260007_at 5 22 42 52 65 67 101 117 125 155 261868_s_at 263136_at 267497_at 181 195 232 It should be something like [[1]] "835761","834904","834356","834281","831256","829175","826721","843479","837084","816891","816892" and similarly for other list of lists.