Technical error of measurement in between two columns - r

I have the following data frame:
data_2
sex age seca1 chad1 DL alog1 dig1 scifirst1 crimetech1
1 F 19 1800 1797 180 70 69 421 424
2 F 19 1682 1670 167 69 69 421 423
3 F 21 1765 1765 178 80 81 421 423
4 F 21 1829 1833 181 74 72 421 419
5 F 21 1706 1705 170 103 101 439 440
6 F 18 1607 1606 160 76 76 440 439
7 F 19 1578 1576 156 50 48 422 422
8 F 19 1577 1575 156 61 61 439 441
9 F 21 1666 1665 166 52 51 439 441
10 F 17 1710 1716 172 65 65 420 420
11 F 28 1616 1619 161 66 65 426 428
12 F 22 1648 1644 165 58 57 426 429
13 F 19 1569 1570 155 55 54 419 420
14 F 19 1779 1777 177 55 54 422 422
15 M 18 1773 1772 179 70 69 420 419
16 M 18 1816 1809 181 81 80 442 440
17 M 19 1766 1765 178 77 76 425 425
18 M 19 1745 1741 174 76 76 421 423
19 M 18 1716 1714 170 71 70 445 446
20 M 21 1785 1783 179 64 63 446 445
21 M 19 1850 1854 185 71 72 422 421
22 M 31 1875 1880 188 95 95 419 420
23 M 26 1877 1877 186 106 106 420 420
24 M 19 1836 1837 185 100 100 426 423
25 M 18 1825 1823 182 85 85 444 439
26 M 19 1755 1754 174 79 78 420 419
27 M 26 1658 1658 165 69 69 421 421
28 M 20 1816 1818 183 84 83 439 440
29 M 18 1755 1755 175 67 67 429 422
I wish to compute the technical error measurement (TEM) between " alog1 " and " dig1 ", which has the following formula:
TEM= √(D/2n)
Where D is the sum of the differences between alog1 and dig1 squared and n is 29
I'm not sure how to compute the sum of the differences squared between the two columns in the first place. Please help.

Probably with
n <- 29
TEM <- sqrt((data_2$alog1-data_2$dig1)^2/2*n)
data_3 <- cbind(data_2, TEM) #To bind it to the table and create the output table 3
Check the formula of TEM maybe I didn't understand it correctly.

Related

Split a data.frame into n random groups with x rows each

Assume a data.frame as follows:
df <- data.frame(name = paste0("Person",rep(1:30)),
number = sample(1:100, 30, replace=TRUE),
focus = sample(1:500, 30, replace=TRUE))
I want to split the above data.frame into 9 groups, each with 9 observations. Each person can be assigned to multiple groups (replacement), so that all 9 groups have all 10 observations (since 9 groups x 9 observations require 81 rows while the df has only 30).
The output will ideally be a large list of 1000 data.frames.
Are there any efficient ways of doing this? This is just a sample data.frame. The actual df has ~10k rows and will require 1000 groups each with 30 rows.
Many thanks.
Is this what you are looking for?
res <- replicate(1000, df[sample.int(nrow(df), 30, TRUE), ], FALSE)
df I used
df <- data.frame(name = paste0("Person",rep(1:1e4)),
number = sample(1:100, 1e4, replace=TRUE),
focus = sample(1:500, 1e4, replace=TRUE))
Output
> res[1:3]
[[1]]
name number focus
529 Person529 5 351
9327 Person9327 4 320
1289 Person1289 78 164
8157 Person8157 46 183
6939 Person6939 38 61
4066 Person4066 26 103
132 Person132 34 39
6576 Person6576 36 397
5376 Person5376 47 456
6123 Person6123 10 18
5318 Person5318 39 42
6355 Person6355 62 212
340 Person340 90 256
7050 Person7050 19 198
1500 Person1500 42 208
175 Person175 34 30
3751 Person3751 99 441
3813 Person3813 93 492
7428 Person7428 72 142
6840 Person6840 58 45
6501 Person6501 95 499
5124 Person5124 16 159
3373 Person3373 38 36
5622 Person5622 40 203
8761 Person8761 9 225
6252 Person6252 75 444
4502 Person4502 58 337
5344 Person5344 24 233
4036 Person4036 59 265
8764 Person8764 45 1
[[2]]
name number focus
8568 Person8568 87 360
3968 Person3968 67 468
4481 Person4481 46 140
8055 Person8055 73 286
7794 Person7794 92 336
1110 Person1110 6 434
6736 Person6736 4 58
9758 Person9758 60 49
9356 Person9356 89 300
9719 Person9719 100 366
4183 Person4183 5 124
1394 Person1394 87 346
2642 Person2642 81 449
3592 Person3592 65 358
579 Person579 21 395
9551 Person9551 39 495
4946 Person4946 73 32
4081 Person4081 98 270
4062 Person4062 27 150
7698 Person7698 52 436
5388 Person5388 89 177
9598 Person9598 91 474
8624 Person8624 3 464
392 Person392 82 483
5710 Person5710 43 293
4942 Person4942 99 350
3333 Person3333 89 91
6789 Person6789 99 259
7115 Person7115 100 320
1431 Person1431 77 263
[[3]]
name number focus
201 Person201 100 272
4674 Person4674 27 410
9728 Person9728 18 275
9422 Person9422 2 396
9783 Person9783 45 37
5552 Person5552 76 109
3871 Person3871 49 277
3411 Person3411 64 24
5799 Person5799 29 131
626 Person626 31 122
3103 Person3103 2 76
8043 Person8043 90 384
3157 Person3157 90 392
7093 Person7093 11 169
2779 Person2779 83 2
2601 Person2601 77 122
9003 Person9003 50 163
9653 Person9653 4 235
9361 Person9361 100 391
4273 Person4273 83 383
4725 Person4725 35 436
2157 Person2157 71 486
3995 Person3995 25 258
3735 Person3735 24 221
303 Person303 81 407
4838 Person4838 64 198
6926 Person6926 90 417
6267 Person6267 82 284
8570 Person8570 67 317
2670 Person2670 21 342

Remove rows under condition in R

I have this dataframe
time power hr fr VE VO2 VCO2 id
1 1462.0104166666667 25 90 24 20 632 549 LM01-PRD-S1
2 1462.0194444444444 25 92 23 21 679 597 LM01-PRD-S1
3 1462.0305555555556 25 93 22 21 675 607 LM01-PRD-S1
4 1462.0416666666667 25 93 20 19 680 577 LM01-PRD-S1
5 1462.0520833333333 40 96 20 22 745 660 LM01-PRD-S1
6 1462.0618055555556 40 98 21 22 764 675 LM01-PRD-S1
7 1462.0722222222223 40 100 21 22 789 703 LM01-PRD-S1
8 1462.0826388888888 40 100 20 23 805 734 LM01-PRD-S1
9 1462.09375 55 105 22 26 911 843 LM01-PRD-S1
10 1462.1041666666667 55 105 20 25 881 831 LM01-PRD-S1
11 1462.1131944444444 55 109 19 25 895 847 LM01-PRD-S1
12 1462.1229166666667 55 112 21 25 908 868 LM01-PRD-S1
13 1462.1347222222223 70 120 21 28 981 947 LM01-PRD-S1
14 1462.1451388888888 70 120 21 29 1044 1021 LM01-PRD-S1
15 1462.1548611111111 70 122 22 27 1066 1031 LM01-PRD-S1
16 1462.1652777777779 70 127 19 30 1136 1122 LM01-PRD-S1
17 1462.1770833333333 85 130 20 32 1181 1218 LM01-PRD-S1
18 1462.1868055555556 85 141 21 32 1194 1216 LM01-PRD-S1
19 1462.1958333333334 85 139 22 34 1231 1295 LM01-PRD-S1
20 1462.2069444444444 85 139 19 32 1193 1268 LM01-PRD-S1
21 1462.2166666666667 100 139 21 31 1192 1274 LM01-PRD-S1
22 1462.2291666666667 100 146 21 38 1363 1460 LM01-PRD-S1
23 1462.2395833333333 100 150 28 50 1551 1801 LM01-PRD-S1
24 1462.2479166666667 100 148 30 51 1499 1810 LM01-PRD-S1
25 1462.2597222222223 115 150 30 55 1564 1883 LM01-PRD-S1
26 1462.2708333333333 115 153 31 56 1544 1892 LM01-PRD-S1
27 1462.2805555555556 115 157 33 59 1545 2012 LM01-PRD-S1
28 1462.2881944444443 115 157 34 62 1647 2091 LM01-PRD-S1
29 NA NA NA RÈcupÈ ration NA NA LM01-PRD-S1
30 1462.0027777777777 65 157 39 61 1466 1940 LM01-PRD-S1
31 1462.0131944444445 20 153 32 58 1518 1939 LM01-PRD-S1
32 1462.0236111111112 20 148 28 50 1422 1748 LM01-PRD-S1
33 1462.0333333333333 20 144 26 46 1222 1555 LM01-PRD-S1
34 1462.0430555555556 20 141 22 37 963 1209 LM01-PRD-S1
35 1462.0541666666666 20 133 22 42 1165 1464 LM01-PRD-S1
36 1462.0645833333333 20 133 24 47 1021 1384 LM01-PRD-S1
37 1462.0743055555556 20 130 22 40 914 1228 LM01-PRD-S1
38 1462.0854166666666 20 130 23 38 847 1128 LM01-PRD-S1
39 1462.0944444444444 20 120 18 32 755 998 LM01-PRD-S1
40 1462.1069444444445 0 117 17 29 674 904 LM01-PRD-S1
41 1462.1173611111112 0 115 20 27 587 805 LM01-PRD-S1
42 1462.1277777777777 0 113 20 28 536 803 LM01-PRD-S1
43 1462.1368055555556 0 112 18 26 489 744 LM01-PRD-S1
44 1462.1479166666666 0 110 18 25 457 703 LM01-PRD-S1
45 1462.1590277777777 0 103 19 23 419 633 LM01-PRD-S1
46 1462.16875 0 103 17 24 479 672 LM01-PRD-S1
47 1462.1791666666666 0 103 19 21 423 560 LM01-PRD-S1
48 1462.1902777777777 0 100 19 22 459 609 LM01-PRD-S1
49 1462.1993055555556 0 101 18 22 440 599 LM01-PRD-S1
50 1462.004861111111 0 98 18 22 410 572 LM01-PRD-S1
51 1.0416666666666666E-2 35 102 16 18 659 576 LB02-PRD-S1
52 1.9444444444444445E-2 35 101 17 19 729 613 LB02-PRD-S1
53 3.0555555555555555E-2 35 105 15 28 977 851 LB02-PRD-S1
54 4.0972222222222222E-2 35 96 16 28 886 852 LB02-PRD-S1
55 4.9999999999999996E-2 50 90 16 16 593 504 LB02-PRD-S1
56 6.1111111111111116E-2 50 106 18 17 737 552 LB02-PRD-S1
57 7.2222222222222229E-2 50 108 19 23 1053 775 LB02-PRD-S1
58 8.2638888888888887E-2 50 117 17 30 1236 1008 LB02-PRD-S1
59 9.2361111111111116E-2 65 113 18 29 1181 983 LB02-PRD-S1
60 0.10347222222222223 65 114 15 31 1167 1016 LB02-PRD-S1
61 0.11388888888888889 65 118 16 31 1167 1052 LB02-PRD-S1
62 0.12430555555555556 65 114 17 28 1104 967 LB02-PRD-S1
63 0.13402777777777777 80 120 17 35 1318 1172 LB02-PRD-S1
64 0.1451388888888889 80 117 16 32 1236 1153 LB02-PRD-S1
65 0.15486111111111112 80 122 17 31 1168 1094 LB02-PRD-S1
66 0.16458333333333333 80 122 17 34 1312 1205 LB02-PRD-S1
67 0.1763888888888889 95 126 18 37 1311 1274 LB02-PRD-S1
68 0.18611111111111112 95 129 18 35 1248 1201 LB02-PRD-S1
69 0.19722222222222222 95 131 15 33 1275 1196 LB02-PRD-S1
70 0.20625000000000002 95 134 18 39 1444 1381 LB02-PRD-S1
71 0.21736111111111112 110 134 19 43 1539 1472 LB02-PRD-S1
72 0.22847222222222222 110 136 19 41 1417 1406 LB02-PRD-S1
73 0.2388888888888889 110 137 20 43 1496 1437 LB02-PRD-S1
74 0.25 110 139 20 44 1561 1539 LB02-PRD-S1
75 0.25972222222222224 125 142 21 46 1561 1560 LB02-PRD-S1
76 0.26944444444444443 125 146 21 46 1535 1552 LB02-PRD-S1
77 0.28055555555555556 125 148 23 51 1698 1703 LB02-PRD-S1
78 0.29166666666666669 125 150 23 53 1725 1776 LB02-PRD-S1
79 0.30069444444444443 140 151 22 52 1726 1760 LB02-PRD-S1
80 0.31180555555555556 140 151 23 53 1713 1763 LB02-PRD-S1
81 0.32222222222222224 140 153 25 55 1807 1836 LB02-PRD-S1
82 0.33263888888888887 140 155 26 58 1897 1941 LB02-PRD-S1
83 0.34375 155 153 26 59 1929 1963 LB02-PRD-S1
84 0.35347222222222219 155 157 26 57 1843 1908 LB02-PRD-S1
85 0.36388888888888887 155 160 28 65 1942 2065 LB02-PRD-S1
86 0.375 155 164 26 64 2011 2131 LB02-PRD-S1
87 0.38472222222222219 170 166 26 65 2048 2178 LB02-PRD-S1
88 0.39583333333333331 170 166 26 64 2069 2171 LB02-PRD-S1
89 0.40625 170 169 25 64 2165 2269 LB02-PRD-S1
90 0.41666666666666669 170 169 28 76 2328 2539 LB02-PRD-S1
91 0.42638888888888887 185 169 30 76 2189 2449 LB02-PRD-S1
92 0.4368055555555555 185 171 29 73 2225 2411 LB02-PRD-S1
93 0.44722222222222219 185 171 29 68 2170 2292 LB02-PRD-S1
94 0.45763888888888887 185 171 31 82 2458 2712 LB02-PRD-S1
95 0.4680555555555555 200 171 33 89 2443 2780 LB02-PRD-S1
96 0.47847222222222219 200 173 33 87 2465 2784 LB02-PRD-S1
97 0.48888888888888887 200 176 32 88 2536 2853 LB02-PRD-S1
98 0.5 200 176 34 93 2571 2899 LB02-PRD-S1
99 0.51041666666666663 215 176 36 98 2529 2924 LB02-PRD-S1
100 0.52083333333333337 215 179 36 105 2602 3087 LB02-PRD-S1
101 0.53125 215 179 39 111 2795 3282 LB02-PRD-S1
102 0.54097222222222219 215 181 40 118 2679 3240 LB02-PRD-S1
103 0.55208333333333337 230 179 40 113 2649 3160 LB02-PRD-S1
104 0.56180555555555556 230 179 41 111 2601 3055 LB02-PRD-S1
105 0.57291666666666663 230 176 42 116 2639 3129 LB02-PRD-S1
106 0.58263888888888882 230 181 43 126 2683 3277 LB02-PRD-S1
107 0.59375 245 181 47 123 2597 3160 LB02-PRD-S1
108 0.60416666666666663 245 181 48 128 2482 3122 LB02-PRD-S1
109 NA NA NA RÈcupÈ ration NA NA LB02-PRD-S1
110 9.7222222222222224E-3 20 179 42 108 2320 2830 LB02-PRD-S1
111 2.013888888888889E-2 20 173 40 106 2134 2594 LB02-PRD-S1
112 3.125E-2 20 171 37 103 1869 2531 LB02-PRD-S1
113 4.0972222222222222E-2 20 166 38 97 1438 2207 LB02-PRD-S1
114 5.1388888888888894E-2 20 164 36 88 1192 1918 LB02-PRD-S1
115 6.1805555555555558E-2 20 155 37 81 1121 1746 LB02-PRD-S1
116 7.0833333333333331E-2 20 142 32 71 1072 1585 LB02-PRD-S1
117 8.1944444444444445E-2 20 151 26 56 961 1345 LB02-PRD-S1
118 9.2361111111111116E-2 20 148 28 58 996 1367 LB02-PRD-S1
119 0.10277777777777779 20 144 24 49 858 1189 LB02-PRD-S1
120 0.11319444444444444 20 141 25 49 722 1053 LB02-PRD-S1
121 0.125 0 136 25 42 611 895 LB02-PRD-S1
122 0.13472222222222222 0 131 26 42 642 893 LB02-PRD-S1
123 0.1451388888888889 0 129 28 44 612 874 LB02-PRD-S1
124 0.15555555555555556 0 126 24 36 544 728 LB02-PRD-S1
125 0.16527777777777777 0 127 26 40 658 840 LB02-PRD-S1
126 0.1763888888888889 0 130 23 31 511 665 LB02-PRD-S1
127 0.18611111111111112 0 126 24 39 646 815 LB02-PRD-S1
128 0.19652777777777777 0 120 25 38 527 716 LB02-PRD-S1
129 0.20694444444444446 0 120 24 36 509 684 LB02-PRD-S1
130 1462.0104166666667 25 101 20 18 712 584 GC03-PRD-S1
131 1462.0208333333333 25 99 20 17 673 551 GC03-PRD-S1
132 1462.03125 25 97 20 17 686 559 GC03-PRD-S1
133 1462.0402777777779 25 96 20 16 639 524 GC03-PRD-S1
134 1462.0506944444444 40 99 19 16 647 518 GC03-PRD-S1
135 1462.0604166666667 40 105 19 16 669 543 GC03-PRD-S1
136 1462.0729166666667 40 107 21 18 723 598 GC03-PRD-S1
137 1462.0826388888888 40 107 25 19 746 605 GC03-PRD-S1
138 1462.0916666666667 55 109 23 20 775 645 GC03-PRD-S1
139 1462.1020833333334 55 111 20 20 780 671 GC03-PRD-S1
140 1462.1118055555555 55 116 21 21 811 710 GC03-PRD-S1
141 1462.1243055555556 55 113 17 22 858 765 GC03-PRD-S1
142 1462.1340277777779 70 117 21 23 900 789 GC03-PRD-S1
143 1462.1458333333333 70 117 20 23 953 843 GC03-PRD-S1
144 1462.15625 70 120 20 25 980 882 GC03-PRD-S1
145 1462.1652777777779 70 122 22 26 1000 916 GC03-PRD-S1
146 1462.1763888888888 85 122 23 27 1049 961 GC03-PRD-S1
147 1462.1868055555556 85 126 23 28 1072 992 GC03-PRD-S1
148 1462.1965277777779 85 131 22 29 1110 1056 GC03-PRD-S1
149 1462.2076388888888 85 130 22 30 1066 1047 GC03-PRD-S1
150 1462.2173611111111 100 129 21 28 1166 1057 GC03-PRD-S1
151 1462.2284722222223 100 137 27 34 1346 1247 GC03-PRD-S1
152 1462.2395833333333 100 137 22 34 1272 1261 GC03-PRD-S1
153 1462.25 100 136 20 33 1222 1235 GC03-PRD-S1
154 1462.2590277777779 115 139 23 36 1321 1321 GC03-PRD-S1
155 1462.2701388888888 115 142 23 37 1340 1377 GC03-PRD-S1
156 1462.2798611111111 115 144 24 38 1362 1418 GC03-PRD-S1
157 1462.2909722222223 115 150 27 44 1470 1579 GC03-PRD-S1
158 1462.3013888888888 130 151 27 45 1466 1618 GC03-PRD-S1
159 1462.3125 130 153 31 54 1686 1875 GC03-PRD-S1
160 1462.3222222222223 130 155 33 59 1679 1998 GC03-PRD-S1
161 1462.3326388888888 130 157 33 59 1676 2021 GC03-PRD-S1
162 1462.3423611111111 145 157 33 61 1700 2041 GC03-PRD-S1
163 1462.3534722222223 145 160 35 64 1764 2120 GC03-PRD-S1
164 1462.3638888888888 145 160 36 67 1765 2182 GC03-PRD-S1
165 1462.3743055555556 145 162 40 71 1762 2208 GC03-PRD-S1
166 1462.0006944444444 145 162 39 69 1754 2208 GC03-PRD-S1
167 NA NA NA RÈcupÈ ration NA NA GC03-PRD-S1
168 1462.0097222222223 20 155 38 68 1687 2124 GC03-PRD-S1
169 1462.0194444444444 20 148 39 67 1576 1996 GC03-PRD-S1
170 1462.0298611111111 20 142 35 62 1390 1842 GC03-PRD-S1
171 1462.0409722222223 20 136 35 58 1189 1632 GC03-PRD-S1
172 1462.05 20 127 26 46 991 1337 GC03-PRD-S1
173 1462.0604166666667 20 117 21 26 776 896 GC03-PRD-S1
174 1462.0715277777779 20 115 22 31 855 1012 GC03-PRD-S1
175 1462.0819444444444 20 111 23 30 783 950 GC03-PRD-S1
176 1462.0930555555556 20 109 23 30 756 939 GC03-PRD-S1
177 1462.1020833333334 20 100 23 28 702 870 GC03-PRD-S1
178 1462.1131944444444 20 104 23 29 685 853 GC03-PRD-S1
179 1462.1236111111111 20 90 19 20 471 594 GC03-PRD-S1
180 1462.1340277777779 0 96 20 20 494 607 GC03-PRD-S1
181 1462.1444444444444 0 94 20 19 439 559 GC03-PRD-S1
182 1462.1548611111111 0 93 20 19 425 561 GC03-PRD-S1
183 1462.1638888888888 0 90 19 17 357 480 GC03-PRD-S1
184 1462.175 0 91 18 16 345 443 GC03-PRD-S1
185 1462.1854166666667 0 96 21 18 370 480 GC03-PRD-S1
186 1462.1958333333334 0 92 20 16 324 420 GC03-PRD-S1
187 1462.2076388888888 0 92 20 16 324 414 GC03-PRD-S1
188 1462.0083333333334 0 93 20 15 309 391 GC03-PRD-S1
189 1462.0104166666667 60 127 27 40 1267 1274 GT04-PRD-S1
190 1462.0201388888888 60 131 29 40 1264 1274 GT04-PRD-S1
191 1462.0305555555556 60 133 30 40 1281 1298 GT04-PRD-S1
192 1462.0402777777779 60 134 29 42 1304 1360 GT04-PRD-S1
193 1462.0513888888888 80 134 28 40 1274 1324 GT04-PRD-S1
194 1462.0625 80 137 28 40 1337 1335 GT04-PRD-S1
195 1462.0729166666667 80 144 29 45 1485 1501 GT04-PRD-S1
196 1462.0833333333333 80 144 30 50 1573 1630 GT04-PRD-S1
197 1462.0930555555556 100 148 30 47 1380 1478 GT04-PRD-S1
198 1462.1034722222223 100 150 30 49 1520 1576 GT04-PRD-S1
199 1462.1145833333333 100 153 31 50 1553 1589 GT04-PRD-S1
200 1462.1243055555556 100 151 31 55 1735 1818 GT04-PRD-S1
201 1462.1340277777779 120 153 32 65 1905 2146 GT04-PRD-S1
202 1462.1444444444444 120 151 32 62 1748 2026 GT04-PRD-S1
203 1462.1555555555556 120 160 31 61 1799 2041 GT04-PRD-S1
204 1462.1652777777779 120 160 30 64 1810 2105 GT04-PRD-S1
205 1462.1756944444444 140 164 33 73 1895 2314 GT04-PRD-S1
206 1462.1861111111111 140 162 33 72 1966 2345 GT04-PRD-S1
207 1462.1972222222223 140 166 36 79 2021 2470 GT04-PRD-S1
208 1462.2083333333333 140 166 35 76 2022 2450 GT04-PRD-S1
209 1462.2180555555556 160 164 37 78 2115 2491 GT04-PRD-S1
210 1462.2284722222223 160 169 40 82 2147 2583 GT04-PRD-S1
211 1462.2388888888888 160 169 38 83 2190 2647 GT04-PRD-S1
212 1462.2493055555556 160 173 38 85 2202 2713 GT04-PRD-S1
213 1462.2604166666667 180 171 38 88 2332 2837 GT04-PRD-S1
214 1462.2701388888888 180 171 41 95 2321 2937 GT04-PRD-S1
215 1462.28125 180 176 39 94 2358 2994 GT04-PRD-S1
216 1462.2909722222223 180 176 42 104 2339 3086 GT04-PRD-S1
217 1462.2979166666667 200 176 44 105 2444 3186 GT04-PRD-S1
218 NA NA NA RÈcupÈ ration NA NA GT04-PRD-S1
219 1462.0034722222222 125 179 42 97 2304 2957 GT04-PRD-S1
220 1462.0131944444445 30 171 38 92 2266 2900 GT04-PRD-S1
221 1462.0236111111112 30 166 36 93 2136 2851 GT04-PRD-S1
222 1462.0347222222222 30 166 35 91 1829 2619 GT04-PRD-S1
223 1462.0444444444445 30 162 34 83 1576 2306 GT04-PRD-S1
224 1462.0548611111112 30 160 31 65 1411 1904 GT04-PRD-S1
225 1462.0652777777777 30 155 36 78 1439 2013 GT04-PRD-S1
226 1462.0763888888889 30 153 34 69 1337 1832 GT04-PRD-S1
227 1462.0861111111112 30 153 34 66 1283 1716 GT04-PRD-S1
228 1462.0965277777777 30 144 28 49 1012 1303 GT04-PRD-S1
229 1462.1069444444445 30 134 25 41 897 1147 GT04-PRD-S1
230 1462.1180555555557 0 130 25 40 756 1051 GT04-PRD-S1
231 1462.1284722222222 0 126 20 28 500 741 GT04-PRD-S1
232 1462.1381944444445 0 123 23 27 533 712 GT04-PRD-S1
233 1462.1486111111112 0 123 23 29 548 737 GT04-PRD-S1
234 1462.1590277777777 0 117 24 24 415 560 GT04-PRD-S1
235 1462.16875 0 114 21 27 610 728 GT04-PRD-S1
236 1462.1798611111112 0 111 19 23 508 612 GT04-PRD-S1
237 1462.1902777777777 0 113 21 26 548 666 GT04-PRD-S1
238 1462.2006944444445 0 113 23 27 552 683 GT04-PRD-S1
239 1462.0020833333333 0 114 22 28 547 702 GT04-PRD-S1
I would like to remove all rows after words "ration" in the column VE BUT only for each id.
Meaning that I would like to remove lines 29 to 50, 109 to 129, 167 to 188, and from 218 to 239.
The word "ration" is repeated several times, and please take into account that I have several ID (I can not include it in my question because it is too long).
I tried to create at the end of each id but it did not work.
Thank you for your help!
With dplyr:
data %>%
group_by(id) %>%
filter(cumsum(VE == "ration") == 0)
Assuming for all the id you'll have a row with "ration", you can use dplyr like
library(dplyr)
df %>% group_by(id) %>% slice(1:(which.max(VE == "ration") -1))

Convert non-numeric rows and columns to zero

I have this data from an r package, where X is the dataset with all the data
library(ISLR)
data("Hitters")
X=Hitters
head(X)
here is one part of the data:
AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun CRuns CRBI CWalks League Division PutOuts Assists Errors Salary NewLeague
-Andy Allanson 293 66 1 30 29 14 1 293 66 1 30 29 14 A E 446 33 20 NA A
-Alan Ashby 315 81 7 24 38 39 14 3449 835 69 321 414 375 N W 632 43 10 475.0 N
-Alvin Davis 479 130 18 66 72 76 3 1624 457 63 224 266 263 A W 880 82 14 480.0 A
-Andre Dawson 496 141 20 65 78 37 11 5628 1575 225 828 838 354 N E 200 11 3 500.0 N
-Andres Galarraga 321 87 10 39 42 30 2 396 101 12 48 46 33 N E 805 40 4 91.5 N
-Alfredo Griffin 594 169 4 74 51 35 11 4408 1133 19 501 336 194 A W 282 421 25 750.0 A
I want to convert all the columns and the rows with non numeric values to zero, is there any simple way to do this.
I found here an example how to remove the rows for one column just but for more I have to do it for every column manually.
Is in r any function that does this for all columns and rows?
To remove non-numeric columns, perhaps something like this?
df %>%
select(which(sapply(., is.numeric)))
# AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun
#-Andy Allanson 293 66 1 30 29 14 1 293 66 1
#-Alan Ashby 315 81 7 24 38 39 14 3449 835 69
#-Alvin Davis 479 130 18 66 72 76 3 1624 457 63
#-Andre Dawson 496 141 20 65 78 37 11 5628 1575 225
#-Andres Galarraga 321 87 10 39 42 30 2 396 101 12
#-Alfredo Griffin 594 169 4 74 51 35 11 4408 1133 19
# CRuns CRBI CWalks PutOuts Assists Errors Salary
#-Andy Allanson 30 29 14 446 33 20 NA
#-Alan Ashby 321 414 375 632 43 10 475.0
#-Alvin Davis 224 266 263 880 82 14 480.0
#-Andre Dawson 828 838 354 200 11 3 500.0
#-Andres Galarraga 48 46 33 805 40 4 91.5
#-Alfredo Griffin 501 336 194 282 421 25 750.0
or
df %>%
select(-which(sapply(., function(x) is.character(x) | is.factor(x))))
Or much neater (thanks to #AntoniosK):
df %>% select_if(is.numeric)
Update
To additionally replace NAs with 0, you can do
df %>% select_if(is.numeric) %>% replace(is.na(.), 0)
# AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun
#-Andy Allanson 293 66 1 30 29 14 1 293 66 1
#-Alan Ashby 315 81 7 24 38 39 14 3449 835 69
#-Alvin Davis 479 130 18 66 72 76 3 1624 457 63
#-Andre Dawson 496 141 20 65 78 37 11 5628 1575 225
#-Andres Galarraga 321 87 10 39 42 30 2 396 101 12
#-Alfredo Griffin 594 169 4 74 51 35 11 4408 1133 19
# CRuns CRBI CWalks PutOuts Assists Errors Salary
#-Andy Allanson 30 29 14 446 33 20 0.0
#-Alan Ashby 321 414 375 632 43 10 475.0
#-Alvin Davis 224 266 263 880 82 14 480.0
#-Andre Dawson 828 838 354 200 11 3 500.0
#-Andres Galarraga 48 46 33 805 40 4 91.5
#-Alfredo Griffin 501 336 194 282 421 25 750.0
library(ISLR)
data("Hitters")
d = head(Hitters)
library(dplyr)
d %>%
mutate_if(function(x) !is.numeric(x), function(x) 0) %>% # if column is non numeric add zeros
mutate_all(function(x) ifelse(is.na(x), 0, x)) # if there is an NA element replace it with 0
# AtBat Hits HmRun Runs RBI Walks Years CAtBat CHits CHmRun CRuns CRBI CWalks League Division PutOuts Assists Errors Salary NewLeague
# 1 293 66 1 30 29 14 1 293 66 1 30 29 14 0 0 446 33 20 0.0 0
# 2 315 81 7 24 38 39 14 3449 835 69 321 414 375 0 0 632 43 10 475.0 0
# 3 479 130 18 66 72 76 3 1624 457 63 224 266 263 0 0 880 82 14 480.0 0
# 4 496 141 20 65 78 37 11 5628 1575 225 828 838 354 0 0 200 11 3 500.0 0
# 5 321 87 10 39 42 30 2 396 101 12 48 46 33 0 0 805 40 4 91.5 0
# 6 594 169 4 74 51 35 11 4408 1133 19 501 336 194 0 0 282 421 25 750.0 0
If you want to avoid function(x) you can use this
d %>%
mutate_if(Negate(is.numeric), ~0) %>%
mutate_all(~ifelse(is.na(.), 0, .))
You can get the numeric columns with sapply/inherits.
X <- Hitters
inx <- sapply(X, inherits, c("integer", "numeric"))
Y <- X[inx]
Then, it wouldn't make much sense to remove the rows with non-numeric entries, they were already removed, but you could do
inx <- apply(Y, 1, function(y) all(inherits(y, c("integer", "numeric"))))
Y[inx, ]

Merge rows with duplicate IDs

I would like to merge and sum the values of each row that contains duplicated IDs.
For example, the data frame below contains a duplicated symbol 'LOC102723897'. I would like to merge these two rows and sum the value within each column, so that one row appears for the duplicated symbol.
> head(y$genes)
SM01 SM02 SM03 SM04 SM05 SM06 SM07 SM08 SM09 SM10 SM11 SM12 SM13 SM14 SM15 SM16 SM17 SM18 SM19 SM20 SM21 SM22
1 32 29 23 20 27 105 80 64 83 80 94 58 122 76 78 70 34 32 45 42 138 30
2 246 568 437 343 304 291 542 457 608 433 218 329 483 376 410 296 550 533 537 473 296 382
3 30 23 30 13 20 18 23 13 31 11 15 27 36 21 23 25 26 27 37 27 31 16
4 1450 2716 2670 2919 2444 1668 2923 2318 3867 2084 1121 2175 3022 2308 2541 1613 2196 1851 2843 2078 2180 1902
5 288 366 327 334 314 267 550 410 642 475 219 414 679 420 425 308 359 406 550 398 399 268
6 34 59 62 68 42 31 49 45 62 51 40 32 30 39 41 75 54 59 83 99 37 37
SM23 SM24 SM25 SM26 SM27 SM28 SM29 SM30 Symbol
1 41 23 57 160 84 67 87 113 LOC102723897
2 423 535 624 304 568 495 584 603 LINC01128
3 31 21 49 13 33 31 14 31 LINC00115
4 2453 3041 3590 2343 3450 3725 3336 3850 NOC2L
5 403 347 468 478 502 563 611 577 LOC102723897
6 45 51 56 107 79 105 92 131 PLEKHN1
> dim(y)
[1] 12928 30
I attempted using plyr to merge rows based on the 'Symbol' column, but it's not working.
> ddply(y$genes,"Symbol",numcolwise(sum))
> dim(y)
[1] 12928 30
> length(y$genes$Symbol)
[1] 12928
> length(unique(y$genes$Symbol))
[1] 12896
You group-by on Symbol and sum all columns.
library(dplyr)
df %>% group_by(Symbol) %>% summarise_all(sum)
using data.table
library(data.table)
setDT(df)[ , lapply(.SD, sum),by="Symbol"]
We can just use aggregate from base R
aggregate(.~ Symbol, df, FUN = sum)

continuous value supplied to discrete scale

I am new to ggplot2. In fact, I only discovered it last week and I haven't quite figured out yet how to use aesthetics and scales etc. There is probably a very easy solution to my problem but I couldn't find a satisfying answer online.
Sorry for the size of the message, but all the data used is in the following script:
dados
Fres Vc Lu
1 466 30 10
2 416 30 10
3 465 30 10
4 416 30 10
5 464 30 10
6 416 30 10
7 476 30 10
8 412 30 10
9 468 30 10
10 410 30 10
11 470 30 10
12 407 30 10
13 468 30 10
14 412 30 10
15 469 30 10
16 414 30 10
17 469 30 10
18 412 30 10
19 467 30 10
20 409 30 10
21 469 30 10
22 415 30 10
23 471 30 10
24 420 30 10
25 469 30 10
26 416 30 10
27 464 30 10
28 409 30 10
29 465 30 10
30 412 30 10
31 464 30 10
32 409 30 10
33 466 30 10
34 417 30 10
35 466 30 10
36 417 30 10
37 464 30 10
38 414 30 10
39 466 30 10
40 415 30 10
41 585 30 94
42 234 30 94
43 589 30 94
44 231 30 94
45 585 30 94
46 223 30 94
47 586 30 94
48 223 30 94
49 572 30 94
50 233 30 94
51 585 30 94
52 233 30 94
53 589 30 94
54 234 30 94
55 598 30 94
56 237 30 94
57 605 30 94
58 237 30 94
59 586 30 94
60 233 30 94
61 588 30 94
62 227 30 94
63 585 30 94
64 230 30 94
65 586 30 94
66 230 30 94
67 591 30 94
68 237 30 94
69 586 30 94
70 234 30 94
71 592 30 94
72 237 30 94
73 595 30 94
74 236 30 94
75 600 30 94
76 227 30 94
77 592 30 94
78 237 30 94
79 592 30 94
80 240 30 94
81 468 30 10
82 408 30 10
83 471 30 10
84 405 30 10
85 475 30 10
86 403 30 10
87 470 30 10
88 409 30 10
89 478 30 10
90 405 30 10
91 474 30 10
92 403 30 10
93 472 30 10
94 402 30 10
95 478 30 10
96 408 30 10
97 477 30 10
98 406 30 10
99 473 30 10
100 406 30 10
101 474 30 10
102 406 30 10
103 477 30 10
104 411 30 10
105 480 30 10
106 413 30 10
107 479 30 10
108 408 30 10
109 476 30 10
110 406 30 10
111 476 30 10
112 404 30 10
113 472 30 10
114 407 30 10
115 474 30 10
116 411 30 10
117 473 30 10
118 415 30 10
119 479 30 10
120 409 30 10
121 578 30 94
122 370 30 94
123 570 30 94
124 378 30 94
125 575 30 94
126 367 30 94
127 579 30 94
128 371 30 94
129 576 30 94
130 362 30 94
131 579 30 94
132 372 30 94
133 588 30 94
134 375 30 94
135 586 30 94
136 372 30 94
137 589 30 94
138 378 30 94
139 587 30 94
140 375 30 94
141 578 30 94
142 368 30 94
143 575 30 94
144 375 30 94
145 574 30 94
146 376 30 94
147 575 30 94
148 367 30 94
149 580 30 94
150 382 30 94
151 583 30 94
152 368 30 94
153 591 30 94
154 386 30 94
155 595 30 94
156 379 30 94
157 593 30 94
158 384 30 94
159 607 30 94
160 399 30 94
161 760 30 122
162 625 30 122
163 746 30 122
164 612 30 122
165 762 30 122
166 625 30 122
167 783 30 122
168 637 30 122
169 778 30 122
170 640 30 122
171 778 30 122
172 638 30 122
173 791 30 122
174 638 30 122
175 782 30 122
176 635 30 122
177 792 30 122
178 640 30 122
179 783 30 122
180 637 30 122
181 774 30 122
182 622 30 122
183 777 30 122
184 618 30 122
185 777 30 122
186 622 30 122
187 765 30 122
188 623 30 122
189 769 30 122
190 625 30 122
191 775 30 122
192 622 30 122
193 777 30 122
194 628 30 122
195 769 30 122
196 620 30 122
197 778 30 122
198 623 30 122
199 788 30 122
200 634 30 122
201 457 40 38
202 416 40 38
203 460 40 38
204 438 40 38
205 465 40 38
206 441 40 38
207 467 40 38
208 442 40 38
209 473 40 38
210 452 40 38
211 469 40 38
212 446 40 38
213 478 40 38
214 450 40 38
215 476 40 38
216 454 40 38
217 479 40 38
218 452 40 38
219 480 40 38
220 450 40 38
221 481 40 38
222 443 40 38
223 476 40 38
224 447 40 38
225 472 40 38
226 450 40 38
227 479 40 38
228 449 40 38
229 478 40 38
230 455 40 38
231 478 40 38
232 457 40 38
233 481 40 38
234 447 40 38
235 504 40 38
236 452 40 38
237 472 40 38
238 447 40 38
239 472 40 38
240 451 40 38
241 622 40 66
242 377 40 66
243 619 40 66
244 378 40 66
245 622 40 66
246 369 40 66
247 616 40 66
248 374 40 66
249 619 40 66
250 374 40 66
251 616 40 66
252 374 40 66
253 621 40 66
254 375 40 66
255 618 40 66
256 397 40 66
257 633 40 66
258 406 40 66
259 652 40 66
260 412 40 66
261 652 40 66
262 419 40 66
263 658 40 66
264 423 40 66
265 659 40 66
266 409 40 66
267 650 40 66
268 405 40 66
269 653 40 66
270 405 40 66
271 652 40 66
272 403 40 66
273 656 40 66
274 408 40 66
275 644 40 66
276 406 40 66
277 649 40 66
278 412 40 66
279 650 40 66
280 406 40 66
281 853 40 122
282 330 40 122
283 859 40 122
284 323 40 122
285 842 40 122
286 308 40 122
287 842 40 122
288 324 40 122
289 831 40 122
290 334 40 122
291 838 40 122
292 341 40 122
293 836 40 122
294 328 40 122
295 840 40 122
296 324 40 122
297 836 40 122
298 321 40 122
299 831 40 122
300 328 40 122
301 833 40 122
302 328 40 122
303 840 40 122
304 330 40 122
305 831 40 122
306 321 40 122
307 833 40 122
308 328 40 122
309 833 40 122
310 321 40 122
311 840 40 122
312 319 40 122
313 838 40 122
314 317 40 122
315 831 40 122
316 319 40 122
317 827 40 122
318 323 40 122
319 836 40 122
320 328 40 122
321 442 40 38
322 407 40 38
323 437 40 38
324 410 40 38
325 444 40 38
326 412 40 38
327 440 40 38
328 414 40 38
329 439 40 38
330 413 40 38
331 436 40 38
332 416 40 38
333 446 40 38
334 412 40 38
335 438 40 38
336 414 40 38
337 443 40 38
338 408 40 38
339 446 40 38
340 407 40 38
341 445 40 38
342 413 40 38
343 453 40 38
344 414 40 38
345 449 40 38
346 417 40 38
347 447 40 38
348 411 40 38
349 443 40 38
350 417 40 38
351 447 40 38
352 410 40 38
353 449 40 38
354 409 40 38
355 442 40 38
356 413 40 38
357 451 40 38
358 412 40 38
359 447 40 38
360 420 40 38
361 526 40 66
362 467 40 66
363 532 40 66
364 470 40 66
365 528 40 66
366 474 40 66
367 529 40 66
368 472 40 66
369 533 40 66
370 480 40 66
371 542 40 66
372 487 40 66
373 545 40 66
374 504 40 66
375 549 40 66
376 507 40 66
377 546 40 66
378 517 40 66
379 541 40 66
380 518 40 66
381 554 40 66
382 514 40 66
383 564 40 66
384 514 40 66
385 571 40 66
386 522 40 66
387 575 40 66
388 525 40 66
389 582 40 66
390 533 40 66
391 588 40 66
392 536 40 66
393 591 40 66
394 553 40 66
395 592 40 66
396 557 40 66
397 592 40 66
398 563 40 66
399 583 40 66
400 568 40 66
> dadosc <- summarySE(dados, measurevar="Fres", groupvars=c("Vc","Lu"))
> dadosc
Vc Lu N Fres sd se ci
1 30 10 80 440.6875 30.91540 3.456447 6.879885
2 30 94 80 445.0250 150.97028 16.878990 33.596789
3 30 122 40 701.7000 75.06688 11.869115 24.007552
4 40 38 80 444.6125 23.31973 2.607225 5.189552
5 40 66 80 526.7125 90.77824 10.149316 20.201707
6 40 122 40 581.1250 259.74092 41.068645 83.069175
> ggplot(dadosc, aes(x=Lu, y=Fres, colour=Vc)) +
+ geom_errorbar(aes(ymin=Fres-se, ymax=Fres+se), width=5) +
+ geom_point()
> pd <- position_dodge(0.1)
Up to here I got this graph, very close to my desired graph, except for the fact I´d like a legend with only two colors, one for Vc=30 and other for Vc=40.
![enter image description here][1]
Then I try the following script:
ggplot(dadosc, aes(x=Lu, y=Fres, ymax = max(Fres), colour=Vc, group=Vc)) +
+ geom_errorbar(aes(ymin=Fres-se, ymax=Fres+se), colour="black", width=.1, position=pd) +
+ geom_point(position=pd, size=3, shape=21, fill="white") + # 21 is filled circle
+ xlab("Machining lenght (mm)") +
+ ylab("Machining forces (N)") +
+ scale_colour_hue(name="Cutting Velocity",
+ breaks=c("30", "40"),
+ labels=c("Vc = 30 m/min", " Vc = 40 m/min "),
+ l=40) +
+ ggtitle("The Effect of Cutting Velocity on Machining Forces") +
+ expand_limits(y=0) +
+ scale_y_continuous(breaks=0:750*50) +
+ theme_bw() +
+ theme(legend.justification=c(1,0),
+ legend.position=c(1,0))
Error: Continuous value supplied to discrete scale
And I receive this message:
"Error: Continuous value supplied to discrete scale"!
Vc should be a factor if you want two values in the legend. You were getting that error because you were trying to scale Vc as discrete (breaks = c(30, 40)) when it was of type integer
ggplot(dadosc, aes(x=Lu, y=Fres, colour=factor(Vc))) +
...

Resources