I am using purrr to fit several linear mixed spline models and to select the best model based on the lowest BIC. I would like to extract predictions from the best model however I get the following error message when trying to do this:
Error in model.matrix.default(fixed, model.frame(delete.response(Terms), : model frame and formula mismatch in model.matrix()
How can i extract predictions from the best model?
This is the example data
dat <- structure(list(id = c(1001L, 1001L, 1001L, 1001L, 1001L, 1002L,
1003L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1005L,
1005L, 1005L, 1005L, 1005L, 1006L, 1006L, 1006L, 1006L, 1006L,
1007L, 1007L, 1008L, 1008L, 1008L, 1008L, 1008L, 1009L, 1009L,
1009L, 1010L, 1010L, 1010L, 1011L, 1012L, 1012L, 1012L, 1013L,
1013L, 1014L, 1015L, 1015L, 1015L, 1016L, 1016L, 1016L, 1016L,
1016L, 1017L, 1017L, 1018L, 1020L, 1020L, 1021L, 1021L, 1021L,
1021L, 1022L, 1022L, 1023L, 1023L, 1023L, 1023L, 1023L, 1023L,
1023L, 1023L, 1023L, 1023L, 1024L, 1024L, 1024L, 1024L, 1024L,
1025L, 1025L, 1025L, 1026L, 1026L, 1026L, 1026L, 1027L, 1027L,
1028L, 1028L, 1028L, 1028L, 1028L, 1028L, 1028L, 1029L, 1029L,
1029L, 1029L, 1029L, 1029L, 1030L, 1030L, 1030L, 1030L, 1030L,
1030L, 1030L, 1030L, 1031L, 1031L, 1031L, 1031L, 1032L, 1032L,
1032L, 1032L, 1032L, 1033L, 1033L, 1033L, 1033L, 1034L, 1034L,
1034L, 1034L, 1034L, 1035L, 1035L, 1036L, 1037L, 1037L, 1037L,
1037L, 1039L, 1039L, 1040L, 1040L, 1040L, 1040L, 1040L, 1040L,
1041L, 1041L, 1041L, 1041L, 1041L, 1041L, 1042L, 1042L, 1042L,
1042L, 1042L, 1042L, 1042L, 1043L, 1043L, 1043L, 1043L, 1044L,
1044L, 1044L, 1045L, 1045L, 1045L, 1045L, 1045L, 1045L, 1047L,
1048L, 1048L, 1049L, 1049L, 1049L, 1049L, 1051L, 1051L, 1052L,
1052L, 1052L, 1052L, 1052L, 1053L, 1053L, 1053L, 1053L, 1053L,
1054L, 1054L, 1054L, 1054L, 1054L, 1054L, 1054L, 1054L, 1056L,
1056L, 1056L, 1056L, 1057L, 1057L, 1058L, 1058L, 1058L, 1058L,
1058L, 1060L, 1060L, 1060L, 1061L, 1061L, 1061L, 1061L, 1061L,
1062L, 1062L, 1062L, 1062L, 1062L, 1063L, 1063L, 1063L, 1064L,
1064L, 1064L, 1064L, 1065L, 1065L, 1066L, 1066L, 1066L, 1066L,
1066L, 1066L, 1067L, 1067L, 1067L, 1068L, 1068L, 1068L, 1068L,
1068L, 1068L, 1068L, 1069L, 1070L, 1070L, 1070L, 1071L, 1071L,
1071L, 1072L, 1072L, 1072L, 1072L, 1072L, 1073L, 1073L, 1073L,
1073L, 1074L, 1074L, 1074L, 1075L, 1075L, 1075L, 1075L, 1075L,
1075L, 1076L, 1076L, 1076L, 1077L, 1077L, 1077L, 1077L, 1077L,
1077L, 1078L, 1078L, 1078L, 1078L, 1078L, 1078L, 1078L, 1080L,
1080L, 1080L, 1080L, 1081L, 1081L, 1082L, 1082L, 1082L, 1083L,
1083L, 1084L, 1085L, 1085L, 1085L, 1085L, 1085L, 1085L, 1086L,
1086L, 1086L, 1087L, 1087L, 1087L, 1087L, 1087L, 1087L, 1087L,
1087L, 1088L, 1088L, 1088L, 1088L, 1089L, 1089L, 1089L, 1089L,
1089L, 1090L, 1090L, 1091L, 1091L, 1091L, 1091L, 1091L, 1092L,
1092L, 1092L, 1092L, 1092L, 1093L, 1093L, 1093L, 1093L, 1094L,
1094L, 1094L, 1094L, 1094L, 1095L, 1095L, 1095L, 1095L, 1096L,
1097L, 1097L, 1098L, 1098L, 1098L, 1098L, 1098L, 1099L, 1099L,
1099L, 1099L, 1099L, 1099L, 1099L, 1099L, 1100L, 1100L, 1100L,
1101L, 1101L, 1101L, 1101L, 1103L, 1103L, 1103L, 1103L, 1103L,
1103L, 1103L, 1104L, 1104L, 1104L, 1104L, 1105L, 1105L, 1105L,
1106L, 1106L, 1106L, 1106L, 1106L, 1106L, 1106L, 1106L, 1106L,
1107L, 1108L, 1110L, 1111L, 1112L, 1117L, 1123L), y = c(1934.047646,
1075.598345, 1956.214821, 2000.38538, 2000.38538, 732.315937,
3119.86, 624.951231, 791.2764892, 1884.530826, 624.951231, 1047.57,
1047.57, 791.2764892, 1238.306103, 1555.042976, 2547.870529,
2547.870529, 2467.385, 1181.635212, 1181.635212, 565.306282,
2016.027874, 2016.027874, 712.6134567, 635.2537841, 2167.362267,
2575.574188, 2167.362267, 2480.028259, 2575.574188, 2875.363243,
1180.139938, 2828.037147, 3017.119362, 2722.940933, 2167.92,
2409.652458, 2245.442558, 724.1520328, 635.6034756, 1649.08326,
966.8182507, 865.2717723, 1570.23, 916.1300105, 1180.999973,
2351.32885, 2418.851707, 2290.038887, 2224.060562, 2509.52, 1174.589081,
1540.219376, 2692.26, 1300.899734, 1100.650177, 1786.628242,
1705.842979, 543.8596134, 1786.628242, 2115.374241, 2331.46,
875.949604, 2241.945103, 2319.666939, 2316.220234, 719.7139549,
2042.803307, 719.7139549, 1132.977503, 875.949604, 2316.220234,
1737.18, 1351.629826, 1291.44593, 1291.44593, 1108.26586, 1028.979719,
1291.44593, 2068.934227, 2440.784416, 1036.72, 894.6663704, 2449.184731,
1109.9, 672.9310664, 2072.320354, 2114.215416, 2114.215416, 1805.422001,
2461.18, 2101.374248, 2105.879, 1600.086481, 2866.84, 1600.086481,
2807.311, 3055.569931, 1600.086481, 2602.287521, 2690.007614,
620.5975037, 2608.4, 2722.3, 2713.66185, 2608.4, 1590.002, 2198.211,
2488.097725, 2198.211, 2322.616348, 2627.1, 2418.328346, 2601.661034,
531.7369251, 811.9494571, 884.31, 768.0526981, 652.1271248, 768.0526981,
2767.479, 1047.144354, 1047.144354, 1995.119, 1995.119, 707.6093158,
707.6093158, 1120.650104, 3036.591904, 3036.591904, 3081.86,
1193.583691, 2056.569244, 1823.155, 1238.948124, 2124.685, 887.20438,
1823.155, 2056.569244, 2056.569244, 2560.155342, 3095.923164,
3095.923164, 3003.729011, 2861.12, 2560.155342, 2735.26, 822.8209591,
1648.951, 1648.951, 1648.951, 822.8209591, 906.7692623, 582.787096,
1286.45, 797.2365359, 2566.770554, 2666.41, 2666.41, 2045.320816,
2401.21, 2401.21, 2583.2, 2581.32, 2622.357, 2581.32, 2588.462498,
442.433671, 1251.627064, 406.2565479, 2108.787437, 983.1101169,
2102.085403, 1155.713411, 1909.797131, 2871.55, 2711.07, 2883.22245,
2883.22245, 2711.07, 3027.103172, 3108.21537, 3007.87294, 3208.963631,
3108.21537, 2617.91, 2457.464466, 2890.51, 2698.48214, 2700.723,
2700.723, 2817.668579, 2700.723, 1349.90691, 1476.19994, 1552.95,
1349.90691, 925.8325004, 1258.28, 840.1875095, 2405.175911, 840.1875095,
1056.678543, 1571.936, 1210.89, 1210.89, 673.7005405, 687.7842464,
1016.86, 1217.866, 1493.791817, 2246.726913, 1054.821, 1054.821,
563.6580887, 1054.821, 1540.429863, 2209.006493, 1437.835186,
2191.308, 1412.128944, 2724.164597, 2791.705185, 2727.774208,
2070.451198, 866.7974147, 1661.082638, 2108.271309, 2411.515434,
2342.026085, 2071.06, 2258.321014, 1537.06, 760.6319065, 867.7596569,
1907.60466, 1770.658, 760.6319065, 912.8781966, 912.8781966,
912.8781966, 1257.222706, 2586.922356, 1608.28, 962.5674305,
1085.451181, 2539.218132, 2535.526085, 2561.60054, 1600.198,
2100.048149, 758.3851737, 758.3851737, 2643.373329, 367.7795143,
866.0683727, 718.5049658, 866.0683727, 1906.694649, 2291.48,
2190.560314, 744.1710777, 1498.981777, 2460.912292, 590.1345787,
2487.559135, 1855.601353, 660.9104843, 1116.08, 792.929533, 708.8373737,
2272.232933, 1801.729801, 2299.800095, 2272.232933, 2299.800095,
1895.828438, 1757.75, 1050.279345, 1757.75, 1326.09478, 1326.09478,
1633.119305, 1558, 1167.971405, 1828.16, 1788.571758, 2175.469,
1071.039494, 941.6030864, 2053.067215, 1461.02132, 1597.646778,
1885.321567, 2195.704372, 2195.704372, 1675.768558, 3157.550789,
1565.173126, 2195.704372, 3157.550789, 2404.836883, 2541.045593,
585.7223682, 2465.177761, 2678.462074, 500.3733997, 2465.177761,
781.342, 898.3551559, 2465.177761, 2465.177761, 1807.02, 1418.888027,
1797.36, 1807.02, 2200.06, 2218.369926, 2200.06, 1986.642735,
2088.292, 2069.139, 1507.901432, 2061.395798, 2075.164864, 2081.913219,
2081.913219, 483.8579493, 1857.88, 2578.772636, 1857.88, 1857.88,
1039.632153, 2288.28, 2288.28, 1831.349922, 2349.23, 933.1002788,
2626.298935, 1521.744, 933.1002788, 2626.298935, 1984.760715,
2450.333, 1732.339031, 1984.760715, 2731.9, 869.2320918, 1785.72,
1922.798, 3081.28, 1508.8, 2421.288597, 1922.798, 1268.074959,
1569.05, 1808.115, 1569.05, 1268.074959, 2165.724808, 2165.724808,
1808.115, 2084.149837, 2693.027184, 2464.489, 2607.653496, 1012.837271,
1012.837271, 2673.190872, 2635.290516, 2773.42, 2635.290516,
2654.772674, 2377.905655, 2679.014969, 2654.772674, 1226.40016,
1470.69, 1273.789799, 2294.926086, 1226.40016, 1470.69, 1273.789799,
1873.817, 2274.930534, 2317.429165, 959.1709613, 1328.159428,
1328.159428, 1328.159428, 959.1709613, 1630.28, 1610.54982, 2507.05302,
750.467966, 750.467966, 821.2255058, 802.8240452, 2829.47879),
age = c(31.54004107, 11.95071869, 27.88501027, 27.88501027,
25.07871321, 10.90759754, 25.70020534, 9.560574949, 11.17864476,
15.8384668, 9.560574949, 11.23613963, 14.01232033, 10.54620123,
12.89527721, 14.52977413, 24.96919918, 24.72005476, 23.95893224,
13.31690623, 11.52087611, 9.927446954, 22.10814511, 16.44353183,
10.90759754, 7.991786448, 17.26488706, 23.95893224, 15.66872005,
17.63723477, 24.72005476, 30.97330595, 11.52087611, 17.5633128,
30.11088296, 23.31279945, 17.26488706, 20.58590007, 28.27926078,
11.66324435, 9.927446954, 13.92744695, 11.20328542, 12.70362765,
13.52498289, 12.21355236, 13.80150582, 22.81724846, 39.3045859,
16.62696783, 22.63107461, 29.86447639, 12.54483231, 14.42299795,
34.27789185, 12.91170431, 12.25462012, 21.81245722, 21.81245722,
10.05065024, 23.6659822, 16.22450376, 28.74743326, 12.70362765,
35.43052704, 21.21013005, 19.28542094, 12.77207392, 16.59411362,
12.12867899, 11.29637235, 11.81930185, 19.04449008, 19.93429158,
16.14236824, 12.85420945, 13.21560575, 11.61396304, 11.85763176,
13.3798768, 17.42915811, 24.41341547, 13.08418891, 11.6659822,
24.41341547, 12.06297057, 10.22861054, 26.15468857, 21.71937029,
20.1889117, 12.60232717, 25.39904175, 30.72689938, 19.22245038,
14.45037645, 24.77207392, 13.47570157, 17.87816564, 27.52635181,
15.16221766, 19.68514716, 21.67282683, 9.062286105, 20.43805613,
21.67282683, 21.24024641, 20.70362765, 13.5687885, 17.13347023,
28.11498973, 24.16974675, 18.19575633, 27.73442847, 15.52361396,
20.70362765, 11.76728268, 10.98699521, 11.51540041, 9.902806297,
13.05407255, 8.703627652, 25.60164271, 10.59000684, 10.59000684,
14.45859001, 14.05886379, 10.88295688, 10.75427789, 10.59000684,
26.50513347, 18.83093771, 22.86379192, 11.8384668, 15.04449008,
15.42505133, 14.14099932, 28.06844627, 11.51540041, 14.66119097,
13.79055441, 15.37850787, 22.58179329, 22.86379192, 30.0752909,
21.85900068, 25.60164271, 15.29089665, 26.79534565, 11.68514716,
15.42505133, 15.58384668, 15.08555784, 14.11909651, 11.6659822,
10.21765914, 12.1670089, 10.50239562, 23.3045859, 15.92607803,
22.58179329, 16.65982204, 20.58590007, 39.3045859, 32.56947296,
16.90349076, 25.12799452, 17.88364134, 19.46338125, 8.736481862,
14.14099932, 8.736481862, 17.68104038, 14.54893908, 19.22245038,
12.98562628, 22.45311431, 18.83093771, 38.68856947, 26.50513347,
25.44010951, 28.70910335, 19.21697467, 30.0752909, 26.50513347,
29.45106092, 33.31690623, 16.68172485, 15.816564, 24.89801506,
15.816564, 18.7761807, 18.4366872, 19.45790554, 19.78370979,
14.98973306, 15.89869952, 29.06502396, 16.14236824, 10.74880219,
13.47843943, 10.5982204, 24.61875428, 10.74880219, 12.47364819,
16.95277207, 12.41889117, 13.44832307, 9.984941821, 9.451060917,
12.59137577, 13.38261465, 15.14852841, 21.65913758, 12.57494867,
12.40520192, 10.75701574, 15.16495551, 15.67419576, 22.52703628,
13.31143053, 16.71457906, 12.98288843, 32.16974675, 25.3798768,
30.57084189, 22.14647502, 11.43874059, 13.25119781, 18.48049281,
25.81519507, 24.78028747, 17.85626283, 27.70704997, 13.28952772,
8.703627652, 11.61396304, 35.04996578, 15.61943874, 8.703627652,
13.33333333, 10.56810404, 11.34017796, 13.5797399, 28.79671458,
12.56673511, 13.33333333, 12.55578371, 30.80082136, 23.63039014,
29.66461328, 13.25119781, 17.46748802, 8.703627652, 8.703627652,
21.21013005, 9.768651608, 13.46748802, 10.75427789, 13.24298426,
26.87474333, 27.43326489, 20.6899384, 10.0752909, 13.37713895,
28.38056126, 8.911704312, 24.62149213, 14.32443532, 10.24229979,
13.87268994, 10.54620123, 11.44421629, 21.68377823, 15.61943874,
27.97809719, 28.90075291, 28.90075291, 24.64339493, 14.32443532,
10.61190965, 15.8110883, 14.25051335, 14.25051335, 13.64818617,
26.05338809, 13.69746749, 23.98083504, 16.68172485, 20.42162902,
12.68172485, 11.51813826, 16.65982204, 14.32443532, 15.49897331,
35.04996578, 18.70225873, 17.47570157, 14.66666667, 26.83915127,
13.29226557, 18.14647502, 25.70020534, 14.67761807, 16.61601643,
9.812457221, 15.96714579, 24.41341547, 8.911704312, 17.61806982,
11.87953457, 11.80561259, 19.15400411, 17.61806982, 15.70704997,
12.35318275, 18.12457221, 16.8733744, 32.02464066, 32.02464066,
25.30047912, 16.13415469, 19.37850787, 26.50513347, 15.89869952,
13.79055441, 25.42368241, 16.05201916, 15.43874059, 9.158110883,
14.39014374, 22.12183436, 15.70704997, 15.35934292, 11.44421629,
28.45995893, 17.06502396, 14.39014374, 26.32991102, 12.38056126,
16.42436687, 13.37713895, 11.70978782, 17.62628337, 16.13415469,
17.61806982, 15.11019849, 14.09993155, 21.89185489, 13.80150582,
16.8733744, 17.73305955, 25.55509925, 14.75975359, 24.03559206,
14.36002738, 12.73100616, 16.09034908, 18.12457221, 15.11019849,
13.69472964, 23.03901437, 16.94182067, 15.70704997, 13.99315537,
21.89185489, 15.65776865, 19.25530459, 10.43394935, 12.72826831,
24.41341547, 24.25735797, 37.41820671, 37.41820671, 25.25393566,
24.78028747, 25.25393566, 37.41820671, 12.11772758, 14.19575633,
14.091718, 15.10746064, 13.16906229, 12.09856263, 13.3798768,
14.39014374, 36.3504449, 22.68035592, 11.21149897, 12.73100616,
13.34702259, 14.5982204, 11.31827515, 15.14579055, 15.44969199,
15.65776865, 12.12867899, 12.43531828, 12.72005476, 14.11909651,
24.25735797)), row.names = c(7L, 303L, 323L, 372L, 391L,
240L, 311L, 38L, 46L, 94L, 149L, 154L, 185L, 362L, 40L, 70L,
98L, 262L, 305L, 73L, 74L, 77L, 306L, 374L, 104L, 397L, 14L,
43L, 188L, 248L, 370L, 50L, 101L, 143L, 25L, 155L, 251L, 37L,
173L, 208L, 263L, 49L, 383L, 389L, 30L, 237L, 353L, 156L, 283L,
288L, 302L, 325L, 33L, 158L, 159L, 35L, 360L, 57L, 128L, 204L,
387L, 300L, 365L, 16L, 51L, 82L, 85L, 93L, 148L, 150L, 232L,
242L, 287L, 32L, 62L, 200L, 285L, 290L, 193L, 352L, 398L, 54L,
175L, 203L, 324L, 69L, 195L, 92L, 106L, 141L, 189L, 218L, 347L,
394L, 23L, 24L, 120L, 166L, 257L, 349L, 6L, 118L, 235L, 266L,
269L, 275L, 282L, 390L, 122L, 153L, 330L, 378L, 53L, 88L, 229L,
241L, 314L, 135L, 278L, 332L, 384L, 64L, 168L, 207L, 212L, 359L,
329L, 338L, 130L, 67L, 108L, 286L, 316L, 182L, 254L, 113L, 215L,
247L, 273L, 322L, 336L, 27L, 102L, 162L, 171L, 270L, 326L, 19L,
205L, 210L, 307L, 333L, 358L, 375L, 41L, 111L, 179L, 226L, 2L,
277L, 367L, 68L, 83L, 147L, 180L, 260L, 354L, 144L, 81L, 342L,
103L, 217L, 321L, 376L, 131L, 280L, 39L, 267L, 291L, 301L, 400L,
11L, 36L, 152L, 177L, 377L, 21L, 201L, 236L, 281L, 312L, 331L,
355L, 369L, 8L, 176L, 202L, 385L, 45L, 327L, 12L, 138L, 151L,
157L, 233L, 95L, 258L, 279L, 224L, 239L, 243L, 310L, 328L, 63L,
191L, 214L, 227L, 356L, 80L, 110L, 366L, 97L, 107L, 293L, 373L,
117L, 335L, 22L, 160L, 209L, 221L, 230L, 268L, 55L, 163L, 284L,
5L, 10L, 76L, 132L, 222L, 256L, 399L, 228L, 127L, 343L, 357L,
133L, 259L, 334L, 261L, 341L, 382L, 393L, 395L, 213L, 219L, 249L,
289L, 44L, 126L, 368L, 42L, 72L, 196L, 297L, 308L, 320L, 84L,
137L, 172L, 60L, 129L, 142L, 186L, 197L, 319L, 15L, 109L, 115L,
116L, 125L, 199L, 223L, 190L, 245L, 346L, 396L, 146L, 364L, 1L,
29L, 192L, 112L, 170L, 315L, 164L, 225L, 231L, 255L, 274L, 345L,
65L, 96L, 264L, 4L, 28L, 31L, 59L, 87L, 250L, 271L, 295L, 161L,
198L, 265L, 339L, 18L, 26L, 114L, 124L, 174L, 145L, 304L, 105L,
119L, 140L, 238L, 381L, 48L, 52L, 71L, 351L, 371L, 244L, 253L,
294L, 340L, 20L, 75L, 86L, 165L, 167L, 47L, 89L, 298L, 318L,
211L, 350L, 380L, 66L, 79L, 90L, 234L, 309L, 61L, 99L, 139L,
276L, 299L, 344L, 348L, 361L, 313L, 337L, 379L, 9L, 58L, 181L,
187L, 17L, 100L, 121L, 123L, 184L, 206L, 220L, 178L, 292L, 386L,
392L, 194L, 252L, 272L, 3L, 56L, 134L, 136L, 183L, 216L, 246L,
296L, 363L, 169L, 388L, 78L, 34L, 13L, 91L, 317L), class = "data.frame")
This is the code to fit different models and select the best model based on the lowest BIC
library(nlme)
library(splines)
library(tidyverse)
models <- map(c(3:6), possibly(~ {
lme(as.formula(paste("y ~", capture.output(print(call("ns", quote(age), .x))))),
data = dat, random = ~ age | id, method = "ML")
}, otherwise = NA_real_))
(models_bic <- unlist(map(models, BIC)))
(best_model <- which.min(models_bic))
(best_model <- models[[best_model]])
This is what I want to do to get the predictions
best_model_pred <- data.frame(age =seq(min(dat$age), max(dat$age), length = 100))
best_model_pred$pred <- predict(best_model, best_model_pred, level = 0)
Error in model.matrix.default(fixed, model.frame(delete.response(Terms), :
model frame and formula mismatch in model.matrix()
The issue is that the function lme takes formula literally as you put it there. In your case it's this
as.formula(paste("y ~ ns(age, ", .x, ")"))
That only works inside the map loop. Print best model and take a look ath the fourth line.
To fix it you can take it a step further and construct entire call as string and then evaluate it.
models <- map(c(3:6), possibly(~ {
eval(parse(text = paste0("lme(y ~ ns(age, ", .x, "), data = dat, random = ~ age | id, method = 'ML')")))
}, otherwise = NA_real_))
It's not perfect but it works :)
Unrelated note: I'd use use safely instead of possibly so that errors are transformed into NULL and then use compact to remove missing models.
models <- map(3:6, safely(your_function)) %>% compact()
Related
I have the following list of vectors:
list(c(663L, 705L, 680L, 769L, 775L, 327L, 665L, 805L, 808L,
689L, 774L, 831L, 832L, 217L, 739L, 918L, 354L, 373L, 764L, 691L,
839L, 372L, 146L, 840L, 727L, 728L, 617L, 647L, 159L, 161L, 581L,
142L, 618L, 332L, 585L, 134L, 809L, 154L, 158L, 133L, 448L, 736L,
737L, 815L, 876L, 151L, 750L, 701L, 778L, 861L, 584L, 692L, 427L,
455L, 601L, 412L, 432L, 449L, 457L, 456L, 620L, 124L, 125L, 679L,
329L, 667L, 697L, 806L, 807L, 312L, 315L, 733L, 821L, 222L, 583L,
702L, 631L, 642L, 812L, 850L, 726L, 853L, 129L, 660L, 799L, 410L,
188L, 798L, 130L, 703L, 341L, 826L, 137L, 253L, 123L, 827L, 844L,
786L, 655L, 879L, 695L, 749L, 866L, 820L, 890L, 889L, 888L, 694L,
744L, 746L, 813L, 818L, 868L, 873L, 872L, 869L, 870L, 414L, 738L,
751L, 208L, 209L, 210L, 899L, 900L, 901L, 903L, 902L, 904L, 913L,
911L, 912L, 767L, 917L, 777L, 521L, 396L, 397L, 915L, 277L, 529L,
740L, 509L, 508L, 524L, 224L, 790L, 791L, 698L, 725L, 696L, 817L,
802L, 897L, 898L, 787L, 788L, 789L, 462L, 356L, 395L, 693L, 745L,
469L, 519L, 336L, 355L, 792L, 556L, 375L, 398L, 358L, 399L, 720L,
539L, 558L, 331L, 166L, 167L, 128L, 131L, 214L, 239L, 269L, 276L,
213L, 337L, 176L, 304L, 503L, 394L, 296L, 298L, 211L, 223L, 238L,
338L, 487L, 490L, 488L, 489L, 273L, 274L, 892L, 300L, 301L, 816L,
819L, 275L, 752L, 139L, 206L, 420L, 793L, 215L, 320L, 321L, 676L,
226L, 699L, 325L, 252L, 319L, 672L, 236L, 306L, 743L, 237L, 439L,
212L, 675L, 333L, 429L, 476L, 478L, 704L, 768L, 440L, 517L, 518L,
776L, 810L, 413L, 554L, 555L, 765L, 622L, 626L, 624L, 625L, 231L,
577L, 335L, 628L, 629L, 511L, 339L, 352L, 353L, 138L, 578L, 349L,
496L, 611L, 606L, 614L, 612L, 613L, 607L, 609L, 608L, 610L, 328L,
194L, 195L, 639L, 183L, 632L, 340L, 418L, 308L, 435L, 436L, 437L,
543L, 905L, 914L, 428L, 374L, 444L, 502L, 825L, 510L, 732L, 557L,
559L, 730L, 566L, 567L, 506L, 520L, 531L, 534L, 549L, 630L, 174L,
175L, 140L, 677L, 426L, 377L, 392L, 196L, 186L, 197L, 144L, 141L,
407L), c(887L, 886L, 884L, 885L), c(528L, 527L, 525L, 526L),
c(70L, 71L, 75L, 77L, 72L, 73L, 74L, 76L), c(111L, 109L,
110L, 98L, 120L, 112L, 116L, 103L, 106L, 93L, 95L, 94L, 119L,
117L, 99L, 118L), c(87L, 88L, 89L, 81L, 82L, 83L, 84L, 85L,
86L, 91L, 92L, 949L, 126L, 127L, 90L, 122L), c(530L, 185L,
202L, 363L, 729L, 880L, 368L, 401L, 391L, 405L, 906L, 513L,
652L, 708L, 552L, 766L, 505L, 382L, 383L, 803L, 565L, 571L,
572L, 688L, 460L, 480L, 661L, 153L, 859L, 256L, 268L, 685L,
763L, 147L, 865L, 874L, 741L, 754L, 858L, 878L, 220L, 225L,
307L, 317L, 313L, 758L, 314L, 848L, 163L, 165L, 387L, 452L,
378L, 270L, 271L, 464L, 302L, 280L, 283L, 504L, 712L, 281L,
801L), c(595L, 596L, 597L, 908L, 841L, 842L, 493L, 669L,
783L, 360L, 507L, 500L, 501L, 823L, 824L, 779L, 891L, 780L,
781L, 760L, 379L, 756L, 762L, 857L, 814L, 759L, 854L, 867L,
871L, 856L, 855L, 877L, 851L, 852L, 318L, 735L, 811L, 619L,
863L, 322L, 326L, 310L, 309L, 323L, 324L, 459L, 700L, 461L,
687L, 664L, 668L, 587L, 590L, 562L, 563L, 564L, 574L, 569L,
573L, 342L, 547L, 561L, 568L, 575L, 662L, 240L, 316L, 311L,
761L, 443L, 445L, 446L, 836L, 755L, 909L, 910L, 830L, 533L,
881L, 916L, 716L, 843L, 666L, 690L, 670L, 551L, 173L, 466L,
415L, 748L, 718L, 860L, 673L, 747L, 742L, 846L, 875L, 576L,
345L, 594L, 604L, 644L, 603L, 602L, 605L, 598L, 441L, 442L,
450L, 453L, 616L, 447L, 454L, 419L, 433L, 822L, 431L, 634L,
633L, 645L, 586L, 615L, 359L, 421L, 361L, 385L, 386L, 347L,
351L, 757L, 834L, 835L, 155L, 481L, 169L, 390L, 170L, 636L,
417L, 711L, 160L, 162L, 143L, 156L, 593L, 150L, 657L, 656L,
658L, 152L, 648L, 357L, 380L, 434L, 829L, 847L, 580L, 145L,
678L, 164L, 430L, 203L, 204L, 198L, 199L, 635L, 637L, 640L,
641L, 544L, 179L, 828L, 148L, 254L, 184L, 653L, 650L, 651L,
191L, 200L, 201L, 177L, 178L, 181L, 182L, 207L, 495L, 424L,
381L, 403L, 282L, 404L, 406L, 710L, 278L, 279L, 494L, 484L,
485L, 486L, 425L, 498L, 497L, 334L, 348L, 371L, 463L, 467L,
686L, 362L, 402L, 384L, 400L, 230L, 344L, 671L, 684L, 546L,
560L, 709L, 479L, 550L, 570L, 388L, 389L, 149L, 190L, 221L,
376L), c(1364L, 1373L, 1371L, 1372L, 1148L, 1211L, 1369L,
1370L, 1165L, 1377L, 1378L, 1112L, 1140L, 1139L, 1143L, 1019L,
1006L, 1247L, 1263L, 1191L, 1208L, 1059L, 1062L, 1115L, 1451L,
1448L, 1449L, 1113L, 1144L, 1458L, 1498L, 1499L, 955L, 968L,
1093L, 1365L, 1141L, 1265L, 1248L, 1249L, 1040L, 985L, 1119L,
1107L, 986L, 1197L, 1317L, 975L, 1155L, 1267L, 1215L, 1266L,
1106L, 1111L, 1058L, 1060L, 1457L, 1250L, 1314L, 1234L, 1146L,
1315L, 1101L, 1116L, 1310L, 1335L, 1041L, 1114L, 1124L, 954L,
1351L, 1358L, 1011L, 1409L, 1049L, 1167L, 1341L, 1278L, 1316L,
1392L, 1418L, 1307L, 1342L, 1086L, 1356L, 1432L, 1434L, 1466L,
1467L, 1479L, 1501L, 1487L, 1496L, 1495L, 1497L, 1476L, 1505L,
1506L, 1508L, 1507L, 1510L, 944L, 950L), c(1069L, 1094L,
1200L, 1306L, 981L, 1110L, 1206L, 1308L, 1047L, 1207L, 1312L,
1313L, 1109L, 1334L, 1309L, 1332L), c(1237L, 1242L, 1240L,
1243L, 1239L, 1238L, 1241L, 1343L, 1181L, 1301L, 1298L, 1300L,
1117L, 1133L, 1061L, 1419L, 1416L, 1417L, 1453L, 1311L, 1339L,
1333L, 1336L, 1028L, 1079L, 1459L, 1486L, 1192L, 1010L, 1012L,
1125L, 1199L, 1142L, 1205L, 1196L, 1198L, 951L, 1137L, 1128L,
1435L), c(930L, 942L, 922L, 940L, 941L, 943L, 920L, 921L,
923L, 925L, 927L, 928L, 924L, 926L, 931L, 932L, 937L, 938L,
939L, 935L, 936L, 929L, 933L, 934L), c(956L, 1051L, 1433L,
1468L, 1077L, 973L, 1438L, 1009L, 1158L, 1082L, 1170L, 1195L,
1177L, 1212L, 1213L, 1088L, 1153L, 1152L, 1354L, 959L, 1052L,
1176L, 1178L, 957L, 1376L, 1374L, 1375L, 1159L, 1223L, 1227L,
1268L, 1302L, 1275L, 1285L, 1016L, 1014L, 1126L, 1055L, 1102L,
1171L, 1327L, 1183L, 1274L, 1288L, 1296L, 1186L, 1297L, 1426L,
1454L, 1515L, 1078L, 989L, 990L, 980L, 1098L, 1150L, 1151L
), 78:79, c(1455L, 1475L, 1509L, 1477L, 1478L, 1494L, 1490L,
1491L, 1492L, 1427L, 1425L, 1473L, 1471L, 1472L, 1474L, 977L,
1179L, 1299L, 1290L, 1292L, 1480L, 1187L, 1295L, 1233L, 1188L,
1185L, 1293L, 1184L, 1294L, 1291L, 1175L, 1286L, 1424L, 1469L,
1502L, 1503L, 1421L, 1103L, 1488L, 1489L, 1092L, 1452L, 1350L,
1046L, 1166L, 1100L, 1305L, 1180L, 1182L, 1190L, 1289L, 979L,
961L, 1406L, 1273L, 1303L, 1456L, 1105L, 1331L, 1304L, 1407L,
994L, 1022L, 1021L, 1020L, 1025L, 1024L, 1023L, 1026L, 1216L,
1163L, 1161L, 1262L, 1156L, 1164L, 1230L, 1228L, 1224L, 80L,
953L, 962L, 974L, 992L, 1004L, 1005L, 1017L, 1031L, 1032L,
1029L, 1030L, 1057L, 982L, 1003L, 1007L, 1008L, 1042L, 1097L,
1089L, 1160L, 963L, 972L, 1070L, 1044L, 1431L, 1194L, 1204L,
993L, 1000L, 1001L, 1209L, 1210L, 1470L, 1287L, 1493L, 1075L,
1073L, 1074L, 1355L, 1090L, 1154L, 1357L, 1085L, 1087L, 1218L,
1504L, 1217L, 1174L, 1269L, 1270L, 1120L, 1272L, 1015L, 1018L,
946L, 1145L, 1397L, 971L, 1083L, 1284L, 1045L, 1048L, 1360L,
1361L, 1149L, 1282L, 1235L, 1236L, 1172L, 1367L, 1368L, 1345L,
964L, 976L, 1189L, 1281L, 1280L, 1279L, 1330L, 1328L, 1329L,
1157L, 1271L, 1324L, 1325L, 1081L, 1398L, 1391L, 1393L, 1405L,
1420L, 1104L, 1168L, 1201L, 1202L, 1338L, 1340L, 1277L, 1283L,
945L, 978L, 1422L, 1054L, 1076L, 960L, 1096L, 1091L, 1080L,
1169L, 1276L, 1050L, 1084L, 1035L, 1053L, 1095L, 1173L, 1056L,
1099L, 1138L, 997L, 1162L, 958L, 947L, 1344L), c(1222L, 1221L,
1219L, 1220L), c(1444L, 1446L, 1447L, 1445L, 1450L, 1132L,
1131L, 1130L, 1253L, 1462L, 1129L, 1254L, 965L, 966L, 967L,
1463L, 1134L, 1485L, 1483L, 1481L, 1482L, 1513L, 1465L, 1464L,
1512L, 1255L, 1258L, 1381L, 1318L, 1257L, 1323L, 1027L, 1251L,
1252L, 1214L, 1229L, 1256L, 1225L, 1226L, 1349L, 1352L, 1347L,
1348L, 1430L, 1428L, 1429L, 1436L, 1439L, 1440L, 952L, 1399L,
1389L, 1410L, 1385L, 1380L, 1401L, 1382L, 1366L, 1404L, 1403L,
1402L, 1400L, 1259L, 1415L, 1414L, 1413L, 1411L, 1412L, 1036L,
1039L, 1387L, 1386L, 1383L, 1379L, 1396L, 1394L, 1395L),
c(1322L, 1321L, 1319L, 1320L), c(998L, 1193L, 1072L, 991L,
999L, 1261L, 1326L, 1043L, 1037L, 1038L, 1353L, 1260L, 1390L,
1437L, 1346L, 1384L, 1408L, 1127L, 1423L, 1147L, 1135L, 1514L
), c(579L, 643L, 189L, 192L, 599L, 600L, 591L, 423L, 458L,
422L, 654L, 365L, 772L, 833L, 771L, 770L, 837L, 838L, 227L,
416L, 706L, 773L, 849L, 542L, 621L, 364L, 845L, 919L, 346L,
707L, 659L, 135L, 721L), c(305L, 255L, 795L, 800L, 719L,
734L, 794L, 1108L, 1136L, 1118L, 1071L, 1264L, 1203L, 1337L,
108L, 1232L, 1362L), c(674L, 796L, 864L, 235L, 724L, 408L,
731L, 723L, 722L, 548L, 168L, 797L, 132L, 205L, 649L, 180L,
582L, 330L, 157L, 465L, 499L, 536L, 516L, 883L), c(491L,
411L, 171L, 172L, 216L, 681L, 682L, 343L, 862L, 896L, 538L,
882L, 907L, 468L, 474L, 473L, 472L, 471L, 470L, 475L, 244L,
243L, 242L, 257L, 260L, 263L, 262L, 261L, 259L, 258L, 266L,
265L, 264L, 267L, 229L, 483L, 893L, 245L, 241L, 299L, 409L,
136L, 638L, 588L, 589L, 234L, 232L, 293L, 294L, 251L, 250L,
247L, 246L, 286L, 287L, 292L, 291L, 290L, 272L, 233L, 248L,
249L, 297L, 303L, 785L, 717L, 894L, 895L, 366L, 367L, 477L,
532L, 350L, 370L), c(289L, 288L, 284L, 285L), c(96L, 101L,
104L, 107L, 105L, 114L, 121L, 102L, 113L, 115L, 97L, 100L
), c(948L, 970L, 1033L, 969L, 996L, 987L, 988L, 995L, 1002L,
1034L, 1067L, 1068L, 1013L, 983L, 984L, 1460L, 1442L, 1500L,
1484L, 1246L, 1511L, 1461L, 1123L, 1443L, 1388L, 1063L, 1363L,
1064L, 1122L, 1359L, 1121L, 1231L, 1244L, 1245L, 1066L, 1065L,
1441L), c(295L, 438L, 753L, 782L, 219L, 228L, 714L, 369L,
553L, 393L, 713L, 683L, 784L, 492L, 715L, 482L, 541L, 592L,
451L, 627L, 187L, 193L, 804L, 623L, 646L, 514L, 515L, 522L,
512L, 523L, 545L, 218L, 535L, 537L, 540L), 16L, c(15L, 18L
), 1L, c(7L, 9L), 6L, 14L, 4L, 5L, 3L, 11L, 17L, 8L, 10L)
I want to sample a single value from each of the list entries for each iteration in order to create a large matrix of samples, meaning the I'll have 40 columns (the amount of groups) and 5000 rows (the amount of times to sample)
I tried the following:
# groups - is the list
# repetition - is 5000
as.matrix(sapply(groups, sample, repetition, TRUE))
This seem to work for small list, but when I try on the big list I get elements from other groups who shouldn't appear:
Example using the code above:
When you have vector of length 1 the sampling happens from 1:x. From ?sample :
If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x
So when you do
set.seed(123)
sample(10, 1)
#[1] 3
It is selecting 1 number from 1 to 10. To avoid that from happening you can check length of vector in sapply :
sapply(groups, function(x) if(length(x) == 1) rep(x, repetition)
else sample(x, repetition, replace = TRUE))
So this will return the same number repetition number of times when the length of vector is 1.
We may list single values into a sub list, to avoid the 1:x "convenience". Example:
groups <- list(2, 9, 2:9, 22:99)
groups[lengths(groups) == 1] <- lapply(groups[lengths(groups) == 1], list)
str(groups)
# List of 4
# $ :List of 1
# ..$ : num 2
# $ :List of 1
# ..$ : num 9
# $ : int [1:8] 2 3 4 5 6 7 8 9
# $ : int [1:78] 22 23 24 25 26 27 28 29 30 31 ...
repetition <- 10
set.seed(42)
r <- t(replicate(repetition, sapply(groups, sample, 1, replace=TRUE)))
r
# [,1] [,2] [,3] [,4]
# [1,] 2 9 2 46
# [2,] 2 9 3 70
# [3,] 2 9 9 92
# [4,] 2 9 6 41
# [5,] 2 9 8 24
# [6,] 2 9 4 57
# [7,] 2 9 6 26
# [8,] 2 9 5 24
# [9,] 2 9 3 45
# [10,] 2 9 8 43
Note, that the sub lists of length one are sampled as lists and sapply simplifies them to integers internally using simplify2array (i.e. unlists them).
The manual of sample gives the a solution for the case If ‘x’ has length 1, is numeric in the examples with:
resample <- function(x, ...) x[sample.int(length(x), ...)]
set.seed(42)
repetition <- 5
as.matrix(sapply(groups, resample, repetition, TRUE))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40]
#[1,] 778 886 525 77 106 90 153 310 1059 1110 1243 937 1051 79 1489 1220 1394 1320 1408 706 1232 180 907 288 96 1231 482 16 18 1 9 6 14 4 5 3 11 17 8 10
#[2,] 802 887 526 71 106 126 878 857 1250 1094 1205 936 989 78 1478 1222 1253 1321 1127 838 1362 723 216 284 97 1442 545 16 18 1 7 6 14 4 5 3 11 17 8 10
#[3,] 222 885 528 71 95 82 202 145 1370 1109 1196 938 1433 79 1424 1221 1214 1321 999 845 1264 516 350 288 113 983 537 16 15 1 7 6 14 4 5 3 11 17 8 10
#[4,] 237 884 528 74 98 81 280 309 1365 1313 1028 943 980 79 1277 1222 1463 1321 1437 772 1136 408 287 288 113 987 523 16 18 1 9 6 14 4 5 3 11 17 8 10
#[5,] 224 885 527 75 120 88 763 143 1114 981 1336 943 1052 79 1044 1222 1036 1320 1043 227 1071 674 473 285 104 1441 218 16 18 1 9 6 14 4 5 3 11 17 8 10
Where sample.int takes the number of items to choose from and sample elements from which to choose or a positive integer.
I would like fit several latent trajectory models where the only difference between models is the number of groups (2 to 4). How can I automate this process and save the models in a list.
This is the example data
library(lcmm)
library(splines)
library(tidyverse)
data <- structure(list(id = c(1001L, 1001L, 1001L, 1001L, 1001L, 1002L,
1003L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1004L, 1005L,
1005L, 1005L, 1005L, 1005L, 1006L, 1006L, 1006L, 1006L, 1006L,
1007L, 1007L, 1008L, 1008L, 1008L, 1008L, 1008L, 1009L, 1009L,
1009L, 1010L, 1010L, 1010L, 1011L, 1012L, 1012L, 1012L, 1013L,
1013L, 1014L, 1015L, 1015L, 1015L, 1016L, 1016L, 1016L, 1016L,
1016L, 1017L, 1017L, 1018L, 1020L, 1020L, 1021L, 1021L, 1021L,
1021L, 1022L, 1022L, 1023L, 1023L, 1023L, 1023L, 1023L, 1023L,
1023L, 1023L, 1023L, 1023L, 1024L, 1024L, 1024L, 1024L, 1024L,
1025L, 1025L, 1025L, 1026L, 1026L, 1026L, 1026L, 1027L, 1027L,
1028L, 1028L, 1028L, 1028L, 1028L, 1028L, 1028L, 1029L, 1029L,
1029L, 1029L, 1029L, 1029L, 1030L, 1030L, 1030L, 1030L, 1030L,
1030L, 1030L, 1030L, 1031L, 1031L, 1031L, 1031L, 1032L, 1032L,
1032L, 1032L, 1032L, 1033L, 1033L, 1033L, 1033L, 1034L, 1034L,
1034L, 1034L, 1034L, 1035L, 1035L, 1036L, 1037L, 1037L, 1037L,
1037L, 1039L, 1039L, 1040L, 1040L, 1040L, 1040L, 1040L, 1040L,
1041L, 1041L, 1041L, 1041L, 1041L, 1041L, 1042L, 1042L, 1042L,
1042L, 1042L, 1042L, 1042L, 1043L, 1043L, 1043L, 1043L, 1044L,
1044L, 1044L, 1045L, 1045L, 1045L, 1045L, 1045L, 1045L, 1047L,
1048L, 1048L, 1049L, 1049L, 1049L, 1049L, 1051L, 1051L, 1052L,
1052L, 1052L, 1052L, 1052L, 1053L, 1053L, 1053L, 1053L, 1053L,
1054L, 1054L, 1054L, 1054L, 1054L, 1054L, 1054L, 1054L, 1056L,
1056L, 1056L, 1056L, 1057L, 1057L, 1058L, 1058L, 1058L, 1058L,
1058L, 1060L, 1060L, 1060L, 1061L, 1061L, 1061L, 1061L, 1061L,
1062L, 1062L, 1062L, 1062L, 1062L, 1063L, 1063L, 1063L, 1064L,
1064L, 1064L, 1064L, 1065L, 1065L, 1066L, 1066L, 1066L, 1066L,
1066L, 1066L, 1067L, 1067L, 1067L, 1068L, 1068L, 1068L, 1068L,
1068L, 1068L, 1068L, 1069L, 1070L, 1070L, 1070L, 1071L, 1071L,
1071L, 1072L, 1072L, 1072L, 1072L, 1072L, 1073L, 1073L, 1073L,
1073L, 1074L, 1074L, 1074L, 1075L, 1075L, 1075L, 1075L, 1075L,
1075L, 1076L, 1076L, 1076L, 1077L, 1077L, 1077L, 1077L, 1077L,
1077L, 1078L, 1078L, 1078L, 1078L, 1078L, 1078L, 1078L, 1080L,
1080L, 1080L, 1080L, 1081L, 1081L, 1082L, 1082L, 1082L, 1083L,
1083L, 1084L, 1085L, 1085L, 1085L, 1085L, 1085L, 1085L, 1086L,
1086L, 1086L, 1087L, 1087L, 1087L, 1087L, 1087L, 1087L, 1087L,
1087L, 1088L, 1088L, 1088L, 1088L, 1089L, 1089L, 1089L, 1089L,
1089L, 1090L, 1090L, 1091L, 1091L, 1091L, 1091L, 1091L, 1092L,
1092L, 1092L, 1092L, 1092L, 1093L, 1093L, 1093L, 1093L, 1094L,
1094L, 1094L, 1094L, 1094L, 1095L, 1095L, 1095L, 1095L, 1096L,
1097L, 1097L, 1098L, 1098L, 1098L, 1098L, 1098L, 1099L, 1099L,
1099L, 1099L, 1099L, 1099L, 1099L, 1099L, 1100L, 1100L, 1100L,
1101L, 1101L, 1101L, 1101L, 1103L, 1103L, 1103L, 1103L, 1103L,
1103L, 1103L, 1104L, 1104L, 1104L, 1104L, 1105L, 1105L, 1105L,
1106L, 1106L, 1106L, 1106L, 1106L, 1106L, 1106L, 1106L, 1106L,
1107L, 1108L, 1110L, 1111L, 1112L, 1117L, 1123L), y = c(1934.047646,
1075.598345, 1956.214821, 2000.38538, 2000.38538, 732.315937,
3119.86, 624.951231, 791.2764892, 1884.530826, 624.951231, 1047.57,
1047.57, 791.2764892, 1238.306103, 1555.042976, 2547.870529,
2547.870529, 2467.385, 1181.635212, 1181.635212, 565.306282,
2016.027874, 2016.027874, 712.6134567, 635.2537841, 2167.362267,
2575.574188, 2167.362267, 2480.028259, 2575.574188, 2875.363243,
1180.139938, 2828.037147, 3017.119362, 2722.940933, 2167.92,
2409.652458, 2245.442558, 724.1520328, 635.6034756, 1649.08326,
966.8182507, 865.2717723, 1570.23, 916.1300105, 1180.999973,
2351.32885, 2418.851707, 2290.038887, 2224.060562, 2509.52, 1174.589081,
1540.219376, 2692.26, 1300.899734, 1100.650177, 1786.628242,
1705.842979, 543.8596134, 1786.628242, 2115.374241, 2331.46,
875.949604, 2241.945103, 2319.666939, 2316.220234, 719.7139549,
2042.803307, 719.7139549, 1132.977503, 875.949604, 2316.220234,
1737.18, 1351.629826, 1291.44593, 1291.44593, 1108.26586, 1028.979719,
1291.44593, 2068.934227, 2440.784416, 1036.72, 894.6663704, 2449.184731,
1109.9, 672.9310664, 2072.320354, 2114.215416, 2114.215416, 1805.422001,
2461.18, 2101.374248, 2105.879, 1600.086481, 2866.84, 1600.086481,
2807.311, 3055.569931, 1600.086481, 2602.287521, 2690.007614,
620.5975037, 2608.4, 2722.3, 2713.66185, 2608.4, 1590.002, 2198.211,
2488.097725, 2198.211, 2322.616348, 2627.1, 2418.328346, 2601.661034,
531.7369251, 811.9494571, 884.31, 768.0526981, 652.1271248, 768.0526981,
2767.479, 1047.144354, 1047.144354, 1995.119, 1995.119, 707.6093158,
707.6093158, 1120.650104, 3036.591904, 3036.591904, 3081.86,
1193.583691, 2056.569244, 1823.155, 1238.948124, 2124.685, 887.20438,
1823.155, 2056.569244, 2056.569244, 2560.155342, 3095.923164,
3095.923164, 3003.729011, 2861.12, 2560.155342, 2735.26, 822.8209591,
1648.951, 1648.951, 1648.951, 822.8209591, 906.7692623, 582.787096,
1286.45, 797.2365359, 2566.770554, 2666.41, 2666.41, 2045.320816,
2401.21, 2401.21, 2583.2, 2581.32, 2622.357, 2581.32, 2588.462498,
442.433671, 1251.627064, 406.2565479, 2108.787437, 983.1101169,
2102.085403, 1155.713411, 1909.797131, 2871.55, 2711.07, 2883.22245,
2883.22245, 2711.07, 3027.103172, 3108.21537, 3007.87294, 3208.963631,
3108.21537, 2617.91, 2457.464466, 2890.51, 2698.48214, 2700.723,
2700.723, 2817.668579, 2700.723, 1349.90691, 1476.19994, 1552.95,
1349.90691, 925.8325004, 1258.28, 840.1875095, 2405.175911, 840.1875095,
1056.678543, 1571.936, 1210.89, 1210.89, 673.7005405, 687.7842464,
1016.86, 1217.866, 1493.791817, 2246.726913, 1054.821, 1054.821,
563.6580887, 1054.821, 1540.429863, 2209.006493, 1437.835186,
2191.308, 1412.128944, 2724.164597, 2791.705185, 2727.774208,
2070.451198, 866.7974147, 1661.082638, 2108.271309, 2411.515434,
2342.026085, 2071.06, 2258.321014, 1537.06, 760.6319065, 867.7596569,
1907.60466, 1770.658, 760.6319065, 912.8781966, 912.8781966,
912.8781966, 1257.222706, 2586.922356, 1608.28, 962.5674305,
1085.451181, 2539.218132, 2535.526085, 2561.60054, 1600.198,
2100.048149, 758.3851737, 758.3851737, 2643.373329, 367.7795143,
866.0683727, 718.5049658, 866.0683727, 1906.694649, 2291.48,
2190.560314, 744.1710777, 1498.981777, 2460.912292, 590.1345787,
2487.559135, 1855.601353, 660.9104843, 1116.08, 792.929533, 708.8373737,
2272.232933, 1801.729801, 2299.800095, 2272.232933, 2299.800095,
1895.828438, 1757.75, 1050.279345, 1757.75, 1326.09478, 1326.09478,
1633.119305, 1558, 1167.971405, 1828.16, 1788.571758, 2175.469,
1071.039494, 941.6030864, 2053.067215, 1461.02132, 1597.646778,
1885.321567, 2195.704372, 2195.704372, 1675.768558, 3157.550789,
1565.173126, 2195.704372, 3157.550789, 2404.836883, 2541.045593,
585.7223682, 2465.177761, 2678.462074, 500.3733997, 2465.177761,
781.342, 898.3551559, 2465.177761, 2465.177761, 1807.02, 1418.888027,
1797.36, 1807.02, 2200.06, 2218.369926, 2200.06, 1986.642735,
2088.292, 2069.139, 1507.901432, 2061.395798, 2075.164864, 2081.913219,
2081.913219, 483.8579493, 1857.88, 2578.772636, 1857.88, 1857.88,
1039.632153, 2288.28, 2288.28, 1831.349922, 2349.23, 933.1002788,
2626.298935, 1521.744, 933.1002788, 2626.298935, 1984.760715,
2450.333, 1732.339031, 1984.760715, 2731.9, 869.2320918, 1785.72,
1922.798, 3081.28, 1508.8, 2421.288597, 1922.798, 1268.074959,
1569.05, 1808.115, 1569.05, 1268.074959, 2165.724808, 2165.724808,
1808.115, 2084.149837, 2693.027184, 2464.489, 2607.653496, 1012.837271,
1012.837271, 2673.190872, 2635.290516, 2773.42, 2635.290516,
2654.772674, 2377.905655, 2679.014969, 2654.772674, 1226.40016,
1470.69, 1273.789799, 2294.926086, 1226.40016, 1470.69, 1273.789799,
1873.817, 2274.930534, 2317.429165, 959.1709613, 1328.159428,
1328.159428, 1328.159428, 959.1709613, 1630.28, 1610.54982, 2507.05302,
750.467966, 750.467966, 821.2255058, 802.8240452, 2829.47879),
age = c(31.54004107, 11.95071869, 27.88501027, 27.88501027,
25.07871321, 10.90759754, 25.70020534, 9.560574949, 11.17864476,
15.8384668, 9.560574949, 11.23613963, 14.01232033, 10.54620123,
12.89527721, 14.52977413, 24.96919918, 24.72005476, 23.95893224,
13.31690623, 11.52087611, 9.927446954, 22.10814511, 16.44353183,
10.90759754, 7.991786448, 17.26488706, 23.95893224, 15.66872005,
17.63723477, 24.72005476, 30.97330595, 11.52087611, 17.5633128,
30.11088296, 23.31279945, 17.26488706, 20.58590007, 28.27926078,
11.66324435, 9.927446954, 13.92744695, 11.20328542, 12.70362765,
13.52498289, 12.21355236, 13.80150582, 22.81724846, 39.3045859,
16.62696783, 22.63107461, 29.86447639, 12.54483231, 14.42299795,
34.27789185, 12.91170431, 12.25462012, 21.81245722, 21.81245722,
10.05065024, 23.6659822, 16.22450376, 28.74743326, 12.70362765,
35.43052704, 21.21013005, 19.28542094, 12.77207392, 16.59411362,
12.12867899, 11.29637235, 11.81930185, 19.04449008, 19.93429158,
16.14236824, 12.85420945, 13.21560575, 11.61396304, 11.85763176,
13.3798768, 17.42915811, 24.41341547, 13.08418891, 11.6659822,
24.41341547, 12.06297057, 10.22861054, 26.15468857, 21.71937029,
20.1889117, 12.60232717, 25.39904175, 30.72689938, 19.22245038,
14.45037645, 24.77207392, 13.47570157, 17.87816564, 27.52635181,
15.16221766, 19.68514716, 21.67282683, 9.062286105, 20.43805613,
21.67282683, 21.24024641, 20.70362765, 13.5687885, 17.13347023,
28.11498973, 24.16974675, 18.19575633, 27.73442847, 15.52361396,
20.70362765, 11.76728268, 10.98699521, 11.51540041, 9.902806297,
13.05407255, 8.703627652, 25.60164271, 10.59000684, 10.59000684,
14.45859001, 14.05886379, 10.88295688, 10.75427789, 10.59000684,
26.50513347, 18.83093771, 22.86379192, 11.8384668, 15.04449008,
15.42505133, 14.14099932, 28.06844627, 11.51540041, 14.66119097,
13.79055441, 15.37850787, 22.58179329, 22.86379192, 30.0752909,
21.85900068, 25.60164271, 15.29089665, 26.79534565, 11.68514716,
15.42505133, 15.58384668, 15.08555784, 14.11909651, 11.6659822,
10.21765914, 12.1670089, 10.50239562, 23.3045859, 15.92607803,
22.58179329, 16.65982204, 20.58590007, 39.3045859, 32.56947296,
16.90349076, 25.12799452, 17.88364134, 19.46338125, 8.736481862,
14.14099932, 8.736481862, 17.68104038, 14.54893908, 19.22245038,
12.98562628, 22.45311431, 18.83093771, 38.68856947, 26.50513347,
25.44010951, 28.70910335, 19.21697467, 30.0752909, 26.50513347,
29.45106092, 33.31690623, 16.68172485, 15.816564, 24.89801506,
15.816564, 18.7761807, 18.4366872, 19.45790554, 19.78370979,
14.98973306, 15.89869952, 29.06502396, 16.14236824, 10.74880219,
13.47843943, 10.5982204, 24.61875428, 10.74880219, 12.47364819,
16.95277207, 12.41889117, 13.44832307, 9.984941821, 9.451060917,
12.59137577, 13.38261465, 15.14852841, 21.65913758, 12.57494867,
12.40520192, 10.75701574, 15.16495551, 15.67419576, 22.52703628,
13.31143053, 16.71457906, 12.98288843, 32.16974675, 25.3798768,
30.57084189, 22.14647502, 11.43874059, 13.25119781, 18.48049281,
25.81519507, 24.78028747, 17.85626283, 27.70704997, 13.28952772,
8.703627652, 11.61396304, 35.04996578, 15.61943874, 8.703627652,
13.33333333, 10.56810404, 11.34017796, 13.5797399, 28.79671458,
12.56673511, 13.33333333, 12.55578371, 30.80082136, 23.63039014,
29.66461328, 13.25119781, 17.46748802, 8.703627652, 8.703627652,
21.21013005, 9.768651608, 13.46748802, 10.75427789, 13.24298426,
26.87474333, 27.43326489, 20.6899384, 10.0752909, 13.37713895,
28.38056126, 8.911704312, 24.62149213, 14.32443532, 10.24229979,
13.87268994, 10.54620123, 11.44421629, 21.68377823, 15.61943874,
27.97809719, 28.90075291, 28.90075291, 24.64339493, 14.32443532,
10.61190965, 15.8110883, 14.25051335, 14.25051335, 13.64818617,
26.05338809, 13.69746749, 23.98083504, 16.68172485, 20.42162902,
12.68172485, 11.51813826, 16.65982204, 14.32443532, 15.49897331,
35.04996578, 18.70225873, 17.47570157, 14.66666667, 26.83915127,
13.29226557, 18.14647502, 25.70020534, 14.67761807, 16.61601643,
9.812457221, 15.96714579, 24.41341547, 8.911704312, 17.61806982,
11.87953457, 11.80561259, 19.15400411, 17.61806982, 15.70704997,
12.35318275, 18.12457221, 16.8733744, 32.02464066, 32.02464066,
25.30047912, 16.13415469, 19.37850787, 26.50513347, 15.89869952,
13.79055441, 25.42368241, 16.05201916, 15.43874059, 9.158110883,
14.39014374, 22.12183436, 15.70704997, 15.35934292, 11.44421629,
28.45995893, 17.06502396, 14.39014374, 26.32991102, 12.38056126,
16.42436687, 13.37713895, 11.70978782, 17.62628337, 16.13415469,
17.61806982, 15.11019849, 14.09993155, 21.89185489, 13.80150582,
16.8733744, 17.73305955, 25.55509925, 14.75975359, 24.03559206,
14.36002738, 12.73100616, 16.09034908, 18.12457221, 15.11019849,
13.69472964, 23.03901437, 16.94182067, 15.70704997, 13.99315537,
21.89185489, 15.65776865, 19.25530459, 10.43394935, 12.72826831,
24.41341547, 24.25735797, 37.41820671, 37.41820671, 25.25393566,
24.78028747, 25.25393566, 37.41820671, 12.11772758, 14.19575633,
14.091718, 15.10746064, 13.16906229, 12.09856263, 13.3798768,
14.39014374, 36.3504449, 22.68035592, 11.21149897, 12.73100616,
13.34702259, 14.5982204, 11.31827515, 15.14579055, 15.44969199,
15.65776865, 12.12867899, 12.43531828, 12.72005476, 14.11909651,
24.25735797)), row.names = c(7L, 303L, 323L, 372L, 391L,
240L, 311L, 38L, 46L, 94L, 149L, 154L, 185L, 362L, 40L, 70L,
98L, 262L, 305L, 73L, 74L, 77L, 306L, 374L, 104L, 397L, 14L,
43L, 188L, 248L, 370L, 50L, 101L, 143L, 25L, 155L, 251L, 37L,
173L, 208L, 263L, 49L, 383L, 389L, 30L, 237L, 353L, 156L, 283L,
288L, 302L, 325L, 33L, 158L, 159L, 35L, 360L, 57L, 128L, 204L,
387L, 300L, 365L, 16L, 51L, 82L, 85L, 93L, 148L, 150L, 232L,
242L, 287L, 32L, 62L, 200L, 285L, 290L, 193L, 352L, 398L, 54L,
175L, 203L, 324L, 69L, 195L, 92L, 106L, 141L, 189L, 218L, 347L,
394L, 23L, 24L, 120L, 166L, 257L, 349L, 6L, 118L, 235L, 266L,
269L, 275L, 282L, 390L, 122L, 153L, 330L, 378L, 53L, 88L, 229L,
241L, 314L, 135L, 278L, 332L, 384L, 64L, 168L, 207L, 212L, 359L,
329L, 338L, 130L, 67L, 108L, 286L, 316L, 182L, 254L, 113L, 215L,
247L, 273L, 322L, 336L, 27L, 102L, 162L, 171L, 270L, 326L, 19L,
205L, 210L, 307L, 333L, 358L, 375L, 41L, 111L, 179L, 226L, 2L,
277L, 367L, 68L, 83L, 147L, 180L, 260L, 354L, 144L, 81L, 342L,
103L, 217L, 321L, 376L, 131L, 280L, 39L, 267L, 291L, 301L, 400L,
11L, 36L, 152L, 177L, 377L, 21L, 201L, 236L, 281L, 312L, 331L,
355L, 369L, 8L, 176L, 202L, 385L, 45L, 327L, 12L, 138L, 151L,
157L, 233L, 95L, 258L, 279L, 224L, 239L, 243L, 310L, 328L, 63L,
191L, 214L, 227L, 356L, 80L, 110L, 366L, 97L, 107L, 293L, 373L,
117L, 335L, 22L, 160L, 209L, 221L, 230L, 268L, 55L, 163L, 284L,
5L, 10L, 76L, 132L, 222L, 256L, 399L, 228L, 127L, 343L, 357L,
133L, 259L, 334L, 261L, 341L, 382L, 393L, 395L, 213L, 219L, 249L,
289L, 44L, 126L, 368L, 42L, 72L, 196L, 297L, 308L, 320L, 84L,
137L, 172L, 60L, 129L, 142L, 186L, 197L, 319L, 15L, 109L, 115L,
116L, 125L, 199L, 223L, 190L, 245L, 346L, 396L, 146L, 364L, 1L,
29L, 192L, 112L, 170L, 315L, 164L, 225L, 231L, 255L, 274L, 345L,
65L, 96L, 264L, 4L, 28L, 31L, 59L, 87L, 250L, 271L, 295L, 161L,
198L, 265L, 339L, 18L, 26L, 114L, 124L, 174L, 145L, 304L, 105L,
119L, 140L, 238L, 381L, 48L, 52L, 71L, 351L, 371L, 244L, 253L,
294L, 340L, 20L, 75L, 86L, 165L, 167L, 47L, 89L, 298L, 318L,
211L, 350L, 380L, 66L, 79L, 90L, 234L, 309L, 61L, 99L, 139L,
276L, 299L, 344L, 348L, 361L, 313L, 337L, 379L, 9L, 58L, 181L,
187L, 17L, 100L, 121L, 123L, 184L, 206L, 220L, 178L, 292L, 386L,
392L, 194L, 252L, 272L, 3L, 56L, 134L, 136L, 183L, 216L, 246L,
296L, 363L, 169L, 388L, 78L, 34L, 13L, 91L, 317L), class = "data.frame")
The models I am trying to automate are the following (the only variable parameter between models is ng =)
lcmm2g <- lcmm::hlme(fixed = y ~ 1 + ns(age, df = 3),
mixture = ~ 1 + ns(age, df = 3),
random = ~ 1 + age,
ng = 2, nwg = TRUE,
idiag = FALSE,
data = data, subject = "id")
lcmm3g <- lcmm::hlme(fixed = y ~ 1 + ns(age, df = 3),
mixture = ~ 1 + ns(age, df = 3),
random = ~ 1 + age,
ng = 3, nwg = TRUE,
idiag = FALSE,
data = data, subject = "id")
lcmm4g <- lcmm::hlme(fixed = y ~ 1 + ns(age, df = 3),
mixture = ~ 1 + ns(age, df = 3),
random = ~ 1 + age,
ng = 4, nwg = TRUE,
idiag = FALSE,
data = data, subject = "id")
I guess you mean the following:
ng = 2:4
res = lapply(ng, function(x) lcmm::hlme(fixed = y ~ 1 + ns(age, df = 3),
mixture = ~ 1 + ns(age, df = 3),
random = ~ 1 + age,
ng = x, nwg = TRUE,
idiag = FALSE,
data = data, subject = "id"))
names(res) = ng
res$`3` # Gives out the model with 3 groups
I have a list of vectors that each entry in the list is a vector of indices, for example:
list(c(563L, 688L, 630L, 160L, 568L, 908L, 457L, 798L, 3L, 558L,
56L, 389L, 506L, 106L, 807L, 556L, 809L, 63L, 343L, 242L, 470L,
894L, 804L, 970L, 406L, 881L, 893L, 952L, 126L, 827L, 282L, 910L,
61L, 66L, 763L, 787L, 337L, 41L, 712L, 144L, 450L, 12L, 200L,
574L, 945L, 236L, 336L, 684L, 280L, 721L, 233L, 686L, 64L, 504L,
174L, 934L, 40L, 850L, 26L, 799L, 853L, 978L), c(85L, 564L, 591L,
662L, 377L, 536L, 325L, 402L, 72L, 410L, 687L, 216L, 603L, 67L,
794L, 388L, 627L, 376L, 863L, 491L, 598L, 861L, 991L, 651L, 670L,
401L, 459L, 39L, 997L, 806L, 623L, 954L), c(427L, 791L, 212L,
779L, 657L, 740L, 800L, 838L, 104L, 985L, 167L, 486L, 685L, 739L,
60L, 862L, 130L, 134L, 175L, 375L, 683L, 885L, 575L, 859L, 341L,
726L, 472L, 802L, 76L, 424L, 177L, 624L, 189L, 334L, 378L, 329L,
581L, 224L, 851L, 218L, 993L, 678L, 248L, 365L, 188L, 774L, 58L,
813L, 514L, 59L, 777L, 485L, 606L, 480L, 826L, 350L, 608L, 27L,
661L, 775L, 340L, 10L, 207L, 260L, 483L, 150L, 205L), c(138L,
587L, 165L, 1L, 722L, 300L, 500L, 535L, 832L, 392L, 432L, 139L,
744L, 676L, 839L, 107L, 769L, 589L, 647L, 548L, 704L, 197L, 689L,
111L, 342L, 319L, 567L, 17L, 925L, 5L, 116L, 493L, 241L, 965L
), c(89L, 440L, 228L, 884L, 88L, 147L, 413L, 821L, 70L, 95L,
71L, 917L, 463L, 990L, 672L, 981L, 765L, 937L, 75L, 766L, 374L,
636L, 449L, 816L, 1000L, 356L, 629L), c(421L, 650L, 453L, 666L,
584L, 717L, 220L, 605L, 182L, 811L, 157L, 523L, 28L, 527L, 737L,
812L, 263L, 675L, 132L, 879L, 438L, 451L, 883L, 950L, 114L, 466L,
348L, 711L, 209L, 887L, 593L, 949L, 349L, 764L, 595L, 736L, 660L,
801L, 118L, 877L), c(23L, 231L, 78L, 988L, 55L, 57L, 753L, 994L,
437L, 202L, 842L, 190L, 822L, 968L, 331L, 733L, 782L, 886L, 105L,
943L, 743L, 815L, 311L, 498L, 792L, 795L, 184L, 728L, 573L, 771L,
117L, 251L, 192L, 735L, 15L, 776L, 295L, 677L, 631L, 235L, 237L,
705L, 856L, 97L, 725L), c(229L, 671L, 129L, 405L, 115L, 644L,
98L, 492L, 871L, 935L, 435L, 707L, 773L, 754L, 803L, 120L, 656L,
345L, 875L, 330L, 533L, 366L, 240L, 408L, 332L, 577L, 550L, 452L,
963L, 8L, 187L, 226L, 901L, 371L, 426L, 339L, 519L, 86L, 501L,
274L, 831L), c(16L, 79L, 68L, 477L, 133L, 659L, 2L, 973L, 264L,
953L, 90L, 234L, 420L, 588L, 21L, 788L, 363L, 539L, 227L, 565L,
30L, 642L, 786L, 982L, 347L, 680L, 52L, 96L, 592L, 409L, 643L,
81L, 419L, 245L, 658L, 416L, 590L, 448L, 819L, 277L, 357L, 442L,
789L, 516L, 980L, 93L, 998L, 149L, 166L, 299L, 454L, 529L, 986L,
127L, 541L, 45L, 829L, 289L, 418L, 179L, 310L, 113L, 729L), c(429L,
781L, 303L, 434L, 83L, 259L, 387L, 583L, 393L, 770L, 246L, 428L,
947L, 976L, 31L, 382L, 710L, 944L, 164L, 868L, 373L, 899L, 74L,
468L, 614L, 701L, 221L, 645L, 268L, 785L, 293L, 632L, 24L, 749L,
283L, 741L, 796L, 915L), c(258L, 844L, 649L, 752L, 474L, 613L,
351L, 551L, 309L, 380L, 497L, 724L, 327L, 992L, 845L, 607L, 818L,
693L, 914L, 291L, 720L, 633L, 974L, 367L, 639L, 94L, 467L, 92L,
522L, 141L, 496L, 276L, 542L, 665L, 695L, 634L, 602L, 913L, 396L,
597L, 443L, 892L, 65L, 394L, 222L, 778L, 169L, 960L, 35L, 655L,
422L, 927L, 154L, 215L, 262L, 203L, 880L, 217L, 423L, 755L, 904L,
180L, 620L), c(507L, 628L, 29L, 902L, 738L, 897L, 664L, 967L,
294L, 682L, 254L, 302L, 128L, 559L, 511L, 526L, 7L, 742L, 464L,
621L, 265L, 599L, 102L, 546L, 458L, 969L, 751L, 860L, 326L, 873L,
335L, 580L, 499L, 962L, 290L, 557L, 213L, 716L, 53L, 835L, 600L,
610L, 321L, 673L, 713L, 876L, 244L, 462L, 136L, 272L, 195L, 447L,
230L, 679L, 465L, 611L, 297L, 731L, 44L, 824L, 162L, 837L), c(446L,
561L, 391L, 652L, 857L, 946L, 560L, 784L, 854L, 204L, 512L, 82L,
455L, 372L, 407L, 328L, 808L, 152L, 178L, 185L, 543L, 108L, 473L,
490L, 955L, 719L, 757L, 198L, 338L, 223L, 919L, 531L, 653L, 734L,
923L, 487L, 637L, 398L, 431L, 46L, 848L, 324L, 948L, 43L, 183L,
288L, 697L, 87L, 307L, 42L, 571L, 360L, 433L, 390L, 569L, 956L,
534L, 6L, 381L, 549L, 301L, 920L, 69L, 322L, 267L, 503L, 285L,
961L, 370L, 425L), c(344L, 959L, 364L, 552L, 11L, 481L, 287L,
891L, 692L, 762L, 47L, 292L, 358L, 810L, 942L, 730L, 746L, 638L,
750L, 759L, 761L, 140L, 444L, 191L, 805L, 306L, 691L, 170L, 715L,
508L, 984L, 461L, 911L, 103L, 938L, 718L, 928L), c(124L, 284L,
123L, 513L, 417L, 933L, 121L, 168L, 208L, 385L, 32L, 273L, 869L,
932L, 397L, 509L, 239L, 797L, 379L, 723L, 898L, 163L, 320L, 833L,
151L, 906L, 648L, 732L, 279L, 834L, 489L, 840L, 783L, 971L, 49L,
145L, 253L, 352L, 137L, 261L, 247L, 143L, 544L, 109L, 921L, 830L,
972L, 585L, 690L, 609L, 703L, 250L, 708L, 225L, 889L, 181L, 987L,
54L, 502L, 148L, 355L, 888L, 579L, 983L, 825L, 855L, 62L, 918L,
979L, 586L, 681L, 384L, 709L, 333L, 758L, 194L, 368L), c(646L,
930L, 361L, 399L, 13L, 298L, 395L, 975L, 482L, 940L, 596L, 772L,
700L, 843L, 171L, 537L, 173L, 836L, 767L, 989L, 532L, 890L, 99L,
865L, 142L, 135L, 271L, 346L, 441L, 48L, 941L, 866L, 201L, 872L,
36L, 520L, 530L, 77L, 270L), c(238L, 699L, 22L, 50L, 615L, 702L,
4L, 469L, 101L, 314L, 616L, 995L, 996L, 414L, 566L, 249L, 572L,
369L, 553L, 158L, 159L, 199L, 317L, 515L, 517L, 524L, 562L, 19L,
476L, 20L, 146L, 618L, 895L, 312L, 912L), c(768L, 939L, 578L,
849L, 196L, 640L, 323L, 635L, 304L, 318L, 874L, 977L, 488L, 619L,
155L, 905L, 9L, 112L, 484L, 847L, 313L, 900L, 494L, 727L, 625L,
931L, 119L, 846L, 186L, 219L, 471L, 696L, 404L, 460L, 668L, 896L,
439L, 964L, 275L, 756L, 411L, 878L, 538L, 669L, 478L, 570L, 255L,
547L, 257L, 841L, 37L, 576L, 456L, 663L, 525L, 817L, 612L, 820L
), c(243L, 594L, 33L, 176L, 415L, 667L, 748L, 852L, 232L, 922L,
308L, 436L, 153L, 505L, 14L, 281L, 316L, 495L, 540L, 622L, 156L,
926L, 521L, 698L, 545L, 760L, 84L, 210L, 359L, 131L, 745L, 34L,
91L, 555L, 858L, 445L, 867L, 125L, 814L, 604L, 706L, 315L, 654L,
747L, 936L, 269L, 957L), c(80L, 924L, 110L, 193L, 958L, 296L,
475L, 18L, 907L, 626L, 999L, 278L, 362L, 51L, 641L, 211L, 929L,
122L, 694L, 73L, 353L, 25L, 100L, 305L, 864L, 214L, 790L, 286L,
518L, 674L, 206L, 400L, 554L, 903L, 780L, 916L, 38L, 430L, 617L,
823L, 172L, 966L, 412L, 951L, 510L, 828L, 479L, 909L, 266L, 582L,
870L, 882L, 161L, 252L, 256L, 383L, 403L, 601L, 386L, 793L, 528L,
354L, 714L))
Where each entry (or each nested list) represents a group obtained using a clustering method.
Now I have the following piece of code that takes this list of nested lists and the amount of samples required and returns a data-frame where each row represents a single sample and each column is a single sample from a group from one of the nested list.
groups_samples <- function(groups, repetition) {
return(as.data.frame(sapply(groups, sample, repetition, TRUE)))
}
Let's take the following as an example:
df <- groups_samples(ll, 100)
structure(list(V1 = c(106L, 686L, 721L, 200L, 970L, 910L, 556L,
807L, 908L, 568L, 688L, 389L, 56L, 470L, 630L, 893L, 574L, 236L,
804L, 798L, 721L, 934L, 763L, 807L, 457L, 568L, 684L, 934L, 787L,
450L, 688L, 64L, 568L, 934L, 894L, 558L, 568L, 343L, 450L, 853L,
336L, 64L, 712L, 144L, 934L, 144L, 809L, 763L, 457L, 763L, 558L,
457L, 688L, 763L, 504L, 66L, 406L, 881L, 3L, 343L, 556L, 799L,
712L, 568L, 61L, 799L, 908L, 688L, 64L, 881L, 236L, 787L, 66L,
160L, 853L, 343L, 809L, 200L, 827L, 893L, 894L, 799L, 470L, 406L,
337L, 389L, 63L, 952L, 236L, 337L, 763L, 41L, 945L, 144L, 56L,
978L, 233L, 978L, 881L, 910L), V2 = c(72L, 651L, 861L, 651L,
591L, 72L, 564L, 662L, 402L, 623L, 603L, 377L, 401L, 603L, 598L,
67L, 991L, 376L, 67L, 325L, 325L, 377L, 536L, 861L, 564L, 670L,
806L, 377L, 687L, 603L, 954L, 627L, 67L, 388L, 954L, 564L, 991L,
564L, 591L, 863L, 376L, 991L, 85L, 85L, 564L, 598L, 591L, 687L,
806L, 564L, 401L, 72L, 603L, 536L, 459L, 603L, 954L, 67L, 216L,
410L, 687L, 806L, 623L, 388L, 67L, 401L, 491L, 662L, 85L, 627L,
598L, 954L, 459L, 591L, 997L, 687L, 687L, 536L, 863L, 459L, 670L,
459L, 603L, 401L, 39L, 687L, 39L, 651L, 991L, 376L, 388L, 954L,
997L, 85L, 39L, 627L, 861L, 670L, 39L, 459L), V3 = c(424L, 775L,
862L, 791L, 683L, 826L, 60L, 205L, 802L, 740L, 58L, 985L, 683L,
341L, 838L, 212L, 993L, 59L, 851L, 657L, 375L, 885L, 150L, 167L,
218L, 205L, 58L, 260L, 341L, 661L, 791L, 350L, 726L, 378L, 188L,
150L, 60L, 813L, 774L, 104L, 207L, 207L, 485L, 514L, 424L, 514L,
859L, 130L, 350L, 188L, 188L, 740L, 859L, 177L, 212L, 802L, 606L,
104L, 608L, 260L, 329L, 993L, 427L, 427L, 485L, 472L, 859L, 424L,
661L, 514L, 791L, 678L, 993L, 726L, 188L, 340L, 483L, 150L, 340L,
514L, 606L, 248L, 205L, 188L, 581L, 813L, 175L, 657L, 862L, 775L,
212L, 341L, 27L, 885L, 575L, 334L, 350L, 486L, 483L, 340L), V4 = c(138L,
493L, 111L, 241L, 548L, 107L, 548L, 965L, 839L, 1L, 139L, 1L,
165L, 769L, 111L, 965L, 548L, 1L, 676L, 319L, 689L, 769L, 567L,
197L, 139L, 319L, 319L, 832L, 116L, 500L, 392L, 704L, 689L, 500L,
689L, 832L, 165L, 138L, 116L, 676L, 197L, 589L, 832L, 165L, 925L,
165L, 647L, 832L, 116L, 744L, 587L, 925L, 500L, 116L, 107L, 832L,
500L, 319L, 17L, 925L, 116L, 548L, 17L, 107L, 676L, 111L, 832L,
925L, 111L, 107L, 17L, 722L, 139L, 432L, 319L, 548L, 241L, 769L,
319L, 17L, 689L, 342L, 165L, 722L, 676L, 319L, 197L, 241L, 139L,
139L, 111L, 744L, 689L, 722L, 965L, 432L, 647L, 432L, 1L, 111L
), V5 = c(816L, 95L, 884L, 821L, 88L, 374L, 981L, 672L, 70L,
71L, 89L, 95L, 374L, 75L, 917L, 765L, 917L, 449L, 71L, 884L,
766L, 70L, 672L, 89L, 816L, 937L, 937L, 440L, 413L, 1000L, 1000L,
413L, 70L, 356L, 821L, 440L, 990L, 821L, 147L, 356L, 629L, 374L,
766L, 766L, 71L, 937L, 89L, 95L, 917L, 937L, 937L, 449L, 95L,
463L, 1000L, 440L, 821L, 884L, 917L, 816L, 89L, 1000L, 766L,
356L, 765L, 440L, 75L, 463L, 440L, 440L, 765L, 636L, 672L, 629L,
88L, 356L, 374L, 374L, 463L, 95L, 463L, 75L, 71L, 89L, 449L,
88L, 990L, 884L, 765L, 463L, 884L, 672L, 463L, 449L, 629L, 821L,
981L, 75L, 990L, 440L), V6 = c(650L, 675L, 737L, 466L, 883L,
877L, 209L, 887L, 584L, 263L, 605L, 132L, 584L, 950L, 650L, 451L,
737L, 453L, 348L, 675L, 949L, 349L, 209L, 584L, 801L, 593L, 711L,
666L, 466L, 605L, 527L, 666L, 584L, 717L, 114L, 660L, 118L, 466L,
811L, 595L, 438L, 28L, 593L, 811L, 118L, 711L, 605L, 593L, 466L,
650L, 801L, 438L, 348L, 349L, 118L, 584L, 114L, 584L, 801L, 209L,
157L, 466L, 801L, 182L, 812L, 132L, 523L, 666L, 605L, 527L, 950L,
950L, 812L, 421L, 584L, 801L, 132L, 182L, 737L, 887L, 883L, 605L,
737L, 711L, 28L, 675L, 220L, 157L, 118L, 887L, 675L, 132L, 736L,
811L, 887L, 438L, 182L, 717L, 737L, 950L), V7 = c(994L, 202L,
311L, 725L, 437L, 725L, 776L, 295L, 792L, 57L, 57L, 295L, 842L,
15L, 776L, 331L, 822L, 795L, 78L, 988L, 498L, 822L, 988L, 782L,
776L, 728L, 631L, 725L, 735L, 573L, 105L, 295L, 23L, 78L, 202L,
117L, 190L, 705L, 105L, 57L, 792L, 251L, 251L, 968L, 192L, 23L,
231L, 822L, 295L, 231L, 631L, 842L, 57L, 235L, 815L, 331L, 117L,
705L, 331L, 994L, 795L, 237L, 815L, 815L, 23L, 822L, 235L, 631L,
78L, 97L, 57L, 192L, 677L, 184L, 57L, 231L, 231L, 753L, 733L,
237L, 743L, 677L, 631L, 988L, 815L, 311L, 815L, 311L, 771L, 728L,
23L, 988L, 728L, 705L, 97L, 988L, 994L, 57L, 728L, 192L), V8 = c(754L,
875L, 332L, 935L, 86L, 339L, 86L, 644L, 339L, 501L, 803L, 229L,
644L, 426L, 550L, 129L, 330L, 129L, 229L, 86L, 773L, 803L, 129L,
901L, 452L, 8L, 229L, 98L, 129L, 366L, 187L, 8L, 773L, 187L,
229L, 8L, 98L, 935L, 98L, 345L, 754L, 533L, 332L, 550L, 240L,
875L, 773L, 229L, 426L, 754L, 120L, 803L, 129L, 901L, 901L, 644L,
345L, 707L, 707L, 773L, 533L, 120L, 332L, 330L, 803L, 86L, 803L,
8L, 226L, 345L, 871L, 240L, 550L, 963L, 330L, 345L, 226L, 533L,
366L, 452L, 803L, 405L, 803L, 405L, 550L, 577L, 8L, 339L, 901L,
577L, 330L, 229L, 330L, 656L, 452L, 330L, 519L, 226L, 366L, 435L
), V9 = c(643L, 953L, 642L, 21L, 592L, 16L, 127L, 539L, 409L,
516L, 419L, 277L, 986L, 590L, 45L, 980L, 998L, 516L, 541L, 980L,
454L, 81L, 149L, 986L, 227L, 45L, 420L, 363L, 986L, 90L, 409L,
986L, 953L, 45L, 982L, 588L, 68L, 127L, 127L, 16L, 418L, 21L,
953L, 442L, 418L, 419L, 565L, 980L, 659L, 16L, 149L, 448L, 789L,
454L, 516L, 2L, 127L, 79L, 277L, 980L, 234L, 357L, 357L, 642L,
980L, 680L, 729L, 81L, 21L, 454L, 986L, 357L, 980L, 973L, 680L,
592L, 788L, 2L, 264L, 79L, 680L, 729L, 52L, 986L, 539L, 79L,
277L, 416L, 786L, 477L, 113L, 454L, 419L, 442L, 953L, 79L, 245L,
788L, 93L, 234L), V10 = c(31L, 468L, 468L, 387L, 164L, 796L,
701L, 785L, 915L, 614L, 741L, 770L, 770L, 583L, 373L, 373L, 393L,
221L, 303L, 83L, 74L, 785L, 387L, 741L, 741L, 393L, 468L, 701L,
382L, 393L, 387L, 899L, 429L, 947L, 781L, 781L, 645L, 645L, 710L,
915L, 74L, 796L, 259L, 749L, 373L, 393L, 246L, 632L, 785L, 259L,
614L, 785L, 428L, 741L, 632L, 382L, 770L, 710L, 781L, 749L, 868L,
915L, 434L, 221L, 429L, 303L, 393L, 468L, 632L, 976L, 781L, 373L,
947L, 428L, 781L, 781L, 645L, 868L, 645L, 710L, 283L, 31L, 868L,
583L, 915L, 246L, 373L, 373L, 781L, 164L, 428L, 710L, 373L, 303L,
632L, 868L, 614L, 947L, 74L, 382L), V11 = c(351L, 154L, 423L,
496L, 818L, 913L, 665L, 913L, 380L, 720L, 542L, 380L, 634L, 551L,
258L, 818L, 634L, 474L, 222L, 639L, 974L, 755L, 262L, 665L, 522L,
217L, 927L, 351L, 755L, 914L, 380L, 65L, 844L, 633L, 613L, 222L,
649L, 892L, 752L, 423L, 755L, 169L, 904L, 309L, 639L, 276L, 217L,
394L, 291L, 522L, 203L, 720L, 35L, 422L, 724L, 423L, 720L, 914L,
180L, 327L, 92L, 422L, 258L, 467L, 724L, 620L, 665L, 367L, 639L,
443L, 892L, 724L, 141L, 422L, 327L, 396L, 92L, 309L, 844L, 258L,
914L, 634L, 497L, 222L, 141L, 880L, 467L, 443L, 496L, 913L, 394L,
217L, 35L, 396L, 35L, 880L, 351L, 755L, 474L, 215L), V12 = c(102L,
546L, 682L, 464L, 162L, 876L, 162L, 302L, 682L, 162L, 302L, 53L,
967L, 679L, 837L, 824L, 44L, 53L, 294L, 738L, 254L, 557L, 546L,
7L, 902L, 244L, 128L, 499L, 621L, 499L, 458L, 526L, 837L, 465L,
290L, 969L, 265L, 507L, 835L, 837L, 546L, 136L, 897L, 213L, 195L,
244L, 465L, 835L, 464L, 621L, 162L, 511L, 969L, 230L, 580L, 335L,
610L, 969L, 546L, 897L, 835L, 447L, 526L, 302L, 464L, 302L, 682L,
628L, 610L, 272L, 53L, 254L, 969L, 962L, 511L, 621L, 290L, 458L,
559L, 860L, 136L, 507L, 462L, 136L, 462L, 731L, 873L, 462L, 335L,
897L, 580L, 447L, 628L, 731L, 7L, 335L, 102L, 128L, 679L, 742L
), V13 = c(108L, 637L, 757L, 734L, 534L, 42L, 808L, 322L, 757L,
204L, 808L, 324L, 288L, 82L, 285L, 961L, 955L, 652L, 808L, 961L,
503L, 549L, 697L, 87L, 734L, 43L, 204L, 455L, 398L, 961L, 183L,
433L, 431L, 854L, 490L, 69L, 407L, 808L, 398L, 69L, 87L, 338L,
446L, 178L, 6L, 198L, 82L, 543L, 370L, 534L, 87L, 267L, 455L,
360L, 534L, 407L, 431L, 446L, 854L, 857L, 46L, 637L, 848L, 923L,
560L, 531L, 919L, 223L, 307L, 561L, 6L, 719L, 560L, 43L, 734L,
288L, 324L, 87L, 808L, 322L, 757L, 446L, 425L, 324L, 757L, 857L,
87L, 848L, 223L, 503L, 307L, 152L, 503L, 757L, 956L, 152L, 43L,
69L, 719L, 637L), V14 = c(746L, 805L, 191L, 47L, 508L, 508L,
715L, 461L, 928L, 750L, 140L, 746L, 364L, 552L, 287L, 984L, 481L,
715L, 762L, 959L, 750L, 344L, 959L, 959L, 306L, 911L, 103L, 638L,
759L, 761L, 750L, 444L, 692L, 692L, 761L, 481L, 552L, 942L, 810L,
938L, 306L, 762L, 344L, 942L, 344L, 364L, 552L, 891L, 11L, 103L,
762L, 287L, 891L, 358L, 730L, 959L, 750L, 191L, 718L, 959L, 358L,
306L, 287L, 692L, 746L, 461L, 750L, 170L, 358L, 911L, 805L, 938L,
481L, 759L, 750L, 140L, 715L, 959L, 928L, 692L, 461L, 750L, 306L,
762L, 691L, 306L, 287L, 481L, 170L, 746L, 810L, 762L, 358L, 292L,
750L, 191L, 47L, 942L, 344L, 191L), V15 = c(987L, 972L, 151L,
397L, 250L, 825L, 681L, 825L, 723L, 49L, 585L, 109L, 833L, 137L,
49L, 690L, 681L, 253L, 385L, 921L, 708L, 151L, 109L, 385L, 54L,
247L, 979L, 121L, 225L, 124L, 825L, 417L, 320L, 979L, 681L, 918L,
145L, 397L, 681L, 145L, 586L, 709L, 284L, 840L, 121L, 368L, 250L,
898L, 840L, 109L, 417L, 513L, 544L, 194L, 417L, 544L, 320L, 987L,
840L, 987L, 888L, 489L, 855L, 906L, 62L, 579L, 379L, 783L, 368L,
379L, 49L, 732L, 279L, 509L, 54L, 145L, 797L, 979L, 709L, 840L,
368L, 830L, 502L, 123L, 681L, 194L, 855L, 703L, 247L, 833L, 609L,
830L, 708L, 609L, 509L, 397L, 987L, 609L, 320L, 124L), V16 = c(346L,
48L, 865L, 865L, 173L, 890L, 482L, 13L, 537L, 171L, 482L, 940L,
843L, 173L, 975L, 866L, 142L, 646L, 482L, 700L, 395L, 298L, 975L,
890L, 361L, 173L, 890L, 975L, 940L, 271L, 395L, 989L, 395L, 142L,
865L, 361L, 399L, 441L, 441L, 772L, 142L, 520L, 142L, 520L, 975L,
930L, 890L, 989L, 530L, 866L, 941L, 530L, 596L, 890L, 36L, 441L,
346L, 865L, 173L, 646L, 270L, 441L, 866L, 866L, 346L, 441L, 482L,
872L, 36L, 890L, 271L, 13L, 36L, 836L, 767L, 395L, 890L, 537L,
395L, 530L, 346L, 346L, 940L, 173L, 865L, 772L, 520L, 171L, 48L,
866L, 135L, 298L, 135L, 77L, 361L, 872L, 395L, 596L, 772L, 532L
), V17 = c(912L, 146L, 312L, 22L, 618L, 317L, 618L, 199L, 369L,
101L, 515L, 4L, 476L, 699L, 517L, 317L, 159L, 517L, 553L, 616L,
995L, 314L, 317L, 314L, 562L, 101L, 249L, 369L, 615L, 562L, 476L,
702L, 312L, 312L, 515L, 101L, 159L, 572L, 101L, 618L, 895L, 317L,
616L, 618L, 572L, 562L, 4L, 517L, 312L, 312L, 249L, 699L, 312L,
158L, 469L, 20L, 524L, 476L, 572L, 249L, 50L, 19L, 249L, 912L,
469L, 476L, 101L, 146L, 616L, 618L, 476L, 20L, 146L, 249L, 50L,
101L, 158L, 517L, 238L, 515L, 895L, 553L, 702L, 146L, 312L, 517L,
158L, 895L, 517L, 101L, 314L, 238L, 22L, 146L, 317L, 895L, 469L,
912L, 369L, 572L), V18 = c(525L, 635L, 488L, 456L, 878L, 119L,
119L, 849L, 768L, 817L, 931L, 275L, 460L, 900L, 494L, 669L, 846L,
488L, 768L, 494L, 570L, 439L, 878L, 275L, 471L, 896L, 768L, 619L,
727L, 977L, 155L, 155L, 896L, 112L, 817L, 768L, 411L, 304L, 964L,
612L, 905L, 768L, 456L, 255L, 119L, 404L, 304L, 576L, 219L, 756L,
612L, 668L, 255L, 768L, 196L, 668L, 155L, 931L, 896L, 878L, 488L,
576L, 640L, 37L, 846L, 494L, 257L, 37L, 411L, 411L, 625L, 820L,
304L, 112L, 619L, 9L, 669L, 494L, 471L, 323L, 318L, 570L, 817L,
578L, 878L, 696L, 977L, 768L, 896L, 525L, 669L, 841L, 471L, 727L,
619L, 304L, 874L, 931L, 37L, 619L), V19 = c(926L, 281L, 957L,
308L, 315L, 814L, 622L, 153L, 858L, 315L, 867L, 176L, 555L, 210L,
867L, 540L, 555L, 867L, 622L, 852L, 540L, 436L, 269L, 505L, 436L,
505L, 654L, 505L, 91L, 125L, 131L, 706L, 243L, 125L, 922L, 281L,
91L, 359L, 33L, 957L, 232L, 698L, 555L, 540L, 667L, 34L, 545L,
698L, 555L, 308L, 926L, 445L, 316L, 748L, 243L, 14L, 521L, 232L,
654L, 243L, 232L, 359L, 156L, 131L, 555L, 359L, 521L, 852L, 706L,
957L, 308L, 125L, 91L, 852L, 315L, 604L, 604L, 760L, 604L, 936L,
521L, 747L, 922L, 555L, 243L, 521L, 316L, 867L, 84L, 176L, 814L,
232L, 315L, 316L, 555L, 505L, 745L, 505L, 232L, 540L), V20 = c(554L,
882L, 823L, 386L, 966L, 694L, 286L, 354L, 214L, 25L, 25L, 110L,
353L, 475L, 479L, 252L, 582L, 999L, 266L, 211L, 18L, 278L, 828L,
412L, 528L, 386L, 296L, 353L, 412L, 80L, 206L, 714L, 18L, 211L,
475L, 554L, 38L, 882L, 25L, 362L, 510L, 110L, 206L, 823L, 362L,
694L, 256L, 479L, 582L, 25L, 828L, 193L, 951L, 80L, 793L, 999L,
882L, 903L, 38L, 386L, 354L, 214L, 916L, 25L, 110L, 864L, 882L,
25L, 353L, 780L, 296L, 864L, 510L, 38L, 386L, 400L, 694L, 793L,
999L, 122L, 278L, 475L, 916L, 903L, 958L, 161L, 828L, 73L, 790L,
73L, 430L, 18L, 958L, 828L, 582L, 383L, 51L, 278L, 18L, 122L)), class = "data.frame", row.names = c(NA,
-100L))
Now what I wish to do is reduce the amount, let's say from 100 to 50 entries, where each entry is couple of indices 1 from each group. I tried to calculate the distance matrix using several methods and chose the most distant entries, but when I examined it was not so informative.
Is there a way to do it, maybe to consider the list of lists or other sophisticated methods?
Would appreciate some help/insights
Edit - Clarifing the objective
Lets say I sampled 100 groups where each group contains 1 element from each list of the nested lists.
Some of the groups are close to others, let's say only 1 element is different between the 2 groups, so I will probably will want to discard it. Or even only 2 elements are different etc. But I wish to keep eventually the K groups which as "distant" as possible.
Also nice if it is possible to consider is the amount of elements in a specific nested list, some sort of weighting procedure.
Edit No.2
for the following list(c(1L, 5L, 6L), c(3L, 4L, 2L, 9L), c(8L, 7L, 10L)) we get the following data-frame:
structure(list(V1 = c(1L, 5L, 6L, 1L, 6L, 1L, 1L, 6L, 1L, 5L,
5L, 5L, 1L, 1L, 5L, 6L, 5L, 6L, 6L, 5L, 5L, 5L, 6L, 5L, 6L, 1L,
6L, 1L, 1L, 1L, 5L, 5L, 6L, 6L, 5L, 1L, 6L, 6L, 5L, 6L, 1L, 1L,
5L, 5L, 5L, 1L, 6L, 5L, 1L, 5L, 5L, 5L, 5L, 1L, 5L, 5L, 1L, 6L,
5L, 6L, 5L, 6L, 5L, 1L, 5L, 1L, 5L, 6L, 5L, 1L, 6L, 1L, 6L, 1L,
1L, 5L, 5L, 6L, 1L, 5L, 1L, 5L, 5L, 6L, 6L, 1L, 1L, 6L, 6L, 6L,
5L, 5L, 1L, 6L, 1L, 1L, 6L, 5L, 5L, 1L), V2 = c(9L, 3L, 9L, 4L,
2L, 4L, 3L, 3L, 3L, 2L, 2L, 9L, 3L, 3L, 2L, 2L, 9L, 9L, 9L, 3L,
4L, 3L, 2L, 3L, 4L, 2L, 2L, 3L, 4L, 9L, 9L, 2L, 3L, 2L, 9L, 9L,
3L, 2L, 4L, 4L, 3L, 4L, 3L, 2L, 2L, 9L, 9L, 2L, 4L, 4L, 4L, 9L,
2L, 3L, 9L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 3L, 3L, 3L, 2L, 9L,
9L, 9L, 2L, 9L, 3L, 3L, 9L, 4L, 3L, 3L, 4L, 3L, 4L, 4L, 4L, 4L,
2L, 9L, 9L, 4L, 9L, 2L, 2L, 9L, 4L, 4L, 9L, 9L, 2L, 4L, 4L, 3L
), V3 = c(7L, 7L, 7L, 8L, 7L, 7L, 7L, 7L, 10L, 8L, 10L, 8L, 7L,
7L, 10L, 10L, 10L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 10L, 7L, 10L,
10L, 7L, 8L, 7L, 8L, 7L, 8L, 8L, 8L, 7L, 8L, 8L, 8L, 10L, 7L,
8L, 7L, 7L, 10L, 7L, 7L, 10L, 7L, 10L, 8L, 8L, 7L, 10L, 10L,
10L, 8L, 8L, 10L, 7L, 8L, 8L, 10L, 8L, 10L, 10L, 10L, 8L, 10L,
10L, 10L, 8L, 10L, 8L, 7L, 10L, 7L, 7L, 10L, 8L, 7L, 8L, 10L,
7L, 8L, 10L, 7L, 7L, 7L, 7L, 10L, 7L, 7L, 10L, 10L, 7L, 7L, 8L,
10L)), class = "data.frame", row.names = c(NA, -100L))
running #Allan Cameron code, will produce the following where there are better 5:
V1 V2 V3
26 1 2 7
68 6 9 10
7 1 3 7
17 5 9 10
13 1 3 7
As you have described it, the concept of overall "distance" between two groups is a bit vague. It's clear that a pair like c(1, 5, 2, 6) and c(2, 9, 12, 3) are closer than the pair c(1, 5, 2, 6) and c(101, 78, 96, 54), but should there be a penalty for an exact match? Is variance important? In the absence of a clearer notion of distance, the best measure we have is the mean of each group. This is easy to obtain by rowMeans(df).
There's also some vagueness with regards to the concept of "the K furthest apart groups". Distance between groups is a function of pairs of groups, not individual groups. If K = 1, then presumably any group is fine. If K = 2, then you want the single pair of groups with the largest difference between their means. After that, it's not clear what you are looking for, but one approach would be to find the set of K groups which has the highest variance.
So if we do something like:
k <- 5
group_means <- rowMeans(df)
indices <- seq(nrow(df))
k_furthest <- c(which.min(group_means), which.max(group_means))
k_vals <- c(min(group_means), max(group_means))
group_means <- group_means[-k_furthest]
indices <- indices[-k_furthest]
while(length(k_furthest) < k)
{
best <- which.max(rowSums(sapply(k_vals, function(x) (x - group_means)^2)))
k_vals <- c(k_vals, group_means[best])
k_furthest <- c(k_furthest, indices[best])
group_means <- group_means[-best]
indices <- indices[-best]
}
Then k_furthest will contain the set of 5 rows of the data frame with the highest possible variance between all the means. Your result would be obtained like:
df[k_furthest,]
#> V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20
#> 63 236 794 885 300 71 114 725 492 52 468 92 128 948 191 585 441 414 196 156 18
#> 51 798 536 739 704 1000 883 237 644 299 915 695 860 338 47 972 890 996 939 957 793
#> 61 41 388 624 689 672 466 55 229 454 164 542 265 338 170 32 271 314 640 922 582
#> 33 970 598 775 548 228 132 842 644 986 781 818 679 920 287 825 361 562 756 748 929
#> 12 336 216 774 107 71 801 725 492 642 74 613 297 948 306 124 646 19 439 281 122
Note though that this algorithm effectively just takes the rows with the highest and lowest means alternately on each iteration. Although this produces the largest overall collective "difference" between the samples, you might end up with some samples that are very close together, provided that they are also both very far apart from another sample. This may not be what you are looking for, and it is why it might be a good idea to specify exactly what you mean by "distance" in this context.
EDIT
With further clarification and a new example from the OP, it seems that we are looking to maximize the sum of element-wise difference between groups. This means we can do:
distances <- as.data.frame(t(sapply(1:nrow(df), function(i) {
a <- rowSums(apply(df, 2, function(x) abs(x[i] - x)))
c(row = i, most_distant = which.max(a), difference = max(a))
})))
This will give us a data frame which for each row tells us the most "distant" other group.
head(distances)
#> row most_distant difference
#> 1 1 16 15
#> 2 2 46 13
#> 3 3 9 14
#> 4 4 68 12
#> 5 5 46 15
#> 6 6 68 13
If we sort this according to the biggest difference, and take the first K groups mentioned in the first two columns, we will have our result:
i <- unique(c(t(distances[order(-distances$difference)[seq(k)], 1:2])))[seq(k)]
df[i,]
#> V1 V2 V3
#> 1 1 9 7
#> 16 6 2 10
#> 5 6 2 7
#> 46 1 9 10
#> 26 1 2 7
I am dealing with tick data which contains price and volume(buy and sell).
I've tried codes from this post How to label max value points in a faceted plot in R?, yet still cannot solve it. I think what I wanna do is about x and y coordinates. e.g. the maximum volume (4622 ) happened at price 11360 and I would like to label 11360 at the point with maximum volume 4622.
Here is my codes :
ggplot(data=ts629sum) +
geom_point(mapping=aes(x=BS,y=Price)) +
geom_label(filter(BS==max(BS)) +
aes(label(sprintf(%0.2f,y)), hjust=-0.5)
It would be appreciated if someone knows how to solve this problem.
Below is the dataset.
ts629sum <- structure(list(Price = 11315:11528, BS = c(236L, 340L, 266L,
306L, 300L, 546L, 700L, 1106L, 1064L, 1312L, 1358L, 1126L, 876L,
1382L, 1382L, 2290L, 2292L, 2282L, 2454L, 2710L, 3082L, 2252L,
2214L, 2574L, 2498L, 3088L, 2644L, 2664L, 2558L, 2452L, 2508L,
2122L, 2188L, 2152L, 1730L, 2222L, 1210L, 1074L, 1736L, 1750L,
2340L, 2252L, 2004L, 2448L, 2590L, 4622L, 3428L, 3642L, 3628L,
3960L, 4020L, 2690L, 2110L, 1974L, 1018L, 1182L, 796L, 788L,
762L, 780L, 1442L, 1048L, 814L, 862L, 616L, 916L, 808L, 626L,
552L, 506L, 588L, 888L, 1222L, 1942L, 1300L, 1856L, 1284L, 968L,
932L, 1942L, 1320L, 1218L, 1514L, 1746L, 1886L, 3186L, 2540L,
2194L, 2314L, 2166L, 3072L, 2344L, 2238L, 2568L, 2132L, 2806L,
2606L, 2492L, 2610L, 2860L, 3754L, 2940L, 2754L, 3246L, 2912L,
4018L, 3402L, 3534L, 3374L, 3028L, 3760L, 3820L, 3822L, 3890L,
3296L, 4596L, 2780L, 2546L, 2958L, 2706L, 2990L, 2558L, 2518L,
2462L, 2110L, 2818L, 2276L, 2184L, 1828L, 1436L, 1878L, 1468L,
1464L, 1590L, 1580L, 2524L, 1586L, 1480L, 1702L, 1568L, 2490L,
2074L, 1872L, 1872L, 1274L, 2000L, 1252L, 1194L, 1422L, 1422L,
1630L, 1668L, 1798L, 2264L, 1806L, 2244L, 1480L, 2028L, 1616L,
2074L, 2066L, 1798L, 1514L, 1440L, 1116L, 1308L, 780L, 816L,
904L, 1162L, 1434L, 1042L, 1074L, 666L, 400L, 356L, 164L, 130L,
110L, 48L, 48L, 54L, 36L, 34L, 28L, 106L, 32L, 56L, 64L, 54L,
38L, 24L, 18L, 42L, 34L, 86L, 42L, 76L, 196L, 316L, 316L, 422L,
418L, 358L, 300L, 348L, 378L, 238L, 214L, 178L, 248L, 168L, 76L,
18L)), class = "data.frame", row.names = c(NA, -214L))
You can subset the data in geom_label and keep only the row with max BS.
library(ggplot2)
ggplot(data=ts629sum, aes(x=BS,y=Price, label = Price)) +
geom_point() +
geom_label(data = ts629sum[which.max(ts629sum$BS), ], vjust = 1.5)
Hi programming fellows,
Please consider the following data frame:
df <- structure(list(date = structure(c(1251350100.288, 1251351900,
1251353699.712, 1251355500.288, 1251357300, 1251359099.712), class = c("POSIXct",
"POSIXt")), mix.ratio.csi = c(442.78316237477, 436.757082063885,
425.742872761246, 395.770804307671, 386.758335309866, 392.115887652156
), mix.ratio.licor = c(447.141491945547, 441.319548211994, 430.854166343173,
402.232640566763, 393.683007533694, 398.388336602215), ToKeep = c(FALSE,
FALSE, TRUE, TRUE, TRUE, TRUE)), .Names = c("date", "value1",
"value2", "ToKeep"), index = structure(integer(0), ToKeep = c(1L,
2L, 8L, 52L, 53L, 54L, 55L, 85L, 86L, 87L, 88L, 89L, 92L, 93L,
94L, 95L, 96L, 97L, 98L, 99L, 100L, 102L, 103L, 105L, 106L, 192L,
193L, 220L, 223L, 225L, 228L, 229L, 260L, 263L, 264L, 265L, 266L,
267L, 305L, 306L, 307L, 308L, 309L, 310L, 311L, 312L, 313L, 314L,
315L, 352L, 353L, 354L, 375L, 376L, 378L, 379L, 380L, 383L, 411L,
412L, 413L, 414L, 415L, 416L, 418L, 419L, 445L, 453L, 463L, 464L,
465L, 466L, 467L, 468L, 497L, 504L, 547L, 548L, 549L, 586L, 589L,
630L, 631L, 632L, 633L, 634L, 635L, 636L, 644L, 645L, 646L, 647L,
648L, 649L, 650L, 651L, 674L, 675L, 676L, 677L, 678L, 682L, 687L,
690L, 691L, 724L, 725L, 726L, 727L, 728L, 729L, 730L, 731L, 732L,
733L, 734L, 735L, 736L, 739L, 740L, 741L, 742L, 768L, 771L, 772L,
773L, 774L, 775L, 776L, 777L, 778L, 779L, 3L, 4L, 5L, 6L, 7L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L,
22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L,
35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L,
48L, 49L, 50L, 51L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L,
65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L,
78L, 79L, 80L, 81L, 82L, 83L, 84L, 90L, 91L, 101L, 104L, 107L,
108L, 109L, 110L, 111L, 112L, 113L, 114L, 115L, 116L, 117L, 118L,
119L, 120L, 121L, 122L, 123L, 124L, 125L, 126L, 127L, 128L, 129L,
130L, 131L, 132L, 133L, 134L, 135L, 136L, 137L, 138L, 139L, 140L,
141L, 142L, 143L, 144L, 145L, 146L, 147L, 148L, 149L, 150L, 151L,
152L, 153L, 154L, 155L, 156L, 157L, 158L, 159L, 160L, 161L, 162L,
163L, 164L, 165L, 166L, 167L, 168L, 169L, 170L, 171L, 172L, 173L,
174L, 175L, 176L, 177L, 178L, 179L, 180L, 181L, 182L, 183L, 184L,
185L, 186L, 187L, 188L, 189L, 190L, 191L, 194L, 195L, 196L, 197L,
198L, 199L, 200L, 201L, 202L, 203L, 204L, 205L, 206L, 207L, 208L,
209L, 210L, 211L, 212L, 213L, 214L, 215L, 216L, 217L, 218L, 219L,
221L, 222L, 224L, 226L, 227L, 230L, 231L, 232L, 233L, 234L, 235L,
236L, 237L, 238L, 239L, 240L, 241L, 242L, 243L, 244L, 245L, 246L,
247L, 248L, 249L, 250L, 251L, 252L, 253L, 254L, 255L, 256L, 257L,
258L, 259L, 261L, 262L, 268L, 269L, 270L, 271L, 272L, 273L, 274L,
275L, 276L, 277L, 278L, 279L, 280L, 281L, 282L, 283L, 284L, 285L,
286L, 287L, 288L, 289L, 290L, 291L, 292L, 293L, 294L, 295L, 296L,
297L, 298L, 299L, 300L, 301L, 302L, 303L, 304L, 316L, 317L, 318L,
319L, 320L, 321L, 322L, 323L, 324L, 325L, 326L, 327L, 328L, 329L,
330L, 331L, 332L, 333L, 334L, 335L, 336L, 337L, 338L, 339L, 340L,
341L, 342L, 343L, 344L, 345L, 346L, 347L, 348L, 349L, 350L, 351L,
355L, 356L, 357L, 358L, 359L, 360L, 361L, 362L, 363L, 364L, 365L,
366L, 367L, 368L, 369L, 370L, 371L, 372L, 373L, 374L, 377L, 381L,
382L, 384L, 385L, 386L, 387L, 388L, 389L, 390L, 391L, 392L, 393L,
394L, 395L, 396L, 397L, 398L, 399L, 400L, 401L, 402L, 403L, 404L,
405L, 406L, 407L, 408L, 409L, 410L, 417L, 420L, 421L, 422L, 423L,
424L, 425L, 426L, 427L, 428L, 429L, 430L, 431L, 432L, 433L, 434L,
435L, 436L, 437L, 438L, 439L, 440L, 441L, 442L, 443L, 444L, 446L,
447L, 448L, 449L, 450L, 451L, 452L, 454L, 455L, 456L, 457L, 458L,
459L, 460L, 461L, 462L, 469L, 470L, 471L, 472L, 473L, 474L, 475L,
476L, 477L, 478L, 479L, 480L, 481L, 482L, 483L, 484L, 485L, 486L,
487L, 488L, 489L, 490L, 491L, 492L, 493L, 494L, 495L, 496L, 498L,
499L, 500L, 501L, 502L, 503L, 505L, 506L, 507L, 508L, 509L, 510L,
511L, 512L, 513L, 514L, 515L, 516L, 517L, 518L, 519L, 520L, 521L,
522L, 523L, 524L, 525L, 526L, 527L, 528L, 529L, 530L, 531L, 532L,
533L, 534L, 535L, 536L, 537L, 538L, 539L, 540L, 541L, 542L, 543L,
544L, 545L, 546L, 550L, 551L, 552L, 553L, 554L, 555L, 556L, 557L,
558L, 559L, 560L, 561L, 562L, 563L, 564L, 565L, 566L, 567L, 568L,
569L, 570L, 571L, 572L, 573L, 574L, 575L, 576L, 577L, 578L, 579L,
580L, 581L, 582L, 583L, 584L, 585L, 587L, 588L, 590L, 591L, 592L,
593L, 594L, 595L, 596L, 597L, 598L, 599L, 600L, 601L, 602L, 603L,
604L, 605L, 606L, 607L, 608L, 609L, 610L, 611L, 612L, 613L, 614L,
615L, 616L, 617L, 618L, 619L, 620L, 621L, 622L, 623L, 624L, 625L,
626L, 627L, 628L, 629L, 637L, 638L, 639L, 640L, 641L, 642L, 643L,
652L, 653L, 654L, 655L, 656L, 657L, 658L, 659L, 660L, 661L, 662L,
663L, 664L, 665L, 666L, 667L, 668L, 669L, 670L, 671L, 672L, 673L,
679L, 680L, 681L, 683L, 684L, 685L, 686L, 688L, 689L, 692L, 693L,
694L, 695L, 696L, 697L, 698L, 699L, 700L, 701L, 702L, 703L, 704L,
705L, 706L, 707L, 708L, 709L, 710L, 711L, 712L, 713L, 714L, 715L,
716L, 717L, 718L, 719L, 720L, 721L, 722L, 723L, 737L, 738L, 743L,
744L, 745L, 746L, 747L, 748L, 749L, 750L, 751L, 752L, 753L, 754L,
755L, 756L, 757L, 758L, 759L, 760L, 761L, 762L, 763L, 764L, 765L,
766L, 767L, 769L, 770L, 780L, 781L, 782L, 783L, 784L, 785L, 786L,
787L, 788L, 789L)), row.names = c(NA, 6L), class = "data.frame")
I need to create a new data.frame with the following structure:
1) if column 'ToKeep' is TRUE, then columns 'date', 'value1' and 'value2' remain the same;
2) if column 'ToKeep' is FALSE, then columns 'value1' e 'value2' receive NA (and 'date' remains the same).
I have been trying to use ifelse so far, but still haven't found the right indexing procedure:
df[, c(2,3)] <- lapply(df[, 4], function(x) ifelse(x == FALSE, NA, x))
Any suggestion?
Thanks in advance,
Thiago.
You can use the logical column to subset the rows, choose the columns you want, then assign the NA values with [<-
df2 <- df ## so that we don't over-write the original data set
df2[!df2$ToKeep, c("value1", "value2")] <- NA
which results in
df2
# date value1 value2 ToKeep
# 1 2009-08-26 22:15:00 NA NA FALSE
# 2 2009-08-26 22:45:00 NA NA FALSE
# 3 2009-08-26 23:14:59 425.7429 430.8542 TRUE
# 4 2009-08-26 23:45:00 395.7708 402.2326 TRUE
# 5 2009-08-27 00:15:00 386.7583 393.6830 TRUE
# 6 2009-08-27 00:44:59 392.1159 398.3883 TRUE
You could replace the lapply command with
df[,2:3] <- lapply(df[,2:3], function(x)
ifelse(df[,'ToKeep'], x, NA))
df
# date value1 value2 ToKeep
#1 2009-08-27 01:15:00 NA NA FALSE
#2 2009-08-27 01:45:00 NA NA FALSE
#3 2009-08-27 02:14:59 425.7429 430.8542 TRUE
#4 2009-08-27 02:45:00 395.7708 402.2326 TRUE
#5 2009-08-27 03:15:00 386.7583 393.6830 TRUE
#6 2009-08-27 03:44:59 392.1159 398.3883 TRUE
Or instead of ifelse, you can use replace
df[,2:3] <- lapply(df[,2:3], function(x)
replace(x, !df[,'ToKeep'], NA ))