Find point of systematic decrease in R - r

I have the following data frame:
df <- structure(list(x = c(1059.6, 1061.4, 1063.4, 1064.9, 1066.3,
1068, 1069.8, 1071.4, 1072.9, 1074.4, 1075.9, 1077.5, 1079.1,
1080.5, 1082.1, 1083.8, 1085.1, 1086.7, 1088.1, 1089.5, 1091.6,
1093.1, 1094.5, 1095.8, 1097.1, 1098.4, 1099.8, 1101.1, 1102.5,
1103.9, 1105.3, 1106.6, 1108, 1109.4, 1110.8, 1112.2, 1113.7,
1115.2, 1116.5, 1117.9, 1119.1, 1120.4, 1121.8, 1123.1, 1124.8,
1126.2, 1127.4, 1128.8, 1130.2, 1131.8, 1133.3, 1134.6, 1138.5,
1141.2, 1142.4, 1143.6, 1144.8, 1146.8, 1148.2, 1149.6, 1150.9,
1152.2, 1153.4, 1154.7, 1155.9, 1157.1, 1158.3, 1159.5, 1161.9,
1163.4, 1164.7, 1166, 1167.2, 1169, 1170.3, 1171.5, 1172.8, 1173.9,
1175.1, 1176.8, 1178, 1179.2, 1180.3, 1181.6, 1182.8, 1184.1,
1185.8, 1187, 1188.2, 1189.4, 1190.5, 1191.8, 1193, 1194.3, 1195.5,
1205.8, 1206.9, 1208, 1209, 1210.2, 1211.3, 1212.4, 1213.6, 1214.7,
1217.2, 1218.6, 1222.3, 1223.6, 1224.7, 1225.9, 1227.1, 1228.2,
1229.3, 1230.4, 1231.6, 1232.7, 1233.6, 1234.6, 1235.7, 1236.9,
1238.4, 1239.5, 1240.6, 1241.6, 1242.7, 1243.7, 1244.8, 1245.9,
1247, 1248.1, 1249.2, 1250.3, 1251.3, 1252.6, 1253.7, 1254.8,
1255.8, 1256.8, 1257.8, 1258.8, 1261.4, 1262.5, 1263.5, 1264.5,
1265.6, 1266.6, 1267.8, 1268.8, 1270.1, 1271.1, 1272.1, 1273.2,
1274.1, 1275.2, 1276.3, 1279, 1280, 1281, 1282.1, 1283.1, 1284.1,
1285, 1286, 1287, 1288, 1289, 1290, 1291.1, 1292.3, 1293.3, 1294.4,
1298.6, 1299.6, 1300.5, 1301.5, 1302.5, 1303.5, 1304.6, 1305.5,
1306.4, 1307.6, 1308.6, 1309.7, 1310.7, 1311.7, 1312.7, 1315.2,
1316.3, 1317.3, 1318.3, 1319.3, 1320.3, 1321.3, 1322.3, 1323.2,
1326.8, 1327.8, 1329, 1330, 1331, 1332, 1333, 1333.9, 1335, 1336,
1337.3, 1338.3, 1339.3, 1340.5, 1341.6, 1342.7, 1343.8, 1344.9,
1345.9, 1346.8, 1347.8, 1348.8, 1350, 1351.1, 1352, 1353.3, 1354.3,
1355.3, 1356.2, 1357.1, 1358, 1359.2, 1360.2, 1364.4, 1365.5,
1366.6, 1367.6, 1368.7, 1369.8, 1371, 1372, 1373, 1374.1, 1375,
1376, 1376.9, 1377.8, 1378.7, 1379.6, 1380.5, 1381.4, 1382.3,
1383.3, 1384.2, 1385.2, 1387.6, 1388.5, 1389.5, 1390.4, 1391.4,
1392.5, 1393.6, 1394.6, 1395.6, 1397, 1397.9, 1398.8, 1399.8,
1400.6, 1401.6, 1402.5, 1403.4, 1404.2, 1405.1, 1407.4, 1408.3,
1409.2, 1410.1, 1411.2, 1412.2, 1413.2, 1414.2, 1415.6, 1416.7,
1417.8, 1418.9, 1420.2, 1421.5, 1424.6, 1425.7, 1427, 1428.1,
1429.3, 1430.7, 1431.9, 1433.1, 1434.5, 1435.7, 1436.8, 1438,
1439.4, 1440.6, 1441.9, 1443, 1444.4, 1445.6, 1447.3, 1448.5,
1449.7, 1450.9, 1452.1, 1453.2, 1454.5, 1455.6, 1456.8, 1458.1,
1459.3, 1460.3, 1461.4, 1462.4, 1463.9, 1465.1, 1466.3, 1469.8,
1471.1, 1472.6, 1473.8, 1475, 1476.2, 1477.5, 1479.1, 1480.7,
1482, 1483.2, 1484.9, 1486.2, 1487.5, 1488.8, 1490, 1491.3, 1492.4,
1503, 1504.3, 1506.3, 1507.5, 1508.8, 1510.2, 1511.4, 1512.5,
1513.8, 1515.6, 1517.1, 1520.1, 1523.9, 1526.5, 1527.9, 1529.8,
1531.2, 1532.4, 1533.7, 1536, 1537.4, 1538.8, 1540.2, 1541.5,
1542.9, 1544.2, 1545.6, 1546.9, 1548.3, 1549.7, 1551.1, 1552.7,
1554.1, 1556.4, 1557.8, 1559.2, 1560.6, 1562, 1563.4, 1564.7,
1566.2, 1567.5, 1568.9, 1570.2, 1571.4, 1573.9, 1576.7, 1581.5,
1582.8, 1584.7, 1586.2, 1587.7, 1589.3, 1591, 1592.8, 1594.7,
1596.4, 1598.5, 1600.6, 1602.4, 1604.6, 1606.9, 1609, 1611, 1612.6,
1614.4, 1616.3, 1618.6, 1620.6, 1622.4, 1624.5, 1627.2, 1629.3,
1631.4, 1635, 1636.9, 1638.6, 1640.5, 1642.1, 1643.7, 1645.5,
1647.1, 1648.7, 1650.9, 1653, 1655.2, 1657.1, 1659.1, 1661.5,
1663.6, 1665.9, 1668.1, 1671.7, 1674, 1676.2, 1678.1, 1679.7,
1681.6, 1683.6, 1685.7, 1688, 1693.7, 1695.7, 1697.6, 1699.7,
1701.7, 1704.1), y = c(1.876, 2.027, 2.087, 2.231, 2.18, 1.922,
1.921, 1.851, 1.961, 2.035, 2.043, 2.043, 1.838, 2.032, 2.112,
1.976, 2.046, 2.117, 2.062, 2.07, 1.748, 1.917, 2.092, 2.283,
2.158, 2.119, 2.023, 1.971, 1.882, 2.058, 2.141, 2.241, 2.079,
1.946, 1.959, 2.117, 1.923, 2.015, 2.066, 1.98, 2.091, 1.929,
1.987, 1.852, 1.935, 2.127, 1.982, 2.182, 2.099, 2.03, 1.912,
1.998, 2.491, 2.359, 2.188, 1.965, 1.906, 1.772, 1.927, 2.077,
2.381, 2.191, 2.089, 2.086, 2.017, 2.028, 1.832, 1.88, 2.053,
2.177, 1.995, 2.045, 2.116, 1.961, 1.99, 2.227, 2.235, 2.208,
2.249, 1.992, 2.045, 2.152, 2.237, 2.239, 2.247, 2.114, 1.956,
2.042, 1.926, 2.396, 2.184, 2.208, 2.016, 2.177, 2.29, 2.469,
2.502, 2.115, 2.081, 2.091, 2.188, 2.118, 2.179, 2.067, 1.962,
2.181, 2.246, 2.526, 2.145, 1.961, 2.299, 2.306, 2.34, 2.133,
1.974, 1.997, 2.47, 2.24, 2.247, 2.137, 1.965, 2.232, 2.225,
2.417, 2.362, 2.155, 2.034, 2.151, 2.176, 2.183, 2.372, 2.145,
2.284, 1.967, 2.299, 2.299, 2.183, 2.292, 2.193, 2.249, 2.32,
2.333, 2.286, 2.216, 2.233, 2.453, 2.373, 2.284, 2.074, 2.014,
2.153, 2.353, 2.465, 2.373, 2.181, 2.424, 2.334, 2.349, 2.39,
2.513, 2.526, 2.268, 2.098, 2.326, 2.385, 2.306, 2.378, 2.126,
2.191, 2.363, 2.222, 2.723, 2.686, 2.4, 2.251, 2.121, 2.104,
2.16, 2.333, 2.151, 2.116, 2.136, 2.293, 2.281, 2.313, 2.374,
2.585, 2.521, 2.656, 2.66, 2.399, 2.442, 2.413, 2.528, 2.212,
2.58, 2.667, 2.153, 2.736, 2.486, 2.406, 2.39, 2.403, 2.504,
2.502, 2.158, 2.617, 2.434, 2.364, 2.497, 2.456, 2.263, 2.432,
2.562, 2.453, 2.249, 2.18, 2.141, 2.324, 2.176, 2.184, 2.153,
2.332, 2.202, 2.332, 2.125, 2.156, 2.189, 2.71, 2.458, 2.502,
2.285, 2.527, 2.437, 2.418, 2.507, 2.087, 2.321, 2.701, 2.486,
2.389, 2.335, 2.26, 2.108, 2.164, 2.286, 2.103, 2.257, 2.137,
2.076, 2.378, 2.637, 2.446, 2.448, 2.539, 2.253, 2.099, 2.59,
2.405, 2.219, 2.542, 2.532, 2.507, 2.439, 2.463, 2.342, 2.329,
2.436, 2.511, 2.557, 2.603, 2.5, 2.428, 2.204, 2.307, 2.174,
2.193, 1.793, 2.116, 2.107, 2.209, 1.967, 1.834, 2.713, 2.647,
2.379, 2.229, 2.11, 1.964, 1.985, 2.162, 1.996, 2.074, 1.994,
1.839, 1.838, 1.743, 1.668, 1.91, 1.735, 1.714, 1.421, 1.767,
1.816, 1.755, 1.755, 1.698, 1.608, 1.556, 1.511, 1.394, 1.425,
1.579, 1.495, 1.627, 1.305, 1.471, 1.469, 1.67, 1.697, 1.42,
1.483, 1.274, 1.341, 1.235, 1.295, 1.401, 1.463, 1.313, 1.176,
1.333, 1.373, 1.299, 1.086, 1.139, 1.237, 1.303, 1.143, 1.13,
1.114, 1.096, 1.248, 1.302, 1.19, 1.069, 1.1, 1.027, 0.897, 1.09,
0.922, 1.116, 0.963, 1.011, 1.053, 1.025, 0.985, 0.981, 1.025,
1.117, 1.141, 1.135, 1.068, 0.982, 1.028, 1.06, 1.004, 1.112,
1.108, 1.04, 0.857, 0.91, 0.98, 1.081, 1.025, 0.996, 0.931, 1,
1.074, 0.987, 0.996, 1.125, 0.9, 0.607, 1.17, 1.08, 1, 0.909,
0.841, 0.924, 0.818, 0.846, 0.732, 1.006, 0.717, 0.594, 0.786,
0.685, 0.619, 0.684, 0.69, 0.633, 0.564, 0.689, 0.555, 0.445,
0.696, 0.677, 0.729, 0.541, 0.362, 0.312, 0.568, 0.711, 0.515,
0.622, 0.583, 0.631, 0.645, 0.696, 0.535, 0.424, 0.469, 0.519,
0.511, 0.485, 0.436, 0.412, 0.351, 0.556, 0.255, 0.519, 0.399,
0.497, 0.477, 0.564, 0.462, 0.433, 0.616, 0.547, 0.42, 0.499,
0.415, 0.368)), row.names = c(NA, -443L), class = c("tbl_df",
"tbl", "data.frame"), .Names = c("x", "y"))
Plot:
And I need to find the point that y starts to systematically decrease.
I know that the real point is x == 1405. However, is there a way to automatically detect it?
I am not expecting to find the exact x point. A really good approximation would do the job.
I already tried to perform a break point analysis with the segmented package, but with not much success. The best number I could get was x == 1363, but I am looking for a closer approximation.

Here's how to get a fitted smooth of the data using loess. When you say "starts to systematically decrease," I think you mean something like "when the slope gets negative beyond a certain threshold," since it seems to me that it visually peaks and starts to decline around the 1350's. I could manually get the peak to occur later by smoothing more than default, using span = 0.4.
library(broom)
fit <- loess(y ~ x, df, span = 0.4)
df_aug <- augment(fit)
Using that model, the peak looks to be around the 1370's.
library(dplyr); library(ggplot2)
df_aug %>% filter(.fitted == max(.fitted))
# # A tibble: 1 x 5
# y x .fitted .se.fit .resid
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 2.09 1373 2.39 0.0181 -0.307
I presume you could get a better result if you can more definitively describe what model should be used to define "systematically decrease."
You might alternately extract the slope and acceleration from the loess curve, but it's not clear that'd get you much closer you your expected result:
# Extract slope & acceleration
df_aug_slope <- df_aug %>%
mutate(slope = (.fitted - lag(.fitted)) /
(x - lag(x)),
curve = (slope - lag(slope)) /
(x - lag(x)))
ggplot(df_aug_slope, aes(x)) +
geom_point(aes(y=y)) +
geom_line(aes(y=.fitted), color ="red") +
geom_line(aes(y= slope * 100), color = "blue") +
geom_line(aes(y= curve * 1000), color = "green") +
geom_vline(xintercept = 1405, lty = "dashed") +
theme_minimal()

Related

Change distance between ticks on axis with DateTime variable

I have made a plot with ggplot where the x-axis is a DateTime variable. I would like to increase the distance between the ticks on the x axis so basically make the plot longer without changing the actual limits (the limits are two dates which appear correctly on the plot). For visibility reasons I need the plot to have more distance between each point/tick on the x axis so that the plotted change is easier to follow.
plot2 <- ggplot(Data, aes(DateTime, group = 1)) +
geom_line(aes(y = Change, colour = "Change"))
I have tried using a function I found online which makes ggplot display all of the ticks over a specified interval. However, this did not expand the actual distance between the ticks, just added more ticks. Function used shown below.
breaks = function(x)
seq.Date(from = min(x),
to = max(x),
by = "10 days")
minor_breaks = function(x)
seq.Date(from = min(x),
to = max(x),
by = "1 day")
plot2 +
scale_x_datetime(breaks = "15 days",
minor_breaks = "1 days")
Any ideas how I can expand the distance between ticks rather than simply adding more ticks or expanding the limits of the plot?
Here is the data:
structure(list(DateTime = structure(c(1670415374, 1670506704,
1670511629, 1670517043, 1670523367, 1670528144, 1670584731, 1670587219,
1670594506, 1670598044, 1670602687, 1670612571, 1670615016, 1670669321,
1670674548, 1670685813, 1670696903, 1670700710, 1670765570, 1670773501,
1670783869, 1670789931, 1670841137, 1670846470, 1670847141, 1670852707,
1670862620, 1670868387, 1670873869, 1670928407, 1670933500, 1670939207,
1670943626, 1670950125, 1670955461, 1670966455, 1671019197, 1671043928,
1671116424, 1671127891, 1671193896, 1671193896, 1671205681, 1671219303,
1671284825, 1671290071, 1671312145, 1671373499, 1671379886, 1671387493,
1671462033, 1671474322, 1671542373, 1671563513, 1671624638, 1671649144,
1671725712, 1671732040, 1671827734, 1671888249, 1671893712, 1671915669,
1671915669, 1671915669, 1671994322, 1671994322, 1672076098, 1672130014,
1672152035, 1672167136, 1672255279, 1672333309, 1672337893, 1672404995,
1672413431, 1672424101, 1672488057, 1672523537, 1672589460, 1672596553,
1672662188, 1672671457, 1672691857, 1672748172, 1672760508, 1672772744,
1672935802, 1672945396, 1673007435, 1673019876, 1673028304, 1673092863,
1673108795, 1673120023, 1673179769, 1673190069, 1673201163, 1673265620,
1673274863, 1673287229, 1673356126, 1673365774, 1673387010, 1673442864,
1673462481, 1673538703, 1673549051, 1673620368, 1673622111, 1673634853,
1673699438, 1673719261, 1673786770, 1673794936, 1673807902, 1673876509,
1673884159, 1673895512, 1673957880, 1673967545, 1673968363, 1673978875,
1674052099, 1674065378, 1674144571, 1674151393, 1674217307, 1674227431,
1674248909, 1674308920, 1674318010, 1674325431, 1674398675, 1674400192,
1674410504, 1674475288, 1674486319, 1674497762, 1674563116, 1674583639,
1674648195, 1674671450, 1674902477, 1674902477, 1674908963, 1674912648,
1674920521, 1674923572, 1674929410, 1674934232, 1674989081, 1674993673,
1675001115, 1675024579, 1675076219, 1675076219, 1675086360, 1675095923,
1675096891, 1675101964, 1675107017, 1675161022, 1675166442, 1675172546,
1675178387, 1675183108, 1675188015, 1675195910, 1675247464, 1675252883,
1675263629, 1675269012, 1675278293, 1675288565, 1675334193, 1675339936,
1675347483, 1675350039, 1675355417, 1675362687, 1675368937), class = c("POSIXct",
"POSIXt"), tzone = ""), Change = c(0.677, 0.522,
0.252, 0.759, 0.733, 0.331, 0.658, 0.661, 0.245, 0.5, 0.5, 0.679,
0.703, 0.5, 0.5, 0.391, 0.688, 0.702, 0.824, 0.718, 0.5, 0.778,
0.295, 0.263, 0.249, 0.297, 0.737, 0.76, 0.755, 0.704, 0.492,
0.333, 0.5, 0.774, 0.899, 0.5, 0.5, 0.822, 0.649, 0.684, 0.72,
0.72, 0.694, 0.813, 0.318, 0.257, 0.739, 0.74, 0.691, 0.786,
0.735, 0.5, 0.5, 0.834, 0.5, 0.452, 0.706, 0.735, 0.733, 0.74,
0.778, 0.887, 0.887, 0.887, 0.688, 0.688, 0.728, 0.726, 0.803,
0.922, 0.788, 0.764, 0.736, 0.724, 0.5, 0.763, 0.286, 0.831,
0.838, 0.743, 0.701, 0.5, 0.316, 0.5, 0.5, 0.774, 0.741, 0.732,
0.902, 0.5, 0.749, 0.592, 0.699, 0.72, 0.785, 0.702, 0.764, 0.74,
0.329, 0.181, 0.902, 0.5, 0.654, 0.79, 0.705, 0.729, 0.766, 0.26,
0.5, 0.728, 0.767, 0.673, 0.832, 0.804, 0.88, 0.274, 0.792, 0.732,
0.741, 0.834, 0.272, 0.745, 0.726, 0.76, 0.305, 0.83, 0.741,
0.284, 0.691, 0.771, 0.755, 0.669, 0.662, 0.696, 0.684, 0.721,
0.5, 0.775, 0.5, 0.66, 0.699, 0.711, 0.743, 0.743, 0.654, 0.717,
0.767, 0.714, 0.724, 0.683, 0.68, 0.771, 0.699, 0.733, 0.21,
0.21, 0.717, 0.739, 0.774, 0.823, 0.73, 0.5, 0.508, 0.5, 0.5,
0.5, 0.785, 0.469, 0.887, 0.732, 0.5, 0.702, 0.696, 0.862, 0.142,
0.815, 0.5, 0.5, 0.694, 0.775, 0.883)), row.names = c(NA, -181L
), class = c("tbl_df", "tbl", "data.frame"))

How to implement k-fold cross-validation while forcing linear regression of predicted to real values to 1:1 line

I'm trying to train y as a polynomial function of x so that when the predicted y values are linearly regressed against the real y values, the relationship is on the 1:1 line (diagram - The image on the right uses geom_smooth(method="lm") for demonstration, but with SMA from the lmodel2() function, the regression line is 1:1). I'm kind of a stats amateur so I'm aware there might be problems with this, but without forcing the model tends to overestimate low values and underestimate high values. My question is: How do I introduce k-fold cross-validation using an existing package like caret or cvms? It seems like they need a model object to be returned and I can't figure out how to code my problem like that. Is there some way I can train the model by minimizing my custom metric and still return a model object with ypred and use it in k-fold CV?
This is my code for calculating the coefficients without k-fold CV:
data <- data.frame(
x = c(1.514, 1.514, 1.825, 1.281, 1.118, 1.279, 1.835, 1.819, 0.462, 1.53, 1.004, 1.19, 1.275, 0.428, 0.313, 0.909, 0.995, 0.995, 0.706, 0.563, 0.827, 0.65, 0.747, 1.013, 1.013, 1.163, 1.091, 1.163, 1.091, 0.955, 0.955, 2.044, 2.044, 1.777, 1.777, 1.434, 1.393, 1.324, 0.981, 0.845, 1.595, 1.595, 1.517, 1.517, 1.403, 1.403, 0.793, 0.793, 1.016, 0.901, 0.847, 1.054, 0.877, 1.639, 1.639, 1.268, 1.268, 0.842, 0.842, 0.827, 0.777, 1.024, 1.238, 1.238, 1.702, 1.702, 0.673, 0.673, 1.256, 1.256, 0.898, 0.898, 0.66, 0.933, 0.827, 0.836, 1.122, 1.5, 1.5, 1.44, 1.44, 0.671, 0.671, 0.486, 0.486, 1.051, 1.051, 0.971, 0.538, 0.971, 0.538, 1.012, 1.012, 0.776, 0.776, 0.854, 0.854, 0.74, 0.989, 0.989),
y = c(0.19, 0.18, 0.816, 2.568, 0.885, 0.521, 0.268, 0.885, 4.781, 1.648, 0.989, 1.614, 1.492, 0.679, 2.256, 3.17, 1.926, 1.631, 0.462, 2.48, 0.658, 0.355, 0.373, 2.31, 3.263, 1.374, 1.374, 2.637, 2.637, 2.073, 2.298, 0.257, 0.292, 0.359, 0.329, 1.329, 1.272, 3.752, 1.784, 0.76, 0.458, 0.488, 0.387, 0.387, 3.401, 1.458, 8.945, 9.12, 0.308, 0.386, 0.405, 6.444, 3.17, 0.458, 0.47, 0.572, 0.589, 1.961, 1.909, 0.636, 0.32, 1.664, 0.756, 0.851, 0.403, 0.232, 23.112, 22.042, 0.745, 0.477, 2.349, 3.01, 0.39, 0.246, 0.43, 1.407, 1.358, 0.235, 0.215, 0.595, 0.685, 2.539, 2.128, 8.097, 5.372, 0.644, 0.626, 17.715, 17.715, 6.851, 6.851, 2.146, 1.842, 3.147, 2.95, 1.127, 1.019, 8.954, 0.796, 0.758),
stringsAsFactors = FALSE)
optim_results <- optim(par = c(a0 = 0.3, a1 = -3.8, a2 = -1, a3 = 1, a4 = 1),
fn = function (params, x, y) {
params <- as.list(params)
ypred <- with(params, (a0 + (a1*x) + (a2*x^2) + (a3*x^3) + (a4*x^4)))
mod <- suppressMessages(lmodel2::lmodel2(ypred ~ y))$regression.results[3,]
line <- mod$Slope * y + mod$Intercept
return(sum((y - line)^2))},
x = log10(data$x),
y = log10(data$y))
cf <- as.numeric(optim_results$par)
data <- data %>% dplyr::mutate(ypred = 10^(cf[1] + cf[2]*log10(x) + cf[3]*log10(x)^2 + cf[4]*log10(x)^3 + cf[5]*log10(x)^4))
str(data)
Great question!
cvms::cross_validate_fn() allows you to cross-validate custom functions. You just have to wrap your code in a model function and a predict function as so:
EDIT: Added extraction of model parameters from the optim() output. optim() returns a list, which we convert to a class and then tell coef() how to extract the coefficients for that class.
library(dplyr)
library(groupdata2)
library(cvms)
# Set seed for reproducibility
set.seed(2)
data <- data.frame(
x = c(1.514, 1.514, 1.825, 1.281, 1.118, 1.279, 1.835, 1.819, 0.462, 1.53, 1.004, 1.19, 1.275, 0.428, 0.313, 0.909, 0.995, 0.995, 0.706, 0.563, 0.827, 0.65, 0.747, 1.013, 1.013, 1.163, 1.091, 1.163, 1.091, 0.955, 0.955, 2.044, 2.044, 1.777, 1.777, 1.434, 1.393, 1.324, 0.981, 0.845, 1.595, 1.595, 1.517, 1.517, 1.403, 1.403, 0.793, 0.793, 1.016, 0.901, 0.847, 1.054, 0.877, 1.639, 1.639, 1.268, 1.268, 0.842, 0.842, 0.827, 0.777, 1.024, 1.238, 1.238, 1.702, 1.702, 0.673, 0.673, 1.256, 1.256, 0.898, 0.898, 0.66, 0.933, 0.827, 0.836, 1.122, 1.5, 1.5, 1.44, 1.44, 0.671, 0.671, 0.486, 0.486, 1.051, 1.051, 0.971, 0.538, 0.971, 0.538, 1.012, 1.012, 0.776, 0.776, 0.854, 0.854, 0.74, 0.989, 0.989),
y = c(0.19, 0.18, 0.816, 2.568, 0.885, 0.521, 0.268, 0.885, 4.781, 1.648, 0.989, 1.614, 1.492, 0.679, 2.256, 3.17, 1.926, 1.631, 0.462, 2.48, 0.658, 0.355, 0.373, 2.31, 3.263, 1.374, 1.374, 2.637, 2.637, 2.073, 2.298, 0.257, 0.292, 0.359, 0.329, 1.329, 1.272, 3.752, 1.784, 0.76, 0.458, 0.488, 0.387, 0.387, 3.401, 1.458, 8.945, 9.12, 0.308, 0.386, 0.405, 6.444, 3.17, 0.458, 0.47, 0.572, 0.589, 1.961, 1.909, 0.636, 0.32, 1.664, 0.756, 0.851, 0.403, 0.232, 23.112, 22.042, 0.745, 0.477, 2.349, 3.01, 0.39, 0.246, 0.43, 1.407, 1.358, 0.235, 0.215, 0.595, 0.685, 2.539, 2.128, 8.097, 5.372, 0.644, 0.626, 17.715, 17.715, 6.851, 6.851, 2.146, 1.842, 3.147, 2.95, 1.127, 1.019, 8.954, 0.796, 0.758),
stringsAsFactors = FALSE)
# Fold data
# Will do 10-fold repeated cross-validation (10 reps)
data <- fold(
data = data,
k = 10, # Num folds
num_fold_cols = 10 # Num repetitions
)
# Write a model function from your code
# This ignores the formula and hyperparameters but
# you could pass values through those if you wanted
# to try different formulas or hyperparameter values
model_fn <- function(train_data, formula, hyperparameters){
out <- optim(par = c(a0 = 0.3, a1 = -3.8, a2 = -1, a3 = 1, a4 = 1),
fn = function (params, x, y) {
params <- as.list(params)
ypred <- with(params, (a0 + (a1*x) + (a2*x^2) + (a3*x^3) + (a4*x^4)))
mod <- suppressMessages(lmodel2::lmodel2(ypred ~ y))$regression.results[3,]
line <- mod$Slope * y + mod$Intercept
return(sum((y - line)^2))},
x = log10(train_data$x),
y = log10(train_data$y))
# Convert output to an S3 class
# so we can extract parameters with coef()
class(out) <- "OptimModel"
out
}
# Tell coef() how to extract the parameters
# This can modified if you need more info from the optim() output
# Just return a named list
coef.OptimModel <- function(object) {
object$par
}
# Write a predict function from your code
predict_fn <- function(test_data, model, formula, hyperparameters, train_data){
cf <- as.numeric(model$par)
test_data %>%
dplyr::mutate(
ypred = 10^(cf[1] + cf[2]*log10(x) + cf[3]*log10(x)^2 + cf[4]*log10(x)^3 + cf[5]*log10(x)^4)
) %>%
.[["ypred"]]
}
# Cross-validate the model
cv <- cross_validate_fn(
data = data,
model_fn = model_fn,
predict_fn = predict_fn,
formulas = c("y ~ x"), # Not currently used by the model function
fold_cols = paste0('.folds_', seq_len(10)),
type = 'gaussian'
)
#> Will cross-validate 1 models. This requires fitting 100 model instances.
# Check output
cv
# A tibble: 1 × 17
Fixed RMSE MAE NRMSE(I…¹ RRSE RAE RMSLE Predic…² Results Coeffi…³ Folds
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <list> <list> <int>
1 x 4.00 2.31 2.66 1.47 1.17 0.662 <tibble> <tibble> <tibble> 100
# … with 6 more variables: `Fold Columns` <int>, `Convergence Warnings` <int>,
# `Other Warnings` <int>, `Warnings and Messages` <list>, Process <list>,
# Dependent <chr>, and abbreviated variable names ¹​`NRMSE(IQR)`,
# ²​Predictions, ³​Coefficients
# ℹ Use `colnames()` to see all variable names
Created on 2022-10-15 with reprex v2.0.2

How to find comon patterns between different groups using R?

I have a data frame that indicates a road type and 24 columns (h_1 ... h_24) that show how many vehicles pass (relatively over the day) per hour. Each row is a different road.
I'm interested to find commonalities among types. My intended output is a condensation of the roadtypes. I.e. road type 2 and 3 appear to have the same pattern, so they are group into a new category (e.g. category).
So my question is, how can one detect this kind of pattern with as many as 15 different types?
Part of my data:
structure(list(type = c(14, 14, 11, 4, 13, 12, 13, 13, 13, 13,
11, 14, 1, 11, 14, 11, 4, 13, 14, 9, 14, 13, 13, 9, 14, 13, 1,
11, 14, 13, 13, 13, 11, 13, 15, 11, 14, 11, 14, 13, 9, 11, 13,
9, 14, 13, 13, 13, 13, 13, 9, 14, 13, 12, 11, 14, 11, 4, 11,
4, 13, 9, 13, 9, 13, 13, 1, 15, 1, 6, 13, 11, 13, 6, 11, 11,
11, 13, 13, 13, 12, 13, 14, 13, 11, 9, 14, 11, 13, 11, 3, 11,
11, 11, 14, 11, 13, 14, 13, 11, 11, 14, 11, 11, 13, 15, 12, 11,
4, 13, 14, 13, 11, 13, 14, 11, 9, 13, 13, 11, 11, 11, 13, 11,
11, 13, 13, 13, 14, 11, 9, 11, 13, 4, 12, 13, 13, 9, 13, 11,
13, 11, 13, 1, 9, 13, 11, 11, 13, 11), h_1 = c(1.091, 0.591,
1.129, 0.274, 0.178, 1.507, 0.654, 1.003, 0.228, 0.657, 1.411,
0.97, 0.875, 0.397, 1.462, 1.063, 0.648, 1.181, 0.629, 1.219,
2.193, 1.054, 0.768, 0.922, 1.525, 2.891, 0.888, 1.171, 0.684,
0.455, 0.562, 1.138, 0.895, 0.71, 0.445, 1.444, 3.644, 2.365,
0.391, 0.687, 1.037, 0.423, 2.14, 0.942, 1.33, 0.737, 1.766,
0.144, 1.08, 0.672, 0.629, 0.39, 0.325, 1.079, 2.099, 0.163,
0.871, 1.112, 1.731, 0.313, 1.039, 1.057, 1.159, 0.959, 0.755,
0.741, 0.429, 1.017, 0.602, 0.359, 0.574, 0.872, 0.639, 0.786,
0.857, 1.212, 2.553, 1.755, 0.543, 1.691, 0.715, 0.352, 1.431,
1.188, 2.115, 0.536, 0.605, 0.894, 0.745, 2.639, 0.545, 1.135,
0.702, 0.82, 0.462, 0.263, 1.362, 0.226, 0.801, 1.783, 1.301,
1.024, 1.394, 1.512, 1.151, 4.175, 0.644, 2.11, 0.518, 1.938,
1.048, 0.942, 1.233, 1.024, 1.967, 1.601, 0.736, 0.496, 1.346,
1.109, 0.78, 0.635, 0.567, 0.378, 2.976, 0.453, 0.392, 1.362,
1.042, 0.555, 1.218, 0.936, 1.098, 0.868, 1.172, 0.247, 1.287,
0.824, 1.025, 0.863, 1.484, 0.507, 1.335, 0.637, 1.986, 1.137,
0.837, 1.787, 0.353, 1.865), h_2 = c(0.607, 0.284, 0.753, 0.164,
0.085, 1.046, 0.422, 0.816, 0.1, 0.445, 1.032, 0.559, 0.699,
0.334, 1.092, 0.544, 0.494, 0.803, 0.251, 0.862, 2.53, 1.389,
0.705, 0.382, 0.932, 2.332, 0.604, 0.801, 0.329, 0.248, 0.411,
0.866, 0.584, 0.295, 0.26, 0.873, 2.943, 1.887, 0.287, 0.462,
0.668, 0.411, 2.101, 0.636, 0.88, 0.389, 1.24, 0.072, 0.804,
0.481, 0.346, 0.194, 0.093, 0.629, 1.644, 0.122, 0.615, 0.604,
1.308, 0.25, 0.577, 0.996, 0.849, 0.594, 0.418, 0.452, 0.252,
0.706, 0.348, 0.16, 0.297, 0.608, 0.57, 0.413, 0.745, 0.839,
1.894, 1.315, 0.344, 1.046, 0.35, 0.206, 0.987, 0.422, 1.595,
0.229, 0.263, 0.501, 0.556, 2.112, 0.303, 0.765, 0.485, 0.517,
0.24, 0.11, 0.88, 0.104, 0.649, 1.198, 0.948, 0.708, 0.917, 0.729,
0.743, 3.336, 0.35, 1.635, 0.253, 1.421, 0.539, 0.554, 0.82,
0.708, 1.411, 1.011, 0.638, 0.297, 0.918, 0.427, 0.676, 0.449,
0.556, 0.401, 2.192, 0.194, 0.264, 0.879, 0.667, 0.319, 0.854,
0.613, 0.683, 0.481, 0.855, 0.305, 0.865, 0.593, 0.568, 0.552,
1.002, 0.314, 0.953, 0.341, 1.415, 0.508, 0.441, 1.18, 0.24,
1.277), h_3 = c(0.505, 0.171, 0.277, 0.164, 0.097, 0.774, 0.305,
0.646, 0.132, 0.416, 0.853, 0.412, 0.621, 0.508, 0.8, 0.336,
0.432, 0.667, 0.163, 0.7, 2.953, 0.383, 0.656, 0.161, 0.635,
0.551, 0.466, 0.295, 0.229, 0.217, 0.141, 1.002, 0.498, 0.138,
0.177, 0.531, 1.688, 1.634, 0.259, 0.472, 0.565, 0.42, 2.051,
0.488, 0.703, 0.202, 1.03, 0.072, 0.603, 0.552, 0.208, 0.122,
0.023, 0.419, 1.278, 0.081, 0.54, 0.397, 0.921, 0.188, 0.357,
1.049, 0.602, 0.431, 0.193, 0.191, 0.204, 0.452, 0.3, 0.1, 0.173,
0.216, 0.531, 0.28, 0.772, 0.307, 1.486, 0.994, 0.164, 0.681,
0.229, 0.222, 0.723, 0.134, 1.217, 0.189, 0.152, 0.205, 0.562,
1.579, 0.242, 0.114, 0.434, 0.401, 0.218, 0.07, 0.645, 0.111,
0.604, 0.876, 0.847, 0.603, 0.797, 0.573, 0.464, 2.183, 0.266,
1.08, 0.161, 1.034, 0.342, 0.43, 0.533, 0.603, 1.064, 0.601,
0.731, 0.24, 0.801, 0.173, 0.192, 0.141, 0.522, 0.435, 1.044,
0.129, 0.226, 0.64, 0.502, 0.113, 0.466, 0.54, 0.523, 0.283,
0.697, 0.321, 0.701, 0.461, 0.358, 0.403, 0.828, 0.151, 0.662,
0.272, 0.997, 0.28, 0.195, 0.611, 0.353, 1.027), h_4 = c(0.366,
0.166, 0.218, 0.206, 0.047, 0.625, 0.333, 0.685, 0.691, 0.739,
0.937, 0.397, 0.739, 0.703, 0.737, 0.304, 0.432, 0.621, 0.163,
0.774, 2.831, 0.101, 0.929, 0.153, 0.466, 0.186, 0.454, 0.218,
0.331, 0.302, 0.105, 2.127, 0.638, 0.188, 0.264, 0.548, 1.327,
1.387, 0.394, 0.791, 0.614, 0.639, 1.671, 0.496, 1.185, 0.179,
1.192, 0.287, 0.779, 0.844, 0.265, 0.187, 0.023, 0.383, 1.169,
0.122, 0.764, 0.334, 0.835, 0.188, 0.367, 1.277, 0.799, 0.414,
0.193, 0.245, 0.3, 0.367, 0.363, 0.2, 0.254, 0.167, 0.574, 0.186,
1.166, 0.221, 1.512, 0.832, 0.161, 0.714, 0.326, 0.416, 0.867,
0.23, 1.107, 0.348, 0.218, 0.179, 0.935, 1.295, 0.262, 0.177,
0.823, 0.472, 0.295, 0.116, 0.717, 0.271, 0.791, 0.958, 1.371,
0.744, 0.968, 0.706, 0.409, 1.466, 0.249, 0.912, 0.207, 0.827,
0.332, 0.359, 0.702, 0.744, 0.859, 0.544, 0.993, 0.357, 1.109,
0.218, 0.299, 0.144, 0.558, 1.027, 0.761, 0.172, 0.349, 0.719,
0.664, 0.18, 0.406, 0.777, 0.541, 0.274, 0.918, 1.109, 0.675,
0.522, 0.321, 0.489, 0.828, 0.085, 0.57, 0.351, 0.823, 0.134,
0.215, 0.559, 0.396, 1.172), h_5 = c(0.759, 0.362, 0.394, 1.041,
0.08, 0.598, 0.714, 0.738, 4.113, 1.674, 0.948, 0.661, 1.051,
2.606, 0.996, 0.462, 0.571, 1.056, 0.415, 1.242, 2.291, 0.096,
1.211, 0.288, 0.593, 0.727, 0.721, 0.305, 0.967, 0.687, 0.257,
4.004, 1.267, 0.381, 0.628, 0.813, 1.521, 1.281, 0.925, 1.738,
0.872, 2.14, 1.936, 0.698, 1.752, 0.29, 2.07, 0.647, 1.381, 1.377,
0.531, 0.514, 0.162, 0.408, 1.346, 0.448, 2.163, 0.429, 0.907,
0.563, 0.499, 2.062, 1.532, 0.572, 0.396, 0.52, 0.754, 0.537,
0.714, 0.819, 0.821, 0.255, 0.836, 0.266, 2.112, 0.313, 2.199,
0.796, 0.367, 1.3, 0.544, 1.199, 1.131, 0.23, 1.134, 1.306, 0.417,
0.199, 1.416, 1.598, 0.545, 0.3, 2.043, 0.963, 0.652, 0.354,
1.26, 0.793, 1.481, 1.838, 3.07, 1.419, 1.634, 0.932, 0.39, 1.545,
0.701, 1.102, 0.46, 0.887, 0.585, 0.543, 0.999, 1.419, 1.107,
0.623, 1.456, 0.56, 1.963, 0.231, 0.439, 0.162, 0.784, 2.901,
1.314, 0.323, 0.847, 1.264, 1.313, 0.417, 0.78, 1.9, 0.857, 0.538,
1.647, 2.558, 1.153, 0.768, 0.568, 0.91, 1.421, 0.237, 0.755,
0.832, 1.183, 0.107, 0.491, 1.175, 0.848, 2.117), h_6 = c(1.836,
0.961, 1.605, 3.069, 0.089, 1.005, 2.508, 1.717, 5.413, 4.381,
1.665, 1.441, 1.89, 6.892, 2.116, 1.612, 1.157, 2.141, 1.571,
3.35, 2.667, 0.718, 1.845, 0.978, 1.186, 1.787, 1.974, 1.03,
2.14, 1.405, 1.073, 5.952, 3.467, 1.101, 1.578, 2.122, 2.36,
1.302, 2.865, 3.508, 1.718, 4.415, 2.705, 1.552, 3.541, 0.97,
3.108, 0.862, 3.365, 2.782, 1.806, 1.757, 2.921, 0.752, 2.146,
0.855, 4.545, 0.683, 1.447, 1.939, 1.292, 3.794, 3.581, 1.262,
1.564, 1.683, 3.417, 0.65, 1.804, 2.016, 2.446, 0.702, 1.869,
0.746, 3.343, 1.033, 3.984, 1.067, 1.434, 2.076, 2.226, 4.295,
2.023, 1.016, 1.737, 3.669, 1.225, 0.485, 2.42, 3.062, 1.736,
1.372, 3.244, 2.885, 1.806, 2.403, 2.771, 1.791, 2.834, 3.889,
4.502, 2.807, 3.486, 2.375, 0.817, 1.847, 2.197, 2.185, 1.645,
1.362, 1.48, 1.149, 1.788, 2.807, 2.003, 0.997, 2.618, 1.616,
3.618, 1.369, 1.354, 0.5, 2.065, 4.543, 2.877, 1.271, 2.494,
2.778, 3.167, 1.745, 2.359, 4.095, 1.943, 1.415, 3.547, 5.202,
2.58, 1.96, 1.124, 3.238, 3.02, 1.367, 1.726, 2.302, 2.512, 0.759,
2.192, 3.15, 2.839, 4), h_7 = c(3.944, 2.971, 3.597, 7.331, 0.157,
3.246, 5.143, 4.038, 4.62, 8.276, 3.273, 4.792, 4.084, 7.116,
5.171, 4.521, 4.489, 3.858, 4.978, 4.881, 5.335, 1.361, 3.734,
2.205, 3.643, 3.594, 3.541, 2.524, 4.838, 4.157, 2.983, 6.644,
7.517, 2.41, 4.247, 5.508, 4.549, 2.551, 6.072, 6.069, 2.842,
5.862, 5.345, 3, 4.769, 2.266, 4.495, 1.15, 6.454, 4.976, 5.955,
6.088, 4.59, 2.782, 4.862, 2.28, 5.885, 2.574, 3.431, 4.878,
5.899, 5.832, 5.674, 3.591, 3.531, 4.825, 7.777, 2.232, 5.971,
6.209, 5.998, 2.159, 4.549, 2.784, 4.631, 2.271, 5.291, 3.231,
3.554, 4.512, 5.732, 8.902, 3.408, 3.22, 3.883, 6.306, 2.911,
1.414, 4.128, 3.826, 5.774, 2.821, 5.404, 5.808, 5.438, 5.799,
4.229, 4.481, 4.958, 5.625, 5.415, 4.743, 5.998, 4.464, 2.284,
3.182, 6.053, 4.695, 6.305, 3.106, 2.605, 2.989, 4.678, 4.743,
4.143, 2.808, 4.523, 4.027, 4.915, 5.688, 2.799, 1.262, 4.947,
5.731, 5.335, 3.685, 5.917, 4.241, 7.878, 3.28, 5.54, 4.96, 5.138,
4.028, 6.175, 7.239, 4.571, 4.087, 3.57, 5.434, 4.19, 3.716,
3.671, 7.065, 4.635, 1.697, 4.962, 4.667, 4.746, 4.848), h_8 = c(5.215,
5.386, 6.699, 8.865, 0.427, 5.445, 7.39, 6.853, 6.185, 7.8, 5.74,
6.424, 6.53, 7.32, 6.008, 8.395, 10.026, 4.249, 8.699, 5.004,
6.313, 2.168, 7.191, 5.241, 5.125, 5.37, 5.069, 4.746, 6.941,
7.316, 6.779, 6.306, 7.13, 4.673, 7.553, 6.066, 5.055, 4.024,
8.355, 6.071, 3.719, 6.783, 7.185, 3.995, 5.642, 4.749, 5.131,
1.833, 6.354, 6.263, 7.299, 8.336, 6.351, 4.624, 5.835, 4.439,
5.088, 4.1, 5.431, 4.44, 7.039, 6.247, 5.87, 5.735, 6.145, 6.091,
7.619, 6.102, 8.134, 6.349, 8.065, 5.498, 6.431, 4.169, 5.304,
4.546, 5.721, 5.607, 7.894, 6.478, 7.121, 7.671, 4.972, 5.923,
4.956, 7.251, 5.109, 4.452, 6.651, 4.112, 6.743, 5.306, 5.581,
6.09, 8.751, 10.775, 4.649, 6.467, 6.948, 5.593, 5.717, 5.431,
6.158, 5.759, 3.807, 3.975, 6.76, 5.866, 7.375, 4.356, 4.117,
4.146, 7.317, 5.431, 6.002, 4.177, 5.432, 6.185, 5.164, 8.952,
4.595, 3.218, 6.032, 5.805, 5.246, 4.59, 8.141, 4.648, 8.084,
6.147, 5.986, 4.995, 6.718, 4.528, 8.55, 8.742, 5.106, 6.335,
4.793, 5.607, 4.527, 9.264, 5.148, 7.653, 5.048, 3.544, 6.039,
5.443, 5.538, 4.979), h_9 = c(5.279, 5.904, 7.211, 6.111, 0.686,
5.486, 6.411, 7.185, 7.688, 5.984, 5.925, 5.703, 6.011, 6.917,
5.155, 6.805, 9.44, 4.454, 7.568, 4.831, 5.196, 1.964, 7.191,
4.502, 4.701, 4.84, 5.185, 4.929, 7.045, 7.729, 6.769, 5.887,
5.477, 5.668, 7.034, 6.077, 4.772, 4.451, 7.419, 6.329, 4.182,
6.222, 6.138, 4.144, 6.23, 4.38, 5.154, 2.3, 6.404, 5.772, 5.932,
7.208, 7.626, 4.986, 5.043, 8.145, 5.184, 4.195, 5.652, 4.941,
5.91, 5.384, 5.672, 6.187, 5.849, 6.842, 5.453, 7.599, 6.645,
4.831, 8.056, 5.92, 6.229, 4.236, 5.425, 4.728, 5.045, 5.764,
9.408, 6.174, 6.258, 6.956, 5.795, 6.728, 4.565, 6.555, 5.895,
4.633, 7.259, 4.601, 5.855, 5.48, 5.246, 5.448, 7.517, 9.931,
4.938, 6.758, 6.838, 4.945, 5.694, 5.569, 5.11, 5.72, 4.327,
4.316, 6.338, 5.586, 6.627, 4.62, 5.437, 5.406, 6.071, 5.569,
4.98, 4.102, 5.439, 7.023, 5.06, 6.871, 5.272, 3.925, 6.269,
5.077, 4.981, 4.224, 6.76, 4.936, 7.325, 6.029, 5.669, 5.122,
6.133, 4.009, 8.74, 8.375, 4.909, 5.907, 4.212, 5.546, 4.625,
9.848, 5.362, 6.648, 4.422, 3.788, 6.14, 5.741, 5.806, 5.027),
h_10 = c(5.058, 6.346, 6.55, 5.07, 1.484, 5.323, 5.278, 5.735,
6.742, 5.904, 5.429, 5.483, 5.835, 6.039, 4.841, 5.971, 6.849,
5.019, 5.946, 5.286, 4.46, 2.251, 5.69, 4.145, 4.659, 4.651,
4.937, 4.407, 6.683, 8.145, 6.157, 5.74, 5.485, 6.393, 7.573,
6.145, 4.81, 4.735, 5.986, 6.048, 4.754, 6.089, 5.299, 4.212,
5.896, 4.195, 5.432, 4.06, 5.801, 4.853, 5.505, 6.439, 8.785,
5.16, 4.865, 5.701, 6.12, 4.608, 5.585, 5.691, 5.775, 5.481,
5.769, 5.989, 5.114, 6.312, 4.829, 5.96, 5.713, 4.073, 6.565,
4.873, 5.742, 5.089, 5.368, 4.296, 4.905, 5.46, 6.302, 6.095,
5.621, 6.09, 5.853, 6.594, 4.785, 5.915, 6.814, 4.733, 7.454,
4.967, 6.562, 5.022, 5.416, 5.736, 6.562, 7.703, 5.361, 7.068,
5.823, 5.008, 5.717, 6.034, 5.176, 5.567, 4.828, 4.853, 5.759,
5.522, 6.224, 5.135, 6.105, 6.028, 5.328, 6.034, 5.202, 4.424,
5.67, 5.98, 5.371, 5.011, 5.106, 4.307, 6.163, 5.434, 5.086,
4.31, 5.25, 5.36, 5.911, 5.119, 5.45, 5.3, 5.851, 4.142,
5.795, 5.223, 5.202, 4.818, 4.274, 5.377, 5.083, 7.644, 5.714,
5.638, 4.582, 4.417, 6.002, 5.24, 6.413, 5.264), h_11 = c(5.189,
7.161, 6.16, 5.029, 3.229, 5.663, 5.523, 6.22, 5.834, 5.981,
5.588, 5.924, 6.1, 5.593, 5.359, 5.823, 5.877, 5.658, 6.034,
5.53, 4.847, 3.975, 5.727, 4.918, 5.083, 5.428, 5.219, 4.19,
6.593, 8.498, 5.693, 5.474, 5.866, 7.03, 7.822, 6.103, 4.899,
5.251, 5.579, 5.878, 5.56, 6.383, 5.226, 4.756, 6.012, 4.302,
5.936, 3.881, 5.65, 4.818, 5.626, 6.501, 4.288, 5.74, 5.171,
6.108, 6.609, 5.514, 5.815, 7.067, 5.748, 5.871, 6.02, 6.103,
5.188, 6.302, 4.994, 6.102, 5.514, 4.232, 4.934, 4.352, 5.851,
6.021, 5.562, 3.678, 5.036, 5.673, 4.629, 6.192, 5.913, 5.603,
5.916, 7.591, 5.181, 5.692, 7.577, 5.298, 7.374, 5.33, 6.885,
5.202, 5.675, 6.281, 6.411, 5.672, 5.839, 7.519, 6.107, 5.004,
6.014, 6.47, 5.543, 5.677, 5.682, 5.023, 6.043, 5.651, 6.42,
5.696, 6.552, 6.117, 5.337, 6.47, 5.716, 5.585, 6.013, 5.962,
5.692, 4.826, 6.334, 6.454, 6.221, 5.687, 5.396, 5.452, 5.035,
5.838, 5.372, 4.676, 5.881, 5.53, 5.994, 5.085, 4.497, 4.996,
5.709, 4.862, 5.015, 5.277, 5.425, 6.354, 6.517, 5.427, 5.353,
5.554, 5.542, 5.476, 7.077, 5.483), h_12 = c(6.006, 7.094,
5.992, 4.892, 5.527, 5.758, 5.853, 6.067, 5.441, 5.388, 5.872,
6.497, 6.347, 5.749, 5.449, 5.799, 5.769, 5.261, 6.084, 5.822,
4.877, 2.203, 6.523, 6.14, 5.972, 5.421, 5.607, 4.782, 6.095,
8.027, 5.084, 5.227, 5.285, 7.219, 8.326, 5.96, 4.82, 5.55,
5.881, 5.624, 6.303, 6.584, 5.299, 5.366, 7.179, 4.949, 5.607,
5.102, 5.977, 5.271, 5.926, 6.274, 5.239, 6.359, 5.375, 6.231,
6.533, 6.817, 5.978, 6.754, 5.845, 6.12, 5.96, 6.148, 5.58,
6.066, 5.093, 6.271, 5.443, 5.151, 5.033, 5.141, 6.123, 6.887,
5.763, 4.457, 4.872, 5.669, 4.608, 6.125, 6.323, 5.127, 6.01,
5.655, 5.62, 6.021, 7.806, 5.97, 6.284, 5.317, 6.945, 5.344,
5.857, 5.594, 6.486, 5.019, 5.764, 7.185, 6.517, 4.913, 5.67,
6.085, 5.197, 5.733, 6.351, 4.874, 5.961, 5.461, 6.546, 6.145,
7.203, 6.776, 5.585, 6.085, 5.576, 6.505, 6.235, 6.488, 5.459,
5.14, 5.987, 6.034, 6.521, 6.028, 5.297, 6.444, 5.302, 5.763,
5.304, 5.01, 6.146, 5.496, 6.166, 5.821, 5.193, 5.692, 5.393,
5.274, 5.633, 5.832, 5.267, 6.054, 6.359, 5.537, 5.942, 5.417,
5.874, 5.397, 6.357, 5.321), h_13 = c(6.382, 7.456, 5.787,
5.111, 8.092, 5.445, 5.874, 6.724, 5.643, 5.912, 5.375, 5.762,
6.451, 5.818, 5.291, 5.136, 5.244, 5.707, 5.607, 6.193, 4.612,
3.928, 6.428, 8.983, 6.057, 6.555, 6.157, 5.853, 6.179, 5.981,
4.841, 5.41, 5.753, 7, 8.053, 5.501, 4.74, 4.947, 6.016,
5.256, 6.574, 6.457, 4.814, 6.047, 7.346, 5.871, 5.659, 7.654,
5.198, 5.101, 5.81, 5.447, 4.822, 6.117, 4.999, 6.068, 6.337,
6.293, 5.421, 6.567, 5.83, 6.171, 5.885, 6.25, 6.075, 6.12,
4.863, 5.763, 5.438, 5.849, 5.203, 6.08, 6.343, 6.008, 6.01,
5.84, 4.99, 5.15, 4.9, 5.82, 5.972, 4.861, 5.949, 4.792,
5.057, 6.005, 7.517, 6.63, 5.745, 5.071, 6.42, 5.11, 5.905,
6.089, 6.229, 4.908, 5.724, 6.269, 6.566, 5.418, 5.983, 6.295,
5.578, 5.634, 6.518, 4.916, 5.878, 5.115, 6.017, 5.491, 6.706,
6.965, 5.24, 6.295, 5.458, 6.077, 6.325, 6.684, 5.684, 5.056,
5.306, 6.174, 6.427, 5.738, 4.824, 6.271, 5.476, 5.723, 4.761,
5.836, 5.52, 5.633, 5.515, 5.849, 4.908, 5.357, 5.863, 5.775,
5.497, 6.402, 5.548, 5.791, 5.729, 5.537, 5.428, 5.091, 6.287,
4.913, 6.385, 5.436), h_14 = c(6.865, 5.34, 5.906, 6.166,
9.527, 6.315, 6.553, 6.379, 6.105, 6.025, 6.292, 6.424, 6.883,
6.203, 5.484, 6.954, 6.694, 6.35, 6.6, 6.346, 4.679, 6.227,
6.058, 6.076, 5.972, 6.76, 6.654, 6.336, 5.184, 5.248, 5.422,
5.054, 5.889, 4.89, 5.566, 6.149, 5.025, 5.514, 7.41, 4.755,
6.912, 6.719, 5.293, 6.723, 6.496, 6.209, 5.786, 8.121, 5.55,
5.609, 6.491, 4.488, 5.447, 6.319, 6.112, 6.516, 6.502, 5.911,
6.069, 6.692, 7.081, 6.498, 5.976, 6.348, 6.311, 6.341, 5.381,
7.006, 5.965, 5.37, 5.652, 6.401, 6.82, 6.394, 6.207, 6.547,
5.016, 6.152, 5.456, 5.946, 6.595, 4.944, 6.151, 6.095, 5.714,
5.279, 5.83, 6.864, 5.711, 5.508, 7.248, 5.787, 6.396, 6.47,
4.872, 5.098, 6.271, 5.829, 6.88, 5.547, 5.769, 6.688, 5.786,
6.033, 6.425, 5.2, 6.2, 5.459, 6.834, 5.91, 5.718, 6.784,
6.603, 6.688, 5.405, 6.225, 6.74, 7.376, 5.74, 6.335, 5.935,
6.486, 6.772, 5.953, 5.135, 8.082, 5.81, 6.27, 5.372, 6.161,
5.642, 6.044, 6.19, 6.189, 5.193, 5.573, 6.394, 6.608, 5.954,
6.519, 5.845, 6.174, 5.795, 6.169, 5.288, 5.769, 6.364, 5.253,
6.865, 5.5), h_15 = c(7.454, 5.195, 5.94, 6.002, 10.864,
6.654, 6.799, 6.13, 5.498, 6.077, 6.352, 6.776, 6.88, 6.709,
5.871, 6.426, 6.077, 6.678, 6.65, 6.6, 4.99, 8.095, 5.381,
5.34, 5.845, 6.582, 7.072, 6.841, 5.417, 6.091, 7.357, 4.885,
6.02, 5.261, 4.931, 6.01, 5.036, 5.988, 8.589, 5.737, 7.698,
6.582, 5.255, 7.67, 5.562, 6.825, 5.761, 11.858, 5.826, 6.003,
6.589, 6.054, 6.282, 6.523, 6.136, 8.43, 6.082, 6.404, 6.26,
7.192, 6.736, 6.455, 6.068, 6.549, 6.964, 6.164, 5.568, 6.864,
6.019, 5.55, 5.904, 6.778, 7.021, 6.847, 6.406, 7.216, 5.114,
6.28, 5.775, 5.734, 6.615, 6.213, 6.276, 7.322, 6.023, 5.611,
5.306, 6.76, 5.528, 5.739, 7.329, 6.309, 7.257, 6.663, 5.473,
5.547, 6.534, 5.841, 6.962, 5.748, 5.743, 6.83, 5.843, 5.92,
6.964, 5.049, 6.489, 5.705, 6.73, 6.203, 5.531, 7.219, 6.125,
6.83, 5.7, 6.524, 7.36, 8.519, 5.953, 5.96, 6.946, 7.938,
6.982, 6.174, 5.345, 7.418, 6.511, 6.533, 5.5, 6.58, 6.061,
6.319, 6.395, 6.557, 5.7, 6.636, 6.572, 7.005, 6.473, 6.506,
6.007, 5.993, 6.241, 6.091, 5.691, 6.779, 6.31, 6.182, 7.12,
5.51), h_16 = c(7.353, 5.787, 6.161, 6.509, 11.262, 6.6,
6.672, 6.25, 5.535, 6.324, 6.301, 6.674, 6.839, 6.908, 6.514,
6.121, 5.723, 7.035, 6.788, 6.444, 5.472, 10.921, 5.469,
6.733, 6.099, 7.474, 7.784, 8.487, 6.129, 6.309, 8.839, 4.975,
6.244, 6.157, 5.251, 5.895, 5.091, 6.433, 7.946, 6.376, 8.058,
6.393, 5.201, 8.733, 5.122, 8.096, 5.822, 14.05, 6.429, 7.071,
6.739, 6.508, 9.411, 6.858, 6.112, 7.9, 5.593, 6.642, 6.337,
9.631, 6.39, 6.58, 6.185, 6.99, 7.174, 6.788, 6.501, 6.554,
6.3, 6.069, 6.71, 8.661, 7.087, 7.406, 6.401, 8.674, 5.231,
6.311, 6.501, 5.362, 6.612, 6.229, 6.31, 7.476, 6.371, 5.721,
5.987, 7.538, 5.516, 5.92, 6.986, 7.295, 7.904, 6.874, 5.512,
5.875, 6.642, 6.295, 7.16, 5.927, 5.687, 6.876, 5.897, 5.846,
6.982, 5.035, 6.525, 6.14, 6.65, 6.413, 5.853, 6.836, 6.103,
6.876, 7.094, 6.733, 7.558, 8.504, 6.279, 5.826, 7.607, 9.105,
6.933, 6.308, 5.709, 8.642, 7.368, 6.641, 5.486, 7.448, 6.471,
6.54, 6.284, 7.538, 5.985, 7.316, 6.634, 7.738, 7.202, 6.59,
6.25, 6.21, 7.009, 6.275, 6.944, 8.511, 6.414, 6.345, 8.462,
5.458), h_17 = c(7.167, 7.165, 5.919, 7.756, 12.506, 7.129,
7.412, 6.438, 5.298, 6.466, 6.854, 7.129, 7.202, 6.979, 6.77,
6.627, 6.324, 7.777, 7.203, 6.508, 5.667, 8.669, 6.065, 9.439,
7.285, 7.226, 8.281, 8.832, 6.582, 6.924, 8.522, 5.317, 6.729,
7.497, 6.009, 5.773, 5.262, 7.128, 7.873, 6.709, 8.085, 6.174,
5.836, 9.419, 4.893, 9.529, 5.867, 11.822, 6.705, 7.529,
8.222, 6.754, 9.434, 7.511, 6.234, 9.366, 5.612, 7.929, 6.565,
11.007, 6.612, 6.502, 5.569, 7.629, 7.499, 6.537, 8.403,
7.203, 7.194, 9.064, 7.337, 9.488, 7.423, 9.125, 6.571, 9.008,
5.25, 6.816, 7.807, 4.967, 7.059, 6.581, 6.776, 6.44, 7.372,
6.307, 7.067, 7.937, 6.697, 6.099, 6.541, 7.105, 7.735, 7.206,
6.397, 6.699, 6.785, 7.179, 6.538, 5.999, 5.715, 7.28, 5.899,
6.314, 7.874, 4.973, 6.749, 6.529, 7.329, 6.864, 6.686, 6.716,
6.861, 7.28, 7.058, 7.46, 7.692, 8.819, 6.566, 6.706, 7.498,
8.927, 7.233, 6.051, 5.679, 9.051, 8.124, 6.784, 6.083, 8.368,
6.798, 6.812, 6.552, 9.764, 6.555, 7.427, 6.784, 8.472, 9.907,
6.597, 6.547, 6.019, 7.633, 6.337, 7.115, 8.625, 6.184, 6.103,
8.772, 5.442), h_18 = c(7.058, 7.77, 5.969, 8.633, 12.037,
7.306, 6.894, 6.405, 5.153, 6.146, 7.364, 7.364, 6.957, 5.973,
7.169, 6.985, 6.725, 7.839, 6.587, 6.543, 5.967, 11.16, 6.889,
9.07, 8.09, 6.462, 7.865, 8.502, 7.144, 7.165, 8.531, 5.366,
6.528, 7.46, 6.469, 5.544, 5.238, 6.939, 6.006, 6.491, 7.755,
6.095, 5.895, 9.065, 4.411, 9.792, 5.774, 9.019, 6.529, 7.222,
7.882, 7.166, 10.153, 7.306, 6.164, 7.33, 5.754, 9.185, 6.489,
7.255, 5.806, 5.824, 6.048, 7.509, 7.557, 6.832, 9.135, 7.684,
7.419, 11.18, 7.139, 9.235, 7.238, 9.511, 6.481, 8.704, 5.364,
6.689, 8.963, 4.646, 6.716, 6.597, 6.789, 7.246, 7.212, 6.937,
7.719, 8.534, 9.163, 6.1, 5.794, 7.592, 7.438, 7.023, 6.806,
7.155, 6.878, 8.076, 6.093, 5.725, 5.875, 7.098, 5.715, 6.532,
8.449, 4.723, 6.721, 6.45, 6.776, 6.859, 7.236, 6.727, 7.348,
7.098, 7.021, 7.778, 6.834, 7.109, 6.399, 7.01, 7.898, 8.565,
6.871, 5.729, 5.585, 7.22, 8.55, 6.877, 6.558, 8.775, 6.949,
6.736, 6.673, 9.698, 7.03, 7.066, 6.649, 7.933, 9.846, 6.694,
6.554, 6.14, 7.51, 6.06, 6.983, 8.531, 5.996, 5.576, 7.289,
5.478), h_19 = c(5.687, 8.075, 5.892, 6.372, 10.094, 6.505,
6.055, 5.797, 4.308, 4.77, 6.675, 6.218, 5.605, 4.336, 6.558,
5.887, 5.707, 6.176, 5.783, 6.067, 5.872, 11.352, 6.618,
7.651, 7.454, 5.176, 6.72, 7.979, 6.034, 5.747, 8.054, 4.695,
4.984, 7.736, 6.157, 5.458, 5.261, 6.257, 3.821, 6.21, 6.625,
4.79, 5.316, 7.361, 4.711, 9.315, 5.403, 5.857, 5.525, 5.933,
6.151, 6.781, 4.937, 7.457, 5.419, 7.371, 5.079, 8.358, 6.193,
4.878, 5.904, 4.811, 6.298, 6.727, 7.392, 6.331, 6.913, 6.525,
6.486, 8.624, 6.014, 7.782, 5.672, 7.646, 5.664, 8.11, 5.042,
6.073, 8.246, 5.056, 5.858, 5.91, 5.882, 6.555, 6.22, 6.095,
7.354, 8.002, 4.622, 6.014, 5.29, 8.174, 5.627, 5.462, 6.43,
6.281, 6.237, 7.622, 4.942, 5.132, 5.341, 5.682, 5.091, 6.343,
8.041, 4.818, 6.213, 5.921, 5.557, 6.2, 7.53, 6.659, 6.437,
5.682, 6.235, 7.109, 5.464, 4.284, 5.601, 6.94, 8.228, 9.3,
5.611, 5.032, 5.403, 7.09, 6.221, 6.236, 5.1, 8.594, 6.678,
6.037, 5.728, 7.849, 5.225, 4.217, 5.536, 6.219, 7.906, 6.242,
5.73, 5.873, 6.65, 5.933, 6.499, 8.912, 6.167, 5.503, 4.775,
5.145), h_20 = c(4.335, 7.319, 5.038, 3.809, 7.279, 5.174,
4.37, 4.575, 4.01, 2.962, 4.964, 4.469, 4.154, 2.597, 5.11,
4.389, 4.196, 4.113, 4.463, 4.879, 4.825, 6.801, 4.192, 6.711,
5.633, 3.724, 4.825, 5.548, 5.191, 3.998, 4.551, 3.612, 3.448,
7.235, 5.353, 4.955, 4.804, 5.121, 2.528, 5.686, 4.958, 3.28,
4.306, 4.993, 4.507, 6.502, 4.176, 4.384, 4.169, 5.125, 4.224,
5.82, 2.643, 5.897, 4.583, 3.95, 4.013, 5.609, 5.064, 3.315,
4.745, 3.624, 4.79, 4.735, 5.903, 5.266, 4.293, 4.887, 4.835,
4.672, 4.213, 4.997, 3.919, 5.129, 4.28, 5.412, 4.299, 4.906,
5.149, 4.539, 4.291, 5.092, 4.387, 4.581, 4.551, 5.163, 6.564,
6.223, 3.045, 5.073, 3.796, 5.932, 3.251, 3.328, 5.627, 4.159,
4.307, 5.604, 3.305, 4.146, 3.588, 3.425, 4.128, 5.207, 5.942,
5.47, 4.795, 4.636, 3.923, 5.068, 7.024, 5.559, 4.797, 3.425,
4.749, 5.418, 3.967, 2.817, 4.05, 4.93, 6.056, 5.937, 3.952,
4.567, 4.945, 4.655, 3.797, 4.306, 3.789, 5.892, 5.141, 4.769,
4.452, 4.943, 3.388, 2.495, 3.919, 4.386, 5.201, 4.942, 4.51,
4.579, 4.658, 5.084, 5.059, 6.827, 5.386, 4.819, 3.15, 4.386
), h_21 = c(3.86, 3.419, 4.111, 2.631, 2.901, 3.924, 3.444,
3.567, 3.663, 2.51, 3.592, 3.19, 3.056, 1.699, 3.313, 3.106,
2.53, 4.047, 2.854, 3.757, 3.214, 5.125, 3.464, 3.532, 4.447,
2.206, 3.573, 4.029, 3.56, 2.206, 3.041, 3.331, 3.154, 3.713,
2.721, 3.886, 3.941, 4.302, 1.914, 3.58, 3.578, 2.488, 3.295,
3.419, 3.464, 4.121, 3.959, 2.623, 3.114, 3.845, 2.729, 3.271,
2.364, 4.42, 3.901, 2.958, 3.419, 3.846, 3.942, 2.502, 3.382,
2.949, 2.888, 3.22, 3.898, 3.666, 2.721, 3.333, 3.232, 3.074,
2.769, 3.477, 2.933, 3.637, 3.498, 3.93, 4.048, 4.086, 2.949,
3.302, 3.14, 2.861, 3.726, 3.719, 3.748, 3.726, 2.921, 4.167,
2.42, 3.438, 2.806, 4.92, 2.626, 3.024, 3.408, 2.507, 3.624,
2.182, 2.355, 4.125, 2.99, 2.472, 4.123, 4.021, 3.955, 4.36,
3.393, 3.194, 2.623, 4.183, 4.177, 4.048, 3.315, 2.472, 3.006,
4.156, 2.923, 2.275, 3.577, 3.417, 4.577, 4.397, 2.805, 3.977,
3.541, 3.47, 2.691, 3.623, 3.066, 3.351, 2.95, 3.814, 3.464,
3.415, 2.565, 1.514, 4.005, 3.375, 3.57, 3.788, 4.596, 2.815,
2.734, 3.424, 3.232, 5.065, 4.094, 4.471, 2.373, 4.404),
h_22 = c(3.696, 2.023, 3.479, 2.055, 2.163, 3.123, 2.472,
2.54, 3.189, 2.453, 2.989, 2.69, 2.246, 1.216, 3.239, 2.623,
1.928, 4.207, 2.124, 2.968, 3.213, 3.784, 2.654, 2.123, 3.304,
2.814, 2.691, 3.423, 2.341, 1.337, 2.192, 3.08, 3.232, 2.579,
1.511, 3.344, 4.409, 4.034, 1.686, 2.237, 2.774, 2.157, 2.773,
2.464, 3.446, 2.751, 3.929, 2.048, 2.612, 3.692, 2.106, 1.589,
2.063, 3.477, 3.512, 2.851, 3.059, 3.226, 3.348, 2.064, 3.048,
2.397, 2.743, 2.451, 2.823, 2.567, 2.173, 2.458, 2.375, 2.476,
2.321, 2.831, 2.438, 2.611, 3.325, 3.481, 4.156, 3.557, 2.092,
4.33, 2.636, 1.537, 3.45, 2.511, 3.678, 2.096, 1.851, 3.461,
2.603, 3.488, 2.362, 3.282, 2.314, 3.142, 1.986, 1.642, 3.632,
1.055, 2.153, 4.491, 2.856, 2.397, 4.122, 3.356, 2.934, 4.786,
2.589, 3.187, 2.048, 4.002, 2.735, 2.987, 2.698, 2.397, 2.828,
3.752, 2.097, 2.043, 3.777, 2.9, 2.703, 2.717, 2.257, 3.32,
3.618, 2.996, 2.167, 3.632, 2.673, 2.361, 2.528, 3.243, 2.883,
2.632, 1.963, 1.051, 4.194, 2.675, 2.631, 2.803, 4.846, 2.025,
2.77, 2.626, 2.987, 3.567, 3.198, 3.879, 1.992, 4.539), h_23 = c(2.771,
1.806, 3.231, 1.822, 0.874, 3.096, 1.884, 2.118, 3.401, 1.712,
2.575, 2.425, 1.7, 0.833, 3.21, 2.331, 1.635, 3.212, 1.81,
2.477, 2.83, 3.832, 2.955, 2.364, 3.092, 3.731, 2.257, 2.996,
2.081, 1.062, 1.65, 2.328, 2.478, 2.521, 1.246, 3.046, 4.689,
3.862, 1.118, 1.894, 2.37, 1.515, 2.688, 2.22, 3.068, 2.166,
3.274, 1.653, 2.135, 2.91, 1.881, 1.311, 1.53, 2.917, 3.262,
1.914, 2.184, 3.067, 2.963, 1.126, 2.437, 1.724, 2.374, 2.059,
2.302, 1.865, 1.824, 2.203, 1.893, 2.236, 2.286, 2.196, 1.89,
2.371, 2.525, 3.207, 3.87, 3.359, 1.699, 4.334, 1.992, 1.304,
3.103, 2.454, 3.464, 1.819, 1.758, 3.134, 2.076, 3.827, 1.898,
3.213, 1.611, 2.344, 1.466, 1.318, 2.943, 0.808, 1.592, 3.638,
2.408, 1.957, 3.311, 2.834, 2.711, 5.069, 1.998, 3.226, 1.703,
3.527, 2.505, 2.487, 2.322, 1.957, 3.006, 3.355, 1.529, 1.497,
2.97, 2.723, 2.029, 2.036, 1.741, 2.5, 4.043, 2.586, 1.534,
2.942, 2.556, 1.832, 2.586, 2.321, 2.488, 2.472, 2.027, 0.891,
3.262, 2.081, 2.582, 2.053, 3.648, 1.831, 2.555, 2.312, 3.148,
2.915, 2.893, 3.55, 0.353, 3.713), h_24 = c(1.516, 1.248,
1.984, 0.918, 0.314, 2.254, 1.036, 1.374, 1.011, 1, 1.993,
1.617, 1.244, 0.555, 2.287, 1.777, 1.033, 1.891, 1.031, 1.717,
2.167, 2.443, 1.655, 1.944, 2.202, 3.513, 1.457, 1.779, 1.281,
0.745, 0.987, 1.579, 1.433, 1.748, 0.825, 2.249, 4.116, 3.06,
0.679, 1.393, 1.779, 0.978, 2.229, 1.602, 1.854, 1.215, 2.428,
0.503, 1.557, 1.299, 1.148, 0.801, 0.487, 1.878, 2.732, 0.652,
1.45, 2.161, 2.309, 0.313, 1.683, 1.293, 1.692, 1.547, 1.174,
1.252, 1.102, 1.525, 1.292, 1.338, 1.234, 1.309, 1.27, 1.452,
1.584, 1.97, 3.122, 2.457, 1.055, 2.882, 1.158, 0.83, 2.086,
1.878, 2.694, 1.223, 1.133, 1.786, 1.089, 3.285, 1.131, 2.242,
1.025, 1.362, 0.956, 0.597, 2.006, 0.465, 1.102, 2.474, 1.779,
1.363, 2.129, 2.214, 1.95, 4.824, 1.127, 2.634, 1.07, 2.754,
1.954, 1.576, 1.76, 1.363, 2.411, 2.436, 1.027, 0.843, 1.989,
2.183, 1.383, 1.187, 1.211, 1.204, 3.67, 1.271, 0.774, 2.006,
1.825, 1.212, 1.921, 1.468, 1.728, 1.623, 1.678, 0.444, 2.039,
1.32, 1.767, 1.336, 2.22, 1.008, 1.943, 1.45, 2.728, 2.065,
1.78, 2.981, 0.24, 2.61)), class = "data.frame", row.names = c(NA,
-150L))
There are different ways of achieving this. In general, you are looking for some unsupervised learning method (have some unlabelled data with characteristics and want to group observations (roads) based on similarity)
First note that in your data, type includes duplicates. That should not be the case, if each row is a different street. I assume this is a mistake:
d$type <- paste0("id_", 1:nrow(d))
dd <- as.matrix(d[,-1])
rownames(dd) <- d$type
K-means clustering:
dd <- scale(dd)
# 4 means clusering
set.seed(123)
km.res <- kmeans(dd, 15, nstart = 25)
# get cluster membership
km.res$cluster[1:10]
id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 id_10
3 10 3 6 2 9 3 3 12 15
Alternatively, hierarchical clustering:
# hierarchical clustering
dist_mat <- dist(dd, method = 'euclidean')
hclust_avg <- hclust(dist_mat, method = 'average')
plot(hclust_avg)
cut_avg <- cutree(hclust_avg, k = 15)
plot(hclust_avg)
rect.hclust(hclust_avg , k = 15, border = 2:6)
abline(h = 3, col = 'red')
# get cluster membership:
cut_avg[1:10]
id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 id_10
1 2 3 3 4 3 3 3 5 3
Note that in general different methods will have different results. If you look into the help files of the functions you will find more information about the possible options for each method, eg for the definition of distance, and for how to compute the clusters (average, max, min, ward).

Colour a Q-Q plot comparing two distributions by quartiles in R

I am trying to construct a Q-Q plot comparing two distributions, with the 99th percentile colored like the following example:
However I am not sure how to achieve this, here is a subset of my data:
dfw <- structure(list(Date.Time = structure(c(848502000, 848509200,
848512800, 848520000, 848523600, 848530800, 848534400, 848541600,
848545200, 848552400, 848556000, 848563200, 848566800, 848574000,
848577600, 848588400, 848595600, 848599200, 848606400, 848610000,
848617200, 848620800, 848628000, 848631600, 848638800, 848642400,
848649600, 848653200, 848660400, 848664000, 848674800, 848682000,
848685600, 848692800, 848696400, 848703600, 848707200, 848714400,
848718000, 848725200, 848728800, 848736000, 848739600, 848746800,
848750400, 848761200, 848768400, 848772000, 848779200, 848782800,
848790000, 848793600, 848800800, 848804400, 848811600, 848815200,
848822400, 848826000, 848833200, 848847600, 848854800, 848858400,
848865600, 848869200, 848876400, 848880000, 848887200, 848890800,
848898000, 848901600, 848908800, 848912400, 848919600, 848923200,
848934000, 848941200, 848944800, 848952000, 848955600, 848962800,
848966400, 848973600, 848977200, 848984400, 848988000, 848995200,
848998800, 849006000, 849009600, 853682400, 853686000, 853714800,
853718400, 853725600, 853729200, 853736400, 853750800, 853758000,
853761600, 853768800), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Hs.Mod = c(1.5960001, 1.5600001, 1.5480001, 1.552, 1.552,
1.534, 1.5180001, 1.462, 1.4260001, 1.3740001, 1.36, 1.3340001,
1.3080001, 1.2720001, 1.256, 1.218, 1.212, 1.21, 1.2060001,
1.2160001, 1.248, 1.25, 1.25, 1.264, 1.3560001, 1.394, 1.3700001,
1.332, 1.2900001, 1.2800001, 1.268, 1.2800001, 1.2800001,
1.3240001, 1.3240001, 1.286, 1.2540001, 1.19, 1.172, 1.1700001,
1.1860001, 1.2340001, 1.274, 1.3640001, 1.4120001, 1.58,
1.6580001, 1.6660001, 1.682, 1.748, 1.9280001, 1.9800001,
2.026, 2.052, 2.214, 2.328, 2.4320002, 2.39, 2.2180002, 1.9080001,
1.792, 1.7400001, 1.7140001, 1.7140001, 1.692, 1.6680001,
1.608, 1.58, 1.536, 1.524, 1.5640001, 1.5760001, 1.6100001,
1.6240001, 1.6120001, 1.5920001, 1.58, 1.542, 1.5200001,
1.48, 1.4640001, 1.4260001, 1.406, 1.386, 1.3820001, 1.34,
1.312, 1.268, 1.248, 1.8080001, 1.7960001, 1.644, 1.6420001,
1.6600001, 1.6880001, 1.7820001, 2.138, 2.2740002, 2.2940001,
2.252), Hs.Obs = c(1.741, 1.524, 1.618, 1.658, 1.697, 1.822,
1.792, 1.463, 1.433, 1.376, 1.208, 1.299, 1.255, 1.304, 1.328,
1.182, 1.282, 1.293, 1.228, 1.281, 1.45, 1.356, 1.501, 1.5,
1.356, 1.477, 1.408, 1.544, 1.497, 1.768, 2.04, 2.074, 2.042,
2.147, 2.224, 2.022, 2.017, 2.047, 2.353, 2.597, 2.838, 2.67,
2.762, 2.687, 2.734, 2.738, 2.938, 2.795, 2.549, 2.669, 2.447,
2.676, 2.577, 2.383, 2.362, 2.284, 2.341, 2.33, 2.397, 2.498,
2.317, 2.373, 2.377, 2.362, 2.218, 2.226, 1.97, 2.087, 1.874,
2.116, 2.022, 1.886, 2.046, 1.879, 1.638, 1.677, 1.638, 1.647,
1.551, 1.596, 1.591, 1.384, 1.345, 1.522, 1.469, 1.503, 1.459,
1.327, 1.453, 2.448, 2.235, 2.104, 1.958, 2.118, 2.209, 2.034,
2.229, 2.505, 2.163, 2.372)), row.names = c(NA, 100L), class = "data.frame")
Code to make the Q-Q plot:
ggplot(data=dfw, aes(x=sort(Hs.Obs), y=sort(Hs.Mod))) + geom_point(shape = 1, size =2) + xlab('Obs') + ylab('Model')+
theme_bw()+
geom_abline(linetype=2)
Code attempting to colour the 99th percentile:
ggplot(data=dfw, aes(x=sort(Hs.Obs), y=sort(Hs.Mod), col=cut(Hs.Mod,quantile(Hs.Mod, probs = .99)))) +
geom_point(shape = 1, size =2) + xlab('Obs') + ylab('Model')+
theme_bw()+
geom_abline(linetype=2)
Resulting plot:
I'm looking for some help to sort this out, as the attempts I have tried aren't working.
Thanks in advance!
try this
dfw %>%
mutate(qq = quantile(Hs.Mod, probs = c(0.99)),
qq_gt99 = ifelse(qq<= Hs.Mod, 1, 0)) %>%
ggplot(aes(x=Hs.Mod, y = Hs.Obs, col= as.factor(qq_gt99))) + geom_point()
if you order the observations first
dfw %>%
mutate(mod_ordered = sort(Hs.Mod),
obs_ordered = sort(Hs.Obs),
qq = quantile(mod_ordered, probs = c(0.99)),
qq_gt99 = ifelse(qq<= mod_ordered, 1, 0)) %>%
ggplot(aes(x=mod_ordered, y = obs_ordered, col=
as.factor(qq_gt99))) + geom_point()
Im using the dplyr library to filter() the data by quantile():
The code:
library(dplyr)
library(ggplot2)
ggplot()+
geom_point(data=filter(dfw,Hs.Obs>quantile(dfw$Hs.Mod,.99)),aes(x=sort(Hs.Obs),y=sort(Hs.Mod), col="Cuantile 99%"))+
geom_point(data=filter(dfw,Hs.Obs<quantile(dfw$Hs.Mod,.99)),aes(x=sort(Hs.Obs),y=sort(Hs.Mod), col="Cuantile 1-98%"))+
geom_abline(linetype=2)+xlab('Obs') + ylab('Model')+ theme_bw()

How to fit a regression of information (negative entropy) ~ size in R?

I would like to fit a regression to negative entropy ~ size data in order to estimate the optimum size (pointed with the arrow).
The range of entropy data is between 0 and 1, while the range of size data goes from x > 0 to ∞. The information value here was computed following Information = Hmax - H using Shannon
An example of the data is:
size <- c(0.0010, 0.0035, 0.0060, 0.0085, 0.0110, 0.0135, 0.0160, 0.0185, 0.0210, 0.0235, 0.0260, 0.0285, 0.0310, 0.0335, 0.0360, 0.0385, 0.0410, 0.0435, 0.0460, 0.0485, 0.0510, 0.0535, 0.0560, 0.0585, 0.0610, 0.0635, 0.0660, 0.0685, 0.0710, 0.0735, 0.0760, 0.0785, 0.0810, 0.0835, 0.0860, 0.0885, 0.0910, 0.0935, 0.0960, 0.0985, 0.1010, 0.1035, 0.1060, 0.1085, 0.1110, 0.1135, 0.1160, 0.1185, 0.1210, 0.1235, 0.1260, 0.1285, 0.1310, 0.1335, 0.1360, 0.1385, 0.1410, 0.1435, 0.1460, 0.1485, 0.1510, 0.1535, 0.1560, 0.1585, 0.1610, 0.1635, 0.1660, 0.1685, 0.1710, 0.1735, 0.1760, 0.1785, 0.1810, 0.1835, 0.1860, 0.1885, 0.1910, 0.1935, 0.1960, 0.1985, 0.2010, 0.2035, 0.2060, 0.2085, 0.2110, 0.2135, 0.2160, 0.2185, 0.2210, 0.2235, 0.2260, 0.2285, 0.2310, 0.2335, 0.2360, 0.2385, 0.2410, 0.2435, 0.2460, 0.2485, 0.2510, 0.2535, 0.2560, 0.2585, 0.2610, 0.2635, 0.2660, 0.2685, 0.2710, 0.2735, 0.2760, 0.2785, 0.2810, 0.2835, 0.2860, 0.2885, 0.2910, 0.2935, 0.2960, 0.2985, 0.3010, 0.3035, 0.3060, 0.3085, 0.3110, 0.3135, 0.3160, 0.3185, 0.3210, 0.3235, 0.3260, 0.3285, 0.3310, 0.3335, 0.3360, 0.3385, 0.3410, 0.3435, 0.3460, 0.3485, 0.3510, 0.3535, 0.3560, 0.3585, 0.3610, 0.3635, 0.3660, 0.3685, 0.3710, 0.3735, 0.3760, 0.3785, 0.3810, 0.3835, 0.3860, 0.3885, 0.3910, 0.3935, 0.3960, 0.3985, 0.4010, 0.4035, 0.4060, 0.4085, 0.4110, 0.4135, 0.4160, 0.4185, 0.4210, 0.4235, 0.4260, 0.4285, 0.4310, 0.4335, 0.4360, 0.4385, 0.4410, 0.4435, 0.4460, 0.4485, 0.4510, 0.4535, 0.4560, 0.4585, 0.4610, 0.4635, 0.4660, 0.4685, 0.4710, 0.4735, 0.4760, 0.4785, 0.4810, 0.4835, 0.4860, 0.4885, 0.4910, 0.4935, 0.4960, 0.4985)
information <- c(0.001, 0.136, 0.366, 0.532, 0.642, 0.719, 0.773, 0.810, 0.839, 0.854, 0.871, 0.878, 0.882, 0.885, 0.885, 0.886, 0.884, 0.878, 0.877, 0.873, 0.867, 0.864, 0.847, 0.851, 0.839, 0.839, 0.836, 0.828, 0.822, 0.821, 0.817, 0.817, 0.805, 0.805, 0.791, 0.796, 0.798, 0.795, 0.799, 0.788, 0.787, 0.785, 0.779, 0.775, 0.769, 0.771, 0.772, 0.769, 0.770, 0.746, 0.777, 0.755, 0.755, 0.752, 0.744, 0.745, 0.745, 0.759, 0.740, 0.747, 0.740, 0.747, 0.740, 0.738, 0.745, 0.718, 0.732, 0.748, 0.714, 0.731, 0.744, 0.710, 0.720, 0.750, 0.725, 0.708, 0.715, 0.753, 0.720, 0.702, 0.722, 0.708, 0.701, 0.716, 0.723, 0.719, 0.695, 0.692, 0.701, 0.720, 0.719, 0.699, 0.709, 0.699, 0.703, 0.714, 0.706, 0.686, 0.698, 0.694, 0.703, 0.708, 0.698, 0.653, 0.676, 0.687, 0.697, 0.707, 0.689, 0.691, 0.666, 0.646, 0.660, 0.687, 0.706, 0.722, 0.714, 0.702, 0.654, 0.642, 0.647, 0.650, 0.663, 0.673, 0.703, 0.704, 0.698, 0.694, 0.655, 0.641, 0.620, 0.625, 0.631, 0.644, 0.655, 0.663, 0.691, 0.669, 0.674, 0.647, 0.644, 0.659, 0.657, 0.652, 0.649, 0.636, 0.619, 0.613, 0.609, 0.629, 0.655, 0.667, 0.652, 0.640, 0.636, 0.643, 0.640, 0.652, 0.649, 0.645, 0.657, 0.654, 0.650, 0.622, 0.614, 0.617, 0.612, 0.621, 0.627, 0.622, 0.616, 0.626, 0.615, 0.624, 0.634, 0.633, 0.631, 0.629, 0.614, 0.617, 0.630, 0.633, 0.629, 0.620, 0.629, 0.626, 0.614, 0.624, 0.608, 0.591, 0.606, 0.607, 0.605, 0.618, 0.610, 0.622, 0.618, 0.616, 0.613, 0.612)
It seems (please correct me) that the information data follows a Maxwell-Boltzmann distribution
require(shotGroups)
plot(information ~ log(size))
lines(pMaxwell(information, sigma= 0.3639920) ~ log(size), col = "red")
However, I am not sure how to estimate this optimum value using a parameter in a regression or if there is any other approach to determine this optimum rather than max(information).
Any thoughts?
This works OK, although I had to limit the upper bound of the root-finding function below the region where the spline starts to wiggle ...
library(splines)
ss <- smooth.spline(log(size),information,spar=0.4)
uu <- uniroot(function(x) predict(ss,x=x,deriv=1)$y,interval=c(-5,-3))
Result is -3.29.
Picture:
plot(information ~log(size))
lines(ss$x,ss$y,col="red",lwd=2)
abline(v=uu$root,col="blue")
I cannot place an image in a comment, and so place it here. Using the example data in your post, I got an OK fit to the equation "y = a*pow(x,b+c/x)" with parameters a = 5.3705331969760373E-01, b = -1.8691263532001362E-01 and c = 1.5557275459064772E-03 yielding an R-squared of 0.9770 and RMSE of 0.0156

Resources