Example:
library(tidyverse)
mtcars1 <- mtcars %>% mutate(rn = row_number(), blah = rnorm(n(), 10, 1))
mtcars2 <- mtcars %>% mutate(rn = row_number(), blah2 = rnorm(n(), 5, 1))
mtcars_combined <- mtcars1 %>% inner_join(mtcars2, by = 'rn')
mtcars_combined %>% glimpse
Rows: 32
Columns: 25
$ mpg.x <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8, 16.4, 17.3, 15.2, 10.4, 10.4, 14.7, 32.4, 30.4, …
$ cyl.x <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, 4, 4, 4, 8, 6, 8, 4
$ disp.x <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 167.6, 167.6, 275.8, 275.8, 275.8, 472.0, 460.0, 44…
$ hp.x <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180, 205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 1…
$ drat.x <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92, 3.07, 3.07, 3.07, 2.93, 3.00, 3.23, 4.08, 4.93, …
$ wt.x <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.440, 3.440, 4.070, 3.730, 3.780, 5.250, 5.424, 5.…
$ qsec.x <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18.30, 18.90, 17.40, 17.60, 18.00, 17.98, 17.82, 17…
$ vs.x <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1
$ am.x <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1
$ gear.x <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 5, 4
$ carb.x <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2, 2, 4, 2, 1, 2, 2, 4, 6, 8, 2
$ rn <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,…
$ blah.x <dbl> 9.652697, 10.497945, 9.402642, 10.134072, 9.645391, 10.177435, 10.691140, 10.800154, 10.005802, 10.681475, 8.91997…
$ mpg.y <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 17.8, 16.4, 17.3, 15.2, 10.4, 10.4, 14.7, 32.4, 30.4, …
$ cyl.y <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4, 8, 8, 8, 8, 4, 4, 4, 8, 6, 8, 4
$ disp.y <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8, 167.6, 167.6, 275.8, 275.8, 275.8, 472.0, 460.0, 44…
$ hp.y <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180, 180, 205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 1…
$ drat.y <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3.92, 3.07, 3.07, 3.07, 2.93, 3.00, 3.23, 4.08, 4.93, …
$ wt.y <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150, 3.440, 3.440, 4.070, 3.730, 3.780, 5.250, 5.424, 5.…
$ qsec.y <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90, 18.30, 18.90, 17.40, 17.60, 18.00, 17.98, 17.82, 17…
$ vs.y <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1
$ am.y <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1
$ gear.y <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 5, 4
$ carb.y <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1, 2, 2, 4, 2, 1, 2, 2, 4, 6, 8, 2
$ blah.y <dbl> 6.047953, 4.379261, 4.609405, 4.420695, 6.545795, 4.962723, 5.955824, 5.011969, 5.617293, 4.347312, 3.126674, 4.13…
I only joined on one field, rn. Because there are multiple matching field names, they are appended .x and .y. Of course, I could just have joined onto a smaller df with e.g.
mtcars_combined <- mtcars1 %>% inner_join(mtcars2 %>% select(rn, blah2), by = 'rn')
But, I'd like to know if there's a clever way to tell r to just keep matching fields from the left side and drop any duplicate fields coming from the right?
One approach is to make use of the suffix argument and drop the duplicated cols using select:
library(dplyr)
mtcars1 <- mtcars %>% mutate(rn = row_number(), blah = rnorm(n(), 10, 1))
mtcars2 <- mtcars %>% mutate(rn = row_number(), blah2 = rnorm(n(), 5, 1))
mtcars_combined <- mtcars1 %>% inner_join(mtcars2, by = 'rn', suffix = c("", "_drop"))
mtcars_combined <- select(mtcars_combined, -ends_with("_drop"))
glimpse(mtcars_combined)
#> Rows: 32
#> Columns: 14
#> $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2, 1...
#> $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4, 4...
#> $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140.8...
#> $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 180,...
#> $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92, 3...
#> $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.150...
#> $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.90...
#> $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1...
#> $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0...
#> $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3...
#> $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1, 1...
#> $ rn <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18...
#> $ blah <dbl> 10.856380, 9.634127, 10.280296, 10.153320, 9.255293, 10.38564...
#> $ blah2 <dbl> 5.724742, 5.740158, 4.743665, 5.337721, 4.239426, 5.989236, 4...
I have a dataset comparing 15 hybrids, each with 5 separate measurements. I am trying to spread the data into a wider dataset using pivot_wider for a regression analysis, since spread() would not work (probably because of the repeated observations).
The dataset I am working with is below:
data <- structure(list(hybrid = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11,
11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12,
12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
15, 15, 15), measurement = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4,
4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1,
1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3,
3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5,
5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2,
2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4,
4, 5, 5, 5, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 1, 1, 1, 2, 2, 2, 3,
3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5,
5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2,
2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4,
4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5), value = c(245,
889, 450, 45, 515, 318, 956, 434, 29, 740, 156, 516, 767, 292,
753, 573, 636, 611, 777, 557, 408, 95, 482, 227, 495, 360, 55,
76, 393, 37, 667, 802, 724, 900, 885, 191, 79, 143, 531, 398,
324, 129, 172, 467, 25, 101, 476, 629, 915, 122, 498, 649, 354,
527, 920, 788, 565, 552, 586, 127, 461, 307, 77, 552, 198, 240,
816, 144, 136, 781, 593, 421, 233, 264, 812, 407, 492, 932, 940,
139, 764, 200, 352, 754, 271, 506, 381, 973, 678, 848, 432, 358,
218, 736, 287, 411, 220, 264, 531, 669, 666, 727, 841, 792, 79,
460, 159, 426, 90, 395, 793, 507, 262, 814, 157, 641, 230, 870,
304, 591, 636, 277, 534, 783, 562, 938, 889, 68, 557, 892, 809,
157, 71, 54, 256, 246, 301, 823, 622, 953, 6, 66, 556, 902, 207,
832, 248, 540, 192, 65, 381, 712, 15, 323, 1, 193, 146, 637,
488, 158, 289, 839, 229, 237, 273, 978, 560, 969, 898, 204, 335,
930, 444, 968, 920, 398, 303, 318, 975, 182, 630, 4, 624, 271,
272, 438, 661, 728, 32, 106, 473, 465, 498, 33, 189, 918, 704,
605, 867, 240, 833, 497, 514, 241, 860, 228, 643, 791, 4, 898,
574, 225, 339, 365, 387, 548, 88, 604, 283)), class = "data.frame", row.names = c(NA,
-219L))
I'm new to the pivot_wider function, so when I run my code, I get an error:
data%>%
pivot_wider(cols = -hybrid, names_to = c("1","2","3","4","5"))
Error in pivot_wider(., cols = -hybrid, names_to = c("1", "2", "3", "4", :
unused arguments (cols = -hybrid, names_to = c("1", "2", "3", "4", "5"))
How can I spread this data so that I have 5 columns? Hybrid, 1, 2, 3, 4, 5 (with the values under the columns entitled 1:5).
My guess is that you are you looking for this:
library(tidyr)
pivot_wider(data, id_cols = hybrid, names_from = measurement, values_from = "value", values_fn = sum)
# # A tibble: 15 x 6
# hybrid `1` `2` `3` `4` `5`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 1584 878 1419 1412 1812
# 2 2 1820 1742 804 910 506
# 3 3 2193 1976 753 851 664
# 4 4 1206 1535 1530 2273 1265
# 5 5 845 990 1096 1795 1309
# 6 6 1831 1843 1306 1158 2499
# 7 7 1008 1434 1015 2062 1712
# 8 8 1045 1278 1583 1028 1765
# 9 9 913 1317 1500 957 1449
# 10 10 1037 556 1746 1025 1665
# 11 11 1620 638 1050 340 1283
# 12 12 1357 1488 2427 1469 2332
# 13 13 1019 1787 899 1371 866
# 14 14 1436 1140 2176 1570 1615
# 15 15 1662 1476 929 1023 887
Using dcast from data.table
library(data.table)
dcast(setDT(data), hybrid ~ measurement, sum)
# hybrid 1 2 3 4 5
# 1: 1 1584 878 1419 1412 1812
# 2: 2 1820 1742 804 910 506
# 3: 3 2193 1976 753 851 664
# 4: 4 1206 1535 1530 2273 1265
# 5: 5 845 990 1096 1795 1309
# 6: 6 1831 1843 1306 1158 2499
# 7: 7 1008 1434 1015 2062 1712
# 8: 8 1045 1278 1583 1028 1765
# 9: 9 913 1317 1500 957 1449
#10: 10 1037 556 1746 1025 1665
#11: 11 1620 638 1050 340 1283
#12: 12 1357 1488 2427 1469 2332
#13: 13 1019 1787 899 1371 866
#14: 14 1436 1140 2176 1570 1615
#15: 15 1662 1476 929 1023 887
library(tidyverse)
ex <- structure(list(group = c("Group A", "Group B", "Group C"), data = list(
structure(list(a = c(25.1, 15.1, 28.7, 29.7, 5.3, 3.4, 5.3,
10.1, 2.4, 18, 4.7, 22.1, 9.5, 3.1, 26.5, 5.1, 24, 22.5,
19.4, 22.9, 24.5, 18.2, 7.9, 5.3, 24.7), b = c(95.1, 51,
100, 94.1, 47.3, 0, 50.7, 45.8, 40.7, 49.4, 51.9, 76.4, 26.7,
19.8, 37.4, 59.4, 59.1, 60.2, 26.1, 2.8, 100, 40.7, 56.4,
42.5, 0), c = c(39.9, 42.7, 16.3, 11.1, 56.9, 17.8, 62, 28.1,
43, 44.8, 54.8, 8.7, 5.5, 40.2, 7.7, 60.7, 24.8, 7.5, 3.5,
16.9, 31.6, 45.8, 76.7, 58.6, 15.8), d = c(-2.39999999999999,
28.6, -4.59999999999999, -1.39999999999999, 10.3, 3.1, 23.4,
-43, -36.3, 32.4, 33.1, 9.8, 1.5, -17.6, 16.6, 20.9, 7.8,
-1.7, -23.3, 0, -15, 59.3, -40.2, 46.9, 4.7)), .Names = c("a",
"b", "c", "d"), row.names = c(NA, -25L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(a = c(5, 4.7, 30.3,
14.3, 31.6, 6, 4.9, 23.3, 26.9, 16.9, 27.2, 23.8, 19.9, 28.6,
9.9, 17.4, 14.3, 12.5, 30.4, 30.3, 30, 6, 18, 23.7, 5.1),
b = c(48.9, 41.3, 20.1, 63.7, 85.1, 30.3, 52.8, 49.7,
27.1, 51.6, 21.8, 52.4, 52.5, 59.6, 13.7, 53.1, 69, 66.9,
23.4, 35.4, 45.8, 23.7, 62.9, 90.3, 59.6), c = c(37.4,
18.5, 64.6, 13.5, 7.8, 6.8, 12.7, 8.5, 7.8, 5.4, 14.1,
20.5, 10.9, 10.5, 7.5, 14.7, 6.9, 0.699999999999999,
4.7, 1.9, 11.9, 0.9, 7.2, 9.2, 42.2), d = c(4.9, -3.7,
13.5, 21.9, -2.69999999999999, 6.6, 0.5, -12.3, 38.7,
-25.8, -18, 28.4, 38.3, -3.6, 39.4, 19, 23.4, -38.7,
17, 36.3, -31.7, -9.3, -10.5, 9.7, -10.6)), .Names = c("a",
"b", "c", "d"), row.names = c(NA, -25L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(a = c(29.9, 12.8, 23.9,
26.2, 27.5, 32.6, 33.2, 24.8, 29, 22.6, 4.7, 25.6, 4.7, 13.1,
25.9, 14.5, 23.5, 26.6, 12.8, 24.1, 9.1, 31.9, 24.8, 4.6,
17.9), b = c(63.7, 23.3, 71.2, 46.7, 30.6, 49.3, 14.6, 68.4,
27.9, 49.1, 60.5, 26.4, 56.9, 55.4, 37.9, 40.7, 32.7, 68.5,
42.7, 27.9, 67.5, 43.4, 76.6, 53.3, 26.8), c = c(1.6, 32,
18.6, 14, 0.5, 7.2, 27.3, 8.9, 11, 15.5, 16.7, 16.4, 63.1,
14.7, 6.8, 9, 3.1, 11.7, 11, 11.5, 10.6, 14.9, 7.1, 13.2,
5.1), d = c(-35.4, 21, 12, 1.8, 37.6, 9.2, 17.6, 0, -19.4,
32.6, -32, -3.6, 7.2, -25.7, 9.1, -8, 35.8, 24.8, -13.9,
-21.7, -28.7, 0.200000000000003, -16.9, -26.5, 26.2)), .Names = c("a",
"b", "c", "d"), row.names = c(NA, -25L), class = c("tbl_df",
"tbl", "data.frame"))), h_candidates = list(structure(c(0.17320508075689, 2.37782856461527, 2.94890646051978, 3.35205778704499, 3.66771041547043, 3.95224618679369), .Names = c("0%", "0.01%", "0.02%", "0.03%", "0.04%", "0.05%")), structure(c(0.316227766016836, 2.63452963884554, 3.2327619513522, 3.63593179253957, 3.97743636027027, 4.22137418384109), .Names = c("0%", "0.01%", "0.02%", "0.03%", "0.04%", "0.05%")), structure(c(0.316227766016837, 2.7258026340878, 3.24807635378234, 3.62353418639869, 3.92683078321437, 4.17731971484109), .Names = c("0%", "0.01%", "0.02%", "0.03%", "0.04%", "0.05%"))), assignment = list(
structure(list(`0%` = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25),
`0.01%` = c(1, 2, 3, 3, 4, 5, 4, 6, 7, 8, 9, 10, 11,
12, 13, 4, 14, 15, 16, 17, 18, 19, 20, 21, 17), `0.02%` = c(1,
2, 3, 3, 4, 5, 4, 6, 7, 8, 9, 10, 11, 12, 13, 4, 14,
15, 16, 17, 18, 19, 20, 21, 17), `0.03%` = c(1, 2, 3,
3, 4, 5, 4, 6, 7, 8, 9, 10, 11, 12, 13, 4, 10, 14, 15,
16, 17, 18, 19, 9, 16), `0.04%` = c(1, 2, 3, 4, 5, 6,
5, 7, 8, 9, 10, 11, 12, 13, 14, 5, 11, 15, 16, 17, 18,
19, 20, 10, 17)), .Names = c("0%", "0.01%", "0.02%",
"0.03%", "0.04%"), row.names = c(NA, -25L), class = c("tbl_df",
"tbl", "data.frame")), structure(list(`0%` = c(1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25), `0.01%` = c(1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 4, 17, 18, 19, 20, 21, 22,
23, 24), `0.02%` = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 13, 4, 16, 17, 9, 18, 19, 14, 20, 21), `0.03%` = c(1,
2, 3, 4, 5, 6, 2, 7, 8, 9, 10, 11, 12, 13, 14, 12, 4, 15,
6, 8, 16, 17, 13, 18, 19), `0.04%` = c(1, 2, 3, 4, 5, 6,
2, 7, 8, 9, 10, 11, 12, 13, 14, 12, 4, 15, 6, 8, 7, 16, 13,
17, 1)), .Names = c("0%", "0.01%", "0.02%", "0.03%", "0.04%"
), row.names = c(NA, -25L), class = c("tbl_df", "tbl", "data.frame"
)), structure(list(`0%` = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
), `0.01%` = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 12, 15, 16, 17, 15, 18, 19, 4, 20, 21, 22), `0.02%` = c(1,
2, 3, 4, 5, 6, 7, 8, 9, 5, 10, 11, 12, 13, 11, 14, 5, 15,
14, 16, 17, 18, 8, 19, 20), `0.03%` = c(1, 2, 3, 4, 5, 6,
7, 3, 8, 9, 10, 11, 12, 10, 11, 13, 5, 14, 13, 8, 10, 4,
3, 13, 6), `0.04%` = c(1, 2, 3, 4, 5, 5, 6, 3, 7, 8, 9, 10,
11, 9, 10, 12, 5, 13, 12, 7, 9, 4, 3, 12, 5)), .Names = c("0%",
"0.01%", "0.02%", "0.03%", "0.04%"), row.names = c(NA, -25L
), class = c("tbl_df", "tbl", "data.frame")))), .Names = c("group", "data", "h_candidates", "assignment"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L))
With the data structured like above I would like to change all values within assignment data.frames that appears less than k times (let's say k = 5) in a column.
So I need a solution that takes subsequent data.frames, then subsequent columns within a data.frame, check which values appears less than 5 times in a column and if there are any just replace them with 0.
At best, the solution would involve tidyverse functions. I think that nested purrr::map, as well as dplyr::mutate are needed here, but don't know how to count appearances within a column and replace the values then.
You can use purrr::map() to loop over the list column with the dataframes,
and then purrr::modify() to loop over each column in each dataframe. Then
it's just a matter of defining a function that counts occurences of values in
a vector, and replaces them if the count is less than k:
library(tidyverse)
ex %>%
mutate(assignment = map(assignment, modify, function(x, k) {
n <- table(x)[as.character(x)]
replace(x, n < k, 0)
}, k = 5))
#> # A tibble: 3 x 4
#> group data h_candidates assignment
#> <chr> <list> <list> <list>
#> 1 Group A <tibble [25 x 4]> <dbl [6]> <tibble [25 x 5]>
#> 2 Group B <tibble [25 x 4]> <dbl [6]> <tibble [25 x 5]>
#> 3 Group C <tibble [25 x 4]> <dbl [6]> <tibble [25 x 5]>
We can also define a couple of helper functions to make this more readable:
# Replace elements in x given by f(x) with val
replace_if <- function(x, f, val, ...) {
replace(x, f(x, ...), val)
}
appears_less_than <- function(x, k) {
table(x)[as.character(x)] < k
}
Combining these two functions gets what we are after:
replace_if(c(1, 1, 2, 3), appears_less_than, k = 2, 0)
#> [1] 1 1 0 0
Now all that remains is to put the pieces together:
res <- ex %>%
mutate(assignment = map(assignment, modify, replace_if,
appears_less_than, k = 3, 0))
As #thothal mentioned, there aren't any values in your data that occur more
than 4 times in your data, but with k = 3 we can have a look at the result
(to illustrate, just the 3rd dataframe in assignment):
res %>% pluck("assignment", 3)
#> # A tibble: 25 x 5
#> `0%` `0.01%` `0.02%` `0.03%` `0.04%`
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 0 0 0 0
#> 2 0 0 0 0 0
#> 3 0 0 0 3 3
#> 4 0 0 0 0 0
#> 5 0 0 5 0 5
#> 6 0 0 0 0 5
#> 7 0 0 0 0 0
#> 8 0 0 0 3 3
#> 9 0 0 0 0 0
#> 10 0 0 5 0 0
#> # ... with 15 more rows
Finally, we could also use a scoped mutate_at() to further reduce some of
the excess syntax:
ex %>%
mutate_at(vars(assignment), map, modify,
replace_if, appears_less_than, k = 3, 0)
Created on 2018-08-08 by the reprex package (v0.2.0.9000).
This should do the trick:
library(tidyverse)
ex %>%
mutate(
assignment = map(assignment,
~ rowid_to_column(.x, "id") %>%
gather(key, value, -id) %>%
group_by(key) %>%
add_count(value) %>%
mutate(value = ifelse(n < 5, 0, n)) %>%
select(-n) %>%
spread(key, value) %>%
select(-id)
)
)
Note in your example there is no single value appearing more than 4 times.
Explanation
You map over all assignment data.frames
For each data.frame you first add an id column (needed for gather/spread)
Then you gather all columns butidinto akey(former column names)value` (the values) pair
For each group of former columns (now in key) you add a counter of the values in value
Then you replace occurrences which appear less than 5 times by 0
You remove n (the counter)
spread the data back into the original format
Remove the id column
I am running a path analysis in lavaan (with ordinal) and would like to use imputed data.
But whether I impute data separately and use runMI or let the original data be imputed as a part of sem.mi command, I get same error:
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
If I run:
options(expressions = 100000)
the error message changes to: Error: protect(): protection stack overflow
I tried to change
--max-ppsize=500000
but in the command line I can't access rstudio.exe (says: the system cannot find the path specified, - even though I double-checked the path:
C:\Program Files\RStudio\bin\rstudio.exe --max-ppsize=500000)
What can I do to run my analysis with imputed data or to impute it as a part of the path analyses estimation?
Here is my code:
imp <- mice(dat2,m=5,print=F)
imputedData <- NULL
for(i in 1:5) {
imputedData[[i]] <- complete(x=imp, action=i, include=FALSE)
}
model5 <- 'ceadiff ~ mompa + cdpea + momabhx
mompa ~ b1*peadiff + c*momabhx + cdpea + b2*mommhpsi
peadiff ~ a1*momabhx + mommhpsi
cdpea ~ momabhx + mommhpsi
mommhpsi ~ a2*momabhx
peadiff ~~ cdpea
direct := c
indirect1 := a1 * b1
indirect1 := a2 * b2
total := c + (a1 * b1) + (a2 * b2)'
fit5 <- runMI(model5, data = imputedData, fun="sem", ordered = "mompa")
summary(fit5, standardized = TRUE, fit = TRUE, ci = T)
# or:
fit5 <- sem.mi(model5, data = dat2, m=5, ordered = "mompa")
summary(fit5, standardized = TRUE, fit = TRUE, ci = T)
P.S. It prints summary with a warning in this scenario but doesn't print p-values or CIs, so I cannot determine what coefficients are sig.:
fit5 <- sem.mi(model5, data = dat2, m=5, ordered = "mompa")
summary(fit5)
** WARNING ** lavaan (0.5-23.1097) model has NOT been fitted
** WARNING ** Estimates below are simply the starting values
Thank you!
P.S. I don't know how to supply my data sample.
Here is the unimputed data output:
> dput(dat2)
structure(list(id = structure(c(145, 253, 189, 305, 149, 567,
151, 853, 272, 67, 111, 695, 1695, 1301, 2322, 1335, 1490, 580,
209, 1109, 1317, 812, 1459, 2150, 685, 1583, 839, 2156, 1627,
1103, 649, 2294, 1712, 1711, 793, 1425, 1114, 146, 1529, 985,
1889, 1974, 444, 1664, 1569, 859, 1947, 1219, 1427, 1533, 2143,
769, 256, 147, 1393, 1847, 1967, 1651, 1084, 1343, 996, 1765,
1596, 2157, 978, 1448, 915, 1411, 1412, 675, 1876, 53, 400, 2103,
1028, 663, 1090, 360, 2134, 1937, 1061, 1823, 935, 891, 1968,
34, 487, 207, 295, 1118, 1164, 1053, 1511, 777, 1760, 38, 480,
459, 307, 1962, 199, 499, 1375, 782, 1855, 1624, 109, 1481, 483,
536, 972, 1151, 19, 403, 543, 502, 2251, 254, 429, 2118, 1272,
1995, 982, 1748, 1641, 1994, 1718, 510, 494, 273, 602, 549, 293,
1796, 1497, 1197, 1874, 1179, 159, 205, 242, 299, 100, 1200,
579, 870, 1482, 2131, 33, 1319, 148, 1297, 626, 1051, 1948, 1057,
1581, 1349, 1284, 1178, 1178, 1044, 1001, 547, 276, 507, 871,
698, 1006, 1946, 2101, 68, 265, 1186, 1895, 1864, 1884, 1553,
1761, 2171, 168, 30, 1132, 1983, 1897, 1383, 1353, 1697, 1752,
505, 1605, 1144, 1358, 1052, 1645, 1346, 14, 439, 2154, 932,
971, 2104, 1345, 1821, 52, 1642, 1661, 1835, 1232, 2132, 809,
606, 54, 528, 59, 1848, 232, 1750, 2340, 882, 716, 2105, 711,
2109, 2353, 41, 2144, 552, 304, 2404, 1527, 1980, 927, 1586,
1805, 1982, 1181, 2163, 861, 198, 1404, 986, 1404, 238, 2115,
1125), format.spss = "F4.0", display_width = 11L), peadiff = structure(c(4,
7, 2, 2, 3, 4, 5, 5, 2, 6, 2, 6, 4, 3, 4, 5, 2, 3, 2, 1, 1, 3,
3, 3, 3, 5, 6, 3, 2, 2, 2, 4, 2, 2, 3, 5, 2, 4, 6, 2, 2, 3, 2,
1, 7, 7, 2, 5, 6, 4, 4, 4, 2, 9, 3, 4, 6, 7, 3, 3, 4, 3, 7, 5,
7, 4, 1, 1, 6, 14, 6, 2, 4, 3, 6, 4, 6, 7, 8, 5, 3, 4, 5, 1,
5, 4, 4, 9, 6, 3, 4, 3, 6, 6, 3, 1, 2, 2, 5, 4, 4, 1, 1, 3, 3,
3, 3, 7, 5, 4, 3, 4, 3, 4, 3, 4, 4, 4, 6, 3, 1, 1, 6, 4, 6, 9,
2, 3, 3, 7, 4, 1, 2, 9, 2, 3, 6, 1, 5, 3, 8, 4, 0, 4, 4, 6, 2,
4, 2, 7, 6, 8, 5, 3, 10, 3, 1, 4, 6, 6, 6, 5, 4, 5, 3, 7, 3,
4, 8, 4, 7, 4, 15, 4, 0, 2, 5, 3, 3, 3, 5, 7, 4, 7, 5, 2, 3,
2, 8, 5, 2, 5, 4, 5, 2, 4, 3, 3, 5, 4, 4, 3, 5, 2, 4, 3, 2, 1,
6, 2, 8, 2, 6, 3, 0, NA, 6, 3, 4, 2, 9, 3, 4, 4, 2, 12, 5, 4,
0, 2, 2, 5, 2, 1, 3, 3, 4, 3, 2, 4, 7, 9, 5, 4, 6, 8), format.spss = "F8.2", display_width = 10L),
ceadiff = structure(c(5, 4, 2, 1, 2, 2, 3, 4, 3, 4, 0, 2,
2, 1, 4, 2, 6, 4, 2, 2, 2, 3, 4, 2, 6, 4, 4, 4, 5, 3, 2,
4, 4, 3, 1, 7, 3, 6, 8, 2, 3, 2, 2, 1, 4, 5, 0, 4, 2, 3,
4, 4, 1, 5, 3, 1, 4, 3, 5, 2, 0, 4, 0, 5, 4, 2, 4, 3, 2,
7, 7, 0, 5, 0, 4, 5, 2, 4, 4, 3, 2, 4, 2, 2, 3, 4, 4, 3,
1, 3, 4, 6, 8, 2, 2, 5, 2, 6, 6, 2, 4, 0, 2, 4, 2, 2, 2,
5, 2, 2, 7, 6, 3, 6, 4, 8, 2, 2, 5, 1, 1, 1, 2, 1, 3, 3,
4, 3, 5, 8, 2, 1, 4, 3, 1, 3, 5, 5, 2, 4, 4, 5, 1, 1, 8,
6, 1, 4, 12, 5, 7, 8, 3, 6, 5, 6, 3, 5, 4, 3, 3, 4, 6, 4,
2, 6, 2, 3, 4, 2, 7, 4, 7, 4, 3, 0, 3, 0, 2, 2, 1, 3, 5,
1, 4, 2, 1, 2, 7, 4, 4, 4, 8, 6, 2, 6, 1, 1, 5, 3, 0, 5,
8, 4, 8, 3, 0, 3, 4, 5, 5, 2, 6, 0, 6, NA, 4, 4, 1, 3, 12,
2, 0, 4, 0, 5, 4, 3, 2, 1, 1, 5, 5, 6, 3, 1, 2, 1, 4, 2,
8, 6, 3, 0, 1, 3), format.spss = "F8.2", display_width = 10L),
cdpea = structure(c(22, 18, 17, 13, 19, 20, 19, 20, 17, 17,
17, 14, 17, 15, 21, 12, 16, 15, 14, 17, 19, 18, 17, 18, 19,
16, 18, 15, 16, 18, 17, 19, 18, 15, 16, 18, 18, 17, 22, 18,
18, 12, 19, 16, 15, 17, 14, 17, 15, 19, 17, 18, 14, 17, 19,
20, 16, 6, 12, 17, 17, 16, 13, 20, 18, 16, 16, 18, 21, 17,
21, 13, 17, 14, 18, 15, 18, 17, 23, 19, 17, 18, 15, 17, 19,
15, 21, 17, 20, 16, 15, 18, 15, 18, 17, 18, 16, 18, 21, 16,
19, 21, 18, 16, 19, 18, 18, 18, 18, 18, 19, 20, 20, 22, 14,
19, 18, 16, 22, 14, 16, 17, 18, 15, 16, 19, 16, 19, 18, 18,
15, 18, 19, 16, 16, 18, 15, 13, 12, 20, 19, 18, 19, 13, 19,
19, 16, 20, 18, 18, 18, 18, 18, 18, 19, 15, 14, 18, 16, 15,
15, 18, 18, 18, 18, 20, 17, 16, 19, 18, 19, 17, 18, 18, 16,
16, 18, 15, 19, 19, 17, 17, 16, 15, 15, 15, 17, 12, 17, 17,
19, 14, 21, 19, 19, 18, 23, 18, 21, 18, 16, 17, 18, 13, 14,
17, 18, 16, 18, 16, 18, 18, 17, 17, 6, 22, 17, 18, 20, 18,
10, 18, 15, 10, 16, 16, 18, 18, 17, 21, 18, 18, 15, 13, 15,
17, 12, 16, 16, 16, 15, 20, 17, 14, 17, 17), format.spss = "F8.2", display_width = 10L),
mompa = structure(c(0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0,
0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1,
0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0,
1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1,
0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0,
0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1,
0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0,
1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1,
0, 0, 1, 0, 0), format.spss = "F8.2", display_width = 10L),
momabhx = structure(c(0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1,
1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1,
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1,
0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1,
0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1,
0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1,
1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1,
1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1,
1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0,
1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 0, 1, 0, 1), format.spss = "F8.2", display_width = 10L),
capiabr1 = structure(c(36, 43, NA, NA, 90, 95, 128, 137,
136, 245, 322, 154, 87, 111, 181, 278, 173, 137, 69, 24,
27, 70, 34, 27, 11, 53, 31, 49, 14, 54, 131, 35, 43, 43,
60, 58, 55, 60, 18, 38, 76, 98, 41, 20, 117, 58, 98, 10,
16, 101, 120, 165, 44, 96, 23, 19, 53, 57, 77, 41, 53, 100,
90, 96, 91, 29, 54, 134, 134, 105, 106, NA, 125, 61, 72,
34, 215, 42, NA, 106, 47, 45, 107, 208, 191, NA, 50, 56,
222, 47, 89, 134, 204, 211, 228, NA, 24, 34, 34, 135, 174,
112, 239, 104, 102, 129, 71, 100, 159, 280, 97, 105, NA,
56, 76, 120, 176, 89, 154, 46, 59, 214, 53, 245, 197, 60,
425, 25, 62, 137, 199, 171, 191, 46, 49, 117, 183, 79, 47,
76, NA, 158, 151, 47, 70, 118, 198, 94, 43, 296, 108, 56,
277, 214, 331, NA, 293, 277, 41, 134, 134, 283, 87, 96, 126,
305, 152, 82, 308, 168, 274, NA, 48, 171, 98, 90, 84, 257,
144, 255, NA, 106, 67, 184, 173, 156, 243, 357, 116, 132,
226, 260, 308, 358, 225, 312, 102, 244, 87, 176, 270, 224,
136, 243, NA, 117, 234, 280, 133, 143, 234, 273, NA, 169,
145, 310, 255, 280, 58, 152, 239, 254, 322, 342, 288, NA,
155, 179, 206, 270, 173, 319, 194, 206, 319, 111, 408, 310,
324, 296, 288, 391, 409, 379, 311, 338), format.spss = "F3.0", display_width = 11L),
cbclint = structure(c(51, 55, NA, NA, 65, 57, 46, 58, 53,
56, 75, 65, 33, NA, 65, NA, 51, 65, 34, 60, 45, 29, 43, 37,
65, 49, 56, 64, 53, 51, 39, 43, 64, 61, 74, 29, 60, 53, 45,
43, 45, 49, 47, 47, 66, 57, 73, 41, 56, 37, 65, 45, 53, 60,
53, 33, 43, 51, 53, 45, 47, 59, NA, 47, 79, 68, 56, 66, 70,
47, 63, 61, 61, 56, 33, 53, 56, 43, 51, 55, 51, 73, 56, 88,
56, 59, 30, 54, 82, 50, 63, 51, 58, 37, 67, 58, 51, 52, 40,
72, 63, NA, 43, 56, 60, 48, 66, NA, 55, 47, 61, 56, 55, 51,
55, 40, 64, 40, 66, 76, 45, 63, 53, 47, 51, 70, 80, 40, 53,
51, 43, 54, 64, 53, 64, 58, 56, 60, 55, 40, 40, 49, 48, 41,
47, 56, 60, 53, 55, 49, 55, 33, 67, 58, 41, 46, 67, 63, 64,
73, 73, 60, 49, 40, 51, 45, 53, 49, 65, 54, 58, 51, 68, 45,
41, 53, 60, 55, 61, 66, 69, 66, 67, 70, 66, NA, 56, 58, 61,
67, 73, 47, 74, 65, 62, 72, 59, 60, 73, 64, 48, 56, 53, 81,
65, 65, 65, 65, 59, 56, 70, 68, 63, 64, 74, 60, 75, 58, 63,
43, 72, 69, 59, 71, 71, 64, 66, 63, 46, 66, 66, 66, 53, NA,
73, 68, 65, 68, 62, 57, 68, 69, 74, 65, 78, 47), format.spss = "F8.0", display_width = 10L),
bpsidrr1 = structure(c(NA, 21, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 18, NA, NA, NA, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 9, 8, 9, 10, 10, 10, 11,
11, 11, 9, 11, 8, 11, 9, 10, 12, 11, 13, 10, 8, 11, 10, 13,
12, 14, 9, 10, 13, 11, 11, 10, 13, 13, 13, 12, 10, 11, 13,
10, 13, 16, 12, 15, 10, 12, 13, 13, 11, 14, 15, 13, 13, 14,
13, 14, 13, 18, 13, 14, 14, 14, 15, 16, 17, 16, 14, 15, 14,
14, 15, 14, 20, 16, 16, 13, 17, 16, 15, 14, 16, 18, 17, 17,
19, 14, 17, 16, 16, 17, 16, 14, 14, 15, 17, 18, 17, 14, 14,
18, 17, 19, 16, 16, 17, 18, 15, 19, 16, 21, 18, 17, 19, 15,
20, 18, 19, 16, 18, 23, 15, 18, 20, 19, 12, 12, 21, 16, 17,
17, 20, 20, 19, 19, 22, 20, 19, 22, 14, 19, 19, 23, 19, 20,
19, 19, 20, 20, 23, 18, 19, 25, 20, 23, 20, 21, 22, 21, 21,
24, 22, 24, 22, 22, 18, 23, 24, 22, 22, 24, 21, 23, 21, 20,
21, 23, 23, 25, 24, 22, 23, 26, 23, 26, 26, 23, 26, 26, 23,
25, 24, 22, 27, 25, 24, 27, 23, 25, 25, 26, 23, 27, 30, 28,
29, 27, 31, 34, 32, 31, 34), format.spss = "F2.0", display_width = 11L),
ecbiir1 = structure(c(177, 197, 148, 133, 172, 133, 129,
NA, 159, 67, 141, 167, 111, 190, 174, NA, 137, 93, 99, 136,
54, 36, 36, 75, 126, 97, 68, 205, 110, NA, 109, 47, 93, 200,
183, 42, 73, 132, 82, 91, 154, 157, 82, 124, 207, 84, 188,
76, 104, 73, 185, 108, 140, 183, 52, 48, 100, 110, 109, 56,
88, 69, 189, 82, 210, 159, 68, 144, 119, 81, 190, 180, 199,
206, 72, 153, 151, NA, 115, 111, NA, 161, 118, 159, 127,
124, 136, 174, 232, 48, 161, 54, 74, 53, NA, 112, 148, 135,
137, 159, 75, 74, 36, 101, 142, 83, 132, 99, 141, 117, 117,
134, 105, 134, 147, 54, 206, 170, 69, 134, 64, 55, 129, 79,
110, 173, 159, 113, 163, 139, 111, 103, 93, 86, 179, 144,
167, 118, 124, 118, 91, 166, 66, 127, 54, 177, 108, 125,
115, 142, 130, 156, 152, 51, 132, 76, 155, 185, 148, 132,
146, 147, 134, 50, 158, 143, 142, 98, 111, 150, 138, NA,
221, 150, 167, 145, 146, 63, 201, 195, 192, 183, 168, 162,
170, NA, 87, 119, 171, 136, 66, 183, 162, NA, 168, 153, 151,
109, 147, 214, 156, 147, 148, 117, NA, 140, 124, 165, 175,
106, 198, 141, 183, 208, 201, 139, 171, 170, 165, 116, 226,
102, 157, 182, 161, 169, 208, 144, 140, 139, 128, 174, 158,
231, 168, 181, 211, 176, 159, 180, 110, 188, 151, 206, 205,
67), format.spss = "F3.0", display_width = 11L), mommhpsi = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 35.75, 32.75, 32.75, 32.75, 32.75, 38.5, 38.5,
32.75, 32.75, 32.75, 32.75, 34.25, 36.5, 43, 43, 49, 33,
38, NA, 33.5, 36.5, 36.75, 43.75, NA, 33.75, 50, 35.75, 49.25,
34, 39, 45.25, 50.75, 50, NA, NA, 34.25, 34.25, 34.25, 38.25,
42.75, NA, 34.5, 42.75, 36.25, 43, NA, 34.75, 34.75, 39.5,
39.5, 39, 48, NA, NA, 35, 35, 38.5, 50.5, NA, 41.5, 38.25,
43.5, 44.5, 43, 51.75, 44.5, NA, NA, NA, NA, 35.5, 38.5,
35.5, 38.5, 42.75, 50.25, NA, NA, NA, NA, NA, NA, 35.75,
35.75, 45, 40.5, 46, NA, NA, NA, NA, 47, 45.75, NA, NA, NA,
NA, NA, NA, NA, 47, 39.25, 50.75, 42.25, 42.25, 44.75, 44,
43.75, NA, NA, NA, NA, NA, NA, 45.75, 40.5, 38.25, 42.25,
51.75, NA, NA, NA, NA, NA, 39.75, 43.25, 50.5, 53.5, 54,
NA, 52.75, NA, 37.25, 41.5, 46.5, NA, 55.25, NA, 59.75, 42.25,
44.25, 44.25, 48.25, 47, NA, NA, NA, 46.5, 49.75, 50, 49.25,
56.25, NA, NA, NA, 39.75, 47, 44, 41, 54.75, 55.25, NA, NA,
38.25, 51, 48.75, NA, 43.75, 50.25, NA, NA, 46.25, 57, 59.75,
58.5, 62.5, 62.25, NA, NA, 46.75, 46, 56.25, 55, 55.75, 58.25,
NA, 44.75, 49.5, 46.5, 57.25, 53, 60.5, 63, NA, NA, NA, 56.75,
NA, 60.5, 43.75, 39.75, 59.25, 58.75, 57.5, 56.5, 63, NA,
NA, NA, NA, 55.5, 50, NA, 61.25, 61.5, 61, 62.75, 66.5, 57,
64.75, NA, 59.25, 68.25, 65.25, NA, 68.75, 50)), .Names = c("id",
"peadiff", "ceadiff", "cdpea", "mompa", "momabhx", "capiabr1",
"cbclint", "bpsidrr1", "ecbiir1", "mommhpsi"), row.names = c(NA,
-246L), class = "data.frame")
Your code works correctly. The problem in given by the version of lavaan and semTools that you are using.
Following the suggestions given here by Terrence D. Jorgensen (one of the authors of semTools), start a new session of R and reinstall the two packages as follows:
install.packages("lavaan", repos = "http://www.da.ugent.be", type = "source")
# if necessary: install.packages("devtools")
devtools::install_github("simsem/semTools/semTools")
Now the commands:
fit5 <- runMI(model5, data = imputedData, fun="sem", ordered = "mompa")
summary(fit5, standardized = TRUE, ci = T)
give the following output:
Rubin's (1987) rules were used to pool point and SE estimates across 5 imputed data sets, and to calculate degrees of freedom for each parameter's t test and CI.
lavaan.mi object based on 5 imputed data sets.
See class?lavaan.mi help page for available methods.
Convergence information:
The model converged on 5 imputed data sets
Parameter Estimates:
Information Expected
Information saturated (h1) model
Standard Errors Robust.sem
Regressions:
Estimate Std.Err t df P(>|z|) ci.lower ci.upper Std.lv Std.all
ceadiff ~
mompa 0.473 0.165 2.863 2016.256 0.004 0.149 0.797 0.473 0.223
cdpea 0.137 0.038 3.589 2507.509 0.000 0.062 0.212 0.137 0.157
momabhx -0.251 0.302 -0.831 Inf 0.406 -0.843 0.341 -0.251 -0.059
mompa ~
peadiff (b1) 0.108 0.035 3.091 Inf 0.002 0.039 0.176 0.108 0.245
momabhx (c) 0.548 0.165 3.324 Inf 0.001 0.225 0.871 0.548 0.273
cdpea -0.048 0.031 -1.525 Inf 0.127 -0.109 0.014 -0.048 -0.116
mommhpsi (b2) -0.022 0.009 -2.365 61.332 0.021 -0.040 -0.003 -0.022 -0.192
...