Related
I have a large df and I'm trying to relocate the columns with patterns instead of manually write each column name in select(). More details here.
A glimpse of the issue (edit): All my columns share a pattern ARG_G1_50_AAA or ARG_G2_50_AAA or NARR_G1_50_AAA or NARR_G2_50_AAA. The final parts are: AAA, AAC, AC and AB. I need two subsets of this data.
Set 1: I need to intercalate "G1" and "G2" columns (in the order 50, 100, 150 and 200) and in the order (AAA, AAC, AC and AB). Ex:
NARR_G1_50_AAA, NARR_G2_50_AAA,
NARR_G1_50_AAC, NARR_G2_50_AAC.... so on
Set 2: I need to intercalate "Narr" and "Arg" columns (again, 50 before 100, 150 and 200 and AAA before AAC, AC and AB). No need to intercalate G1 and G2 now. Ex:
NARR_G1_50_AAA, ARG_G1_50_AAA,
NARR_G2_50_AAA, ARG_G2_50_AAA... so on
Basically, I was able to partially solve my problem (cf. linked post above) with:
dfPaired <- merged_DF %>%
dplyr::select(ID, str_subset(names(merged_DF), "G?_50\\w*"))
head(dfPaired)
ID ARG_G1_50_AAA ARG_G1_50_AAC ARG_G1_50_AC ARG_G1_50_AB
ARG_G2_50_AAA ARG_G2_50_AAC, ARG_G2_50_AC ARG_G2_50_AB....
## I know that I'm only getting the "50" here, in fact I need all, but It wouldn't be "A" problem to repeat the code for 100, 150, 200)
How can I make R "intercalate" the strings? I mean, I need:
ARG_G1_50_AAA, ARG_G2_50_AAA
ARG_G1_50_AAC, ARG_G2_50_AAC,
ARG_G1_50_AC, ARG_G2_50_AC,
ARG_G1_50_AB, ARG_G2_50_AB ... (so on)
(intercalate G1 and G2 coluns in case of set 1)
Questions :
Could I use sth as seq(by = 2) ?
Is there a way to pass two patterns to str() and ask it to intercalate the output?
Is there an "intercalate()" function that I could pass to str_subset(names(merged_DF), "G?_50\\w*")) ?
** I mean, sth as int(str_subset(names(merged_DF), "G1_50\w*")), str_subset(names(merged_DF), "G2_50\w*")) Thanks in advance :)
EDIT:
dput(merged_DF[1:50])
structure(list(ID = structure(c("P1", "P2", "P3", "P4", "P5",
"P6", "P7", "P8", "P9", "P10", "P11", "P12", "P13", "P14", "P15",
"P16", "P17", "P18", "P19", "P20", "P21", "P22", "P23", "P24",
"P25", "P26", "P27", "P28", "P29", "P30", "P31", "P32", "P33",
"P34", "P35", "P36", "P37", "P38", "P39", "P40", "P41", "P42",
"P43", "P44", "P45", "P46", "P47", "P48", "P49", "P50", "P51",
"P52", "P53", "P54", "P55", "P56", "P57", "P58", "P59", "P60",
"P61", "P62", "P63", "P64", "P65", "P66", "P67", "P68", "P69",
"P70", "P71"), class = c("glue", "character")), ARG_G1_100_AAA = c(68.53,
65.9, 69.78, 68.29, NaN, 69.5, 67.05, 73.74, 73.59, 72.57, 64.33,
67.79, 72.94, 63.75, 71.56, 75.5, 68.16, NA, 65.64, 68.36, 69.75,
72.73, 67.67, 66.19, 62.94, 72.48, 72.19, 62.44, 72.5, 71.06,
70.4, 69.14, NA, 67.59, 69.1, 74.05, NA, 68.6, 68.27, 59.12,
NA, NA, 63.7, 67.18, NA, 68.38, 63.44, 72.56, 66.06, 66.53, 73.19,
NA, NA, NA, 73.44, 67.45, 72.91, 65.81, 73.96, 75, 75.89, 72,
NA, 68.2, 67.29, 69.91, NaN, 69.67, 68.39, 69.2, 67.55), ARG_G1_100_AAC = c(70.18,
67.65, 71.89, 70.42, NaN, 72.38, 69.67, 75.63, 76.7, 76.21, 66.5,
70.57, 76.72, 66.4, 74.75, 79.17, 70.84, NA, 67.82, 70, 71.88,
74.55, 69.33, 69.5, 65.25, 75.05, 75.44, 64.56, 74.88, 74.29,
72.4, 71.93, NA, 69.12, 71.43, 77.53, NA, 71.93, 70.4, 60.25,
NA, NA, 64.8, 69, NA, 71.19, 71.12, 75.04, 68.89, 68.26, 75.81,
NA, NA, NA, 75.89, 68.82, 77.35, 68.38, 76.71, 79.12, 78.89,
73.5, NA, 69.7, 69.82, 70.91, NaN, 72, 71.17, 71.85, 69.7), ARG_G1_100_AC = c(4.35,
4.95, 1.44, 2.71, NaN, 3.25, 3.95, 2.26, 0.85, 1.21, 5.33, 5.43,
0.83, 10.4, 2.56, 0.33, 4.92, NA, 10.55, 3.43, 2.94, 1.55, 5.33,
6.44, 5.25, 2, 3.12, 8.5, 1.38, 3.76, 1.9, 2.79, NA, 4.06, 5.57,
1.95, NA, 6.07, 2.67, 7, NA, NA, 8, 4.76, NA, 4.19, 2.68, 3,
4.94, 4.79, 2.19, NA, NA, NA, 1.78, 5.27, 2.52, 5.88, 1.96, 1.12,
0.67, 3.28, NA, 3.5, 3.41, 3.73, NaN, 3.83, 6.06, 3.3, 3.9),
ARG_G1_100_AB = c(4.94, 6.55, 2.44, 3, NaN, 3.25, 4.71, 2.84,
1.07, 2, 5.33, 5.43, 1.72, 10.55, 3, 1.17, 5.8, NA, 10.55,
4.21, 2.94, 3.55, 6.33, 8.25, 5.88, 2, 3.44, 9.22, 1.69,
4.18, 2.5, 4.71, NA, 4.41, 5.9, 2.21, NA, 6.67, 3.33, 7,
NA, NA, 8, 4.76, NA, 4.44, 2.68, 3.16, 4.94, 5.42, 2.81,
NA, NA, NA, 1.78, 6.09, 2.52, 6.56, 1.96, 1.12, 0.67, 3.78,
NA, 3.5, 3.65, 5.27, NaN, 4.33, 6.78, 3.6, 4.35), ARG_G1_150_AAA = c(93.38,
90.2, 98.33, 94.69, NaN, 99, 93.64, 104.22, 104.8, 103.17,
87, 93.83, 101.89, 87.5, 100.38, 107, 94.69, NA, 90.75, 91.5,
93.88, 99.5, NaN, 89.5, 86.5, 100.55, 101, 84.22, 101.88,
94.62, 97.2, 96.5, NA, 87.38, 96.82, 103.67, NA, 97.57, 95.86,
84, NA, NA, 85.5, 90.5, NA, 96.29, 89.71, 101.64, 92.33,
93.89, 104.43, NA, NA, NA, 101.33, 93.5, 105.42, 90.75, 104.23,
108.86, 102.67, 97, NA, 91.9, 91.38, 93.5, NaN, 98, 94.78,
95.1, 93.4), ARG_G1_150_AAC = c(96.38, 90.9, 100, 96.08,
NaN, 99.5, 95.82, 106.33, 106.6, 106.5, 92, 95.83, 104, 89,
103.75, 109, 96.92, NA, 93, 93.17, 95.12, 102.75, NaN, 93.5,
89.38, 102.09, 104.12, 85.44, 103.38, 96.75, 99.2, 98.5,
NA, 90.38, 99.18, 105.89, NA, 99.43, 97, 84, NA, NA, 86.75,
91.88, NA, 96.86, 98.64, 103.71, 94.22, 95.22, 105.71, NA,
NA, NA, 102.33, 94.25, 108.08, 91.75, 107, 112.29, 106.33,
98.22, NA, 93.5, 93.25, 94.25, NaN, 100, 96.78, 97.8, 95.5
), ARG_G1_150_AC = c(8.75, 10.1, 3.67, 5.23, NaN, 6.5, 6.73,
4.78, 2.27, 3.17, 12, 9.83, 3.44, 21.1, 4.25, 2, 11.85, NA,
17.5, 6.17, 7.25, 3, NaN, 13.5, 10.62, 5, 5.75, 17.44, 4,
10.75, 5, 5.5, NA, 9.5, 9.36, 3.56, NA, 10, 6.86, 9.5, NA,
NA, 16.25, 10.25, NA, 10.43, 6, 6.21, 9.22, 9.22, 5.14, NA,
NA, NA, 3, 10.75, 6, 12.88, 3.77, 2.57, 4.33, 7.22, NA, 8.6,
7.88, 10, NaN, 7, 11.67, 7.8, 7.7), ARG_G1_150_AB = c(10.12,
12.6, 5.33, 5.77, NaN, 6.5, 7.91, 5.44, 2.53, 4.33, 12, 9.83,
4.78, 21.4, 5.25, 3, 13.77, NA, 17.5, 7.33, 7.25, 6, NaN,
16.5, 11.5, 5, 6.25, 18.67, 4.5, 11.38, 5.8, 8.5, NA, 10,
9.82, 4.33, NA, 11, 7.71, 9.5, NA, NA, 16.25, 10.25, NA,
10.86, 6, 7, 9.22, 10.33, 6.43, NA, NA, NA, 3.33, 11.75,
6, 14, 3.77, 2.57, 4.33, 8.22, NA, 8.8, 9, 12, NaN, 8, 12.67,
8.2, 8.4), ARG_G1_200_AAA = c(121.5, 110.6, NaN, 120.57,
NaN, NaN, 115.67, 132.4, 131.11, 128.5, NaN, 114.5, 126.25,
107.4, 124.67, NaN, 120.5, NA, 108, 110.5, 114.33, 125, NaN,
114.67, 108, 123.5, 126.67, 105.5, 129.67, 117.75, 121, 120,
NA, 108.5, 122.83, 130.8, NA, 123.67, 119, NaN, NA, NA, NaN,
109.75, NA, 119, 114.75, 128.88, 115.25, 117, 134, NA, NA,
NA, NaN, 113, 131.86, 110.67, 133.57, 138.33, 127.5, 118.25,
NA, 112.8, 111.5, 113, NaN, NaN, 114.25, 118, 112.8), ARG_G1_200_AAC = c(123.25,
111.6, NaN, 121.29, NaN, NaN, 116.33, 133.4, 132.89, 130.5,
NaN, 115.5, 129.5, 108.2, 128.33, NaN, 123, NA, 108, 111.5,
115.67, 125, NaN, 118, 112, 125.17, 129, 105.75, 130.33,
119.5, 121.4, 121, NA, 109.75, 124.33, 133.4, NA, 125, 120.33,
NaN, NA, NA, NaN, 110.75, NA, 123, 124, 129.75, 117.5, 117.2,
134, NA, NA, NA, NaN, 116, 134.43, 111.33, 135, 141.33, 129.5,
119.5, NA, 114, 113.5, 113, NaN, NaN, 115.5, 120.6, 114),
ARG_G1_200_AC = c(12, 15.6, NaN, 8, NaN, NaN, 10.83, 7.8,
5.33, 6, NaN, 16.5, 6.75, 31.2, 9.33, NaN, 18, NA, 30, 14.5,
13, 11, NaN, 19.67, 17, 9, 9.33, 25.5, 8, 16.25, 9.6, 9,
NA, 16, 12.67, 6.2, NA, 13.67, 11.67, NaN, NA, NA, NaN, 17.5,
NA, 17, 9, 9.5, 14.75, 15.8, 8, NA, NA, NA, NaN, 23, 10.43,
21.33, 5.71, 4.67, 10.25, 13.25, NA, 14.6, 13.25, 19, NaN,
NaN, 21.5, 13.2, 14.6), ARG_G1_200_AB = c(14, 19.4, NaN,
8.71, NaN, NaN, 12.5, 9, 6, 8, NaN, 16.5, 8.5, 31.8, 11,
NaN, 21, NA, 30, 15.5, 13, 15, NaN, 24, 18, 9, 10, 27, 9,
17.25, 10.8, 12, NA, 17, 13.5, 7.2, NA, 14.67, 14, NaN, NA,
NA, NaN, 17.5, NA, 17.67, 9, 10.88, 14.75, 17, 9.67, NA,
NA, NA, NaN, 24, 10.43, 23.33, 5.71, 4.67, 10.5, 15, NA,
14.8, 14.75, 21, NaN, NaN, 23.25, 13.8, 15.8), ARG_G1_50_AAA = c(36.35,
35.88, 36.22, 35.72, 36.12, 36.96, 35.24, 37.62, 36.05, 34.63,
34.19, 33.71, 36.22, 34.43, 34.95, 34.59, 36.03, NA, 32.61,
35.29, 37.17, 37.13, 35.62, 34.64, 34.4, 35.69, 37.36, 36.4,
36.69, 35.8, 36.57, 35.97, NA, 36.44, 34.94, 35.26, NA, 34.44,
37.85, 33.15, NA, NA, 36.13, 34.91, NA, 35.54, 29.02, 35.55,
35.64, 35.79, 35.93, NA, NA, NA, 37, 32.58, 35.71, 34.98,
36.64, 33.29, 35.29, 37.2, NA, 36.29, 36.91, 31.26, 34, 37.48,
33.89, 36.34, 35.88), ARG_G1_50_AAC = c(41.19, 38.7, 41.22,
40.53, 44.12, 41.04, 40.18, 42.38, 42.17, 41.87, 38, 41.21,
42.24, 38.69, 42.64, 42.14, 41.53, NA, 39.65, 40.76, 41.88,
42.23, 39.62, 41.55, 38.19, 42.53, 42.24, 39.49, 42.07, 43.3,
40.92, 39.92, NA, 40.35, 40.49, 44.11, NA, 41.72, 40.64,
36.15, NA, NA, 39.03, 40.86, NA, 40.93, 37.95, 42.27, 39.47,
39.72, 42.12, NA, NA, NA, 42.11, 39.81, 42.82, 39.12, 42.67,
43.02, 43.58, 42.61, NA, 40.04, 41.42, 40.9, 41.5, 41.62,
40.02, 41.08, 40.18), ARG_G1_50_AC = c(0.98, 1.5, 0.37, 0.6,
0.88, 0.73, 1.51, 0.23, 0.25, 0.42, 1.67, 1.58, 0.31, 3.27,
0.62, 0.05, 0.83, NA, 3.71, 1.47, 1.07, 0.1, 1.81, 1.19,
1.62, 0.61, 0.76, 1.73, 0.24, 0.64, 0.33, 0.97, NA, 0.6,
1.98, 0.34, NA, 1.69, 0.26, 2.12, NA, NA, 1.5, 1.14, NA,
1, 0.65, 0.88, 1.62, 1.3, 0.39, NA, NA, NA, 0.57, 1.48, 0.58,
2.21, 0.43, 0.24, 0.16, 0.65, NA, 0.96, 0.4, 1.13, 1.5, 1.05,
1.91, 0.7, 0.94), ARG_G1_50_AB = c(1.09, 2.24, 0.74, 0.68,
0.88, 0.73, 1.82, 0.38, 0.36, 0.89, 1.67, 1.58, 0.76, 3.27,
0.83, 0.45, 1.15, NA, 3.71, 1.82, 1.07, 1.16, 2.25, 1.93,
1.86, 0.61, 1, 2.09, 0.31, 0.86, 0.61, 1.73, NA, 0.77, 2.18,
0.34, NA, 1.92, 0.49, 2.12, NA, NA, 1.5, 1.14, NA, 1.2, 0.65,
0.88, 1.62, 1.49, 0.63, NA, NA, NA, 0.57, 1.77, 0.58, 2.6,
0.43, 0.24, 0.16, 0.85, NA, 0.96, 0.4, 1.84, 1.5, 1.05, 2.4,
0.76, 1.14), ARG_G2_100_AAA = c(64.9, 63.8, 71.73, 67.67,
NA, NA, 52.5, 72.35, 65.28, 57.22, NA, NaN, 69, 66.67, NaN,
66.58, 69, 60.55, 56.29, 67.45, 68.4, 64.25, NaN, 50.86,
67.83, 65.96, 57, 53.07, 66.89, NaN, NA, 59, 61.5, NA, 65.9,
64.07, NA, NA, 57.91, 67.89, 68.75, 68.5, NaN, 63.24, 66.19,
60.59, 59.24, 54.33, 64.39, 65.83, 65.71, 63, 63.78, 63.62,
64, 65.08, NA, 67.61, 67.57, 72.71, 65.46, 61.71, NA, 57.62,
NA, NA, NA, 64, 61.33, 62.64, NA), ARG_G2_100_AAC = c(65.7,
65.8, 74.45, 68, NA, NA, 53.75, 73.94, 67.24, 58.22, NA,
NaN, 71.07, 68.07, NaN, 69.88, 71.32, 62.18, 58.65, 76.45,
71.13, 67.25, NaN, 51.76, 69.33, 68.17, 58, 54.27, 68.05,
NaN, NA, 61, 61.67, NA, 67.79, 65.93, NA, NA, 59.27, 69.67,
71.38, 70, NaN, 64.88, 68.19, 62.06, 61, 55.48, 65.67, 67.72,
68.47, 64, 65.11, 66, 67.5, 66.33, NA, 69.61, 69.33, 75.67,
68.17, 63, NA, 58.81, NA, NA, NA, 66.5, 62.33, 65, NA), ARG_G2_100_AC = c(7.1,
6.4, 0.18, 3.67, NA, NA, 12.75, 1.24, 2.96, 9.78, NA, NaN,
1.43, 1.33, NaN, 5.21, 2.76, 7.91, 8.06, 2.36, 2.87, 4, NaN,
15.52, 2.67, 4.17, 13, 10.07, 5.05, NaN, NA, 9.5, 8.17, NA,
5.86, 3.87, NA, NA, 7, 3.33, 1.75, 3, NaN, 7.94, 3.11, 5.29,
5.29, 13.1, 3.78, 3.33, 3.06, 5.18, 2.56, 5.04, 5.5, 5.75,
NA, 2.22, 2.48, 1, 3.83, 4.82, NA, 8.19, NA, NA, NA, 5, 6.44,
5.29, NA), ARG_G2_100_AB = c(7.1, 7.4, 1.09, 3.67, NA, NA,
12.75, 1.24, 3.28, 9.78, NA, NaN, 1.71, 1.93, NaN, 6.21,
2.76, 7.91, 8.65, 3.55, 3.4, 5, NaN, 16.05, 3.39, 4.52, 13,
11.6, 5.05, NaN, NA, 9.5, 9.67, NA, 7.03, 3.87, NA, NA, 8,
3.33, 2.19, 3, NaN, 8.53, 3.37, 5.47, 7.35, 13.48, 5.33,
3.83, 3.65, 5.82, 4, 6.17, 6, 6.42, NA, 3.83, 2.71, 2.19,
4.58, 5.18, NA, 9.75, NA, NA, NA, 5, 6.44, 5.36, NA), ARG_G2_150_AAA = c(85.25,
NaN, 99, NaN, NA, NA, 66.86, 101, 89.31, 71.33, NA, NaN,
94.5, 88.57, NaN, 95, 95.5, 81.5, 78.5, 107.75, 93.43, NaN,
NaN, 66.18, 92.33, 92.25, NaN, 67.43, 87.44, NaN, NA, NaN,
78, NA, 89.81, 86.43, NA, NA, 75.75, 91.67, 95, NaN, NaN,
85.12, 91.47, 81.88, 79.38, 72.45, 87.67, 91.22, 90.88, 83,
85, 89.23, NaN, 86.2, NA, 92, 93.09, 100.27, 88.62, 83.88,
NA, 75, NA, NA, NA, NaN, 80, 83.5, NA), ARG_G2_150_AAC = c(86.75,
NaN, 101, NaN, NA, NA, 67.29, 103.75, 91.15, 71.67, NA, NaN,
96.33, 88.86, NaN, 96.23, 97.5, 83.5, 79.12, 109.5, 95, NaN,
NaN, 66.45, 93.56, 93.42, NaN, 68, 88.33, NaN, NA, NaN, 78,
NA, 91.69, 87, NA, NA, 76.75, 93, 96.88, NaN, NaN, 85.5,
92.67, 83.38, 80.25, 73.09, 88.33, 92.44, 92.38, 84.25, 85.33,
91.23, NaN, 87.8, NA, 92.67, 94.09, 102.09, 90.15, 84.75,
NA, 76.14, NA, NA, NA, NaN, 81, 85.67, NA), ARG_G2_150_AC = c(15.75,
NaN, 1, NaN, NA, NA, 25.71, 2.62, 6.85, 19.33, NA, NaN, 3.83,
4.57, NaN, 9.85, 6.5, 15.5, 13.88, 3.75, 6.29, NaN, NaN,
27.36, 5.67, 8.42, NaN, 18.86, 11.33, NaN, NA, NaN, 19, NA,
11.25, 9.57, NA, NA, 12.75, 6, 4.5, NaN, NaN, 15.75, 5.67,
10.75, 9.75, 24.82, 8.67, 6.67, 5.88, 13.25, 7, 10, NaN,
10.6, NA, 6.56, 4.18, 2.55, 8.54, 9.75, NA, 17.86, NA, NA,
NA, NaN, 15.67, 13.17, NA), ARG_G2_150_AB = c(15.75, NaN,
2, NaN, NA, NA, 25.71, 2.62, 8.69, 19.33, NA, NaN, 4.33,
5.43, NaN, 11.31, 6.5, 15.5, 14.75, 6, 7.14, NaN, NaN, 28.27,
7.22, 9, NaN, 21.29, 11.33, NaN, NA, NaN, 22, NA, 13.44,
9.71, NA, NA, 14.75, 6, 5.12, NaN, NaN, 16.75, 6, 11.25,
12.75, 25.36, 11.11, 7.33, 6.62, 14.25, 9.33, 11.62, NaN,
11.8, NA, 9.22, 4.91, 4.64, 10, 10.38, NA, 19.86, NA, NA,
NA, NaN, 15.67, 13.33, NA), ARG_G2_200_AAA = c(NaN, NaN,
125, NaN, NA, NA, 81.33, 129.5, 112.25, NaN, NA, NaN, 117.5,
108.33, NaN, 120, 119.25, 99, 94, 134, 113.67, NaN, NaN,
77.67, 112.25, 112.86, NaN, 78.33, 106.6, NaN, NA, NaN, NaN,
NA, 112.4, 106.67, NA, NA, 93, NaN, 122, NaN, NaN, 104.25,
114.89, 101.25, 96.75, 87, 107, 112.25, 112.25, 100, NaN,
111.86, NaN, 101, NA, 114, 114.5, 124.17, 108.86, 103.25,
NA, 90.67, NA, NA, NA, NaN, NaN, 99, NA), ARG_G2_200_AAC = c(NaN,
NaN, 126, NaN, NA, NA, 82.33, 129.75, 113.5, NaN, NA, NaN,
118, 109.33, NaN, 120.71, 120.25, 101, 94.25, 136, 114, NaN,
NaN, 78, 114, 114, NaN, 78.67, 106.8, NaN, NA, NaN, NaN,
NA, 114, 108.33, NA, NA, 93, NaN, 123, NaN, NaN, 104.25,
116.67, 102.75, 97.25, 87.67, 107.75, 113.25, 113.25, 101,
NaN, 113.14, NaN, 101, NA, 114.5, 115, 126.17, 111.29, 104.25,
NA, 92, NA, NA, NA, NaN, NaN, 99, NA), ARG_G2_200_AC = c(NaN,
NaN, 1, NaN, NA, NA, 36, 5.25, 12.25, NaN, NA, NaN, 8.5,
8.33, NaN, 14.29, 11.38, 24, 22.25, 6, 11.67, NaN, NaN, 42.5,
9.25, 13.14, NaN, 32, 19.4, NaN, NA, NaN, NaN, NA, 15.6,
17, NA, NA, 24, NaN, 6.67, NaN, NaN, 21.5, 8.89, 17.5, 16,
37.83, 15.75, 12.25, 11.75, 20, NaN, 15.43, NaN, 26, NA,
12.25, 7.5, 5.67, 12.86, 14.75, NA, 27, NA, NA, NA, NaN,
NaN, 28.5, NA), ARG_G2_200_AB = c(NaN, NaN, 2, NaN, NA, NA,
36, 5.25, 16, NaN, NA, NaN, 10, 9.33, NaN, 16.57, 11.38,
24, 23.25, 9, 13, NaN, NaN, 44.33, 11.5, 14.29, NaN, 35,
19.4, NaN, NA, NaN, NaN, NA, 18.8, 17.33, NA, NA, 26, NaN,
7.67, NaN, NaN, 22.5, 9.33, 18.25, 20.25, 38.67, 19, 13.25,
13.25, 22, NaN, 18, NaN, 28, NA, 15.75, 8.83, 8.17, 15.14,
16, NA, 29.33, NA, NA, NA, NaN, NaN, 29, NA), ARG_G2_50_AAA = c(36.97,
35.4, 34.72, 33.81, NA, NA, 32.98, 35.7, 35.59, 35.36, NA,
36, 37.66, 36.35, 33.44, 34.72, 36.9, 34.32, 32.28, 33.74,
36.38, 35.06, 34.5, 31.47, 36.59, 36.18, 34.75, 31.9, 36.53,
32.62, NA, 33.85, 34.86, NA, 35.36, 34.52, NA, NA, 33.68,
35.89, 36.24, 37.21, 28, 34.05, 36.3, 34.16, 32.86, 32.06,
34.65, 35.57, 35.95, 33.19, 34.61, 34.6, 34.92, 34.24, NA,
34.33, 35.65, 36.16, 33.91, 34.37, NA, 33.44, NA, NA, NA,
33.93, 33.71, 35.42, NA), ARG_G2_50_AAC = c(40.2, 38.6, 42.09,
39.25, NA, NA, 35.68, 41.41, 39.12, 37.68, NA, 39, 41.16,
40.67, 36.11, 39.25, 40.65, 37.52, 35.14, 41.26, 41.13, 40.71,
36.25, 33.33, 40.59, 39.67, 36.83, 34.44, 40.57, 34, NA,
37, 36.45, NA, 39.52, 38.17, NA, NA, 36.52, 40.39, 40.69,
41.21, 29, 39.63, 40.23, 37.27, 36.58, 34.45, 38.87, 38.98,
39.51, 38.13, 37.68, 37.88, 38.85, 38.48, NA, 40, 40.43,
42.73, 39.93, 38.19, NA, 36.41, NA, NA, NA, 39.71, 36.43,
38.03, NA), ARG_G2_50_AC = c(0.8, 1.9, 0, 0.5, NA, NA, 2.93,
0.52, 0.58, 2.75, NA, 1.25, 0.21, 0.25, 2.11, 2, 0.85, 2.03,
2.67, 0.71, 0.82, 0.29, 0.75, 4.27, 0.63, 0.78, 2.92, 2.77,
1.17, 4.88, NA, 3, 2.64, NA, 1.78, 0.98, NA, NA, 2.29, 0.82,
0.45, 0.93, 6, 1.67, 0.86, 1.27, 1.79, 3.37, 1.11, 0.74,
0.79, 1.1, 0.71, 1.11, 1.08, 2.48, NA, 0.17, 0.75, 0.22,
0.91, 1.19, NA, 1.66, NA, NA, NA, 1.07, 1.75, 1.42, NA),
ARG_G2_50_AB = c(0.8, 2, 0.31, 0.5, NA, NA, 2.93, 0.52, 0.58,
2.75, NA, 1.25, 0.34, 0.5, 3.33, 2.44, 0.85, 2.03, 2.91,
1.42, 1, 0.94, 0.75, 4.63, 0.85, 0.96, 2.92, 3.49, 1.17,
4.88, NA, 3, 3.36, NA, 2.3, 0.98, NA, NA, 2.61, 0.82, 0.52,
0.93, 6, 1.91, 1.02, 1.34, 2.58, 3.67, 1.59, 0.96, 1.09,
1.39, 1.5, 1.65, 1.15, 2.76, NA, 0.93, 0.8, 0.82, 1.25, 1.44,
NA, 2.49, NA, NA, NA, 1.07, 1.75, 1.47, NA), NARR_G1_100_AAA = c(71.32,
NA, NA, 67.83, NaN, 71.6, 64.2, 71.68, 73.29, 70.53, 73.35,
59.31, 71.08, 74.06, 68.7, 74, 69.08, NA, 68.52, 63.47, 68.33,
NA, 65.64, 62.11, 63.9, 70.41, 60.36, 65.88, 68.81, 69.62,
70.68, 67.5, NA, 68.45, 67.16, 74.39, 60.6, 65.89, 71.94,
68.75, NA, NA, 67, 66.85, NA, NA, 62.56, 73.33, 69.81, 67.68,
73.06, 65.8, 63.85, NA, 67.64, 71.6, 68.47, 69.39, 71.16,
72.33, NA, 66.68, NA, 66.22, 67, 61.27, NaN, 72.33, 68.29,
71.33, 65.57), NARR_G1_100_AAC = c(74.26, NA, NA, 70.94,
NaN, 75, 66.14, 74.48, 77.07, 73.47, 76, 60.44, 73.92, 77.19,
71.4, 77.59, 72, NA, 70.38, 65.47, 70.54, NA, 68.09, 64.61,
66.5, 72.52, 62.59, 69.25, 71.48, 71.88, 74.4, 70.1, NA,
70, 69.6, 78.04, 62.3, 68.79, 73.44, 72.25, NA, NA, 67, 68.25,
NA, NA, 65.94, 75.71, 72.43, 69.68, 76, 68.6, 65.65, NA,
70.43, 74, 71.76, 71.17, 74.63, 74.22, NA, 69.47, NA, 68.72,
67, 62.82, NaN, 77.33, 69.76, 75.42, 67.62), NARR_G1_100_AC = c(3.05,
NA, NA, 2.33, NaN, 2.4, 1.89, 0.84, 0.07, 5.47, 1.12, 8.81,
2.39, 1.38, 3.6, 0.88, 2.65, NA, 2.05, 5.18, 2.38, NA, 5,
4.78, 6.4, 1.85, 7.41, 3.69, 1.85, 2.62, 1.28, 3.9, NA, 2.35,
3.8, 1.87, 5.1, 6.95, 1.67, 4.5, NA, NA, 4, 4.25, NA, NA,
7.17, 1.29, 2.62, 1.37, 1.47, 3.3, 7.27, NA, 3.64, 3.6, 2.59,
4.83, 0.63, 2.28, NA, 6.58, NA, 4.56, 6, 4.82, NaN, 0.67,
3.95, 1.75, 4.38), NARR_G1_100_AB = c(3.42, NA, NA, 3.17,
NaN, 2.5, 3.29, 1.64, 1.07, 6, 1.41, 9.25, 3.25, 2.69, 3.8,
1.32, 3.04, NA, 2.38, 5.18, 2.38, NA, 6.18, 6.11, 6.4, 1.85,
7.45, 3.69, 1.89, 3.25, 1.6, 4.8, NA, 2.8, 4.32, 2.3, 6.6,
7.42, 2.83, 4.75, NA, NA, 5, 4.75, NA, NA, 8, 1.71, 2.67,
2.05, 1.47, 4.8, 7.96, NA, 4.43, 3.8, 4.47, 4.91, 1.68, 2.78,
NA, 6.58, NA, 6.67, 6, 5.18, NaN, 1.67, 4.86, 2.08, 4.38),
NARR_G1_150_AAA = c(102, NA, NA, 96.22, NaN, 105.33, 87.1,
100.14, 106.17, 97.67, 99.88, 75.43, 99.62, 106.86, 95.3,
105.68, 97.14, NA, 92.82, 87.25, 96.23, NA, 88.5, 83.56,
89.75, 98.47, 80.64, 92.14, 96.07, 94.62, 99.46, 100, NA,
92.6, 94.54, 106.25, 82.5, 93.6, 100.33, 95, NA, NA, NaN,
90.9, NA, NA, 87.89, 101.08, 96.18, 95, 103.12, 92.75, 85.71,
NA, 94.17, NaN, 95.25, 97.5, 100.67, 100.44, NA, 90.9, NA,
90.11, NaN, 81.5, NaN, NaN, 94.45, 100.4, 91.64), NARR_G1_150_AAC = c(103.2,
NA, NA, 97.67, NaN, 106.67, 88.55, 102.43, 109.17, 98.78,
103.25, 76.57, 102.05, 109.43, 97.4, 108.42, 99.29, NA, 94.73,
89, 98, NA, 89.75, 85, 91.75, 100.47, 81.64, 93.14, 97.73,
96, 101.08, 101.33, NA, 94.1, 95.92, 110.33, 83.25, 95.5,
101.67, 98, NA, NA, NaN, 93, NA, NA, 90.56, 102.38, 99, 96.78,
106.5, 94.25, 87.43, NA, 98.33, NaN, 99, 98.92, 103.44, 103,
NA, 93.8, NA, 92, NaN, 82.25, NaN, NaN, 95.45, 102.8, 93.82
), NARR_G1_150_AC = c(6.4, NA, NA, 5.78, NaN, 5, 4.85, 2.29,
0.5, 12.44, 2.5, 19, 4.71, 3, 8, 1.63, 5.86, NA, 4.82, 9.25,
4.08, NA, 10.75, 9.44, 12.25, 3.6, 15.73, 7.14, 3.73, 7.12,
4.08, 6.33, NA, 5.1, 6.62, 3.08, 10.25, 12.5, 4.56, 7.5,
NA, NA, NaN, 8.6, NA, NA, 13.67, 3.15, 6, 2.22, 2.5, 8, 15,
NA, 6, NaN, 5.5, 8.75, 2.44, 4.33, NA, 13.9, NA, 8.78, NaN,
13.75, NaN, NaN, 7.73, 4.4, 9.36), NARR_G1_150_AB = c(7,
NA, NA, 7.33, NaN, 5.33, 7.4, 3.71, 2.17, 13.33, 2.88, 20.14,
6, 5.14, 8.5, 2.42, 6.43, NA, 5.18, 9.25, 4.08, NA, 12.5,
11.56, 12.25, 3.6, 15.73, 7.14, 4, 8.12, 4.46, 7.33, NA,
5.9, 7.54, 3.67, 13, 13.3, 6.78, 8, NA, NA, NaN, 9.1, NA,
NA, 15.11, 4.15, 6.09, 3.22, 2.5, 10.5, 16.29, NA, 7.33,
NaN, 8.38, 8.83, 4, 5.22, NA, 13.9, NA, 12.11, NaN, 15.25,
NaN, NaN, 9.27, 5, 9.36), NARR_G1_200_AAA = c(127.8, NA,
NA, 120.25, NaN, NaN, 105.85, 126.62, 134.5, 121.4, 126.25,
89.33, 126.23, 136, 120.4, 133.17, 124, NA, 115.5, 106.5,
120.86, NA, 115, 104.25, NaN, 123.22, 100, 114, 120.22, 115.67,
124.38, NaN, NA, 112.6, 119, 137.29, NaN, 118.4, 127, NaN,
NA, NA, NaN, 113.8, NA, NA, 111.5, 123.57, 122.33, 118.8,
130, NaN, 106.38, NA, 123.5, NaN, 123.75, 123.29, 127.2,
126.5, NA, 113.8, NA, 113.75, NaN, 101, NaN, NaN, 117.83,
125, 114.5), NARR_G1_200_AAC = c(130, NA, NA, 123, NaN, NaN,
107.54, 128.75, 136.5, 123, 128.5, 90, 128, 137.33, 121.6,
136.92, 125.5, NA, 117, 108.25, 122.29, NA, 115, 105, NaN,
125.11, 102, 116, 122.33, 117.33, 126.25, NaN, NA, 114.6,
121.12, 138.86, NaN, 119.2, 127.75, NaN, NA, NA, NaN, 114.4,
NA, NA, 113, 124.43, 124, 120.6, 133, NaN, 107, NA, 124.5,
NaN, 127.75, 123.57, 129, 127.5, NA, 115.6, NA, 117, NaN,
101, NaN, NaN, 118.5, 129, 115.5), NARR_G1_200_AC = c(11.2,
NA, NA, 12.5, NaN, NaN, 9.31, 4.25, 2, 17.8, 4.5, 32.33,
7.77, 5.67, 13.4, 2.67, 9.62, NA, 7.67, 15, 6.14, NA, 16,
14.75, NaN, 6.22, 24.33, 11, 6.67, 14.33, 7.62, NaN, NA,
9.4, 9.75, 4.86, NaN, 18.6, 8.25, NaN, NA, NA, NaN, 13.8,
NA, NA, 21.75, 6.14, 9.33, 6, 4.5, NaN, 23.75, NA, 8.5, NaN,
6.75, 13.86, 3.8, 6.75, NA, 21.4, NA, 12.75, NaN, 20, NaN,
NaN, 12.83, 7, 15.83), NARR_G1_200_AB = c(12, NA, NA, 14.5,
NaN, NaN, 12.85, 6.38, 4.5, 18.8, 5.25, 34.67, 9.54, 8.67,
14.4, 4, 10.62, NA, 8.33, 15, 6.29, NA, 18, 17.5, NaN, 6.22,
24.33, 11.33, 7, 15.33, 8.12, NaN, NA, 10.8, 11, 5.71, NaN,
19.6, 10.75, NaN, NA, NA, NaN, 14.6, NA, NA, 24, 7.57, 9.5,
8, 5, NaN, 25.75, NA, 10.5, NaN, 10.5, 14, 6, 8.75, NA, 21.4,
NA, 17.75, NaN, 22, NaN, NaN, 15.5, 8, 15.83), NARR_G1_50_AAA = c(37.69,
NA, NA, 37.02, 35.38, 34.34, 36.19, 37.25, 36.78, 36.83,
36.61, 34.2, 34.24, 37.51, 35.74, 34, 35.02, NA, 37.4, 36.18,
36.63, NA, 34.42, 34.38, 35.43, 37.2, 34.49, 34.2, 36.41,
37.07, 36.56, 34.93, NA, 36.06, 36.49, 35.31, 33.33, 34.27,
36.5, 36.5, NA, NA, 34.21, 36.02, NA, NA, 34.02, 35.59, 37.16,
36.02, 37.58, 36.53, 35.46, NA, 36.46, 38.42, 36.05, 37.39,
37.3, 36.22, NA, 35.31, NA, 33.96, 35.55, 35.03, 35, 35.31,
36.54, 36.06, 34.98), NARR_G1_50_AAC = c(41.85, NA, NA, 40.71,
37.5, 42.38, 39.05, 41.98, 42.51, 42.47, 43.43, 36.41, 42.17,
43.27, 40.42, 43.1, 40.52, NA, 41.65, 38.82, 40.63, NA, 40.35,
39.18, 38.93, 41.44, 38.3, 39.54, 40.73, 41.83, 42.54, 40.34,
NA, 40.69, 40.31, 43.51, 36.13, 39.1, 41.65, 41.62, NA, NA,
38.57, 40.02, NA, NA, 38.26, 42.66, 41.55, 39.7, 42.91, 40.43,
38.87, NA, 40.86, 43.26, 40.55, 40.84, 42.13, 42.09, NA,
40.31, NA, 39.69, 39.73, 36.97, 37.71, 43.44, 40.44, 42.33,
39.65), NARR_G1_50_AC = c(0.77, NA, NA, 0.69, 2.25, 0.45,
0.59, 0.12, 0, 1.15, 0.34, 2.61, 0.61, 0.24, 0.64, 0.26,
0.79, NA, 0.19, 1.43, 0.65, NA, 1.39, 1.11, 1.87, 0.31, 1.98,
1.07, 0.54, 0.29, 0.24, 0.76, NA, 0.59, 1.05, 0.62, 2.17,
2.25, 0.33, 1.62, NA, NA, 1.36, 1.53, NA, NA, 2.22, 0.22,
0.65, 0.45, 0.42, 0.9, 2.18, NA, 0.97, 0.05, 0.84, 0.98,
0, 0.44, NA, 1.83, NA, 1.71, 0.91, 1.16, 1.86, 0.12, 0.69,
0.45, 1.24), NARR_G1_50_AB = c(0.88, NA, NA, 0.82, 2.25,
0.45, 1.03, 0.45, 0.54, 1.36, 0.55, 2.71, 0.96, 0.73, 0.64,
0.47, 0.97, NA, 0.29, 1.43, 0.65, NA, 1.81, 1.69, 1.87, 0.31,
2.02, 1.07, 0.54, 0.52, 0.39, 1.1, NA, 0.8, 1.31, 0.82, 2.9,
2.44, 0.74, 1.62, NA, NA, 1.86, 1.76, NA, NA, 2.48, 0.38,
0.67, 0.66, 0.42, 1.67, 2.38, NA, 1.43, 0.16, 1.64, 1.04,
0.57, 0.69, NA, 1.83, NA, 2.6, 0.91, 1.16, 2.71, 0.75, 0.98,
0.58, 1.24), NARR_G2_100_AAA = c(64.25, 59, NA, 67.88, 67.08,
NA, 60.75, 64.42, 71.17, 58.42, NA, 49.8, 63.36, 65.2, NaN,
70.2, 62.85, NaN, 61.6, 53.92, 62.63, NA, NaN, 50.46, 65.14,
60.58, 63.29, NA, 64.33, NaN, NA, 68.57, NA, NA, 66.3, NA,
57.29, NA, 53.5, 63.48, NA, 57.07, NaN, 61.82, NA, 68.61,
57.1, 62.84, 63, 61.91, 58.38, NaN, 61.56, NA, NaN, 65.55,
63.8, 65, 63.14, 67.31, 67.75, 57.62, 63.31, 54.83, 66.43,
NA, NA, 64.67, 57.92, 59, NA)), row.names = c(NA, -71L), class = "data.frame")
I would suggest pulling your column names into a data frame, separating them into their components, and ordering them as desired:
library(dplyr)
library(tidyr)
col_df = data.frame(names = names(merged_DF)[-1]) ## -1 to skip the ID col
col_df = col_df %>%
separate(
col = names, sep = "_",
into = c("s1", "gnum", "num2", "astring"),
remove = FALSE, convert = TRUE
) %>%
arrange(s1, num2, astring, gnum)
## now we have the names in order:
col_df
# names s1 gnum num2 astring
# 1 ARG_G1_50_AAA ARG G1 50 AAA
# 2 ARG_G2_50_AAA ARG G2 50 AAA
# 3 ARG_G1_50_AAC ARG G1 50 AAC
# 4 ARG_G2_50_AAC ARG G2 50 AAC
# 5 ARG_G1_50_AB ARG G1 50 AB
# 6 ARG_G2_50_AB ARG G2 50 AB
# 7 ARG_G1_50_AC ARG G1 50 AC
# 8 ARG_G2_50_AC ARG G2 50 AC
# 9 ARG_G1_100_AAA ARG G1 100 AAA
# 10 ARG_G2_100_AAA ARG G2 100 AAA
# ...
## we can use this order to rearrange the columns
merged_DF = select(merged_DF, c(ID, col_df$names))
names(merged_DF)
# [1] "ID" "ARG_G1_50_AAA" "ARG_G2_50_AAA" "ARG_G1_50_AAC" "ARG_G2_50_AAC"
# [6] "ARG_G1_50_AB" "ARG_G2_50_AB" "ARG_G1_50_AC" "ARG_G2_50_AC" "ARG_G1_100_AAA"
# [11] "ARG_G2_100_AAA" "ARG_G1_100_AAC" "ARG_G2_100_AAC" "ARG_G1_100_AB" "ARG_G2_100_AB"
# [16] "ARG_G1_100_AC" "ARG_G2_100_AC" "ARG_G1_150_AAA" "ARG_G2_150_AAA" "ARG_G1_150_AAC"
# [21] "ARG_G2_150_AAC" "ARG_G1_150_AB" "ARG_G2_150_AB" "ARG_G1_150_AC" "ARG_G2_150_AC"
# [26] "ARG_G1_200_AAA" "ARG_G2_200_AAA" "ARG_G1_200_AAC" "ARG_G2_200_AAC" "ARG_G1_200_AB"
# [31] "ARG_G2_200_AB" "ARG_G1_200_AC" "ARG_G2_200_AC" "NARR_G1_50_AAA" "NARR_G1_50_AAC"
# [36] "NARR_G1_50_AB" "NARR_G1_50_AC" "NARR_G1_100_AAA" "NARR_G2_100_AAA" "NARR_G1_100_AAC"
# [41] "NARR_G1_100_AB" "NARR_G1_100_AC" "NARR_G1_150_AAA" "NARR_G1_150_AAC" "NARR_G1_150_AB"
# [46] "NARR_G1_150_AC" "NARR_G1_200_AAA" "NARR_G1_200_AAC" "NARR_G1_200_AB" "NARR_G1_200_AC"
I bet that there are simpler ways of doing this but this one seems to work.
intercalate <- function(X, pattern) {
f <- function(h, n) {
i <- seq(1, length(h), by = 2)
j <- seq(2, length(h), by = 2)
h[order(c(i, j))]
}
#
g <- function(x, y) {
nx <- length(x)
ny <- length(y)
if(nx == ny) {
h <- c(x, y)
f(h, nx)
} else if(nx > ny) {
h <- c(x[seq_along(y)], y)
h <- f(h, ny)
c(h, x[-seq_along(y)])
} else {
h <- c(x, y[seq_along(x)])
h <- f(h, nx)
c(h, y[-seq_along(x)])
}
}
#
s <- grepl(pattern = pattern, X)
s <- abs(c(0, diff(s)))
sp <- split(X, cumsum(s))
i_odd <- seq(1, length(sp), by = 2)
i_even <- seq(2, length(sp), by = 2)
new_names <- mapply(g, sp[i_odd], sp[i_even])
unname(unlist(new_names))
}
newnames <- intercalate(names(merged_DF)[-1], pattern = "G2")
newnames <- c(names(merged_DF)[1], newnames)
merged_DF[newnames]
This is probably insufficient to the task:
strings <- c('ARG_G1_50_AAA' ,'ARG_G1_50_AAC', 'ARG_G1_50_AC' ,'ARG_G1_50_AB',
'ARG_G2_50_AAA' ,'ARG_G2_50_AAC', 'ARG_G2_50_AC')
substring(strings, regexpr('_\\K[[:upper:]]{2,3}', strings, perl = TRUE), nchar(strings))
[1] "AAA" "AAC" "AC" "AB" "AAA" "AAC" "AC"
idx_strings <- order(substring(strings, regexpr('_\\K[[:upper:]]{2,3}', strings, perl = TRUE), nchar(strings)))
idx_strings
[1] 1 5 2 6 4 3 7
> strings[idx_strings]
[1] "ARG_G1_50_AAA" "ARG_G2_50_AAA" "ARG_G1_50_AAC" "ARG_G2_50_AAC"
[5] "ARG_G1_50_AB" "ARG_G1_50_AC" "ARG_G2_50_AC"
Getting nearly desired 'set1' results for 'NARR_' and 'ARG_' as follows
for 'NARR_', using #akrun data v1, though [7] & [8] appear reversed
idx_v1_N <- which(regexpr('^[N]', v1, perl = TRUE) == 1)
v1[idx_v1_N[order(
substring(v1[idx_v1_N],
regexpr('[^_.G][\\d_]\\d.+[[:upper:]]', v1[idx_v1_N], perl = TRUE),
nchar(v1[idx_v1_N]))[idx_v1_N])]]
[1] "NARR_G1_100_AAC" "NARR_G1_100_AB" "NARR_G2_100_AC" "NARR_G1_150_AAC"
[5] "NARR_G1_150_AB" "NARR_G1_100_AAA" "NARR_G2_150_AAA" "NARR_G2_100_AAA"
[9] "NARR_G1_100_AC" "NARR_G1_150_AAA" "NARR_G1_150_AC" "NARR_G2_100_AAC"
[13] "NARR_G2_50_AB" "NARR_G1_50_AC" "NARR_G1_50_AAA" "NARR_G2_150_AB"
[17] "NARR_G2_150_AAC" "NARR_G2_50_AC" "NARR_G1_50_AAC" "NARR_G2_150_AC"
[21] "NARR_G1_50_AB" "NARR_G2_100_AB" "NARR_G2_50_AAA" "NARR_G2_50_AAC"
the substring and regexpr '[^_.G][\\d_]\\d.+[[:upper:]]' return
substring(v1[idx_v1_N], regexpr('[^_.G][\\d_]\\d.+[[:upper:]]', v1[idx_v1_N], perl = TRUE), nchar(v1[idx_v1_N]))
[1] "1_100_AB" "1_150_AAC" "2_50_AB" "1_150_AB" "2_100_AAA" "1_100_AAC"
[7] "1_150_AAA" "2_100_AC" "1_100_AAA" "1_150_AC" "2_100_AAC" "2_150_AAA"
[13] "1_100_AC" "1_50_AC" "1_50_AAA" "2_150_AB" "2_150_AAC" "2_50_AC"
[19] "1_50_AAC" "2_150_AC" "1_50_AB" "2_100_AB" "2_50_AAA" "2_50_AAC"
which is then order([ed] nearly correctly. Results for 'ARG_' just need an index for starting with 'A'. There are better hammers for this nail, as seen above.
I have two matrices in R lag_mat and r_mat and both have dimensions 16x16x3x2x2.
I have the following code that I use to create plot these in R.
library(R.matlab)
library("wesanderson")
library("ggplot2")
library("ggsci")
library(corrplot)
library(plotly)
library(viridis)
#CCO left and right stimulation time window 2
lag_mat = matrix(CCO_lag[, , 1,2], 16)
r_mat = matrix(CCO[, , 1,2], 16)
row = c(row(lag_mat))
col = c(col(lag_mat))
dd = data.frame( lag = c(lag_mat), r = c(r_mat), row, col )
p1 <- ggplot(dd, aes(x = row, y = col, size = lag, color = r)) +
geom_point( alpha = 1.5, stroke = 2.5) +
ggtitle("CCO, RIGHT Stimulation") +
theme(plot.title = element_text(size=10, face="bold"),
legend.position = "none",
axis.title.x=element_blank(),
axis.title.y=element_blank(),
panel.grid.major = element_line(size = 0.5, linetype = 'solid',
colour = "white"),
panel.grid.minor = element_line(size = 0.5, linetype = 'solid',
colour = "white"),axis.text.x = element_text(size=8)) +
# scale_color_viridis( begin = 0.2 , end = 1, direction = 1 )+
scale_color_gradient2(low = "#4169E1" , mid = "#ffffbf" , high = "#FF8C00", limits=c(-1 ,1)) +
# scale_y_reverse() +
# scale_size_area(trans = "reverse")+
scale_size_continuous(range = c(5,0),limits=c(-12,0))+
scale_x_discrete(limits=c("CP1","P7","P3","Pz","PO3","T1", "M1","Oz","M2","T2","PO4","P4","P8", "CP2","Cz","Fz")) +
scale_y_continuous(limits = c(1,16),breaks=seq(1,16,1))
The issue that I am having is that I need to loop through the last dimension. I have run some more analyses and instead of the last dimension of the matrices being 2, it's now 21. I used to just have two scripts that I used, one where I plotted (i.e. each of the dimensions in different scripts - not very efficient, I know).
r_mat = matrix(CCO[, , 1,1], 16)
and the other for
r_mat = matrix(CCO[, , 1,2], 16)
But now of course I can't have 21 scripts but I'm unsure how to loop and plot in R.
Can anyone help me with this? So I could loop through the last dimension and plot 21 figures using ggplot?
Thanks!
Here is the data, I have reproduced a smaller matrix such that both matrices are not dimension 16x16x1x2.
CCO<-structure(c(-0.492578655481339, NaN, NaN, NaN, -0.492525190114975,
-0.492525696754456, NaN, -0.492627799510956, -0.492677986621857,
-0.492468953132629, NaN, NaN, NaN, -0.49228835105896, -0.492546766996384,
-0.492437690496445, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.521651923656464, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0.473261743783951, NaN, 0.472789525985718, -0.600778460502625,
NaN, NaN, -0.600829541683197, -0.6008580327034, -0.601057589054108,
NaN, -0.600822031497955, -0.600911736488342, -0.600730240345001,
NaN, NaN, NaN, -0.600953936576843, -0.600802004337311, -0.600861430168152,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.521026790142059, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -0.577225089073181, NaN, NaN, -0.577208399772644,
-0.577145278453827, -0.577321112155914, NaN, -0.577184557914734,
-0.577165722846985, -0.577133357524872, NaN, NaN, NaN, -0.577190637588501,
-0.577230930328369, -0.577144026756287, -0.41020467877388, NaN,
NaN, NaN, -0.410186648368835, -0.410334318876266, NaN, -0.410211980342865,
-0.410197377204895, -0.410110324621201, NaN, NaN, NaN, -0.410272806882858,
NaN, NaN, -0.733388960361481, NaN, NaN, NaN, NaN, -0.733434438705444,
NaN, -0.733347833156586, -0.733303666114807, -0.733347356319427,
NaN, NaN, NaN, NaN, -0.733397245407104, -0.73332667350769, -0.702324509620667,
NaN, NaN, NaN, NaN, NaN, NaN, -0.702237844467163, -0.702238082885742,
-0.702193081378937, NaN, NaN, NaN, -0.702261865139008, -0.702301025390625,
NaN, -0.80294394493103, NaN, NaN, -0.802956938743591, -0.802938997745514,
-0.803096830844879, NaN, -0.802961885929108, -0.802923500537872,
-0.802861630916595, NaN, NaN, NaN, -0.803063333034515, -0.802979350090027,
-0.802873134613037, -0.684592604637146, NaN, NaN, -0.684580564498901,
-0.684580743312836, -0.684802889823914, NaN, -0.684630811214447,
-0.684578239917755, -0.684465110301971, NaN, NaN, NaN, -0.684730887413025,
-0.684608578681946, -0.684436023235321, -0.606923937797546, NaN,
NaN, NaN, -0.606987476348877, NaN, NaN, -0.606982827186584, NaN,
NaN, NaN, NaN, NaN, -0.606993675231934, NaN, NaN, -0.746234655380249,
NaN, NaN, -0.7463099360466, -0.746258854866028, -0.746564209461212,
NaN, -0.746362566947937, -0.746387183666229, -0.746385276317596,
NaN, NaN, NaN, -0.746756434440613, -0.746286571025848, -0.746472299098969,
NaN, NaN, NaN, NaN, -0.526792407035828, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, -0.526629209518433, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, -0.402197241783142, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, -0.515719473361969, NaN, NaN, NaN, -0.515782594680786,
-0.516006171703339, NaN, -0.515946447849274, -0.515853404998779,
-0.515883803367615, NaN, NaN, NaN, -0.515994668006897, -0.515867114067078,
-0.515911042690277, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.4820496737957, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0.535082995891571, NaN, 0.534462213516235, -0.567049205303192,
NaN, NaN, -0.567097425460815, -0.567124307155609, -0.567312657833099,
NaN, -0.567090332508087, -0.567174971103668, -0.567003667354584,
NaN, NaN, NaN, -0.567214787006378, -0.567071437835693, -0.567127525806427,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -0.437827885150909, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -0.496517241001129, NaN, NaN, -0.496502816677094,
-0.496448516845703, -0.496599793434143, NaN, -0.496482282876968,
-0.496466100215912, -0.496438264846802, NaN, NaN, NaN, -0.496487557888031,
-0.496522217988968, -0.496447324752808, 0.43168780207634, NaN,
NaN, NaN, 0.43162015080452, 0.431624948978424, NaN, 0.43173423409462,
0.431787043809891, 0.431506514549255, NaN, NaN, NaN, 0.431388199329376,
NaN, NaN, -0.673626005649567, NaN, NaN, NaN, NaN, -0.673667669296265,
NaN, -0.67358809709549, -0.673547565937042, -0.673587679862976,
NaN, NaN, NaN, NaN, -0.673633456230164, -0.673568665981293, -0.657320320606232,
NaN, NaN, NaN, NaN, NaN, NaN, -0.65728884935379, -0.657253861427307,
-0.657285273075104, NaN, NaN, NaN, -0.657291948795319, -0.657335460186005,
NaN, -0.793729186058044, NaN, NaN, -0.793741881847382, -0.793724238872528,
-0.793880224227905, NaN, -0.793746829032898, -0.793708860874176,
-0.793647706508636, NaN, NaN, NaN, -0.793846964836121, -0.793764173984528,
-0.793659150600433, -0.639408528804779, NaN, NaN, -0.639397382736206,
-0.63939756155014, -0.639605164527893, NaN, -0.639444351196289,
-0.63939505815506, -0.639289438724518, NaN, NaN, NaN, -0.639537692070007,
-0.639423429965973, -0.63926237821579, -0.567462205886841, NaN,
NaN, NaN, -0.567524492740631, NaN, NaN, -0.567518472671509, NaN,
NaN, NaN, NaN, NaN, -0.567527711391449, NaN, NaN, -0.76900988817215,
NaN, NaN, -0.769101619720459, -0.769054174423218, -0.769321501255035,
NaN, -0.769179046154022, -0.769175291061401, -0.769182145595551,
NaN, NaN, NaN, -0.769531965255737, -0.769078016281128, -0.769262313842773,
NaN, NaN, NaN, NaN, -0.0669489949941635, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -0.0665916055440903, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, 0.425303876399994, NaN, NaN, NaN,
NaN, NaN, NaN, NaN), .Dim = c(16L, 16L, 2L))
CCO_lag<-structure(c(0, NaN, NaN, NaN, 0, 0, NaN, 0, 1, 0, NaN, NaN, NaN,
1, 0, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, -3, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 5, NaN, 5, -3, NaN, NaN,
-3, -3, -3, NaN, -3, -3, -3, NaN, NaN, NaN, -3, -3, -3, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, -1, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, -3, NaN, NaN, -3, -3, -3, NaN, -3, -3, -3, NaN, NaN,
NaN, -3, -3, -3, -4, NaN, NaN, NaN, -4, -4, NaN, -4, -4, -4,
NaN, NaN, NaN, -4, NaN, NaN, 0, NaN, NaN, NaN, NaN, 0, NaN, 0,
0, 0, NaN, NaN, NaN, NaN, 0, 0, 0, NaN, NaN, NaN, NaN, NaN, NaN,
0, 0, 0, NaN, NaN, NaN, 0, 0, NaN, 0, NaN, NaN, 1, 0, 1, NaN,
0, 1, 1, NaN, NaN, NaN, 1, 0, 1, -2, NaN, NaN, -2, -2, -2, NaN,
-2, -2, -1, NaN, NaN, NaN, -2, -2, -2, 0, NaN, NaN, NaN, 0.5,
NaN, NaN, 0.5, NaN, NaN, NaN, NaN, NaN, 0.5, NaN, NaN, 1, NaN,
NaN, 1, 1, 1, NaN, 1, 1, 1, NaN, NaN, NaN, 1, 1, 1, NaN, NaN,
NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 0.5, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, 0, NaN, NaN, NaN, 0, 0, NaN, 0, 0, 0, NaN,
NaN, NaN, 0, 0, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, -4, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 5, NaN, 5,
-2, NaN, NaN, -2, -2, -2, NaN, -2, -2, -2, NaN, NaN, NaN, -2,
-2, -2, NaN, NaN, NaN, NaN, NaN, NaN, NaN, -2, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, -2, NaN, NaN, -2, -2, -2, NaN, -2, -2,
-2, NaN, NaN, NaN, -2, -2, -2, -2, NaN, NaN, NaN, -2, -2, NaN,
-2, -2, -2, NaN, NaN, NaN, -2, NaN, NaN, -1, NaN, NaN, NaN, NaN,
-1, NaN, -1, -1, -1, NaN, NaN, NaN, NaN, -1, -1, -1, NaN, NaN,
NaN, NaN, NaN, NaN, -1, -1, -1, NaN, NaN, NaN, -1, -1, NaN, 0,
NaN, NaN, 0, 0, 0, NaN, 0, 0, 0, NaN, NaN, NaN, 0, 0, 0, -3,
NaN, NaN, -3, -3, -3, NaN, -3, -3, -2, NaN, NaN, NaN, -3, -3,
-3, 0, NaN, NaN, NaN, 0, NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN,
0, NaN, NaN, 0, NaN, NaN, 0, 0, 0, NaN, 0, 0, 0, NaN, NaN, NaN,
0, 0, 0, NaN, NaN, NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, 0, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
0, NaN, NaN, NaN, NaN, NaN, NaN, NaN), .Dim = c(16L, 16L, 2L))
You could loop along the desired dimension of your array by using lapply(seq_len(dim(my_array)[n]), ...), wherein n is your dimension of interest.
If you then use function(i) {...} inside the lapply() and put the i at the correct spot in the subsetting operation, it should pick out the appropriate data.
If the last line of the function outputs a ggplot object, it automatically gets saved in a list. Simplified example below:
library(ggplot2)
CCO<- array(rnorm(prod(16, 2, 1, 21)), c(16, 2, 1, 21))
CCO_lag <- array(rnorm(prod(16, 2, 1, 21)), c(16, 2, 1, 21))
plots <- lapply(seq_len(dim(CCO)[4]), function(i) {
lag_mat = matrix(CCO_lag[, , 1,i], 16)
r_mat = matrix(CCO[, , 1,i], 16)
row = c(row(lag_mat))
col = c(col(lag_mat))
dd = data.frame( lag = c(lag_mat), r = c(r_mat), row, col )
ggplot(dd, aes(x = row, y = col)) +
geom_point(alpha = 1.5, stroke = 2.5)
})
# Just to show plots come out
patchwork::wrap_plots(plots)
Created on 2021-01-07 by the reprex package (v0.3.0)
I am trying to pull data from one data frame to another based on the equivalent of a VLOOKUP table in Excel. I have had a look at the most popular VLOOKUP question in R, but I cannot see how it applies to my specific problem. The key thing is that I don't want to pull all of the columns from the second data frame into my first one - I only want to pull in one column. I'm pretty sure this will be some kind of derivation of a merge function.
Referring to the below data, I am trying to create a new column called df1$Trait1Percentile. This needs to draw from LookupTable$Trait1Percentiles based on a match between df1$Trait1Scores and
LookupTable$Scores.
#Import data.
df1 <- structure(list(JobNumber = c(634L, 21L, 300L, 797L, 1112L, 147L,
1L, 4L, 260L, 194L, 981L, 1110L, 634L, 554L, 213L, 722L, 1036L,
855L, 624L, 1113L, 681L, 547L, 195L, 624L, 546L, 201L, 918L,
1069L, 300L, 294L, 587L, 933L, 918L, 620L, 918L, 298L, 749L,
295L, 635L, 515L, 624L, 147L, 200L, 527L, 800L, 827L, 4L, 568L,
252L, 655L, 559L, 629L, 639L, 933L, 214L, 750L, 1066L, 495L,
1113L, 1L, 1113L, 12L, 561L, 741L, 495L, 981L, 147L, 199L, 629L,
163L, 615L, 294L, 49L, 624L, 260L, 1L, 299L, 193L, 108L, 113L,
426L, 299L, 708L, 749L, 749L, 483L, 935L, 1036L, 295L, 12L, 1113L,
1038L, 4L, 973L, 448L, 295L, 197L, 76L, 1L, 1L), Trait1Score = c(3.89,
4.39, 4.22, 4.21, 3.94, 3.9, 4.58, 4.5, 4.29, 4.47, 4.41, 4.4,
4.14, 4.78, 4.09, 4.58, 4.27, 4.24, 3.96, 3.94, 4.3, 4.07, 4.28,
4.19, 4.57, 4.74, 3.29, 4.23, 3.51, 3.77, 4.46, 5.04, 4.25, 3.92,
3.78, 4.43, 4.12, 4.18, 4.63, 3.25, 3.87, 4.4, 3.83, 4.03, 3.42,
4.9, 4.09, 4.58, 4.29, 4.7, 4.38, 4.61, 4.41, 4.5, 4.6, 4.22,
3.72, 4.34, 4.34, 4.38, 4.15, 4.22, 3.93, 5, 3.81, 4.3, 4.6,
4.96, 4.29, 4.8, 5.05, 3.76, 4.81, 4.77, 4.25, 4.17, 4.75, 4.15,
4.35, 4.23, 5.31, 4.18, 3.67, 3.84, 4.06, 3.66, 3.58, 4.37, 4.43,
4.63, 4.74, 4.79, 5.04, 3.55, 3.64, 4.9, 4.38, 4.01, 4.47, 4.53
), Trait2Score = c(4, 2.94, 3.17, 3.83, 4.22, 3.83, 5.11, 3,
2.83, 2.78, 2.22, 2.22, 4.11, 2.39, 2.22, 2.06, 2.89, 3.61, 3.89,
4.89, 3.78, 4.22, 4.5, 4.39, 1.89, 4.78, 4.56, 3.78, 2.28, 4.61,
2.72, 1.89, 4.44, 4.06, 3.72, 2.44, 3.61, 2.06, 2.17, 6.44, 3.22,
2.78, 4.61, 2.72, 2.83, 2.44, 6.5, 2.28, 2.89, 2.11, 4.44, 2.83,
3, 6.33, 3.11, 3.17, 3.67, 4.5, 2.5, 4.33, 5, 2.89, 3.89, 1.72,
3.33, 4.28, 2.17, 3.17, 2.61, 2.89, 1.22, 3.39, 1.28, 2.61, 2.5,
4.56, 2.89, 4.89, 3.11, 3.5, 1.44, 2.39, 5.33, 3.78, 1.5, 3.44,
5.83, 3.17, 3.78, 2.67, 1.61, 1.83, 4.56, 4.67, 4.61, 2.5, 4.94,
3.94, 4.33, 2.72)), row.names = c(NA, -100L), class = "data.frame")
LookupTable <- structure(list(Scores = c(0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06,
0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17,
0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28,
0.29, 0.3, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39,
0.4, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5,
0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6, 0.61,
0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7, 0.71, 0.72,
0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8, 0.81, 0.82, 0.83,
0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9, 0.91, 0.92, 0.93, 0.94,
0.95, 0.96, 0.97, 0.98, 0.99, 1, 1.01, 1.02, 1.03, 1.04, 1.05,
1.06, 1.07, 1.08, 1.09, 1.1, 1.11, 1.12, 1.13, 1.14, 1.15, 1.16,
1.17, 1.18, 1.19, 1.2, 1.21, 1.22, 1.23, 1.24, 1.25, 1.26, 1.27,
1.28, 1.29, 1.3, 1.31, 1.32, 1.33, 1.34, 1.35, 1.36, 1.37, 1.38,
1.39, 1.4, 1.41, 1.42, 1.43, 1.44, 1.45, 1.46, 1.47, 1.48, 1.49,
1.5, 1.51, 1.52, 1.53, 1.54, 1.55, 1.56, 1.57, 1.58, 1.59, 1.6,
1.61, 1.62, 1.63, 1.64, 1.65, 1.66, 1.67, 1.68, 1.69, 1.7, 1.71,
1.72, 1.73, 1.74, 1.75, 1.76, 1.77, 1.78, 1.79, 1.8, 1.81, 1.82,
1.83, 1.84, 1.85, 1.86, 1.87, 1.88, 1.89, 1.9, 1.91, 1.92, 1.93,
1.94, 1.95, 1.96, 1.97, 1.98, 1.99, 2, 2.01, 2.02, 2.03, 2.04,
2.05, 2.06, 2.07, 2.08, 2.09, 2.1, 2.11, 2.12, 2.13, 2.14, 2.15,
2.16, 2.17, 2.18, 2.19, 2.2, 2.21, 2.22, 2.23, 2.24, 2.25, 2.26,
2.27, 2.28, 2.29, 2.3, 2.31, 2.32, 2.33, 2.34, 2.35, 2.36, 2.37,
2.38, 2.39, 2.4, 2.41, 2.42, 2.43, 2.44, 2.45, 2.46, 2.47, 2.48,
2.49, 2.5, 2.51, 2.52, 2.53, 2.54, 2.55, 2.56, 2.57, 2.58, 2.59,
2.6, 2.61, 2.62, 2.63, 2.64, 2.65, 2.66, 2.67, 2.68, 2.69, 2.7,
2.71, 2.72, 2.73, 2.74, 2.75, 2.76, 2.77, 2.78, 2.79, 2.8, 2.81,
2.82, 2.83, 2.84, 2.85, 2.86, 2.87, 2.88, 2.89, 2.9, 2.91, 2.92,
2.93, 2.94, 2.95, 2.96, 2.97, 2.98, 2.99, 3, 3.01, 3.02, 3.03,
3.04, 3.05, 3.06, 3.07, 3.08, 3.09, 3.1, 3.11, 3.12, 3.13, 3.14,
3.15, 3.16, 3.17, 3.18, 3.19, 3.2, 3.21, 3.22, 3.23, 3.24, 3.25,
3.26, 3.27, 3.28, 3.29, 3.3, 3.31, 3.32, 3.33, 3.34, 3.35, 3.36,
3.37, 3.38, 3.39, 3.4, 3.41, 3.42, 3.43, 3.44, 3.45, 3.46, 3.47,
3.48, 3.49, 3.5, 3.51, 3.52, 3.53, 3.54, 3.55, 3.56, 3.57, 3.58,
3.59, 3.6, 3.61, 3.62, 3.63, 3.64, 3.65, 3.66, 3.67, 3.68, 3.69,
3.7, 3.71, 3.72, 3.73, 3.74, 3.75, 3.76, 3.77, 3.78, 3.79, 3.8,
3.81, 3.82, 3.83, 3.84, 3.85, 3.86, 3.87, 3.88, 3.89, 3.9, 3.91,
3.92, 3.93, 3.94, 3.95, 3.96, 3.97, 3.98, 3.99, 4, 4.01, 4.02,
4.03, 4.04, 4.05, 4.06, 4.07, 4.08, 4.09, 4.1, 4.11, 4.12, 4.13,
4.14, 4.15, 4.16, 4.17, 4.18, 4.19, 4.2, 4.21, 4.22, 4.23, 4.24,
4.25, 4.26, 4.27, 4.28, 4.29, 4.3, 4.31, 4.32, 4.33, 4.34, 4.35,
4.36, 4.37, 4.38, 4.39, 4.4, 4.41, 4.42, 4.43, 4.44, 4.45, 4.46,
4.47, 4.48, 4.49, 4.5, 4.51, 4.52, 4.53, 4.54, 4.55, 4.56, 4.57,
4.58, 4.59, 4.6, 4.61, 4.62, 4.63, 4.64, 4.65, 4.66, 4.67, 4.68,
4.69, 4.7, 4.71, 4.72, 4.73, 4.74, 4.75, 4.76, 4.77, 4.78, 4.79,
4.8, 4.81, 4.82, 4.83, 4.84, 4.85, 4.86, 4.87, 4.88, 4.89, 4.9,
4.91, 4.92, 4.93, 4.94, 4.95, 4.96, 4.97, 4.98, 4.99, 5, 5.01,
5.02, 5.03, 5.04, 5.05, 5.06, 5.07, 5.08, 5.09, 5.1, 5.11, 5.12,
5.13, 5.14, 5.15, 5.16, 5.17, 5.18, 5.19, 5.2, 5.21, 5.22, 5.23,
5.24, 5.25, 5.26, 5.27, 5.28, 5.29, 5.3, 5.31, 5.32, 5.33, 5.34,
5.35, 5.36, 5.37, 5.38, 5.39, 5.4, 5.41, 5.42, 5.43, 5.44, 5.45,
5.46, 5.47, 5.48, 5.49, 5.5, 5.51, 5.52, 5.53, 5.54, 5.55, 5.56,
5.57, 5.58, 5.59, 5.6, 5.61, 5.62, 5.63, 5.64, 5.65, 5.66, 5.67,
5.68, 5.69, 5.7, 5.71, 5.72, 5.73, 5.74, 5.75, 5.76, 5.77, 5.78,
5.79, 5.8, 5.81, 5.82, 5.83, 5.84, 5.85, 5.86, 5.87, 5.88, 5.89,
5.9, 5.91, 5.92, 5.93, 5.94, 5.95, 5.96, 5.97, 5.98, 5.99, 6,
6.01, 6.02, 6.03, 6.04, 6.05, 6.06, 6.07, 6.08, 6.09, 6.1, 6.11,
6.12, 6.13, 6.14, 6.15, 6.16, 6.17, 6.18, 6.19, 6.2, 6.21, 6.22,
6.23, 6.24, 6.25, 6.26, 6.27, 6.28, 6.29, 6.3, 6.31, 6.32, 6.33,
6.34, 6.35, 6.36, 6.37, 6.38, 6.39, 6.4, 6.41, 6.42, 6.43, 6.44,
6.45, 6.46, 6.47, 6.48, 6.49, 6.5, 6.51, 6.52, 6.53, 6.54, 6.55,
6.56, 6.57, 6.58, 6.59, 6.6, 6.61, 6.62, 6.63, 6.64, 6.65, 6.66,
6.67, 6.68, 6.69, 6.7, 6.71, 6.72, 6.73, 6.74, 6.75, 6.76, 6.77,
6.78, 6.79, 6.8, 6.81, 6.82, 6.83, 6.84, 6.85, 6.86, 6.87, 6.88,
6.89, 6.9, 6.91, 6.92, 6.93, 6.94, 6.95, 6.96, 6.97, 6.98, 6.99,
7), Trait1Percentiles = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03,
0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 0.09, 0.09, 0.09, 0.09, 0.09, 0.09, 0.13, 0.13, 0.13,
0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.16, 0.19, 0.19, 0.19, 0.19,
0.25, 0.28, 0.35, 0.38, 0.41, 0.5, 0.5, 0.5, 0.5, 0.54, 0.57,
0.57, 0.6, 0.6, 0.66, 0.66, 0.69, 0.69, 0.85, 0.85, 0.91, 1.01,
1.04, 1.07, 1.1, 1.13, 1.2, 1.23, 1.32, 1.42, 1.48, 1.48, 1.67,
1.73, 1.89, 1.98, 2.14, 2.14, 2.14, 2.33, 2.33, 2.52, 2.52, 2.77,
2.77, 3.12, 3.34, 3.46, 3.75, 3.97, 4.16, 4.57, 4.82, 5.1, 5.26,
5.45, 5.61, 5.73, 6.14, 6.36, 6.65, 7.09, 7.43, 7.43, 8.31, 8.31,
9.01, 9.01, 9.51, 9.51, 10.65, 11.15, 11.69, 12.03, 12.6, 13.39,
14.08, 14.61, 14.96, 15.59, 16.5, 17.23, 18.02, 18.8, 19.78,
20.79, 21.57, 22.33, 22.93, 22.93, 25.01, 25.92, 26.9, 26.9,
28.79, 29.83, 31.28, 32.35, 33.45, 34.43, 35.43, 36.91, 37.95,
39.31, 40.88, 42.05, 43.15, 44.22, 45.61, 46.87, 48.22, 49.23,
50.77, 52.03, 52.03, 54.46, 55.81, 56.94, 56.94, 59.37, 60.66,
61.95, 61.95, 64.28, 65.48, 66.96, 68, 68.79, 69.73, 70.55, 71.43,
72.44, 73.48, 74.3, 74.99, 75.81, 76.76, 77.73, 78.46, 78.46,
80.16, 80.79, 81.64, 81.64, 83.28, 83.97, 84.63, 84.63, 85.76,
86.27, 86.99, 87.46, 87.91, 88.35, 88.69, 88.91, 89.32, 89.73,
90.11, 90.43, 90.8, 91.24, 91.5, 91.69, 91.69, 92.44, 92.72,
93.2, 93.2, 93.76, 93.92, 94.17, 94.17, 94.52, 94.83, 95.21,
95.46, 95.72, 95.87, 95.94, 96.19, 96.35, 96.6, 96.85, 96.98,
97.1, 97.1, 97.2, 97.32, 97.32, 97.54, 97.64, 97.76, 97.76, 97.92,
97.98, 98.05, 98.05, 98.24, 98.3, 98.36, 98.39, 98.55, 98.58,
98.68, 98.77, 98.83, 98.87, 98.87, 98.99, 99.06, 99.06, 99.06,
99.06, 99.06, 99.06, 99.15, 99.15, 99.15, 99.28, 99.28, 99.31,
99.31, 99.31, 99.5, 99.53, 99.59, 99.62, 99.62, 99.65, 99.65,
99.65, 99.65, 99.69, 99.69, 99.72, 99.78, 99.78, 99.81, 99.81,
99.81, 99.81, 99.81, 99.81, 99.81, 99.81, 99.81, 99.81, 99.81,
99.81, 99.81, 99.81, 99.81, 99.81, 99.81, 99.81, 99.81, 99.87,
99.87, 99.91, 99.91, 99.94, 99.94, 99.97, 99.97, 99.97, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100), Trait2Percentiles = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.03, 0.03, 0.03,
0.03, 0.03, 0.03, 0.03, 0.03, 0.03, 0.06, 0.06, 0.06, 0.09, 0.09,
0.09, 0.09, 0.13, 0.13, 0.13, 0.13, 0.19, 0.19, 0.19, 0.22, 0.25,
0.28, 0.31, 0.31, 0.31, 0.41, 0.41, 0.41, 0.41, 0.41, 0.41, 0.57,
0.6, 0.63, 0.69, 0.76, 0.82, 0.82, 0.91, 0.98, 0.98, 1.13, 1.23,
1.23, 1.35, 1.51, 1.54, 1.57, 1.67, 1.67, 1.89, 1.89, 2.08, 2.08,
2.55, 2.55, 2.93, 3.02, 3.15, 3.34, 3.53, 3.94, 4.19, 4.6, 4.85,
5.04, 5.35, 5.57, 6.02, 6.27, 6.65, 7.02, 7.56, 8.06, 8.06, 8.85,
8.85, 9.7, 9.7, 10.55, 10.55, 11.62, 12.38, 12.98, 13.64, 14.3,
14.93, 15.62, 16.35, 17.35, 18.11, 19.15, 20.19, 21.39, 22.52,
23.5, 24.85, 25.61, 26.68, 26.68, 29.13, 29.13, 31.78, 31.78,
34.2, 34.2, 36.91, 38.33, 39.53, 41.35, 42.93, 44.72, 46.52,
48, 49.57, 50.96, 52.41, 54.27, 55.69, 57.64, 58.96, 60.85, 62.2,
64, 65.48, 65.48, 68.31, 69.86, 71.21, 71.21, 73.73, 75.02, 76.54,
77.45, 78.46, 79.78, 80.63, 81.8, 82.8, 83.81, 84.91, 85.86,
86.68, 87.4, 88.16, 88.54, 89.04, 89.86, 90.52, 90.93, 90.93,
91.81, 92.16, 92.47, 92.47, 93.48, 93.8, 94.27, 94.27, 94.68,
95.02, 95.37, 95.59, 95.94, 96.13, 96.41, 96.66, 96.76, 96.79,
97.01, 97.13, 97.32, 97.45, 97.51, 97.57, 97.57, 97.57, 97.95,
98.05, 98.05, 98.24, 98.36, 98.39, 98.39, 98.61, 98.61, 98.74,
98.77, 98.87, 98.93, 98.99, 99.02, 99.12, 99.21, 99.31, 99.4,
99.43, 99.53, 99.59, 99.59, 99.59, 99.59, 99.62, 99.65, 99.65,
99.65, 99.75, 99.75, 99.75, 99.78, 99.81, 99.84, 99.87, 99.87,
99.87, 99.87, 99.87, 99.87, 99.91, 99.91, 99.91, 99.94, 99.94,
99.94, 99.94, 99.94, 99.94, 99.94, 99.97, 99.97, 99.97, 99.97,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100)), class = "data.frame", row.names = c(NA, -701L))
You can use match here :
df1$Trait1Percentile <- LookupTable$Trait1Percentiles[match(df1$Trait1Score, LookupTable$Scores)]
head(df1)
# JobNumber Trait1Score Trait2Score Trait1Percentile
#1 634 3.89 4.00 14.08
#2 21 4.39 2.94 68.00
#3 300 4.22 3.17 46.87
#4 797 4.21 3.83 45.61
#5 1112 3.94 4.22 17.23
#6 147 3.90 3.83 14.61
With merge you need to select relevant columns
merge(df1, LookupTable, by.x = 'Trait1Score', by.y = 'Scores')[1:4]
Similarly in dplyr :
library(dplyr)
inner_join(df1, LookupTable, by = c('Trait1Score' = 'Scores')) %>% select(1:4)
I would like to divide each column of my dataframe by the values of one row.
I tried to transform my dataframe into a matrix and to extract one row of the dataframe as a vector then divide the matrix by the vector but it did not work. Indeed, only the first row of the matrix got divided by the vector.
Here is my original dataframe.
And this is the code I tried to run :
data <- read_excel("Documents/TFB/xlsx_geochimie/solfatara_maj.xlsx")
View(data)
data.mat <- as.matrix(data[,2:20])
vector <- data[12,2:20]
data.mat/vector
We replicate the vector to make the length same and then do the division
data.mat/unlist(vector)[col(data.mat)]
# FeO Total S SO4 Total N SiO2 Al2O3 Fe2O3 MnO MgO CaO Na2O K2O
#[1,] 0.10 16.5555556 NA NA 0.8908607 0.8987269 0.1835206 0.08333333 0.03680982 0.04175365 0.04823151 0.5738562
#[2,] 0.40 125.8333333 NA NA 0.5510204 0.4456019 0.2359551 0.08333333 0.04294479 0.01878914 0.04501608 0.2588235
#[3,] 0.85 0.6111111 NA NA 1.0021295 1.0162037 0.7715356 1.08333333 0.53987730 0.69728601 1.03858521 1.0457516
#[4,] 0.15 48.0555556 NA NA 1.1027507 0.2569444 NA 0.08333333 0.01840491 0.01878914 0.04180064 0.1647059
#[5,] 0.85 NA NA NA 1.0889086 1.0271991 0.6591760 0.75000000 0.59509202 0.53862213 1.02250804 1.1228758
#[6,] NA NA NA NA 1.3426797 0.6319444 0.0411985 0.08333333 0.03067485 0.11899791 0.65594855 0.7764706
# TiO2 P2O5 LOI LOI2 Total Total 2 Fe2O3(T)
#[1,] 0.7924528 0.3928571 7.0841837 6.6963855 0.9922233 0.9894632 0.14489796
#[2,] 0.5094340 0.3214286 14.5561224 13.7710843 0.9958126 0.9936382 0.31020408
#[3,] 0.8679245 0.6428571 1.5637755 1.5228916 0.9990030 0.9970179 0.80612245
#[4,] 1.4905660 0.2857143 7.4056122 7.0024096 0.9795613 0.9769384 0.05510204
#[5,] 1.0377358 0.2500000 0.3520408 0.3783133 0.9969093 0.9960239 0.74489796
#[6,] 0.3018868 0.2500000 1.2551020 1.1879518 1.0019940 1.0000000 0.04489796
Or use sweep
sweep(data.mat, MARGIN = 2, unlist(vector), FUN = `/`)
Or using mapply with asplit
mapply(`/`, asplit(data.mat, 2), vector)
data
data_mat <- structure(c(0.2, 0.8, 1.7, 0.3, 1.7, NA, 5.96, 45.3, 0.22, 17.3,
NA, NA, NA, 6.72, NA, 4.08, 0.06, 0.16, NA, NA, NA, NA, NA, NA,
50.2, 31.05, 56.47, 62.14, 61.36, 75.66, 15.53, 7.7, 17.56, 4.44,
17.75, 10.92, 0.49, 0.63, 2.06, NA, 1.76, 0.11, 0.01, 0.01, 0.13,
0.01, 0.09, 0.01, 0.06, 0.07, 0.88, 0.03, 0.97, 0.05, 0.2, 0.09,
3.34, 0.09, 2.58, 0.57, 0.15, 0.14, 3.23, 0.13, 3.18, 2.04, 4.39,
1.98, 8, 1.26, 8.59, 5.94, 0.42, 0.27, 0.46, 0.79, 0.55, 0.16,
0.11, 0.09, 0.18, 0.08, 0.07, 0.07, 27.77, 57.06, 6.13, 29.03,
1.38, 4.92, 27.79, 57.15, 6.32, 29.06, 1.57, 4.93, 99.52, 99.88,
100.2, 98.25, 99.99, 100.5, 99.54, 99.96, 100.3, 98.28, 100.2,
100.6, 0.71, 1.52, 3.95, 0.27, 3.65, 0.22), .Dim = c(6L, 19L), .Dimnames = list(
NULL, c("FeO", "Total S", "SO4", "Total N", "SiO2", "Al2O3",
"Fe2O3", "MnO", "MgO", "CaO", "Na2O", "K2O", "TiO2", "P2O5",
"LOI", "LOI2", "Total", "Total 2", "Fe2O3(T)")))
vector <- structure(list(FeO = 2, `Total S` = 0.36, SO4 = NA_real_, `Total N` = NA_real_,
SiO2 = 56.35, Al2O3 = 17.28, Fe2O3 = 2.67, MnO = 0.12, MgO = 1.63,
CaO = 4.79, Na2O = 3.11, K2O = 7.65, TiO2 = 0.53, P2O5 = 0.28,
LOI = 3.92, LOI2 = 4.15, Total = 100.3, `Total 2` = 100.6,
`Fe2O3(T)` = 4.9), row.names = c(NA, -1L), class = c("tbl_df",
"tbl", "data.frame"))
To divide data frame, df, by the third row:
df/df[rep(3, nrow(df)), ]
I have a nested tibble which looks like the following:
# A tibble: 2 x 3
SCORE score1_rank score2_rank
<chr> <list> <list>
1 scr_rnk_1 <tibble [54 x 5]> <tibble [54 x 5]>
2 scr_rnk_2 <tibble [46 x 5]> <tibble [46 x 5]>
I want to construct regressions for each of the 4 tibbles. I can expand out the data by the following and run regressions individually:
sub_data1 <- nested_df$score1_rank[[1]]
sub_data2 <- nested_df$score1_rank[[2]]
#Reression 1
sub_data1 <- sub_data1[!is.na(sub_data1$Y), ]
lm(Y ~ X1 + X2, data = sub_data1)
#Regression 2
sub_data2 <- sub_data2[!is.na(sub_data2$Y), ]
lm(Y ~ X1 + X2, data = sub_data2)
However I would like to try to do this for the whole nested tibble.
i.e. I am trying to map the regression over the tibbles.
Data:
nested_df <- structure(list(SCORE = c("scr_rnk_1", "scr_rnk_2"), score1_rank = list(
structure(list(time = c("July_2013_June_2014", "July_2013_June_2014",
"July_2013_June_2014", "July_2013_June_2014", "July_2013_June_2014",
"July_2014_June_2015", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2016_June_2017", "July_2016_June_2017",
"July_2016_June_2017", "July_2016_June_2017", "July_2010_June_2011",
"July_2010_June_2011", "July_2010_June_2011", "July_2010_June_2011",
"July_2010_June_2011", "July_2010_June_2011", "July_2012_June_2013",
"July_2012_June_2013", "July_2012_June_2013", "July_2012_June_2013",
"July_2012_June_2013", "July_2012_June_2013", "July_2018_June_2019",
"July_2018_June_2019", "July_2018_June_2019", "July_2018_June_2019",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2015_June_2016", "July_2011_June_2012", "July_2011_June_2012",
"July_2011_June_2012", "July_2011_June_2012", "July_2011_June_2012",
"July_2008_June_2009", "July_2008_June_2009", "July_2008_June_2009",
"July_2017_June_2018", "July_2017_June_2018", "July_2017_June_2018",
"July_2009_June_2010", "July_2009_June_2010", "July_2009_June_2010",
"July_2019_June_2020"), score1 = c(0.878385627705134, 0.829149886628575,
0.873633400824437, 0.873191548477804, 0.833360020840671,
0.821514348879447, 0.93893179382238, 0.902566094498171, 0.832521540654393,
0.904546026086165, 0.944312545893212, 0.90721438246816, 0.925563285777056,
0.837735581176652, 0.898314100598163, 0.881156591451732,
0.927432166201199, 0.810462622843289, 0.924966424794594,
0.54982486102469, 0.632637353015548, 0.93598101241571, 0.748712668464033,
0.887355002120062, 0.00606213355201044, 0.66570681669867,
0.809662797719473, 0.80883896141453, 0.410059100270974, 0.45097086832185,
0.855118540355703, 0.73792861592456, 0.582170697766921, 0.910913548399676,
0.909192361557635, 0.61000565934628, 0.541242004262667, 0.847840909074889,
0.838844407944549, 0.638014235742945, 0.948686837455938,
0.569343264654849, 0.942357992461572, 0.956483422999484,
0.716630105733463, 0.757677906984471, 0.840660131450953,
0.944095864840561, 0.74291963665858, 0.944596570938035, 0.916460742106468,
0.90890022256817, 0.895889262055934, 0.886515265060623),
Y = c(-0.0392143242061138, 0.00517332553863525, 0.0475661605596542,
-0.0140374358743429, -0.0235463473945856, 0.0460794232785702,
0.0647838711738586, -0.0257589742541313, 0.0539961569011211,
-0.170428335666656, 0.0925306528806686, 0.11557175219059,
0.0496749319136143, -0.11405622959137, 0.0666666403412819,
-0.0189777128398418, -0.00572755141183734, 0.0277173686772585,
-0.0241545476019383, 0.0328245237469673, 0.223529428243637,
0.0253662765026092, 0.0394621938467026, 0.0815821811556816,
0.0597507022321224, -0.0132956989109516, 0.0609685145318508,
0.0393742695450783, -0.00168346334248781, -0.000859459512867033,
0.0345749147236347, NA, 0.0327170714735985, 0.144188165664673,
0.0415891073644161, 0.0028026478830725, -0.0840985849499702,
0.00914959330111742, 0.0197730101644993, -0.0929021015763283,
0.0382972247898579, NA, 0.015947800129652, 0.0136986169964075,
-0.139593943953514, 0.113736107945442, 0.0216289088129997,
-0.209788918495178, 0.00545153254643083, 0.126438871026039,
0.0538020096719265, 0.0774460881948471, 0.0651820451021194,
NA), X1 = c(0.14, 5.52, 0.14, -3.29, 1.82, -1.17, 1.93,
2.7, -1.44, -1.74, 5.91, -2.05, 2.72, 1.86, 2.28, 1.39,
3.49, 4.47, -1.52, 4.47, 9.85, -0.68, -2.52, 5.46, -0.43,
-0.43, 2.3, 0.56, -8.19, 0.87, 2.53, NA, 7.32, 6.92,
6.92, -6.18, -3.91, -6.32, 0.45, -8.88, -0.44, NA, -0.44,
-1.11, -8.54, 7.28, -6.53, 1.93, 1.93, 1.93, 6.24, 8.62,
6.24, NA), X2 = c(-0.5, 2.22, -0.5, 2.93, -0.17, 1.42,
-0.53, 0.78, 1.67, -0.05, -0.39, -1.08, 0.46, 0.37, -0.62,
0.17, 0.18, -0.69, -0.42, -0.69, 1.48, 1.32, 0.21, 0.17,
-0.76, -0.76, 1.19, -0.66, -2.51, -0.38, -2.56, NA, -2.36,
1.33, 1.33, 1.16, -0.25, -2.16, 0.04, -0.53, -0.46, NA,
-0.46, 0.23, 2.23, -1.27, -0.57, -0.61, -0.61, -0.61,
-0.19, -1.37, -0.19, NA)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -54L)), structure(list(time = c("July_2013_June_2014",
"July_2013_June_2014", "July_2013_June_2014", "July_2013_June_2014",
"July_2013_June_2014", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2016_June_2017", "July_2016_June_2017",
"July_2016_June_2017", "July_2010_June_2011", "July_2010_June_2011",
"July_2010_June_2011", "July_2010_June_2011", "July_2010_June_2011",
"July_2012_June_2013", "July_2012_June_2013", "July_2012_June_2013",
"July_2012_June_2013", "July_2012_June_2013", "July_2018_June_2019",
"July_2018_June_2019", "July_2018_June_2019", "July_2015_June_2016",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2011_June_2012", "July_2011_June_2012", "July_2011_June_2012",
"July_2011_June_2012", "July_2008_June_2009", "July_2008_June_2009",
"July_2008_June_2009", "July_2017_June_2018", "July_2017_June_2018",
"July_2017_June_2018", "July_2009_June_2010", "July_2009_June_2010"
), score1 = c(0.910630243821458, 0.887211746784698, 0.920092482844549,
0.94450683954903, 0.886972163304589, 0.991052738161695, 0.981619567238222,
0.977490375052585, 0.961036277360393, 0.985523653404714,
0.948091565971217, 0.959812930740014, 0.936269500157121,
0.948541666157695, 0.939675946745415, 0.995146212267317,
0.944554298851532, 0.982930629437269, 0.963858517802992,
0.92872841572452, 0.968099127001545, 0.945198156814004, 0.892947157198215,
0.906930889247629, 0.957790348580216, 0.928122479697648,
0.953267485671018, 0.963714595673124, 0.976914001156382,
0.973623547932495, 0.962870831719229, 0.978333062077069,
0.958765402277667, 0.959032891808224, 0.972965648015492,
0.982760065777063, 0.957170836537733, 0.961880715763936,
0.975885654717621, 0.924673632533321, 0.925318007280836,
0.987246011368269, 0.98249943727474, 0.980272445641619, 0.978206000922261,
0.929807352926533), Y = c(0.0737265646457672, 0.0278251487761736,
0.201131358742714, 0.125700861215591, 0.0777644738554955,
-0.0130416098982096, -0.0990565568208694, 0.0333333089947701,
-0.031569954007864, 0.0422280319035053, -0.0111790159717202,
-0.278726726770401, -0.139534845948219, -0.0800638571381569,
0.23757965862751, -0.0746169164776802, 0.0465963147580624,
0.0337920561432838, -0.0111621227115393, -0.0133928591385484,
0.0778210312128067, -0.0821536555886269, 0.00643268134444952,
NA, 0.152694001793861, 0.0409262739121914, 0.0360006913542747,
-0.0233012177050114, -0.211209982633591, -0.11425743252039,
-0.169167995452881, 0.0282719731330872, 0.161968618631363,
-0.0525752492249012, 0.0127659253776074, -0.0466842725872993,
-0.115001328289509, -0.00946897640824318, 0.114568591117859,
0.2675521671772, -0.0196253582835197, 0.123595483601093,
NA, 0.12380950897932, -0.0350765138864517, -0.16666667163372
), X1 = c(2.01, 0.14, 5.06, 5.52, 1.82, 2.7, -3.09, 1.65,
0.5, 1.93, -1.17, 2.25, 1.86, -1.88, 9.85, -3.9, 3.94, 7.6,
4.47, -2.52, 1.32, 2.78, 0.09, NA, 0.88, 2.53, 2.53, 7.32,
1.13, -6.18, -6.32, -0.3, 7.32, -6.18, 4.93, -1.11, -9.2,
-7.52, 11.42, 9.96, -0.26, 1.93, NA, 0.49, 8.62, 0.49), X2 = c(2.18,
-0.5, -1.03, 2.22, -0.17, 0.78, -2.72, -2.19, 1.22, -0.53,
1.42, -0.51, 0.37, -1.55, 1.48, -0.22, -0.02, 2.08, -0.69,
0.21, -1.2, -0.32, 0.35, NA, -0.57, -2.56, -2.56, -2.36,
-3.09, 1.16, -2.16, 1.75, -2.36, 1.16, -0.77, 0.23, -1.33,
-0.63, 1.64, 1.63, 2.85, -0.61, NA, 1.88, -1.37, 3.81)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -46L))), score2_rank = list(
structure(list(time = c("July_2013_June_2014", "July_2013_June_2014",
"July_2013_June_2014", "July_2013_June_2014", "July_2013_June_2014",
"July_2014_June_2015", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2016_June_2017", "July_2016_June_2017",
"July_2016_June_2017", "July_2016_June_2017", "July_2010_June_2011",
"July_2010_June_2011", "July_2010_June_2011", "July_2010_June_2011",
"July_2010_June_2011", "July_2010_June_2011", "July_2012_June_2013",
"July_2012_June_2013", "July_2012_June_2013", "July_2012_June_2013",
"July_2012_June_2013", "July_2012_June_2013", "July_2018_June_2019",
"July_2018_June_2019", "July_2018_June_2019", "July_2018_June_2019",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2015_June_2016", "July_2011_June_2012", "July_2011_June_2012",
"July_2011_June_2012", "July_2011_June_2012", "July_2011_June_2012",
"July_2008_June_2009", "July_2008_June_2009", "July_2008_June_2009",
"July_2017_June_2018", "July_2017_June_2018", "July_2017_June_2018",
"July_2009_June_2010", "July_2009_June_2010", "July_2009_June_2010",
"July_2019_June_2020"), score2 = c(0.573384803196917, 0.95560973004494,
0.936151601862601, 0.940067094946625, 0.790149367637373,
0.885023225824309, 0.956490411723667, 0.918534374861312,
0.9660240615445, 0.961407533200788, 0.794743982673356, 0.926614681101157,
0.924390324452674, 0.838697174839086, 0.548480558835933,
0.928419789574611, 0.942229561212187, 0.808215644539813,
0.89946853678008, 0.931010276978734, 0.780385177969094, 0.945728847589739,
0.958939314931932, 0.101395325662518, 0.0547541695358364,
0.757995973046388, 0.815555744982054, 0.947726570770333,
0.589921893700343, 0.924114006154793, 0.164071857964122,
0.946752193254218, 0.801515206601873, 0.709037475517904,
0.730962189352849, 0.872901083488831, 0.958819700206169,
0.951829945538551, 0.924000702901887, 0.963439907199707,
0.94482417669742, 0.817381450384857, 0.977233364779766, 0.881676744287434,
0.820839678297149, 0.449214983785051, 0.536396658733052,
0.756705578897905, 0.904306523171427, 0.947974271863387,
0.947487349720247, 0.95821125132286, 0.890792036806817, 0.983129670844182
), Y = c(-0.0392143242061138, 0.0475661605596542, 0.0278251487761736,
-0.0235463473945856, 0.0777644738554955, 0.0333333089947701,
0.0460794232785702, 0.0647838711738586, -0.0257589742541313,
-0.170428335666656, 0.0925306528806686, 0.11557175219059,
-0.278726726770401, -0.139534845948219, -0.11405622959137,
0.0666666403412819, -0.00572755141183734, 0.0277173686772585,
0.23757965862751, -0.0241545476019383, 0.0465963147580624,
0.0253662765026092, 0.0394621938467026, 0.00643268134444952,
0.0597507022321224, -0.0132956989109516, 0.0609685145318508,
0.0393742695450783, -0.00168346334248781, 0.0345749147236347,
NA, 0.0360006913542747, 0.0327170714735985, -0.0233012177050114,
0.0028026478830725, -0.0840985849499702, 0.161968618631363,
0.00914959330111742, 0.0197730101644993, -0.0466842725872993,
-0.0929021015763283, 0.0382972247898579, 0.015947800129652,
0.0136986169964075, -0.139593943953514, 0.113736107945442,
0.0216289088129997, -0.209788918495178, 0.00545153254643083,
0.12380950897932, 0.0538020096719265, 0.0774460881948471,
-0.16666667163372, NA), X1 = c(0.14, 0.14, 0.14, 1.82, 1.82,
1.65, -1.17, 1.93, 2.7, -1.74, 5.91, -2.05, 2.25, 1.86, 1.86,
2.28, 3.49, 4.47, 9.85, -1.52, 3.94, -0.68, -2.52, 0.09,
-0.43, -0.43, 2.3, 0.56, -8.19, 2.53, NA, 2.53, 7.32, 7.32,
-6.18, -3.91, 7.32, -6.32, 0.45, -1.11, -8.88, -0.44, -0.44,
-1.11, -8.54, 7.28, -6.53, 1.93, 1.93, 0.49, 6.24, 8.62,
0.49, NA), X2 = c(-0.5, -0.5, -0.5, -0.17, -0.17, -2.19,
1.42, -0.53, 0.78, -0.05, -0.39, -1.08, -0.51, 0.37, 0.37,
-0.62, 0.18, -0.69, 1.48, -0.42, -0.02, 1.32, 0.21, 0.35,
-0.76, -0.76, 1.19, -0.66, -2.51, -2.56, NA, -2.56, -2.36,
-2.36, 1.16, -0.25, -2.36, -2.16, 0.04, 0.23, -0.53, -0.46,
-0.46, 0.23, 2.23, -1.27, -0.57, -0.61, -0.61, 1.88, -0.19,
-1.37, 3.81, NA)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -54L)), structure(list(time = c("July_2013_June_2014",
"July_2013_June_2014", "July_2013_June_2014", "July_2013_June_2014",
"July_2013_June_2014", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2014_June_2015", "July_2014_June_2015",
"July_2014_June_2015", "July_2016_June_2017", "July_2016_June_2017",
"July_2016_June_2017", "July_2010_June_2011", "July_2010_June_2011",
"July_2010_June_2011", "July_2010_June_2011", "July_2010_June_2011",
"July_2012_June_2013", "July_2012_June_2013", "July_2012_June_2013",
"July_2012_June_2013", "July_2012_June_2013", "July_2018_June_2019",
"July_2018_June_2019", "July_2018_June_2019", "July_2015_June_2016",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2015_June_2016", "July_2015_June_2016", "July_2015_June_2016",
"July_2011_June_2012", "July_2011_June_2012", "July_2011_June_2012",
"July_2011_June_2012", "July_2008_June_2009", "July_2008_June_2009",
"July_2008_June_2009", "July_2017_June_2018", "July_2017_June_2018",
"July_2017_June_2018", "July_2009_June_2010", "July_2009_June_2010"
), score2 = c(0.977777238266838, 0.994161535248162, 0.973746623206586,
0.959737686390477, 0.960771840809366, 0.973573416279972,
0.971473417619078, 0.994362749200424, 0.998832204612857,
0.969953961861552, 0.974595202023975, 0.990460167618893,
0.977938934839813, 0.933720130788891, 0.997555980989323,
0.983534940461115, 0.961638641355128, 0.98302503175898, 0.955924205281728,
0.960588460795172, 0.980272014323638, 0.99319344527155, 0.990396166187007,
0.96928405964874, 0.958824291095735, 0.94735915935544, 0.956799713877734,
0.974313477760366, 0.959422857050319, 0.970981339110875,
0.986720965210939, 0.988119219123952, 0.987757971968369,
0.998331238333002, 0.985606980938901, 0.996309951852897,
0.978123949182993, 0.980322946112709, 0.870995840583191,
0.99620925825849, 0.952471805464684, 0.967521340577839, 0.997358168481063,
0.954089152398106, 0.99961257213601, 0.971649355774121),
Y = c(0.00517332553863525, 0.0737265646457672, 0.201131358742714,
-0.0140374358743429, 0.125700861215591, -0.0130416098982096,
-0.0990565568208694, 0.0539961569011211, -0.031569954007864,
0.0422280319035053, -0.0111790159717202, 0.0496749319136143,
-0.0189777128398418, -0.0800638571381569, -0.0746169164776802,
0.0328245237469673, 0.223529428243637, 0.0337920561432838,
-0.0111621227115393, -0.0133928591385484, 0.0815821811556816,
0.0778210312128067, -0.0821536555886269, NA, -0.000859459512867033,
0.152694001793861, 0.0409262739121914, -0.211209982633591,
0.144188165664673, 0.0415891073644161, -0.11425743252039,
-0.169167995452881, 0.0282719731330872, -0.0525752492249012,
0.0127659253776074, -0.115001328289509, -0.00946897640824318,
NA, 0.114568591117859, 0.2675521671772, -0.0196253582835197,
0.123595483601093, NA, 0.126438871026039, -0.0350765138864517,
0.0651820451021194), X1 = c(5.52, 2.01, 5.06, -3.29,
5.52, 2.7, -3.09, -1.44, 0.5, 1.93, -1.17, 2.72, 1.39,
-1.88, -3.9, 4.47, 9.85, 7.6, 4.47, -2.52, 5.46, 1.32,
2.78, NA, 0.87, 0.88, 2.53, 1.13, 6.92, 6.92, -6.18,
-6.32, -0.3, -6.18, 4.93, -9.2, -7.52, NA, 11.42, 9.96,
-0.26, 1.93, NA, 1.93, 8.62, 6.24), X2 = c(2.22, 2.18,
-1.03, 2.93, 2.22, 0.78, -2.72, 1.67, 1.22, -0.53, 1.42,
0.46, 0.17, -1.55, -0.22, -0.69, 1.48, 2.08, -0.69, 0.21,
0.17, -1.2, -0.32, NA, -0.38, -0.57, -2.56, -3.09, 1.33,
1.33, 1.16, -2.16, 1.75, 1.16, -0.77, -1.33, -0.63, NA,
1.64, 1.63, 2.85, -0.61, NA, -0.61, -1.37, -0.19)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -46L)))), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))
2 nested lapplys is what I would use to do it across the whole tibble:
#iterate across score1 and score2
lapply(df[-1], function(x) {
#iterate within score1 and then score2 to run the regressions
lapply(x, function(y) {
sub_data1 <- y[!is.na(y$Y), ]
lm(Y ~ X1 + X2, data = sub_data1)
})
})
Output (4 regressions):
# $score1_rank
# $score1_rank[[1]]
#
# Call:
# lm(formula = Y ~ X1 + X2, data = sub_data1)
#
# Coefficients:
# (Intercept) X1 X2
# 0.010491 0.008486 -0.002082
#
#
# $score1_rank[[2]]
#
# Call:
# lm(formula = Y ~ X1 + X2, data = sub_data1)
#
# Coefficients:
# (Intercept) X1 X2
# -0.013118 0.013098 0.008622
#
#
#
# $score2_rank
# $score2_rank[[1]]
#
# Call:
# lm(formula = Y ~ X1 + X2, data = sub_data1)
#
# Coefficients:
# (Intercept) X1 X2
# -0.003704 0.007486 -0.009675
#
#
# $score2_rank[[2]]
#
# Call:
# lm(formula = Y ~ X1 + X2, data = sub_data1)
#
# Coefficients:
# (Intercept) X1 X2
# -0.002017 0.012093 0.014742
Another option would be to use the tidy model approach using tidyverse and broom.
library(tidyverse)
library(broom)
nested_df %>%
gather(key, data, -SCORE) %>%
mutate(tidymod = map(data, ~lm(Y ~ X1 + X2, data = .) %>% tidy)) %>%
unnest(tidymod)
# A tibble: 12 x 7
SCORE key term estimate std.error statistic p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 scr_rnk_1 score1_rank (Intercept) 0.0105 0.00962 1.09 0.281
2 scr_rnk_1 score1_rank X1 0.00849 0.00212 4.00 0.000219
3 scr_rnk_1 score1_rank X2 -0.00208 0.00808 -0.258 0.798
4 scr_rnk_2 score1_rank (Intercept) -0.0131 0.0155 -0.848 0.402
5 scr_rnk_2 score1_rank X1 0.0131 0.00320 4.10 0.000192
6 scr_rnk_2 score1_rank X2 0.00862 0.00894 0.965 0.340
7 scr_rnk_1 score2_rank (Intercept) -0.00370 0.0125 -0.296 0.769
8 scr_rnk_1 score2_rank X1 0.00749 0.00291 2.57 0.0132
9 scr_rnk_1 score2_rank X2 -0.00968 0.00961 -1.01 0.319
10 scr_rnk_2 score2_rank (Intercept) -0.00202 0.0121 -0.166 0.869
11 scr_rnk_2 score2_rank X1 0.0121 0.00242 4.99 0.0000121
12 scr_rnk_2 score2_rank X2 0.0147 0.00774 1.91 0.0640