I have two lists named h and g.
They each contain 244 dataframes and they look like the following:
h[[1]]
year avg hr sal
1 2010 0.300 31 2000
2 2011 0.290 30 4000
3 2012 0.275 14 600
4 2013 0.280 24 800
5 2014 0.295 18 1000
6 2015 0.330 26 7000
7 2016 0.315 40 9000
g[[1]]
year pos fld
1 2010 A 0.990
2 2011 B 0.995
3 2013 C 0.970
4 2014 B 0.980
5 2015 D 0.990
I want to cbind these two dataframes.
But as you see, they have different number of rows.
I want to combine these dataframes so that the rows with the same year will be combined in one row. And I want the empty spaces to be filled with NA.
The result I expect looks like this:
year avg hr sal pos fld
1 2010 0.300 31 2000 A 0.990
2 2011 0.290 30 4000 B 0.995
3 2012 0.275 14 600 NA NA
4 2013 0.280 24 800 C 0.970
5 2014 0.295 18 1000 B 0.980
6 2015 0.330 26 7000 D 0.990
7 2016 0.315 40 9000 NA NA
Also, I want to repeat this for all the 244 dataframes in each list, h and g.
I'd like to make a new list named final which contains the 244 combined dataframes.
How can I do this...?
All answers will be greatly appreciated :)
I think you should instead use merge:
merge(df1, df2, by="year", all = T)
For your data:
df1 = data.frame(matrix(0, 7, 4))
names(df1) = c("year", "avg", "hr", "sal")
df1$year = 2010:2016
df1$avg = c(.3, .29, .275, .280, .295, .33, .315)
df1$hr = c(31, 30, 14, 24, 18, 26, 40)
df1$sal = c(2000, 4000, 600, 800, 1000, 7000, 9000)
df2 = data.frame(matrix(0, 5, 3))
names(df2) = c("year", "pos", "fld")
df2$year = c(2010, 2011, 2013, 2014, 2015)
df2$pos = c('A', 'B', 'C', 'B', 'D')
df2$fld = c(.99,.995,.97,.98,.99)
cbind is meant to column-bind two dataframes that are in all sense compatible. But what you aim to do is actual merge, where you want the elements from the two data frames not be discarded, and for missing values you get NA instead.
We can use Map with cbind.fill (from rowr) to cbind the corresponding 'data.frame' from 'h' and 'g'.
library(rowr)
Map(cbind.fill, h, g, MoreArgs = list(fill=NA))
Update
Based on the expected output showed, it seems like the OP wanted a merge instead of cbind
f1 <- function(...) merge(..., all = TRUE, by = 'year')
Map(f1, h, g)
#[[1]]
# year avg hr sal pos fld
#1 2010 0.300 31 2000 A 0.990
#2 2011 0.290 30 4000 B 0.995
#3 2012 0.275 14 600 <NA> NA
#4 2013 0.280 24 800 C 0.970
#5 2014 0.295 18 1000 B 0.980
#6 2015 0.330 26 7000 D 0.990
#7 2016 0.315 40 9000 <NA> NA
Or as #Colonel Beauvel mentioned, this can be made compact
Map(merge, h, g, by='year', all=TRUE)
data
h <- list(structure(list(year = 2010:2016, avg = c(0.3, 0.29, 0.275,
0.28, 0.295, 0.33, 0.315), hr = c(31L, 30L, 14L, 24L, 18L, 26L,
40L), sal = c(2000L, 4000L, 600L, 800L, 1000L, 7000L, 9000L)), .Names = c("year",
"avg", "hr", "sal"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7")))
g <- list(structure(list(year = c(2010L, 2011L, 2013L, 2014L, 2015L
), pos = c("A", "B", "C", "B", "D"), fld = c(0.99, 0.995, 0.97,
0.98, 0.99)), .Names = c("year", "pos", "fld"), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5")))
Here is how you could do this with tidyverse tools:
library(tidyverse)
h <- list()
g <- list()
h[[1]] <- tribble(
~year, ~avg, ~hr, ~sal,
2010, 0.300, 31, 2000,
2011, 0.290, 30, 4000,
2012, 0.275, 14, 600,
2013, 0.280, 24, 800,
2014, 0.295, 18, 1000,
2015, 0.330, 26, 7000,
2016, 0.315, 40, 9000
)
g[[1]] <- tribble(
~year, ~pos, ~fld,
2010, "A", 0.990,
2011, "B", 0.995,
2013, "C", 0.970,
2014, "B", 0.980,
2015, "D", 0.990
)
map2(h, g, left_join)
Which produces:
[[1]]
# A tibble: 7 x 6
year avg hr sal pos fld
<dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 2010 0.3 31 2000 A 0.99
2 2011 0.290 30 4000 B 0.995
3 2012 0.275 14 600 NA NA
4 2013 0.28 24 800 C 0.97
5 2014 0.295 18 1000 B 0.98
6 2015 0.33 26 7000 D 0.99
7 2016 0.315 40 9000 NA NA
Related
Anonymised example subset of a much larger dataset (now edited to show an option with multiple competing types):
structure(list(`Sample File` = c("A", "A", "A", "A", "A", "A",
"A", "A", "A", "B", "B", "B", "B", "B", "C", "C", "C", "C"),
Marker = c("X", "X", "X", "X", "Y", "Y", "Y", "Y", "Y", "Z",
"Z", "Z", "Z", "Z", "q", "q", "q", "q"), Allele = c(19, 20,
22, 23, 18, 18.2, 19, 19.2, 20, 12, 13, 14, 15, 16, 10, 10.2,
11, 12), Size = c(249.15, 253.13, 260.64, 264.68, 366, 367.81,
369.97, 372.02, 373.95, 91.65, 95.86, 100, 104.24, 108.38,
177.51, 179.4, 181.42, 185.49), Height = c(173L, 1976L, 145L,
1078L, 137L, 62L, 1381L, 45L, 1005L, 38L, 482L, 5766L, 4893L,
19L, 287L, 36L, 5001L, 50L), Type = c("minusone", "allele",
"minusone", "allele", "ambiguous", "minushalf", "allele",
"minushalf", "allele", "minustwo", "ambiguous", "allele",
"allele", "plusone", "minusone", "minushalf", "allele", "plusone"
), LUS = c(11.75, 11.286, 13.375, 13.5, 18, 9, 19, 10, 20,
12, 11, 14, 15, 16, 9.5, NA, 11, 11.5)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -18L), groups = structure(list(
`Sample File` = c("A", "A", "B", "C"), Marker = c("X", "Y",
"Z", "q"), .rows = structure(list(1:4, 5:9, 10:14, 15:18), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))
I want to look up values based on the classification $Type.
"minustwo" means I want to look up the "Allele", "Height" and "LUS"
values for the row with "Allele" equal to the current row plus two,
with the same Sample File and Marker.
"minusone" means the same but for "Allele" equal to the current row plus one.
"minushalf" means the same but for "Allele" equal to the current row plus 0.2 but the dot values here are 25% each, so 12.1, 12.3, 12.3, 13, 13.1 etc - I have a helper function plusTwoBP() for this.
"plusone" means the same for "Allele" equal to the current row -1
"allele" or "ambiguous" don't need to do anything.
Ideal output:
# A tibble: 18 × 10
# Rowwise: Sample File, Marker
`Sample File` Marker Allele Size Height Type LUS ParentHeight ParentAllele ParentLUS
<chr> <chr> <dbl> <dbl> <int> <chr> <dbl> <int> <dbl> <dbl>
1 A X 19 249. 173 minusone 11.8 1976 20 11.3
2 A X 20 253. 1976 allele 11.3 NA NA NA
3 A X 22 261. 145 minusone 13.4 1078 23 13.5
4 A X 23 265. 1078 allele 13.5 NA NA NA
5 A Y 18 366 137 ambiguous 18 NA NA NA
6 A Y 18.2 368. 62 minushalf 9 1381 19 19
7 A Y 19 370. 1381 allele 19 NA NA NA
8 A Y 19.2 372. 45 minushalf 10 1005 20 20
9 A Y 20 374. 1005 allele 20 NA NA NA
10 B Z 12 91.6 38 minustwo 12 5766 14 14
11 B Z 13 95.9 482 ambiguous 11 NA NA NA
12 B Z 14 100 5766 allele 14 NA NA NA
13 B Z 15 104. 4893 allele 15 NA NA NA
14 B Z 16 108. 19 plusone 16 4893 15 15
15 C q 10 178. 287 minusone 9.5 5001 11 11
16 C q 10.2 179. 36 minushalf NA 5001 11 11
17 C q 11 181. 5001 allele 11 NA NA NA
18 C q 12 185. 50 plusone 11.5 5001 11 11
I have a rather belaboured way of doing it:
# eg for minustwo
sampleData %>%
filter(Type == "minustwo") %>%
rowwise() %>%
mutate(ParentHeight = sampleData$Height[sampleData$`Sample File` == `Sample File` & sampleData$Marker == Marker & sampleData$Allele == (Allele + 2)],
ParentAllele = sampleData$Allele[sampleData$`Sample File` == `Sample File` & sampleData$Marker == Marker & sampleData$Allele == (Allele + 2)],
ParentLUS = sampleData$LUS[sampleData$`Sample File` == `Sample File` & sampleData$Marker == Marker & sampleData$Allele == (Allele + 2)]) %>%
right_join(sampleData)
I then have to redo that for each of my Types
My real dataset is thousands of rows so this ends up being a little slow but manageable, but more to the point I want to learn a better way to do it, in particular the sampleData$'Sample File' == 'Sample File' & sampleData$Marker == Marker seems like it should be doable with grouping so I must be missing a trick there.
I have tried using group_map() but I've clearly not understood it correctly:
sampleData$ParentHeight <- sampleData %>%
group_by(`Sample File`, `Marker`) %>%
group_map(.f = \(.x, .y) {
pmap_dbl(.l = .x, .f = \(Allele, Height, Type, ...){
if(Type == "allele" | Type == "ambiguous") { return(0)
} else if (Type == "plusone") {
return(.x$Height[.x$Allele == round(Allele - 1, 1)])
} else if (Type == "minushalf") {
return(.x$Height[.x$Allele == round(plustwoBP(Allele), 1)])
} else if (Type == "minusone") {
return(.x$Height[.x$Allele == round(Allele + 1, 1)])
} else if (Type == "minustwo") {
return(.x$Height[.x$Allele == round(Allele + 2, 1)])
} else { stop("unexpected peak type") }
})}) %>% unlist()
Initially seems to work, but on investigation it's not respecting both layers of grouping, so brings matches from the wrong Marker. Additionally, here I'm assigning the output to a new column in the data frame, but if I try to instead wrap a mutate() around this so that I can create all three new columns in one go then the group_map() no longer works at all.
I also considered using complete() to hugely extend the data frame will all possible values of Allele (including x.0, x.1, x.2, x.3 variants) then use lag() to select the corresponding rows, then drop the spare rows. This seems like it'd make the data frame enormous in the interim.
To summarise
This works, but it feels ugly and like I'm missing a more elegant and obvious solution. How would you approach this?
You can create two versions of Allele: one identical to the original Allele, and one that is equal to an adjustment based on minusone, minustwo, etc
Then do a self left join, based on that adjusted version of Allele (and Sample File and Marker)
sampleData = sampleData %>% group_by(`Sample File`,Marker) %>% mutate(id = Allele) %>% ungroup()
left_join(
sampleData %>%
mutate(id = case_when(
Type=="minusone"~id+1,
Type=="minustwo"~id+2,
Type=="plusone"~id-1,
Type=="minushalf"~ceiling(id))),
sampleData %>% select(-c(Size,Type)),
by=c("Sample File", "Marker", "id"),
suffix = c("", ".parent")
) %>% select(-id)
Output:
# A tibble: 14 × 10
`Sample File` Marker Allele Size Height Type LUS Allele.parent Height.parent LUS.parent
<chr> <chr> <dbl> <dbl> <int> <chr> <dbl> <dbl> <int> <dbl>
1 A X 19 249. 173 minusone 11.8 20 1976 11.3
2 A X 20 253. 1976 allele 11.3 NA NA NA
3 A X 22 261. 145 minusone 13.4 23 1078 13.5
4 A X 23 265. 1078 allele 13.5 NA NA NA
5 A Y 18 366 137 ambiguous 18 NA NA NA
6 A Y 18.2 368. 62 minushalf 9 19 1381 19
7 A Y 19 370. 1381 allele 19 NA NA NA
8 A Y 19.2 372. 45 minushalf 10 20 1005 20
9 A Y 20 374. 1005 allele 20 NA NA NA
10 B Z 12 91.6 38 minustwo 12 14 5766 14
11 B Z 13 95.9 482 ambiguous 11 NA NA NA
12 B Z 14 100 5766 allele 14 NA NA NA
13 B Z 15 104. 4893 allele 15 NA NA NA
14 B Z 16 108. 19 plusone 16 15 4893 15
15 C q 10 178. 287 minusone 9.5 11 5001 11
16 C q 10.2 179. 36 minushalf NA 11 5001 11
17 C q 11 181. 5001 allele 11 NA NA NA
18 C q 12 185. 50 plusone 11.5 11 5001 11
In the data below I want to compute the following ratio tr(year)/(op(year) - op(year-1). I would appreciate an answer with dplyr.
year op tr cp
<chr> <dbl> <dbl> <dbl>
1 1984 10 39.1 38.3
2 1985 55 132. 77.1
3 1986 79 69.3 78.7
4 1987 78 47.7 74.1
5 1988 109 77.0 86.4
this is the expected output
year2 ratio
1 1985 2.933333
2 1986 2.887500
3 1987 -47.700000
4 1988 -2.483871
I do not manage to get to any result...
Use lag:
library(dplyr)
df %>%
mutate(year = year,
ratio = tr / (op - lag(op)),
.keep = "none") %>%
tidyr::drop_na()
# year ratio
#2 1985 2.933333
#3 1986 2.887500
#4 1987 -47.700000
#5 1988 2.483871
We may use
library(dplyr)
df1 %>%
reframe(year = year[-1], ratio = tr[-1]/diff(op))
-output
year ratio
1 1985 2.933333
2 1986 2.887500
3 1987 -47.700000
4 1988 2.483871
data
df1 <- structure(list(year = 1984:1988, op = c(10L, 55L, 79L, 78L, 109L
), tr = c(39.1, 132, 69.3, 47.7, 77), cp = c(38.3, 77.1, 78.7,
74.1, 86.4)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5"))
I repost here what I posted on stats exchange, having been told it was better suited for stack overflow. Here is the structure of my dataset for reproducibility :
structure(list(numero = c("133", "62", "75", "76", "86", "281"
), tranche_age = c("20-30", "20-30", "20-30", "20-30", "20-30",
"20-30"), tranche_anciennete = c("5 ans et moins", "5 à 10 ans",
"5 ans et moins", "5 ans et moins", "5 à 10 ans", "5 à 10 ans"
), code_statut = c("C", "E", "E", "E", "E", "E"), code_contrat = c("A",
"A", "A", "A", "A", "A"), taux_demploi_mois = c(100, 100, 100,
100, 100, 100), echelon = c("E1", NA, NA, NA, NA, NA), niveau = c("N7",
NA, NA, NA, NA, NA), brut_mensuel = c(NA, 786.13, 1156.95, 1156.95,
904.79, 904.79), estimation_annuelle = c(NA, 10219.69, 15040.35,
15040.35, 11762.27, 11762.27), annee = c(2017, 2017, 2017, 2017,
2017, 2017), primes_en_montant = c(0, 0, 0, 0, 0, 0), primes_en_pourcentage =
c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), brut_mensuel_ETP = c(NA,
786.13, 1156.95, 1156.95, 904.79, 904.79)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
Each worker is identified with one number ("numero"), which doesn't change from year to year. I would like to compute a new variable, to add to this dataframe, representing the evolution of the "estimation_annuelle" (which is the yearly wage) of each worker, from year to year (from 2017 to 2021), and then the average annual growth rate over the 5 years. Then, I would like to view those who have less than a 2% raise on one year (2017-2018 for example), and see whether it has been caught up in the following years or no (that is, if one's wage has increased by less than 2% between 2017 and 2018, if the wage increased one had between 2018 and 2019 compensated, and by how much, the insufficient raise on the previous yearly period).
I have tried a code to compute the variable evolution from year to year, which doesn't work :
test <- liste_complete %>%
group_by(annee, numero) %>%
select(numero, annee, estimation_annuelle)%>%
data.frame()
for(i in 1:length(test$estimation_annuelle)) {
print((test[i+1,] - test[i,])/test[i,])
}
And I have not found anything to compute the average annual growth rate (here is the formula : https://investinganswers.com/dictionary/a/average-annual-growth-rate-aagr), nor computed whether the insufficient increase for those who are concerned has been made up for in the following years.
Could anyone help ?
We can use a summarise then a match.
df$annee <- c(2017, 2017, 2018,2018, 2019,2019)
df$brut_mensuel[1] <- 11000
# first, summarise
summary <- df %>% select(numero, annee, estimation_annuelle, brut_mensuel) %>%
group_by(annee) %>% summarise(estimation_annuelle=mean(brut_mensuel)) %>% arrange(annee) %>%
mutate(salaire_annee_prec = lag(estimation_annuelle),
variation_annee_precedente=(estimation_annuelle-salaire_annee_prec)/salaire_annee_prec)
# matching
df$variation_annee_prec <- summary$variation_annee_precedente[match(df$annee,summary$annee)]
df
# A tibble: 6 x 15
numero tranche_age tranche_anciennete code_statut code_contrat taux_demploi_mois echelon niveau brut_mensuel estimation_annuelle annee primes_en_montant
<chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 133 20-30 5 ans et moins C A 100 E1 N7 11000 NA 2017 0
2 62 20-30 5 à 10 ans E A 100 NA NA 786. 10220. 2017 0
3 75 20-30 5 ans et moins E A 100 NA NA 1157. 15040. 2018 0
4 76 20-30 5 ans et moins E A 100 NA NA 1157. 15040. 2018 0
5 86 20-30 5 à 10 ans E A 100 NA NA 905. 11762. 2019 0
6 281 20-30 5 à 10 ans E A 100 NA NA 905. 11762. 2019 0
primes_en_pourcentage brut_mensuel_ETP variation_annee_prec
<dbl> <dbl> <dbl>
1 NA NA NA
2 NA 786. NA
3 NA 1157. -0.804
4 NA 1157. -0.804
5 NA 905. -0.218
6 NA 905. -0.218
I would like to make a connection between the x and df2 datasets. Notice that the dataset x, I have a percentage value, which in this case for the day 03-01-2021 is 0.1 and for the days 01-02-2021 and 01-01-2022 it is 0.45. So from that information, I know the percentage value for 03-01-2021 is 0.1, so this value falls into category I of my dataset df2 (since the values range from 0.1 to 0.2). As for the days 02-01-2021 and 01-01-2022, they correspond to category F of the df2,since the values range from 0.4 to 0.5. So, I would like to generate an output table as follows:
library(dplyr)
df1<- structure(
list(date2= c("01-01-2022","01-01-2022","03-01-2021","03-01-2021","01-02-2021","01-02-2021"),
Category= c("ABC","CDE","ABC","CDE","ABC","CDE"),
coef= c(5,4,0,2,4,5)),
class = "data.frame", row.names = c(NA, -6L))
x<-df1 %>%
group_by(date2) %>%
summarize(across("coef", sum),.groups = 'drop')%>%
arrange(date2 = as.Date(date2, format = "%d-%m-%Y"))
number<-20
x$Percentage<-x$coef/number
date2 coef Percentage
<chr> <dbl> <dbl>
1 03-01-2021 2 0.1
2 01-02-2021 9 0.45
3 01-01-2022 9 0.45
df2 <- structure(
list(
Category = c("A", "B", "C", "D",
"E", "F", "G", "H", "I", "J"),
From = c(0.9,
0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0),
Until = c(
1,
0.8999,
0.7999,
0.6999,
0.5999,
0.4999,
0.3999,
0.2999,
0.1999,
0.0999
),
`1 Val` = c(
2222,
2017.8,
1793.6,
1621.5,
1522.4,
1457.3,
1325.2,
1229.15,
1223.1,
1177.05
),
`2 Val` = c(3200, 2220, 2560,
2200, 2220, 2080, 1220, 1240, 1720, 1620),
`3 Val` = c(
4665,
4122.5,
3732,
3498.75,
3265.5,
3032.25,
2799,
2682.375,
2565.75,
2449.125
),
`4 Val` = c(
6112,
5222.8,
4889.6,
4224,
4278.4,
3972.8,
3667.2,
3224.4,
3361.6,
3222.8
)
),
row.names = c(NA,-10L),
class = c("tbl_df",
"tbl", "data.frame")
)
Category From Until 1 Val 2 Val 3 Val 4 Val
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 0.9 1 2222 3200 4665 6112
2 B 0.8 0.900 2018 2220 4122 5223
3 C 0.7 0.800 1794 2560 3732 4890
4 D 0.6 0.700 1622 2200 3499 4224
5 E 0.5 0.600 1522 2220 3266 4278
6 F 0.4 0.500 1457 2080 3032 3973
7 G 0.3 0.400 1325 1220 2799 3667
8 H 0.2 0.300 1229 1240 2682 3224
9 I 0.1 0.200 1223 1720 2566 3362
10 J 0 0.0999 1177 1620 2449 3223
Using tidyverse, we do a rowwise on the 'x' dataset, slice the rows of 'df2' where the 'Percentage' falls between the 'From' and 'Until', and unpack the data.frame/tibble column
library(dplyr)
library(tidyr)
x %>%
rowwise %>%
mutate(out = df2 %>%
slice(which(Percentage>= From &
Percentage <= Until)[1]) %>%
select(-(1:3)) ) %>%
ungroup %>%
unpack(out)
-output
# A tibble: 3 × 7
date2 coef Percentage `1 Val` `2 Val` `3 Val` `4 Val`
<chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 03-01-2021 2 0.1 1223. 1720 2566. 3362.
2 01-02-2021 9 0.45 1457. 2080 3032. 3973.
3 01-01-2022 9 0.45 1457. 2080 3032. 3973.
Or this could be done with a non-equi join
library(data.table)
nm1 <- names(df2)[endsWith(names(df2), 'Val')]
setDT(x)[setDT(df2), (nm1) := mget(nm1),
on = .(Percentage >= From, Percentage <= Until)]
-output
> x
date2 coef Percentage 1 Val 2 Val 3 Val 4 Val
1: 03-01-2021 2 0.10 1223.1 1720 2565.75 3361.6
2: 01-02-2021 9 0.45 1457.3 2080 3032.25 3972.8
3: 01-01-2022 9 0.45 1457.3 2080 3032.25 3972.8
Let's say I have a time series and in each iteration, I take a fixed portion of it and calculate the correlation matrix. Also, assume three elements only, which are denoted with their names in the correlation matrix. I want to give them sequential numbers, meaning the first element is 1, second is 2 and so forth. Then I want to have a data frame in a way that expands these matrices. For example:
The first element is the element "from", the second one is "to", the third one is the correlation value and the fourth one is the time. I can give the times as input and repeat it twice many times as the elements. I realize that I will have duplicates for each correlation value, with a difference in "to" and "from" elements and that is what I am looking for. How can I construct this? Here is my data, where g.list is a list of correlation matrices:
> dput(g.list)
list(structure(c(1, 0.352209944821856, 0.802051885793422, 0.352209944821857,
1, 0.827370298950111, 0.802051885793422, 0.827370298950111, 1
), .Dim = c(3L, 3L), .Dimnames = list(c("jpm", "gs", "ms"), c("jpm",
"gs", "ms"))), structure(c(1, 0.670163753398499, 0.753168359152204,
0.6701637533985, 1, 0, 0.753168359152202, 0, 1), .Dim = c(3L,
3L), .Dimnames = list(c("jpm", "gs", "ms"), c("jpm", "gs", "ms"
))), structure(c(1, 0.681190013681026, 0.153608963486821, 0.681190013681026,
1, 0.82058156983829, 0.153608963486822, 0.82058156983829, 1), .Dim = c(3L,
3L), .Dimnames = list(c("jpm", "gs", "ms"), c("jpm", "gs", "ms"
))))
Are you looking for this ?
result <- do.call(rbind, Map(function(x, y)
cbind(which(x < 1, arr.ind = TRUE), value = x[x != 1], year = y),
g.list, 2018:2020))
result
# row col value year
#gs 2 1 0.352 2018
#ms 3 1 0.802 2018
#jpm 1 2 0.352 2018
#ms 3 2 0.827 2018
#jpm 1 3 0.802 2018
#gs 2 3 0.827 2018
#gs 2 1 0.352 2019
#ms 3 1 0.802 2019
#jpm 1 2 0.352 2019
#ms 3 2 0.827 2019
#jpm 1 3 0.802 2019
#gs 2 3 0.827 2019
#gs 2 1 0.352 2020
#ms 3 1 0.802 2020
#jpm 1 2 0.352 2020
#ms 3 2 0.827 2020
#jpm 1 3 0.802 2020
#gs 2 3 0.827 2020
To get only upper/lower triangle values to avoid duplicates you may use -
do.call(rbind, Map(function(x, y) {
x[upper.tri(x)] <- 1
cbind(which(x < 1, arr.ind = TRUE), value = x[x != 1], year = y)
}, g.list, 2018:2020))