Fixing column names with unnest_wider in R - r

I am having a problem in R and seek your help!
I have a tibble that looks like this (unfortunately, I can't figure out how to write the code to create the table here). My table looks exactly like this to the viewer, e.g., you can see the letter "c" in the table.
person zip_code
Laura c("11001", "28720", "32948", "10309")
Mel c("80239", "23909")
Jake c("20930", "23929", "13909")
In short, my "zip_code" column contains rows of character vectors, each of which contain multiple ZIP codes.
I would like to separate the column "zip_code" into multiple columns, each containing one zip code (e.g., "zip_code_1", "zip_code_2", etc.). To do so, I have been using unnest_wider:
unnest_wider(zip_code, names_sep="_")
However, whenever I do this, the names of the new columns generated by unnest_wider come out wrong. Instead of being "zip_code_1", "zip_code_2", "zip_code_3," the new names are "zip_code_1[,1]", "zip_code_2[,1]", and zip_code_3[,1]". Basically, each column name has a "[,1]" afterward.
I have not repeated these column names anywhere, so I have no idea why they look like this.
I cannot manually rename them with:
dplyr::rename(zip_code_1=`zip_code_[,1]`)
If I do this, I get an error message.
Any help fixing these names is greatly appreciated! Thank you!

With the OP's data, it is a case of matrix column, thus if we convert to a vector (doesn't have dim attributes), the names_sep should work
library(dplyr)
library(purrr)
library(tidyr)
df1 %>%
mutate(zip = map(zip, c)) %>%
unnest_wider(zip, names_sep = "_")
# A tibble: 3 × 4
zip_1 zip_2 zip_3 zip_4
<chr> <chr> <chr> <chr>
1 10010 10019 10010 10019
2 10019 10032 10019 10032
3 11787 11375 11787 11375
Or as #IceCreamToucan mentioned, the transform option in unnest_wider would make it concise
unnest_wider(df1, zip, names_sep = '_', transform = c)
# A tibble: 3 × 4
zip_1 zip_2 zip_3 zip_4
<chr> <chr> <chr> <chr>
1 10010 10019 10010 10019
2 10019 10032 10019 10032
3 11787 11375 11787 11375
data
df1 <- structure(list(zip = list(structure(c("10010", "10019", "10010",
"10019"), dim = c(4L, 1L)), structure(c("10019", "10032", "10019",
"10032"), dim = c(4L, 1L)), structure(c("11787", "11375", "11787",
"11375"), dim = c(4L, 1L)))), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"))

library(dplyr)
library(tidyr) # unnest, pivot_wider
dat %>%
mutate(
# because your sample data didn't have an ID-column
person = LETTERS[row_number()],
# it's better to work with list-columns of strings, not matrices
zip_code = lapply(zip_code, c)
) %>%
unnest(zip_code) %>%
group_by(person) %>%
mutate(rn = paste0("zip", row_number())) %>%
pivot_wider(person, names_from = "rn", values_from = "zip_code") %>%
ungroup()
# # A tibble: 6 x 5
# person zip1 zip2 zip3 zip4
# <chr> <chr> <chr> <chr> <chr>
# 1 A 11374 11374 NA NA
# 2 B 10023 10023 NA NA
# 3 C 10028 10028 NA NA
# 4 D 11210 12498 11210 12498
# 5 E 10301 10301 NA NA
# 6 F 12524 10605 12524 10605
Data
dat <- structure(list(zip_code = list(structure(c("11374", "11374"), .Dim = 2:1), structure(c("10023", "10023"), .Dim = 2:1), structure(c("10028", "10028"), .Dim = 2:1), structure(c("11210", "12498", "11210", "12498"), .Dim = c(4L, 1L)), structure(c("10301", "10301"), .Dim = 2:1), structure(c("12524", "10605", "12524", "10605"), .Dim = c(4L, 1L)))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L))

Related

Sum column based on variable name in other column that contains x following similar letters

I have a table that is somewhat like this:
var
RC
distance50
2
distance20
4
precMax
5
precMin
1
total_prec
8
travelTime
5
travelTime
2
I want to sum all similar type variables, resulting in something like this:
var
sum
dist
6
prec
14
trav
7
Using 4 letters is enough to separate the different types. I have tried and tried but not figured it out. Could anyone please assist? I generally try to work with dplyr, so that would be preferred. The datasets are small (n<100) so speed is not required.
Base R solution:
aggregate(
RC ~ var,
data = transform(
with(df, df[!(grepl("total", var)),]),
var = gsub("^(\\w+)([A-Z0-9]\\w+$)", "\\1", var)
),
FUN = sum
)
Data:
df <- structure(list(var = c("distance50", "distance20", "precMax",
"precMin", "total_prec", "travelTime", "travelTime"), RC = c(2L,
4L, 5L, 1L, 8L, 5L, 2L)), class = "data.frame", row.names = c(NA,
-7L))
library(dplyr)
library(tidyr)
df %>%
separate(var, c("var", "b"), sep = "[_A-Z0-9]", extra = "merge") %>%
group_by(var = ifelse(b %in% var, b, var)) %>%
summarize(RC = sum(RC), .groups = "drop")
separate var into two columns by splitting on underscores (_), capital letters A-Z or numbers 0-9.
In the group_by statement, if the second column can be found in the first then fill the first column.
Lastly, sum RC by group.
Output
var RC
<chr> <int>
1 distance 6
2 prec 14
3 travel 7
tibble(
var=c('dista', 'distb', 'travelTime'),
rc=2:4) %>%
print() %>%
# A tibble: 3 x 2
# var rc
# <chr> <int>
#1 dista 2
#2 distb 3
#3 travelTime 4
group_by(var=str_sub(var, end=4)) %>%
print() %>%
# A tibble: 3 x 2
# Groups: var [2]
# var rc
# <chr> <int>
#1 dist 2
#2 dist 3
#3 trav 4
summarise(sum=sum(rc))
# A tibble: 2 x 2
# var sum
# <chr> <int>
#1 dist 5
#2 trav 4

Summarizing a dataframe in R with multiple functions in place?

I am new to R and trying to summarize a dataframe with multiple functions and I would like the result to appear in the same column, rather than in separated columns for each function. For example, my data set looks something like this
data =
A B
----
1 2
2 2
3 2
4 2
And I call summarize_all(data, c(min, max)) the dataframe becomes
a_fn1 b_fn1 a_fn2 b_fn2
1 2 4 2
How can I make it so that the result of the summarize_all becomes this:
A B
----
1 2
4 2
Thanks
Does this work:
library(dplyr)
bind_rows(apply(data,2,min),apply(data,2,max))
# A tibble: 2 x 2
A B
<dbl> <dbl>
1 1 2
2 4 2
Here is an option with transpose
library(dplyr)
library(tidyr)
pivot_longer(df1, cols = everything()) %>%
group_by(name) %>%
summarise(min = min(value), max = max(value)) %>%
data.table::transpose(., make.names = 'name')
A B
1 1 2
2 4 2
data
df1 <- structure(list(A = 1:4, B = c(2L, 2L, 2L, 2L)),
class = "data.frame", row.names = c(NA,
-4L))

How to convert string in value to attributes and values?

I have 3mio observations with the attribute "other_tags". The value of "other_tags" have to be converted to new attributes and values.
dput()
structure(list(osm_id = c(105093, 107975, 373652), other_tags = structure(c(2L,
3L, 1L), .Label = c("\"addr:city\"=>\"Neuenegg\",\"addr:street\"=>\"Stuberweg\",\"building\"=>\"school\",\"building:levels\"=>\"2\"",
"\"building\"=>\"commercial\",\"name\"=>\"Pollahof\",\"type\"=>\"multipolygon\"",
"\"building\"=>\"yes\",\"amenity\"=>\"sport\",\"type\"=>\"multipolygon\""
), class = "factor")), class = "data.frame", row.names = c(NA,
-3L))
Here is a subsample of the data:
osm_id other_tags
105093 "building"=>"commercial","name"=>"Pollahof","type"=>"multipolygon"
107975 "building"=>"yes","amenity"=>"sport","type"=>"multipolygon"
373652 "addr:city"=>"Neuenegg","addr:street"=>"Stuberweg","building"=>"school","building:levels"=>"2"
This is the desired data format: Make new attributes (only for building and amenity) and add the value.
osm_id building amenity
105093 commercial
107975 yes sport
373652 school
Thx for your help!
Not that difficult.
other_tags is factor column, so we have to use as.charachter on that
Extract results in an intermediate list say s where all variable are separated; after splitting these from split = ',' using strsplit
store these attributes in a seaparte rwo for each attribute in anew dataframe say df2
use separate() from tidyr to break attributae name and value in two separate columns. separator sep is used as => this time
remove extra quotation marks by using str_remove_all
optionally filter the dataset
pivot_wider into the desired format.
library(tidyverse)
s <- strsplit(as.character(df$other_tags), split = ",")
df2 <- data.frame(osm_id = rep(df$osm_id, sapply(s, length)), other_tags = unlist(s))
df2 %>% separate(other_tags, into = c("Col1", "Col2"), sep = "=>") %>%
mutate(across(starts_with("Col"), ~str_remove_all(., '"'))) %>%
filter(Col1 %in% c("amenity", "building")) %>%
pivot_wider(id_cols = osm_id, names_from = Col1, values_from = Col2)
# A tibble: 3 x 3
osm_id building amenity
<dbl> <chr> <chr>
1 105093 commercial NA
2 107975 yes sport
3 373652 school NA
If however, filter is not used
df2 %>% separate(other_tags, into = c("Col1", "Col2"), sep = "=>") %>%
mutate(across(starts_with("Col"), ~str_remove_all(., '"'))) %>%
pivot_wider(id_cols = osm_id, names_from = Col1, values_from = Col2)
# A tibble: 3 x 8
osm_id building name type amenity `addr:city` `addr:street` `building:levels`
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 105093 commercial Pollahof multipolygon NA NA NA NA
2 107975 yes NA multipolygon sport NA NA NA
3 373652 school NA NA NA Neuenegg Stuberweg 2
A single pipe syntax
df %>% mutate(other_tags = as.character(other_tags),
other_tags = str_split(other_tags, ",")) %>%
unnest(other_tags) %>%
mutate(other_tags = str_remove_all(other_tags, '"')) %>%
separate(other_tags, into = c("Col1", "Col2"), sep = "=>") %>%
filter(Col1 %in% c("amenity", "building")) %>%
pivot_wider(id_cols = osm_id, names_from = Col1, values_from = Col2)
# A tibble: 3 x 3
osm_id building amenity
<dbl> <chr> <chr>
1 105093 commercial NA
2 107975 yes sport
3 373652 school NA
We can use (g)sub and str_extract as well as lookaround (in just two lines of code):
library(stringr)
df$building <- str_extract(gsub('"','', df$other_tags),'(?<=building=>)\\w+(?=,)')
df$amenity <- str_extract(gsub('"','', df$other_tags),'(?<=amenity=>)\\w+(?=,)')
If for some reason you want to remove column other_tags:
df$other_tags <- NULL
Result:
df
osm_id building amenity
1 105093 commercial <NA>
2 107975 yes sport
3 373652 school <NA>

Splitting a column into multiple columns based on 2 conditions

I have a large dataframe and I would like to split a column into many columns based on two conditions the caret character ^ and the letter following IMM-. Based on the data below Column 1 would be split into columns named IMM-A, IMM-B, IMM-C, and IMM-W. I tried the separate function but it only works if you specify the column names and because my data is not uniform I don't always know what the column names should be.
SampleId Column1
1 IMM-A*010306+IMM-A*0209^IMM-B*6900+IMM-B*779999^IMM-C*1212+IMM-C*3333
2 IMM-A*010306+IMM-A*0209^IMM-C*6900+IMM-C*779999^IMM-W*1212+IMM-W*3333
3 IMM-B*010306+IMM-B*0209^IMM-C*6900+IMM-C*779999^IMM-W*1212+IMM-W*3333
The expected output would be;
SampleId IMM-A IMM-B IMM-C IMM-W
1 IMM-A*010306+IMM-A*0209 IMM-B*6900+IMM-B*779999 IMM-C*1212+IMM-C*3333
2 IMM-A*010306+IMM-A*0209 IMM-C*6900+IMM-C*779999 IMM-W*1212+IMM-W*3333
3 IMM-B*010306+IMM-B*0209 IMM-C*6900+IMM-C*779999 IMM-W*1212+IMM-W*3333
Not clear about the expected output. Based on the description, we may need
library(tidyverse)
map(strsplit(df$Column1, "[*+^]"), ~
stack(setNames(as.list(.x[c(FALSE, TRUE)]), .x[c(TRUE, FALSE)])) %>%
group_by(ind) %>%
mutate(rn = row_number()) %>%
spread(ind, values)) %>%
set_names(df$SampleId) %>%
bind_rows(.id = 'SampleId') %>%
select(-rn)
# A tibble: 6 x 5
# SampleId `IMM-A` `IMM-B` `IMM-C` `IMM-W`
# <chr> <chr> <chr> <chr> <chr>
#1 1 010306 6900 1212 <NA>
#2 1 0209 779999 3333 <NA>
#3 2 010306 <NA> 6900 1212
#4 2 0209 <NA> 779999 3333
#5 3 <NA> 010306 6900 1212
#6 3 <NA> 0209 779999 3333
Update
Based on the OP's expected output, we expand the data by splitting the 'Column1' at the ^ delimiter, then separate the 'Column1' into 'colA', 'colB' at the delimiter *, remove the 'colB' and spread to 'wide' format
df %>%
separate_rows(Column1, sep = "\\^") %>%
separate(Column1, into = c("colA", "colB"), remove = FALSE, sep="[*]") %>%
select(-colB) %>%
spread(colA, Column1, fill = "")
#SampleId IMM-A IMM-B IMM-C IMM-W
#1 1 IMM-A*010306+IMM-A*0209 IMM-B*6900+IMM-B*779999 IMM-C*1212+IMM-C*3333
#2 2 IMM-A*010306+IMM-A*0209 IMM-C*6900+IMM-C*779999 IMM-W*1212+IMM-W*3333
#3 3 IMM-B*010306+IMM-B*0209 IMM-C*6900+IMM-C*779999 IMM-W*1212+IMM-W*3333
data
df <- structure(list(SampleId = 1:3, Column1 =
c("IMM-A*010306+IMM-A*0209^IMM-B*6900+IMM-B*779999^IMM-C*1212+IMM-C*3333",
"IMM-A*010306+IMM-A*0209^IMM-C*6900+IMM-C*779999^IMM-W*1212+IMM-W*3333",
"IMM-B*010306+IMM-B*0209^IMM-C*6900+IMM-C*779999^IMM-W*1212+IMM-W*3333"
)), class = "data.frame", row.names = c(NA, -3L))

Match part of a pattern to a string

I have two dataframes, and I want to do a match and merge.
Initially I was using inner_join and coalesce, but realized the match portion wasn't properly matching.
I found an example which seemed to be in the right direction How to merge two data frame based on partial string match with R? . One answer suggested using this code:
idx2 <- sapply(df_mouse_human$Protein.IDs, grep, df_mouse$Protein.IDs)
idx1 <- sapply(seq_along(idx2), function(i) rep(i, length(idx2[[i]])))
merged <- cbind(df_mouse_human[unlist(idx1),,drop=F], df_mouse[unlist(idx2),,drop=F])
However it fell short. The issue being is the dataset that I want to use as the pattern match, has strings which are longer than what I want to match to, and thus didn't match anything. Let me show a subset of the data:
dput(droplevels(df_mouse))
structure(list(Protein.IDs = c("Q8CBM2;A2AL85;Q8BSY0", "A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8",
"A2AMW0;P47757-2;A2AMV7;P47757;F6QJN8;F6YHZ8;F7CAZ6", "Q3U8S1;A2APM5;A2APM3;A2APM4;E9QKM8;Q80X37;A2APM1;A2APM2;P15379-2;P15379-3;P15379-6;P15379-11;P15379-5;P15379-10;P15379-9;P15379-4;P15379-8;P15379-7;P15379;P15379-12;P15379-13",
"A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78", "A2AUR7;Q9D031;Q01730"
), Replicate = c(2L, 2L, 2L, 2L, 2L, 2L), Ratio.H.L.normalized.01 = c(NaN,
NaN, NaN, NaN, NaN, NaN), Ratio.H.L.normalized.02 = c(NaN, NaN,
NaN, NaN, NaN, NaN), Ratio.H.L.normalized.03 = c(NaN, NaN, NaN,
NaN, NaN, NaN)), .Names = c("Protein.IDs", "Replicate", "Ratio.H.L.normalized.01",
"Ratio.H.L.normalized.02", "Ratio.H.L.normalized.03"), row.names = 12:17, class = "data.frame")
dput(droplevels(df_mouse_human))
structure(list(Human = c("Q8WZ42", "Q8NF91", "Q9UPN3", "Q96RW7",
"Q8WXG9", "P20929", "Q5T4S7", "O14686", "Q2LD37", "Q92736"),
Protein.IDs = c("A2ASS6", "Q6ZWR6", "Q9QXZ0", "D3YXG0", "Q8VHN7",
"E9Q1W3", "A2AN08", "Q6PDK2", "A2AAE1", "E9Q401")), .Names = c("Human",
"Protein.IDs"), row.names = c(NA, 10L), class = "data.frame")
So I want to match the Protein.IDs in df_mouse to where they exist in df_mouse_human. In the sample data I'm trying to match A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78 to the entry A2ASS6. It works well if I do it the other way, but is there a way so that if part of the pattern matches the query, it will come back TRUE?
My long term goal is to match and merge the data, so that df_mouse gets a new column with the matching Human protein ids, and where there is no match I'll just replace the NA value with the original string of mouse IDs.
thanks
One method I commonly use with partial matches like this is to reduce the more-complex field to make it look like the simpler one. Sometimes this involves just removing extraneous characters (e.g., if "match only on the first four chars", then I'd make a new index column from substr(idcol, 1, 4) and join on that), but in this case it involves breaking one string into multiple.
This involves associating each of the semi-colon-delimited ids with the big-string, making this intermediate frame taller (sometimes much taller) than the original data.
(For the sake of presentability/aesthetics, I'm modifying df1 to remove the other invariant columns and, for the sake of "other data", adding a row number column.)
I'm using dplyr and tidyr, so:
library(dplyr)
library(tidyr)
df1 <- select(df1, Protein.IDs) %>%
mutate(other = row_number())
First I'll break the 6-row frame into a much larger one:
df1ids <- tbl_df(df1) %>%
select(Protein.IDs) %>%
mutate(eachID = strsplit(Protein.IDs, ";")) %>%
unnest()
df1ids
# # A tibble: 46 x 2
# Protein.IDs eachID
# <chr> <chr>
# 1 Q8CBM2;A2AL85;Q8BSY0 Q8CBM2
# 2 Q8CBM2;A2AL85;Q8BSY0 A2AL85
# 3 Q8CBM2;A2AL85;Q8BSY0 Q8BSY0
# 4 A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 A2AMH3
# 5 A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 A2AMH5
# 6 A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 A2AMH4
# 7 A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 Q6X893
# 8 A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 Q6X893-2
# 9 A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 A2AMH8
# 10 A2AMW0;P47757-2;A2AMV7;P47757;F6QJN8;F6YHZ8;F7CAZ6 A2AMW0
# # ... with 36 more rows
Notice how the first row of three is now three rows of three. We'll use "eachID" to join.
left_join(df1ids, df2, by = c("eachID" = "Protein.IDs")) %>%
filter(complete.cases(.)) %>%
select(Human, Protein.IDs) %>%
right_join(df1)
# Joining, by = "Protein.IDs"
# # A tibble: 6 x 3
# Human Protein.IDs other
# <chr> <chr> <int>
# 1 <NA> Q8CBM2;A2AL85;Q8BSY0 1
# 2 <NA> A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 2
# 3 <NA> A2AMW0;P47757-2;A2AMV7;P47757;F6QJN8;F6YHZ8;F7CAZ6 3
# 4 <NA> Q3U8S1;A2APM5;A2APM3;A2APM4;E9QKM8;Q80X37;A2APM1;A2APM2;P15~ 4
# 5 Q8WZ42 A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78 5
# 6 <NA> A2AUR7;Q9D031;Q01730 6
If you happen to have multiple Human rows for each Proteins.IDs, things change a little.
df2$Protein.IDs[2] <- "E9Q8K5"
left_join(df1ids, df2, by = c("eachID" = "Protein.IDs")) %>%
filter(complete.cases(.)) %>%
select(Human, Protein.IDs) %>%
right_join(df1)
# Joining, by = "Protein.IDs"
# # A tibble: 7 x 3
# Human Protein.IDs other
# <chr> <chr> <int>
# 1 <NA> Q8CBM2;A2AL85;Q8BSY0 1
# 2 <NA> A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 2
# 3 <NA> A2AMW0;P47757-2;A2AMV7;P47757;F6QJN8;F6YHZ8;F7CAZ6 3
# 4 <NA> Q3U8S1;A2APM5;A2APM3;A2APM4;E9QKM8;Q80X37;A2APM1;A2APM2;P15~ 4
# 5 Q8WZ42 A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78 5
# 6 Q8NF91 A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78 5
# 7 <NA> A2AUR7;Q9D031;Q01730 6
Notice how you now have two copies of other 5? Likely not what you want. If you intend to continue with the semi-colon-delimited theme, though:
left_join(df1ids, df2, by = c("eachID" = "Protein.IDs")) %>%
filter(complete.cases(.)) %>%
group_by(Protein.IDs) %>%
summarize(Human = paste(Human, collapse = ";")) %>%
select(Human, Protein.IDs) %>%
right_join(df1)
# Joining, by = "Protein.IDs"
# # A tibble: 6 x 3
# Human Protein.IDs other
# <chr> <chr> <int>
# 1 <NA> Q8CBM2;A2AL85;Q8BSY0 1
# 2 <NA> A2AMH3;A2AMH5;A2AMH4;Q6X893;Q6X893-2;A2AMH8 2
# 3 <NA> A2AMW0;P47757-2;A2AMV7;P47757;F6QJN8;F6YHZ8;F7CAZ6 3
# 4 <NA> Q3U8S1;A2APM5;A2APM3;A2APM4;E9QKM8;Q80X37;A2APM1;A2APM~ 4
# 5 Q8WZ42;Q8N~ A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78 5
# 6 <NA> A2AUR7;Q9D031;Q01730 6
#r2evans asks a good question about what to do with multiple matches. Once that question gets answered, I may need to edit my answer, but here is a quick solution. First, we split up the string of possible IDs, then we see which IDs are matched in the other dataframe, then we join on the row index of the match.
library(tidyverse)
df_mouse %>% mutate(all_id = str_split(Protein.IDs, ";"),
row = map(all_id, ~.x %in% df_mouse_human$Protein.IDs %>% which())) %>%
unnest(row) %>%
list(., df_mouse_human %>% rownames_to_column("row") %>% mutate(row = as.numeric(row))) %>%
reduce(left_join, by = "row")
#> Protein.IDs.x Replicate
#> 1 A2ASS6;E9Q8N1;E9Q8K5;A2ASS6-2;A2AT70;F7CR78 2
#> Ratio.H.L.normalized.01 Ratio.H.L.normalized.02 Ratio.H.L.normalized.03
#> 1 NaN NaN NaN
#> row Human Protein.IDs.y
#> 1 1 Q8WZ42 A2ASS6

Resources