I got a list with a weird format:
[[1]]
[1] "Freq.2432.40862794099" "Freq.2792.87280096993" "Freq.2955.16577598796"
[4] "Freq.3161.12982491516" "Freq.3194.19720315405" "Freq.3218.83311568825"
[7] "Freq.3265.37951283662" "Freq.3317.86908506493" "Freq.3900.50408838719"
[10] "Freq.4073.33935633108" "Freq.4302.8830598659" "Freq.4404.80065271461"
[13] "Freq.4469.12305573234" "Freq.4567.90688886175" "Freq.4965.4984006347"
[16] "Freq.5854.45161215455" "Freq.5905.64933878776" "Freq.6175.68130655941"
[19] "Freq.6433.22411185796" "Freq.6631.46775487994" "Freq.6958.20015968149"
[22] "Freq.7469.83422424355" "Freq.8602.43342069553" "Freq.8766.14436081853"
[25] "Freq.8811.22677706485" "Freq.8915.90029255773" "Freq.9131.39810096"
[28] "Freq.9378.82122607608"
Never saw that [[1]] in a list before, and the problem is that I can't append things to this list.
How can I solve this?
This is a list in a list. Normally this can be referred to as a nested list.
a <- c(1,2,3)
b <- c(4,5,6)
list <- list(a,b)
In this code snippet we are creating two vectors and put them into a list. Now you can access the nested vectors/lists using the double brackets. Like so:
list[[1]]
> [1] 1 2 3
Now, if you want to change the value (or append it, see comment) you can use the normal syntax but solely assign it to the nested object.
list[[1]] <- c(7,8,9)
list[[1]]
> [1] 7 8 9
Related
I have a data frame in R with 341 rows. I want to rename the row names using a list with 349 names. All 341 names will be in this list for sure. But not all of them will be perfect hits.
The data looks like this
rownames(df_RPM1)
[1] "LQNS02059392.1_11686_5p"
[2] "LQNS02277998.1_30984_3p"
[3] "LQNS02277998.1_30984_5p"
[4] "LQNS02277998.1_30988_3p"
[5] "LQNS02277998.1_30988_5p"
[6] "LQNS02277997.1_30943_3p"
[7] "miR-9|LQNS02278070.1_31740_3p"
[8] "miR-9|LQNS02278094.1_36129_3p"
head(inlist)
[1] "dpu-miR-2-03_LQNS02059392.1_11686_5p" "dpu-miR-10-P2_LQNS02277998.1_30984_3p"
[3] "dpu-miR-10-P2_LQNS02277998.1_30984_5p" "dpu-miR-10-P3_LQNS02277998.1_30988_3p"
[5] "dpu-miR-10-P3_LQNS02277998.1_30988_5p" "miR-9|LQNS02278070.1_31740_3p"
[6] "miR-9|LQNS02278094.1_36129_3p"
The order won't necessarily be the same in the two.
Can anyone suggest me how to do this in R?
Thanks a lot
Depends a lot what a "non-perfect hit" looks like. Assuming the row name is a substring of the real name, str_detect() does the job quite well:
library(tidyverse)
real_names <- c("dpu-miR-2-03_LQNS02059392.1_11686_5p",
"dpu-miR-10-P2_LQNS02277998.1_30984_3p",
"dpu-miR-10-P2_LQNS02277998.1_30984_5p",
"dpu-miR-10-P3_LQNS02277998.1_30988_3p",
"dpu-miR-10-P3_LQNS02277998.1_30988_5p",
"miR-9|LQNS02278070.1_31740_3p",
"miR-9|LQNS02278094.1_36129_3p")
str_which(real_names, "LQNS02059392.1_11686_5p")
#> [1] 1
So we can vectorize (I removed the element 6 which is not found in the example list):
pos <- map_int(rownames(df_RPM1), ~ str_which(real_names, fixed(.)))
pos
#> [1] 1 2 3 4 5 6 7
And all that's left is to change the row names:
rownames(df_RPM1) <- real_names[pos]
Of course, if a non-perfect hit means something more complicated, you may need to create a regex from the row names or something like that.
I'm working on a shiny app which plots data trees. I'm looking to incorporate the shinyTree app to permit quick comparison of plotted nodes. The issue is that the shinyTree app returns a redundant list of lists of the sub node plot.
The actual list of list is included below. I would like to keep the longest branches only. I would also like to remove the id node (integer node), I'm struggling as to why it even shows up based on the list. I have tried many different methods to work with this list but it's been a real struggle. The list concept is difficult to understand.
I create the data.tree and plot via:
dataTree.a <- FromListSimple(checkList)
plot(dataTree.a)
> checkList
[[1]]
[[1]]$Asia
[[1]]$Asia$China
[[1]]$Asia$China$Beijing
[[1]]$Asia$China$Beijing$Round
[[1]]$Asia$China$Beijing$Round$`20383994`
[1] 0
[[2]]
[[2]]$Asia
[[2]]$Asia$China
[[2]]$Asia$China$Beijing
[[2]]$Asia$China$Beijing$Round
[1] 0
[[3]]
[[3]]$Asia
[[3]]$Asia$China
[[3]]$Asia$China$Beijing
[1] 0
[[4]]
[[4]]$Asia
[[4]]$Asia$China
[[4]]$Asia$China$Shanghai
[[4]]$Asia$China$Shanghai$Round
[[4]]$Asia$China$Shanghai$Round$`23740778`
[1] 0
[[5]]
[[5]]$Asia
[[5]]$Asia$China
[[5]]$Asia$China$Shanghai
[[5]]$Asia$China$Shanghai$Round
[1] 0
[[6]]
[[6]]$Asia
[[6]]$Asia$China
[[6]]$Asia$China$Shanghai
[1] 0
[[7]]
[[7]]$Asia
[[7]]$Asia$China
[1] 0
[[8]]
[[8]]$Asia
[[8]]$Asia$India
[[8]]$Asia$India$Delhi
[[8]]$Asia$India$Delhi$Round
[[8]]$Asia$India$Delhi$Round$`25703168`
[1] 0
[[9]]
[[9]]$Asia
[[9]]$Asia$India
[[9]]$Asia$India$Delhi
[[9]]$Asia$India$Delhi$Round
[1] 0
[[10]]
[[10]]$Asia
[[10]]$Asia$India
[[10]]$Asia$India$Delhi
[1] 0
[[11]]
[[11]]$Asia
[[11]]$Asia$India
[1] 0
[[12]]
[[12]]$Asia
[[12]]$Asia$Japan
[[12]]$Asia$Japan$Tokyo
[[12]]$Asia$Japan$Tokyo$Round
[[12]]$Asia$Japan$Tokyo$Round$`38001000`
[1] 0
[[13]]
[[13]]$Asia
[[13]]$Asia$Japan
[[13]]$Asia$Japan$Tokyo
[[13]]$Asia$Japan$Tokyo$Round
[1] 0
[[14]]
[[14]]$Asia
[[14]]$Asia$Japan
[[14]]$Asia$Japan$Tokyo
[1] 0
[[15]]
[[15]]$Asia
[[15]]$Asia$Japan
[1] 0
[[16]]
[[16]]$Asia
[1] 0
Well, I did cobble together a poor hack to make this work here is what I did to the 'checkList' list
checkList <- get_selected(tree, format = "slices")
# Convert and collapse shinyTree slices to data.tree
# This is a bit of a cluge to work the graphic with
# shinyTree an alternate one liner is in works
# This transform works by finding the longest branches
# and only plotting them since the other branches are
# subsets due to the slices.
# Extract the checkList name (as characters) from the checkList
tmp <- names(unlist(checkList))
# Determine the length of the individual checkList Names
lens <- lapply(tmp, function(x) length(strsplit(x, ".", fixed=TRUE)[[1]]))
# Find the elements with the highest length returns a list of high vals
lens.max <- which(lens == max(sapply(lens, max)))
# Replace all '.' with '\' prepping for DataFrameTable Converions
tmp <- relist(str_replace_all(tmp, "\\.", "/"), skeleton=tmp)
# Add a root node to work with multiple branches
tmp <- unlist(lapply(tmp, function(x) paste0("Root/", x)))
# Create a list of only the longest branches
longBranches <- as.list(tmp[lens.max])
# Convert the list into a data.frame for convert
longBranches.df <- data.frame(pathString = do.call(rbind, longBranches))
# Publish the data.frame for use
vals$selDF <- longBranches.df
#save(checkList, file = "chkLists.RData") # Save for troubleshooting
print(vals$selDF)ode here
The new checkList looks like this:
[1] "Root/Europe/France/Paris/Round/10843285" "Root/Europe/France/Paris/Round"
[3] "Root/Europe/France/Paris" "Root/Europe/France"
[5] "Root/Europe/Germany/Berlin/Diamond/3563194" "Root/Europe/Germany/Berlin/Diamond"
[7] "Root/Europe/Germany/Berlin/Round/3563194" "Root/Europe/Germany/Berlin/Round"
[9] "Root/Europe/Germany/Berlin" "Root/Europe/Germany"
[11] "Root/Europe/Italy/Rome/Round/3717956" "Root/Europe/Italy/Rome/Round"
[13] "Root/Europe/Italy/Rome" "Root/Europe/Italy"
[15] "Root/Europe/United Kingdom/London/Round/10313307" "Root/Europe/United Kingdom/London/Round"
[17] "Root/Europe/United Kingdom/London" "Root/Europe/United Kingdom"
[19] "Root/Europe"
It works :)... but I think this could be done with a two liner.... I'll work on it again in a week or so. Any other Ideas would be appreciated.
I have several vectors to combine into a named list ("my_list"). The names of the vectors are already stored in the vector ("zI").
> zI
[1] "Chemokines" "Cell_Cycle" "Regulation"
[4] "Senescence" "B_cell_Functions" "T_Cell_Functions"
[7] "Cell_Functions" "Adhesion" "Transporter_Functions"
[10] "Complement" "Pathogen_Defense" "Cytokines"
[13] "Antigen_Processing" "Leukocyte_Functions" "TNF_Superfamily"
[16] "Macrophage_Functions" "Microglial_Functions" "Interleukins"
[19] "Cytotoxicity" "NK_Cell_Functions" "TLR"
If it's a small number of vectors, I'd simply do
my_list <- setNames(list(Chemokines, Adhesion), c("Chemokines", "Adhesion"))
I'd like to find a smarter way, other than to combine the vector names into a long string and then copying/pasting.
> toString(zI)
[1] "Chemokines, Cell_Cycle, Regulation, Senescence, B_cell_Functions, T_Cell_Functions, Cell_Functions, Adhesion, Transporter_Functions, Complement, Pathogen_Defense, Cytokines, Antigen_Processing, Leukocyte_Functions, TNF_Superfamily, Macrophage_Functions, Microglial_Functions, Interleukins, Cytotoxicity, NK_Cell_Functions, TLR"
> my_lists <- list(Chemokines, Cell_Cycle, Regulation, Senescence, B_cell_Functions, T_Cell_Functions, Cell_Functions, Adhesion, Transporter_Functions, Complement, Pathogen_Defense, Cytokines, Antigen_Processing, Leukocyte_Functions, TNF_Superfamily, Macrophage_Functions, Microglial_Functions, Interleukins, Cytotoxicity, NK_Cell_Functions, TLR)
> my_lists <- setNames(my_lists, zI)
This is probably a really fundamental question, but I've searched and read about 10 separate threads and still can't figure it out. Much thanks for any help!
We can use mget to get the values of the character strings.
mget(zI)
I have some variables in my current R environment:
ls()
[1] "clt.list" "commands.list" "dirs.list" "eq" "hurs.list" "mlist" "prec.list" "temp.list" "vars"
[10] "vars.list" "wind.list"
where each one of the variables "clt.list", "hurs.list", "prec.list", "temp.list" and "wind.list" is a (huge) list of strings.
For example:
clt.list[1:20]
[1] "clt_Amon_ACCESS1-0_historical_r1i1p1_185001-200512.nc" "clt_Amon_ACCESS1-3_historical_r1i1p1_185001-200512.nc"
[3] "clt_Amon_bcc-csm1-1_historical_r1i1p1_185001-201212.nc" "clt_Amon_bcc-csm1-1-m_historical_r1i1p1_185001-201212.nc"
[5] "clt_Amon_BNU-ESM_historical_r1i1p1_185001-200512.nc" "clt_Amon_CanESM2_historical_r1i1p1_185001-200512.nc"
[7] "clt_Amon_CCSM4_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-BGC_historical_r1i1p1_185001-200512.nc"
[9] "clt_Amon_CESM1-CAM5_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-CAM5-1-FV2_historical_r1i1p1_185001-200512.nc"
[11] "clt_Amon_CESM1-FASTCHEM_historical_r1i1p1_185001-200512.nc" "clt_Amon_CESM1-WACCM_historical_r1i1p1_185001-200512.nc"
[13] "clt_Amon_CMCC-CESM_historical_r1i1p1_190001-190412.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_190001-200512.nc"
[15] "clt_Amon_CMCC-CESM_historical_r1i1p1_190501-190912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_191001-191412.nc"
[17] "clt_Amon_CMCC-CESM_historical_r1i1p1_191501-191912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_192001-192412.nc"
[19] "clt_Amon_CMCC-CESM_historical_r1i1p1_192501-192912.nc" "clt_Amon_CMCC-CESM_historical_r1i1p1_193001-193412.nc"
What I need to do is extract the subset of the string that is between "Amon_" and "_historical".
I can do this for a single variable, as shown here:
levels(as.factor(sub(".*?Amon_(.*?)_historical.*", "\\1", clt.list[1:20])))
[1] "ACCESS1-0" "ACCESS1-3" "bcc-csm1-1" "bcc-csm1-1-m" "BNU-ESM" "CanESM2" "CCSM4"
[8] "CESM1-BGC" "CESM1-CAM5" "CESM1-CAM5-1-FV2" "CESM1-FASTCHEM" "CESM1-WACCM" "CMCC-CESM"
However, what I'd like to do is to run the command above for all the five variables at once. Instead of using just "ctl.list" as argument in the command above, I'd like to use all variables "clt.list", "hurs.list", "prec.list", "temp.list" and "wind.list" at once.
How can I do that?
Many thanks in advance!
You can put your operation into a function and then iterate over it:
get_my_substr <- function(vecname)
levels(as.factor(sub(".*?Amon_(.*?)_historical.*", "\\1", get(vecname))))
lapply(my_vecnames,get_my_substr)
lapply acts like a loop. You can create your list of vector names with
my_vecnames <- ls(pattern=".list$")
It is generally good practice to post a reproducible example in your question. Since none was provided here, I tested this approach with...
# example-maker
prestr <- "grr_Amon_"
posstr <- "_historical_zzz"
make_ex <- function()
replicate(
sample(10,1),
paste0(prestr,paste0(sample(LETTERS,sample(5,1)),collapse=""),posstr)
)
# make a couple examples
set.seed(1)
m01 <- make_ex()
m02 <- make_ex()
# test result
lapply(ls(pattern="^m[0-9][0-9]$"),get_my_substr)
One solution would be to create a vector containing the variable names that you want extract the data from, for example:
var.names <- c("clt.list", "commands.list", "dirs.list")
Then to access the value of each variable from the name:
for (var.name in var.names) {
var.value <- as.list(environment())[[var.name]]
# Do something with var.value
}
I have two matrix with the same number of columns, but with different number of rows:
a <- cbind(runif(5), runif(5))
b <- cbind(runif(8), runif(8))
I want to associate these in a same list, so that the first columns of a and b are associated with each other, and so on:
my_result <- list(list(a[,1], b[,1]), list(a[,2], b[,2]))
So the result would look like this:
> print(my_result)
[[1]]
[[1]][[1]]
[1] 0.9440956 0.7259602 0.7804068 0.7115368 0.2771190
[[1]][[2]]
[1] 0.4155642 0.1535414 0.6983123 0.7578231 0.2126765 0.6753884 0.8160817
[8] 0.6548915
[[2]]
[[2]][[1]]
[1] 0.7343330 0.7751599 0.4463870 0.6926663 0.9692621
[[2]][[2]]
[1] 0.5708726 0.1234482 0.2875474 0.4760349 0.2027653 0.5142006 0.4788264
[8] 0.7935544
I can't figure how to do that without a for loop, but I'm pretty sure some *pply magic could be used here.
Any directions would be much appreciated.
I'm not sure how general a solution you're looking for (arbitrary number of matrices, ability to pass a list of matrices, etc.) but this works for your specific example:
lapply(1:2,function(i){list(a[,i],b[,i])})