How to remove elements of a list in R? - r

I have an igraph object, what I have created with the igraph library. This object is a list. Some of the components of this list have a length of 2. I would like to remove all of these ones.
IGRAPH clustering walktrap, groups: 114, mod: 0.79
+ groups:
$`1`
[1] "OTU0041" "OTU0016" "OTU0062"
[4] "OTU1362" "UniRef90_A0A075FHQ0" "UniRef90_A0A075FSE2"
[7] "UniRef90_A0A075FTT8" "UniRef90_A0A075FYU2" "UniRef90_A0A075G543"
[10] "UniRef90_A0A075G6B2" "UniRef90_A0A075GIL8" "UniRef90_A0A075GR85"
[13] "UniRef90_A0A075H910" "UniRef90_A0A075HTF5" "UniRef90_A0A075IFG0"
[16] "UniRef90_A0A0C1R539" "UniRef90_A0A0C1R6X4" "UniRef90_A0A0C1R985"
[19] "UniRef90_A0A0C1RCN7" "UniRef90_A0A0C1RE67" "UniRef90_A0A0C1RFI5"
[22] "UniRef90_A0A0C1RFN8" "UniRef90_A0A0C1RGE0" "UniRef90_A0A0C1RGX0"
[25] "UniRef90_A0A0C1RHM1" "UniRef90_A0A0C1RHR5" "UniRef90_A0A0C1RHZ4"
+ ... omitted several groups/vertices
For example, this one :
> a[[91]]
[1] "OTU0099" "UniRef90_UPI0005B28A7E"
I tried this but it does not work :
a[lapply(a,length)>2]
Any help?

Since you didn't provide any reproducible data or example, I had to produce some dummy data:
# create dummy data
a <- list(x = 1, y = 1:4, z = 1:2)
# remove elements in list with lengths greater than 2:
a[which(lapply(a, length) > 2)] <- NULL
In case you wanted to remove the items with lengths exactly equal to 2 (question is unclear), then last line should be replaced by:
a[which(lapply(a, length) == 2)] <- NULL

Related

Partial Variances at each row of a Matrix

I generated a series of 10,000 random numbers through:
rand_x = rf(10000, 3, 5)
Now I want to produce another series that contains the variances at each point i.e. the column look like this:
[variance(first two numbers)]
[variance(first three numbers)]
[variance(first four numbers)]
[variance(first five numbers)]
.
.
.
.
[variance of 10,000 numbers]
I have written the code as:
c ( var(rand_x[1:1]) : var(rand_x[1:10000])
but I am only getting 157 elements in the column rather than not 10,000. Can someone guide what I am doing wrong here?
An option is to loop over the index from 2 to 10000 in sapply, extract the elements of 'rand_x' from position 1 to the looped index, apply the var and return a vector of variance output
out <- sapply(2:10000, function(i) var(rand_x[1:i]))
Your code creates a sequence incrementing by one with the variance of the first two elements as start value and the variance of the whole vector as limit.
var(rand_x[1:2]):var(rand_x[1:n])
# [1] 0.9026262 1.9026262 2.9026262
## compare:
.9026262:3.33433
# [1] 0.9026262 1.9026262 2.9026262
What you want is to loop over the vector indices, using seq_along to get the variances of sequences growing by one. To see what needs to be done, I show you first a (rather slow) for loop.
vars <- numeric() ## initialize numeric vector
for (i in seq_along(rand_x)) {
vars[i] <- var(rand_x[1:i])
}
vars
# [1] NA 0.9026262 1.4786540 1.2771584 1.7877717 1.6095619
# [7] 1.4483273 1.5653797 1.8121144 1.6192175 1.4821020 3.5005254
# [13] 3.3771453 3.1723564 2.9464537 2.7620001 2.7086317 2.5757641
# [19] 2.4330738 2.4073546 2.4242747 2.3149455 2.3192964 2.2544765
# [25] 3.1333738 3.0343781 3.0354998 2.9230927 2.8226541 2.7258979
# [31] 2.6775278 2.6651541 2.5995346 3.1333880 3.0487177 3.0392603
# [37] 3.0483917 4.0446074 4.0463367 4.0465158 3.9473870 3.8537925
# [43] 3.8461463 3.7848464 3.7505158 3.7048694 3.6953796 3.6605357
# [49] 3.6720684 3.6580296
The first element has to be NA because the variance of one element is not defined (division by zero).
However, the for loop is slow. Since R is vectorized we rather want to use a function from the *apply family, e.g. vapply, which is much faster. In vapply we initialize with numeric(1) (or just 0) because the result of each iteration is of length one.
vars <- vapply(seq_along(rand_x), function(i) var(rand_x[1:i]), numeric(1))
vars
# [1] NA 0.9026262 1.4786540 1.2771584 1.7877717 1.6095619
# [7] 1.4483273 1.5653797 1.8121144 1.6192175 1.4821020 3.5005254
# [13] 3.3771453 3.1723564 2.9464537 2.7620001 2.7086317 2.5757641
# [19] 2.4330738 2.4073546 2.4242747 2.3149455 2.3192964 2.2544765
# [25] 3.1333738 3.0343781 3.0354998 2.9230927 2.8226541 2.7258979
# [31] 2.6775278 2.6651541 2.5995346 3.1333880 3.0487177 3.0392603
# [37] 3.0483917 4.0446074 4.0463367 4.0465158 3.9473870 3.8537925
# [43] 3.8461463 3.7848464 3.7505158 3.7048694 3.6953796 3.6605357
# [49] 3.6720684 3.6580296
Data:
n <- 50
set.seed(42)
rand_x <- rf(n, 3, 5)

Is there a specific function in R to merge 2 vectors [duplicate]

This question already has answers here:
Pasting two vectors with combinations of all vectors' elements
(8 answers)
Closed 2 years ago.
I have two vectors, one that contains a list of variables, and one that contains dates, such as
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
I want to merge them to have a vector with each variable indexed by my date, that is my desired output is
> Colonnes_Pays_Principaux
[1] "PIB_2020" "PIB_2021" "ConsommationPrivee_2020"
[4] "ConsommationPrivee_2021" "ConsommationPubliques_2020" "ConsommationPubliques_2021"
[7] "FBCF_2020" "FBCF_2021" "ProductionIndustrielle_2020"
[10] "ProductionIndustrielle_2021" "Inflation_2020" "Inflation_2021"
[13] "InflationSousJacente_2020" "InflationSousJacente_2021" "PrixProductionIndustrielle_2020"
[16] "PrixProductionIndustrielle_2021" "CoutHoraireTravail_2020" "CoutHoraireTravail_2021"
Is there a simpler / more readabl way than a double for loop as I have tried and succeeded below ?
Colonnes_Pays_Principaux <- vector()
for (Variable in (1:length(Variables_Pays))){
for (Annee in (1:length(Annee_Pays))){
Colonnes_Pays_Principaux=
append(Colonnes_Pays_Principaux,
paste(Variables_Pays[Variable],Annee_Pays[Annee],sep="_")
)
}
}
expand.grid will create a data frame with all combinations of the two vectors.
with(
expand.grid(Variables_Pays, Annee_Pays),
paste0(Var1, "_", Var2)
)
#> [1] "PIB_2000" "ConsommationPrivee_2000"
#> [3] "ConsommationPubliques_2000" "FBCF_2000"
#> [5] "ProductionIndustrielle_2000" "Inflation_2000"
#> [7] "InflationSousJacente_2000" "PrixProductionIndustrielle_2000"
#> [9] "CoutHoraireTravail_2000" "PIB_2001"
#> [11] "ConsommationPrivee_2001" "ConsommationPubliques_2001"
#> [13] "FBCF_2001" "ProductionIndustrielle_2001"
#> [15] "Inflation_2001" "InflationSousJacente_2001"
#> [17] "PrixProductionIndustrielle_2001" "CoutHoraireTravail_2001"
We can use outer :
c(t(outer(Variables_Pays, Annee_Pays, paste, sep = '_')))
# [1] "PIB_2000" "PIB_2001"
# [3] "ConsommationPrivee_2000" "ConsommationPrivee_2001"
# [5] "ConsommationPubliques_2000" "ConsommationPubliques_2001"
# [7] "FBCF_2000" "FBCF_2001"
# [9] "ProductionIndustrielle_2000" "ProductionIndustrielle_2001"
#[11] "Inflation_2000" "Inflation_2001"
#[13] "InflationSousJacente_2000" "InflationSousJacente_2001"
#[15] "PrixProductionIndustrielle_2000" "PrixProductionIndustrielle_2001"
#[17] "CoutHoraireTravail_2000" "CoutHoraireTravail_2001"
No real need to go beyond the basics here! Use paste for pasting the strings and rep to repeat either Annee_Pays och Variables_Pays to get all combinations:
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
# To get this is the same order as in your example:
paste(rep(Variables_Pays, rep(2, length(Variables_Pays))), Annee_Pays, sep = "_")
# Alternative order:
paste(Variables_Pays, rep(Annee_Pays, rep(length(Variables_Pays), 2)), sep = "_")
# Or, if order doesn't matter too much:
paste(Variables_Pays, rep(Annee_Pays, length(Variables_Pays)), sep = "_")
In base R:
Variables_Pays <- c("PIB", "ConsommationPrivee","ConsommationPubliques",
"FBCF","ProductionIndustrielle","Inflation","InflationSousJacente",
"PrixProductionIndustrielle","CoutHoraireTravail")
Annee_Pays <- c("2000","2001")
cbind(paste(Variables_Pays, Annee_Pays,sep="_"),paste(Variables_Pays, rev(Annee_Pays),sep="_")

How to select a specific interval of dataframes/objects inside a list()?

I have a list composed of 10 numeric vectors. I would like to select the first 5 1:5, or let's say just the 3rd and the 9th of this numeric vectors inside the list.
This below would be an example of a list:
n_vec = lapply(1:10, function(x) rnorm(20,5,2))
bLister = list()
keeping_names = NULL
for (i in 1:length(n_vec)) {
single_name_ = paste("thisis_vec",i)
temp = n_vec[[i]]
keeping_names = c(keeping_names,single_name_)
bLister[[i]] = temp
}
names(bLister) = keeping_names
This way doesn't work:
bLister[[1:5]]
bLister[[c(3,9)]]
How can I do this?
You can subset vectors like so. Notice the number of square brackets.
> bLister[c(3, 9)]
$`thisis_vec 3`
[1] 5.603467 3.749571 3.944807 7.279552 7.122220 2.065051 2.587282 4.405463
[9] 6.687400 7.567451 6.239640 6.017510 2.484759 3.223271 5.301008 1.545704
[17] 2.465992 1.518966 6.997675 3.966775
$`thisis_vec 9`
[1] 3.900151 5.260895 7.971662 6.578425 4.861220 3.770569 1.128102 6.164506
[9] 4.767511 5.286352 3.898185 2.298500 8.476691 7.794415 7.148588 6.699527
[17] 3.638074 4.240355 8.575829 5.340551

How can I use two lists to create a Table (Columns and Rows)

I want to write a script in R that allows me to import MSG files and store the information in a table. The fields may vary by course, so the column names are defined based on the first MSG file being imported.
The import and extraction are already working (special thanks to the user "January")
What does not work is the filling in the table, which consists of two steps. Add column names and fill in rows.
I've tried using unlist to prepare the contents of the lists so that I can add them as colums and rows to a table.
Anmeldung <- gsub("^\\s+", "", Anmeldung) # remove spaces at the beginning and end
Anmeldung <- gsub("\\s+$", "", Anmeldung)
words <- strsplit(Anmeldung, " *[\n\r]+ *")[[1]]
fields <- as.list(words[seq(1, length(words), 2)])
information <- as.list(words[seq(2, length(words), 2)])
resTab1 = data.frame(t(unlist(fields)))
resTab2 = data.frame(t(unlist(information)))
colnames(resTab2) = c(resTab1)
variable.names(resTab2)
When I am trying to create the Table,this error appears:
colnames(resTab2) = c(resTab1)
Error in names(x) <- value :
'names' attribute [22] must be the same length as the vector [21]
This is what the Dataframes Fields and Information look like:
Fields
> fields
[[1]]
[1] "Anrede"
[[2]]
[1] "Vorname"
[[3]]
[1] "Name"
[[4]]
[1] "Email (für Kontaktaufnahme)"
[[5]]
[1] "Telefon/Mobile (geschäftlich)"
[[6]]
[1] "Telefon/Mobile (privat)"
[[7]]
[1] "Strasse/Nr."
Information:
> information
[[1]]
[1] "Herr"
[[2]]
[1] "James"
[[3]]
[1] "Bond"
[[4]]
[1] "james.bond#email.com"
[[5]]
[1] "007 000 77 07"
[[6]]
[1] "007 000 77 07"
[[7]]
[1] "Lampenstrasse 8"
I see you're trying to give names to resTab2 that is shorter than your resTab1
ex:
x <- c(1,2)
y <- c("a","b","c")
names(x) <- y
#Error in names(x) <- y :
#'names' attribute [3] must be the same length as the vector [2]
EDIT:
use unlist to flatten the list
information <- unlist(information)
fields <- unlist(fields)
names(information) <- fields
information
#OUTPUT
#Anrede 'Herr'
#Vorname 'James'
#Name 'Bond'
#Email (für Kontaktaufnahme) 'james.bond#email.com'
#Telefon/Mobile (geschäftlich) '007 000 77 07'
#Telefon/Mobile (privat) '007 000 77 07'
#Strasse/Nr. 'Lampenstrasse 8'

Accessing selected elements of a list of lists in R

I have a list of list subgame[[i]]$Weight of this type:
[[1]]
[1] 0.4720550 0.4858826 0.4990469 0.5115899 0.5235512 0.5349672 0.5458720
[8] 0.5562970 0.5662715 0.5758226 0.5849754 0.5937532 0.6021778 0.6102692
[15] 0.6180462 0.6255260 0.6327250 0.6396582 0.6463397 0.6527826
[[2]]
[1] 0.4639948 0.4779027 0.4911519 0.5037834 0.5158356 0.5273443 0.5383429
[8] 0.5488623 0.5589313 0.5685767 0.5778233 0.5866943 0.5952111 0.6033936
[15] 0.6112605 0.6188291 0.6261153 0.6331344 0.6399002 0.6464260
[[3]]
[1] 0.4629488 0.4768668 0.4901266 0.5027692 0.5148329 0.5263534 0.5373639
[8] 0.5478953 0.5579764 0.5676339 0.5768926 0.5857755 0.5943041 0.6024984
[15] 0.6103768 0.6179568 0.6252543 0.6322844 0.6390611 0.6455976
What I am looking for is to access all the j-th elements of every list. Example if j=1 I must get:
>0.4720550 0.4639948 0.4629488
How can I do it?
I found
sapply(1:length(subgame[[i]]$Weight),function(k) subgame[[i]]$Weight[[k]][1])
But seems too tricky to me.
There is a more elegant way?
If j=1, then you're interested in subgame[[i]]$Weight[[1]][1], subgame[[i]]$Weight[[2]][1], and subgame[[i]]$Weight[[3]][1]. In other words, you want to use [1] on each list element.
But what happens when you subset a vector? For example:
(x <- rnorm(5))
# [1] -1.8965529 0.4688618 0.6588774 0.2749539 0.1829046
x[3]
# [1] 0.6588774
[ is actually a function, and it gets called in this situation. You can read a bit more about it with ?"[", but the point is that you can call it like any other function. Its first argument will be the object to subset, then you can pass it the index (or indices) you're interested in (along with some other arguments that the help page discusses):
x[3]
# [1] 0.6588774
`[`(x, 3)
# [1] 0.6588774
Note the backticks surrounding the name. A bare [ will throw an error, so you need to quote it. The same goes for other functions like +.
So if you want to get the first element of each list element, you can apply [ to each element of the list, passing it 1 or whatever j is:
sapply(subgame[[i]]$Weight, `[`, 1)
I would like to add a solution which returns the result you want for the Weight list of each elements of your subgame list.
> subgame <- list(list(weight = list(c(1, 2), c(3, 4), c(5, 6))), list(weight = list(c(7, 8), c(9, 10), c(11, 12))))
>
> j = 1
>
> do.call(rbind, subgame[[1]]$weight)[,j]
[1] 1 3 5
>
> lapply(subgame, function(x) {do.call(rbind, x$weight)[,j]})
[[1]]
[1] 1 3 5
[[2]]
[1] 7 9 11

Resources