In python we can do this..
numbers = [1, 2, 3]
characters = ['foo', 'bar', 'baz']
for item in zip(numbers, characters):
print(item[0], item[1])
(1, 'foo')
(2, 'bar')
(3, 'baz')
We can also unpack the tuple rather than using the index.
for num, char in zip(numbers, characters):
print(num, char)
(1, 'foo')
(2, 'bar')
(3, 'baz')
How can we do the same using base R?
To do something like this in an R-native way, you'd use the idea of a data frame. A data frame has multiple variables which can be of different types, and each row is an observation of each variable.
d <- data.frame(numbers = c(1, 2, 3),
characters = c('foo', 'bar', 'baz'))
d
## numbers characters
## 1 1 foo
## 2 2 bar
## 3 3 baz
You then access each row using matrix notation, where leaving an index blank includes everything.
d[1,]
## numbers characters
## 1 1 foo
You can then loop over the rows of the data frame to do whatever you want to do, presumably you actually want to do something more interesting than printing.
for(i in seq_len(nrow(d))) {
print(d[i,])
}
## numbers characters
## 1 1 foo
## numbers characters
## 2 2 bar
## numbers characters
## 3 3 baz
For another option, how about mapply, which is the closest analog to zip I can think of in R. Here I'm using the c function to make a new vector, but you could use any function you'd like:
numbers<- c(1, 2, 3)
characters<- c('foo', 'bar', 'baz')
mapply(c,numbers, characters, SIMPLIFY = FALSE)
[[1]]
[1] "1" "foo"
[[2]]
[1] "2" "bar"
[[3]]
[1] "3" "baz"
Which way is of most use depends on what you want to do with your output, but as the other answers mention, a dataframe is the most natural approach in R (and pandas dataframe probably in python).
To index a vector in R, where the vector is variable x would be x[1]. This would return the first element of the vector. R element numbering starts at 1 in contrast to Python which starts at 0.
For this problem it would be:
x = seq(1,10)
j = seq(11,20)
for (i in 1:length(x)){
print (c(x[i],j[i]))
}
Many functions in R are vectorized and don't require loops:
numbers = c(1, 2, 3)
characters = c('foo', 'bar', 'baz')
myList <- list(numbers, characters)
myDF <- data.frame(numbers,characters, stringsAsFactors = F)
print(myList)
print(myDF)
This is the conceptual equivalent:
for (item in Map(list,numbers,characters)){ # though most of the time you would actually do all your work inside Map
print(item[c(1,2)])
}
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[1]]
# [1] 2
#
# [[2]]
# [1] "b"
#
# [[1]]
# [1] 3
#
# [[2]]
# [1] "c"
#
# [[1]]
# [1] 4
#
# [[2]]
# [1] "d"
#
# [[1]]
# [1] 5
#
# [[2]]
# [1] "e"
Though most of the time you would actually do all your work inside Map and do something like this:
Map(function(nu,ch){print(data.frame(nu,ch))},numbers,characters)
This is the closest I could get to a clone:
zip <- function(...){ Map(list,...)}
print2 <- function(...){do.call(cat,c(list(...),"\n"))}
for (item in zip(numbers,characters)){
print2(item[[1]],item[[2]])
}
# 1 a
# 2 b
# 3 c
# 4 d
# 5 e
to be able to call items by their names (still works with indices):
zip <- function(...){
names <- sapply(substitute(list(...))[-1],deparse)
Map(function(...){setNames(list(...),names)}, ...)
}
for (item in zip(numbers,characters)){
print2(item[["numbers"]],item[["characters"]])
}
The tidyverse solution would be to use purrr::map2 function. Ex:
numbers <- c(1, 2, 3)
characters <- c('foo', 'bar', 'baz')
map2(numbers, characters, ~paste0(.x, ',', .y))
#[[1]]
#[1] "1,foo"
#[[2]]
#[1] "2,bar"
#[[3]]
#[1] "3,baz"
See API here
Other scalable alternatives: Store the vectors in the list and iterate over.
vect1 <- c(1, 2, 3)
vect1 <- c('foo', 'bar', 'baz')
vect2 <- c('a', 'b', 'c')
idx_list <- list(vect1, vect2)
idx_vect <- c(1:length(idx_list[[1]]))
for(i in idx_vect){
x <- idx_list[[1]][i]
j <- idx_list[[2]][i]
print(c(i, x, j))
}
Related
If I have a symmetric binary operator that I want to apply over the pairs of elements from a list, is there an easy way I can do this in R? I tried:
A <- list(1,2,3)
mapply(function(x,y) x+y, A,A)
but this only gives x[n]+y[n] for all n=1..N but I want x[n]+y[m] for all m=1..n, n=1..N returned as a list. outer(..) does that for m=1..N, n=1..N which involves redundant computation so I want to discount that.
Notice I don't want solution to this simple example. I need a general solution that works for non-numeric input as well. The thing I'm trying to do is like:
mapply(function(set_1, set_2) setequal(intersect(set_1, set_2), set_3), list_of_sets, list_of_sets)
In both cases addition and intersection are symmetric. In the first example, I expect list(3,4,5) from list(1+2,1+3,2+3). For the second case me input list_of_sets is:
> list_of_sets
[[1]]
numeric(0)
[[2]]
[1] 1
[[3]]
[1] 2
[[4]]
[1] 1 2
[[5]]
[1] 3
[[6]]
[1] 1 3
[[7]]
[1] 2 3
[[8]]
[1] 1 2 3
and set_3 being c(1,2) as a simple example.
You may use outer -
values <- c(1, 2, 3)
outer(values, values, `+`)
# [,1] [,2] [,3]
#[1,] 2 3 4
#[2,] 3 4 5
#[3,] 4 5 6
outer also works for non-numeric input. If the function that you want to apply is not vectorised you can use Vectorize. Since OP did not provide an example I have created one of my own.
list_of_sets_1 <- list(c('a', 'b', 'c'), c('a'))
list_of_sets_2 <- list(c('a', 'c'), c('a', 'b'))
fun <- function(x, y) intersect(x, y)
result <- outer(list_of_sets_1, list_of_sets_2, Vectorize(fun))
result
We need combn to do pairwise computation without redundancy
combn(A, 2, FUN = function(x) x[[1]] + x[[2]], simplify = FALSE)
-output
[[1]]
[1] 3
[[2]]
[1] 4
[[3]]
[1] 5
This will also work with non-numeric elements
list_of_sets <- list(c('a', 'b', 'c'), "a", c("a", "c"))
combn(list_of_sets, 2, FUN = function(x) Reduce(intersect, x), simplify = FALSE)
-output
[[1]]
[1] "a"
[[2]]
[1] "a" "c"
[[3]]
[1] "a"
We may also do
combn(list_of_sets, 2, FUN = function(x)
setequal(intersect(x[[1]], x[[2]]), set_3), simplify = FALSE)
I'm looking for a simple way to check if values in an R data frame have comma (or any character for that matter).
Let's suppose I have the following data frame:
df <- data.frame(A = c("apple","orange", "banana","strawberries"),
B = c(23,12,10,15),
C = c("2,53", "1.35","0,25","1,44"))
If I know the column with commas in it I use this:
which(grepl(",",df$C))
length(which(grepl(",",df$C)))
However, I want an output as the one above but not specifying the column of my dataframe.
Any suggestions?
You need to simply go through all three columns; sapply works here:
sapply(df, grep, pattern = ",")
##output:
# $A
# integer(0)
#
# $B
# integer(0)
#
# $C
# [1] 1 3 4
To get the length you can do this:
sapply(sapply(df, grep, pattern = ","), length)
# A B C D
# 0 0 3 0
Somewhat simpler to grasp solution; first, convert your data frame to vector.
df2vector <- as.vector(t(df))
df2vector
# [1] "apple" "23" "2,53" "orange" "12"
# [6] "1.35" "banana" "10" "0,25" "strawberries"
# [11] "15" "1,44"
Then use your approach.
length(which(grepl(",",df2vector)))
# [1] 3
I have two lists x and y created from
x1 = list(c(1,2,3,4))
x2 = list(c(seq(1, 10, by = 2)))
x<- list(x1,x2)
x
[[1]]
[[1]][[1]]
[1] 1 2 3 4
[[2]]
[[2]][[1]]
[1] 1 3 5 7 9
and y,
y1 = list(c(5, 6, 7, 8))
y2 = list(c(9, 7, 5, 3, 1))
y <- list(y1, y2)
y
[[1]]
[[1]][[1]]
[1] 5 6 7 8
[[2]]
[[2]][[1]]
[1] 9 7 5 3 1
So basically, I want to get matches of x into y so I should just get '1 3 5 7 9' actually being a match. I am also needing indexes.
I have tried, I want to match the values irrespective of the position each x[[ ]] with each y[[ ]].
Matches <- x[x %in% y]
IDX <- which(x %in% y)
This does not work....
I would like something that can return matches of the same elements irrespective of positions per each list. This would be a rough idea of what I need...
matches
[1] False
[1] 1 3 5 7 9
Thanks in advance, appreciate all the help.
Here is what you can do:
So, you have made list of lists, which is quite confusing to work with, you could have totally avoided using c, so you can have, x <- c(x1, x2) to get list of vectors, which is much more easy to work with.
But since you provided with list of lists, I will work with that.
Now back to solving your question:
flags <- lapply(Map(`%in%`, unlist(x, recursive = F), unlist(y, recursive=F)),all)
k <- lapply(1:length(x), function(i)ifelse(unlist(flags)[i] == TRUE,
list(unlist(x, recursive=F)[[i]]),
unlist(flags[i])))
unlist(k, recursive = F) #Final Output
Logic:
Mapping each items in list using %in% to see if an element
contains item of other elements, if all the elements are present it
will return a TRUE or a FALSE, In your case it would return FALSE and
TRUE respectively.
Here we are iterating to the lists of x by using flag as a filter
criteria you can make another list k, when value of flag created in
earlier step is TRUE it will copy back the contents of x, however
when FALSE it will remain as FALSE
Final step to your answer, unlist k again to convert into a list
of vectors using unlist with recursive = F.
Output:
# [[1]]
# [1] FALSE
# [[2]]
# [1] 1 3 5 7 9
I can make this list by hand:
list( list(n=1) , list(n=2), list(n=3) )
But how do I automate this, for instance if I want n to go up to 10? I tried as.list(1:10), which firstly is a different type of data structure, and secondly I couldn't work out how to specify n.
I'm hoping the answer can be expanded to multiple element lists, e.g. all combinations of 1:3 and c('A','B'):
list( list(n=1,z='A') , list(n=2,z='A'), list(n=3,z='A'),
list(n=1,z='B') , list(n=2,z='B'), list(n=3,z='B') )
Background: I'll be using it along the lines of: lapply( outer_list, function(params) do.call(FUN,params) )
UPDATE:
It was difficult to choose which answer to give the tick to. I went with the expand.grid approach as it can scale to more than two parameters more easily; the use of mapply as shown in the comment makes the two examples above look reasonably compact and readable:
outer_list=with( expand.grid(n=1:10,stringsAsFactors=F),
mapply(list, n=n, SIMPLIFY=F)
)
outer_list=with( expand.grid(n=1:3,z=c('A','Z'), stringsAsFactors=F),
mapply(list, n=n, z=z, SIMPLIFY=F)
)
They violate the DRY principle, by repeating the parameter names in the mapply() call, which bothers me a little. So, when it bothers me enough I will use the alply call as shown in Sebastian's answer.
You don't need to expand using expand.grid.
L <- mapply(function(x, y) list("n"=x,"z"=y),
rep(1:10, each=10), LETTERS[1:10],
SIMPLIFY=FALSE)
EDIT (see comment below)
L <- mapply(function(x, y) list("n"=x,"z"=y),
rep(1:10, each=length(LETTERS[1:10])), LETTERS[1:10],
SIMPLIFY=FALSE)
vals <- expand.grid(n=1:3, z=c("A", "B"),
KEEP.OUT.ATTRS=FALSE, stringsAsFactors=FALSE)
library(plyr)
alply(vals, 1, as.list)
$`1`
$`1`$n
[1] 1
$`1`$z
[1] "A"
$`2`
$`2`$n
[1] 2
$`2`$z
[1] "A"
$`3`
$`3`$n
[1] 3
$`3`$z
[1] "A"
$`4`
$`4`$n
[1] 1
$`4`$z
[1] "B"
$`5`
$`5`$n
[1] 2
$`5`$z
[1] "B"
$`6`
$`6`$n
[1] 3
$`6`$z
[1] "B"
attr(,"split_type")
[1] "array"
attr(,"split_labels")
n z
1 1 A
2 2 A
3 3 A
4 1 B
5 2 B
6 3 B
I have a list and I want to remove a single element from it. How can I do this?
I've tried looking up what I think the obvious names for this function would be in the reference manual and I haven't found anything appropriate.
If you don't want to modify the list in-place (e.g. for passing the list with an element removed to a function), you can use indexing: negative indices mean "don't include this element".
x <- list("a", "b", "c", "d", "e"); # example list
x[-2]; # without 2nd element
x[-c(2, 3)]; # without 2nd and 3rd
Also, logical index vectors are useful:
x[x != "b"]; # without elements that are "b"
This works with dataframes, too:
df <- data.frame(number = 1:5, name = letters[1:5])
df[df$name != "b", ]; # rows without "b"
df[df$number %% 2 == 1, ] # rows with odd numbers only
I don't know R at all, but a bit of creative googling led me here: http://tolstoy.newcastle.edu.au/R/help/05/04/1919.html
The key quote from there:
I do not find explicit documentation for R on how to remove elements from lists, but trial and error tells me
myList[[5]] <- NULL
will remove the 5th element and then "close up" the hole caused by deletion of that element. That suffles the index values, So I have to be careful in dropping elements. I must work from the back of the list to the front.
A response to that post later in the thread states:
For deleting an element of a list, see R FAQ 7.1
And the relevant section of the R FAQ says:
... Do not set x[i] or x[[i]] to NULL, because this will remove the corresponding component from the list.
Which seems to tell you (in a somewhat backwards way) how to remove an element.
I would like to add that if it's a named list you can simply use within.
l <- list(a = 1, b = 2)
> within(l, rm(a))
$b
[1] 2
So you can overwrite the original list
l <- within(l, rm(a))
to remove element named a from list l.
Here is how the remove the last element of a list in R:
x <- list("a", "b", "c", "d", "e")
x[length(x)] <- NULL
If x might be a vector then you would need to create a new object:
x <- c("a", "b", "c", "d", "e")
x <- x[-length(x)]
Work for lists and vectors
Removing Null elements from a list in single line :
x=x[-(which(sapply(x,is.null),arr.ind=TRUE))]
Cheers
If you have a named list and want to remove a specific element you can try:
lst <- list(a = 1:4, b = 4:8, c = 8:10)
if("b" %in% names(lst)) lst <- lst[ - which(names(lst) == "b")]
This will make a list lst with elements a, b, c. The second line removes element b after it checks that it exists (to avoid the problem #hjv mentioned).
or better:
lst$b <- NULL
This way it is not a problem to try to delete a non-existent element (e.g. lst$g <- NULL)
Use - (Negative sign) along with position of element, example if 3rd element is to be removed use it as your_list[-3]
Input
my_list <- list(a = 3, b = 3, c = 4, d = "Hello", e = NA)
my_list
# $`a`
# [1] 3
# $b
# [1] 3
# $c
# [1] 4
# $d
# [1] "Hello"
# $e
# [1] NA
Remove single element from list
my_list[-3]
# $`a`
# [1] 3
# $b
# [1] 3
# $d
# [1] "Hello"
# $e
[1] NA
Remove multiple elements from list
my_list[c(-1,-3,-2)]
# $`d`
# [1] "Hello"
# $e
# [1] NA
my_list[c(-3:-5)]
# $`a`
# [1] 3
# $b
# [1] 3
my_list[-seq(1:2)]
# $`c`
# [1] 4
# $d
# [1] "Hello"
# $e
# [1] NA
There's the rlist package (http://cran.r-project.org/web/packages/rlist/index.html) to deal with various kinds of list operations.
Example (http://cran.r-project.org/web/packages/rlist/vignettes/Filtering.html):
library(rlist)
devs <-
list(
p1=list(name="Ken",age=24,
interest=c("reading","music","movies"),
lang=list(r=2,csharp=4,python=3)),
p2=list(name="James",age=25,
interest=c("sports","music"),
lang=list(r=3,java=2,cpp=5)),
p3=list(name="Penny",age=24,
interest=c("movies","reading"),
lang=list(r=1,cpp=4,python=2)))
list.remove(devs, c("p1","p2"))
Results in:
# $p3
# $p3$name
# [1] "Penny"
#
# $p3$age
# [1] 24
#
# $p3$interest
# [1] "movies" "reading"
#
# $p3$lang
# $p3$lang$r
# [1] 1
#
# $p3$lang$cpp
# [1] 4
#
# $p3$lang$python
# [1] 2
Don't know if you still need an answer to this but I found from my limited (3 weeks worth of self-teaching R) experience with R that, using the NULL assignment is actually wrong or sub-optimal especially if you're dynamically updating a list in something like a for-loop.
To be more precise, using
myList[[5]] <- NULL
will throw the error
myList[[5]] <- NULL : replacement has length zero
or
more elements supplied than there are to replace
What I found to work more consistently is
myList <- myList[[-5]]
Just wanted to quickly add (because I didn't see it in any of the answers) that, for a named list, you can also do l["name"] <- NULL. For example:
l <- list(a = 1, b = 2, cc = 3)
l['b'] <- NULL
In the case of named lists I find those helper functions useful
member <- function(list,names){
## return the elements of the list with the input names
member..names <- names(list)
index <- which(member..names %in% names)
list[index]
}
exclude <- function(list,names){
## return the elements of the list not belonging to names
member..names <- names(list)
index <- which(!(member..names %in% names))
list[index]
}
aa <- structure(list(a = 1:10, b = 4:5, fruits = c("apple", "orange"
)), .Names = c("a", "b", "fruits"))
> aa
## $a
## [1] 1 2 3 4 5 6 7 8 9 10
## $b
## [1] 4 5
## $fruits
## [1] "apple" "orange"
> member(aa,"fruits")
## $fruits
## [1] "apple" "orange"
> exclude(aa,"fruits")
## $a
## [1] 1 2 3 4 5 6 7 8 9 10
## $b
## [1] 4 5
Using lapply and grep:
lst <- list(a = 1:4, b = 4:8, c = 8:10)
# say you want to remove a and c
toremove<-c("a","c")
lstnew<-lst[-unlist(lapply(toremove, function(x) grep(x, names(lst)) ) ) ]
#or
pattern<-"a|c"
lstnew<-lst[-grep(pattern, names(lst))]
You can also negatively index from a list using the extract function of the magrittr package to remove a list item.
a <- seq(1,5)
b <- seq(2,6)
c <- seq(3,7)
l <- list(a,b,c)
library(magrittr)
extract(l,-1) #simple one-function method
[[1]]
[1] 2 3 4 5 6
[[2]]
[1] 3 4 5 6 7
There are a few options in the purrr package that haven't been mentioned:
pluck and assign_in work well with nested values and you can access it using a combination of names and/or indices:
library(purrr)
l <- list("a" = 1:2, "b" = 3:4, "d" = list("e" = 5:6, "f" = 7:8))
# select values (by name and/or index)
all.equal(pluck(l, "d", "e"), pluck(l, 3, "e"), pluck(l, 3, 1))
[1] TRUE
# or if element location stored in a vector use !!!
pluck(l, !!! as.list(c("d", "e")))
[1] 5 6
# remove values (modifies in place)
pluck(l, "d", "e") <- NULL
# assign_in to remove values with name and/or index (does not modify in place)
assign_in(l, list("d", 1), NULL)
$a
[1] 1 2
$b
[1] 3 4
$d
$d$f
[1] 7 8
Or you can remove values using modify_list by assigning zap() or NULL:
all.equal(list_modify(l, a = zap()), list_modify(l, a = NULL))
[1] TRUE
You can remove or keep elements using a predicate function with discard and keep:
# remove numeric elements
discard(l, is.numeric)
$d
$d$e
[1] 5 6
$d$f
[1] 7 8
# keep numeric elements
keep(l, is.numeric)
$a
[1] 1 2
$b
[1] 3 4
Here is a simple solution that can be done using base R. It removes the number 5 from the original list of numbers. You can use the same method to remove whatever element you want from a list.
#the original list
original_list = c(1:10)
#the list element to remove
remove = 5
#the new list (which will not contain whatever the `remove` variable equals)
new_list = c()
#go through all the elements in the list and add them to the new list if they don't equal the `remove` variable
counter = 1
for (n in original_list){
if (n != ){
new_list[[counter]] = n
counter = counter + 1
}
}
The new_list variable no longer contains 5.
new_list
# [1] 1 2 3 4 6 7 8 9 10
How about this? Again, using indices
> m <- c(1:5)
> m
[1] 1 2 3 4 5
> m[1:length(m)-1]
[1] 1 2 3 4
or
> m[-(length(m))]
[1] 1 2 3 4
You can use which.
x<-c(1:5)
x
#[1] 1 2 3 4 5
x<-x[-which(x==4)]
x
#[1] 1 2 3 5
if you'd like to avoid numeric indices, you can use
a <- setdiff(names(a),c("name1", ..., "namen"))
to delete names namea...namen from a. this works for lists
> l <- list(a=1,b=2)
> l[setdiff(names(l),"a")]
$b
[1] 2
as well as for vectors
> v <- c(a=1,b=2)
> v[setdiff(names(v),"a")]
b
2