R how to merge 2 list [duplicate]

R how to merge 2 list [duplicate] - r

I'm trying to set the default value for a function parameter to a named numeric. Is there a way to create one in a single statement? I checked ?numeric and ?vector but it doesn't seem so. Perhaps I can convert/coerce a matrix or data.frame and achieve the same result in one statement? To be clear, I'm trying to do the following in one shot:
test = c( 1 , 2 )
names( test ) = c( "A" , "B" )

The setNames() function is made for this purpose. As described in Advanced R and ?setNames:
test <- setNames(c(1, 2), c("A", "B"))

How about:
c(A = 1, B = 2)
A B
1 2

...as a side note, the structure function allows you to set ALL attributes, not just names:
structure(1:10, names=letters[1:10], foo="bar", class="myclass")
Which would produce
a b c d e f g h i j
1 2 3 4 5 6 7 8 9 10
attr(,"foo")
[1] "bar"
attr(,"class")
[1] "myclass"

The convention for naming vector elements is the same as with lists:
newfunc <- function(A=1, B=2) { body} # the parameters are an 'alist' with two items
If instead you wanted this to be a parameter that was a named vector (the sort of function that would handle arguments supplied by apply):
newfunc <- function(params =c(A=1, B=2) ) { body} # a vector wtih two elements
If instead you wanted this to be a parameter that was a named list:
newfunc <- function(params =list(A=1, B=2) ) { body}
# a single parameter (with two elements in a list structure

magrittr offers a nice and clean solution.
result = c(1,2) %>% set_names(c("A", "B"))
print(result)
A B
1 2
You can also use it to transform data.frames into vectors.
df = data.frame(value=1:10, label=letters[1:10])
vec = extract2(df, 'value') %>% set_names(df$label)
vec
a b c d e f g h i j
1 2 3 4 5 6 7 8 9 10
df
value label
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e
6 6 f
7 7 g
8 8 h
9 9 i
10 10 j

To expand upon #joran's answer (I couldn't get this to format correctly as a comment): If the named vector is assigned to a variable, the values of A and B are accessed via subsetting using the [ function. Use the names to subset the vector the same way you might use the index number to subset:
my_vector = c(A = 1, B = 2)
my_vector["A"] # subset by name
# A
# 1
my_vector[1] # subset by index
# A
# 1

Related

environment issue

e <<- data.env ## here i am storing my rdata
data_frames <- Filter(function(x) is.data.frame(get(x)), ls(envir = e)) ## getting only dataframe
for(i in data_frames) e[[i]] <<- mytest_function(e[[i]]) ### here i am iterating the dataframe
Now, how do I convert the for loop into an apply function? The loop takes so long to iterate.

Ok here some basic demonstration and I think it is a good call to use apply especially because of the environment issues in loops and such.
# lets create some data.frames
df1 <- data.frame(x = LETTERS[1:3], y = rep(1:3))
df2 <- data.frame(x = LETTERS[4:6], y = rep(4:6))
# what df's are we going to "loop" over
data_frames <- c("df1", "df2")
# just some simple function to paste x and y from your df's to a new column z
mytest_function <- function(x) {
df <- get(x)
df$z <- paste(df$x, df$y)
df
}
# apply over your df's and call your function for every df
e <- lapply(data_frames, mytest_function)
# note that e will be a list with data.frames
e
[[1]]
x y z
1 A 1 A 1
2 B 2 B 2
3 C 3 C 3
[[2]]
x y z
1 D 4 D 4
2 E 5 E 5
3 F 6 F 6
# most of the time you want them combined
e <- do.call(rbind, e)
e
x y z
1 A 1 A 1
2 B 2 B 2
3 C 3 C 3
4 D 4 D 4
5 E 5 E 5
6 F 6 F 6

It's unclear what you want the result to be. However, if you are just wanting to apply a function to each column in a dataframe, then you can just use sapply.
sapply(df, function(x) mytest_function(x))
Or you can use the purrr package.
purrr::map(df, function(x) mytest_function(x)) %>%
as.data.frame
If you have a list of a dataframes and are applying a function to each dataframe, then you can also use purrr.
library(purrr)
purrr::map(data_frames, mytest_function)

When you want to convert a loop into an apply function I usually go for lapply but it depends on the situation :
my_f <- function(x) {
mytest_function(e[[x]])
}
my_var <- lapply(1:length(data_frames), my_f)

Using letters in function argument

Im trying to create a function that returns characteristic symbol to a defined value like this
"a" to 1
"b" to 2
"c" to 3
And where there is only one input argument (one of "a", "b" or "c") in the function. Like this: function(x), for example function("a") returns 1.

We can convert with matching to the default Constant vector letters
f1 <- function(arg1){
match(arg1, letters)
}
f1('a')
#[1] 1
f1('b')
#[1] 2
f1(c('a', 'b', 'c'))
#[1] 1 2 3

letterToNumber <- function(x){
which(x == letters)}
sapply(letters[1:10], letterToNumber)
a b c d e f g h i j
1 2 3 4 5 6 7 8 9 10

You can create a dictionary like structure by using a named vector.
f <- function(x)
{
dict <- setNames(seq_along(letters),letters)
unname(dict[x])
}
f("a")
[1] 1
f("g")
[1] 7
f(c("a","z"))
[1] 1 26

This will be faster than other solutions but won't fail if you don't feed a lower case letter :
foo <- function(x) utf8ToInt(x) - 96L
foo("m")
#> [1] 13

Assign results of apply to multiple columns of data frame

I would like to process all rows in data frame df by applying function f to every row. As function f returns numeric vector with two elements I would like to assign individual elements to new columns in df.
Sample df, trivial function f returning two elements and my trial with using apply
df <- data.frame(a = 1:3, b = 3:5)
f <- function (a, b) {
c(a + b, a * b)
}
df[, c('apb', 'amb')] <- apply(df, 1, function(x) f(a = x[1], b = x[2]))
This does not work results are assigned by columns:
> df
a b apb amb
1 1 3 4 8
2 2 4 3 8
3 3 5 6 15

You could also use Reduce instead of apply as it is generally more efficient. You just need to slightly modify your function to use cbind instead of c
f <- function (a, b) {
cbind(a + b, a * b) # midified to use `cbind` instead of `c`
}
df[c('apb', 'amb')] <- Reduce(f, df)
df
# a b apb amb
# 1 1 3 4 3
# 2 2 4 6 8
# 3 3 5 8 15
Note: This will only work nicely if you have only two columns (as in your example), thus if you have more columns in you data set, run this only on a subset

You need to transpose apply results to get what you want :
df[, c('apb', 'amb')] <- t(apply(df, 1, function(x) f(a = x[1], b = x[2])))
> df
a b apb amb
1 1 3 4 3
2 2 4 6 8
3 3 5 8 15

Extract subset of data

Ok, I have a matrix of values with certain identifiers, such as:
A 2
B 3
C 4
D 5
E 6
F 7
G 8
I would like to pull out a subset of these values (using R) based on a list of the identifiers ("B", "D", "E") for example, so I would get the following output:
B 3
D 5
E 6
I'm sure there's an easy way to do this (some sort of apply?) but I can't seem to figure it out. Any ideas? Thanks!

If the letters are the row names, then you can just use this:
m <- matrix(2:8, dimnames = list(LETTERS[1:7], NULL))
m[c("B","D","E"),]
# B D E
# 3 5 6
Note that there is a subtle but very important difference between: m[c("B","D","E"),] and m[rownames(m) %in% c("B","D","E"),]. Both return the same rows, but not necessarily in the same order.
The former uses the character vector c("B","D","E") as in index into m. As a result, the rows will be returned in the order of character vector. For instance:
# result depends on order in c(...)
m[c("B","D","E"),]
# B D E
# 3 5 6
m[c("E","D","B"),]
# E D B
# 6 5 3
The second method, using %in%, creates a logical vector with length = nrow(m). For each element, that element is T if the row name is present in c("B","D","E"), and F otherwise. Indexing with a logical vector returns rows in the original order:
# result does NOT depend on order in c(...)
m[rownames(m) %in% c("B","D","E"),]
# B D E
# 3 5 6
m[rownames(m) %in% c("E","D","B"),]
# B D E
# 3 5 6
This is probably more than you wanted to know...

Your matrix:
> m <- matrix(2:8, dimnames = list(LETTERS[1:7]))
You can use %in% to filter out the desired rows. If the original matrix only has a single column, using drop = FALSE will keep the matrix structure. Otherwise it will be converted to a named vector.
> m[rownames(m) %in% c("B", "D", "E"), , drop = FALSE]
# [,1]
# B 3
# D 5
# E 6

Difference between `names(df[1]) <- ` and `names(df)[1] <- `

Consider the following:
df <- data.frame(a = 1, b = 2, c = 3)
names(df[1]) <- "d" ## First method
## a b c
##1 1 2 3
names(df)[1] <- "d" ## Second method
## d b c
##1 1 2 3
Both methods didn't return an error, but the first didn't change the column name, while the second did.
I thought it has something to do with the fact that I'm operating only on a subset of df, but why, for example, the following works fine then?
df[1] <- 2
## a b c
##1 2 2 3

What I think is happening is that replacement into a data frame ignores the attributes of the data frame that is drawn from. I am not 100% sure of this, but the following experiments appear to back it up:
df <- data.frame(a = 1:3, b = 5:7)
# a b
# 1 1 5
# 2 2 6
# 3 3 7
df2 <- data.frame(c = 10:12)
# c
# 1 10
# 2 11
# 3 12
df[1] <- df2[1] # in this case `df[1] <- df2` is equivalent
Which produces:
# a b
# 1 10 5
# 2 11 6
# 3 12 7
Notice how the values changed for df, but not the names. Basically the replacement operator `[<-` only replaces the values. This is why the name was not updated. I believe this explains all the issues.
In the scenario:
names(df[2]) <- "x"
You can think of the assignment as follows (this is a simplification, see end of post for more detail):
tmp <- df[2]
# b
# 1 5
# 2 6
# 3 7
names(tmp) <- "x"
# x
# 1 5
# 2 6
# 3 7
df[2] <- tmp # `tmp` has "x" for names, but it is ignored!
# a b
# 1 10 5
# 2 11 6
# 3 12 7
The last step of which is an assignment with `[<-`, which doesn't respect the names attribute of the RHS.
But in the scenario:
names(df)[2] <- "x"
you can think of the assignment as (again, a simplification):
tmp <- names(df)
# [1] "a" "b"
tmp[2] <- "x"
# [1] "a" "x"
names(df) <- tmp
# a x
# 1 10 5
# 2 11 6
# 3 12 7
Notice how we directly assign to names, instead of assigning to df which ignores attributes.
df[2] <- 2
works because we are assigning directly to the values, not the attributes, so there are no problems here.
EDIT: based on some commentary from #AriB.Friedman, here is a more elaborate version of what I think is going on (note I'm omitting the S3 dispatch to `[.data.frame`, etc., for clarity):
Version 1 names(df[2]) <- "x" translates to:
df <- `[<-`(
df, 2,
value=`names<-`( # `names<-` here returns a re-named one column data frame
`[`(df, 2),
value="x"
) )
Version 2 names(df)[2] <- "x" translates to:
df <- `names<-`(
df,
`[<-`(
names(df), 2, "x"
) )
Also, turns out this is "documented" in R Inferno Section 8.2.34 (Thanks #Frank):
right <- wrong <- c(a=1, b=2)
names(wrong[1]) <- 'changed'
wrong
# a b
# 1 2
names(right)[1] <- 'changed'
right
# changed b
# 1 2

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R how to merge 2 list [duplicate] - r

The setNames() function is made for this purpose. As described in Advanced R and ?setNames: test <- setNames(c(1, 2), c("A", "B"))

How about: c(A = 1, B = 2) A B 1 2

...as a side note, the structure function allows you to set ALL attributes, not just names: structure(1:10, names=letters[1:10], foo="bar", class="myclass") Which would produce a b c d e f g h i j 1 2 3 4 5 6 7 8 9 10 attr(,"foo") [1] "bar" attr(,"class") [1] "myclass"

Related

environment issue

Using letters in function argument

Assign results of apply to multiple columns of data frame

Extract subset of data

Difference between `names(df[1]) <- ` and `names(df)[1] <- `

Categories

Resources