Paste column values together in a data frame - r

I am trying to paste together the rowname along with the data in the desired column. I wrote the following code but somehow couldnot find a way to do it correctly.
The desired output will be: "a,1,11" "b,2,22" "c,3,33"
x = data.frame(cbind(f1 = c(1,2,3), f2 = c(5,6,7), f3=c(11,22,33)), row.names= c('a','b','c'))
x
# f1 f2 f3
# a 1 5 11
# b 2 6 22
# c 3 7 33
do.call("paste", c(rownames(x), x[c('f1','f3')], sep=","))
# [1] "a,b,c,1,11" "a,b,c,2,22" "a,b,c,3,33"

Two main points:
Use apply instead of do.call(paste, .)
Use cbind instead of c in this case.
If you would rather use c, you would need to coerce the row names to a list or column first, eg: c(list(rownames(x)), x)
Try the following:
apply(cbind(rownames(x), x[c('f1','f3')]), 1, paste, collapse=",")
a b c
"a,1,11" "b,2,22" "c,3,33"

Your do.call instructs R to paste the list c(rownames(x), x[c('f1','f3')]) together. But take a look at your list.
> c(rownames(x), x[c('f1','f3')])
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
$f1
[1] 1 2 3
$f3
[1] 11 22 33
The c command takes the elements of each argument and joins them together. This properly deconstructs x[c('f1','f3')] but also deconstructs rownames(x) in a way you don't want. Obeying the standard recycling rule, paste then takes an item from each list element and patches them together with sep=",".
You could fix this by encapsulating rownames(x) inside a list structure so that your list of arguments comes out properly:
do.call("paste", c(list(rownames(x)), x[c('f1','f3')], sep=","))

No need for do.call or apply:
paste(rownames(x),x[[1]],x[[3]] , sep=",")
[1] "a,1,11" "b,2,22" "c,3,33"

Related

Split dataframe columns into vectors in R

I have a dataframe as such:
Number <- c(1,2,3)
Number2 <- c(10,12,14)
Letter <- c("A","B","C")
df <- data.frame(Number,Number2,Letter)
I would like to split the df into its respective three columns, each one becoming a vector with the respective column name. In essence, the output should look exactly like the original three input vectors in the above example.
I have tried the split function and also using for loop, but without success.
Any ideas? Thank you.
We may use unclass as data.frame is a list with additional attributes. By unclassing, it removes the data.frame attribute
unclass(df)
Or another option is asplit with MARGIN specified as 2
asplit(df, 2)
NOTE: Both of them return a named list. If we intend to create new objects in the global env, use list2env (not recommended though)
We can use c oras.list
> c(df)
$Number
[1] 1 2 3
$Number2
[1] 10 12 14
$Letter
[1] "A" "B" "C"
> as.list(df)
$Number
[1] 1 2 3
$Number2
[1] 10 12 14
$Letter
[1] "A" "B" "C"
Assuming you are trying to create these as vectors if the global environment, use list2env:
df <- data.frame(Number = c(1, 2, 3),
Number2 = c(10, 12, 14),
Letter = c("A", "B", "C"))
list2env(df, .GlobalEnv)
## <environment: R_GlobalEnv>
ls()
## [1] "df" "Letter" "Number" "Number2"
list2env is clearly the easiest way, but if you want to do it with a for loop it can also be achieved.
The "tricky" part is to make a new vector based on the column names inside the for loop. If you just write
names(df[i]) <- input
a vector will not be created.
A workaround is to use paste to create a string with the new vector name and what should be in it, then use "eval(parse(text=)" to evaluate this expression.
Maybe not the most elegant solution, but seems to work.
for (i in colnames(df)){
vector_name <- names(df[i])
expression_to_be_evaluated <- paste(vector_name, "<- df[[i]]")
eval(parse(text=expression_to_be_evaluated))
}
> Letter
[1] A B C
Levels: A B C
> Number
[1] 1 2 3
> Number2
[1] 10 12 14

How to use grep to search for patterns matches within a list of data frames using a second list of character vectors in R

I have two lists in R. One is a list of data frames with rows that contain strings (List 1). The other is a list (of the same length) of characters (List 2). I would like to go through the lists in a parallel fashion taking the character string from List 2 and searching for it to get its position (using grep) in the data frame at the corresponding element in List 1. Here is a toy example to show what my lists look like:
List1 <- list(data.frame(a = c("other","other","dog")),
data.frame(a = c("cat","other","other")),
data.frame(a = c("other","other","bird")))
List2 <- list("a" = c("dog|xxx|xxx"),
"a" = c("cat|xxx|xxx"),
"a" = c("bird|xxx|xxx"))
The output I would like to get would be a list of the position in each data frame in List 1 of the pattern match i.e. in this example the positions would be 3, 1 & 3. So the list would be:
[[1]]
[1] 3
[[2]]
[1] 1
[[3]]
[1] 3
I cannot seem to figure out how to do this.
I tried lapply:
NewList1 <- lapply(1:length(List1),
function(x) grep(List2[[x]]))
But that does not work. I also tried purrr:map2:
NewList2<-map2(List2, List1, grep(List2$A, List1))
This also does not work. I would be very grateful of any suggestions anyone may have as to how to fix this. Many thanks to anyone willing to wade in!
Try Map + unlist
> Map(grep, List2, unlist(List1, recursive = FALSE))
$a
[1] 3
$a
[1] 1
$a
[1] 3
Using Map you can do -
Map(function(x, y) grep(y, x$a), List1, List2)
#[[1]]
#[1] 3
#[[2]]
#[1] 1
#[[3]]
#[1] 3
The map2 attempt was close but you need to refer lists as .x and .y in the function.
purrr::map2(List2, List1, ~grep(.x, .y$a))

paste character vector as a comma separated, unquoted list in R

I have a character vector that looks like this
vector <- c('a','b','c','d','e')
I have an object in a for-loop that takes input as:
out[a,] <- c(a,b,c,d,e)
Where a-e are variables with values (for instance, a=0.7). I would like to feed the out object some transfomred version of ther vector object. I've tried
paste(noquote(vector),collapse=',')
However, this just returns
"a,b,c,d,e"
Which is still not useful.
Reverse the order of the function calls:
noquote(paste(vector, collapse = ','))
This will print [1] a,b,c,d,e. If you don't like the [1] use
cat(paste(vector, collapse = ','))
which prints
a,b,c,d,e
You can use mget to put objects into a named list:
# data
a <- 1; b <- 2; c <- 3; d <- 4; e <- 5
mget(letters[1:5])
$a
[1] 1
$b
[1] 2
$c
[1] 3
$d
[1] 4
$e
[1] 5
or wrap it mget in unlist to get a named vector:
unlist(mget(letters[1:5]))
a b c d e
1 2 3 4 5
This is very basic question and ate my head almost with a tiny mistake every time. I simplified and created a function in R language.
Here you go buddy!
numbers <- list(2,5,8,9,14,20) #List containing even odd numbers
en<-list() #Initiating even numbers’ list
on<-list() #Initiating odd numbers’ list
#Function creation
separate <- function(x){
for (i in x)
{
ifelse((i%%2)==0, en <- paste(append(en,i, length(en)+1), collapse = ","),
on <- paste(append(on,i, length(on)+1), collapse = ","))
}
message("Even numbers are : ", en)
message("Odd numbers are : ", on)
}
#Passing the function with argument
separate(numbers)
Result!
Even numbers are : 2,8,14,20
Odd numbers are : 5,9

Subset different vector elements within a list

Assume I have this list of vectors:
mylist <- list(a=1:3,b=4:1,c=1:5)
mylist
$a
[1] 1 2 3
$b
[1] 4 3 2 1
$c
[1] 1 2 3 4 5
I want to get the last or the max element of each vector like this for the last element:
$a
[1] 3
$b
[1] 1
$c
[1] 5
What I have tried so far:
First use lapply and the length function to get the last element index and then subset:
last <- unlist(lapply(mylist, length))
lapply(mylist,"[", last) # not working
Then I tried to use sapply with lapply. This is working, but I'm not sure whether this is generally valid. There must be a better base R solution (without loops!).
mymatrix <- sapply(last, function(x) lapply(mylist, "[",x))
diag(mymatrix)
$a
[1] 3
$b
[1] 1
$c
[1] 5
(Making this a CV as there were many contributes here and worth summing this up)
If you have some function you want to apply on your list, a simple lapply should do, such as
lapply(mylist, max) # retrieving the maximum values
Or
lapply(mylist, tail, 1) # retrieving the last values (by #docendo)
If you want to operate on two vectors simultaneously, you could use mapply or Map
Map(`[`, mylist, lengths(mylist)) # A Map version of #docendos lapply suggestion
Or per your newest request
Map(`[`, mylist, 1:3)

Apply function to corresponding elements in list of data frames

I have a list of data frames in R. All of the data frames in the list are of the same size. However, the elements may be of different types. For example,
I would like to apply a function to corresponding elements of data frame. For example, I want to use the paste function to produce a data frame such as
"1a" "2b" "3c"
"4d" "5e" "6f"
Is there a straightforward way to do this in R. I know it is possible to use the Reduce function to apply a function on corresponding elements of dataframes within lists. But using the Reduce function in this case does not seem to have the desired effect.
Reduce(paste,l)
Produces:
"c(1, 4) c(\"a\", \"d\")" "c(2, 5) c(\"b\", \"e\")" "c(3, 6) c(\"c\", \"f\")"
Wondering if I can do this without writing messy for loops. Any help is appreciated!
Instead of Reduce, use Map.
# not quite the same as your data
l <- list(data.frame(matrix(1:6,ncol=3)),
data.frame(matrix(letters[1:6],ncol=3), stringsAsFactors=FALSE))
# this returns a list
LL <- do.call(Map, c(list(f=paste0),l))
#
as.data.frame(LL)
# X1 X2 X3
# 1 1a 3c 5e
# 2 2b 4d 6f
To explain #mnel's excellent answer a bit more, consider the simple example of summing the corresponding elements of two vectors:
Map(sum,1:3,4:6)
[[1]]
[1] 5 # sum(1,4)
[[2]]
[1] 7 # sum(2,5)
[[3]]
[1] 9 # sum(3,6)
Map(sum,list(1:3,4:6))
[[1]]
[1] 6 # sum(1:3)
[[2]]
[1] 15 # sum(4:6)
Why the second one is the case might be made more obvious by adding a second list, like:
Map(sum,list(1:3,4:6),list(0,0))
[[1]]
[1] 6 # sum(1:3,0)
[[2]]
[1] 15 # sum(4:6,0)
Now, the next is more tricky. As the help page ?do.call states:
‘do.call’ constructs and executes a function call from a name or a
function and a list of arguments to be passed to it.
So, doing:
do.call(Map,c(sum,list(1:3,4:6)))
calls Map with the inputs of the list c(sum,list(1:3,4:6)), which looks like:
[[1]] # first argument to Map
function (..., na.rm = FALSE) .Primitive("sum") # the 'sum' function
[[2]] # second argument to Map
[1] 1 2 3
[[3]] # third argument to Map
[1] 4 5 6
...and which is therefore equivalent to:
Map(sum, 1:3, 4:6)
Looks familiar! It is equivalent to the first example at the top of this answer.

Resources