My data frame:
df <- structure(list(g1 = 1:12, g2 = c(3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 67L)), class = "data.frame", row.names = c(NA,
-12L))
I would like to combine 2 columns of my data frame and get a list of vectors
What I want to get:
list(c("1","3"),c("2","4"),c("3","5"),c("4","6"),c("5","7"),c("6","8"),c("7","9")c("8","10"),c("9","11"),c("10","12"),c("11","13"),c("12","67"))
What I tried:
mark.list <- list()
for(i in 1:length(bact$group1)){
x <- bact$group1[i]
y <- bact$group2[i]
df <- combn(paste(x,y))
mark.list <- c(mark.list,df)
}
Another base R option
> unclass(data.frame(t(df)))
$X1
[1] 1 3
$X2
[1] 2 4
$X3
[1] 3 5
$X4
[1] 4 6
$X5
[1] 5 7
$X6
[1] 6 8
$X7
[1] 7 9
$X8
[1] 8 10
$X9
[1] 9 11
$X10
[1] 10 12
$X11
[1] 11 13
$X12
[1] 12 67
attr(,"row.names")
[1] "g1" "g2"
If you want to have output with characters, you can try
> strsplit(do.call(paste, df), " ")
[[1]]
[1] "1" "3"
[[2]]
[1] "2" "4"
[[3]]
[1] "3" "5"
[[4]]
[1] "4" "6"
[[5]]
[1] "5" "7"
[[6]]
[1] "6" "8"
[[7]]
[1] "7" "9"
[[8]]
[1] "8" "10"
[[9]]
[1] "9" "11"
[[10]]
[1] "10" "12"
[[11]]
[1] "11" "13"
[[12]]
[1] "12" "67"
Here are couple of base R options -
asplit
asplit(df, 1)
transpose and as.list.
t(df) |> as.data.frame() |> as.list() |> unname()
ls=list(c("1","3"),c("2","4"),c("3","5"),c("4","6"),c("5","7"),c("6","8"),c("7","9"),c("8","10"),c("9","11"),c("10","12"),c("11","13"),c("12","67"))
vec <- c()
ls=list()
for (k in 1:nrow(df)){
print(k)
vec <- c(c(as.character(df$g1[k]),as.character(df$g2[k])))
ls[k] <- list(vec)
}
output
> ls
[[1]]
[1] "1" "3"
[[2]]
[1] "2" "4"
[[3]]
[1] "3" "5"
[[4]]
[1] "4" "6"
[[5]]
[1] "5" "7"
[[6]]
[1] "6" "8"
[[7]]
[1] "7" "9"
[[8]]
[1] "8" "10"
[[9]]
[1] "9" "11"
[[10]]
[1] "10" "12"
[[11]]
[1] "11" "13"
[[12]]
[1] "12" "67"
Using pmap
library(purrr)
pmap(df, ~ unname(c(...)))
-output
[[1]]
[1] 1 3
[[2]]
[1] 2 4
[[3]]
[1] 3 5
[[4]]
[1] 4 6
[[5]]
[1] 5 7
[[6]]
[1] 6 8
[[7]]
[1] 7 9
[[8]]
[1] 8 10
[[9]]
[1] 9 11
[[10]]
[1] 10 12
[[11]]
[1] 11 13
[[12]]
[1] 12 67
Related
> my_data <- "08,23,02.06.2022,5,7,THISPRODUCT,09.02.2022,yes,89,25"
> lengths(gregexpr(",", my_data))+1
[1] 10
I need to get each element individually. I tried with
print(gregexpr(",", my_data))[[1]][1]
> print(gregexpr(",", my_data))[[1]][1]
[[1]]
[1] 3 6 17 19 21 33 44 48 51
attr(,"match.length")
[1] 1 1 1 1 1 1 1 1 1
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE
[1] 3
but my_data has the first element "08" but it displays 3.. anyone give me correct syntax to display every element.
library(tidyverse)
strings <- "08,23,02.06.2022,5,7,THISPRODUCT,09.02.2022,yes,89,25" %>%
str_split(pattern = ",") %>%
unlist()
strings[1]
#> [1] "08"
Created on 2022-06-29 by the reprex package (v2.0.1)
Let's try scan
> scan(text = my_data, what = "",sep = ",",quiet = TRUE)
[1] "08" "23" "02.06.2022" "5" "7"
[6] "THISPRODUCT" "09.02.2022" "yes" "89" "25"
Using lapply:
lapply(strsplit(my_data, ","), `[`)
Output:
[[1]]
[1] "08" "23" "02.06.2022" "5" "7" "THISPRODUCT" "09.02.2022" "yes"
[9] "89" "25"
You can simply do:
unlist(strsplit(my_data, split = ","))
I have a list, which contains lists of different length. Each of these sub-lists contains a vector of the same length.
My goal is to get a dataframe, where each of these vectors is a row.
Every solution I came across was not usable because the sub-lists don't have the same length. One example of my list below
[[1]]
[[1]][[1]]
[1] "CGATCTGCTTCTATT" "1" "GA/TC" "DpnI"
[[1]][[2]]
[1] "CGATCTGCTTCTATT" "1" "/GATC" "DpnII Sau3AI MboI"
[[2]]
[[2]][[1]]
[1] "AGATCTGCTTCTATT" "1" "A/GATCT" "BglII"
[[2]][[2]]
[1] "AGATCTGCTTCTATT" "1" "GA/TC" "DpnI"
[[2]][[3]]
[1] "AGATCTGCTTCTATT" "1" "/GATC" "DpnII Sau3AI MboI"
[[3]]
[[3]][[1]]
[1] "CGATCCGCTTCTATT" "1" "GA/TC" "DpnI"
[[3]][[2]]
[1] "CGATCCGCTTCTATT" "1" "/GATC" "DpnII Sau3AI MboI"
[[4]]
[[4]][[1]]
[1] "AGATCCGCTTCTATT" "1" "GA/TC" "DpnI"
[[4]][[2]]
[1] "AGATCCGCTTCTATT" "1" "/GATC" "DpnII Sau3AI MboI"
[[5]]
[[5]][[1]]
[1] "CGTTCAGCTTCTATT" "1" "AG/CT" "AluI"
[[6]]
[[6]][[1]]
[1] "CGCTCAGCTTCTATT" "1" "AG/CT" "AluI"
Each of these "CGCTCAGCTTCTATT" "1" "AG/CT" "AluI" should be a row in the dataframe
My goal is to divide a list into n groups in all possible combinations (where the group has a variable length).
I found the same question answered here (Python environment), but I'm unable to replicate it in the R environment.
Could anyone kindly help me? Thanks a lot.
If you want an easy implementation for the similar objective, you can try listParts from package partitions, e.g.,
> x <- 4
> partitions::listParts(x)
[[1]]
[1] (1,2,3,4)
[[2]]
[1] (1,2,4)(3)
[[3]]
[1] (1,2,3)(4)
[[4]]
[1] (1,3,4)(2)
[[5]]
[1] (2,3,4)(1)
[[6]]
[1] (1,4)(2,3)
[[7]]
[1] (1,2)(3,4)
[[8]]
[1] (1,3)(2,4)
[[9]]
[1] (1,4)(2)(3)
[[10]]
[1] (1,2)(3)(4)
[[11]]
[1] (1,3)(2)(4)
[[12]]
[1] (2,4)(1)(3)
[[13]]
[1] (2,3)(1)(4)
[[14]]
[1] (3,4)(1)(2)
[[15]]
[1] (1)(2)(3)(4)
where x is the number of elements in the set, and all partitions denotes the indices of elements.
If you want to choose the number of partitions, below is a user function that may help
f <- function(x, n) {
res <- listParts(x)
subset(res, lengths(res) == n)
}
such that
> f(x, 2)
[[1]]
[1] (1,2,4)(3)
[[2]]
[1] (1,2,3)(4)
[[3]]
[1] (1,3,4)(2)
[[4]]
[1] (2,3,4)(1)
[[5]]
[1] (1,4)(2,3)
[[6]]
[1] (1,2)(3,4)
[[7]]
[1] (1,3)(2,4)
> f(x, 3)
[[1]]
[1] (1,4)(2)(3)
[[2]]
[1] (1,2)(3)(4)
[[3]]
[1] (1,3)(2)(4)
[[4]]
[1] (2,4)(1)(3)
[[5]]x
[1] (2,3)(1)(4)
[[6]]
[1] (3,4)(1)(2)
Update
Given x <- LETTERS[1:4], we can run
res <- rapply(listParts(length(x)), function(v) x[v], how = "replace")
such that
> res
[[1]]
[1] (A,B,C,D)
[[2]]
[1] (A,B,D)(C)
[[3]]
[1] (A,B,C)(D)
[[4]]
[1] (A,C,D)(B)
[[5]]
[1] (B,C,D)(A)
[[6]]
[1] (A,D)(B,C)
[[7]]
[1] (A,B)(C,D)
[[8]]
[1] (A,C)(B,D)
[[9]]
[1] (A,D)(B)(C)
[[10]]
[1] (A,B)(C)(D)
[[11]]
[1] (A,C)(B)(D)
[[12]]
[1] (B,D)(A)(C)
[[13]]
[1] (B,C)(A)(D)
[[14]]
[1] (C,D)(A)(B)
[[15]]
[1] (A)(B)(C)(D)
I have the following list. As you can see the 5th element contains multiple variables. I want to split the 5th element up and insert each single variable into the overall list as an individual element.
[[1]]
[1] "319"
[[2]]
[1] "321"
[[3]]
[1] "328"
[[4]]
[1] "333"
[[5]]
[1] "344" " 345" " 346"
[[6]]
[1] "353"
I'm coding in R Studio. I want it to do this -->
[[1]]
[1] "319"
[[2]]
[1] "321"
[[3]]
[1] "328"
[[4]]
[1] "333"
[[5]]
[1] "344"
[[6]]
[1] "345"
[[7]]
[1] "346"
[[8]]
[1] "353"
I tried to work with merge:
budget = sapply(budget1, function(i) {
Reduce(merge,budget1)[[i]]
}
)
and get error Error in budget1[[i]] : no such index at level 1
The problem is that I cannot find a solution to the following problem, does someone know of a way to deal with this data?
I want to add a column to a data frame, that contains the numbers that are still split in the follow list:
> budget1
[[1]]
[1] "4" "000" "000"
[[2]]
character(0)
[[3]]
character(0)
[[4]]
[1] "30" "000" "000"
[[5]]
[1] "20" "000" "000"
[[6]]
character(0)
[[7]]
character(0)
[[8]]
character(0)
[[9]]
[1] "22" "500" "000"
[[10]]
[1] "0"
[[11]]
[1] "4" "635" "000"
[[12]]
[1] "12" "000" "000"
[[13]]
character(0)
[[14]]
[1] "9" "000" "000"
[[15]]
character(0)
[[16]]
character(0)
[[17]]
[1] "18" "745" "17" "2017"
[[18]]
[1] "0"
[[19]]
[1] "47" "000" "000"
[[20]]
character(0)
Try this:
# Your list
budget1 <- list(c("4", "000", "000"),
character(0),
character(0),
c("18", "745", "17", "2017"),
c("0"))
#Create toy dataframe
mydataframe <- data.frame(a = 1:length(budget1))
# Add new column "budget" to our dataframe, containing the elements of budget1 as numeric values
mydataframe$budget <- as.numeric(lapply(budget1, paste, collapse =""))
The result:
mydataframe$budget
[1] 4000000 NA NA
[4] 18745172017 0
class(mydataframe$budget)
[1] "numeric"