How to merge two excel spreadsheets in r? [duplicate]

How to merge two excel spreadsheets in r? [duplicate] - r

cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10

The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.

I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6

I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10

Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA

Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))

Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])

Related

From tree list to a matrix in R

Apologies, if the question is too basic. What would be an effective approach/idea (in R) to convert
list(c(1), c(1,2), c(1,2,3), c(1,2,3,4))
to square matrix form
[,1] [,2] [,3] [,4]
[1,] 1 NA NA NA
[2,] 1 2 NA NA
[3,] 1 2 3 NA
[4,] 1 2 3 4
I suppose there is some quick dynamic way to append just the right number of NA values and then convert to a matrix.
Naturally, the size of the (square) matrix can change).
Thanks in advance for your time.

You can use
## create the list
x <- Map(":", 1, 1:4)
ml <- max(lengths(x))
do.call(rbind, lapply(x, "length<-", ml))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
Or you could do
library(data.table)
as.matrix(unname(rbindlist(lapply(x, as.data.frame.list), fill = TRUE)))
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4
And one more for good measure ... Fore!
m <- stringi::stri_list2matrix(x, byrow = TRUE)
mode(m) <- "numeric"
m
# [,1] [,2] [,3] [,4]
# [1,] 1 NA NA NA
# [2,] 1 2 NA NA
# [3,] 1 2 3 NA
# [4,] 1 2 3 4

Rearrange list into a dataframe and

There is a list with different sublists of different length
l
[[1]] 3,4,5,7,8
[[2]] 4,5,7
[[3]] 3,9
[[4]] 6,7,8,10
How can I reorder the list to be a data frame in which the cols are the sublists 1 to 4? The rest should be filled up with NA. Can I then use melt on this data frame?

With stringi package
library(stringi)
(Res <- stri_list2matrix(a))
# [,1] [,2] [,3] [,4]
# [1,] "3" "4" "3" "6"
# [2,] "4" "5" "9" "7"
# [3,] "5" "7" NA "8"
# [4,] "7" NA NA "10"
# [5,] "8" NA NA NA
If you want them to be numeric class, could add the following line
(Res <- matrix(as.numeric(Res), ncol = ncol(Res)))
# Res
# [,1] [,2] [,3] [,4]
# [1,] 3 4 3 6
# [2,] 4 5 9 7
# [3,] 5 7 NA 8
# [4,] 7 NA NA 10
# [5,] 8 NA NA NA

You could try something like this:
#create some dummy date
a<-list(c(3,4,5,7,8),c(4,5,7),c(3,9),c(6,7,8,10))
#get the longest vector
m<-max(sapply(a,length))
#fill in the sublists with NAs so that they all have the same length
as.data.frame(sapply(a,function(x){length(x)<-m;x}))
# V1 V2 V3 V4
#1 3 4 3 6
#2 4 5 9 7
#3 5 7 NA 8
#4 7 NA NA 10
#5 8 NA NA NA

You can do as follows:
The following add zeros instead NA
a<-list(c(3,4,5,7,8),c(4,5,7),c(3,9),c(6,7,8,10))
n.col <- vapply(a, length, 1L)
mat <- matrix(0, nrow = length(a), ncol = max(n.col))
rand <- cbind(rep(seq_along(n.col), times = n.col), sequence(n.col))
M[rand] <- unlist(a, use.names = FALSE)
M2 <- t(M)
print(M2)
# [,1] [,2] [,3] [,4]
#[1,] 3 4 3 6
#[2,] 4 5 9 7
#[3,] 5 7 0 8
#[4,] 7 0 0 10
#[5,] 8 0 0 0
if you want now to have NA instead zero , you can do as follows:
M2[M2 == 0] <- NA
print(M2)
# [,1] [,2] [,3] [,4]
#[1,] 3 4 3 6
#[2,] 4 5 9 7
#[3,] 5 7 NA 8
#[4,] 7 NA NA 10
#[5,] 8 NA NA NA

create new matrix with new dimension and omitting NA values

I have a matrix with some NA values
for example:
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
I want to create new matrix with data from my matrix above with new dimension and no NA value. (it is ok to have NA only some last elements)
something like:
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 NA
[4,] 4 10 NA
I would appreciate if anyone can help me.
Thanks

Something like this as well:
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
matrix(c(na.omit(c(m)), rep(NA, sum(is.na(m)))), nrow=4)

m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# create an array of the appropriate class and dimension (filled with NA values)
dims <- c(4, 3)
md <- array(m[0], dim=dims)
# replace first "n" values with non-NA values from m
nonNAm <- na.omit(c(m))
md[seq_along(nonNAm)] <- nonNAm
md
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA

Yet another attempt. This will keep the order of the values in column order as a matrix usually would. E.g.:
mat <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 7 11 NA
Now change a value to check it doesn't affect the ordering.
mat[7] <- 20
array(mat[order(is.na(mat))],dim=dim(mat))
# [,1] [,2] [,3] [,4]
#[1,] 1 4 8 12
#[2,] 2 6 10 NA
#[3,] 3 20 11 NA
You can then specify whatever dimensions you feel like to the dim= argument:
array(mat[order(is.na(mat))],dim=c(4,3))
# [,1] [,2] [,3]
#[1,] 1 6 11
#[2,] 2 20 12
#[3,] 3 8 NA
#[4,] 4 10 NA

This is fairly straightforward if you want to preserve order column-wise or row-wise.
originalMatrix <- matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 NA 8 11
[3,] 3 6 NA 12
newMatrixNums <- originalMatrix[!is.na(originalMatrix)]
[1] 1 2 3 4 6 7 8 10 11 12
Pad with NA:
newMatrixNums2 <- c(newMatrixNums,rep(NA,2))
Column-wise:
matrix(newMatrixNums2,nrow=3)
[,1] [,2] [,3] [,4]
[1,] 1 4 8 12
[2,] 2 6 10 NA
[3,] 3 7 11 NA
Row-wise:
matrix(newMatrixNums2,nrow=3,byrow=T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 6 7 8 10
[3,] 11 12 NA NA

Here's one way:
# Reproducing your data
m <- matrix(1:12, nc=4)
m[c(5, 9)] <- NA
# Your desired dimensions
dims <- c(4, 3)
array(c(na.omit(c(m)), rep(NA, prod(dims) - length(na.omit(c(m))))), dim=dims)
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA

This can do the job but dunno whether it is a good way.
list1 <- m[m]
list2 <- m[!is.na(m)]
element1 <- list2
element2 <- rep(NA, (length(list1)-length(list2)))
newm <- matrix(c(element1,element2), nrow=4)

If you increase the length of a numeric vector with length(x)<- without assigning values to the new elements, the new values are given NA as their value. So length(M2) <- length(M) takes the shorter M2 vector and makes it the same length as M by adding NA values to the new elements.
## original
> (M <- matrix(c(1:4,NA,6:8,NA,10:12), nrow = 3))
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 NA 8 11
# [3,] 3 6 NA 12
## new
> M2 <- M[!is.na(M)]; length(M2) <- length(M)
> matrix(M2, ncol(M))
# [,1] [,2] [,3]
# [1,] 1 6 11
# [2,] 2 7 12
# [3,] 3 8 NA
# [4,] 4 10 NA

In R how to control the cycling direction of values when copying in an array?

I would like to copy some values across rows but the default seems to be cycling over columns, is there an elegant way to achieve what I want ?
The following code I have is:
> w = array(NA,dim=c(4,2))
> w
[,1] [,2]
[1,] NA NA
[2,] NA NA
[3,] NA NA
[4,] NA NA
> w[2:4,] = c(2,3)
> w
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 3 2
[4,] 2 3
But I would like the values to be cycled over rows to obtain:
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 2 3
[4,] 2 3

You can write a new transposed assignment function:
`t<-` <- function(x, value)
{
t(matrix(value, nrow=ncol(x), ncol=nrow(x)))
}
Result:
> w = array(NA,dim=c(4,2))
> t(w[2:4,]) = c(2,3)
> w
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 2 3
[4,] 2 3

Create the matrix row-wise, then assign it:
w[2:4, ] <- matrix(c(2,3), nrow=3, ncol=2, byrow=TRUE)

Another elegant way :-)
w[2:4,] <- rep(c(2,3),each=length(2:4))
[,1] [,2]
[1,] NA NA
[2,] 2 3
[3,] 2 3
[4,] 2 3

w[2:4, ] <- matrix(c(2,3),nrow=1)[ rep(1,3), ]

How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?

cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10

The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.

I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6

I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10

Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA

Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))

Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to merge two excel spreadsheets in r? [duplicate] - r

cbind(1:2, 1:10) [,1] [,2] [1,] 1 1 [2,] 2 2 [3,] 1 3 [4,] 2 4 [5,] 1 5 [6,] 2 6 [7,] 1 7 [8,] 2 8 [9,] 1 9 [10,] 2 10 I want an output like below [,1] [,2] [1,] 1 1 [2,] 2 2 [3,] 3 [4,] 4 [5,] 5 [6,] 6 [7,] 7 [8,] 8 [9,] 9 [10,] 10

Related

From tree list to a matrix in R

Rearrange list into a dataframe and

create new matrix with new dimension and omitting NA values

In R how to control the cycling direction of values when copying in an array?

How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?

Categories

Resources