I'm plan to write a function called lagit(a,k) to get a result like this:
lagit(c(1,5,6,4,7),c(1,3))
then it should output:
L0 L1 L3
1 NA NA
5 1 NA
6 5 NA
4 6 1
7 4 5
I'm facing 2 problems right now:
1. lag each column as each element in vector k;
2. how to bind a vector to a matrix(I used a for loop.
I was asked to use the functions in base only. So I can't call any functions from other packages.
Try this:
lagit <- function(a,k) {
tmp <- lapply(k,function(i) c(rep(NA,i),head(a,length(a)-i)))
res <- cbind(a,do.call(cbind,tmp))
colnames(res) <- paste0("L",c(0,k))
res
}
lagit(a,k)
#[1,] 1 NA NA
#[2,] 5 1 NA
#[3,] 6 5 NA
#[4,] 4 6 1
#[5,] 7 4 5
Where:
a <- c(1,5,6,4,7)
k <- c(1,3)
Here is an alternative approach
x <- c(1,5,6,4,7)
# Define a function that operates on a vector x
lagit <- function(x, k) {
stopifnot(k >= 0 & k <= length(x))
replace(rep(NA, length(x)), (k + 1):length(x), x[1:(length(x) - k)])
}
While not strictly necessary I've added a stopifnot statement to ensure that the lag is positive and less or equal to the length of the vector.
# Use sapply to apply lagit to different lags and store result as a matrix
sapply(c(0, 1, 3), function(k) lagit(x, k))
# [,1] [,2] [,3]
#[1,] 1 NA NA
#[2,] 5 1 NA
#[3,] 6 5 NA
#[4,] 4 6 1
#[5,] 7 4 5
A recursive solution:
myLag <- function(x, n){
if(n > 0) myLag(c(NA, x)[1:length(x)], n-1) else x
}
The ability of this function is equivalent to dplyr::lag() and data.table::shift(). Let's test it:
myLag(1:10, 3)
# [1] NA NA NA 1 2 3 4 5 6 7
In your case:
a <- c(1,5,6,4,7)
b <- c(1,3)
> sapply(b, myLag, x = a)
[1,] NA NA
[2,] 1 NA
[3,] 5 NA
[4,] 6 1
[5,] 4 5
> cbind(a, sapply(b, myLag, x = a))
[1,] 1 NA NA
[2,] 5 1 NA
[3,] 6 5 NA
[4,] 4 6 1
[5,] 7 4 5
Yet another option that uses vapply and length<- under the hood
lagit <- function(a, k) {
l <- length(a)
k <- if (0 %in% k) k else c(0, k)
vapply(k, function(x) `length<-`(c(rep(NA, times = x), a), l), numeric(l))
}
lagit(1:5, c(1, 3, 6))
# [,1] [,2] [,3] [,4]
#[1,] 1 NA NA NA
#[2,] 2 1 NA NA
#[3,] 3 2 NA NA
#[4,] 4 3 1 NA
#[5,] 5 4 2 NA
A base R solution
myLag <- function(x, n){
if(n >= length(x))
return(rep(NA,n))
else if(n < length(x) & n > 0)
c(rep(NA,n), x[1:(length(x)-n)])
else
x
}
lagit <- function(x,y){
cbind(x, sapply(y, function(z) myLag(x,z)))
}
> lagit(c(1,5,6,4,7),c(1,3))
x
[1,] 1 NA NA
[2,] 5 1 NA
[3,] 6 5 NA
[4,] 4 6 1
[5,] 7 4 5
Related
I generated the following binomial tree table in R.
Matrix 1:
And I wanted to convert this matrix in another type.
Matrix 2:
How can I do this with a for loop ?
Here's a function that gets it done, but not with a for loop:
fBT <- function(m) {
n <- nrow(m)
v <- rep(c(m[0], NA), n*(2*n - 1))
i <- c(n, rep(2L, n*(n + 1)/2 - 1))
if (n > 2L) {
ii <- 2:(n - 1)
i[ii*(ii - 1)/2 + 1] <- seq(2*n - 2, 4, -2)
}
v[cumsum(i)] <- m[upper.tri(m, diag = TRUE)]
return(matrix(v, ncol = n))
}
n <- 4L
m <- matrix(nrow = n, ncol = n)
m[upper.tri(m, diag = TRUE)] <- 1:(n*(n + 1)/2)
m
#> [,1] [,2] [,3] [,4]
#> [1,] 1 2 4 7
#> [2,] NA 3 5 8
#> [3,] NA NA 6 9
#> [4,] NA NA NA 10
fBT(m)
#> [,1] [,2] [,3] [,4]
#> [1,] NA NA NA 7
#> [2,] NA NA 4 NA
#> [3,] NA 2 NA 8
#> [4,] 1 NA 5 NA
#> [5,] NA 3 NA 9
#> [6,] NA NA 6 NA
#> [7,] NA NA NA 10
cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.
I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA
Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))
Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])
I want to create a Matrix where the entry for each row is chosen randomly. I want the matrix to have the property that each row in the same column has a different value. If different rows (for example row i and row i+1) in the same column have the same value then I want to replace the entry for row i+1 with NA. Basically, I want the column to have different entries for each row. For example, column 1 entries are (1,2,2,4,1). Then, I want to make this column entries are (1,2,NA,4,NA). I have tried this
solution = matrix(NA,nrow=5,ncol=5)
for (i in 1:5) {
for (j in 1:5) {
one_entry = sample(1:10, 1)
solution[j,i] = one_entry
if (solution[j+1,i]==solution[j,i]){
#is.na(solution[j+1,i]) <- solution[j+1,I]
solution[j+1,i]<- NA
#solution[solution[j+1,i]] <- NA
} else {
solution[j+1, i] = one_entry
}
}
}
print(solution)
I got the error "Error in if (solution[j + 1, i] == solution[j, i]) { :
missing value where TRUE/FALSE needed". Please help how to resolve this.
Instead of element-wise comparison using if statement, you can replace duplicated entries with NA. The output of duplicated() is a logical vector setting the position of the duplicates to TRUE.
set.seed(1)
nr <- 5
nc <- 7
m <- matrix(sample(1:10, nr * nc, replace = TRUE), nrow = nr)
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 9 7 5 9 5 1 10
# [2,] 4 2 10 5 5 4 6
# [3,] 7 3 6 5 2 3 4
# [4,] 1 1 10 9 10 6 4
# [5,] 2 5 7 9 9 10 10
for (i in seq_len(nc)) {
m[, i][duplicated(m[, i])] <- NA
}
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 9 7 5 9 5 1 10
# [2,] 4 2 10 5 NA 4 6
# [3,] 7 3 6 NA 2 3 4
# [4,] 1 1 NA NA 10 6 NA
# [5,] 2 5 7 NA 9 10 NA
Using purrr library:
library(purrr)
set.seed(123)
#populate the matrix
(mat <- rerun(5, sample(1:10,size = 5, replace = TRUE)) %>%
reduce(cbind))
#> out elt elt elt elt
#> [1,] 3 5 5 3 9
#> [2,] 3 4 3 8 3
#> [3,] 10 6 9 10 4
#> [4,] 2 9 9 7 1
#> [5,] 6 10 9 10 7
map(2:length(mat), ~{ if (mat[[. - 1]] == mat[[.]]) .x } ) %>%
compact() %>%
walk(~{ mat[[.x]] <<- NA })
mat
#> out elt elt elt elt
#> [1,] 3 5 5 3 9
#> [2,] NA 4 3 8 3
#> [3,] 10 6 9 10 4
#> [4,] 2 9 NA 7 1
#> [5,] 6 10 NA 10 7
Created on 2021-06-28 by the reprex package (v2.0.0)
I have several vectors of unequal length and I would like to cbind them. I've put the vectors into a list and I have tried to combine the using do.call(cbind, ...):
nm <- list(1:8, 3:8, 1:5)
do.call(cbind, nm)
# [,1] [,2] [,3]
# [1,] 1 3 1
# [2,] 2 4 2
# [3,] 3 5 3
# [4,] 4 6 4
# [5,] 5 7 5
# [6,] 6 8 1
# [7,] 7 3 2
# [8,] 8 4 3
# Warning message:
# In (function (..., deparse.level = 1) :
# number of rows of result is not a multiple of vector length (arg 2)
As expected, the number of rows in the resulting matrix is the length of the longest vector, and the values of the shorter vectors are recycled to make up for the length.
Instead I'd like to pad the shorter vectors with NA values to obtain the same length as the longest vector. I'd like the matrix to look like this:
# [,1] [,2] [,3]
# [1,] 1 3 1
# [2,] 2 4 2
# [3,] 3 5 3
# [4,] 4 6 4
# [5,] 5 7 5
# [6,] 6 8 NA
# [7,] 7 NA NA
# [8,] 8 NA NA
How can I go about doing this?
You can use indexing, if you index a number beyond the size of the object it returns NA. This works for any arbitrary number of rows defined with foo:
nm <- list(1:8,3:8,1:5)
foo <- 8
sapply(nm, '[', 1:foo)
EDIT:
Or in one line using the largest vector as number of rows:
sapply(nm, '[', seq(max(sapply(nm,length))))
From R 3.2.0 you may use lengths ("get the length of each element of a list") instead of sapply(nm, length):
sapply(nm, '[', seq(max(lengths(nm))))
You should fill vectors with NA before calling do.call.
nm <- list(1:8,3:8,1:5)
max_length <- max(unlist(lapply(nm,length)))
nm_filled <- lapply(nm,function(x) {ans <- rep(NA,length=max_length);
ans[1:length(x)]<- x;
return(ans)})
do.call(cbind,nm_filled)
This is a shorter version of Wojciech's solution.
nm <- list(1:8,3:8,1:5)
max_length <- max(sapply(nm,length))
sapply(nm, function(x){
c(x, rep(NA, max_length - length(x)))
})
Here is an option using stri_list2matrix from stringi
library(stringi)
out <- stri_list2matrix(nm)
class(out) <- 'numeric'
out
# [,1] [,2] [,3]
#[1,] 1 3 1
#[2,] 2 4 2
#[3,] 3 5 3
#[4,] 4 6 4
#[5,] 5 7 5
#[6,] 6 8 NA
#[7,] 7 NA NA
#[8,] 8 NA NA
Late to the party but you could use cbind.fill from rowr package with fill = NA
library(rowr)
do.call(cbind.fill, c(nm, fill = NA))
# object object object
#1 1 3 1
#2 2 4 2
#3 3 5 3
#4 4 6 4
#5 5 7 5
#6 6 8 NA
#7 7 NA NA
#8 8 NA NA
If you have a named list instead and want to maintain the headers you could use setNames
nm <- list(a = 1:8, b = 3:8, c = 1:5)
setNames(do.call(cbind.fill, c(nm, fill = NA)), names(nm))
# a b c
#1 1 3 1
#2 2 4 2
#3 3 5 3
#4 4 6 4
#5 5 7 5
#6 6 8 NA
#7 7 NA NA
#8 8 NA NA
cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.
I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA
Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))
Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])