How do I loop correctly? - r

Here is the data below. I'm not sure which type of looping I should be using, but here is what I am looking to do: If, for row 1, there is a 6 present, then for column 7 we have "Yes", if there is no 6 present, then column 7 has "No". Ignore columns 8 & 9.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 6 1 1 6 1 NA NA NA
[2,] 5 5 5 5 5 5 NA NA NA
[3,] 1 1 6 1 1 6 NA NA NA
[4,] 5 5 5 5 5 5 NA NA NA
[5,] 6 1 1 6 1 1 NA NA NA
[6,] 5 5 5 5 5 5 NA NA NA
[7,] 1 6 1 1 6 1 NA NA NA
[8,] 5 5 5 5 5 5 NA NA NA
[9,] 1 1 6 1 1 6 NA NA NA
[10,] 5 5 5 5 5 5 NA NA NA
Here is the code that I have.
data.matrix <- matrix(data=NA,nrow = b, ncol = n+3)
b <- 10
n <- 6
for (i in 1:b)
{
data.matrix[,1:n] <- sample(6,n,replace=T)
}
Side Note: I keep getting this error
"the condition has length > 1 and only the first element will be used"

Here is a solution using apply:
a[,7] <- apply(a, 1, function(x) ifelse(max(x,na.rm = T) == 6,"YES","NO"))
where a is the input data.frame/tibble. As commented above, if you have matrix, then convert it to data.frame and perform this operation.

Here is solution with lapply and which:
res <- apply(data.matrix, 1, function(x) {
x[[7]] <- length(which(x == 6)) > 0
x
})
res <- t(res)

Related

How to merge two excel spreadsheets in r? [duplicate]

cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.
I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA
Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))
Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])

Assigning NA to the Matrix entries in R

I want to create a Matrix where the entry for each row is chosen randomly. I want the matrix to have the property that each row in the same column has a different value. If different rows (for example row i and row i+1) in the same column have the same value then I want to replace the entry for row i+1 with NA. Basically, I want the column to have different entries for each row. For example, column 1 entries are (1,2,2,4,1). Then, I want to make this column entries are (1,2,NA,4,NA). I have tried this
solution = matrix(NA,nrow=5,ncol=5)
for (i in 1:5) {
for (j in 1:5) {
one_entry = sample(1:10, 1)
solution[j,i] = one_entry
if (solution[j+1,i]==solution[j,i]){
#is.na(solution[j+1,i]) <- solution[j+1,I]
solution[j+1,i]<- NA
#solution[solution[j+1,i]] <- NA
} else {
solution[j+1, i] = one_entry
}
}
}
print(solution)
I got the error "Error in if (solution[j + 1, i] == solution[j, i]) { :
missing value where TRUE/FALSE needed". Please help how to resolve this.
Instead of element-wise comparison using if statement, you can replace duplicated entries with NA. The output of duplicated() is a logical vector setting the position of the duplicates to TRUE.
set.seed(1)
nr <- 5
nc <- 7
m <- matrix(sample(1:10, nr * nc, replace = TRUE), nrow = nr)
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 9 7 5 9 5 1 10
# [2,] 4 2 10 5 5 4 6
# [3,] 7 3 6 5 2 3 4
# [4,] 1 1 10 9 10 6 4
# [5,] 2 5 7 9 9 10 10
for (i in seq_len(nc)) {
m[, i][duplicated(m[, i])] <- NA
}
m
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 9 7 5 9 5 1 10
# [2,] 4 2 10 5 NA 4 6
# [3,] 7 3 6 NA 2 3 4
# [4,] 1 1 NA NA 10 6 NA
# [5,] 2 5 7 NA 9 10 NA
Using purrr library:
library(purrr)
set.seed(123)
#populate the matrix
(mat <- rerun(5, sample(1:10,size = 5, replace = TRUE)) %>%
reduce(cbind))
#> out elt elt elt elt
#> [1,] 3 5 5 3 9
#> [2,] 3 4 3 8 3
#> [3,] 10 6 9 10 4
#> [4,] 2 9 9 7 1
#> [5,] 6 10 9 10 7
map(2:length(mat), ~{ if (mat[[. - 1]] == mat[[.]]) .x } ) %>%
compact() %>%
walk(~{ mat[[.x]] <<- NA })
mat
#> out elt elt elt elt
#> [1,] 3 5 5 3 9
#> [2,] NA 4 3 8 3
#> [3,] 10 6 9 10 4
#> [4,] 2 9 NA 7 1
#> [5,] 6 10 NA 10 7
Created on 2021-06-28 by the reprex package (v2.0.0)

Multiplication of matrices with NA values

If I have 2 square Matrices with random NA values, for example:
Matrix A:
1 2 3
1 5 NA 7
2 NA 3 8
3 NA 4 5
Matrix B:
1 2 3
1 NA 8 NA
2 2 5 9
3 NA 4 3
What is the best way to multiply them? Would changing NA values to 0 give a different result of the dot product?
NAs will be ignored:
## Dummy matrices
mat1 <- matrix(sample(1:9, 9), 3, 3)
mat2 <- matrix(sample(1:9, 9), 3, 3)
## Adding NAs
mat1[sample(1:9, 4)] <- NA
mat2[sample(1:9, 4)] <- NA
mat1
# [,1] [,2] [,3]
#[1,] 9 NA 3
#[2,] 2 NA NA
#[3,] NA 1 8
mat2
# [,1] [,2] [,3]
#[1,] NA NA 4
#[2,] NA 9 3
#[3,] NA 7 1
mat1 * mat2
# [,1] [,2] [,3]
#[1,] NA NA 12
#[2,] NA NA NA
#[3,] NA 7 8
mat1 %*% mat2
# [,1] [,2] [,3]
#[1,] NA NA NA
#[2,] NA NA NA
#[3,] NA NA NA
In this case the dot product results in only NAs because there are no operations that does not involve an NA. Different matrices can lead to different results.

convert all zeros of a matrix in R to NA [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Replace all 0 values to NA in R
Going off of this question. Is there a similar function in R such as x[is.na(x)] <- 0 except that it will changes every zero in a matrix to NA?
You can do it like this:
x[x == 0] <- NA
For example:
x = matrix(rep(0:1, 50), nrow=10)
x[x == 0] <- NA
print(x)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] NA NA NA NA NA NA NA NA NA NA
# [2,] 1 1 1 1 1 1 1 1 1 1
# [3,] NA NA NA NA NA NA NA NA NA NA
# [4,] 1 1 1 1 1 1 1 1 1 1
# [5,] NA NA NA NA NA NA NA NA NA NA
# [6,] 1 1 1 1 1 1 1 1 1 1
# [7,] NA NA NA NA NA NA NA NA NA NA
# [8,] 1 1 1 1 1 1 1 1 1 1
# [9,] NA NA NA NA NA NA NA NA NA NA
#[10,] 1 1 1 1 1 1 1 1 1 1

How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?

cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10
I want an output like below
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 7
[8,] 8
[9,] 9
[10,] 10
The trick is to make all your inputs the same length.
x <- 1:2
y <- 1:10
n <- max(length(x), length(y))
length(x) <- n
length(y) <- n
If you want you output to be an array, then cbind works, but you get additional NA values to pad out the rectangle.
cbind(x, y)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
[7,] NA 7
[8,] NA 8
[9,] NA 9
[10,] NA 10
To get rid of the NAs, the output must be a list.
Map(function(...)
{
ans <- c(...)
ans[!is.na(ans)]
}, as.list(x), as.list(y)
)
[[1]]
[1] 1 1
[[2]]
[1] 2 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
EDIT: I swapped mapply(..., SIMPLIFY = FALSE) for Map.
I came across similar problem and I would like to suggest that additional solution that some, I hope, may find useful. The solution is fairly straightforward and makes use of the qpcR package and the provided cbind.na function.
Example
x <- 1:2
y <- 1:10
dta <- qpcR:::cbind.na(x, y)
Results
> head(dta)
x y
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
Side comments
Following the OP's original example, column names can be easily removed:
colnames(dta) <- NULL
the operation would produce the desired output in full:
> head(dta)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] NA 3
[4,] NA 4
[5,] NA 5
[6,] NA 6
I would like to propose an alternate solution that makes use of the rowr package and their cbind.fill function.
> rowr::cbind.fill(1:2,1:10, fill = NA);
object object
1 1 1
2 2 2
3 NA 3
4 NA 4
5 NA 5
6 NA 6
7 NA 7
8 NA 8
9 NA 9
10 NA 10
Or alternatively, to match the OP's desired output:
> rowr::cbind.fill(1:2,1:10, fill = '');
object object
1 1 1
2 2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
Helper function...
bind.pad <- function(l, side="r", len=max(sapply(l,length)))
{
if (side %in% c("b", "r")) {
out <- sapply(l, 'length<-', value=len)
} else {
out <- sapply(sapply(sapply(l, rev), 'length<-', value=len, simplify=F), rev)}
if (side %in% c("r", "l")) out <- t(out)
out
}
Examples:
> l <- lapply(c(3,2,1,2,3),seq)
> lapply(c("t","l","b","r"), bind.pad, l=l, len=4)
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] 1 NA NA NA 1
[3,] 2 1 NA 1 2
[4,] 3 2 1 2 3
[[2]]
[,1] [,2] [,3] [,4]
[1,] NA 1 2 3
[2,] NA NA 1 2
[3,] NA NA NA 1
[4,] NA NA 1 2
[5,] NA 1 2 3
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 NA 2 2
[3,] 3 NA NA NA 3
[4,] NA NA NA NA NA
[[4]]
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 1 2 NA NA
[3,] 1 NA NA NA
[4,] 1 2 NA NA
[5,] 1 2 3 NA
Given that some of the solutions above rely on packages that are no longer available, here a helper function that only uses dplyr.
bind_cols_fill <- function(df_list) {
max_rows <- map_int(df_list, nrow) %>% max()
map(df_list, function(df) {
if(nrow(df) == max_rows) return(df)
first <- names(df)[1] %>% sym()
df %>% add_row(!!first := rep(NA, max_rows - nrow(df)))
}) %>% bind_cols()
}
Note that this takes a list of data frames, so that it is slightly cumbersome if one only wants to combine two vectors:
x <- 1:2
y <- 1:10
bind_cols_fill(list(tibble(x), tibble(y))
Another solution with no dependencies:
my_bind <- function(x, y){
if(length(x = x) > length(x = y)){
len_diff <- length(x) - length(y)
y <- c(y, rep(NA, len_diff))
}else if(length(x = x) < length(x = y)){
len_diff <- length(y) - length(x)
x <- c(x, rep(NA, len_diff))
}
cbind(x, y)
}
my_bind(x = letters[1:4], y = letters[1:2])

Resources