How to apply a function to a matrix in R - r

Write a function which takes a matrix that can be coerces into a matrix; the function should return a matrix which is the same as the function argument, but every even number is not changed and odd number is doubled.
I'm very new to R. Can someone help me complete my codes:
mx = matrix(c(1,1,3,5,2,6,-2,-1,-3), nrow = 3, byrow = TRUE)
fun = function(mx){
for(i in mx){
if(i %% 2 == 0){
return(i)
}
else if(i %% 2 > 0){
return(2*i)
}
}
}

Don't need a function, just use the built-in function ifelse:
mx <- ifelse(mx %% 2 == 0, mx, 2*mx)
Or, if you prefer to encapsulate it into a function:
fun = function(mx) {
ifelse(mx %% 2 == 0, mx, 2*mx)
}
res <- fun(mx)
## [,1] [,2] [,3]
##[1,] 2 2 6
##[2,] 10 2 6
##[3,] -2 -2 -6
Explanation:
ifelse performs a vectorized comparison over all elements of the matrix mx to see if each element is even (i.e., mx %% 2 == 0). For each element if this comparison condition is TRUE, the next argument is returned, which in this case is just the value from that element in mx. Otherwise, the last argument is returned, which is 2 times the value from that element in mx as you wish.

That's easy using indices :)
double_odd <- function(mx){
odds_idx <- (mx %% 2 != 0)
mx[odds_idx] <- 2 * mx[odds_idx]
mx # If it is the last statement, you don't need return
}
Cheers

Using your try:
fun = function(mx){
res <- matrix(data = NA, ncol = ncol(mx), nrow = nrow(mx))
for(i in 1:ncol(mx)){
for(j in 1:nrow(mx))
if(mx[j, i] %% 2 == 0){
res[j, i] <- mx[j, i]
}else{
res[j, i] <- 2 * mx[j, i]
}
}
return(res)
}
of course not the most elegant solution :)

Related

Error in if (a[i][j] > 4) { : missing value where TRUE/FALSE needed

Find the number of entries in each row which are greater than 4.
set.seed(75)
aMat <- matrix( sample(10, size=60, replace=T), nr=6)
rowmax=function(a)
{
x=nrow(a)
y=ncol(a)
i=1
j=1
z=0
while (i<=x) {
for(j in 1:y) {
if(!is.na(a[i][j])){
if(a[i][j]>4){
z=z+1
}
}
j=j+1
}
print(z)
i=i+1
}
}
rowmax(aMat)
It is showing the error. I don't want to apply in built function
You could do this easier counting the x that are greater than 4 using length.
rowmax2 <- function(x) apply(x, 1, function(x) {x <- na.omit(x);length(x[x > 4])})
rowmax2(aMat)
# [1] 8 7 8 7 4 3
If you wanted to do this absolutely without any shortcut you could use two for loops. 1 for each row and another for each value in the row.
rowmax = function(a) {
y=nrow(a)
result <- numeric(y)
for(j in seq_len(y)) {
count = 0
for(val in a[j, ]) {
if(!is.na(val) && val > 4)
count = count + 1
}
result[j] <- count
}
return(result)
}
rowmax(aMat)
#[1] 8 7 8 7 4 3
If you wanted to do this using in-built functions in base R you could use rowSums.
rowSums(aMat > 4, na.rm = TRUE)
#[1] 8 7 8 7 4 3
There are several errors in you code:
You should put z <- 0 inside while loop
You should use a[i,j] for the matrix indexing, rather than a[i][j]
Below is a version after fixing the problems
rowmax <- function(a) {
x <- nrow(a)
y <- ncol(a)
i <- 1
j <- 1
while (i <= x) {
z <- 0
for (j in 1:y) {
if (!is.na(a[i, j])) {
if (a[i, j] > 4) {
z <- z + 1
}
}
j <- j + 1
}
print(z)
i <- i + 1
}
}
and then we get
> rowmax(aMat)
[1] 8
[1] 7
[1] 8
[1] 7
[1] 4
[1] 3
A concise approach to make it is using rowSums, e.g.,
rowSums(aMat, na.rm = TRUE)

How to make a function order some values in a matrix in R?

Trying to write a function to sort a matrix by rows.
I could write something to loop over the values on a vector of values but couldn't add complexity to make it loop over some matrix.
sww = function(x){
n <- length(x)
for(i in 1:(n-1)){
for (j in (i+1):n) {
if(x[i] > x[j]){
tmp = x[i]; x[i] = x[j]; x[j] = tmp
}
}
}
return(x)
}
does anyone knows how to make it loop over an entire matrix ?
Edit:
By sorting a matrix by rows I meant to have a matrix like:
2 1 4 "Sorted by row" 1 2 4
5 4 0 --> 0 4 5
Thank you
Edit1: I know about the r functions but would like to write my own
Use apply:
m <- matrix(c(2, 5, 1, 4, 4, 0), 2) # test matrix
t(apply(m, 1, sort))
## [,1] [,2] [,3]
## [1,] 1 2 4
## [2,] 0 4 5
If you really want to loop over the rows:
mm <- m
for(i in 1:nrow(m)) mm[i, ] <- sort(m[i, ])
and, of course, you can replace sort with your own version if you wish.

Matching patterns in a matrix

My data looks like this:
S
0101001010000000000000000100111100000000000011101100010101010
1001010000000001100000000100000000000100000010101110101010010
1101010101010010000000000100000000100101010010110101010101011
0000000000000000001000000111000110000000000000000000000000000
the S indicates the column from which I am talking. It is col 26. All four rows share a 1 at that position.
I would need to be able to count for each row from 2 to 4:
How many columns to the left and right are the same as row 1?
For row 2 it would be 3 to the right (as it reaches 1/0) and 8 to the left (as it reaches 0/1).
The result for every row should be entered into a matrix like this:
row2 8 3
row3 11 9
Is there a fast and efficient way to do that? The matrix I am dealing with is very large.
If you need something fast, you could use Rcpp:
mat <- as.matrix(read.fwf(textConnection("0101001010000000000000000100111100000000000011101100010101010
1001010000000001100000000100000000000100000010101110101010010
1101010101010010000000000100000000100101010010110101010101011
0000000000000000001000000111000110000000000000000000000000000"), widths = rep(1, 61)))
library(Rcpp)
cppFunction('
IntegerMatrix countLR(const LogicalMatrix& mat, const int S) {
const int nr(mat.nrow()), nc(mat.ncol());
IntegerMatrix res(nr - 1, 2);
for(int i=1; i<nr;i++){
for(int j=S-2; j>=0;j--) {
if (mat(0,j) != mat(i,j)) break;
else res(i-1,0)++;
}
for(int j=S; j<nc;j++) {
if (mat(0,j) != mat(i,j)) break;
else res(i-1,1)++;
}
}
return(res);
}' )
countLR(mat, 26)
# [,1] [,2]
#[1,] 8 2
#[2,] 10 2
#[3,] 6 0
I assumed that column 26 itself doesn't count for the result. I also assumed that the matrix can only contain 0/1 (i.e., boolean) values. Adjust as needed.
It's pretty easy with strsplit and rle to pull apart and assemble this data:
> S <- scan(what="") #input of character mode
1: 0101001010000000000000000100111100000000000011101100010101010
2: 1001010000000001100000000100000000000100000010101110101010010
3: 1101010101010010000000000100000000100101010010110101010101011
4: 0000000000000000001000000111000110000000000000000000000000000
5:
s2 <- strsplit(S, split="")
sapply(s2, "[[", 26) # verify the 26th position is all ones
#[1] "1" "1" "1" "1"
#length of strings from 26th postion to right
rtlen <- length(s2[[1]])-(26-1)
# Pick from the `rle` $values where values TRUE
rle( tail( s2[[1]] == s2[[2]], rtlen) )
Run Length Encoding
lengths: int [1:11] 3 4 5 1 7 1 4 1 1 6 ...
values : logi [1:11] TRUE FALSE TRUE FALSE TRUE FALSE ...
Now that you have an algorithm for a single instance, you can iterate of the rest of the items in s2. To do the backwards look I just did the same operation on a rev-ersed section of the strings.
m<-matrix(NA, 3,2);
for (i in 2:4) { m[i-1,2] <- rle(tail( s2[[1]] == s2[[i]], rtlen) )$lengths[1]
m[i-1, 1] <- rle( rev( head( s2[[1]] == s2[[i]], 26)) )$lengths[1] }
m
[,1] [,2]
[1,] 9 3 # I think you counted wrong
[2,] 11 3
[3,] 7 1
Notice that I was comparing each one to the first row and your results suggest you were doing something else...perhaps comparing to the row above. That could easily be done instead with only a very small mod to the code indices for choice of the comparison vector:
m<-matrix(NA, 3,2);
for (i in 2:4) { m[i-1,2] <- rle(tail( s2[[i-1]] == s2[[i]], rtlen) )$lengths[1]
m[i-1, 1] <- rle( rev( head( s2[[i-1]] == s2[[i]], 26)) )$lengths[1] }
m
[,1] [,2]
[1,] 9 3
[2,] 9 9 #Again I think you may have miscounted. Easy to do, eh?
[3,] 7 1
This problem intrigued me. Since the matrix is binary, it's far more efficient to pack the matrix into a raw matrix than it is to use sparse matrices. It means that the storage for a 1,000 x 21,000,000 pattern matrix is approx. 2.4 GiB (print(object.size(raw(1000 * 21000000 / 8)), units = "GB")).
The following should be a relatively efficient way to tackle the problem. The Rcpp code takes a raw matrix which indicates the differences between the first row of the original matrix and the other rows. For efficiency in the R code, it's actually arranged with the patterns in columns rather than rows. The other functions help to convert existing sparse or regular matrices into packed ones and to read a matrix directly from a file.
library("Rcpp")
library("Matrix")
writeLines("0101001010000000000000000100111100000000000011101100010101010
1001010000000001100000000100000000000100000010101110101010010
1101010101010010000000000100000000100101010010110101010101011
0000000000000000001000000111000110000000000000000000000000000", "example.txt")
cppFunction('
IntegerMatrix countLRPacked(IntegerMatrix mat, long S) {
long l = S - 2;
long r = S;
long i, cl, cr;
int nr(mat.nrow()), nc(mat.ncol());
IntegerMatrix res(nc, 2);
for(int i=0; i<nc;i++){
// First the left side
// Work out which byte is the first to have a 1 in it
long j = l >> 3;
int x = mat(j, i) & ((1 << ((l & 7) + 1)) - 1);
long cl = l & 7;
while(j > 0 && !x) {
j --;
x = mat(j, i);
cl += 8;
}
// Then work out where the 1 is in the byte
while (x >>= 1) --cl;
// Now the right side
j = r >> 3;
x = mat(j, i) & ~((1 << ((r & 7))) - 1);
cr = 8 - (r & 7);
while(j < (nr-1) && !x) {
j ++;
x = mat(j, i);
cr += 8;
}
cr--;
while (x = (x << 1) & 0xff) --cr;
res(i, 0) = cl;
res(i, 1) = cr;
}
return(res);
}')
# Reads a binary matrix from file or character vector
# Borrows the first bit of code from read.table
readBinaryMatrix <- function(file = NULL, text = NULL) {
if (missing(file) && !missing(text)) {
file <- textConnection(text)
on.exit(close(file))
}
if (is.character(file)) {
file <- file(file, "rt")
on.exit(close(file))
}
if (!inherits(file, "connection"))
stop("'file' must be a character string or connection")
if (!isOpen(file, "rt")) {
open(file, "rt")
on.exit(close(file))
}
lst <- list()
i <- 1
while(length(line <- readLines(file, n = 1)) > 0) {
lst[[i]] <- packRow(as.integer(strsplit(line, "", fixed = TRUE)[[1]]))
i <- i + 1
}
do.call("cbind", lst)
}
# Converts a binary integer vector into a packed raw vector,
# padding out at the end to make the input length a multiple of 8
packRow <- function(row) {
packBits(as.raw(c(row, rep(0, (8 - length(row)) %% 8 ))))
}
# Converts a binary integer matrix to a packed raw matrix
# Note the matrix is transposed (makes the subsequent xor more efficient)
packMatrix <- function(mat) {
stopifnot(class(mat) %in% c("matrix", "dgCMatrix"))
apply(mat, 1, packRow)
}
# Takes either a packed raw matrix or a binary integer matrix, uses xor to compare all the first row
# with the others and then hands it over to the Rcpp code for processing
countLR <- function(mat, S) {
stopifnot(class(mat) %in% c("matrix", "dgCMatrix"))
if (storage.mode(mat) != "raw") {
mat <- packMatrix(mat)
}
stopifnot(8 * nrow(mat) > S)
y <- xor(mat[, -1, drop = FALSE], mat[, 1, drop = TRUE])
countLRPacked(y, S)
}
sMat <- Matrix(as.matrix(read.fwf("example.txt", widths = rep(1, 61))))
pMat <- readBinaryMatrix("example.txt")
countLR(sMat, 26)
countLR(pMat, 26)
You should note that the width of the pattern matrix is right-padded to a multiple of 8, so if the patterns match all the way to the right hand side this will result in the right hand count being possibly a bit high. This could be corrected if need be.
Slow R version to do this (moved from duplicate):
countLR <- function(mat, S) {
mat2 <- mat[1, ] != t(mat[-1, , drop = FALSE])
l <- apply(mat2[(S - 1):1, ], 2, function(x) which(x)[1] - 1)
l[is.na(l)] <- S - 1
r <- apply(mat2[(S + 1):nrow(mat2), ], 2, function(x) which(x)[1] - 1)
r[is.na(l)] <- ncol(mat) - S
cbind(l, r)
}

Euler Project #1 in R

Problem
Find the sum of all numbers below 1000 that can be divisible by 3 or 5
One solution I created:
x <- c(1:999)
values <- x[x %% 3 == 0 | x %% 5 == 0]
sum(values
Second solution I can't get to work and need help with. I've pasted it below.
I'm trying to use a loop (here, I use while() and after this I'll try for()). I am still struggling with keeping references to indexes (locations in a vector) separate from values/observations within vectors. Loops seem to make it more challenging for me to distinguish the two.
Why does this not produce the answer to Euler #1?
x <- 0
i <- 1
while (i < 100) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- c(x, i)
}
i <- i + 1
}
sum(x)
And in words, line by line this is what I understand is happening:
x gets value 0
i gets value 1
while object i's value (not the index #) is < 1000
if is divisible by 3 or 5
add that number i to the vector x
add 1 to i in order (in order to keep the loop going to defined limit of 1e3
sum all items in vector x
I am guessing x[i] <- c(x, i) is not the right way to add an element to vector x. How do I fix this and what else is not accurate?
First, your loop runs until i < 100, not i < 1000.
Second, replace x[i] <- c(x, i) with x <- c(x, i) to add an element to the vector.
Here is a shortcut that performs this sum, which is probably more in the spirit of the problem:
3*(333*334/2) + 5*(199*200/2) - 15*(66*67/2)
## [1] 233168
Here's why this works:
In the set of integers [1,999] there are:
333 values that are divisible by 3. Their sum is 3*sum(1:333) or 3*(333*334/2).
199 values that are divisible by 5. Their sum is 5*sum(1:199) or 5*(199*200/2).
Adding these up gives a number that is too high by their intersection, which are the values that are divisible by 15. There are 66 such values, and their sum is 15*(1:66) or 15*(66*67/2)
As a function of N, this can be written:
f <- function(N) {
threes <- floor(N/3)
fives <- floor(N/5)
fifteens <- floor(N/15)
3*(threes*(threes+1)/2) + 5*(fives*(fives+1)/2) - 15*(fifteens*(fifteens+1)/2)
}
Giving:
f(999)
## [1] 233168
f(99)
## [1] 2318
And another way:
x <- 1:999
sum(which(x%%5==0 | x%%3==0))
# [1] 233168
A very efficient approach is the following:
div_sum <- function(x, n) {
# calculates the double of the sum of all integers from 1 to n
# that are divisible by x
max_num <- n %/% x
(x * (max_num + 1) * max_num)
}
n <- 999
a <- 3
b <- 5
(div_sum(a, n) + div_sum(b, n) - div_sum(a * b, n)) / 2
In contrast, a very short code is the following:
x=1:999
sum(x[!x%%3|!x%%5])
Here is an alternative that I think gives the same answer (using 99 instead of 999 as the upper bound):
iters <- 100
x <- rep(0, iters-1)
i <- 1
while (i < iters) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- i
}
i <- i + 1
}
sum(x)
# [1] 2318
Here is the for-loop mentioned in the original post:
iters <- 99
x <- rep(0, iters)
i <- 1
for (i in 1:iters) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- i
}
i <- i + 1
}
sum(x)
# [1] 2318

Count occurrences of condition in lapply

I am running a simulation that I need to keep track of number of occurrences in a function call of a particular condition. I attempted to accomplish this with an assignment to a global object. It works if you run the function but if you try to lapply the function as I'm doing then you get a single count of all the times the condition happened rather than a count for every time it happened for each element in the list fed to lapply.
Here's a dummy situation where the occurrence is evenness of a number:
FUN <- function(x){
lapply(1:length(x), function(i) {
y <- x[i]
if (y %% 2 == 0){
assign("count.occurrences", count.occurrences + 1, env=.GlobalEnv)
}
print("do something")
})
list(guy="x", count=count.occurrences)
}
#works as expected
count.occurrences <- 0
FUN(1:10)
count.occurrences <- 0
lapply(list(1:10, 1:3, 11:16, 9), FUN)
#gives me...
#> count.occurrences
#[1] 9
#I want...
#> count.occurrences
#[1] 5 1 3 0
It's in a simulation so speed is an issue. I want this to be as fast as possible so I'm not married to the global assignment idea.
Rather than assign to the global environment, why not just assign to inside FUN's environment?
FUN <- function(x){
count.occurances <- 0
lapply(1:length(x), function(i) {
y <- x[i]
if (y %% 2 == 0){
count.occurances <<- count.occurances + 1
}
print("do something")
})
list(guy="x", count=count.occurances)
}
Z <- lapply(list(1:10, 1:3, 11:16, 9), FUN)
Then you can just pull the counts out.
> sapply(Z, `[[`, "count")
[1] 5 1 3 0
I haven't done any benchmarking on this, but have you tried just using a for loop? I know that loops aren't generally encouraged in R, but they're also not always slower.
FUN <- function(x) {
count.occurrences = 0
for (i in 1:length(x)) {
y = x[i]
if (y %% 2 == 0) {
count.occurrences = count.occurrences + 1
}
print("do something")
}
list(guy="x", count=count.occurrences)
}
lapply(list(1:10, 1:3, 11:16, 9), FUN)
I can get it like this:
count.occurances <- 0
Z <-lapply(list(1:10, 1:3, 11:16, 9), FUN)
diff(c(0, sapply(1:length(Z), function(x) Z[[x]]$count)))
I'm open to better ideas (faster).

Resources