Remove columns with lowest variance - r

I have a large matrix of low values, such as:
m <- matrix(c(0.000000217, 0.000000021, 0.000000403, 0.000000272,
0.000000209, 0.000000310, 0.000000161, 0.000000243,
0.000000375, 0.000000185, 0.000000298, 0.000000269),
nrow = 3, ncol = 4)
In what I'm working on, columns with low variance are causing issues. My actual matrix has over 7,000 rows. How can I remove the bottom n columns of variance? I've tried various iterations of apply() with no success.

m <- matrix(c(0.000000217, 0.000000021, 0.000000403, 0.000000272,
0.000000209, 0.000000310, 0.000000161, 0.000000243,
0.000000375, 0.000000185, 0.000000298, 0.000000269),
nrow = 3, ncol = 4)
Use apply and var to get column variances, and order and head to get the top or bottom n.
m[, head(order(apply(m, 2, var), decreasing = TRUE), -n)]

using %in% and which.min
m[,c(1:length(m[1,]))[!(c(1:length(m[1,])) %in% which.min(apply(m, 2, var)))]]

Related

Coding numerical derivatives in R

I am totally new to R and I am struggling to write a code to find the numerical derivatives of vector fields. I have two matrices U and V, e.g.,
U <- matrix(runif(9), nrow = 3, ncol = 3, byrow = T)
V <- matrix(runif(9), nrow = 3, ncol = 3, byrow = T)
These matrices (not actual values obviously) represents the components of a 2D wind vector field. I would like to code the numerical derivatives of the 2 vector components du/dy and dv/dx. I have no idea how to do this in R. Please help. Sorry in advance if this question has been answered already.
What you are looking for is the diff() function. You can apply it efficiently over a dimension of a matrix using an apply
U <- matrix(runif(9), nrow = 3, ncol = 3, byrow = T) #Your wind component
apply(U,2,diff) #change the '1' by '2' to apply diff over the other dim
Hope this helped.

How to initialize an m*n matrix in R with specific row and column names

I am trying to find a way to initialize a m*n matrix in R.
Let's say I have a seq of variable names c(a, b, c, d), and I would like to create a 4*10 matrix with c(a, b, c, d) being the vertical variable, and seq(1:10) to be horizontal variable, so I can check the matrix with the call matrix[a, 1].
Thanks in advance
We can create the matrix as
m1 <- matrix(nrow = 4, ncol = 10, dimnames = list(letters[1:4], NULL))
and use the row names and column index to extract elements
m1['a', 1]
Another base R option using row.names<-
`row.names<-`(matrix(nrow = 4, ncol = 10), head(letters, 4))

Unique combinations of vector elements that fulfill criteria

I have a vector of integers, e.g., totalVector <- c(4,2,1), and two variables totalResult and totalNumber. What I want to do is the following:
I want to to find all UNIQUE combinations of "totalNumber" elements from totalVector that add up to "totalResult". To clarify, if totalResult = 100 and totalNumber = 50, I want all combinations of 50 elements from totalVector that have a sum of 100 (repetitions are obviously allowed, but duplicate results such as 25 fours and 25 re-arranged fours should only be counted once).
I originally did this by expanding the total vector (repeating each element 50 times), getting all combinations of 50 elements with combn() and then filtering their sums. For large values however, this proved very inefficient, and failed due to the sheer amount of data. Is there a quicker and less data-heavy way to do this?
I think the OP is looking for the combinations with repetition of a vector that sum to a particular number. This will do it:
totalVector <- c(4,2,1)
totalNumber <- 50
totalResult <- 100
library(RcppAlgos)
myAns <- comboGeneral(totalVector, totalNumber, repetition = TRUE,
constraintFun = "sum", comparisonFun = "==",
limitConstraints = totalResult)
dim(myAns)
[1] 17 50
all(apply(myAns, 1, sum) == totalResult)
[1] TRUE
Disclaimer: I am the author of RcppAlgos
This would give you what you need for a small sample, but you will encounter issues with combinatorial explosion very quickly as you increase the size of the problem
tv <- sample(1:10, 10, replace = TRUE)
tn <- 5
tr <- 20
combinations <- combn(tv, tn)
equals.tr <- apply(combinations, MARGIN = 2, FUN = function(x) sum(x) == tr)
combinations[, equals.tr]

Combinations from grouped elements

I have a list of grouped elements and I want to make all possible combinations of these elements, however I only want to take one element from each group. Order does not matter.
vars_list <- list(Group1 = letters[1:5], Group2 = letters[6:9],
Group3 = letters[10:11], Group4 = letters[12:15])
Let's say I want to make combinations for n=2, n=3, n=4 where n is the number of groups I want to use.
I found a solution to do it when n=number of groups (Combinations from recursive lists) :
lengths <- c(1, 1, 1, 1)
combos <- expand.grid( mapply(function(element, n) combn(element, m=n,
FUN=paste0, collapse=""), vars_list, lengths, SIMPLIFY = F) )
How could I do this for n < number of groups?
You could use combn for getting all combinations of your groups for n=1, n=2, n=3 and n=4 and then use expand_grid:
n = 2
apply(combn(1:length(vars_list), n), 2, function(x){expand.grid(vars_list[x])})
so for n=4, you would get the same as in your question. Is this what you meant?

How to get all rows of a dataframe NOT designated in a vector in R?

I'm looking for an elegant, R-like way to capture rows in a dataframe that don't have their indices listed in a vector:
table.combos <- matrix(data = 1:12, nrow = 10, ncol = 6, byrow=T)
table.combos
not.these<-c(2,4,5,9)
x<-table.combos[c(not.these),]
#y<- everything not in x
Just use the same index vector as in:
y <- table.combos[-not.these,]
which tells chose all the rows from table.combos but those contained in not.these vector.

Resources