Matlab or R: replace elements in matrix by values from another matrix in order - r

I have a problem to solve in either Matlab or R (preferably in R).
Imagine I have a vector A with 10 elements.
I have also a vector B with 30 elements, of which 10 have value 'x'.
Now, I want to replace all the 'x' in B by the corresponding values taken from A, in the order that is established in A. Once a value in A is taken, the next one is ready to be used when the next 'x' in B is found.
Note that the sizes of A and B are different, it's the number of 'x' cells that coincides with the size of A.
I have tried different ways to do it. Any suggestion on how to program this?

As long as the number of x entries in B matches the length of A, this will do what you want:
B[B=='x'] <- A
(It should be clear that this is the R solution.)

MATLAB Solution
In MATLAB it's quite simple, use logical indexing:
B(B == 'x') = A;

Related

How to find the length of a list based on a condition in R

The problem
I would like to find a length of a list.
The expected output
I would like to find the length based on a condition.
Example
Suppose that I have a list of 4 elements as follows:
myve <–list(1,2,3,0)
Here I have 4 elements, one of them is zero. How can I find the length by extracting the zero values? Then, if the length is > 1I would like to substruct one. That is:
If the length is 4 then, I would like to have 4-1=3. So, the output should be 3.
Note
Please note that I am working with a problem where the zero values may be changed from one case to another. For example, For the first list may I have only one 0 value, while for the second list may I have 2 or 3 zero values.
The values are always positive or zero.
You just need to apply the condition to each element. This will produce a list of boolean, then you sum it to get the number of True elements (i.e. validation your condition).
In your case:
sum(myve != 0)
In a more complex case, where the confition is expressed by a function f:
sapply(myve, f)
Use sapply to extract the ones different to zeros and sum to count them
sum(sapply(myve, function(x) x!=0))

transforming a matrix in R by

I want to make a new matrix B from a previous matrix A, where the length of rows and columns are the same in B and every position corresponds to a ranking of A.
In particular, for any x of a location [i,j] in A, I want to find how many values are greater than [i,j] (which sum(A>x), which I can find when x is discrete, but not for any x), followed by division by the total number of observations*variables in the matrix A.
I think using the apply function would be able to create matrix B as I wish, but I'm having trouble finding a way to apply use of "sum" for each position (i.e., sum(A>x)/# of positions in A.
I think I could use apply(A, c(1,2), FUN(X...)), but I do not know what function I can use.
Thanks for any suggestions.
Short version: matrix((length(M) - rank(M))/length(M), nrow=nrow(M), ncol=ncol(M))
Long version:
length(M) will give you the number of elements in the matrix.
length(M) - rank(M) will give the number of elements greater than each element.
So you want (length(M) - rank(M)) / length(M) but formatted into a matrix like M, so
matrix((length(M) - rank(M))/length(M), nrow=nrow(M), ncol=ncol(M))

Find n closest non-NA values to position t in vector

This is probably a simple question for those experienced in R, but it is something that I (a novice) am struggling with...
I have two examples of vectors that are common to the problem I am trying to solve, A and B:
A <- c(1,3,NA,3,NA,4,NA,1,7,NA,2,NA,9,9,10)
B <- c(1,3,NA,NA,NA,NA,NA,NA,NA,NA,2,NA,9)
#and three scalars
R <- 4
t <- 5
N <- 3
There is a fourth scalar, n, where 0<=n<=N. In general, N <= R.
I want to find the n closest non-NA values to t such that they fall within a radius R centered on t. I.e., the search radius, R comprises of R+1 values. For example A, the search radius sequence is (3,NA,3,NA,4,NA,1), where t=NA, the middle value in the search radius sequence.
The expected answer can be one of two results for A:
answerA1 <- c(3,4,1)
OR
answerA2 <- c(3,4,3)
The expected answer for B:
answerB <- c(1,3)
How would I accomplish this task in the most time- and space-efficient manner? One liners, loops, etc. are welcome. If I have to choose a preference, it is for speed!
Thanks in advance!
Note:
For this case, I understand that the third closest non-NA value may involve choosing a preference for the third value to fall on either the right or left of t (as shown by the two possible answers above). I do not have a preference for whether this values falls to the left or the right of t but, if there is a way to leave it to random chance, (whether the third value falls to the right or the left) that would be ideal (but, again, it is not a requirement).
A relatively short solution is:
orderedA <- A[order(abs(seq_len(length(A)) - t))][seq_len(R*2)]
n_obj <- min(sum(is.na(orderedA)), N, length(na.omit(orderedA)))
res <- na.omit(orderedA)[seq_len(n_obj)]
res
#[1] 3 4 3
Breaking this down a little more the steps are:
Order A, by the absolute distance from the position of interest, t.
Code is: A[order(abs(seq_len(length(A)) - t))]
Subset to the first R*2 elements (so this will get the elements on either side of t within R.
Code is: [seq_len(R*2)]
Get the first min(N, # of non-NA, len of non-NA) elements
Code is: min(sum(is.na(orderedA)), N, length(na.omit(orderedA)))
Drop NA
Code is: na.omit()
Take first elements determined in step 3 (whichever is smaller)
Code is: [seq_len(n_obj)]
Something like this?
thingfinder <- function(A,R,t,n) {
left <- A[t:(t-R-1)]
right <- A[t:(t+R+1)]
leftrightmat <- cbind(left,right)
raw_ans <- as.vector(t(leftrightmat))
ans <- raw_ans[!is.na(raw_ans)]
return(ans[1:n])
}
thingfinder(A=c(1,3,NA,3,NA,4,NA,1,7,NA,2,NA,9,9,10), R=3, t=5, n=3)
## [1] 3 4 3
This would give priority to the left side, of course.
In case it is helpful to others, #Mike H. also provided me with a solution to return the index positions associated with the desired vector elements res:
A <- setNames(A, seq_len(length(A)))
orderedA <- A[order(abs(seq_len(length(A)) - t))][seq_len(R*2)]
n_obj <- min(sum(is.na(orderedA)), N, length(na.omit(orderedA)))
res <- na.omit(orderedA)[seq_len(n_obj)]
positions <- as.numeric(names(res))

Generate sets from given overlap matrix

Note: I edited the original question to explain more precisely.
While I was doing a simulation for my new method, I needed to generate a special type of dataset consists of multiple subset. The problem is that there is some "shared" variables across the subsets, and the number of shared variable is called "overlap" here. Since the distribution of overlap proportion is given, I need to generate an appropriate list of variables and their overlap follows the given distribution. But I have failed to implement such algorithm...
I am not sure whether there is a specific algorithm for this kind of question,
but I have failed to find such thing after a long search.
I prefer R solution, but anything others also will be very appreciated. Please help me to solve this problem! Thank you so much in advance!
The below is a standardized explanation for my problem. I tried to explain as general as possible I can, but please give me any suggestion if it is not sufficient.
Purpose: Generate n sets from given overlap matrix of elements. Each set contains k elements.
Input: There is a n*n matrix whose (i,j)th cell value represents a number of overlapped elements from (i)th set to (j)th set.
Output: A list of k element identifiers (whatever can be used such as number) for n sets.
Assumption: The number of elements for each set is k, and it is same across all n sets. Hence, the input matrix is symmetric.
Example (assumes k=3 and n=3)
Input
3 1 0
1 3 1
0 1 3
Output
Set 1: A B C
Set 2: A D E
Set 3: D F G
In the above example input, (1,2)th and (2,1)th cells are 1 because set 1 and 2 share "A" element and vice versa, and diagonal cells are 3(=k) because each set shares all elements with itself.
I would repeat the following process until I had accounted for all the matrix entries:
1) Treat the matrix as the adjacency matrix of a graph, and find the largest clique in it. That is, find the largest possible set S of indexes such that for all i, j in set S M(i,j) > 0
2) Create an item that is in all of the sets which correspond to the indexes in S - in fact, if the minimum value of M(i,j) = v, create v such items.
3) subtract v from M(i,j) for all i, j in set S, accounting for the counts generated by the items you have just created.

Make vector with 2 elements with equal chance in R

I want to create an R vector with two repeat elements. A length of the array is 200.
But each element can be either 'x' or 'y'.
an element can be x or y with equal chance.
Is there any grammatical function in R to do above task?
Please someone help.
A possible way to do it is to use rbinom. Step by step, generate first a vecotr of 0 and 1, then change it into x and y:
vec = ifelse(rbinom(200, 1, 0.5)==0,"x","y"))
We need a little bit more information to be helpful, but if you want a vector of 200 values, 100 x's and 100 y's, then just do this:
t <- rep(c('X','Y'), 100)
If you want this in a random order:
t <- sample(t)

Resources