Need help coding an answer in R - r

I do not know how the X or Y vector relate to each other in this question. The fact that the answer is a single vector is what is confusing me the most.
I tried doing the negative rank (i.e. rank(-s)) and then trying
rev(rank(X)) and rev(rank(Y)) and I am getting incorrect rank returns. Here's my code:
x <- c(-1, 0, 1, 2, -3) #vector x
y <- c(1, 3, 2, 5, 8) # Vector y
S <- x[order(match(x,y))] #orders and matches x with y
R <- rank(-S) #ranks S in element number order
print(R) #I want R = (5 2 1 3 4)
instead, I get [2 1 4 3 5]
with this other code:
never <- rev(rank(-x)) #takes the reverse rank of x
bird <- rev(rank(y)) #takes the reverse rank of y
kid <- x[order(match(never, bird))] #orders the match of the reverse ranks
tr <- rank(kid) #ranks the match of the reverse ranks
print(tr)
I get [2 1 3 4 5].

As I understand the question (which is not clear, I agree), there seems to be an error, but $x$ and $y$ are linked by the $s_i$ index.
Following the question: Let us find the $s_1$ index thanks to $y$. We know that $y_{s_1}$ is the largest value of $y$, hence $y_5$ in this case. Thus, $s_1 = 5$, so we need to find the reverse rank of $x_5 = -3$. As this is the lowest value, we have $r_1 = 5$.
Now with this approach, we would have $s_2 = 4$, so my guess is that there is an error in the example or the question.
In the example, I happen to have $R = (5, 1, 3, 2, 4)$.
You can notice that if you sort $y$, then apply the sorting rule to $x$ to create $x'$, the $R$ vector is the sorting rule of $x'$.

I figured it out.
x <- c(-1,0,1,2,-3)
y <- c(1,3,2,5,8)
myrank <- function(x,y)
{
R=c()
{
for(i in 1:length(x))
{
b=length(y)+1-rank(y)
a=length(y)+1-rank(x[b]) # here x[b] is to order the x values according to b
R=c(a)
}
}
print(R)
}
myrank(x,y)

Related

In R, How to apply if statement in matrix

recently I am trying to mimic a game.
I am going to throw 2 dice at the same time. If the sum of 2 dice is greater than or equals to 10, I win 1 point.
If it is lower than 10, I lose 1 point. I will do this for 1000 times.
At the very beginning, I draw 2000 random samples with set.seed (1234)
set.seed(1234)
d = sample(c(1:6), size = 2000, replace = T)
d
And then, I turn it into a matrix, and sum each row
a = matrix(d, nrow=1000, ncol=2, byrow=T)
t = rowSums(a)
t
Now, I have 1000 elements (sum of two dice each time). I would like to create a vector X to calculate the point that I can get.
However, how can I apply if statement to create vector X in this time?
Thank you very much
Do you mean this?
X <- ifelse(t>=10,1,-1)
or
X <- 2*(t>=10)-1
Using case_when
library(dplyr)
case_when(t >= 10 ~ 1, TRUE ~ -1)
You could assign a temporary variable and assign points by comparing the values.
tmp <- t
t[tmp >= 10] <- 1
t[tmp < 10] <- -1
Or without a temporary variable.
t1 <- c(-1, 1)[(t >= 10) + 1]

Count "changes in direction" in a vector in R

I need to count how many times a variable inverts its growth pattern - from increasing values to decreasing values (as well as from decreasing values to increasing values). In the following example, I should be able to find 4 such inversions. How can I create a new dummy variable that shows such inversions?
x <- c(1:20,19:5,6:15,12:9,10:11)
plot(x)
You're effectively asking "when is the second derivative of x not equal to zero?", so you could just do a double diff:
x <- c(1:20,19:5,6:15,12:9,10:11)
plot(seq_along(x), x)
changes <- c(0, diff(diff(x)), 0) != 0
To show it picks the right points, colour them red.
points(seq_along(x)[changes], x[changes], col = "red")
This function will return the indices at which the direction changed:
get_change_indices <- function(x){
# return 0 if x contains one (or none, if NULL) unique elements
if(length(unique(x)) <= 1) return(NULL)
# make x named, so we can recapture its indices later
x <- setNames(x, paste0("a", seq_along(x)))
# calculate diff between successive elements
diff_x <- diff(x)
# remove points that are equal to zero
diff_x <- diff_x[!diff_x==0]
# identify indices of changepoints
diff_x <- c(diff_x[1], diff_x)
change_ind <- NULL
for(i in 2:length(diff_x)){
if(sign(diff_x[i]) != sign(diff_x[i-1])){
change_ind_curr <- as.numeric(gsub("a", "", names(diff_x[i]))) - 1
change_ind <- c(change_ind, change_ind_curr)
}
}
change_ind
}
The length of its output is the number of changes.
Note that it also works when the change in x is non-linear, e.g. if x <- c(1, 4, 9, 1).

Sampling from a subset of data

I have the following problem.
I have multiple subarrays (say 2) that I have populated with character labels (1, 2, 3, 4, 5). My algorithm selects labels at random based on occurrence probabilities.
How can I get R to instead select labels 1:3 for subarray 1 and 4:5 for subarray 2, say, without using subsetting (i.e., []). That is, I want a random subset of labels to be selected for each subarray, instead of all labels assigned to each subarray manually using [].
I know sample() should help.
Using subsetting (which I don't want) one would do
x <- 1:5
sample(x[1:3], size, prob = probs[1:3])
but this assigns labels 1:3 to ALL subarrays.
Would
sample(sample(x), size, replace = TRUE, prob = probs)
work?
Any ideas? Please let me know if this is unclear.
Here is a small example, which selects labels from 1:5 for each of 10 subarrays.
set.seed(1)
N <- 10
K <- 2
Hstar <- 5
probs <- rep(1/Hstar, Hstar)
perms <- 5
## Set up container(s) to hold the identity of each individual from each permutation ##
num.specs <- ceiling(N / K)
## Create an ID for each haplotype ##
haps <- 1:Hstar
## Assign individuals (N) to each subpopulation (K) ##
specs <- 1:num.specs
## Generate permutations, assume each permutation has N individuals, and sample those individuals' haplotypes from the probabilities ##
gen.perms <- function() {
sample(haps, size = num.specs, replace = TRUE, prob = probs) # I would like each subarray to contain a random subset of 1:5.
}
pop <- array(dim = c(perms, num.specs, K))
for (i in 1:K) {
pop[,, i] <- replicate(perms, gen.perms())
}
pop
Hopefully this helps.
I think what you actually want is something like that
num.specs <- 3
haps[sample(seq(haps),size = num.specs,replace = F)]
[1] 3 5 4
That is a random subset of your vector haps ?
Not quite what you want (returns list of matrices instead of 3D array) but this might help
lapply(split(1:5, cut(1:5, breaks=c(0, 2, 5))), function(i) matrix(sample(i, 25, replace=TRUE), ncol=5))
Use cut and split to partition your vector of character labels before sampling them. Here I split your character labels at the value 2. Also, rather than sampling 5 numbers 5 times, you can sample 25 numbers once, and convert to matrix.

How to skip an error in a loop

I want to skip an error (if there is any) in a loop and continue the next iteration. I want to compute 100 inverse matrices of a 2 by 2 matrix with elements randomly sampled from {0, 1, 2}. It is possible to have a singular matrix (for example,
1 0
2 0
Here is my code
set.seed(1)
count <- 1
inverses <- vector(mode = "list", 100)
repeat {
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
inverses[[count]] <- solve(x)
count <- count + 1
if (count > 100) break
}
At the third iteration, the matrix is singular and the code stops running with an error message. In practice, I would like to bypass this error and continue to the next loop. I know I need to use a try or tryCatch function but I don't know how to use them. Similar questions have been asked here, but they are all really complicated and the answers are far beyond my understanding. If someone can give me a complete code specifically for this question, I really appreciate it.
This would put NULLs into inverses for the singular matrices:
inverses[[count]] <- tryCatch(solve(x), error=function(e) NULL)
If the first expression in a call to tryCatch raises an error, it executes and returns the value of the function supplied to its error argument. The function supplied to the error arg has to take the error itself as an argument (here I call it e), but you don't have to do anything with it.
You could then drop the NULL entries with inverses[! is.null(inverses)].
Alternatively, you could use the lower level try. The choice is really a matter of taste.
count <- 0
repeat {
if (count == 100) break
count <- count + 1
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
x.inv <- try(solve(x), silent=TRUE)
if ('try-error' %in% class(x.inv)) next
else inverses[[count]] <- x.inv
}
If your expression generates an error, try returns an object with class try-error. It will print the message to screen if silent=FALSE. In this case, if x.inv has class try-error, we call next to stop the execution of the current iteration and move to the next one, otherwise we add x.inv to inverses.
Edit:
You could avoid using the repeat loop with replicate and lapply.
matrices <- replicate(100, matrix(sample(0:2, 4, replace=T), 2, 2), simplify=FALSE)
inverses <- lapply(matrices, function(mat) if (det(mat) != 0) solve(mat))
It's interesting to note that the second argument to replicate is treated as an expression, meaning it gets executed afresh for each replicate. This means you can use replicate to make a list of any number of random objects that are generated from the same expression.
Instead of using tryCatch you could simply calculate the determinant of the matrix with the function det. A matrix is singular if and only if the determinant is zero.
Hence, you could test whether the determinant is different from zero and calculate the inverse only if the test is positive:
set.seed(1)
count <- 1
inverses <- vector(mode = "list", 100)
repeat {
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
# if (det(x)) inverses[[count]] <- solve(x)
# a more robust replacement for the above line (see comment):
if (is.finite(determinant(x)$modulus)) inverses[[count]] <- solve(x)
count <- count + 1
if (count > 100) break
}
Update:
It is, however, possible to avoid generating singular matrices. The determinant of a 2-by-2 matrix mat is definded as mat[1] * mat[4] - mat[3] * mat[2]. You could use this knowledge for sampling random numbers. Just do not sample numbers which will produce a singular matrix. This, of course, depends on the numbers sampled before.
set.seed(1)
count <- 1
inverses <- vector(mode = "list", 100)
set <- 0:2 # the set of numbers to sample from
repeat {
# sample the first value
x <- sample(set, 1)
# if the first value is zero, the second and third one are not allowed to be zero.
new_set <- ifelse(x == 0, setdiff(set, 0), set)
# sample the second and third value
x <- c(x, sample(new_set, 2, replace = T))
# calculate which 4th number would result in a singular matrix
not_allowed <- abs(-x[3] * x[2] / x[1])
# remove this number from the set
new_set <- setdiff(0:2, not_allowed)
# sample the fourth value and build the matrix
x <- matrix(c(x, sample(new_set, 1)), 2, 2)
inverses[[count]] <- solve(x)
count <- count + 1
if (count > 100) break
}
This procedure is a guarantee that all generated matrices will have an inverse.
try is just a way of telling R: "If you commit an error inside the following parentheses, then skip it and move on."
So if you're worried that x <- matrix(sample(0:2, 4, replace = T), 2, 2) might give you an error, then all you have to do is:
try(x <- matrix(sample(0:2, 4, replace = T), 2, 2))
However, keep in mind then that x will be undefined if you do this and it ends up not being able to compute the answer. That could cause a problem when you get to solve(x) - so you can either define x before try or just "try" the whole thing:
try(
{
x <- matrix(sample(0:2, 4, replace = T), 2, 2)
inverses[[count]] <- solve(x)
}
)
The documentation for try explains your problem pretty well. I suggest you go through it completely.
Edit: The documentation example looked pretty straightforward and very similar to the op's question. Thanks for the suggestion though. Here goes the answer following the example in the documentation page:
# `idx` is used as a dummy variable here just to illustrate that
# all 100 entries are indeed calculated. You can remove it.
set.seed(1)
mat_inv <- function(idx) {
print(idx)
x <- matrix(sample(0:2, 4, replace = T), nrow = 2)
solve(x)
}
inverses <- lapply(1:100, function(idx) try(mat_inv(idx), TRUE))

Constrained randomization of column order in a data.frame

I am trying to duplicate each column from data frame and move it to a randomly located point within 1-3 columns and do it for each column in the data frame. I want columns to move AT LEAST one space to the left or right. Of course sample(data) reorders columns randomly, but my attempts to put it in a loop are embarrassingly bad (I admit I skipped majority of linear algebra classes, damn...). Below is an example data:
dat <- read.table(textConnection(
"-515.5718 94.33423 939.6324 -502.9918 -75.14629 946.6926
-515.2283 96.10239 939.5687 -503.1425 -73.39015 946.6360
-515.0044 97.68119 939.4177 -503.4021 -71.79252 946.6909
-514.7430 99.59141 939.3976 -503.6645 -70.08514 946.6887
-514.4449 101.08511 939.2342 -503.9207 -68.48133 946.7183
-514.2769 102.29453 939.0013 -504.2665 -67.04509 946.7809
-513.9294 104.02753 938.9436 -504.4703 -65.34361 946.7899
-513.5900 105.49624 938.7684 -504.7405 -63.75965 946.7991"
),header=F,as.is=T)
sample(dat)#random columns position
How about this brute-force but plenty-fast solution?
It tries out different permutations of the columns until it finds one in which each column is moved at least 1, and not more than 3 columns to left or right. When it finds such a permutation, the test in the final line of the while() call evaluates to FALSE, terminating the loop and leaving the variable x containing the acceptable permutation.
n <- ncol(dat)
while({x <- sample(n) # Proposed new column positions
y <- seq_len(n) # Original column positions
max(abs(x - y)) > 3 | min(abs(x - y)) == 0
}) NULL
dat[x]
I should probably wait to post this until I have time to comment it up, and discuss some of the ambiguities in the problem as currently specified in the comments above. But since I won't be able to do that, possibly for a while, I thought I'd give you code for a solution that you can examine yourself.
# Create a function that generates acceptable permutations of the data
getPermutation <- function(blockSize, # number of columns/block
nBlock, # number of blocks of data
fromBlocks) { # indices of blocks to be moved
X <- unique(as.vector(outer(fromBlocks, c(-2,-1,1,2), "+")))
# To remove nonsensical indices like 0 or -1
X <- X[X %in% seq.int(nBlock)]
while({toBlocks <- sample(X, size = length(fromBlocks))
max(abs(toBlocks - fromBlocks)) > 2 | min(abs(toBlocks - fromBlocks)) < 1
}) NULL
A <- seq.int(nBlock)
A[toBlocks] <- fromBlocks
A[fromBlocks] <- toBlocks
blockColIndices <-
lapply(seq.int(nBlock) - 1,
function(X) {
seq(from = X * blockSize + 1,
by = 1,
length.out = blockSize)
})
unlist(blockColIndices[A])
}
# Create an example dataset, a 90 column data.frame
dat <- as.data.frame(matrix(seq.int(90*4), ncol=90))
# Call the function for a data frame with 30 3-column blocks
# within which you want to move blocks 2, 14, and 14.
index <- getPermutation(3, 30, c(2, 14, 15))
newdat <- dat[index]

Resources