ifelse statement in R - r

I'm looking at a gene in 10 people. And this gene has two alleles, say a and b. And each allele has 3 forms: type 2, 3 or 4.
a <- c(2, 2, 2, 2, 3, 3, 3, 2, 4, 3)
b <- c(4, 2, 3, 2, 4, 2, 3, 4, 4, 4)
I wish to code a variable that tells me how many type 4 alleles the person has: 0, 1, or 2.
var <- ifelse(a==4 & b==4, 2, 0)
The code above doesn't work since I didn't account for the individuals who have just one copy of the type 4 allele. I feel like I might need 2 ifelse statements that work simultaneously?

EDIT: You don't actually need ifelse or any fancy operations other than plus and equal to.
var <- (a == 4) + (b == 4)
If you're set on ifelse, this can be done with
var <- ifelse(a == 4, 1, 0) + ifelse(b == 4, 1, 0)
However, I prefer the following solution using apply. The following will give you three cases, the result being the number of 4's the person has (assuming each row is a person).
a = c(2, 2, 2, 2, 3, 3, 3, 2, 4, 3)
b = c(4, 2, 3, 2, 4, 2, 3, 4, 4, 4)
d <- cbind(a,b)
apply(d, 1, function(x) {sum(x == 4)})
For this operation, I first combined the two vectors into a matrix since it makes applying the function easier. In R, generally if data are the same type it is easier (and faster for the computer) to combine the data into a matrix/data frame/etc., then create a function to be performed on each row/column/etc.
To understand the output, consider what happens to the first row of d.
> d[1, ]
a b
2 4
> d[1, ] == 4
a b
FALSE TRUE
Booleans are interpreted as integers under addition, so
> FALSE + TRUE
[1] 1
It doesn't seem to matter whether the 4 came from a or b, so we end up with three cases: 0, 1, and 2, depending on the number of 4's.

Related

Is there in R function for finding index of an array?

this is going to be a body of the particular Question.
which function we are using in array .
You can find the index of the element by the functions
which() or match()
Example for using which():
# vector created
v <- c(0, 1, 2, 3, 4,
5, 6, 7, 8, 9)
# which function is used
# to get the index
which(v == 5) # output is: 6
Example for using match():
# vector created
v <- c(0, 1, 2, 3, 4,
5, 6, 7, 8, 9)
# match function is
# used to get the index
match( 5 , v ) # output is: 6
You can see here more information
For a matrix or an array, set the argument arr.ind = TRUE:
which(myarray == 5, arr.ind = TRUE)

Identify position range values within a vector

I was wondering if there would be the possibility to identify the position of the range values according to a condition. This condition is determined by the longest sequence of values lower than 3.
For instance,
x <- c(4, 1, 2, 1, 1, 4, 1, 1, 1, 1, 2, 1, 1, 1, 1, 4, 1, 1)
Desired output:
c(7:15)
It may be that split() and rle() could be useful in this case but any help will be more than helpful.
You could do the rle on x < 3, then find which of the TRUEs is max. Then sum the lengths before the match plus one as well as the match itself (which will be the final position). Finally do a sequence with the values.
rl <- rle(x < 3)
w <- which(rl$lengths == max(rl$lengths) & rl$values)
do.call(seq.int, list(sum(rl$lengths[1:(w - 1)]) + 1, sum(rl$lengths[1:w])))
# [1] 7 8 9 10 11 12 13 14 15

R sapply new evaluation of function for each incidence in array

Thanks to lots of help, I've got an expression that substitutes the value from a rbinom into a vector, when certain conditions are met. My problem is that it always substitutes the same value, i.e. does not do a new evaluation for each instance of the conditions being met. I think I just need to wrap it in a sapply statement but haven't got the syntax correct. MWE:
arr1 <- c(8, 2, 5, 2, 3, 2, 2, 2, 8, 2, 4)
arr2 <- c(0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0)
arr1
arr1[Reduce("&", list(arr1 == 2, arr2 ==1))] <- rbinom(1,1,0.5) * 2
arr1
arr1
[1] 8 2 5 0 3 0 0 0 8 0 4
I would have hoped that it changed some of the values but not others, so evaluated the result again for each instance. Is this a good application of purrr::modify2 ? Thx. J
Probably, you mean to use :
inds <- arr1 == 2 & arr2 == 1
arr1[inds] <- rbinom(sum(inds), 1, 0.5) * 2

How to count the number of times a pattern changes?

I have a vector created simulating a continious time Markov Chain. The vector represents the path the chain may describe. Simulating 20 steps we could have:
Xt <- c(5, 5, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3, 2, 2, 2, 1, 1, 1, 0 ,0)
Further, the vector can jump 1 by 1 or jump from any state (5,4,3,2,1) to 0. So other simulation could be:
Xt <- c(5, 5, 5, 5, 5, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
I want to count the number of times the simulated chain jumps to other state (when the vector changes of number) within a determined interval. For example:
The number of jumps for the first vector I wrote for the first 10 elements is 2 (Jumps from 5 to 4 and 4 to 0). The number of jumps for the second vector I wrote for the last 10 elements is 0 (The last 10 elements are all 0)
So I would like to count the number of jumps (the number of times the pattern changes). I tried using toString(Xt)and then trying to match some regex but nothing worked. Any ideas?
You can use diff for this which counts the difference between adjacent numbers in a vector. Sum all instances not equal to zero to get total times the pattern changes.
First 10:
sum(diff(Xt[1:10])!=0)
[1] 2
Last 10:
sum(diff(Xt[(length(Xt)-10):length(Xt)])!=0)
[1] 0
Seems like just count the number of times the difference was not zero would deliver the desired result:
Xt <- c(5, 5, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3, 2, 2, 2, 1, 1, 1, 0 ,0)
sum(diff(Xt) != 0)
If the goal was to write a function that takes a string and a starting positon it could be done thusly:
jump_in_next_10 <- function(string, start){
sum( diff(string[start:(start+9)]) != 0 )}
jump_in_next_10(Xt, 3)
#[1] 2

How to only get value from data.frame in R?

i am trying to calculate the probabilities of 4 dices being thrown in R. I am nearly finished, i just want to know how i could possibly access ONLY the value in a specific row of my test1 dataframe? If i write rowSums(test1[1,]) it gives me both the index AND the sum, but i only want to access the sum to be able to store how many possibilities there are to get i.e. a 4 with 4 dices etc.
HereĀ“s the important place of the code.
wurf1 <- c(1, 2, 3, 4, 5, 6)
wurf2 <- c(1, 2, 3, 4, 5, 6)
wurf3 <- c(1, 2, 3, 4, 5, 6)
wurf4 <- c(1, 2, 3, 4, 5, 6)
test1 <- data.frame(expand.grid(wurf1, wurf2, wurf3, wurf4))
rowSums(test1[1,]) #this gives me:
1
4 #because the sum of the values in index 1 = 4
Thank you for your help in advance.

Resources