How to only get value from data.frame in R? - r

i am trying to calculate the probabilities of 4 dices being thrown in R. I am nearly finished, i just want to know how i could possibly access ONLY the value in a specific row of my test1 dataframe? If i write rowSums(test1[1,]) it gives me both the index AND the sum, but i only want to access the sum to be able to store how many possibilities there are to get i.e. a 4 with 4 dices etc.
HereĀ“s the important place of the code.
wurf1 <- c(1, 2, 3, 4, 5, 6)
wurf2 <- c(1, 2, 3, 4, 5, 6)
wurf3 <- c(1, 2, 3, 4, 5, 6)
wurf4 <- c(1, 2, 3, 4, 5, 6)
test1 <- data.frame(expand.grid(wurf1, wurf2, wurf3, wurf4))
rowSums(test1[1,]) #this gives me:
1
4 #because the sum of the values in index 1 = 4
Thank you for your help in advance.

Related

Is there in R function for finding index of an array?

this is going to be a body of the particular Question.
which function we are using in array .
You can find the index of the element by the functions
which() or match()
Example for using which():
# vector created
v <- c(0, 1, 2, 3, 4,
5, 6, 7, 8, 9)
# which function is used
# to get the index
which(v == 5) # output is: 6
Example for using match():
# vector created
v <- c(0, 1, 2, 3, 4,
5, 6, 7, 8, 9)
# match function is
# used to get the index
match( 5 , v ) # output is: 6
You can see here more information
For a matrix or an array, set the argument arr.ind = TRUE:
which(myarray == 5, arr.ind = TRUE)

Calculate the Average of Data frame iteratively

I have a set of Data and I want to be able to iteratively calculate the Average. i.e. the first two points, then the first three points, then first four points, and so on.
For example,
D <- [1, 2, 3, 4, 5, 6]
Average <- [1.5, 2, 2.5, 3, 3.5]
Is there any way to do that in R?
Try the following:
D <- c(1, 2, 3, 4, 5, 6)
(cumsum(D)/1:length(D))[-1]
# [1] 1.5 2.0 2.5 3.0 3.5

How to count the number of times a pattern changes?

I have a vector created simulating a continious time Markov Chain. The vector represents the path the chain may describe. Simulating 20 steps we could have:
Xt <- c(5, 5, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3, 2, 2, 2, 1, 1, 1, 0 ,0)
Further, the vector can jump 1 by 1 or jump from any state (5,4,3,2,1) to 0. So other simulation could be:
Xt <- c(5, 5, 5, 5, 5, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
I want to count the number of times the simulated chain jumps to other state (when the vector changes of number) within a determined interval. For example:
The number of jumps for the first vector I wrote for the first 10 elements is 2 (Jumps from 5 to 4 and 4 to 0). The number of jumps for the second vector I wrote for the last 10 elements is 0 (The last 10 elements are all 0)
So I would like to count the number of jumps (the number of times the pattern changes). I tried using toString(Xt)and then trying to match some regex but nothing worked. Any ideas?
You can use diff for this which counts the difference between adjacent numbers in a vector. Sum all instances not equal to zero to get total times the pattern changes.
First 10:
sum(diff(Xt[1:10])!=0)
[1] 2
Last 10:
sum(diff(Xt[(length(Xt)-10):length(Xt)])!=0)
[1] 0
Seems like just count the number of times the difference was not zero would deliver the desired result:
Xt <- c(5, 5, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3, 2, 2, 2, 1, 1, 1, 0 ,0)
sum(diff(Xt) != 0)
If the goal was to write a function that takes a string and a starting positon it could be done thusly:
jump_in_next_10 <- function(string, start){
sum( diff(string[start:(start+9)]) != 0 )}
jump_in_next_10(Xt, 3)
#[1] 2

ifelse statement in R

I'm looking at a gene in 10 people. And this gene has two alleles, say a and b. And each allele has 3 forms: type 2, 3 or 4.
a <- c(2, 2, 2, 2, 3, 3, 3, 2, 4, 3)
b <- c(4, 2, 3, 2, 4, 2, 3, 4, 4, 4)
I wish to code a variable that tells me how many type 4 alleles the person has: 0, 1, or 2.
var <- ifelse(a==4 & b==4, 2, 0)
The code above doesn't work since I didn't account for the individuals who have just one copy of the type 4 allele. I feel like I might need 2 ifelse statements that work simultaneously?
EDIT: You don't actually need ifelse or any fancy operations other than plus and equal to.
var <- (a == 4) + (b == 4)
If you're set on ifelse, this can be done with
var <- ifelse(a == 4, 1, 0) + ifelse(b == 4, 1, 0)
However, I prefer the following solution using apply. The following will give you three cases, the result being the number of 4's the person has (assuming each row is a person).
a = c(2, 2, 2, 2, 3, 3, 3, 2, 4, 3)
b = c(4, 2, 3, 2, 4, 2, 3, 4, 4, 4)
d <- cbind(a,b)
apply(d, 1, function(x) {sum(x == 4)})
For this operation, I first combined the two vectors into a matrix since it makes applying the function easier. In R, generally if data are the same type it is easier (and faster for the computer) to combine the data into a matrix/data frame/etc., then create a function to be performed on each row/column/etc.
To understand the output, consider what happens to the first row of d.
> d[1, ]
a b
2 4
> d[1, ] == 4
a b
FALSE TRUE
Booleans are interpreted as integers under addition, so
> FALSE + TRUE
[1] 1
It doesn't seem to matter whether the 4 came from a or b, so we end up with three cases: 0, 1, and 2, depending on the number of 4's.

Linearly regress a vector against each column of a matrix

I have a very simple question which I am sure there is an elegant answer to (I am also sure the title above is inappropriate). I have a vector of y values:
y = matrix(c(1, 2, 3, 4, 5, 6, 7), nrow=7, ncol=1)
which I would like to regress against each column in a matrix, x:
x = matrix(c(1, 2, 3, 4, 5, 6, 7, 7, 6, 5, 4, 3, 2, 1, 4, 4, 4, 4, 4, 4, 4), nrow=7, ncol=3)
For example I would like to linearly regress the first column of x against y and then the second column of x against y until the last column of x is reached:
regression.1=lm(y~x[,1])
regression.2=lm(y~x[,2])
I would later like to plot the slope of these regression versus other parameters so it would be useful if the model coefficient parameters are easily accessible in the usual way:
slope.1 = summary(regression.1)$coefficients[2,1]
I am guessing a list using something like plyr but I am too new to this game to find the simplest way to code this.
store <- mapply(col.ind = 1:ncol(x),function(col.ind){ lm(y~x[,col.ind]) })
You can then access the slope using:
> store[1,]
[[1]]
(Intercept) x[, col.ind]
6.713998e-16 1.000000e+00
[[2]]
(Intercept) x[, col.ind]
8 -1
[[3]]
(Intercept) x[, col.ind]
4 NA
Another way:
regression <- apply(x, 2, function(z)lm(y~z))
slope <- sapply(regression, function(z)unname(coef(z)[2]))
Result:
> slope
[1] 1 -1 NA

Resources