Multiply certain elements of a vector in R - r

I have a vector [1:360] with integers and need to find the products of the first, second ... twelfth set of 30 elements. Ultimately, I need a function that gives me a vector [1:12] with the products of all twelve 30-element intervals.
I'm fairly new to R and have been stuck on this for too long.

A simple way to do this would be to turn your vector into a 30-row matrix and get the product of each column.
In the absence of a reproducible example, let's make one with a vector of 360 numbers drawn from a normal distribution:
set.seed(69)
vec <- rnorm(360)
We can turn vec into a 30 * 12 matrix by just doing matrix(vec, nrow = 30), which will fill the matrix by column. We then get the product of each column by using apply to apply the function prob to each column.
apply(matrix(vec, nrow = 30), 2, prod)
#> [1] -6.253460e-09 -4.413086e-09 -1.332389e-10 1.041448e-08 -1.779489e-08 1.255979e-10
#> [7] 3.463687e-13 -6.265196e-12 8.300651e-04 -1.041469e-10 4.256378e-09 1.439522e-09

Related

Create a list of names/indices of overlapping fragments of vector based on condition [R]

I want to perform a sliding window analysis of a long vector in R. Doing so, I would like to check, whether given fragments of this vector contain certain value.
Below I paste a reproducible example. This vector (vctr) contains 77 elements (either 0 or 1). I am analyzing it with sliding window encompassing 10 items (segment) with overlap encompassing 5 elements (overlap).
I know how to check, whether given fragment contains certain value (in this case 1) or not (split_vctr). However I would also like to do something else, namely:
I would like to create a new variable (list or vector) containing only indices of those fragments, which fulfill the given criterion (in this case: which contain at least one value equal to 1; in this case: TRUE).
Let's suppose that the initial list would be named - how could I extract only the names of fragments which are TRUE?
I would highly appreciate your help.
Dummy data:
# dummy vector
vctr <- c(rep(0, 11), rep(1, 4), rep(0, 25), rep(1, 3), rep(0, 31),rep(1, 3))
# split parameters:
segment <- 10 # length of each segment
overlap <- 5 # length of each overlapping part
#finding coordinates
start_coordinates <- seq(1, length(vctr), by=segment-overlap)
end_coordinates <- start_coordinates + segment - 1
#check whether splitted vector fragments meet a condition
split_vctr <- lapply(1:length(start_coordinates), function(i) 1 %in% vctr[start_coordinates[i]:end_coordinates[i]])
which(unlist(split_vctr))
will return the indices of split_vctr where it is TRUE
If split_vctr itself was named, you could use these indices to extract the names of the TRUE fragments like this:
names(split_vctr)[which(unlist(split_vctr))]

How to sum elements by intervals?

I am wondering how I can use dplyr (or other methods) to sum intervals of elements of a vector?
Lets say I have the vector: v = rep(2,800).
I want to get a new vector with the sums of intervals of 16 elements, having the content like this:
Vsum <- c(sum(v[1:16]), sum(v[17:32]), ..., sum(v[785:800]) )
length(Vsum)
[1] 50
NB! What I have tried myself:
sixteen <- seq(1,800,16)
sixteen_end <- sixteen + 15
sum(test[seksten:seksten_slutt])
[1] 32
But it only sum the first interval (1:16) and not for the rest of vector v.
You can use matrix with colSums:
colSums(matrix(v, 16))

Convert a one column matrix to n x c matrix

I have a (nxc+n+c) by 1 matrix. And I want to deselect the last n+c rows and convert the rest into a nxc matrix. Below is what I've tried, but it returns a matrix with every element the same in one row. I'm not sure why is this. Could someone help me out please?
tmp=x[1:n*c,]
Membership <- matrix(tmp, nrow=n, ncol=c)
You have a vector x of length n*c + n + c, when you do the extract, you put a comma in your code.
You should do tmp=x[1:(n*c)].
Notice the importance of parenthesis, since if you do tmp=x[1:n*c], it will take the range from 1 to n, multiply it by c - giving a new range and then extract based on this new range.
For example, you want to avoid:
(1:100)[1:5*5]
[1] 5 10 15 20 25
You can also do without messing up your head with indexing:
matrix(head(x, n*c), ncol=c)

gene expression datamatrix filtration

I have one matrix with 3064 rows and 27 columns which contains values between -0.5 and 2.0. I want to extract every rows which have at least once value >=0.5. As answer I would like to have whole row in it's origional matrix form.
Consider m is my matrix, I tried:
m[m[1:190,1:16]>0.5,1:16]
As this command is not accepting process on more then 190 rows, I went for 190 rows, but somehow it went wrong, because it gave me rows which also have values < 0.5.
Is it possible to write any function, that can be applied for whole matrix ?
you can also try like this if your data name is df
df2<- df[apply(df, MARGIN = 1, function(x) any(x >= 0.5)), ]
library(fBasics)
m2 <- subset(x = m, subset = rowMaxs(m)>=0.5)
What mm=m[1:190,1:16]>0.5 gives you is a matrix of boolean indicating which values of m[1:190,1:16] are greater than 0.5.
Then when you do m[mm], it considers mm as a vector and gives you corresponding values. The thing is dim(m) = 3064*27 while dim(m[1:190,1:16]) = 190*16. Which means that the first 27 values of mm will be used to get the first line of m while they correspond to part of the second line of mm.
So in order to have only the elements greater than 0.5, you need to apply matrix to m[1:190,1:16] which has the same dimension, i.e:
`m[1:190,1:16][m[1:190,1:16]>0.5, 1:16]
But what you do here is m[mm, 1:16], so you consider each individual value of mm as a row number, while it is a 190*16 matrix. It means you specify 190*16=3040 rows, it does not work with more because m only has 3064 rows.
What you want is a vector of length 190 (or even 3064 I guess) specifying which rows to take. You can get this vector with rowSums(m >=0.5)>0, which means each row with more than 0 values greater than 0.5. Then you get your output with:
m[rowSums(m >= 0.5) > 0,]
And it will work for the whole matrix. Note that some values will be smaller than 0.5 since you selected the whole line if at least one value was greater than 0.5.
Edit
For rows with values <0.5, the idea is the same:
m[rowSums(m < 0.5) > 0,]

How to create a list from an array of z-scores in R?

I have an array of z-scores that is structured like num [1:27, 1:11, 1:467], so there are 467 entries with 27 rows and 11 columns. Is there a way that I can make a list from this array? For example a list of entries which contain a z-score over 2.0 (not just a list of z scores, a list which identifies which 1:467 entries have z > 2).
Say that your array is called z in your R session. The function you are looking for is which with the argument arr.ind set to TRUE.
m <- which(z > 2, arr.ind=TRUE)
This will give you a selection matrix, i.e. a matrix with three columns, each line corresponding to an entry with a Z-score greater than 2. To know the number of Z-scores greater than 2 you can do
nrow(m)
# Note that 'sum(z > 2)' is easier.
and to get the values
z[m]
# Note that 'z[z > 2]' is easier

Resources