create numeric vector based on values in logic vector- R - r

I have a logic vector in R something like this:
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[55] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[73] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
I want to construct another numeric vector that contains a 1 if the logic vector is true and a 0 if it is false. I have tried the following code
## create an empty vector
numericvec <- vector(mode="numeric", length=0)
## for loop
for (i in logicvec){
if(i == TRUE){
c(numericvec, 1)
} else {
c(numericvec, 0)
}
}
The for loop syntax seems ok because I don't get errors when I run it but it isn't currently adding any values to the numeric vector.

This should work:
numericvec <- as.numeric(logicvec)
No need for a for() loop. R typically operates on entire columns.

Related

subsetting by index in R

I have an vector with indexes:
indexes
[1] 25 2 16 23
and another vector with logical:
logical
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
i want to keep all logical items that, except those with indexes stored in indexes.
i thought this would have an easy solution, but mine doesn't work:
for(index in indexes){
logical[index] = NULL
}
You could just use minus (-) indexing :
indexes <- c(25, 2, 16, 23)
logicals <- sample(c(T,F),25,replace=T)
logicals
#> [1] FALSE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE
#> [13] FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE TRUE TRUE
#> [25] FALSE
logicals[-indexes]
#> [1] FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE FALSE
#> [13] FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE TRUE

Logical vector to see wether an element of a df is contained within a df inside a List

I tried:
mdf$CLAVE.EMISORA %in% BMV[[9]]$`CLAVE EMISORA`
But it only returns:
logical(0)
For some reason the reveres seems to work:
BMV[[9]]$`CLAVE EMISORA` %in% mdf$CLAVE.EMISORA
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[20] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[39] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[58] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[77] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
My data (mdf): I have it but I don't know how to embed
My list (BMV): .... I don't know how to copy a list to clipboard sorry...
logical(0) is a vector of base type logical with 0 length.
You're getting this because your trying to check if any element in a vector of length 0 is present in BMV[[9]]$'CLAVE EMISORA'
if you run
length(mdf$CLAVE.EMISORA)
You'll get 0 as output
Reverse works because you're checking if any element from a vector of a non-zero length is present in a vector of 0 length.

Unexpected results from str_detect()

str_detect(c("abc", "xyz"), letters)) does not return expected results.
It should be a vector of
[1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[23] FALSE TRUE TRUE TRUE
But instead it returns
str_detect(c("abc", "xyz"), letters))
[1] TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[23] FALSE TRUE FALSE TRUE
Why? And how do I get the desired result?
The reason for this is because str_detect recycles arguments. It's comparing abc against a, then xyz against b, then abc against c, and so on. You should paste together abc and xyz into a single character, or just supply c("abcxyz"), but I'm assuming this might be a simplified version of a more complex issue.
library(stringr)
rgx <- paste0(c("abc", "xyz"), collapse = "")
str_detect(rgx, letters)

Search for vector of motifs in vector of sequences with dataframe output

I have a set of nucleotide sequences in a vector of strings called x.
I want to check whether some (say 10) motifs are present in x. I want to produce a data frame or table where the rows are the sequences in X and the columns are the patterns/motifs are in the vector sdseqs.
sdframe <- data.frame
sdseqs = c("AGGAG.+ATG",
"AGAAG.+ATG","AAAGG.+ATG","GGAGG.+ATG","GAAGA.+ATG",
"GGAGA.+ATG","AAGGT.+ATG","AGGAA.+ATG","AAGGA.+ATG","GTGGA.+ATG")
for (i in 1:10) {
sdframe <- cbind(sdframe,(grepl(sdseqs[i], x)))
}
This code works just fine but the first column of the data frame will be empty, with question marks. The other columns are populated with true and false - that's what i want.
I tried to define an empty data frame outside the loop at the beginning. I am new to R and I am coming from Perl. This what I usually did in Perl: you define variables to be used within a loop outside. How can I do this in R?
Also, a viable option would be to delete the first column from my data frame, but that does not seem so straightforward to me.
Any help is appreciated.
The output i Get with my code now:
sdframe
[1,] ? TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
[2,] ? FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE
[3,] ? FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
[4,] ? TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[5,] ? FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[6,] ? FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
[7,] ? FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
[8,] ? FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
[9,] ? FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10,] ? FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[11,] ? FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
I want the same but without the first column of ?. Note my x has 11 sequences, the motifs i checked for are the column (10 columns, 11 counting the first with ?)
A common R solution would use a function from the apply family to apply a function over a a vector.
sdseqs = c(
"AGGAG.+ATG",
"AGAAG.+ATG",
"AAAGG.+ATG",
"GGAGG.+ATG",
"GAAGA.+ATG",
"GGAGA.+ATG",
"AAGGT.+ATG",
"AGGAA.+ATG",
"AAGGA.+ATG",
"GTGGA.+ATG"
)
sdframe <- sapply(sdseqs, function(one.motif) {
grepl(one.motif, x = x)
})
sdframe
AGGAG.+ATG AGAAG.+ATG AAAGG.+ATG GGAGG.+ATG GAAGA.+ATG GGAGA.+ATG AAGGT.+ATG AGGAA.+ATG AAGGA.+ATG GTGGA.+ATG
[1,] FALSE TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE
[2,] FALSE TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE
[3,] FALSE TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE
sdframe.t <- t(sdframe)
sdframe.t
[,1] [,2] [,3]
AGGAG.+ATG FALSE FALSE FALSE
AGAAG.+ATG TRUE TRUE TRUE
AAAGG.+ATG FALSE FALSE FALSE
GGAGG.+ATG FALSE FALSE FALSE
GAAGA.+ATG TRUE TRUE TRUE
GGAGA.+ATG TRUE TRUE TRUE
AAGGT.+ATG TRUE TRUE TRUE
AGGAA.+ATG FALSE FALSE FALSE
AAGGA.+ATG TRUE TRUE TRUE
GTGGA.+ATG FALSE FALSE FALSE
In first line in fact you do not create a data.frame. So your output is a list.
Instead of cbind you need rbind to add rows:
sdframe <- data.frame()
sdseqs = c("AGGAG.+ATG",
"AGAAG.+ATG","AAAGG.+ATG","GGAGG.+ATG","GAAGA.+ATG",
"GGAGA.+ATG","AAGGT.+ATG","AGGAA.+ATG","AAGGA.+ATG","GTGGA.+ATG")
for (i in 1:10) {
sdframe <- rbind(sdframe,(grepl(sdseqs[i], x)))
}

Comparing Vectors Values: 1 element with all other

I'm wondering how I can compare 1 element of a vector with all elements in the other vector. As an example: suppose
x <- c(1:10)
y <- c(10,11,12,13,14,1,7)
Now I can compare the elements parewise
x == y
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
But I want to compare all elements of y with a specific element of x, something like
x[7] == y
[1] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Is this possible?
Do you mean something like this?
x <- 1:10
y <- c(10,7,11,12,13,14,15,16,17,18)
res <- outer(x, y, `==`)
colnames(res) <- paste0("y=", y)
rownames(res) <- paste0("x=", x)
Which gives you the following matrix:
y=10 y=7 y=11 y=12 y=13 y=14 y=15 y=16 y=17 y=18
x=1 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=2 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=5 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=6 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=7 FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=8 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=9 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x=10 TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
If you want the dimnames to be as y[1] use
colnames(res) <- paste0("y[", seq_along(y), "]")
rownames(res) <- paste0("x[", seq_along(x), "]")
which gives you:
y[1] y[2] y[3] y[4] y[5] y[6] y[7] y[8] y[9] y[10]
x[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[2] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[3] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[4] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[5] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[6] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[7] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[8] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[9] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
x[10] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
To get the index use which as follows:
which(res)
[1] 10 17
As R saves matrices rowwise this results in 10 and 17.
If you want the index in x and y component use:
which(res, arr.ind = TRUE)
row col
x=10 10 1
x=7 7 2
If you want to compare each element of x to y, usually one of the 'apply' functions will help.
As follows:
x <- c(1:10)
y <- c(10,11,12,13,14,1,7)
sapply(x,function(z){z==y})
Column i in the output is result from x[i]==y.
Is this what you're looking for?

Resources