This question already has answers here:
Boolean operators && and ||
(4 answers)
Closed 3 years ago.
I can't find the answer and the simple approaches I've tried haven't worked.
Basically, I have two corresponding dataframes with identical dimensions, full of boolean values.
I want "OR" logic, to produce a third corresponding dataframe with a TRUE anywhere either starting dataframes had TRUE.
df1 <- data.frame(a=c(T,T),
b=c(F,F))
df2 <- data.frame(a=c(F,T),
b=c(F,T))
Desired output:
a b
[1,] TRUE FALSE
[2,] TRUE TRUE
It works using the | operator:
df1 | df2
a b
[1,] TRUE FALSE
[2,] TRUE TRUE
Related
This question already has answers here:
Using regex in R to find strings as whole words (but not strings as part of words)
(2 answers)
Closed 1 year ago.
I referred this question (How to filter Exact match string using dplyr) but mine is slightly different as the word is not the start but can occur anywhere in the string. I want TRUE to be returned only for first one not the second & third
library(stringr)
vec <- c("this should be selected", "thisus should not be selected","not selected thisis too")
str_detect(vec,"this")
Current output
TRUE TRUE TRUE
Expected output
TRUE FALSE FALSE
Use a word boundary (\\b)
stringr::str_detect(vec,"\\bthis\\b")
#[1] TRUE FALSE FALSE
In base R :
grepl('\\bthis\\b', vec)
This question already has answers here:
List distinct values in a vector in R
(7 answers)
Closed 2 years ago.
I would like to extract duplicated strings from a list. As, the unique function does not work on non-numerical data, I used the stringi package with the stri_duplicated function to obtain logical values (TRUE or FALSE). I would like to extract the strings that are duplicated from the list (the strings for which stri_duplicated reports a TRUE).
Here a minimal example:
ex1 <- c("SE1", "SE2", "SE5", "SE2")
dupl <- stri_duplicated(ex1)
> dupl
[1] FALSE FALSE FALSE TRUE
Many thanks in advance.
In base-R there is
duplicated(ex1)
[1] FALSE FALSE FALSE TRUE
if you want to extract the duplicated items
ex1[duplicated(ex1)]
[1] "SE2"
This question already has answers here:
How to count the frequency of a string for each row in R
(4 answers)
Counting number of instances of a condition per row R [duplicate]
(1 answer)
Closed 3 years ago.
I've the following data frame and I want to count the occurrences of the word "High" for each rows and append as another column say "countHigh" to the data frame
a b c
1 High High High
2 High Low High
3 Low Low High
So I should get a vector of counts (3,2,1).
I've tried apply() and stringr:str_count as follows:
> apply(test.df[,1:3],1,str_count,"High" )
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 1 0 0
[3,] 1 1 1
and I used the apply() function twice:
> apply(apply(test.df[,1:3],1,str_count,"High" ),2,sum)
[1] 3 2 1
Is there a better way to do this, particularly using apply() just once and using grep() or which() ?
Thanks
If it a fixed string, then create a logical matrix with == and get the rowSums from base R (should be fast compared to apply)
test.df$countHigh <- rowSums(test.df == "High")
This question already has an answer here:
Removing duplicate combinations (irrespective of order)
(1 answer)
Closed 7 years ago.
I have the following problem:
I have an k*2 matrix with unique rows (unique() command was applied before), so the essential part is, it has two columns.
Now I only want to keep the rows which are not a permutation of another row,
but in such a way that if there is a permutation, I do want to keep one of the two!
background: each element of this matrix is associated w/ a column of another data vector and I wantt to take differences of many 2 such vectors, and project on the resulting (difference) vector.
But projecting on +/- the vector is the same, so this is for this application a duplicate.
Example:
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 2 1
Desired result:
[,1] [,2]
[1,] 1 2
[2,] 3 4
Create a copy of the matrix
Sort each of its rows
Find the indices of non-duplicate rows using the duplicated function
Select these rows from the original matrix
or if the order does not matter, just run unique after step 2.
This question already has answers here:
Reshaping data.frame from wide to long format
(8 answers)
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 5 years ago.
I have a data frame with counts of each combination of a trait (true / false) for species A and B. Here's a smaller version of my data:
species <- c("A", "B")
true <- c(3, 2)
false <- c(1, 4)
df <- data.frame(species, true, false)
df
species true false
1 A 3 1
2 B 2 4
Is there any way to convert these summarized counts to one row for each registration, with first column for "Species" (A or B). Second column "Trait" (true or false):
Species Trait
A true
A true
A true
B true
B true
A false
B false
B false
B false
B false
I don´t really know how to approach this, usually raw data is available and a summary table can easily be constructed from that, but this is the reverse way.
I´m thankful for every answer! :)