Extract indices from array meeting a condition in R - r

Say I have d<-c(1,2,3,4,5,6,6,7). How can I select the indices from d that meet a certain condition such as x>3 and x<=6 (i.e. d[4], d[5], d[6], d[7])?

Use which
> which(d>3 & d<=6)
[1] 4 5 6 7

Minor: c() creates a vector, which is similar to but not exactly an array.
You can create a logical vector an use it to access d.
d[d>3 & d<=6] # the operators return logical vectors, [] extracts
# only the TRUE values

Related

Count minimum values in a vector, where the minumum values are in consecutive order

As the title says, and it's probably very easy, but how can I count the number of minimum values in a vector, or more specific in a subset of vector:
Down below is an example:
a <- c(1,1,1,2,2)
so i want the output to be equal to 3 (since there are three 1's)
You can use == to get a logical vector, sum() then counts number of TRUE in a logical vector.
sum(a == min(a))
# [1] 3
You can use table, i.e.
table(a)[1]
#1
#3
or If you want to unname it,
unname(table(a)[1])
#[1] 3
We can use tabulate
tabulate(a)[1]
#[1] 3

How to test if an object is a vector in R

I want to test if an object is a vector in R. I'm confused as to why
is.vector(c(0.1))
returns TRUE and so does
is.vector(0.1)
I would like it to return false when it is just a number and true when it is a vector. Can anyone offer any help on this please?
Many thanks in advance.
in R there doesn't exist a single number or string alone. They are vectors of length 1. Or embedded in some more complex structures.
is.vector(c(0.1)) and is.vector(0.1) are in R absolutely identical.
That is also the reason, why length("this is a string/character") returns 1 - because length() in this case measures the number of elements in the vector.
And you see it if you type "this is a string/character" into R console:
It returns [1] "this is a string/character" - the [1] indicates: vector of length 1.
So you have to do nchar("this is a string/character") to get the length of the first element - the charater string - returning 26.
nchar(c("this is a string/character", "and this another string"))
## [1] 26 23
## nchar is vectorized as you see ...
This is an important difference to Python, where strings and numbers can stand alone.
So len("this") returns 4 in Python. len(["this"]) however 1 (1 element in list, thus length of list is 1).
As already mentioned by #RHertel, R considers c(0.1) a vector of length 1. You may want to test for length as well. E.g.
> x <- 1
> y <- 1:2
> is.vector(x) & length(x) > 1
[1] FALSE
> is.vector(y) & length(y) > 1
[1] TRUE

R: how does the which() function operate [duplicate]

This question already has answers here:
Is there an R function for finding the index of an element in a vector?
(4 answers)
Closed 4 years ago.
I am confused about the which function. Basically I thought that it checks at which position of an input object (e.g., a vector) a logical condition is true. As seen in the documentation:
which(LETTERS == "R")
[1] 18
In other words, it goes through all LETTERS values and checks if value == R. But this seems to be a misunderstanding. If I input
a <- c("test","test2","test3","test4")
b <- c("test","test3")
which(a==b)
[1] 1
it returns [1] 1 although test3 does also appear in both vectors. Also, if I input a shorter vector for a, it returns a warning:
a <- c("test","test2","test3")
b <- c("test","test3")
which(a==b)
[1] 1
Warning message:
In a == b : longer object length is not a multiple of shorter object length
My question here is twofold:
How can I return the positions of a character vector a that match a character vector b?
How does which() operate because I obviously misunderstand the function.
Thank you for your answers
Edit: thank you for your quick replies, you clarified my misunderstanding!
You need to give which an input that tells it what elements of a are in b:
which(a%in%b)
[1] 1 3
which essentially identifies which elements are TRUE in a logical vector.
== compares values 1 by 1 (a[1]==b[1]);(a[2]==b[2])..... and not as sets.
for set operations use %in%
use a[which(a %in% b)] to get [1] "test" "test3"
which() returns the index of TRUE expressions (!) not the value.
which(a %in% b) will return
[1] 1 3
the reason for the strange warning message is R's recycling
Warning message:
In a == b : longer object length is not a multiple of shorter object length
so when you compare a vector of length 4 with a vector of length 2, value by value (using == ), R 'recycles' the short vector. in 4 and 2 it works and you will get an answer for this question: (a1==b1,a2==b2,a3==b1,a4==b2). in case of length 4 and 3 - you get a warning message saying the short vector cannot be multiplied by an integer to get the long vector length.

How can I compare two strings to find the number of characters that match in R, using substitution distance?

In R, I have two character vectors, a and b.
a <- c("abcdefg", "hijklmnop", "qrstuvwxyz")
b <- c("abXdeXg", "hiXklXnoX", "Xrstuvwxyz")
I want a function that counts the character mismatches between each element of a and the corresponding element of b. Using the example above, such a function should return c(2,3,1). There is no need to align the strings.
I need to compare each pair of strings character-by-character and count matches and/or mismatches in each pair. Does any such function exist in R?
Or, to ask the question in another way, is there a function to give me the edit distance between two strings, where the only allowed operation is substitution (ignore insertions or deletions)?
Using some mapply fun:
mapply(function(x,y) sum(x!=y),strsplit(a,""),strsplit(b,""))
#[1] 2 3 1
Another option is to use adist which Compute the approximate string distance between character vectors:
mapply(adist,a,b)
abcdefg hijklmnop qrstuvwxyz
2 3 1

How many elements in a vector are greater than x without using a loop

If I have the following vector :
x
[1] 1 5 8 9 1 0 15 15
and I want to know how many elements are greater than 10, how can I proceed without using a loop ?
I would like to get :
2
as a result
Use length or sum:
> length(x[x > 10])
[1] 2
> sum(x > 10)
[1] 2
In the first approach, you would be creating a vector that subsets the values that matches your condition, and then retrieving the length of the vector.
In the second approach, you are simply creating a logical vector that states whether each value matches the condition (TRUE) or doesn't (FALSE). Since TRUE and FALSE equate to "1" and "0", you can simply use sum to get your answer.
Because the first approach requires indexing and subsetting before counting, I am almost certain that the second approach would be faster than the first.
Another way to do this:
> length(which(as.vector(x) > 10))
[1] 2

Resources