Count number of identical values between two vectors - r

a<-c(19,24,34,47,47,47)
b<-c(3,14,24,25,47,47)
I want to know how many values in a match those in b, however i'm running into issues - when there are duplicate numbers present in both vectors. My desired answer for the above example would be 3 - because 24,47,47 - are shared between the two vectors.
If I use intersect:
intersect(a,b)
[1] 24 47
The 2nd matching 47 is ignored.
If I use %in%:
length(which(a %in% b))
[1] 4
The extra 47 in a is also counted.
I realise that I can do:
length(which(b %in% a))
[1] 3
However, I may also have cases where there is an extra matching value in b instead of a and so %in% is also not useful. For example:
a<-c(19,24,34,7,47,47)
b<-c(3,14,24,47,47,47)
length(which(b %in% a))
[1] 4 (I want the answer to still be 3)
So, without rearranging which vector comes first in the %in% function, for each test - I cannot figure out how to do this. Can somebody show me how?

How about:
sum(pmin(
table(a[a %in% intersect(a, b)]),
table(b[b %in% intersect(a, b)])
))
We make table()s of the chunks of a, b that are common to both, then we take the smallest numbers from those tables and add them up.

Related

Making a subvector, that only keeps elements divisible by three

so I am currently struggling with starting out in R. I had a task of creating a vector v=c(1,3,5,7,8,9,11,13,15,17,19,21), and then to create a subvector, that only keeps the elements of v that are divisible by three.
I suppose i would have to use the %% operator but I am not really sure how to get it to kind of pick and choose, instead of just dividing every element by three. I also tried to create a vector of just threes in order to the divide the original vector by that... no suprise that didnt work lol.
Any help appreciated, i just want to get to know how to use the different operators and commands.
Create vector
x <- c(1,3,5,7,8,9,11,13,15,17,19,21)
Filter elements that can divided by 3
x[(x %% 3) == 0]
Result
[1] 3 9 15 21
If you want to keep only the elements divisable by 3, you can do it with purrr::keep and modulo x%%y.
library(purrr)
v %>% keep(v%%3==0)
#OR simply
keep(v, v%%3==0)
[1] 3 9 15 21

Vectorized vector construction in R through indexing

I would like to construct an atomic vector X using values from a vector A, such that length(X)>=length(A). Furthermore, the values of X are indexed by a third vector B such that length(B)=length(X). The mapping to construct X is as follows:
X[i] <- A[B[i]]
Now, it is clear to me how I would construct the vector X in a for loop. My question is: since the X is due to be quite large (length(X) ~ 30,000) is there a way to vectorize the construction of X? That is, apply a blanket function that avoids element by element calculation. I looked into functions such as sapply and mapply, but I didn't see how I could incorporate the indexing of vector B into those.
For example, if:
A <- c(20,31,17,110,87)
B <- c(1,1,2,1,1,3,4,3,5)
I would expect X to be:
X <- c(20,20,31,20,20,17,110,17,87)
That's very simple to vectorise, so you can avoid overcomplicating it with applys or loops etc. - simply use B as numerical vector to index the values of A.
In your case, using A[B] translates to A[c(1,1,2,1,...,5)] which is basically saying "return the 1st element of A, the first element of A, the second element of A, the first element of A... the fifth element of A".
A <- c(20,31,17,110,87)
B <- c(1,1,2,1,1,3,4,3,5)
A[B]
## > A[B]
## [1] 20 20 31 20 20 17 110 17 87
X <- A[B]

$value in unidimensional integrals in R [duplicate]

I have transitioned from STATA to R, and I was experimenting with different data types so that R's data structures are clear in my mind.
Here's how I set up my data structure:
b<-list(u=5,v=12)
c<-list(u=7)
j<-list(name="Joe",salary=55000,union=T)
bcj<-list(b,c,j)
Now, I was trying to figure out different ways to access u=5. I believe there are three ways:
Try1:
bcj[[1]][[1]]
I got 5. Correct!
Try2:
bcj[[1]][["u"]]
I got 5. Correct!
Try3:
bcj[[1]]$u
I got 5. Correct!
Try4
bcj[[1]][1][1]
Here's what I got:
bcj[[1]][1][1]
$u
[1] 5
class(bcj[[1]][1][1])
[1] "list"
Question 1: Why did this happen?
Also, I experimented with the following:
bcj[[1]][1][1][1][1][1]
$u
[1] 5
class(bcj[[1]][1][1][1][1][1])
[1] "list"
Question 2: I would have expected an error because I don't think so many lists exist in bcj, but R gave me a list. Why did this happen?
PS: I did look at this thread on SO, but it's talking about a different issue.
I think this is sufficient to answer your question. Consider a length-1 list:
x <- list(u = 5)
#$u
#[1] 5
length(x)
#[1] 1
x[1]
x[1][1]
x[1][1][1]
...
always gives you the same:
#$u
#[1] 5
In other words, x[1] will be identical to x, and you fall into infinite recursion. No matter how many [1] you write, you just get x itself.
If I create t1<-list(u=5,v=7), and then do t1[2][1][1][1]...this works as well. However, t1[[2]][2] gives NA
That is the difference between [[ and [ when indexing a list. Using [ will always end up with a list, while [[ will take out the content. Compare:
z1 <- t1[2]
## this is a length-1 list
#$v
#[1] 7
class(z1)
# "list"
z2 <- t1[[2]]
## this takes out the content; in this case, a vector
#[1] 7
class(z2)
#[1] "numeric"
When you do z1[1][1]..., as discussed above, you always end up with z1 itself. While if you do z2[2], you surely get an NA, because z2 has only one element, and you are asking for the 2nd element.
Perhaps this post and my answer there is useful for you: Extract nested list elements using bracketed numbers and names?

Combining the common elements in two lists in R, using only logical and arithmetic operators

I'm currently attempting to work out the GCD of two numbers (x and y) in R. I'm not allowed to use loops or if, else, ifelse statements. So i'm restricted to logical and arithmetic operators. So far using the code below i've managed to make lists of the factors of x and y.
xfac<-1:x
xfac[x%%fac==0]
This gives me two lists of factors but i'm not sure where to go from here. Is there a way I can combine the common elements in the two lists and then return the greatest value?
Thanks in advance.
Yes, max(intersect(xfac,yfac)) should give the gcd.
You have almost solved the problem. Let's take the example x <- 12 and y <- 18. The GCD is in this case 6.
We can start by creating vectors xf and yf containing the factor decomposition of each number, similar to the code you have shown:
xf <- (1:x)[!(x%%(1:x))]
#> xf
#[1] 1 2 3 4 6 12
yf <- (1:y)[!(y%%(1:y))]
#> yf
#[1] 1 2 3 6 9 18
The parentheses after the negation operator ! are not necessary due to specific rules of operator precedence in R, but I think that they make the code clearer in this case (see fortunes::fortune(138)).
Once we have defined these vectors, we can extract the GCD with
max(xf[xf %in% yf])
#[1] 6
Or, equivalently,
max(yf[yf %in% xf])
#[1] 6

How to find common elements on two different length vectors in R?

I need to find common elements in 2 different length vectors.
For example, I have a vector A with 10 elements, and a vector B with 3 elements.
I need get the position of which elements in A is equal to B.
A=c(1,2,45,3,10,5,11,13,6,7)
B=c(45,3,10)
C would be [3,4,5]
I have already tried "match" and "intercept" functions, but no success :(
Thanks a lot! :)
You can use which function.
> which(A %in% B)
[1] 3 4 5

Resources