I have a vector containing a list of numbers. How do I find numbers that are missing from the vector?
For example:
sequence <- c(12:17,1:4,6:10,19)
The missing numbers are 5, 11 and 18.
sequence <- c(12:17,1:4,6:10,19)
seq2 <- min(sequence):max(sequence)
seq2[!seq2 %in% sequence]
...and the output:
> seq2[!seq2 %in% sequence]
[1] 5 11 18
>
You can use the setdiff() function to compute set differences. You want the difference between the complete sequence (from min(sequence) to max(sequence)) and the sequence with missing values.
setdiff(min(sequence):max(sequence), sequence)
This answer just gets all of the numbers from the lowest to highest in the sequence, then asks which are not present in the original sequence.
which(!(seq(min(sequence), max(sequence)) %in% sequence))
[1] 5 11 18
c(1:max(sequence))[!duplicated(c(sequence,1:max(sequence)))[-(1:length(sequence))]]
[1] 5 11 18
Not a particularly elegant solution, I admit, but what it does is determines which in the vector 1:max(sequence) are duplicates of sequence, and then selects those out of that same vector.
Related
If a vector is produced from a vector of unknown length with unique elements by repeating it unknown times
small_v <- c("as","d2","GI","Worm")
big_v <- rep(small_v, 3)
then how to determine how long that vector was and how many times it was repeated?
So in this example the original length was 4 and it repeats 3 times.
Realistically in my case the vectors will be fairly small and will be repeated only a few times.
1) Assuming that there is at least one unique element in small_v (which is the case in the question since it assumes all elements in small_v are unique):
min(table(big_v))
## [1] 3
or using pipes
big_v |> table() |> min()
## [1] 3
Here is a more difficult test but it still works because small_v2[2] is unique in small_v2 even though the other elements of small_v2 are not unique.
# test data
small_v2 <- c(small_v, small_v[-2])
big_v2 <- rep(small_v2, 3)
min(table(big_v2))
## [1] 3
2) If we knew that the first element of small_v were unique (which is the case in the question since it assumes all elements in small_v are unique) then this would work:
sum(big_v[1] == big_v)
## [1] 3
1) If the elements are all repeating and no other values are there, then use
length(big_v)/length(unique(big_v))
[1] 3
2) Or use
library(data.table)
max(rowid(big_v))
[1] 3
Alternatively we could use rle with with to count the repeats
with(rle(sort(big_v)), max(lengths))
Created on 2023-02-04 with reprex v2.0.2
[1] 3
I have a data.frame as given below. I want to get the index/row number where (b-a)>8 but I want to compare them after row 7 not from row 1. I have written the code to get me the row number where b-a>8 satisfies but it checks from row 1. How to check it from row 7?
a <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16)
b <- c(2,12,4,5,2,5,8,5,7,19,6,7,4,23,1,2)
df <- data.frame(a,b)
which((df$b-df$a)>8)[1]
Desired output: Row number 10 not 2.
One can start with offset as in both vectors as:
which((df$b[7:nrow(df)]-df$a[7:nrow(df)])>8)
#[1] 8
This is just a math calculation
(which(with(df[-(1:7),],b-a>8))+7)[1]
[1] 10
(a<-which((df$b-df$a)>8))[a>7][1]
[1] 10
Assume I have three matrices...
A=matrix(c("a",1,2),nrow=1,ncol=3)
B=matrix(c("b","c",3,4,5,6),nrow=2,ncol=3)
C=matrix(c("d","e","f",7,8,9,10,11,12),nrow=3,ncol=3)
I want to find all possible combinations of column 1 (characters or names) while summing up columns 2 and 3. The result would be a single matrix with length equal to the total number of possible combinations, in this case 6. The result would look like the following matrix...
Result <- matrix(c("abd","abe","abf","acd","ace","acf",11,12,13,12,13,14,17,18,19,18,19,20),nrow=6,ncol=3)
I do not know how to add a table in to this question, otherwise I would show it more descriptively. Thank you in advance.
You are mixing character and numeric values in a matrix and this will coerce all elements to character. Much better to define your matrix as numeric and keep the character values as the row names:
A <- matrix(c(1,2),nrow=1,dimnames=list("a",NULL))
B <- matrix(c(3,4,5,6),nrow=2,dimnames=list(c("b","c"),NULL))
C <- matrix(c(7,8,9,10,11,12),nrow=3,dimnames=list(c("d","e","f"),NULL))
#put all the matrices in a list
mlist<-list(A,B,C)
Then we use some Map, Reduce and lapply magic:
res <- Reduce("+",Map(function(x,y) y[x,],
expand.grid(lapply(mlist,function(x) seq_len(nrow(x)))),
mlist))
Finally, we build the rownames
rownames(res)<-do.call(paste0,expand.grid(lapply(mlist,rownames)))
# [,1] [,2]
#abd 11 17
#acd 12 18
#abe 12 18
#ace 13 19
#abf 13 19
#acf 14 20
You have the following vector which has NA's mixed in. There are no consecutive NA's and the vector is of unknown length. There are always the same number of NA's.
#Data
testvector <- c(NA,rnorm(round(abs(rnorm(1))*10)),NA,rnorm(round(abs(rnorm(1))*10)),NA,rnorm(round(abs(rnorm(1))*10)),NA,rnorm(round(abs(rnorm(1))*10)),NA,rnorm(round(abs(rnorm(1))*10)))
You need to find the number of non-NA values that exist after each NA. This needs to be return as a vector. The length of this vector will equal the number of NA's.
For this vector.
thisvector <- c(NA,rnorm(4),NA,rnorm(5),NA,rnorm(9),NA,rnorm(2),NA,rnorm(6))
What you want is
somefunction(thisvector)
[1] 4 5 9 2 6
How can this be done?
Use rle on the output of is.na to get what you want:
x <- rle(is.na(testvector))
x$lengths[!x$values]
## [1] 1 2 5 9 4
I need some help in determining more than one minimum value in a vector. Let's suppose, I have a vector x:
x<-c(1,10,2, 4, 100, 3)
and would like to determine the indexes of the smallest 3 elements, i.e. 1, 2 and 3. I need the indexes of because I will be using the indexes to access the corresponding elements in another vector. Of course, sorting will provide the minimum values but I want to know the indexes of their actual occurrence prior to sorting.
In order to find the index try this
which(x %in% sort(x)[1:3]) # this gives you and index vector
[1] 1 3 6
This says that the first, third and sixth elements are the first three lowest values in your vector, to see which values these are try:
x[ which(x %in% sort(x)[1:3])] # this gives the vector of values
[1] 1 2 3
or just
x[c(1,3,6)]
[1] 1 2 3
If you have any duplicated value you may want to select unique values first and then sort them in order to find the index, just like this (Suggested by #Jeffrey Evans in his answer)
which(x %in% sort(unique(x))[1:3])
I think you mean you want to know what are the indices of the bottom 3 elements? In that case you want order(x)[1:3]
You can use unique to account for duplicate minimum values.
x<-c(1,10,2,4,100,3,1)
which(x %in% sort(unique(x))[1:3])
Here's another way with rank that includes duplicates.
x <- c(x, 3)
# [1] 1 10 2 4 100 3 3
which(rank(x, ties.method='min') <= 3)
# [1] 1 3 6 7