This question already has answers here:
How to retrieve the most repeated value in a column present in a data frame
(9 answers)
Closed 2 years ago.
I was given a sample vector v and was asked to use R code to extract, as a number (meaning: not as a character string), the value that was repeated the most times in v.
(Hints: use table(); note that which.max() gives you index of a vector's maximum value, like the maximum value within a table; names() allows you the extract the values of the original vector, when applied to the output of table().)
My answer is as follows:
names(which.max(table(v)))
it returns the correct answer as a string, not as a number. Am i using the hint correctly? Thanks.
names return the number as character, perhaps add as.integer/as.numeric to convert it to number.
as.integer(names(which.max(table(v))))
Moreover, in case of tie which.max would return only the first maximum. If you want all the values which are tied you can use :
v <- c(1, 1, 2, 4, 5, 3, 3)
as.integer(names(which.max(table(v))))
#[1] 1
tab <- table(v)
as.integer(names(tab[max(tab) == tab]))
#[1] 1 3
Related
This question already has answers here:
How to return 5 topmost values from vector in R?
(4 answers)
Closed last month.
I'm trying to obtain the three largest number in a vector, with R.
With function max() I can get the first one, is there any function that I can choose the number of values that I want?
For example:
vector <- c(100,215,200,180,300,500,130)
max(vector)
This will returne 500,
I want something that returns: 500, 300, 215.
I've tried
pmax(vector,3)
you can use the tail function to get the last three elements of the sorted vector.
Example:
largest_three <- tail(sort(vector), 3)
This question already has answers here:
Find nearest smaller number
(7 answers)
Closed 2 years ago.
Building on R - Fastest way to find nearest value in vector, I am interested in getting the nearest value in a vector prior to a specific value.
The DescTools package Closest does not differentiate according to direction.
Eg.
x=c(1,7:10)
min(DescTools::Closest(x, 6, which = F, na.rm = FALSE))
would return 7, while I want 1. Anyone?
You could try writing a simple function to do this.
closest_preceding <- function(vec, value) max(vec[vec < value])
closest_preceding(x, 6)
#> [1] 1
This question already has answers here:
How to index a vector sequence within a vector sequence
(5 answers)
Closed 5 years ago.
I have got a dataframe and I need to find row numbers where the values of the entries in one column match a certain pattern.
Let the col1 col1 = matrix(c(1,0,0,0,0,0,0,0,0,0,2,0,2,0,0,0,0,0,0,0,1), nrow = 21, ncol = 1) be an example of by column and vector r r = c(2, 0 ,2) be a vector I need to match it with.
I need R to return an index number of rows where the pattern in r matches the values in col1 (in this case row 11, 12, 13).
I thought I could achieve this with row.match, but that is not the case. I have tried different combinations of match function, but it doesn't yield any results either.
Maybe the way I am approaching this problem is wrong from the beginning, but I have trouble believing that there isn't any function, that would provide me with the expected result given some adjustment.
Thanks.
You could do this using rollapply from zoo. Basically, this runs identical on a rolling basis with a window of length(r). This tells you that the sequence is present starting at positon 11 of the col1 vector..
library(zoo)
which(rollapply(col1,length(r),identical,r))
[1] 11
To get a vector of positions, you could do:
which(rollapply(col1,length(r),identical,r))+0:(length(r)-1)
[1] 11 12 13
This question already has answers here:
Sum rows in data.frame or matrix
(7 answers)
Closed 7 years ago.
I need to sum columns of a table that have a names starting with a particular string.
An example table might be:
tbl<-data.frame(num1=c(3,2,9), num2=c(3,2,9),n3=c(3,2,9),char1=c('a', 'b', 'c'))
I get the list of columns (in this example I wrote only 2, but the real case has more tan 20).
a<-colnames(tbl)[grep('num', colnames(tbl))]
I tried with
sum(tbl[,a])
But I get only one number with the total sum of the elements in both vectors.
What I need is the result of:
tbl$num1+ tbl$num2
We can either use Reduce
Reduce(`+`, tbl[a])
Or rowSums. The rowSums also has the option of removing the NA elements with na.rm=TRUE.
rowSums(tbl[a])
This question already has answers here:
Count of unique elements of each row in a data frame in R
(3 answers)
Closed 5 years ago.
I want to count the number of unique values per row.
For instance with this data frame:
example <- data.frame(var1 = c(2,3,3,2,4,5),
var2 = c(2,3,5,4,2,5),
var3 = c(3,3,4,3,4,5))
I want to add a column which counts the number of unique values per row; e.g. 2 for the first row (as there are 2's and 3's in the first row) and 1 for the second row (as there are only 3's in the second row).
Does anyone know an easy code to do this? Up until now I only found code for counting the number of unique values per column.
This apply function returns a vector of the number of unique values in each row:
apply(example, 1, function(x)length(unique(x)))
You can append it to your data.frame using on of the following two ways (and if you want to name that column as count):
example <- cbind(example, count = apply(example, 1, function(x)length(unique(x))))
or
example$count <- apply(example, 1, function(x)length(unique(x)))
We can also use a vectorized approach with regex. After pasteing the elements of each row of the dataset (do.call(paste0, ...), match a pattern of any character, capture as a group ((.)), using the positive lookahead, match characters only if it appears again later in the string (\\1 - backreference for the captured group and replace it with blank (""). So, in effect only those characters remain that will be unique. Then, with nchar we count the number of characters in the string.
example$count <- nchar(gsub("(.)(?=.*?\\1)", "", do.call(paste0, example), perl = TRUE))
example$count
#[1] 2 1 3 3 2 1