I Want Frequency Value with the Maximum Sample Using R [duplicate] - r

This question already has answers here:
Find value corresponding to maximum in other column [duplicate]
(2 answers)
Closed 1 year ago.
I want the frequency with the maximum samples in df
df <- data.frame(Freq = c(1,2,3,4,5,6,7,8,9,10), Valu = c(10,5,11,7,13,15,9,6,12,12))
apply(df, 2, which.max)
.
What I want
I want it to print just the frequency of the maximum Valu which is 6

We could use which.max on the column 'Sample', get the index and extract ([), the corresponding 'Freq' value
with(df, Freq[which.max(Valu)])
#[1] 6
If the column names are changing, then use position index
df[[1]][which.max(df[[2]])]
[1] 6
Or may use order as well
df[[1]][order(-df[[2]])][1]
[1] 6
If we loop over the columns (*apply) with MARGIN = 2 and apply the function which.max, it returns the index of max for those columns separately

Related

Count the number of observations in the data frame in R [duplicate]

This question already has answers here:
Count number of unique levels of a variable
(7 answers)
Count number of distinct values in a vector
(6 answers)
Closed 2 months ago.
I want to know the way of counting the number of observations using R.
For example, let's say I have a data df as follows:
df <- data.frame(id = c(1,1,1,2,2,2,2,3,3,5,5,5,9,9))
Even though the biggest number of id is 9, there are only 5 numbers: 1,2,3,5,and 9. So there are only 5 numbers in id. I want to count how many numbers exist in id like this.
In base R:
length(unique(df$id))
[1] 5
Here, unique filters only distinct values and length then counts the number of values in the vector
In dplyr:
df %>%
summarise(n = length(unique(id)))
Alternatively:
nrow(distinct(df))
Here, distinct subsets the whole dataframe (not just the column id!) to unique rows before nrow counts the number of remaining rows
Here another two options:
df <- data.frame(id = c(1,1,1,2,2,2,2,3,3,5,5,5,9,9))
sum(!duplicated(df$id))
#> [1] 5
library(dplyr)
n_distinct(df$id)
#> [1] 5
Created on 2022-07-09 by the reprex package (v2.0.1)

Finding the maximum value for each row and extract column names [duplicate]

This question already has answers here:
R Create column which holds column name of maximum value for each row
(4 answers)
Closed 1 year ago.
Say we have the following matrix,
x <- matrix(1:9, nrow = 3, dimnames = list(c("X","Y","Z"), c("A","B","C")))
What I'm trying to do is:
1- Find the maximum value of each row. For this part, I'm doing the following,
df <- apply(X=x, MARGIN=1, FUN=max)
2- Then, I want to extract the column names of the maximum values and put them next to the values. Following the reproducible example, it would be "C" for the three rows.
Any assistance would be wonderful.
You can use apply like
maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])
Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame and do
resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))
resulting in
resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C

How to add cells based off of a specific integer? [duplicate]

This question already has answers here:
Sum elements of a vector beween zeros in R
(3 answers)
Closed 2 years ago.
I want to add values from a column. They go in sequence:
0,225,2352,34234,23442,23456,0,123,...
I want to add the values from 0 until the following 0 but not including the second.
For example, i want an output of
(0+225+2352+34234+23442+23456),(0+123+,...,),...
I want to store them as a new column of totals
One simple solution in base R is
sapply(split(x, cumsum(x == 0)), sum)
With split you basically create groups of elements that you want to sum together using sapply. The final result will be a named numeric vector.
Sample data
x <- c(0,225,2352,34234,23442,23456,0,123,2,0,1,42)
sapply(split(x, cumsum(x == 0)), sum)
# 1 2 3
# 83709 125 43

Finding the most repeated value using table() function [duplicate]

This question already has answers here:
How to retrieve the most repeated value in a column present in a data frame
(9 answers)
Closed 2 years ago.
I was given a sample vector v and was asked to use R code to extract, as a number (meaning: not as a character string), the value that was repeated the most times in v.
(Hints: use table(); note that which.max() gives you index of a vector's maximum value, like the maximum value within a table; names() allows you the extract the values of the original vector, when applied to the output of table().)
My answer is as follows:
names(which.max(table(v)))
it returns the correct answer as a string, not as a number. Am i using the hint correctly? Thanks.
names return the number as character, perhaps add as.integer/as.numeric to convert it to number.
as.integer(names(which.max(table(v))))
Moreover, in case of tie which.max would return only the first maximum. If you want all the values which are tied you can use :
v <- c(1, 1, 2, 4, 5, 3, 3)
as.integer(names(which.max(table(v))))
#[1] 1
tab <- table(v)
as.integer(names(tab[max(tab) == tab]))
#[1] 1 3

Count number of unique values per row [duplicate]

This question already has answers here:
Count of unique elements of each row in a data frame in R
(3 answers)
Closed 5 years ago.
I want to count the number of unique values per row.
For instance with this data frame:
example <- data.frame(var1 = c(2,3,3,2,4,5),
var2 = c(2,3,5,4,2,5),
var3 = c(3,3,4,3,4,5))
I want to add a column which counts the number of unique values per row; e.g. 2 for the first row (as there are 2's and 3's in the first row) and 1 for the second row (as there are only 3's in the second row).
Does anyone know an easy code to do this? Up until now I only found code for counting the number of unique values per column.
This apply function returns a vector of the number of unique values in each row:
apply(example, 1, function(x)length(unique(x)))
You can append it to your data.frame using on of the following two ways (and if you want to name that column as count):
example <- cbind(example, count = apply(example, 1, function(x)length(unique(x))))
or
example$count <- apply(example, 1, function(x)length(unique(x)))
We can also use a vectorized approach with regex. After pasteing the elements of each row of the dataset (do.call(paste0, ...), match a pattern of any character, capture as a group ((.)), using the positive lookahead, match characters only if it appears again later in the string (\\1 - backreference for the captured group and replace it with blank (""). So, in effect only those characters remain that will be unique. Then, with nchar we count the number of characters in the string.
example$count <- nchar(gsub("(.)(?=.*?\\1)", "", do.call(paste0, example), perl = TRUE))
example$count
#[1] 2 1 3 3 2 1

Resources