how to get all rows with max value of a variable [duplicate]

how to get all rows with max value of a variable [duplicate] - r

This question already has answers here:
Extracting indices for data frame rows that have MAX value for named field
(3 answers)
Closed 4 years ago.
I have matrix containing two columns and many rows. The first column name is idCombinaison and the second column name is accuarcy. The accuarcy has a float values.
Now I want to get all rows which the value of accuarcy == max value. In some cases (like depicted in the picture), I can have many rows which the value of accuarcy equals to max, so I want to get all these rows!
I tried this:
maxAccuracy <- subset(accuarcyMatrix, accuarcyMatrix['accuarcy'] == max(accuarcyMatrix['accuarcy']))
But this return an empty vector. Any ideas please?

A reproducible data simulating your matrix:
set.seed(123)
x <- matrix(sample(1:9, 30, T), 10, 3)
row.names(x) <- 1:10
colnames(x) <- LETTERS[1:3]
# A B C
# 1 3 9 9
# 2 8 5 7
# 3 4 7 6
# ...
In matrix objects, you need to use a binary way to extract element such as data[a, b]. Take the above data for example, x["C"] will return NA and x[, "C"] will return all elements in column C. Therefore, the following two codes are going to generate different outputs.
subset(x, x["C"] == max(x["C"]))
# A B C (Empty)
subset(x, x[, "C"] == max(x[, "C"]))
# A B C
# 1 3 9 9
# 4 8 6 9

Maybe something like this?
library(dplyr)
accuarcyMatrix %>%
filter_at(vars(accuarcy),
any_vars(.==max(.))
)

Base R solution (although this is very likely a duplicate):
accuarcyMatrix[ which(accuarcyMatrix$accuarcy == max(accuarcyMatrix$accuarcy) , ]
I'm guessing you will want to change "accuarcy" to "accuracy"

Related

Defining what species I want to include in a scatterplot [duplicate]

This question already has answers here:
Filter data.frame rows by a logical condition
(9 answers)
Closed 6 years ago.
I have a data frame df with an ID column eg A,B,etc. I also have a vector containing certain IDs:
L <- c("A", "B", "E")
How can I filter the data frame to get only the IDs present in the vector? Individually, I would use
subset(df, ID == "A")
but how do I filter on a whole vector?

You can use the %in% operator:
> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52)
> L <- c("A","B","E")
> subset(df, id %in% L)
id x
1 A 1
2 B 2
5 E 5
27 A 27
28 B 28
31 E 31
If your IDs are unique, you can use match():
> df <- data.frame(id=c(LETTERS), x=1:26)
> df[match(L, df$id), ]
id x
1 A 1
2 B 2
5 E 5
or make them the rownames of your dataframe and extract by row:
> rownames(df) <- df$id
> df[L, ]
id x
A A 1
B B 2
E E 5
Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

I reckon you need to use 'match'. It matches the values in one vector to the values in another vector, and gives NA where there's no match. So then you subset based on !is.na of the match.
See ?match and you can probably work it out for yourself, in which case you'll learn more than from the exact answer someone will do shortly which will just encourage you to cut n paste :)

Filtering multiple Categorical Data [duplicate]

This question already has answers here:
Filter data.frame rows by a logical condition
(9 answers)
Closed 6 years ago.
I have a data frame df with an ID column eg A,B,etc. I also have a vector containing certain IDs:
L <- c("A", "B", "E")
How can I filter the data frame to get only the IDs present in the vector? Individually, I would use
subset(df, ID == "A")
but how do I filter on a whole vector?

You can use the %in% operator:
> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52)
> L <- c("A","B","E")
> subset(df, id %in% L)
id x
1 A 1
2 B 2
5 E 5
27 A 27
28 B 28
31 E 31
If your IDs are unique, you can use match():
> df <- data.frame(id=c(LETTERS), x=1:26)
> df[match(L, df$id), ]
id x
1 A 1
2 B 2
5 E 5
or make them the rownames of your dataframe and extract by row:
> rownames(df) <- df$id
> df[L, ]
id x
A A 1
B B 2
E E 5
Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

I reckon you need to use 'match'. It matches the values in one vector to the values in another vector, and gives NA where there's no match. So then you subset based on !is.na of the match.
See ?match and you can probably work it out for yourself, in which case you'll learn more than from the exact answer someone will do shortly which will just encourage you to cut n paste :)

View all rows where there is a duplicate in one of the columns in R [duplicate]

This question already has answers here:
Filter data.frame rows by a logical condition
(9 answers)
Closed 6 years ago.
I have a data frame df with an ID column eg A,B,etc. I also have a vector containing certain IDs:
L <- c("A", "B", "E")
How can I filter the data frame to get only the IDs present in the vector? Individually, I would use
subset(df, ID == "A")
but how do I filter on a whole vector?

You can use the %in% operator:
> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52)
> L <- c("A","B","E")
> subset(df, id %in% L)
id x
1 A 1
2 B 2
5 E 5
27 A 27
28 B 28
31 E 31
If your IDs are unique, you can use match():
> df <- data.frame(id=c(LETTERS), x=1:26)
> df[match(L, df$id), ]
id x
1 A 1
2 B 2
5 E 5
or make them the rownames of your dataframe and extract by row:
> rownames(df) <- df$id
> df[L, ]
id x
A A 1
B B 2
E E 5
Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

I reckon you need to use 'match'. It matches the values in one vector to the values in another vector, and gives NA where there's no match. So then you subset based on !is.na of the match.
See ?match and you can probably work it out for yourself, in which case you'll learn more than from the exact answer someone will do shortly which will just encourage you to cut n paste :)

Merge two columns into one, delete colnames

I have a table like:
a
n_msi2010 n_msi2011
1 -0.122876 1.818750
2 1.328930 0.931426
3 -0.111653 4.400060
4 1.222900 4.500450
5 3.604160 6.110930
I would like to merge these two columns into one column to obtain (I don't want to keep column names):
a
n_msi2010
1 -0.122876
2 1.328930
3 -0.111653
4 1.222900
5 3.604160
6 1.818750
7 0.931426
8 4.400060
9 4.500450
10 6.110930
When I am using prefabricated data like
x <- cbind(c(1, 2, 3), c(4, 5, 6))
colnames(x)<-c("a","b")
c(t(x))
# 1 4 2 5 3 6
c((x))
# 1 2 3 4 5 6
the column merging works fine. Only in "a" exemple id doesn't work and it creates 2 separate vectors. I don't really understand why. Any help? Thanks

It seems like your question is about column versus row order vector creation from a data.frame.
Using t() on a data.frame converts the data.frame to a matrix, and using c() on the matrix removes its dimensions.
With that knowledge, you can try:
# create a vector of values, column by column
c(as.matrix(a)) # you are missing the `as.matrix` in your current approach
# create a vector of values, row by row
c(t(a)) # you already know this works
Other approaches to get the "column by column" result would be:
unlist(a, use.names = FALSE)
stack(a)[, "values"] # add `drop = FALSE` if you want to retain a data.frame

Not a elegant way but it seems it can combine two or several columns to one.
n_msi2010 <- 1:5
n_msi2011 <- 6:10
a <- data.frame(n_msi2010, n_msi2011)
vector <- vector()
for (i in 1:dim(a)[2]){
vector <- append(vector, as.vector(a[,i]))
vector
}
You may do
as.matrix(vector) or data.frame(vector)

Filtering a data frame on a vector [duplicate]

This question already has answers here:
Filter data.frame rows by a logical condition
(9 answers)
Closed 6 years ago.
I have a data frame df with an ID column eg A,B,etc. I also have a vector containing certain IDs:
L <- c("A", "B", "E")
How can I filter the data frame to get only the IDs present in the vector? Individually, I would use
subset(df, ID == "A")
but how do I filter on a whole vector?

You can use the %in% operator:
> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52)
> L <- c("A","B","E")
> subset(df, id %in% L)
id x
1 A 1
2 B 2
5 E 5
27 A 27
28 B 28
31 E 31
If your IDs are unique, you can use match():
> df <- data.frame(id=c(LETTERS), x=1:26)
> df[match(L, df$id), ]
id x
1 A 1
2 B 2
5 E 5
or make them the rownames of your dataframe and extract by row:
> rownames(df) <- df$id
> df[L, ]
id x
A A 1
B B 2
E E 5
Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

I reckon you need to use 'match'. It matches the values in one vector to the values in another vector, and gives NA where there's no match. So then you subset based on !is.na of the match.
See ?match and you can probably work it out for yourself, in which case you'll learn more than from the exact answer someone will do shortly which will just encourage you to cut n paste :)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

how to get all rows with max value of a variable [duplicate] - r

Maybe something like this? library(dplyr) accuarcyMatrix %>% filter_at(vars(accuarcy), any_vars(.==max(.)) )

Base R solution (although this is very likely a duplicate): accuarcyMatrix[ which(accuarcyMatrix$accuarcy == max(accuarcyMatrix$accuarcy) , ] I'm guessing you will want to change "accuarcy" to "accuracy"

Related

Defining what species I want to include in a scatterplot [duplicate]

Filtering multiple Categorical Data [duplicate]

View all rows where there is a duplicate in one of the columns in R [duplicate]

Merge two columns into one, delete colnames

Filtering a data frame on a vector [duplicate]

Categories

Resources