Trying to find a specific element based on a condition [duplicate] - r

This question already has answers here:
Find value corresponding to maximum in other column [duplicate]
(2 answers)
Closed 2 years ago.
This is my dataframe in r studio. I'm trying to find code what will produce the name of the student with he highest age.
students.df #Name of dataframe
name DAD BDA gender nationality age
1 Amy 80 70 F IRL 20
2 Bill 65 50 M UK 21
3 Carl 50 80 M IRL 22

as.character(subset(students.df,students.df$age==max(students.df$age))$name)
library(dplyr)
students.df %>% filter(age==max(age)) %>% select(name)

you can try this
students.df[which.max(student.df$age),]

Related

Should I use for loop? OR apply? [duplicate]

This question already has answers here:
Split dataframe by levels of a factor and name dataframes by those levels
(3 answers)
Closed 5 years ago.
this is my first post.
I have this dataframe of the Nhl draft.
What I would like to do is to use some sort of recursive function to create 10 objects.
So, I want to create these 10 objects by subsetting the Nhl dataframe by Year.
Here are the first 6 rows of the data set (nhl_draft)
Year Overall Team
1 2000 1 New York Islanders
2 2000 2 Atlanta Thrashers
3 2000 3 Minnesota Wild
4 2000 4 Columbus Blue Jackets
5 2000 5 New York Islanders
6 2000 6 Nashville Predators
Player PS
1 Rick DiPietro 49.3
2 Dany Heatley 95.2
3 Marian Gaborik 103.6
4 Rostislav Klesla 34.5
5 Raffi Torres 28.4
6 Scott Hartnell 74.5
I want to create 10 objects by subsetting out the Years, 2000 ~ 2009.
I tried,
for (i in 2000:2009) {
nhl_draft.i <- subset(nhl_draft, Year == "i")
}
BUT this doesn't do anything. What's the problem with this for-loop? Can you suggest any other ways?
Please tell me if this is confusing after all, this is my first post......
The following code may fix your error.
# Create an empty list
nhl_list <- list()
for (i in 2000:2009) {
# Subset the data frame based on Year
nhl_draft_temp <- subset(nhl_draft, Year == i)
# Assign the subset to the list
nhl_list[[as.character(i)]] <- nhl_draft_temp
}
But you can consider split, which is more concise.
nhl_list <- split(nhl_draft, f = nhl_draft$Year)

Average for column value across multiple datasets in R [duplicate]

This question already has answers here:
calculate average over multiple data frames
(5 answers)
Closed 6 years ago.
I am new to R and I need help in this. I have 3 data sets from 3 different years. they have the same columns with different values for each year. I want to find the average for the column values across the three years based on the name field. To be specific:
assume : first data set
Name Age Height Weight
A 4 20 20
B 5 22 22
C 8 25 21
D 10 25 23
second data set
Name Age Height Weight
A 5 22 25
B 6 23 26
Third data set
Name Age Height Weight
A 6 24 24
B 7 24 27
C 10 27 28
I want to find the average height for "A" across the three data sets
We can place them in a list and rbind them, group by 'Name' and get the mean of each column
library(data.table)
rbindlist(list(df1, df2, df3))[, lapply(.SD, mean), by = Name]
Or with dplyr
bind_rows(df1, df2, df3) %>%
group_by(Name) %>%
summarise_each(funs(mean))

Sort data.table unique values (r language) [duplicate]

This question already has answers here:
unique / sort in data.frame
(3 answers)
Closed 6 years ago.
I have my table(input):
user_id is_Leaver age
1 Helen yes 25
2 Helen yes 25
3 Helen yes 25
4 Rob no 31
5 Rob no 31
I need to have table with unique logs(Output):
user_id is_Leaver age
1 Helen yes 25
2 Rob no 31
Thanks!
Do this :
let's say your dataframe = df;
df <- unique(df)
It should do the trick.

How to create a step-by-step cumulation of data? [duplicate]

This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 7 years ago.
Probably my question is really dull but I couldn't find an easy solution for that. So we have a data.frame without (overall) column. Overall column must present a cumulative number of pies (in my case) eaten up to a certain time period. What is the easiest way to create it in R for an infinite number of rows? Thanks!
Year Pies eaten Pies eaten(overall)
1 1960 3 3
2 1961 2 5
3 1962 5 10
4 1963 1 11
5 1964 7 18
6 1965 4 22
We can use cumsum
df1$Pies_eaten_Overall <- cumsum(df1$Pies_eaten)

Sub-setting with dplyr [duplicate]

This question already has answers here:
Select the row with the maximum value in each group
(19 answers)
Closed 2 years ago.
Dataframe has the columns:
State Sex Year Name Number Percent
I need to filter for each year, one male and one female with highest percentage, in every state.
Example:
Washington M 2011 John 34 0.46
Washington F 2011 Mary 42 0.67
Washington M 2012 John 46 0.46
Washington F 2012 Mary 64 0.67
and so on for every State and year.
You can try
df %>%
group_by(State, Year, Sex) %>%
slice(which.max(Percent))

Resources