R programing inbuilt Titanic Data set [duplicate] - r

This question already has an answer here:
How can I count the number of instances a value occurs within a subgroup in R?
(1 answer)
Closed 1 year ago.
I am new to R programming. I have to build titanic data in R. I want to find out how many child and adults are there in the dataset. Can someone give me hint to find the same?
I tried using length() function but it did not give the result.

Here's a solution in tidyverse syntax. It converts the Titanic dataset into a tibble (a type of dataframe), groups the data by the Age column, then uses n() to count the number of rows at each level of Age, giving the number of children and adults.
library(tidyverse)
Titanic %>%
as_tibble() %>%
group_by(Age) %>%
summarise(N = n())
This gives the output:
# A tibble: 2 x 2
Age N
<chr> <int>
1 Adult 16
2 Child 16

Related

Need help to find total of specific name in column! (R-Coding) [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 4 months ago.
so I'm trying to find the total amount of each class in the Genus column, the following dataframe is shown below:
Genus/Species Dataframe
But it returns values of how many times each Genus is repeating, not it's total amount of the species observed.
I want a dataframe that looks like this:
Genus - Arachnida
Total Sum - (Total amount of Arachnida)
Thank you for whoever replies! (This is my first post, English is my second language so hopefully someone understands!)
I've tried using dplyr's function of count like:
BIO205 %>% count(Genus)
But it returns values of how many times each Genus is repeating, not it's total.
Like if I did BIO205 %>% count(Genus), it would return with
Returning Dataframe
This is indicating that the word Arachnida is repeating 21 times.
So it seems like you are trying to sum all the total values for each genus. If so group_by() is what you want. Here is a reprex:
library(dplyr)
data("iris")
iris %>%
group_by(Species) %>%
summarise(sum_col = sum(Sepal.Length))
The above code is grouping the data by species, followed by summing all the sepal lengths for each species. In your case, what I would try is the following code:
library(dplyr)
BIO205 %>%
group_by(Genus) %>%
summarise(sum_col = sum(Total))
Hope this helps.

Grouping characters in R using as.factor [duplicate]

This question already has answers here:
Extract the maximum value within each group in a dataframe [duplicate]
(3 answers)
Closed 2 years ago.
I'm trying to find the maximum number of flights delayed from certain origins using the library(nycflights13) and I'm not able to figure out how to group by "chr"
library(nycflights13)
library(dplyr)
flights2 <- mutate(flights,factori = as.factor(flights$origin))
flights2 %>%
filter(dep_delay > 2) %>%
select(dep_delay, factori) %>%
group_by(factori)
Sample of output:
How can I get them grouped together? How can I find the max count?
group_by doesn't change anything in the structure of the data. The number of rows and column remain the same after group_by. It is what you do after group_by that decides the output.
To get max dep_delay for each factori you can do :
library(nycflights13)
library(dplyr)
flights2 %>%
filter(dep_delay > 2) %>%
select(dep_delay, factori) %>%
group_by(factori) %>%
summarise(max = max(dep_delay, na.rm = TRUE))
# factori max
#* <fct> <dbl>
#1 EWR 1126
#2 JFK 1301
#3 LGA 911
summarise usually gives only one row per group whereas mutate would keep the number of rows same as original data.

How to manipulate a data? [duplicate]

This question already has answers here:
Divide the values in one column by those in another column in R
(2 answers)
Closed 2 years ago.
I'm confused with an exercise that i'm working on in R. I'm just a beginner in R
the instructions is to Use dplyr to manipulate the data so that you have the proportion immunised (i.e. Immunised divided by Eligible) for each DHB, for each Age, and each Date. Save the result so you can use it for the remaining questions. You should end up with a data frame containing variables for DHB, Date, Age and Proportion with 4834 observations.
But I don't understand how to do this but this is what i've tried
```{r}
vacc %>% mutate(Proportion = Immunised/DHB(vacc))
vacc %>% select(DHB, Date, Age, Proportion)
```
but it gave me this error
Error: Problem with `mutate()` input `Proportion`. x could not find function "DHB" i Input `Proportion` is `Immunised/DHB(vacc)`.
can someone please help me
DHB is a column in the dataframe, however, you are using it as function.
You can group_by DHB, Age and Date and calculate ratio between Immunised and Eligible.
library(dplyr)
vacc %>% group_by(DHB, Age, Date) %>% mutate(Proportion = Immunised/Eligible)
Perhaps, I think this would work too :
vacc %>% mutate(Proportion = Immunised/Eligible)

How can I count the frequency of a variable in R? [duplicate]

This question already has answers here:
Count number of occurences for each unique value
(14 answers)
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 3 years ago.
I am currently trying to count the frequency of countries that appear in a dataframe object.
I tried using count commands as well as rle(sort(x)), which apparently is used to search for strings. But it does not seem to yield any results.
rle(sort(x))
I tried using this, but does not seem to work. I also tried to use
count(x, "COUNTRY")
but all it does is count how many entries are there.
How can I get a result such as:
Country Frequency
[1] United States 3
[2] Mexico 5
[3] Germany 12
Here is a small example using dplyr and the built-in dataset mtcars:
library(dplyr)
mtcars %>%
group_by(cyl) %>%
count(cyl)
or
mtcars %>%
group_by(cyl) %>%
add_count(cyl)
other solution is: table(yourdataframe$x)
count(x,Country,Frequency)
Have to include both to see a deeper breakdown then it'll count the countries and Frequency
or
X%>%group_by(Country)%>%summarise(sum = sum(Frequency), n = n())

how to obtain summary of statistics for distinct values of a column in dataframe in R? [duplicate]

This question already has answers here:
Aggregate / summarize multiple variables per group (e.g. sum, mean)
(10 answers)
Closed 6 years ago.
Consider we have a data.frame named IND, in which we have a column called dept. There are in total 100 rows and there are 20 distinct values in dept.
Now I would like to obtain the summary statistics for these 20 subsets of data.frame containing 5 rows each using the main data.frame!
summary(IND) gives the summary statistics for whole dataset but what should I do in my case?
Something like this
mtcars %>% group_by(cyl) %>% summarise_each(funs(sum, mean))
can be used for your case as
IND %>% group_by(dept) %>% summarise_each(funs(sum, mean))

Resources