How to manipulate a data? [duplicate] - r

This question already has answers here:
Divide the values in one column by those in another column in R
(2 answers)
Closed 2 years ago.
I'm confused with an exercise that i'm working on in R. I'm just a beginner in R
the instructions is to Use dplyr to manipulate the data so that you have the proportion immunised (i.e. Immunised divided by Eligible) for each DHB, for each Age, and each Date. Save the result so you can use it for the remaining questions. You should end up with a data frame containing variables for DHB, Date, Age and Proportion with 4834 observations.
But I don't understand how to do this but this is what i've tried
```{r}
vacc %>% mutate(Proportion = Immunised/DHB(vacc))
vacc %>% select(DHB, Date, Age, Proportion)
```
but it gave me this error
Error: Problem with `mutate()` input `Proportion`. x could not find function "DHB" i Input `Proportion` is `Immunised/DHB(vacc)`.
can someone please help me

DHB is a column in the dataframe, however, you are using it as function.
You can group_by DHB, Age and Date and calculate ratio between Immunised and Eligible.
library(dplyr)
vacc %>% group_by(DHB, Age, Date) %>% mutate(Proportion = Immunised/Eligible)
Perhaps, I think this would work too :
vacc %>% mutate(Proportion = Immunised/Eligible)

Related

Need help to find total of specific name in column! (R-Coding) [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 4 months ago.
so I'm trying to find the total amount of each class in the Genus column, the following dataframe is shown below:
Genus/Species Dataframe
But it returns values of how many times each Genus is repeating, not it's total amount of the species observed.
I want a dataframe that looks like this:
Genus - Arachnida
Total Sum - (Total amount of Arachnida)
Thank you for whoever replies! (This is my first post, English is my second language so hopefully someone understands!)
I've tried using dplyr's function of count like:
BIO205 %>% count(Genus)
But it returns values of how many times each Genus is repeating, not it's total.
Like if I did BIO205 %>% count(Genus), it would return with
Returning Dataframe
This is indicating that the word Arachnida is repeating 21 times.
So it seems like you are trying to sum all the total values for each genus. If so group_by() is what you want. Here is a reprex:
library(dplyr)
data("iris")
iris %>%
group_by(Species) %>%
summarise(sum_col = sum(Sepal.Length))
The above code is grouping the data by species, followed by summing all the sepal lengths for each species. In your case, what I would try is the following code:
library(dplyr)
BIO205 %>%
group_by(Genus) %>%
summarise(sum_col = sum(Total))
Hope this helps.

R programing inbuilt Titanic Data set [duplicate]

This question already has an answer here:
How can I count the number of instances a value occurs within a subgroup in R?
(1 answer)
Closed 1 year ago.
I am new to R programming. I have to build titanic data in R. I want to find out how many child and adults are there in the dataset. Can someone give me hint to find the same?
I tried using length() function but it did not give the result.
Here's a solution in tidyverse syntax. It converts the Titanic dataset into a tibble (a type of dataframe), groups the data by the Age column, then uses n() to count the number of rows at each level of Age, giving the number of children and adults.
library(tidyverse)
Titanic %>%
as_tibble() %>%
group_by(Age) %>%
summarise(N = n())
This gives the output:
# A tibble: 2 x 2
Age N
<chr> <int>
1 Adult 16
2 Child 16

How can I add the populations of males and females together to remove gender as a variable in a demographics table. In R Studio [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 2 years ago.
This is my first time posting a question, so may not have the correct info to start, apologies in advance. Am new to R. Prefer to use dplyr or tidyverse because those are the packages we've used so far. I did search for a similar question, but most gender/sex related questions are around separating the data, or performing operations on each separately.
I have a table of population counts, with variables (factors) Age Range, Year and Sex, with Population as the dependent variable. I want to create a plot to show if the population is aging - that is, showing how the relative proportion of different ages groups changes over time. But gender is not relevant, so I want to add together the population counts for males and females, for each year and age range.
I don't know how to provide a copy of the raw data .csv file, so if you have any suggestions, please let me know.
This is a sample of the data(output table):
And here is the code so far:
file_name <- "AusPopDemographics.csv"
AusDemo_df = read.table(file_name,",", header=TRUE)
(grp_AusDemo_df <- AusDemo_df %>% group_by(Year, Age))
I am guessing it may be something like pivot(wider) to bring male and female up as column headings, then transmute() to sum them and create a new population column.
Thanks for your help.
With dplyr you could do something like this
library(dplyr)
grp_AusDemo_df <- AusDemo_df %>%
group_by(Year, Age) %>%
summarise(Population = sum(Population, na.rm = TRUE))

How can I count the frequency of a variable in R? [duplicate]

This question already has answers here:
Count number of occurences for each unique value
(14 answers)
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 3 years ago.
I am currently trying to count the frequency of countries that appear in a dataframe object.
I tried using count commands as well as rle(sort(x)), which apparently is used to search for strings. But it does not seem to yield any results.
rle(sort(x))
I tried using this, but does not seem to work. I also tried to use
count(x, "COUNTRY")
but all it does is count how many entries are there.
How can I get a result such as:
Country Frequency
[1] United States 3
[2] Mexico 5
[3] Germany 12
Here is a small example using dplyr and the built-in dataset mtcars:
library(dplyr)
mtcars %>%
group_by(cyl) %>%
count(cyl)
or
mtcars %>%
group_by(cyl) %>%
add_count(cyl)
other solution is: table(yourdataframe$x)
count(x,Country,Frequency)
Have to include both to see a deeper breakdown then it'll count the countries and Frequency
or
X%>%group_by(Country)%>%summarise(sum = sum(Frequency), n = n())

Calculate mean of multiple rows using grouping variables [duplicate]

This question already has answers here:
Mean per group in a data.frame [duplicate]
(8 answers)
Closed 7 years ago.
I am trying to calculate an overall mean of multiple classes. Currently the database is in long format. I tried selecting first ID number (group variable 1), then a dummy variable (stem=1) classes that I am interested in (grouping variable 2), and then calculating one GPA mean (i.e., stem GPA mean) for the grades received in interested classes (stem=1).
I have an attached an example of the database below. Overall,, I am trying figure out how to calculate stem GPA for each student.
See example here
I have tried using library(psych), describeBy(data, dataset$id, dataset$stem), but to no avail. Any suggestions?
I prefer the dplyr package for these operations. Try e.g.
df %>% group_by(class) %>% summarise(mean_class=mean(class))
For instance, using the mtcars dataset:
library(dplyr)
mtcars %>% group_by(cyl) %>% summarise(mean_disp = mean(disp))
will give you all the means of disp based on the grouping variable cyl.

Resources