R- Display value with aggregate [duplicate] - r

This question already has answers here:
R use ddply or aggregate
(4 answers)
Aggregate by factor levels, keeping other variables in the resulting data frame
(5 answers)
Closed 9 years ago.
I have a dataframe in R with the following variables:
DateTime year dayYear diameter dendro
I want to calculate the min diameter for each day of the year for each dendrometer, so I used aggregate:
dailymin <- aggregate(diameter~year+dayYear+dendro, FUN=min, data=myData)
However, I also need the time when the min diameter happened at each day and dendro. Therefore, I need to keep the info contained in the variable DateTime in the row of each min.
Thanks for helping me.

You may alternatively use the plyr package for this purpose and add a new column by the transform shortcut function:
ddply(myData, .(variables.list), transform, new.variable= min(diameter))

Related

HOW DO I CALCULATE THE MEAN OF SELECTED COLUMNS? [duplicate]

This question already has answers here:
How to get the mean of specific columns in dataframe and store in vector (in R)
(1 answer)
Find mean of multiple columns in R
(4 answers)
Closed 2 months ago.
I have a large dataset. I want to calculate the mean of some of the columns in the data set together. I am not sure how I can use the
colMeans ()
I have only found how to calculate for categories and rows.
Let me take embedded data iris in R as an example.
colMeans(iris[, 1:3], na.rm=TRUE) # select columns #1~3.

How to find the min, median and max (variable) for another variable [duplicate]

This question already has answers here:
How to get summary statistics by group
(14 answers)
Closed 6 months ago.
I'm quite new to using R so please bear with me.
I'm using the Theoph dataset and need to find the min, median, max concentration for each subject. Both are listed as variables in the dataset.
The subjects are listed like 1,1,1,1,1,2,2,2,2,2,3,3...etc. each with a corresponding concentration value.
How would I approach this and preferably using the tidyverse package (although can use another if it works) if I wanted to present the end result in a table (a data frame)
Here a solution in base R:
aggregate(Theoph$conc, list(Theoph$Subject), summary)

Replace value in dataframe by value divided by mean of the entire column [duplicate]

This question already has answers here:
Normalizing selection of dataframe columns with dplyr
(2 answers)
Closed 3 years ago.
In need to normalise my data by dividing each value by the mean of the entire column, preferably using dplyr.
assume
inputs <- c(3,5,3,9,12)
mydata = data.frame(inputs)
I would like all the values replaced by themselves divided by the mean, which is 6.4.
Any straightforward suggestion?
We can use sapply in base R for generalized approach
sapply(mydata, function(x) x/mean(x))
Or with colMeans if more than one column
mydata/colMeans(mydata)[col(mydata)]

R sum on rows based on day [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 5 years ago.
I have a dataframe containing sells by day, but as there are many values for one day, so i'd like to rebuilt my dataframe to have just one value per day which is the sum of all the sells of this day.
what i have:
what i'd like to have:
(the values are not well calculated but it's for the example)
I tried agregate and functions like this but it dosn't work and I dont know how to do this...
Thanks for help
Aggregate should work
aggregate(TOT_OP_MAIN_PAID_MNT,by=list(DATE_ID),FUN=sum)
This should work
df <- data.frame(DATE_ID = 1, TOT_OP_MAIN_PAID_MNT = 1)
aggregate(TOT_OP_MAIN_PAID_MNT ~ DATE_ID, df, sum)

Categorising data within a range in R [duplicate]

This question already has answers here:
Convert continuous numeric values to discrete categories defined by intervals
(2 answers)
Closed 6 years ago.
I have a data frame in R that has a personal ID, an income and some other variables. I would like to add a new column to this data that categorises people in to which income group they fit in to (0-24,999, 25,000-49,999, 50,000-74,999, 75000-99,000, etc).
I then want to be able to make frequency tables of this data compared with some of the other variables (eg: weekly hours worked, age).
I should be fine to figure out the latter of these problems, but I am having trouble figuring out how to categorise my data. Any help would be greatly appreciated.
Thank you.
We can use cut or findInterval to group the "Variable"
gr <- cut(df1$Variable, breaks = c(0, 24999, 49999,74999,99999, Inf))
Then, use table to get the frequency count
table(gr, df1$age)

Resources