How to add cells based off of a specific integer? [duplicate] - r

This question already has answers here:
Sum elements of a vector beween zeros in R
(3 answers)
Closed 2 years ago.
I want to add values from a column. They go in sequence:
0,225,2352,34234,23442,23456,0,123,...
I want to add the values from 0 until the following 0 but not including the second.
For example, i want an output of
(0+225+2352+34234+23442+23456),(0+123+,...,),...
I want to store them as a new column of totals

One simple solution in base R is
sapply(split(x, cumsum(x == 0)), sum)
With split you basically create groups of elements that you want to sum together using sapply. The final result will be a named numeric vector.
Sample data
x <- c(0,225,2352,34234,23442,23456,0,123,2,0,1,42)
sapply(split(x, cumsum(x == 0)), sum)
# 1 2 3
# 83709 125 43

Related

I Want Frequency Value with the Maximum Sample Using R [duplicate]

This question already has answers here:
Find value corresponding to maximum in other column [duplicate]
(2 answers)
Closed 1 year ago.
I want the frequency with the maximum samples in df
df <- data.frame(Freq = c(1,2,3,4,5,6,7,8,9,10), Valu = c(10,5,11,7,13,15,9,6,12,12))
apply(df, 2, which.max)
.
What I want
I want it to print just the frequency of the maximum Valu which is 6
We could use which.max on the column 'Sample', get the index and extract ([), the corresponding 'Freq' value
with(df, Freq[which.max(Valu)])
#[1] 6
If the column names are changing, then use position index
df[[1]][which.max(df[[2]])]
[1] 6
Or may use order as well
df[[1]][order(-df[[2]])][1]
[1] 6
If we loop over the columns (*apply) with MARGIN = 2 and apply the function which.max, it returns the index of max for those columns separately

Split dataframe into 20 groups based on column values [duplicate]

This question already has answers here:
Splitting a continuous variable into equal sized groups
(11 answers)
How to categorize a continuous variable in 4 groups of the same size in R?
(1 answer)
R divide data into groups
(1 answer)
Closed 2 years ago.
I am fairly new to R and can't find a concise way to a problem.
I have a dataframe in R called df that looks as such. It contain a column called values that contains values from 0 to 1 ordered numerically and a binary column called flag that contains either 0 or 1.
df
value flag
0.033 0
0.139 0
0.452 1
0.532 0
0.687 1
0.993 1
I wish to split this dataframe into X amount of groups from 0 to 1. For example if I wished a 4 split grouping, the data would be split from 0-0.25, 0.25-0.5, 0.5-0.75, 0.75-1. This data would also contain the corresponding flag to that point.
I want to solution to be scalable so if I wished to split it into more group then I can. I am also limited to the tidyverse packages.
Does anyone have a solution for this? Thanks
if n is the number of partitions:
L = seq(1,n)/n
GroupedList = lapply(L,function(x){
df[(df$value < x) & (df$value > (x-(1/n))),]
})
I think this should produce a list of dataframes where each dataframe contains what you asked.
You can use cut to divide data into n groups and use it in split to have list of dataframes.
n <- 4
list_df <- split(df, cut(df$value, breaks = n))
If you want to split the data between 0-1 into n groups you can do :
list_df <- split(df, cut(df$value, seq(0, 1, length.out = n + 1)))

How to create a variable using another variable as an Index? [duplicate]

This question already has answers here:
Using row-wise column indices in a vector to extract values from data frame [duplicate]
(2 answers)
Closed 3 years ago.
I'm looking to create a new variable, d, which grabs the value from either an or b based off of the variable C.
dat = data.frame(a=1:10,b=11:20,c=rep(1:2,5))
The result would be:
d = c(1,12,3,14,... etc)
We can use a row/column indexing where the row index is the sequence of rows and column index the 'c' column, cbind them and extract the elements from the dataset based on this
dat$d <- dat[1:2][cbind(seq_len(nrow(dat)), dat$c)]
dat$d
#[1] 1 12 3 14 5 16 7 18 9 20
NOTE: This should also work when there are multiple column values to extract.
You can do
dat$d <- ifelse(dat$c==1,dat$a,dat$b)
A dplyr variant
dat %>%
mutate(d = case_when(c==1 ~ a,
TRUE ~ b))

Apply function over consecutive groups in vector [duplicate]

This question already has answers here:
Calculate the mean of every 13 rows in data frame
(4 answers)
Closed 1 year ago.
I want to calculate meas of three consecutive variables a vector.
Ex:
Vec<-rep(1:10)
I would like the output to be like the screenshot below:
You can create the following function to calculate means by groups of 3 (or any other number):
f <- function(x, k=3)
{
for(i in seq(k,length(x),k))
x[(i/k)] <- mean(x[(i-k+1):i])
return(x[1:(length(x)/k)])
}
f(1:15)
[1] 2 5 8 11 14
We can create a grouping variable using gl and then get the mean with ave
ave(Vec, as.numeric(gl(length(Vec), 3, length(Vec))))

R, accessing a column vector of a matrix by name [duplicate]

This question already has answers here:
Extract matrix column values by matrix column name
(2 answers)
Closed 7 years ago.
In R I can access the data in a column vector of a column matrix by the following:
mat2[,1]
Each column of mat2 has a name. How can I retrieve the data from the first column by using the name attribute instead of [,1]?
For example suppose my first column had the name "saturn". I want something like
mat2[,1] == mat2[saturn]
The following should do it:
mat2[,'saturn']
For example:
> x <- matrix(1:21, nrow=7, ncol=3)
> colnames(x) <- paste('name', 1:3)
> x[,'name 1']
[1] 1 2 3 4 5 6 7
Bonus information (adding to the first answer)
x[,c('name 1','name 2')]
would return two columns just as if you had done
x[,1:2]
And finally, the same operations can be used to subset rows
x[1:2,]
And if rows were named...
x[c('row 1','row 2'),]
Note the position of the comma within the brackets and with respect to the indices.

Resources