Using IF to rate an interval value in R [duplicate] - r

This question already has answers here:
Assign a value, if a number is in between two numbers
(5 answers)
Nested ifelse statement
(10 answers)
Closed last year.
I have data like this
df:
A rate
10 ...
20
How to rate A[] with this rule with the simplest code in R
#if A<20 rate=1 if 20<A[]<30 rate=2
Thanks

There are several ways to do this, perhaps the simplest and most general being findInterval:
set.seed(1)
A <- sample(40, 5)
A
#> [1] 4 39 1 34 23
rate <- findInterval(A, c(-Inf, 20, 30, Inf))
rate
#> [1] 1 3 1 3 2
Created on 2022-02-17 by the reprex package (v2.0.1)

Related

Adding successive values in a column of a csv in R [duplicate]

This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 5 months ago.
I am new to coding so this is probably very basic. I was wondering how to add two successive values in a column of a csv along its entire length. Say this was my data:
[1] 2
[2] 3
[3] 4
[4] 5
I want to make a vector which contains 2+3, 3+4 and 4+5 (but obviously my real data set is much larger).
Thanks a lot!
I think you need cumsum():
If this is your vector:
vector <- 2:5
[1] 2 3 4 5
You would need to:
cumsum(vector)
[1] 2 5 9 14

How to group in a row values of the same value in a column with R? [duplicate]

This question already has answers here:
Collapse / concatenate / aggregate a column to a single comma separated string within each group
(6 answers)
Closed 4 years ago.
I'm tryng to sequence a dataset, and I'm a little lost with this. I have everything else done, data filtering, duplicated values eliminated, ordered by date... but im stuck with this, maybe one of the most simple parts. My goal is to convert this data frame:
Type Value
A 12
B 20
A 14
A 13
B 15
To something like this:
A 12,14,13
B 20,15
Any idea on how to do this?
Thanks in advance!
Using base is simplest:
aggregate(df$Value~df$Type,FUN=c)
df$Type df$Value
1 A 12, 14, 13
2 B 20, 15
using FUN=c keeps the Value type to numeric (actually a numeric vector) which is better imho than converting to String
however.... if no more transformations are needed and you want to save the above as CSV - you DO want to convert to String:
write.csv(x = aggregate(df$Value~df$Type,FUN=toString),file = "nameMe")
works fine.
We could use aggregate from base R
aggregate(Value~., df1, FUN= toString)
# Type Value
#1 A 12, 14, 13
#2 B 20, 15
Another Alternative using data.table:
Assumption: The data.frame is stored in variable df.
library(data.table)
setDT(df)
df[,.(Value = paste(Value,collapse=',')),.(Type)]
You could use the tidyr library.
> library(tidyr)
> spread(df, Type, Value)
A B
1 12 NA
2 NA 20
3 14 NA
4 13 NA
5 NA 15

R Data-Frame: Get Maximum of Variable B condititional on Variable A [duplicate]

This question already has answers here:
Extract the maximum value within each group in a dataframe [duplicate]
(3 answers)
Closed 7 years ago.
I am searching for an efficient and fast way to do the following:
I have a data frame with, say, 2 variables, A and B, where the values for A can occur several times:
mat<-data.frame('VarA'=rep(seq(1,10),2),'VarB'=rnorm(20))
VarA VarB
1 0.95848233
2 -0.07477916
3 2.08189370
4 0.46523827
5 0.53500190
6 0.52605101
7 -0.69587974
8 -0.21772252
9 0.29429577
10 3.30514605
1 0.84938361
2 1.13650996
3 1.25143046
Now I want to get a vector giving me for every unique value of VarA
unique(mat$VarA)
the maximum of VarB conditional on VarA.
In the example here that would be
1 0.95848233
2 1.13650996
3 2.08189370
etc...
My data-frame is very big so I want to avoid the use of loops.
Try this:
library(dplyr)
mat %>% group_by(VarA) %>%
summarise(max=max(VarB))
Try to use data.table package.
library(data.table)
mat <- data.table(mat)
result <- mat[,max(VarB),VarA]
print(result)
Try this:
library(plyr)
ddply(mat, .(VarA), summarise, VarB=min(VarB))

Apply function over consecutive groups in vector [duplicate]

This question already has answers here:
Calculate the mean of every 13 rows in data frame
(4 answers)
Closed 1 year ago.
I want to calculate meas of three consecutive variables a vector.
Ex:
Vec<-rep(1:10)
I would like the output to be like the screenshot below:
You can create the following function to calculate means by groups of 3 (or any other number):
f <- function(x, k=3)
{
for(i in seq(k,length(x),k))
x[(i/k)] <- mean(x[(i-k+1):i])
return(x[1:(length(x)/k)])
}
f(1:15)
[1] 2 5 8 11 14
We can create a grouping variable using gl and then get the mean with ave
ave(Vec, as.numeric(gl(length(Vec), 3, length(Vec))))

Sum from element 1 to current element in R [duplicate]

This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 7 years ago.
I've come across the aggregate() function and things like seq_along(), but I'm not sure how to solve this yet:
For the following:
x
1
5
10
20
I'd like to get the following output:
y
1
6
16
36
It seemed to me that doing something like x[1:seq_along(x)] would do the trick but it seems not because seq_along(x) is a sequence rather than a number.
As mentioned by #DavidArenburg, you can use the cumsum function:
x <- c(1, 5, 10, 20)
cumsum(x)
[1] 1 6 16 36

Resources