Creating an repeating but increasing sequence in R [duplicate] - r

This question already has answers here:
Using seq and rep to create a sequence of 5 integers that go up by 1 on each repetition
(4 answers)
Closed 1 year ago.
I want to create the sequence 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9 if possible using only rep and 'seq'. So each repetition I want the repeating sequence to increase by one. This could be achieved my creating rep(seq(1,5),5) and then adding a vector rep(0:4, each = 5).
But is there any way to do this without creating a new vector and adding it to the first one?

You can use outer + seq in one line
> c(outer(seq(5), seq(5) - 1, `+`))
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
or shorter code with embed
> c(embed(1:9, 5)[, 5:1])
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9

Related

adding rank of size column in r [duplicate]

This question already has answers here:
How to create a consecutive group number
(13 answers)
Closed 1 year ago.
I have these set of variables in the column Num I want to create another column that ranks them with size similar to rankt below but I don't like how this is done.
x <- data.frame("Num" = c(2,5,2,7,7,7,2,5,5))
x$rankt <- rank(x$Num)
Num rankt
1 2 2
2 5 5
3 2 2
4 7 8
5 7 8
6 7 8
7 2 2
8 5 5
9 5 5
Desired Outcome I would like for rankt
Num rankt
1 2 1
2 5 2
3 2 1
4 7 3
5 7 3
6 7 3
7 2 1
8 5 2
9 5 2
Well, a crude approach is to turn them to factors, which are just increasing numbers with labels, and then fetch those numbers:
x <- data.frame("Num" = c(2,5,2,7,7,7,2,5,5))
x$rankt <- as.numeric(as.factor( rank(x$Num) ))
x
It produces:
Num rankt
1 2 1
2 5 2
3 2 1
4 7 3
5 7 3
6 7 3
7 2 1
8 5 2
9 5 2
A solution with dplyr
library(dplyr)
x1 <- x %>%
mutate(rankt=dense_rank(desc(-Num)))

Sequentially remove vector elements

I want to replicate a vector with one value within this vector is missing (sequentially).
For example, my vector is
value <- 1:7
First, the series is without 1, second without 2, and so on. In the end, the series is in one vector.
The intended output looks like
2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 6
Is there any smart way to do this?
You could use the diagonal matrix to set up a logical vector, using it to remove the appropriate values.
n <- 7
rep(1:n, n)[!diag(n)]
# [1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5
# [36] 7 1 2 3 4 5 6
Well, you can certainly do it as a one-liner but I am not sure it qualifies as smart. For example:
x <- 1:7
do.call("c", lapply(as.list(-1:-length(x)), function(a)x[a]))
This simple uses lapply to create a list of copies of x with each of its entries deleted, and then concatenates them using c. The do.call function applies its first argument (a function) to its second argument (a list of arguments to the function).
For fun, it's also possible to just use rep:
> n <- 7
> rep(1:n, n)[rep(c(FALSE, rep(TRUE, n)), length.out=n^2)]
[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2
[39] 3 4 5 6
But lapply is cleaner, I think.
You could also do:
n <- 7
rep(seq(n), n)[-seq(1,n*n,n+1)]
#[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2 3 4 5 6

How to order data.table by custom column [duplicate]

This question already has an answer here:
Data.table meta-programming
(1 answer)
Closed 6 years ago.
Ordering of the data.frame by column index:
> df <- data.frame(5:9, 8:4)
> df
X5.9 X8.4
1 5 8
2 6 7
3 7 6
4 8 5
5 9 4
> df[order(df[,2]),]
X5.9 X8.4
5 9 4
4 8 5
3 7 6
2 6 7
1 5 8
or by column name:
> df[order(df[,"X5.9"]),]
X5.9 X8.4
1 5 8
2 6 7
3 7 6
4 8 5
5 9 4
Is it possible to achieve the same with data.table and order by custom column name or index?
We can use setkey
setkey(setDT(df), X5.9)

How to input this vector (1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9) simply using seq()&rep()?

The vector (1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9)
seq() and rep() maybe can not deliver parameters.
I read the help doc but fail to find the way.
You could try
(1:5) + rep(0:4,each=5)
#[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
NOTE: (1:5) and 0:4 can be replaced by seq(1,5) and seq(0,4)
Another one:
as.vector(outer(1:5,0:4,"+"))

Trouble splitting up data evenly?

I am trying to split up data evenly in R. For example, I am using the dataset cars that is built into R Studio with 50 lines. If I want to split the data into two sections, I would do something along the lines of this:
cars$split <- rep(1:2, each=25) where I would create a column called split and assign the first 25 values to a 1, and the next 25 values to a 2. However, if I wanted to split my data into, lets say, 8 sections (based on user discretion), I would not be able to divide 50 / 8 evenly as it equals to 6.25. In this case, I would simply assign the last two rows (since 50 / 8 = 6.25, and 6 * 8 = 48 so we would have 2 rows left over) to the number 8 in this case using the function above. However, I am unable to do this since the rep function needs to divide properly so I tried to write out some logic as so, but I get an issue saying:
Error in `$<-.data.frame`(`*tmp*`, "split", value = c(1L, 1L, 1L, 1L, : replacement has 48 rows, data has 50
Any ideas on how to fix this? My attempt is shown below:
numDataPerSection <- floor(nrow(cars) / userInputNum)
if(nrow(cars) %% userInputNum != 0){
#If not divisible, assign last few data points to the last number
cars$split <- rep(1:ncls, each=numDataPerSection, len = nrow(cars) - (nrow(cars) %% userInputNum))
for(i in nrow(cars) %% userInputNum){
cars$split[nrow(cars) - i] <- userInputNum
}
}
#Everything divides correctly
else{
cars$split <- rep(1:ncls, each=numDataPerSection)
}
How about using a function such as this one to create your indices?
create.indices <- function(nrows, sections) {
indices <- rep(1:sections,each=floor(nrows/sections))
indices <- append(indices, rep(tail(indices, 1), nrows%%sections))
return(indices)
}
create.indices(50,8)
# [1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 7 8 8 8 8 8 8 8 8
You could use something like
split1 <- function(n,s){ c( rep(1:s, each=n%/%s), rep(s, n%%s) ) }
cars$split <- split1(nrow(cars,userInputNum))
but this is not very balanced as in your example category 8 is two larger than any other, and would be worse with 55 rows and 8 sections:
> split1(50,8)
[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 7 7
[39] 7 7 7 7 8 8 8 8 8 8 8 8
> table(split1(50,8))
1 2 3 4 5 6 7 8
6 6 6 6 6 6 6 8
> table(split1(55,8))
1 2 3 4 5 6 7 8
6 6 6 6 6 6 6 13
You could do better with something like
split2 <- function(n,s){ ((1:n)*s+n-s) %/% n }
which produces
> split2(50,8)
[1] 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 5 6 6 6 6 6 6
[39] 7 7 7 7 7 7 8 8 8 8 8 8
> table(split2(50,8))
1 2 3 4 5 6 7 8
7 6 6 6 7 6 6 6
> table(split2(55,8))
1 2 3 4 5 6 7 8
7 7 7 7 7 7 7 6
You can use the length.out argument of rep() to create your split column: rep(1:8, length.out = 50, each = round(50/8)). Using the round() function works reasonably well to achieve a uniform distribution of group sizes:
> table(rep(1:8, length.out = 50, each = round(50/8)))
1 2 3 4 5 6 7 8
8 6 6 6 6 6 6 6

Resources