Use rep() and seq() to create a vector - r

I am new to R. In JAVA I would introduce a control variable to create a sequence such as
1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
I was thinking on doing something like
seq(from=c(1:5),to=c(5,10),by=1)
However that does not work...
Can that be solved purely with seq and rep?

How about this?
rep(0:4, each=5)+seq(from=1, to=5, by=1)
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9

Try this. You can create a function to create the sequence and apply to an initial vector v1. Here the code:
#Data
v1 <- 1:5
#Code
v2 <- c(sapply(v1, function(x) seq(from=x,by=1,length.out = 5)))
Output:
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
And the way using seq() and rep() can be:
#Code2
rep(1:5, each = 5) + 0:4
Output:
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9

Using outer is pretty concise:
c(outer(1:5, 0:4, `+`))
#> [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Note, 0:4 is short for seq(from = 0, to = 4, by = 1)

A perfect use case for Map or mapply. I always prefer Map because it does not simplify the output by default.
Map(seq, from = 1:5, to = 5:9)
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 2 3 4 5 6
[[3]]
[1] 3 4 5 6 7
[[4]]
[1] 4 5 6 7 8
[[5]]
[1] 5 6 7 8 9
You can use unlist() to get it the way you want.
unlist(Map(seq, from = 1:5, to = 5:9))
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Note that `by = 1`, the default.

Related

Generating an vector with rep and seq but without the c() function [duplicate]

This question already has answers here:
R repeating sequence add 1 each repeat
(2 answers)
Closed 5 months ago.
Suppose that I am not allowed to use the c() function.
My target is to generate the vector
"1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9"
Here is my attempt:
rep(seq(1, 5, 1), 5)
# [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
rep(0:4,rep(5,5))
# [1] 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
So basically I am sum them up. But I wonder if there is a better way to use rep and seq functions ONLY.
Like so:
1:5 + rep(0:4, each = 5)
# [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
I like the sequence option as well:
sequence(rep(5, 5), 1:5)
You could do
rep(1:5, each=5) + rep.int(0:4, 5)
# [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Just to be precise and use seq as well:
rep(seq.int(1:5), each=5) + rep.int(0:4, 5)
(PS: You can remove the .ints, but it's slower.)
One possible way:
as.vector(sapply(1:5, `+`, 0:4))
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
I would also propose the outer() function as well:
library(dplyr)
outer(1:5, 0:4, "+") %>%
array()
Or without magrittr %>% function in newer R versions:
outer(1:5, 0:4, "+") |>
array()
Explanation.
The first function will create an array of 1:5 by 0:4 sequencies and fill the intersections with sums of these values:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 2 3 4 5 6
[3,] 3 4 5 6 7
[4,] 4 5 6 7 8
[5,] 5 6 7 8 9
The second will pull the vector from the array and return the required vector:
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9

How to input this vector (1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9) simply using seq()&rep()?

The vector (1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9)
seq() and rep() maybe can not deliver parameters.
I read the help doc but fail to find the way.
You could try
(1:5) + rep(0:4,each=5)
#[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
NOTE: (1:5) and 0:4 can be replaced by seq(1,5) and seq(0,4)
Another one:
as.vector(outer(1:5,0:4,"+"))

Converting multiple histogram frequency count into an array in R

For each row in the matrix "result" shown below
A B C D E F G H I J
1 4 6 3 5 9 9 9 3 4 4
2 5 7 5 5 8 8 8 7 4 5
3 7 5 4 4 7 9 7 4 4 5
4 6 6 6 6 8 9 8 6 3 6
5 4 5 5 5 8 8 7 4 3 7
6 7 9 7 6 7 8 8 5 7 6
7 5 6 6 5 8 8 7 3 3 5
8 6 7 4 5 8 9 8 4 6 5
9 6 8 8 6 7 7 7 7 6 6
I would like to plot a histogram for each row with 3 bins as shown below:
samp<-result[1,]
hist(samp, breaks = 3, col="lightblue", border="pink")
Now what is needed is to convert the histogram frequency counts into an array as follows
If I have say 4 bins and say first bin has count=5 and second bin has a count=2 and fourth bin=3. Now I want a vector of all values in each of these bins, coming from data result(for every row) in a vector as my output.
row1 5 2 0 3
For hundreds of rows I would like to do it in an automated way and hence posted this question.
In the end the matrix should look like
bin 2-4 bin 4-6 bin6-8 bin8-10
row 1 5 2 0 3
row 2
row 3
row 4
row 5
row 6
row 7
row 8
row 9
DF <- read.table(text="A B C D E F G H I J
1 4 6 3 5 9 9 9 3 4 4
2 5 7 5 5 8 8 8 7 4 5
3 7 5 4 4 7 9 7 4 4 5
4 6 6 6 6 8 9 8 6 3 6
5 4 5 5 5 8 8 7 4 3 7
6 7 9 7 6 7 8 8 5 7 6
7 5 6 6 5 8 8 7 3 3 5
8 6 7 4 5 8 9 8 4 6 5
9 6 8 8 6 7 7 7 7 6 6", header=TRUE)
m <- as.matrix(DF)
apply(m,1,function(x) hist(x,breaks = 3)$count)
# $`1`
# [1] 5 2 0 3
#
# $`2`
# [1] 5 0 2 3
#
# $`3`
# [1] 6 3 1
#
# $`4`
# [1] 1 6 2 1
#
# $`5`
# [1] 3 3 4
#
# $`6`
# [1] 3 4 2 1
#
# $`7`
# [1] 2 5 3
#
# $`8`
# [1] 6 3 1
#
# $`9`
# [1] 4 4 0 2
Note that according to the documentation the number of breaks is only a suggestion. If you want to have the same number of breaks in all rows, you should do the binning outside of hist:
breaks <- 1:5*2
t(apply(m,1,function(x) table(cut(x,breaks,include.lowest = TRUE))))
# [2,4] (4,6] (6,8] (8,10]
# 1 5 2 0 3
# 2 1 4 5 0
# 3 4 2 3 1
# 4 1 6 2 1
# 5 3 3 4 0
# 6 0 3 6 1
# 7 2 5 3 0
# 8 2 4 3 1
# 9 0 4 6 0
You could access the counts vector which is returned by hist (see ?hist for details):
counts <- hist(samp, breaks = 3, col="lightblue", border="pink")$counts

Sequentially reorganize a vector in R

I have a numeric element z as below:
> sort(z)
[1] 1 5 5 5 6 6 7 7 7 7 7 9 9
I would like to sequentially reorganize this element so to have
> z
[1] 1 2 2 2 3 3 4 4 4 4 4 5 5
I guess converting z to a factor and use it as an index should be the way.
You answered it yourself really:
as.integer(factor(sort(z)))
I know this has been accepted already but I decided to look inside factor() to see how it's done there. It more or less comes down to this:
x <- sort(z)
match(x, unique(x))
Which is an extra line I suppose but it should be faster if that matters.
This should do the trick
z = sort(sample(1:10, 100, replace = TRUE))
cumsum(diff(z)) + 1
[1] 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3
[26] 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6
[51] 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8
[76] 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10
Note that diff omits the first element of the series. So to compensate:
c(1, cumsum(diff(z)) + 1)
Alternative using rle:
z = sort(sample(1:10, 100, replace = TRUE))
rle_result = rle(sort(z))
rep(rle_result$values, rle_result$lengths)
> rep(rle_result$values, rle_result$lengths)
[1] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
[26] 3 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6
[51] 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8
[76] 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10
rep(seq_along(rle(x)$l), rle(x)$l)

Sequence expansion question

I have a sequence of 'endpoints', e.g.:
c(7,10,5,11,15)
that I want to expand to a sequence of 'elapsed time' between the endpoints, e.g.
c(7,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,1,2,3,4,5,6,7,8,9,10,11,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
Whats the most efficient way to do this in R? I'm imagining some creative use of the embed function, but I can't quite get there without using a ugly for loop.
Here's the naive way to do this:
expandSequence <- function(x) {
out <- x[1]
for (y in (x[-1])) {
out <- c(out,seq(1,y))
}
return(out)
}
expandSequence(c(7,10,5,11,15))
There is a base function to do this, called, wait for it, sequence:
sequence(c(7,10,5,11,15))
[1] 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2 3
[26] 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
In your case it seems your first endpoint is in fact not part of the sequence, so it becomes:
c(7, sequence(c(10,5,11,15)))
[1] 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2 3 4 5 6 7 8 9
[26] 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
How about this:
> unlist(sapply(x,seq))
[1] 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2
[25] 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
With the first element added on at the end:
c( x[1], unlist( sapply( x[seq(2,length(x))], seq ) ) )
And a slightly more readable version:
library(taRifx)
c( x[1], unlist( sapply( shift(x,wrap=FALSE), seq ) ) )
A combination of lapply() and seq_len() is useful here:
expandSequence <- function(x) {
out <- lapply(x[-1], seq_len)
do.call(c, c(x[1], out))
}
Which gives for
pts <- c(7,10,5,11,15)
> expandSequence(pts)
[1] 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2 3 4
[21] 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13
[41] 14 15
(An alternative is:
expandSequence <- function(x) {
out <- lapply(x[-1], seq_len)
unlist(c(x[1], out), use.names = FALSE)
}
)

Resources