How to roll the order of a fixed length vector [duplicate] - r

This question already has answers here:
Circular shift of vector (equivalent to numpy.roll)
(7 answers)
Closed 2 years ago.
I have this vector:
1:12
and I want to roll the order of the values in this way:
#Iter 1
1 2 3 4 5 6 7 8 9 10 11 12
#Iter 2
12 1 2 3 4 5 6 7 8 9 10 11
#Iter 3
11 12 1 2 3 4 5 6 7 8 9 10
#Iter 3
10 11 12 1 2 3 4 5 6 7 8 9
#Iter 4
...
#Iter 12
1 2 3 4 5 6 7 8 9 10 11 12
I try dplyr:lead, seq(to = 1, by = -1, length.out = 12) and a loop, but I don't know how to do backwards (reverse) slicing in R.

You can try this:
vec <- 1:12
#List
List <- list()
List[[1]] <- vec
#Loop
for(i in 2:length(vec))
{
List[[i]] <- vec[c((length(vec)-2)+2,1:length(vec)-1)]
vec <- List[[i]]
}
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12
[[2]]
[1] 12 1 2 3 4 5 6 7 8 9 10 11
[[3]]
[1] 11 12 1 2 3 4 5 6 7 8 9 10
[[4]]
[1] 10 11 12 1 2 3 4 5 6 7 8 9
[[5]]
[1] 9 10 11 12 1 2 3 4 5 6 7 8
[[6]]
[1] 8 9 10 11 12 1 2 3 4 5 6 7
[[7]]
[1] 7 8 9 10 11 12 1 2 3 4 5 6
[[8]]
[1] 6 7 8 9 10 11 12 1 2 3 4 5
[[9]]
[1] 5 6 7 8 9 10 11 12 1 2 3 4
[[10]]
[1] 4 5 6 7 8 9 10 11 12 1 2 3
[[11]]
[1] 3 4 5 6 7 8 9 10 11 12 1 2
[[12]]
[1] 2 3 4 5 6 7 8 9 10 11 12 1

Related

Create numbers based on different probability in R

I am trying to simulate a matrix of data set i*j, with i=2 ; j = 200, which represent subject and trial separately, and create random number between 0-10 based on trials with different probability. For first subject (i=1), the first 100 trials (j = 1-100) there is 70% probability to be number 1-5 and 30% probability to be number 6-10, and the probability reverse in trial 101 to 200. For second subject (i=2), the first 100 trials (j = 1-100) there is 60% probability to be number 1-5 and 40% probability to be number 6-10, and the probability reverse in trial 101 to 200.
I gave an example of 2 subjects because I need to do this with multiple i but not only 1 i.
Can I work this out with sample?
I guess what you are after is Stratified Sampling.
With base R, you can implement stratified sampling via sample, but you may need to define a user function like f as below
f <- function(N, p) {
c(
sapply(
list(p, rev(p)),
function(v) {
sapply(
sample(c(TRUE, FALSE), N, replace = TRUE, prob = v),
function(x) ifelse(x, sample(1:5, 1), sample(6:10, 1))
)
}
)
)
}
When you use it, you first define a probability list probs for each trial, e.g.,
probs <- list(c(0.7, 0.3), c(0.6, 0.4))
and then run
> lapply(probs, f, N = j)
[[1]]
[1] 2 1 2 5 3 6 9 2 2 2 3 2 3 7 4 5 3 7 1 4 10 2 3 6 8
[26] 7 8 3 1 2 5 1 4 4 4 2 1 5 5 4 1 6 4 2 9 10 5 1 1 5
[51] 4 4 3 4 8 4 10 3 2 1 3 4 7 4 2 10 1 4 3 3 5 2 7 6 5
[76] 3 10 4 2 2 5 1 2 3 2 3 3 2 9 10 10 10 10 3 1 4 3 1 1 5
[101] 8 6 5 9 1 6 1 9 10 4 5 4 6 5 8 2 4 10 6 3 8 5 10 8 8
[126] 8 9 3 8 6 5 7 10 9 6 8 9 5 6 8 4 6 6 7 4 4 8 10 10 6
[151] 9 10 9 7 8 7 3 7 4 6 10 8 10 8 5 6 10 8 9 6 6 1 9 4 8
[176] 1 5 10 7 10 8 7 6 6 5 4 7 7 8 8 1 10 8 5 8 9 4 5 6 7
[[2]]
[1] 7 9 4 9 5 3 3 9 4 5 6 10 4 5 2 3 2 5 4 5 3 8 5 2 1
[26] 6 5 3 9 3 9 9 9 8 7 3 4 5 7 3 5 3 5 7 5 3 4 2 6 4
[51] 7 6 2 7 4 4 10 4 10 2 8 10 3 2 8 1 8 10 8 4 3 2 9 8 4
[76] 4 10 1 3 10 6 8 6 3 5 2 3 3 9 4 7 5 1 1 1 3 10 5 2 7
[101] 2 10 2 6 8 10 10 7 3 7 3 3 7 1 10 3 4 1 1 8 2 5 2 4 7
[126] 2 7 7 4 9 10 7 1 4 4 9 7 9 9 9 8 4 1 10 6 10 4 4 8 9
[151] 7 8 3 2 9 1 9 7 6 9 1 6 3 9 7 8 5 9 3 8 9 6 5 1 2
[176] 5 10 2 7 8 7 8 8 8 8 8 5 1 1 7 6 3 3 4 2 3 2 3 1 3

Specify unique levels when creating multiple factors

I have a dataframe which I am trying to turn into factors. I want each row to represent a factor, with the levels ordered in the order that the values appear. My code is falling short of this last task:
> x
V11 V12 V13 V21 V22 V23 V31 V32 V33 V41 V42 V43
r1 1 2 3 4 5 6 7 8 9 10 11 12
r2 1 2 3 4 5 6 10 11 12 7 8 9
r3 1 2 3 7 8 9 10 11 12 4 5 6
r4 4 5 6 7 8 9 10 11 12 1 2 3
>
> x %>%
+ t %>%
+ as_data_frame %>%
+ mutate_all(factor) %>%
+ lapply(., unlist)
$r1
[1] 1 2 3 4 5 6 7 8 9 10 11 12
Levels: 1 2 3 4 5 6 7 8 9 10 11 12
$r2
[1] 1 2 3 4 5 6 10 11 12 7 8 9
Levels: 1 2 3 4 5 6 7 8 9 10 11 12
$r3
[1] 1 2 3 7 8 9 10 11 12 4 5 6
Levels: 1 2 3 4 5 6 7 8 9 10 11 12
$r4
[1] 4 5 6 7 8 9 10 11 12 1 2 3
Levels: 1 2 3 4 5 6 7 8 9 10 11 12
Is there any way to specify that the levels should match the other of each column in the initial dataframe (it was transformed as the first piped command); right now each factor has the same order of levels which is incorrect.
You need to specify the levels = argument inside factor():
lapply(data.frame(t(df)), function(x) factor(x, levels = unique(x)))
#$r1
# [1] 1 2 3 4 5 6 7 8 9 10 11 12
#Levels: 1 2 3 4 5 6 7 8 9 10 11 12
#$r2
# [1] 1 2 3 4 5 6 10 11 12 7 8 9
#Levels: 1 2 3 4 5 6 10 11 12 7 8 9
#$r3
# [1] 1 2 3 7 8 9 10 11 12 4 5 6
#Levels: 1 2 3 7 8 9 10 11 12 4 5 6
#$r4
# [1] 4 5 6 7 8 9 10 11 12 1 2 3
#Levels: 4 5 6 7 8 9 10 11 12 1 2 3

Converting multiple histogram frequency count into an array in R

For each row in the matrix "result" shown below
A B C D E F G H I J
1 4 6 3 5 9 9 9 3 4 4
2 5 7 5 5 8 8 8 7 4 5
3 7 5 4 4 7 9 7 4 4 5
4 6 6 6 6 8 9 8 6 3 6
5 4 5 5 5 8 8 7 4 3 7
6 7 9 7 6 7 8 8 5 7 6
7 5 6 6 5 8 8 7 3 3 5
8 6 7 4 5 8 9 8 4 6 5
9 6 8 8 6 7 7 7 7 6 6
I would like to plot a histogram for each row with 3 bins as shown below:
samp<-result[1,]
hist(samp, breaks = 3, col="lightblue", border="pink")
Now what is needed is to convert the histogram frequency counts into an array as follows
If I have say 4 bins and say first bin has count=5 and second bin has a count=2 and fourth bin=3. Now I want a vector of all values in each of these bins, coming from data result(for every row) in a vector as my output.
row1 5 2 0 3
For hundreds of rows I would like to do it in an automated way and hence posted this question.
In the end the matrix should look like
bin 2-4 bin 4-6 bin6-8 bin8-10
row 1 5 2 0 3
row 2
row 3
row 4
row 5
row 6
row 7
row 8
row 9
DF <- read.table(text="A B C D E F G H I J
1 4 6 3 5 9 9 9 3 4 4
2 5 7 5 5 8 8 8 7 4 5
3 7 5 4 4 7 9 7 4 4 5
4 6 6 6 6 8 9 8 6 3 6
5 4 5 5 5 8 8 7 4 3 7
6 7 9 7 6 7 8 8 5 7 6
7 5 6 6 5 8 8 7 3 3 5
8 6 7 4 5 8 9 8 4 6 5
9 6 8 8 6 7 7 7 7 6 6", header=TRUE)
m <- as.matrix(DF)
apply(m,1,function(x) hist(x,breaks = 3)$count)
# $`1`
# [1] 5 2 0 3
#
# $`2`
# [1] 5 0 2 3
#
# $`3`
# [1] 6 3 1
#
# $`4`
# [1] 1 6 2 1
#
# $`5`
# [1] 3 3 4
#
# $`6`
# [1] 3 4 2 1
#
# $`7`
# [1] 2 5 3
#
# $`8`
# [1] 6 3 1
#
# $`9`
# [1] 4 4 0 2
Note that according to the documentation the number of breaks is only a suggestion. If you want to have the same number of breaks in all rows, you should do the binning outside of hist:
breaks <- 1:5*2
t(apply(m,1,function(x) table(cut(x,breaks,include.lowest = TRUE))))
# [2,4] (4,6] (6,8] (8,10]
# 1 5 2 0 3
# 2 1 4 5 0
# 3 4 2 3 1
# 4 1 6 2 1
# 5 3 3 4 0
# 6 0 3 6 1
# 7 2 5 3 0
# 8 2 4 3 1
# 9 0 4 6 0
You could access the counts vector which is returned by hist (see ?hist for details):
counts <- hist(samp, breaks = 3, col="lightblue", border="pink")$counts

Sequentially reorganize a vector in R

I have a numeric element z as below:
> sort(z)
[1] 1 5 5 5 6 6 7 7 7 7 7 9 9
I would like to sequentially reorganize this element so to have
> z
[1] 1 2 2 2 3 3 4 4 4 4 4 5 5
I guess converting z to a factor and use it as an index should be the way.
You answered it yourself really:
as.integer(factor(sort(z)))
I know this has been accepted already but I decided to look inside factor() to see how it's done there. It more or less comes down to this:
x <- sort(z)
match(x, unique(x))
Which is an extra line I suppose but it should be faster if that matters.
This should do the trick
z = sort(sample(1:10, 100, replace = TRUE))
cumsum(diff(z)) + 1
[1] 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3
[26] 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6
[51] 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8
[76] 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10
Note that diff omits the first element of the series. So to compensate:
c(1, cumsum(diff(z)) + 1)
Alternative using rle:
z = sort(sample(1:10, 100, replace = TRUE))
rle_result = rle(sort(z))
rep(rle_result$values, rle_result$lengths)
> rep(rle_result$values, rle_result$lengths)
[1] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
[26] 3 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 6 6 6
[51] 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8
[76] 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10
rep(seq_along(rle(x)$l), rle(x)$l)

Sequence expansion question

I have a sequence of 'endpoints', e.g.:
c(7,10,5,11,15)
that I want to expand to a sequence of 'elapsed time' between the endpoints, e.g.
c(7,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,1,2,3,4,5,6,7,8,9,10,11,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
Whats the most efficient way to do this in R? I'm imagining some creative use of the embed function, but I can't quite get there without using a ugly for loop.
Here's the naive way to do this:
expandSequence <- function(x) {
out <- x[1]
for (y in (x[-1])) {
out <- c(out,seq(1,y))
}
return(out)
}
expandSequence(c(7,10,5,11,15))
There is a base function to do this, called, wait for it, sequence:
sequence(c(7,10,5,11,15))
[1] 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2 3
[26] 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
In your case it seems your first endpoint is in fact not part of the sequence, so it becomes:
c(7, sequence(c(10,5,11,15)))
[1] 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2 3 4 5 6 7 8 9
[26] 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
How about this:
> unlist(sapply(x,seq))
[1] 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2
[25] 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
With the first element added on at the end:
c( x[1], unlist( sapply( x[seq(2,length(x))], seq ) ) )
And a slightly more readable version:
library(taRifx)
c( x[1], unlist( sapply( shift(x,wrap=FALSE), seq ) ) )
A combination of lapply() and seq_len() is useful here:
expandSequence <- function(x) {
out <- lapply(x[-1], seq_len)
do.call(c, c(x[1], out))
}
Which gives for
pts <- c(7,10,5,11,15)
> expandSequence(pts)
[1] 7 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 1 2 3 4
[21] 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 12 13
[41] 14 15
(An alternative is:
expandSequence <- function(x) {
out <- lapply(x[-1], seq_len)
unlist(c(x[1], out), use.names = FALSE)
}
)

Resources