R Replicating until length is met

R Replicating until length is met - r

Say we have the following:
a=c( 1, 9, 5, 7, 8, 11)
length(a) ## 6
and I want to obtain:
a_desired=c( 1, 1, 9, 9, 5, 5, 7, 7, 8, 11)
length(a_desired) ## 10
Basically it stops replicating when it reaches the desired length, in this case 10.
If the desired length is 14,
a_desired=c( 1, 1, 1, 9, 9, 9, 5, 5, 7, 7, 8, 8, 11, 11)
Does anyone have a suggestion on how to obtain this or perhaps a link on something similar asked before ?(I'm not too sure what keyword to look for)

You could write your own function to do something like this
extend_to <- function(x, len) {
stopifnot(len>0)
times = len %/% length(x)
each <- rep(times, length(x))
more <- len-sum(each)
if (more>0) {
each[1:more] <- each[1:more]+1
}
rep(x, each)
}
a <- c( 1, 9, 5, 7, 8, 11)
extend_to(a, 6)
# [1] 1 9 5 7 8 11
extend_to(a, 10)
# [1] 1 1 9 9 5 5 7 7 8 11
extend_to(a, 14)
# [1] 1 1 1 9 9 9 5 5 7 7 8 8 11 11
extend_to(a, 2)
# [1] 1 9
We use the rep() to repeat each element a certain number of times.

So if your sequence is currently of length M and you want length N > M, then you have these possibilities:
N <= 2M: double the first (N-M) items
2M < N <= 3M: triple the first (N-2M) items, double the rest
3M < N <= 4M: quadruple the first (N-3M) items, triple the rest.
and so on.
So first, divide the target length by the current length, take the floor, and replicate the sequence that many times. Then add an extra copy of the first remainder items.
a=c( 1, 9, 5, 7, 8, 11)
m=length(a)
n=10 # desired new length
new_a = append(
rep(a[1:(n%%m)],each=ceiling(n/m)),
rep(a[((n%%m)+1):m],each=floor(n/m)))

Related

Finding Values Present in Two or More Unequal-Length Vectors

I have the following two numeric vectors:
A <- c(1, 3, 5, 7, 9)
B <- c(2, 3, 4, 5, 6, 10, 12, 13)
I want to generate a new vector C that contains the values that are present in both A and B (not the positions at which these values are found). The result should be:
C <- c(3, 5)
I also want to generate a vector D containing the values present in A but not present in B and a vector E containing the values present in B but not A.
D <- c(1, 7, 9)
E <- c(2, 4, 6, 10, 12, 13)
What is the best way to do this using base R? Thanks!

You can use the base R function intersect().
In addition, generally speaking I wouldn't use C as a variable name as it really close to c(), which might cause you problems.
A <- c(1, 3, 5, 7, 9)
B <- c(2, 3, 4, 5, 6, 10, 12, 13)
Inter <- intersect(A, B)
[1] 3 5
For the opposite of `intersect()':
#taken from here:https://www.r-bloggers.com/outersect-the-opposite-of-rs-intersect-function/
outersect <- function(x, y) {
sort(c(setdiff(x, y),
setdiff(y, x)))
}
outersect(A, B)
[1] 1 2 4 6 7 9 10 12 13

A <- c(1, 3, 5, 7, 9)
B <- c(2, 3, 4, 5, 6, 10, 12, 13)
C <- A[!A%in%B]
D <- B[!B%in%A]
Which yields
> C
[1] 1 7 9
> D
[1] 2 4 6 10 12 13

Group matching numbers in random order in R

I'm working on a Monte-Carlo simulation type problem and need to generate a vector of repeated random numbers, with the matching numbers grouped together, but in random order.
It's easier to explain with an example. If I had:
1, 3, 7, 12, 1, 3, 7, 12, 1, 3, 7, 12
I would like it sorted as:
7, 7, 7, 3, 3, 3, 12, 12, 12, 1, 1, 1 (or with the groups of matching numbers in any order but ascending/descending).
The reason I need the random order is because my MC simulation is for 2 variables, so if both are in order they won't vary independently.
I've got as far as:
sort(rep(runif(50,1,10),10), decreasing = FALSE)
Which generates 50 random numbers between 1 and 10, repeats each 10 times, then sorts the 50 groups of 10 matching random numbers in ascending order (or it could easily be descending order if I changed "FALSE" to "TRUE"). I just can't figure out the last step of getting 50 groups of 10 matching numbers in random order. Can anyone help?

Here is one option with split
unlist(sample(split(v1, v1)), use.names = FALSE)
#[1] 3 3 3 1 1 1 12 12 12 7 7 7
Or another option is match with unique
v1[order(match(v1, sample(unique(v1))))]
data
v1 <- c(1, 3, 7, 12, 1, 3, 7, 12, 1, 3, 7, 12)

An option could be as:
v <- c(1, 3, 7, 12, 1, 3, 7, 12, 1, 3, 7, 12)
lst <- split(v, unique(v))
sapply(sample(seq(length(lst)),length(lst)), function(i)lst[[i]])
# [,1] [,2] [,3] [,4]
#[1,] 3 12 7 1
#[2,] 3 12 7 1
#[3,] 3 12 7 1
#OR for having just a vector
as.vector(sapply(sample(seq(length(lst)),length(lst)), function(i)lst[[i]]))
#[1] 3 3 3 12 12 12 7 7 7 1 1 1

Merging two consecutive values

I have a vector of different values, and I would like to merge and add two values together if a 5 is followed by a 3.
Input:
vector <- c(1, 2, 7, 4, 3, 8, 5, 3, 2, 6, 9, 4, 4, 5, 6, 2, 6, 5, 3)
Expected output:
1 2 7 4 3 8 8 2 6 9 4 4 5 6 2 6 8
So as you can see, the two occurrences of a three following a 5 have been added together to show 8. I'm sure there is a simple function that will do this in a matter of seconds, I just wasn't able to find it.
Thanks in advance!

vector <- c(1, 2, 7, 4, 3, 8, 5, 3, 2, 6, 9, 4, 4, 5, 6, 2, 6, 5, 3)
# get indices where 5 followed by 3
fives <- head(vector, -1) == 5 & tail(vector, -1) == 3
# add three to fives
vector[fives] <- vector[fives] + 3
# remove threes
vector <- vector[c(TRUE, !fives)]
vector
# [1] 1 2 7 4 3 8 8 2 6 9 4 4 5 6 2 6 8

Here is one possibility:
x <- c(1, 2, 7, 4, 3, 8, 5, 3, 2, 6, 9, 4, 4, 5, 6, 2, 6, 5, 3)
A <- rbind(x[-length(x)], x[-1])
id <- which( colSums( abs(A - c(5, 3)) ) == 0 )
x[rbind(id, id + 1L)] <- c(8, NA)
na.omit(x)
This solution was proposed to make it easier to extend to general cases (It may not best meets OP's need, but I just did it as an exercise.)
In general, if you want to match a chunk xc in a vector x, we can do:
A <- t(embed(x, length(xc)))
id <- which(colSums(abs(A - rev(xc))) == 0)
Now id gives you the starting index of the matching chunk in x.

vector <- c(1, 2, 7, 4, 3, 8, 5, 3, 2, 6, 9, 4, 4, 5, 6, 2, 6, 5, 3)
temp = rev(which((vector == 5) & (vector[-1] == 3))) # find indexes of 5s followed by 3s
for (t in temp){
vector = vector[-(t+1)] # remove threes
vector[t] = 8 # replace fives with eights
}
vector
# [1] 1 2 7 4 3 8 8 2 6 9 4 4 5 6 2 6 8

Exchange two elements of a vector in one call

I have a vector c(9,6,3,4,2,1,5,7,8), and I want to switch the elements at index 2 and at index 5 in the vector. However, I don't want to have to create a temporary variable and would like to make the switch in one call. How would I do that?

How about just x[c(i,j)] <- x[c(j,i)]? Similar to replace(...), but perhaps a bit simpler.
swtch <- function(x,i,j) {x[c(i,j)] <- x[c(j,i)]; x}
swtch(c(9,6,3,4,2,1,5,7,8) , 2,5)
# [1] 9 2 3 4 6 1 5 7 8

You could use replace().
x <- c(9, 6, 3, 4, 2, 1, 5, 7, 8)
replace(x, c(2, 5), x[c(5, 2)])
# [1] 9 2 3 4 6 1 5 7 8
And if you don't even want to assign x, you can use
replace(
c(9, 6, 3, 4, 2, 1, 5, 7, 8),
c(2, 5),
c(9, 6, 3, 4, 2, 1, 5, 7, 8)[c(5, 2)]
)
# [1] 9 2 3 4 6 1 5 7 8
but that's a bit silly. You will probably want x assigned to begin with.

If you actually want to do it without creating a temporary copy of the vector, you would need to write a short C function.
library(inline)
swap <- cfunction(c(i = "integer", j = "integer", vec="integer"),"
int *v = INTEGER(vec);
int ii = INTEGER(i)[0]-1, jj = INTEGER(j)[0]-1;
int tmp = v[ii];
v[ii] = v[jj];
v[jj] = tmp;
return R_NilValue;
")
vec <- as.integer(c(9,6,3,4,2,1,5,7,8))
swap(2L, 5L, vec)
vec
# [1] 9 2 3 4 6 1 5 7 8

Updating empirical cumulative function

I have the following problem:
given a stream of observations, find the number of observations that are less or equal to the currently last observation. For example, if the streaming observations are
8, 1, 10, 3, 9, 7, 4, 5, 6, 2
then we have the following updates
Observations - 8, there is 1 observation less or equal 8
Observations - 8, 1, there is 1 observation less or equal 1
Observations - 8, 1, 10, there are 3 observation less or equal 10
...
As a result one would obtain such values
1, 1, 3, 2, 4, 3, 3, 4, 5, 2
The solution should be very fast as I am working with huge dataset.

Using a for but in the reverse direction, I don't test but I think it is faster.
xx <- c(8, 1, 10, 3, 9, 7, 4, 5, 6, 2)
res = vector('integer',length=length(xx))
for (i in rev(seq_along(xx))) {
res[i] <- sum(xx[i]>=xx)
xx <- xx[-i]
}
res
[1] 1 1 3 2 4 3 3 4 5 2

You can use sapply:
vec <- c(8, 1, 10, 3, 9, 7, 4, 5, 6, 2)
sapply(seq_along(vec), function(x) sum(vec[seq(x)] <= vec[x]))
# [1] 1 1 3 2 4 3 3 4 5 2
Since performence is important, you can also use vapply. It might be faster (untested):
vapply(seq_along(vec), function(x) sum(vec[seq(x)] <= vec[x]), integer(1))
# [1] 1 1 3 2 4 3 3 4 5 2

So I couldn't leave well enough alone, so I created a kludgemonster
carl<-function(vec) {
newct<-vector('integer',length=length(vec))
vlen<-length(vec)
for(j in 1:length(vec) ) {
wins<- (which(vec[j:vlen] >= vec[j])+j-1)
newct[wins]<-newct[wins]+1
}
}
It appears to work, but...
Rgames> set.seed(20)
Rgames> vec<-runif(2000)
Rgames> microbenchmark(carl(vec),agstudy(vec),times=10)
Unit: milliseconds
expr min lq median uq max neval
carl(vec) 86.75314 87.55323 88.16816 88.80831 89.65117 10
agstudy(vec) 70.26213 70.83771 71.06158 71.72247 71.93800 1
Still not quite as good as agstudy's code. Maybe someone can tighten up my loop?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R Replicating until length is met - r

Related

Finding Values Present in Two or More Unequal-Length Vectors

Group matching numbers in random order in R

Merging two consecutive values

Exchange two elements of a vector in one call

Updating empirical cumulative function

Categories

Resources