Variable sample upper value in R - r

I have the following matrix
m <- matrix(c(2, 4, 3, 5, 1, 5, 7, 9, 3, 7), nrow=5, ncol=2,)
colnames(x) = c("Y","Z")
m <-data.frame(m)
I am trying to create a random number in each row where the upper limit is a number based on a variable value (in this case 1*Y based on each row's value for for Z)
I currently have:
samp<-function(x){
sample(0:1,1,replace = TRUE)}
x$randoms <- apply(m,1,samp)
which work works well applying the sample function independently to each row, but I always get an error when I try to alter the x in sample. I thought I could do something like this:
samp<-function(x){
sample(0:m$Z,1,replace = TRUE)}
x$randoms <- apply(m,1,samp)
but I guess that was wishful thinking.
Ultimately I want the result:
Y Z randoms
2 5 4
4 7 7
3 9 3
5 3 1
1 7 6
Any ideas?

The following will sample from 0 to x$Y for each row, and store the result in randoms:
x$randoms <- sapply(x$Y + 1, sample, 1) - 1
Explanation:
The sapply takes each value in x$Y separately (let's call this y), and calls sample(y + 1, 1) on it.
Note that (e.g.) sample(y+1, 1) will sample 1 random integer from the range 1:(y+1). Since you want a number from 0 to y rather than 1 to y + 1, we subtract 1 at the end.
Also, just pointing out - no need for replace=T here because you are only sampling one value anyway, so it doesn't matter whether it gets replaced or not.

Based on #mathematical.coffee suggestion and my edited example this is the slick final result:
m <- matrix(c(2, 4, 3, 5, 1, 5, 7, 9, 3, 7), nrow=5, ncol=2,)
colnames(m) = c("Y","Z")
m <-data.frame(m)
samp<-function(x){
sample(Z + 1, 1)}
m$randoms <- sapply(m$Z + 1, sample, 1) - 1

Related

r sequence problem - max number of changes in a given sequence

Can somebody help me understand a CS problem.
The problem is the New York Time Rollercoaster problem.
I have a queue:
queue <- seq(from = 1, to = 5)
1 2 3 4 5
A person can bribe another person who is ahead of them in the queue but by only a maximum of 2 times. Thus a queue sequence might look like:
Ride: 1, 2, 3, 4, 5 # Original queue
Ride: 1, 2, 3, 5, 4 # 5 bribes number 4
Ride: 1, 2, 5, 3, 4 # 5 bribes number 3 and thus runs out of bribes and cannot move further (it does not state in the problem if 3 can "re-bribe" 5 so I assume they cannot).
Ride: 2, 1, 5, 3, 4 # 2 bribes number 1
So given the input c(1, 2, 3, 4, 5) what are the minimum number of swaps it would take to get to the final output which would be c(2, 1, 5, 3, 4).
Python code from here:
def minimumBribes(q):
moves = 0
for pos, val in enumerate(q):
if (val-1) - pos > 2:
return "Too chaotic"
for j in xrange(max(0,val-2), pos):
if q[j] > val:
moves+=1
return moves
I am trying to re-create this in R and understand the solution.
Here's a way I think -
minimumBribes <- function(final_q) {
change <- final_q - seq_along(final_q)
if(any(change > 2)) return("Too chaotic!")
sum(change[change > 0])
}
minimumBribes(q = c(2, 1, 5, 3, 4))
[1] 3
Explanation -
initial_q <- 1:5
final_q <- c(2, 1, 5, 3, 4)
# calculate change in position; +ve is gain and -ve is loss
change <- final_q - initial_q
[1] 1 -1 2 -1 -1
# it is clear that if some gained x posn combined then other(s) lost x posn combined
# i.e. sum of posn gains and losses will always be 0
# therefore, to get min total swaps, simply add either gains or losses
# which in a way implies the most direct path from initial_q to final_q
sum(change[change > 0])
[1] 3

Creating shuffled numbers in R

As a result of seeing THIS EXAMPLE, I was wondering how I could create one set of 15 shuffled orderings of 1 through 4 in R?
On THIS Website, you can get 1 Set of 15 shuffled Numbers
Ranging: From 1 to 4
As an example, on my run I got:
Set #1:
3, 2, 2, 1, 1, 1, 3, 2, 2, 3, 2, 1, 3, 4, 1
Is there a way I can replicate the above in R?
If I understood correctly your question, at first it comes to mind a solution like the following one: very basic, but it does its job.
size <- 40
vec <- sample(1:4, size = size, replace = TRUE)
while(length(unique(vec)) < 4){
vec <- sample(1:4, size = size, replace = TRUE)
}
vec
The while cycle will not go on for long as it's very unlikely that a digit does not appear in the random vector vec if you sample 40 times.
Of course you can change the size of your vector, the code will still work, except you want vec to be < 4; in that case, the loop will go on indefinitely.

Extract first continuous sequence in vector

I have a vector:
as <- c(1,2,3,4,5,9)
I need to extract the first continunous sequence in the vector, starting at index 1, such that the output is the following:
1 2 3 4 5
Is there a smart function for doing this, or do I have to do something not so elegant like this:
a <- c(1,2,3,4,5,9)
is_continunous <- c()
for (i in 1:length(a)) {
if(a[i+1] - a[i] == 1) {
is_continunous <- c(is_continunous, i)
} else {
break
}
}
continunous_numbers <- c()
if(is_continunous[1] == 1) {
is_continunous <- c(is_continunous, length(is_continunous)+1)
continunous_numbers <- a[is_continunous]
}
It does the trick, but I would expect that there is a function that can already do this.
It isn't clear what you need if the index of the continuous sequence only if it starts at index one or the first sequence, whatever the beginning index is.
In both case, you need to start by checking the difference between adjacent elements:
d_as <- diff(as)
If you need the first sequence only if it starts at index 1:
if(d_as[1]==1) 1:(rle(d_as)$lengths[1]+1) else NULL
# [1] 1 2 3 4 5
rle permits to know lengths and values for each consecutive sequence of same value.
If you need the first continuous sequence, whatever the starting index is:
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
Examples (for the second option):
as <- c(1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
#[1] 1 2 3 4 5
as <- c(4,3,1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 3 4 5 6 7
as <- c(1, 2, 3, 6, 7, 8)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 1 2 3
A simple way to catch the sequence would be to find the diff of your vector and grab all elements with diff == 1 plus the very next element, i.e.
d1<- which(diff(as) == 1)
as[c(d1, d1[length(d1)]+1)]
NOTE
This will only work If you only have one sequence in your vector. However If we want to make it more general, then I 'd suggest creating a function as so,
get_seq <- function(vec){
d1 <- which(diff(as) == 1)
if(all(diff(d1) == 1)){
return(c(d1, d1[length(d1)]+1))
}else{
d2 <- split(d1, cumsum(c(1, diff(d1) != 1)))[[1]]
return(c(d2, d2[length(d2)]+1))
}
}
#testing it
as <- c(3, 5, 1, 2, 3, 4, 9, 7, 5, 4, 5, 6, 7, 8)
get_seq(as)
#[1] 3 4 5 6
as <- c(8, 9, 10, 11, 1, 2, 3, 4, 7, 8, 9, 10)
get_seq(as)
#[1] 1 2 3 4
as <- c(1, 2, 3, 4, 5, 6, 11)
get_seq(as)
#[1] 1 2 3 4 5 6

R-Randomly pick a number and do it over and over until a condition is achivied

I want to randomly pick a number from a vector with 8 elements that sums to 35. If the number is 0 look for another number. If the number is greater than 0, make this number -1. Do this in a loop until the sum of the vector is 20. How can I do this in R?
For example: vec<-c(2,3,6,0,8,5,6,5)
Pick a number from this list randomly and make the number -1 until the sum of the elements becomes 20.
I'm really really not sure that is what you want, but for what I understand of your question, here is my solution. You'll get most of the concept and key fonctions in my script. Use that and help() to understand them and optimize it.
vec <- c(2, 3, 6, 0, 8, 5, 6, 5)
summ <- 0
new.vec <- NULL
iter <- 1
while(summ<20) {
selected <- sample(vec,1)
if(selected!=0) new.vec[iter] <- selected-1
summ <- sum(new.vec)
iter <- iter+1
}
Try this:
vec <- c(2, 3, 6, 0, 8, 5, 6, 5)
#just setting the seed for reproducibility
set.seed(19)
tabulate(sample(rep(seq_along(vec),vec),20))
#[1] 0 2 4 0 4 5 3 2

Replace column if the same column name R

I have used mice package in R to impute some missing values in my data, but not for all variables. Now I would like to replace the columns from the original data with columns from the imputed data, if their column names are equal. Here is my function:
replace_imp <- function(data,impdata) {
for(i in 1:length(impdata)){
for(k in 1:length(data)){
if(colnames(impdata)[i]==colnames(data)[k]){
data[,k] <- imp_data[,i]
}
}
}
}
But it does not seem to work, any help?
Starting with a minimal data set:
original <- data.frame(X=c(1, 1, 1), Y=c(2, 2, 2), Z=c(3, 3, 3))
imputed <- data.frame(A=c(2, 2, 2), Y=c(5, 5, 5), Z=c(1, 1, 1))
We should expect the original data frame to change it's 'Y' and 'Z' column to the imputed's value. Let's create a function that takes all matching column names, and for every match, we will replace the original's values with the imputed's.
replace_imputed <- function(original, imputed){
namestoChange <- colnames(original)[colnames(imputed) %in% colnames(original)]
for(i in 1:length(namestoChange)){
original[namestoChange[i]] <- imputed[namestoChange[i]]
}
return(original)
}
> replace_imputed(original, imputed)
X Y Z
1 1 5 1
2 1 5 1
3 1 5 1
Is this more or less what you were looking for?
original <- data.frame(X=c(1, 1, 1), Y=c(2, 2, 2), Z=c(3, 3, 3))
imputed <- data.frame(A=c(2, 2, 2), Y=c(5, 5, 5), Z=c(1, 1, 1))
original[names(imputed)] <- imputed
X Y Z A
1 5 1 2
1 5 1 2
1 5 1 2

Resources