I want to reorder a vector with 250 values and I am using sample, repeat and if in order to do so:
x <- rnorm(200, mean = 0.06, sd = 0.20)
x$ret_coef = 1 + returns,
X$ret = cumprod(ret_coef) - 1
reorder1 <- function(x){
repeat{
temp <- tibble(
ret= sample(x$ret, 200)
)
if(sum(temp$ret[200],temp$ret[180])<0) break
}
}
Unfortunately, the new vector never fullfills the if-condition.
I figured it out:
its important to set replace = TRUE:
sample(x$ret, 200, replace=TRUE)
It worked afterwards!
Related
Recently, I learned how to write a loop that initializes some number, and then randomly generates numbers until the initial number is guessed (while recording the number of guesses it took) such that no number will be guessed twice:
# https://stackoverflow.com/questions/73216517/making-sure-a-number-isnt-guessed-twice
all_games <- vector("list", 100)
for (i in 1:100){
guess_i = 0
correct_i = sample(1:100, 1)
guess_sets <- 1:100 ## initialize a set
trial_index <- 1
while(guess_i != correct_i){
guess_i = sample(guess_sets, 1) ## sample from this set
guess_sets <- setdiff(guess_sets, guess_i) ## remove it from the set
trial_index <- trial_index + 1
}
## no need to store `i` and `guess_i` (as same as `correct_i`), right?
game_results_i <- data.frame(i, trial_index, guess_i, correct_i)
all_games[[i]] <- game_results_i
}
all_games <- do.call("rbind", all_games)
I am now trying to modify the above code to create the following two loops:
(Deterministic) Loop 1 will always guess the midpoint (round up) and told if their guess is smaller or bigger than the correct number. They will then re-take the midpoint (e.g. their guess and the floor/ceiling) until they reach the correct number.
(Semi-Deterministic) Loop 2 first makes a random guess and is told if their guess is bigger or smaller than the number. They then divide the difference by half and makes their next guess randomly in a smaller range. They repeat this process many times until they reach the correct number.
I tried to write a sketch of the code:
#Loop 2:
correct = sample(1:100, 1)
guess_1 = sample(1:100, 1)
guess_2 = ifelse(guess_1 > correct, sample(50:guess_1, 1), sample(guess_1:100, 1))
guess_3 = ifelse(guess_2 > correct, sample(50:guess_2, 1), sample(guess_2:100, 1))
guess_4 = ifelse(guess_4 > correct, sample(50:guess_3, 1), sample(guess_3:100, 1))
#etc
But I am not sure if I am doing this correctly.
Can someone please help me with this?
Thank you!
Example : Suppose I pick the number 68
Loop 1: first random guess = 51, (100-51)/2 + 51 = 75, (75-50)/2 + 50 = 63, (75 - 63)/2 + 63 = 69, (69 - 63)/2 + 63 = 66, etc.
Loop 2: first random guess = 53, rand_between(53,100) = 71, rand_between(51,71) = 65, rand(65,71) = 70, etc.
I don't think you need a for loop for this, you can create structures since the beginning, with sample, sapply and which:
## correct values can repeat, so we set replace to TRUE
corrects <- sample(1:100, 100, replace = TRUE)
## replace is by default FALSE in sample(), if you don't want repeated guesses
## sapply() creates a matrix
guesses <- sapply(1:100, function(x) sample(1:100, 100))
## constructing game_results_i equal to yours, but could be simplified
game_results_i <- data.frame(
i = 1:100,
trial_index = sapply(
1:100,
function(x) which(
## which() returns the index of the first element that makes the predicate true
guesses[, x] == corrects[x]
)
),
guess_i = corrects,
correct_i = corrects # guess_i and correct_i are obviously equal
)
Ok, let's see if now I match question and answer properly :)
If I got correctly your intentions, in both loops, you are setting increasingly finer lower and upper bounds. Each guess reduces the search space. However, this interpretation does not always match your description, please double check if it can be acceptable for your purposes.
I wrote two functions, guess_bisect for the deterministic loop_1 and guess_sample for loop_2:
guess_bisect <- function(correct, n = 100) {
lb <- 0
ub <- n + 1
trial_index <- 1
guess <- round((ub - lb) / 2) + lb
while (guess != correct) {
# cat(lb, ub, guess, "\n") # uncomment to print the guess iteration
if (guess < correct)
lb <- guess
else
ub <- guess
guess <- round((ub - lb) / 2) + lb
trial_index <- trial_index + 1
}
trial_index
}
guess_sample <- function(correct, n = 100) {
lb <- 0
ub <- n + 1
trial_index <- 1
guess <- sample((lb + 1):(ub - 1), 1)
while (guess != correct) {
# cat(lb, ub, guess, "\n") # uncomment to print the guess iteration
if (guess < correct)
lb <- guess
else
ub <- guess
guess <- sample((lb + 1):(ub - 1), 1)
trial_index <- trial_index + 1
}
trial_index
}
Obviously, guess_bisect always produces the same results with the same input, guess_sample changes randomly instead.
By plotting the results in a simple chart, it seems that the deterministic bisection is on the average much better, as the random sampling may become happen to pick improvements from the wrong sides. x-axis is the correct number, spanning 1 to 100, y-axis is the trial index, with guess_bisect you get the red curve, with many attempts of guess_sample you get the blue curves.
Good evening,
I asked a question earlier and found it hard to implement the solution so I am gonna reask it in a more clear way.
I have the problem, that I want to add a column to a dataframe of daily returns of a stock. Lets say its normally distributed and I would like to add a column that contains the value at risk (hist) whose function I wrote myself.
The restriction is that each observation should be assigned to my function and take the last 249 observations as well.
So when the next observation is calculated it should also take only the last 249 observations of the das before. So the input values should move as the time goes on. In other words I want values from 251 days ago to be excluded. Hopefully I explained myself well enough. If not maybe the code speaks for me:
df<- data.frame(Date=seq(ISOdate(2000,1,1), by = "days", length.out = 500), Returns=rnorm(500))
#function
VaR.hist<- function(x, n=250, hd=20, q=0.05){
width<-nrow(x)
NA.x<-na.omit(x)
quantil<-quantile(NA.x[(width-249):width],probs=q)
VaR<- quantil*sqrt(hd)%>%
return()
}
# Run the function on the dataframe
df$VaR<- df$Returns%>%VaR.hist()
Error in (width - 249):width : argument of length 0
This is the Error code that I get and not my new Variable...
Thanks !!
As wibom wrote in the comment nrow(x) does not work for vectors. What you need is length() instead. Also you do not need return() in the last line as R automatically returns the last line of a function if there is no early return() before.
library(dplyr)
df<- data.frame(Date=seq(ISOdate(2000,1,1), by = "days", length.out = 500), Returns=rnorm(500))
#function
VaR.hist <- function(x, n=250, hd=20, q=0.05){
width <- length(x) # here you need length as x is a vector, nrow only works for data.frames/matrixes
NA.x <- na.omit(x)
quantil <- quantile(NA.x[(width-249):width], probs = q)
quantil*sqrt(hd)
}
# Run the function on the dataframe
df$VaR <- df$Returns %>% VaR.hist()
It's a bit hard to understand what you want to do exactly.
My understanding is that you wish to compute a new variable VarR, calculated based on the current and previous 249 observations of df$Returns, right?
Is this about what you wish to do?:
library(tidyverse)
set.seed(42)
df <- tibble(
Date = seq(ISOdate(2000, 1, 1), by = "days", length.out = 500),
Returns=rnorm(500)
)
the_function <- function(i, mydata, hd = 20, q = .05) {
r <-
mydata %>%
filter(ridx <= i, ridx > i - 249) %>%
pull(Returns)
quantil <- quantile(r, probs = q)
VaR <- quantil*sqrt(hd)
}
df <-
df %>%
mutate(ridx = row_number()) %>%
mutate(VaR = map_dbl(ridx, the_function, mydata = .))
If you are looking for a base-R solution:
set.seed(42)
df <- data.frame(
Date = seq(ISOdate(2000, 1, 1), by = "days", length.out = 500),
Returns = rnorm(500)
)
a_function <- function(i, mydata, hd = 20, q = .05) {
r <- mydata$Returns[mydata$ridx <= i & mydata$ridx > (i - 249)]
quantil <- quantile(r, probs = q)
VaR <- quantil*sqrt(hd)
}
df$ridx <- 1:nrow(df) # add index
df$VaR <- sapply(df$ridx, a_function, mydata = df)
I'm trying to "pseudo-randomize" a vector in R using a while loop.
I have a vector delays with the elements that need to be randomized.
I am using sample on a vector values to index randomly into delays. I cannot have more than two same values in a row, so I am trying to use an if else statement. If the condition are met, the value should be added to random, and removed from delays.
When I run the individual lines outside the loop they are all working, but when I try to run the loop, one of the vector is populated as NA_real, and that stops the logical operators from working.
I'm probably not great at explaining this, but can anyone spot what I'm doing wrong? :)
delay_0 <- rep(0, 12)
delay_6 <- rep(6, 12)
delays <- c(delay_6, delay_0)
value <- c(1:24)
count <- 0
outcasts <- c()
random <- c(1,2)
while (length(random) < 27) {
count <- count + 1
b <- sample(value, 1, replace = FALSE)
a <- delays[b]
if(a == tail(random,1) & a == head(tail(random,2),1) {
outcast <- outcasts + 1
}
else {
value <- value[-(b)]
delays <- delays[-(b)]
random <- c(random,a)
}
}
Two problems with your code:
b can take a value that is greater than the number of elements in delays. I fixed this by using sample(1:length(delays), 1, replace = FALSE)
The loop continues when delays is empty. You could either change length(random) < 27 to length(random) < 26 I think or add length(delays) > 0.
The code:
delay_0 <- rep(0, 12)
delay_6 <- rep(6, 12)
delays <- c(delay_6, delay_0)
value <- c(1:24)
count <- 0
outcasts <- c()
random <- c(1, 2)
while (length(random) < 27 & length(delays) > 0) {
count <- count + 1
b <- sample(1:length(delays), 1, replace = FALSE)
a <- delays[b]
if (a == tail(random, 1) & a == head((tail(random, 2)), 1))
{
outcast <- outcasts + 1
}
else {
value <- value[-(b)]
delays <- delays[-(b)]
random <- c(random, a)
}
}
This is my code where I explain some parts where I think there is a problem.
set.seed(5623)
t_llegada <- (1:30)
t_viaje <- (1:30)
t_intervalo<- (1:30)
#Prob. morning
pllegadaM <- rexp(30,rate=0.81) #Prob
pviajeM <- rexp(30,rate=30.47) #Prob
pinterM<- rexp(30,rate=0.12) #Prob
#Prob afternoon
pllegadaT <- rexp(30,rate=0.096) #Prob
pviajeT <- rexp(30,rate=31.80) #Prob
pinterT<- rexp(30,rate=0.97) #Prob
#Prob night
pllegadaN <- rexp(30,rate=0.12) #Prob
pviajeN <- rexp(30,rate=32.12) #Prob
pinterN<- rexp(30,rate=0.9) #Prob
sim <-NULL
##Dimension variables:
minutos.dia<-numeric(600)
min.llegada <- minutos.dia
min.salida <- minutos.dia
tinterval <- minutos.dia
tservicio.llegada<-minutos.dia
##Sample time with probs
tintervalM <- sample(t_intervalo,size=1, replace=TRUE, prob= pinterM)
tllegadasM <- sample(t_llegada,size=1,replace = TRUE,prob=pllegadaM)
tviajeM <- sample(t_viaje,size=1,replace = TRUE,prob=pviajeM)
tintervalT <- sample(t_intervalo,size=1, replace=T, prob= pinterT)
tllegadasT <- sample(t_llegada,size=1,replace = TRUE,prob=pllegadaT)
tviajeT <- sample(t_viaje,size=1,replace = TRUE,prob=pviajeT)
tintervalN <- sample(t_intervalo,size=1, replace=T, prob= pinterN)
tllegadasN <- sample(t_llegada,size=1,replace = TRUE,prob=pllegadaN)
tviajeN <- sample(t_viaje,size=1,replace = TRUE,prob=pviajeN)
##Count first person
min.llegada[1]<- 1
tinterval[1]<- 1
min.salida[1]<-tinterval[1]+tviajeM[1]
###Save in data frame "Sim"
uno <- data.frame (caso = 1,
minuto_llegada = min.llegada[1],
minuto_inicio_del_viaje = tinterval[1],
Tiempo_viaje = tviajeM [1],
minuto_salida_del_cliente = min.salida[1])
sim <- rbind(sim, uno)
##Loop to asigne probs acording to number cases
for (c in 2:600){
tllegadasM[c] <- if(c <300){sample(t_llegada,size=1,replace =TRUE,prob=pllegadaM)#VAL 2
} else{
sample(t_llegada,size=1,replace = TRUE,prob=pllegadaT)}
tviajeM[c] <- if(c<300){sample(t_viaje,size=1,replace = TRUE,prob=pviajeM)#VAL 3
}
else{
sample(t_viaje,size=1,replace = TRUE,prob=pviajeT)}
tintervalM[c]<-if(c <300){sample(t_intervalo,size=1, replace=TRUE, prob=pinterM)#VAL 2
} else{
sample(t_intervalo,size=1, replace=T, prob= pinterT)
}}
I previously asgined the number aleat. to the variable tintervalM, and in the second loop I suposed to I only pick the number from the variables aleats. an just sum. I hope to be well explained and be helped.
#Loop for times
for (c in 3:600){
min.llegada[c]<-min.llegada[c-1]+tllegadasM[c] #VAL 1
tinterval[c]<-if(min.llegada[c-1]>tinterval[c-1]){
tinterval[c-1]+ tintervalM[c]+tintervalM[c+1]} #VAL 2 HERE IS THE PROBLEM
min.salida[c]<-tinterval[c]+tviajeM[c] #VAL 4
nuevo <- data.frame (caso = c,
minuto_llegada = min.llegada[c],#1
minuto_inicio_del_viaje = tinterval[c],#2
Tiempo_viaje = tviajeM [c],#3
minuto_salida_del_cliente = min.salida[c])#4
sim <- rbind(sim, nuevo)
}
I want to asigne the sum of the previos number of tinterval[c-1] plus the number generated by tintervalM[c] and plus next number generated by tintervalM[c] to the variable tinterval[c] if min.salida[c] is greater than tinverval[c],but i recive the error has length zero,
Despite not understanding what your code does, I think I can point to the error. This part will fail if the condition within if in not TRUE.
tinterval[c] <- if(min.llegada[c-1]>tinterval[c-1]){
tinterval[c-1] + tintervalM[c] + tintervalM[c+1]
} #VAL 2 HERE IS THE PROBLEM
If the condition is FALSE, the if clause returns NULL and it cannot be assigned to tinterval[c].
How I guess it should be written is as follow
if( min.llegada[c-1] > tinterval[c-1] ){
tinterval[c] <- tinterval[c-1] + tintervalM[c] + tintervalM[c+1]
}
Now if the condition is FALSE, nothing happens.
Beginning R programmer here. I'm trying to run a function with the argument being the number of samples (user-defined) and the output being a vector of means of those samples.
Here is what I have so far, however, I only get one mean value returned. How do I alter the formula so I get a vector of the means that is variable on the number the user inputs?
Pop1 <- rnorm(500, mean = 0.5, sd = 0.2)
My_Func <- function(Samples) {
A <- sample(Pop1, size = 25, replace = TRUE)
for (i in 1:Samples) {
Means <- mean(A)
}
return(Means)
}
Using a for loop it can be like this. As #MrFlick mentioned, avoid assingning the loop to the same variable. Include it into the loop.
Pop1 <- rnorm(500, mean = 0.5, sd = 0.2)
My_Func <- function(Samples) {
Means = numeric(Samples)
for (i in 1:Samples) {
A <- sample(Pop1, size = 25, replace = TRUE)
Means[i] <- mean(A)
}
return(Means)
}