constrained optimization of a complicated function - r

I need to optimize the following to find the maximum value for r1:
ad = 0.95*M_D + 0.28*G_D + 0.43*S_D + 2.25*Q_D
as = 0.017*M_A + 0.0064*G_A + 0.0076*S_A + 0.034*Q_A
ccb = 0.0093*M_CC+ 0.0028*G_CC + 0.0042*S_CC + 0.0186*Q_CC
ccd = 0.0223*M_CD + 0.0056*G_CD + 0.0078*S_CD + 0.0446*Q_CD
apb = 1.28*M_P + 2.56*Q_P
r1=(1+ccb*(1+ccd))*ad*as*100/(130-apb)
subject to the following constraints:
0 <= M_CD <= M_CC <= M_A <= M_D <= M_P <= 9
0 <= G_CD <= G_CC <= G_A <= G_D <= 9
0 <= S_CD <= S_CC <= S_A <= S_D <= 9
0 <= Q_CD <= Q_CC <= Q_A <= Q_D <= Q_P <= 3
The approach I've tried before doesn't work very well and I'm hoping to find a better solution.

Once the problem is stated correctly you maybe able to firstly map the parameters to lower and
upper bounds of [0,1]. You can then implement the inequalities inside your function and optimise using an algorithm which accepts basic lower and upper bound constraints. nlminb could be used but the vignette suggests the algorithm used may not be the best.
UPDATE:
With OP revised function
dumFun <- function(p){
p[1] -> M_CD; p[2] -> M_CC; p[3] -> M_A; p[4] -> M_D; p[5] -> M_P;
M_P <- 9*M_P; M_D <- M_P*M_D; M_A <- M_D*M_A; M_CC <- M_A*M_CC; M_CD <- M_CC*M_CD;
p[6] -> G_CD; p[7] -> G_CC; p[8] -> G_A; p[9] -> G_D;
G_D <- 9*G_D; G_A <- G_D*G_A; G_CC <- G_A*G_CC; G_CD <- G_CC*G_CD;
p[10] -> S_CD; p[11] -> S_CC; p[12] -> S_A; p[13] -> S_D;
S_D <- 9*S_D; S_A <- S_D*S_A; S_CC <- S_A*S_CC; S_CD <- S_CC*S_CD;
p[14] -> Q_CD; p[15] -> Q_CC; p[16] -> Q_A; p[17] -> Q_D; p[18] -> Q_P;
Q_P <- 3*Q_P; Q_D <- Q_P*Q_D; Q_A <- Q_D*Q_A; Q_CC <- Q_A*Q_CC; Q_CD <- Q_CC*Q_CD;
ad = 0.95*M_D + 0.28*G_D + 0.43*S_D + 2.25*Q_D
as = 0.017*M_A + 0.0064*G_A + 0.0076*S_A + 0.034*Q_A
ccb = 0.0093*M_CC+ 0.0028*G_CC + 0.0042*S_CC + 0.0186*Q_CC
ccd = 0.0223*M_CD + 0.0056*G_CD + 0.0078*S_CD + 0.0446*Q_CD
apb = 1.28*M_P + 2.56*Q_P
r1=(1+ccb*(1+ccd))*ad*as*100/(130-apb)
-r1
}
require(minqa)
p <- rep(.1, 18)
bobyqa(p, dumFun, lower = rep(0, length(p)), upper = rep(1, length(p)))
parameter estimates: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
objective: -9.65605526502482
number of function evaluations: 97
>

I finally solved my problem not with vectorization but with C. My program containing 14nested loops executes 100 to 1000 times faster with C than with R ! Which is sad because I didn't learn anything new from that and it prooves that R can be pretty useless on some problems but what can we do.

Related

How is it possible to collect NLexpression in a loop?

Would you please help me that how I can collect some NLexpressions in a loop?
I Want to keep k8 and k9 for all i=1:10 as scenarios. It means in the end of the loop, Q equal to collection of k8 and k9 under each scenario ( i ). I couldn't define a matrix and put each pair of k8 and k9 in that as an element. with considering Q the code doesn't work as well.
Many thanks for your kindly help.
using JuMP,CPUTime, Distributions, Ipopt,Juniper,Cplex
n1=1; #the least of scenarios
N=4; #number of scenarios
M=20; #number of sampling
landa=0.01;
E=0.05
T0=0;
T1=2;
T2=2;
gam2=1; gam1=1;
a1=0.5; a2=0.1; a3=50; ap=25;
c0=10;
Zn=zeros(N, 4)
Q=0;
for i in n1:N
C1=rand(100:100:300);
sig=rand(0.5:0.5:2);
f(x) = cdf(Normal(0, 1), x);
#---------------------------------------------------------------------------
ALT= Model(optimizer_with_attributes(Juniper.Optimizer, "nl_solver"=>optimizer_with_attributes(Ipopt.Optimizer, "print_level" => 0),
"mip_solver"=>optimizer_with_attributes(Cplex.Optimizer, "logLevel" => 0),"registered_functions" =>[Juniper.register( :f, 1, f; autodiff = true)])
);
# variables-----------------------------------------------------------------
JuMP.register(ALT, :f, 1, f; autodiff = true);
#variable(ALT, h >= 0.001);
#variable(ALT, L >= 0.000001);
#variable(ALT, n>=2, Int);
#---------------------------------------------------------------------------
#NLexpression(ALT,k1,h/(1-f(L-sig*sqrt(n))+f(-L - sig*sqrt(n)))); # HARL1
#NLexpression(ALT,k2,(1-(1+landa*h)*exp(-landa*h))/(landa*(1-exp(-landa*h)))); #to
#NLexpression(ALT,k3,E*n+T1*gam1+T2*gam2);
#NLexpression(ALT,k8,(C1*(k1-k2+k3)));# depend on scenario
#NLexpression(ALT,k9,(((a1+a2*n)/h)*(k1)));#depend on scenario
Q=Q+k8+k9;
#-----------------------------------------------------------------------
end
You have a couple of issues in your code.
Why are you creating a new model every loop? You cannot aggregate expressions across models like you do with Q=Q+k8+k9
You cannot add nonlinear expressions like Q=Q+k8+k9. All nonlinear expressions must occur inside macros
In general, you are not limited to the specific syntax of JuMP. You can use any Julia data structures to help. In the code below, I just push expressions into a vec
I haven't tested, so there might be typos etc, but this should point you in the right direction:
using JuMP, Distributions
E, landa, T1, T2, gam1, gam2, a1, a2 = 0.05, 0.01, 2, 2, 1, 1, 0.5, 0.1
ALT = Model()
f(x) = cdf(Normal(0, 1), x)
JuMP.register(ALT, :f, 1, f; autodiff = true)
#variable(ALT, h >= 0.001)
#variable(ALT, L >= 0.000001)
#variable(ALT, n >= 2, Int)
k8, k9 = Any[], Any[]
for i in 1:4
C1 = rand(100:100:300)
sig = rand(0.5:0.5:2)
k1 = #NLexpression(ALT, h / (1 - f(L - sig * sqrt(n)) + f(-L - sig * sqrt(n))))
k2 = #NLexpression(ALT, (1 - (1 + landa * h) * exp(-landa * h)) / (landa * (1 - exp(-landa * h))))
k3 = #NLexpression(ALT, E * n + T1 * gam1 + T2 * gam2)
push!(k8, #NLexpression(ALT, C1 * (k1 - k2 + k3))
push!(k9, #NLexpression(ALT, k9, (a1 + a2 * n) / h * k1)
end
Q = #NLexpression(ALT, sum(k for k in k8) + sum(k for k in k9))

Renewal Function for Weibull Distribution

The renewal function for Weibull distribution m(t) with t = 10 is given as below.
I want to find the value of m(t). I wrote the following r code to compute m(t)
last_term = NULL
gamma_k = NULL
n = 50
for(k in 1:n){
gamma_k[k] = gamma(2*k + 1)/factorial(k)
}
for(j in 1: (n-1)){
prev = gamma_k[n-j]
last_term[j] = gamma(2*j + 1)/factorial(j)*prev
}
final_term = NULL
find_value = function(n){
for(i in 2:n){
final_term[i] = gamma_k[i] - sum(last_term[1:(i-1)])
}
return(final_term)
}
all_k = find_value(n)
af_sum = NULL
m_t = function(t){
for(k in 1:n){
af_sum[k] = (-1)^(k-1) * all_k[k] * t^(2*k)/gamma(2*k + 1)
}
return(sum(na.omit(af_sum)))
}
m_t(20)
The output is m(t) = 2.670408e+93. Does my iteratvie procedure correct? Thanks.
I don't think it will work. First, lets move Γ(2k+1) from denominator of m(t) into Ak. Thus, Ak will behave roughly as 1/k!.
In the nominator of the m(t) terms there is t2k, so roughly speaking you're computing sum with terms
100k/k!
From Stirling formula
k! ~ kk, making terms
(100/k)k
so yes, they will start to decrease and converge to something but after 100th term
Anyway, here is the code, you could try to improve it, but it breaks at k~70
N <- 20
A <- rep(0, N)
# compute A_k/gamma(2k+1) terms
ps <- 0.0 # previous sum
A[1] = 1.0
for(k in 2:N) {
ps <- ps + A[k-1]*gamma(2*(k-1) + 1)/factorial(k-1)
A[k] <- 1.0/factorial(k) - ps/gamma(2*k+1)
}
print(A)
t <- 10.0
t2 <- t*t
r <- 0.0
for(k in 1:N){
r <- r + (-t2)^k*A[k]
}
print(-r)
UPDATE
Ok, I calculated Ak as in your question, got the same answer. I want to estimate terms Ak/Γ(2k+1) from m(t), I believe it will be pretty much dominated by 1/k! term. To do that I made another array k!*Ak/Γ(2k+1), and it should be close to one.
Code
N <- 20
A <- rep(0.0, N)
psum <- function( pA, k ) {
ps <- 0.0
if (k >= 2) {
jmax <- k - 1
for(j in 1:jmax) {
ps <- ps + (gamma(2*j+1)/factorial(j))*pA[k-j]
}
}
ps
}
# compute A_k/gamma(2k+1) terms
A[1] = gamma(3)
for(k in 2:N) {
A[k] <- gamma(2*k+1)/factorial(k) - psum(A, k)
}
print(A)
B <- rep(0.0, N)
for(k in 1:N) {
B[k] <- (A[k]/gamma(2*k+1))*factorial(k)
}
print(B)
shows that
I got the same Ak values as you did.
Bk is indeed very close to 1
It means that term Ak/Γ(2k+1) could be replaced by 1/k! to get quick estimate of what we might get (with replacement)
m(t) ~= - Sum(k=1, k=Infinity) (-1)k (t2)k / k! = 1 - Sum(k=0, k=Infinity) (-t2)k / k!
This is actually well-known sum and it is equal to exp() with negative argument (well, you have to add term for k=0)
m(t) ~= 1 - exp(-t2)
Conclusions
Approximate value is positive. Probably will stay positive after all, Ak/Γ(2k+1) is a bit different from 1/k!.
We're talking about 1 - exp(-100), which is 1-3.72*10-44! And we're trying to compute it precisely summing and subtracting values on the order of 10100 or even higher. Even with MPFR I don't think this is possible.
Another approach is needed
OK, so I ended up going down a pretty different road on this. I have implemented a simple discretization of the integral equation which defines the renewal function:
m(t) = F(t) + integrate (m(t - s)*f(s), s, 0, t)
The integral is approximated with the rectangle rule. Approximating the integral for different values of t gives a system of linear equations. I wrote a function to generate the equations and extract a matrix of coefficients from it. After looking at some examples, I guessed a rule to define the coefficients directly and used that to generate solutions for some examples. In particular I tried shape = 2, t = 10, as in OP's example, with step = 0.1 (so 101 equations).
I found that the result agrees pretty well with an approximate result which I found in a paper (Baxter et al., cited in the code). Since the renewal function is the expected number of events, for large t it is approximately equal to t/mu where mu is the mean time between events; this is a handy way to know if we're anywhere in the neighborhood.
I was working with Maxima (http://maxima.sourceforge.net), which is not efficient for numerical stuff, but which makes it very easy to experiment with different aspects. At this point it would be straightforward to port the final, numerical stuff to another language such as Python.
Thanks to OP for suggesting the problem, and S. Pappadeux for insightful discussions. Here is the plot I got comparing the discretized approximation (red) with the approximation for large t (blue). Trying some examples with different step sizes, I saw that the values tend to increase a little as step size gets smaller, so I think the red line is probably a little low, and the blue line might be more nearly correct.
Here is my Maxima code:
/* discretize weibull renewal function and formulate system of linear equations
* copyright 2020 by Robert Dodier
* I release this work under terms of the GNU General Public License
*
* This is a program for Maxima, a computer algebra system.
* http://maxima.sourceforge.net/
*/
"Definition of the renewal function m(t):" $
renewal_eq: m(t) = F(t) + 'integrate (m(t - s)*f(s), s, 0, t);
"Approximate integral equation with rectangle rule:" $
discretize_renewal (delta_t, k) :=
if equal(k, 0)
then m(0) = F(0)
else m(k*delta_t) = F(k*delta_t)
+ m(k*delta_t)*f(0)*(delta_t / 2)
+ sum (m((k - j)*delta_t)*f(j*delta_t)*delta_t, j, 1, k - 1)
+ m(0)*f(k*delta_t)*(delta_t / 2);
make_eqs (n, delta_t) :=
makelist (discretize_renewal (delta_t, k), k, 0, n);
make_vars (n, delta_t) :=
makelist (m(k*delta_t), k, 0, n);
"Discretized integral equation and variables for n = 4, delta_t = 1/2:" $
make_eqs (4, 1/2);
make_vars (4, 1/2);
make_eqs_vars (n, delta_t) :=
[make_eqs (n, delta_t), make_vars (n, delta_t)];
load (distrib);
subst_pdf_cdf (shape, scale, e) :=
subst ([f = lambda ([x], pdf_weibull (x, shape, scale)), F = lambda ([x], cdf_weibull (x, shape, scale))], e);
matrix_from (eqs, vars) :=
(augcoefmatrix (eqs, vars),
[submatrix (%%, length(%%) + 1), - col (%%, length(%%) + 1)]);
"Subsitute Weibull pdf and cdf for shape = 2 into discretized equation:" $
apply (matrix_from, make_eqs_vars (4, 1/2));
subst_pdf_cdf (2, 1, %);
"Just the right-hand side matrix:" $
rhs_matrix_from (eqs, vars) :=
(map (rhs, eqs),
augcoefmatrix (%%, vars),
[submatrix (%%, length(%%) + 1), col (%%, length(%%) + 1)]);
"Generate the right-hand side matrix, instead of extracting it from equations:" $
generate_rhs_matrix (n, delta_t) :=
[delta_t * genmatrix (lambda ([i, j], if i = 1 and j = 1 then 0
elseif j > i then 0
elseif j = i then f(0)/2
elseif j = 1 then f(delta_t*(i - 1))/2
else f(delta_t*(i - j))), n + 1, n + 1),
transpose (makelist (F(k*delta_t), k, 0, n))];
"Generate numerical right-hand side matrix, skipping over formulas:" $
generate_rhs_matrix_numerical (shape, scale, n, delta_t) :=
block ([f, F, numer: true], local (f, F),
f: lambda ([x], pdf_weibull (x, shape, scale)),
F: lambda ([x], cdf_weibull (x, shape, scale)),
[genmatrix (lambda ([i, j], delta_t * if i = 1 and j = 1 then 0
elseif j > i then 0
elseif j = i then f(0)/2
elseif j = 1 then f(delta_t*(i - 1))/2
else f(delta_t*(i - j))), n + 1, n + 1),
transpose (makelist (F(k*delta_t), k, 0, n))]);
"Solve approximate integral equation (shape = 3, t = 1) via LU decomposition:" $
fpprintprec: 4 $
n: 20 $
t: 1;
[AA, bb]: generate_rhs_matrix_numerical (3, 1, n, t/n);
xx_by_lu: linsolve_by_lu (ident(n + 1) - AA, bb, floatfield);
"Iterative solution of approximate integral equation (shape = 3, t = 1):" $
xx: bb;
for i thru 10 do xx: AA . xx + bb;
xx - (AA.xx + bb);
xx_iterative: xx;
"Should find iterative and LU give same result:" $
xx_diff: xx_iterative - xx_by_lu[1];
sqrt (transpose(xx_diff) . xx_diff);
"Try shape = 2, t = 10:" $
n: 100 $
t: 10 $
[AA, bb]: generate_rhs_matrix_numerical (2, 1, n, t/n);
xx_by_lu: linsolve_by_lu (ident(n + 1) - AA, bb, floatfield);
"Baxter, et al., Eq. 3 (for large values of t) compared to discretization:" $
/* L.A. Baxter, E.M. Scheuer, D.J. McConalogue, W.R. Blischke.
* "On the Tabulation of the Renewal Function,"
* Econometrics, vol. 24, no. 2 (May 1982).
* H(t) is their notation for the renewal function.
*/
H(t) := t/mu + sigma^2/(2*mu^2) - 1/2;
tx_points: makelist ([float (k/n*t), xx_by_lu[1][k, 1]], k, 1, n);
plot2d ([H(u), [discrete, tx_points]], [u, 0, t]), mu = mean_weibull(2, 1), sigma = std_weibull(2, 1);

Continued fractions and Pell's equation - numerical issues

Mathematical background
Continued fractions are a way to represent numbers (rational or not), with a basic recursion formula to calculate it. Given a number r, we define r[0]=r and have:
for n in range(0..N):
a[n] = floor(r[n])
if r[n] == [an]: break
r[n+1] = 1 / (r[n]-a[n])
where a is the final representation. We can also define a series of convergents by
h[-2,-1] = [0, 1]
k[-2, -1] = [1, 0]
h[n] = a[n]*h[n-1]+h[n-2]
k[n] = a[n]*k[n-1]+k[n-2]
where h[n]/k[n] converge to r.
Pell's equation is a problem of the form x^2-D*y^2=1 where all numbers are integers and D is not a perfect square in our case. A solution for a given D that minimizes x is given by continued fractions. Basically, for the above equation, it is guaranteed that this (fundamental) solution is x=h[n] and y=k[n] for the lowest n found which solves the equation in the continued fraction expansion of sqrt(D).
Problem
I am failing to get this simple algorithm work for D=61. I first noticed it did not solve Pell's equation for 100 coefficients, so I compared it against Wolfram Alpha's convergents and continued fraction representation and noticed the 20th elements fail - the representation is 3 compared to 4 that I get, yielding different convergents - h[20]=335159612 on Wolfram compared to 425680601 for me.
I tested the code below, two languages (though to be fair, Python is C under the hood I guess), on two systems and get the same result - a diff on loop 20. I'll note that the convergents are still accurate and converge! Why am I getting different results compared to Wolfram Alpha, and is it possible to fix it?
For testing, here's a Python program to solve Pell's equation for D=61, printing first 20 convergents and the continued fraction representation cf (and some extra unneeded fluff):
from math import floor, sqrt # Can use mpmath here as well.
def continued_fraction(D, count=100, thresh=1E-12, verbose=False):
cf = []
h = (0, 1)
k = (1, 0)
r = start = sqrt(D)
initial_count = count
x = (1+thresh+start)*start
y = start
while abs(x/y - start) > thresh and count:
i = int(floor(r))
cf.append(i)
f = r - i
x, y = i*h[-1] + h[-2], i*k[-1] + k[-2]
if verbose is True or verbose == initial_count-count:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
if x**2 - D*y**2 == 1:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
print(cf)
return
count -= 1
r = 1/f
h = (h[1], x)
k = (k[1], y)
print(cf)
raise OverflowError(f"Converged on {x} {y} with count {count} and diff {abs(start-x/y)}!")
continued_fraction(61, count=20, verbose=True, thresh=-1) # We don't want to stop on account of thresh in this example
A c program doing the same:
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
int main() {
long D = 61;
double start = sqrt(D);
long h[] = {0, 1};
long k[] = {1, 0};
int count = 20;
float thresh = 1E-12;
double r = start;
long x = (1+thresh+start)*start;
long y = start;
while(abs(x/(double)y-start) > -1 && count) {
long i = floor(r);
double f = r - i;
x = i * h[1] + h[0];
y = i * k[1] + k[0];
printf("%ld\u00B2-%ldx%ld\u00B2 = %lf\n", x, D, y, x*x-D*y*y);
r = 1/f;
--count;
h[0] = h[1];
h[1] = x;
k[0] = k[1];
k[1] = y;
}
return 0;
}
mpmath, python's multi-precision library can be used. Just be careful that all the important numbers are in mp format.
In the code below, x, y and i are standard multi-precision integers. r and f are multi-precision real numbers. Note that the initial count is set higher than 20.
from mpmath import mp, mpf
mp.dps = 50 # precision in number of decimal digits
def continued_fraction(D, count=22, thresh=mpf(1E-12), verbose=False):
cf = []
h = (0, 1)
k = (1, 0)
r = start = mp.sqrt(D)
initial_count = count
x = 0 # some dummy starting values, they will be overwritten early in the while loop
y = 1
while abs(x/y - start) > thresh and count > 0:
i = int(mp.floor(r))
cf.append(i)
x, y = i*h[-1] + h[-2], i*k[-1] + k[-2]
if verbose or initial_count == count:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
if x**2 - D*y**2 == 1:
print(f'{x}\u00B2-{D}x{y}\u00B2 = {x**2-D*y**2}')
print(cf)
return
count -= 1
f = r - i
r = 1/f
h = (h[1], x)
k = (k[1], y)
print(cf)
raise OverflowError(f"Converged on {x} {y} with count {count} and diff {abs(start-x/y)}!")
continued_fraction(61, count=22, verbose=True, thresh=mpf(1e-100))
Output is similar to wolfram's:
...
335159612²-61x42912791² = 3
1431159437²-61x183241189² = -12
1766319049²-61x226153980² = 1
[7, 1, 4, 3, 1, 2, 2, 1, 3, 4, 1, 14, 1, 4, 3, 1, 2, 2, 1, 3, 4, 1]

Linear optimization with R

I'm trying to maximize the function by x[1], x[2]:
(a - 1) * x[1] * c + (a - 1) * x[2] * b * d
Where a, b, c, d are known positive constants and
0 < x[1] < 1, 0 < x[2] < 1,
x[1] + x[2] = 1
Using NlcOptim, I did:
solver_1 <- function(a, b, c, d){
obj = function(x){
return((a - 1) * x[1] * c + (a - 1) * x[2] * b * d )
}
con = function(x){
f = NULL
f = rbind(f, x[1] + x[2] - 1)
return(list(ceq = f, c = NULL))
}
x0 = c(1, 0)
solnl(x0, objfun = obj, confun = con)
}
solver_1(1.2, 5.2, 0.8, 0.1)
But it gives me:
Error in if (norm(H, "I") == 0) { : missing value where TRUE/FALSE needed
Does anyone know what I'm doing wrong?
Using an optimization package is massive overkill.
Either by following up on the comment of #Roland and doing some elementary calculus, or using the corner point theorem of linear programming, your objective function is optimized when x[1] = 0 and x[2] = 1 or vice versa. Thus you only have to evaluated two constant expressions in a,b,c,d and pick the larger of the two:
max(c*(a-1), b*d*(a-1))
For any a other than 1, which is larger will be determined by whether or not c > a*b

Create a new probability distribution (relying on the previous r.v.) in R

I want to write this pdf and generate random numbers using it.
Suppose that X is a random variable that takes the values 0 or 1 only, the pdf is as follows:
P(X_t) = a^[(1-X_(t-1))*X_t] * (1-a)^[(1-X_(t-1))*(1-X_t)] * b^[X_(t-1)*(1-X_t)] * (1-b)^[X_(t-1)*X_t]
X_t: the current r.v., X_(t-1): the previous r.v., where t=1,2,...,T and the initial value at t=0 is given. Finally, a and b are two known probabilities.
Not sure if I completely understood, but you could possibly do as follows.
If we say 'y' is the previous realisation, and 'x' is the current one, then we have:
P(x=0|y=0) = 1-a
P(x=1|y=0) = a
P(x=0|y=1) = b
P(x=1|y=1) = 1-b
So then we can generate uniform variates U in [0,1] and if y = 0 then set x = 0 if U <= 1-a, else x = 1; and if y = 1, then set x = 0 if U <= b, else set x = 1
The following function may do the trick, where x0 is the initial value of x:
rhany <- function(n, a, b, x0 = 0) {
sim <- c(x0, runif(n-1))
for (i in 2:n){
sim[i] = (sim[i-1] == 0) * ((sim[i] <= 1-a) * 0 + (sim[i] > a) * 1) +
(sim[i-1] == 1) * ((sim[i] <= b) * 0 + (sim[i] > 1-b) * 1)
}
sim
}
So then if you run the function:
rhany(10, 0.1, 0.7)
[1] 0 1 0 1 1 1 0 1 1 0
Admittedly, the for-loop slows down the function; it takes about 9 seconds on my machine to generate 1e7 variates. You could reimplement using the Rcpp package:
library(Rcpp)
cppFunction('NumericVector rhanya(double a, double b, NumericVector zs) {
int n = zs.size();
NumericVector sim = zs;
for (int i = 1; i < n; i++) {
sim[i] = (sim[i-1] == 0) * ((sim[i] <= 1-a) * 0 + (sim[i] > a) * 1) + (sim[i-1] == 1) * ((sim[i] <= b) * 0 + (sim[i] > 1-b) * 1);
}
return(sim);
}')
rhany1 <- function(n, a, b, x0 = 0) {
sim <- c(x0, runif(n-1))
rhanya(a, b, sim)
}
This function rhany1 takes less than 0.5 sec to generate 1e7 variates.
You can test that the two functions rhany and rhany1 give the same results when the same seed is set:
set.seed(123); bc <- rhany(1e7, 0.1, 0.7)
set.seed(123); ac <- rhany1(1e7, 0.1, 0.7)
all.equal(ac, bc)
[1] TRUE

Resources