Difference drawing random numbers from distributions R [migrated] - r

This question was migrated from Stack Overflow because it can be answered on Cross Validated.
Migrated 24 days ago.
I am comparing these two forms of drawing random numbers from a beta and a Gaussian distribution. What are their differences? Why are they different?
The first way (_1) simulates from a Uniform(0,1) and then applies the inverse CDF of the Beta (Normal) distribution on those uniform draws to get draws from the Beta (Normal) distribution.
While the second way (_2) uses the default function to generate random numbers from the distribution.
Beta Distribution
set.seed(1)
beta_1 <- qbeta(runif(1000,0,1), 2, 5)
set.seed(1)
beta_2 <- rbeta(1000, 2,5)
> summary(beta_1); summary(beta_2)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.009481 0.164551 0.257283 0.286655 0.387597 0.895144
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.006497 0.158083 0.261649 0.284843 0.396099 0.841760
Here every number is different.
Normal distribution
set.seed(1)
norm_1 <- qnorm(runif(1000, 0,1), 0, 0.1)
set.seed(1)
norm_2 <- rnorm(1000, 0, 0.1)
> summary(norm_1); summary(norm_2)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.3008048 -0.0649125 -0.0041975 0.0009382 0.0664868 0.3810274
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.300805 -0.069737 -0.003532 -0.001165 0.068843 0.381028
Here the numbers are almost the same except in the mean and median
Shouldn't all be equal? Because I am generating random numbers from distributions with the same parameters

I think your question boils down to the assumption about the random number generator. If rnorm used the same RNG as runif under the hood, then your expectation would hold. It does not use the same RNG. The normal distribution RNG and uniform RNG are separate. See ?RNGkind. Without that exact match, you are left with the statistical tests below:
Is the mean of norm_1 different from the mean of norm_2?
t.test(x = norm_1, y = norm_2)
p-value > 0.05 indicates there is insufficient evidence to reject the null hypothesis that the means are equal at the 0.05 type I error level
Are the distributions different?
ks.test(x = norm_1, y = norm_2)
p-value > 0.05 indicates there is insufficient evidence to reject the null hypothesis that the distributions are equal at the 0.05 type I error level

I tried to sample a Bernoulli parameter at home using two different ways.
I flip a coin and assign 1 to heads and 0 to tails
I roll a six sided dice and assign the result 1 to a roll of the 3 highest numbers and the result 0 to a roll of the 3 lowest numbers.
I did this only twenty times instead of thousand times but the principle is the same. I got the following results:
result 0
result 1
Method 1
11
9
Method 2
8
12
Q: Why did I not get the same result for both methods?
A: Well, it is of course because they are samples and are supposed to be variable everytime.
If I would be able to reset some random seed to remove the variability, then this still doesn't matter because they are different methods.
Why is there no use of inverse transform sampling?
The normal distribution actually does use inverse transform sampling. The following command returns the same value of 0.3735462
set.seed(1)
rnorm(1,1,1)
set.seed(1)
qnorm(runif(1),1,1)
Also the rbeta uses inverse transform sampling and the following returns the same 0.7344913 and 0.2655087, which are only different by the relationship Y = 1-X (so internally there is some inversion)
alpha = 1
beta = 1
set.seed(1)
rbeta(1,alpha,beta)
set.seed(1)
qbeta(runif(1),alpha,beta)
The beta function becomes different when when $\alpha$ and $\beta$ are not both equal to one. This is because the inverse sampling is not very efficient and the the rbeta function will do some algorithm that creates the sample in a different way. Below is a code with the algorithm for the case that $min(\alpha,\beta) \leq 1$.
See for more about the algorithm: Hung, Ying-Chao, Narayanaswamy Balakrishnan, and Yi-Te Lin. "Evaluation of beta generation algorithms." Communications in Statistics-Simulation and Computation 38.4 (2009): 750-770.
You can see a few points that are calculated differently. The algorithm has a few steps where it starts redrawing random numbers, and it does this because redrawing numbers is easier than computing the inverse transform for a difficult case.
alpha = 0.9
beta = 0.9
#### Cheng's BC algorithm
### used if min(alpha,beta)<=1
### initialize
set.seed(1)
p = min(alpha,beta)
q = max(alpha,beta)
a = p+q
b = p^-1
delta = 1+q-p
k1 = delta*(0.0138889+0.0416667*p)/(q*b-0.777778)
k2 = 0.25 + (0.5+0.25/delta)*p
sample = function() {
### Perform steps of algorithm in a loop
step = 1
while(step<6) {
if (step == 1) {
U1 = runif(1)
U2 = runif(1)
if (U1 < 0.5)
{step = 2}
else
{step = 3}
}
if (step == 2) {
Y = U1*U2
Z = U1*Y
if (0.25*U2 + Z-Y >= k1) {
step = 1
} else {
step = 5
}
}
if (step == 3) {
Z = U1^2*U2
if (Z > 0.25) {
step = 4
} else {
V = b*log(U1/(1-U1))
W = q*exp(V)
step = 6
}
}
if (step == 4) {
if (Z < k2) {
step = 5
} else {
step = 1
}
}
if (step == 5) {
V = b*log(U1/(1-U1))
W = q*exp(V)
if (a*(log(a/(p+W))+V) - 1.3862944 < log(Z)) {
step = 1
} else {
step = 6
}
}
}
if (q == alpha) {
X = W/(p+W)
} else {
X = p/(p+W)
}
return(X)
}
sample()
n = 20
beta_orig = sapply(1:n,function(x) {
set.seed(x)
rbeta(1,alpha,beta)
})
beta_quantile = sapply(1:n,function(x) {
set.seed(x)
qbeta(runif(1),alpha,beta)
})
beta_BC = sapply(1:n,function(x) {
set.seed(x)
sample()
})
plot(beta_orig,beta_BC, pch = 1, xlim = c(0,1), ylim = c(0,1))
points(beta_orig,beta_quantile, col = 2, pch = 3)
legend(0.3,1, c("rbeta compared to inverse transform sampling", "rbeta compared to manual"), pch=c(3,1), col = c(2,1), cex = 0.85)
Some weird effect
In the code above I was resetting the random seed for each computation. The inverse transform is only the same for the first number. When you compute multiple numbers then only the first number is the same.
The following code
set.seed(1)
rnorm(6,1,1)
set.seed(1)
qnorm(runif(6),1,1)
set.seed(2)
rnorm(6,1,1)
set.seed(2)
qnorm(runif(6),1,1)
returns
[1] 0.3735462 1.1836433 0.1643714 2.5952808 1.3295078 0.1795316
[1] 0.3735462 0.6737666 1.1836433 2.3297993 0.1643714 2.2724293
[1] 0.1030855 1.1848492 2.5878453 -0.1303757 0.9197482 1.1324203
[1] 0.10308546 1.53124079 1.18484918 0.03810797 2.58784531 2.58463150
What you see here is that rnorm function skips a number. The reason is because it samples two random numbers to create more precision.
See these lines in the source ode of the norm_rand() function that R uses https://svn.r-project.org/R/trunk/src/nmath/snorm.c
define BIG 134217728 /* 2^27 */
/* unif_rand() alone is not of high enough precision */
u1 = unif_rand();
u1 = (int)(BIG*u1) + unif_rand();
return qnorm5(u1/BIG, 0.0, 1.0, 1, 0);

Related

Why am I getting NAs in this calculation in R?

While working on an Rcpp program, I used the sample() function, which gave me the following error: "NAs not allowed in probability." I traced this issue to the fact that the probability vector I used had NA values in it. I have no idea how. Below is some R code that captures the errors:
n.0=20
n.1=20
n.reps=1
beta0.vals=rep(seq(-.3,.1,,n.0),n.reps)
beta1.vals=rep(seq(-7,0,,n.1),n.reps)
beta.grd=as.matrix(expand.grid(beta0.vals,beta1.vals))
n.rnd=200
beta.rnd.grd=cbind(runif(n.rnd,min(beta0.vals),max(beta0.vals)),runif(n.rnd,min(beta1.vals),max(beta1.vals)))
beta.grd=rbind(beta.grd,beta.rnd.grd)
N = 22670
count = 0
for(i in 1:dim(beta.grd)[1]){ # iterate through 600 possible beta values in beta grid
beta.ind = 0 # indicator for current pair of beta values
for(j in 1:N){ # iterate through all possible Nsums
logit = beta.grd[i,1]/N*(j - .1*N)^2 + beta.grd[i,2];
phi01 = exp(logit)/(1 + exp(logit))
if(is.na(phi01)){
count = count + 1
}
}
}
cat("Total number of invalid probabilities: ", count)
Here, $\beta_0 \in (-0.3, 0.1), \beta_1 \in (-7, 0), N = 22670, N_\text{sum} \in (1, N)$. Note that $N$ and $N_\text{sum}$ are integers, whereas the beta values may not be.
Since mathematically, $\phi_{01} \in (0,1)$, I'm assuming that NAs are arising because R is not liking extremely small values. I am receiving an overwhelming amount of NA values, too. More so than numbers. Why would I be getting NAs in this code?
Include print(logit) next to count = count + 1 and you will find lots of logit > 1000 values. exp(1000) == Inf so you divide Inf by Inf which will get you a NaN and NaN is NA:
> exp(500)
[1] 1.403592e+217
> Inf/Inf
[1] NaN
> is.na(NaN)
[1] TRUE
So your problems are not too small but to large numbers coming first out of the evaluation of exp(x) with x larger then roughly 700:
> exp(709)
[1] 8.218407e+307
> exp(710)
[1] Inf
Bernhard's answer correctly identifies the problem:
If logit is large, exp(logit) = Inf.
Here is a solution:
for(i in 1:dim(beta.grd)[1]){ # iterate through 600 possible beta values in beta grid
beta.ind = 0 # indicator for current pair of beta values
for(j in 1:N){ # iterate through all possible Nsums
logit = beta.grd[i,1]/N*(j - .1*N)^2 + beta.grd[i,2];
## This one isn't great because exp(logit) can be very large
# phi01 = exp(logit)/(1 + exp(logit))
## So, we say instead
## phi01 = 1 / ( 1 + exp(-logit) )
phi01 = plogis(logit)
if(is.na(phi01)){
count = count + 1
}
}
}
cat("Total number of invalid probabilities: ", count)
# Total number of invalid probabilities: 0
We can use the more stable 1 / (1 + exp(-logit)
(to convince yourself of this, multiply your expression with exp(-logit) / exp(-logit)),
and luckily either way, R has a builtin function plogis() that can calculate these probabilities quickly and accurately.
You can see from the help file (?plogis) that this function evaluates the expression I gave, but you can also double check to assure yourself
x = rnorm(1000)
y = 1 / (1 + exp(-x))
z = plogis(x)
all.equal(y, z)
[1] TRUE

t-distribution in R

I would like to find the t-value for 90% confidence interval with 17 observation.
In Excel, I can do this calculation with t=T.INV.2T(.10, 16)=1.75 however in R I cannot find the correct way to get the same result.
qt(p = 1-.9, df = 17-1) = -1.34
qt(p = (1-.9)/2, df = 17-1) = -1.75 # trying with two-tailed?
What is the function R doing the same computation as T.INV.2T in Excel.
Similarly, we have also T.DIST.2T in Excel, what is the same function in R?
You need the 1 - .1 / 2 = 0.95 quantile from the t-distribution with 17 - 1 = 16 degrees of freedom:
qt(0.95, 16)
# [1] 1.745884
Explanation
Excel describes T.INV.2T as
Returns the two-tailed inverse of the Student's t-distribution
which is the quantile in math talk (though I would never use the term 2 tailed quantile). The p% quantile q is defined as the point which satisfies P(X <= q) >= p%.
In R we get that with the function qt (q for quantile, t for t-distribution). Now we just have to sort out what is meant by a two-tailed inverse. It turns out we are looking for the point q which satisfies P(X <= -|q| | X >= |q|) >= .1. Since the t-distribution is symmetrical this simplifies to P(X >= |q|) >= .1 / 2.
You can easily verify that in R with the use of the probability function pt:
pt(qt(0.05, 16), 16, lower.tail = TRUE) +
pt(qt(0.95, 16), 16, lower.tail = FALSE)
# [1] 0.1
As you correctly guessed, you do it by estimating the two-sided interval (alpha/2 = 0.1/2 = 0.05)
> qt(p = 0.95, df = 16)
[1] 1.745884
So 5 % off the upper and lower interval. I don't know Excel, but I am guessing that's what that function is doing.
As for dist, that is I assume the two-sided CDF
pt(-1.745884, df=16, lower.tail=T) +
pt(1.745884, df=16, lower.tail=F)
which is equal to 0.09999994.

Integration and false convergence of optimization in R

I am trying to find MLEs of three positive parameters a, mu and theta, and then the value of a function, saying f1.
f1<-function(para)
{
a<-para[1]
mu<-para[2]
the<-para[3]
return(a*mu/the)
}
Step 1 Suppose we have the following (negative) log likelihood function.
where
x_ij and t_ij are known
loglik<-function(para, data)
{
n.quad<-64
a<-para[1]
mu<-para[2]
the<-para[3]
k<-length(table(data$group))
rule<-glaguerre.quadrature.rules(n.quad, alpha = 0)[[n.quad]]
int.ing.gl<-function(y, x, t)
{
(y^(a-mu-1)/(y+t)^(x+a))*exp(-the/y)
}
int.f<-function(x, t) glaguerre.quadrature(int.ing.gl, lower = 0, upper =
Inf, x=x, t=t, rule = rule, weighted = F)
v.int.f<-Vectorize(int.f)
int<-v.int.f(data$count, data$time)
loglik.value<-lgamma(a+data$count)-lgamma(a)+mu*log(the)-lgamma(mu)+log(int)
log.sum<-sum(loglik.value)
return(-log.sum)
}
Step 2 Let's fix true values and generate data.
### Set ###
library(tolerance)
library(lbfgs3)
a<-2
mu<-0.01
theta<-480
k<-10
f1(c(a, mu, theta))
[1] 5e-04
##### Data Generation #####
set.seed(k+100+floor(a*100)+floor(theta*1000)+floor(mu*1024))
n<-sample(50:150, k) # sample size for each group
X<-rep(0,sum(n))
# Initiate time vector
t<-rep(0, sum(n))
# Initiate the data set
group<-sample(rep(1:k,n)) # Randomly assign the group index
data.pre<-data.frame(X,t,group)
colnames(data.pre)<-c('count','time','group')
data<-data.pre[order(data.pre$group),] # Arrange by group index
# Generate time variable
mut<-runif(k, 50, 350)
for (i in 1:k)
{
data$time[which(data$group==i)]<-ceiling(r2exp(n[i], rate = mut[i], shift = 1))
}
### Generate count variable: Poisson
## First, Generate beta for each group: beta_i
beta<-rgamma(k, shape = mu, rate = theta)
# Generate lambda for each observation
lambda<-0
for (i in 1:k)
{
l<-rgamma(n[i], shape = a, rate = 1/beta[i])
lambda<-c(lambda,l)
}
lambda<-lambda[-1]
data<-data.frame(data,lambda)
data$count<-rpois(length(data$time), data$lambda*data$time) # Generate count variable
Step 3 optimization
head(data)
count time group
0 400 1
0 39 1
0 407 1
0 291 1
0 210 1
0 241 1
start.value<-c(2, 0.01, 100)
fit<-nlminb(start = start.value, loglik, data=data,
lower = c(0, 0, 0), control = list(trace = T))
fit
$par
[1] 1.674672e-02 1.745698e+02 3.848568e+03
$objective
[1] 359.5767
$convergence
[1] 1
$iterations
[1] 40
$evaluations
function gradient
79 128
$message
[1] "false convergence (8)"
One of the possible reasons leading to false convergence is the integral in the step 1. In the loglik function, I used glaguerre.quadrature. However, it failed to give correct result because the integral is converging slowly.
I gave an example to look for some suggestion in the following question
Use the Gauss-Laguerre quadrature to approximate an integral in R
Here, I just provide a complete example. Is there any method I can use to handle this integral?

Expected return and covariance from return time series

I’m trying to simulate the Matlab ewstats function here defined:
https://it.mathworks.com/help/finance/ewstats.html
The results given by Matlab are the following ones:
> ExpReturn = 1×2
0.1995 0.1002
> ExpCovariance = 2×2
0.0032 -0.0017
-0.0017 0.0010
I’m trying to replicate the example with the RiskPortfolios R package:
https://cran.r-project.org/web/packages/RiskPortfolios/RiskPortfolios.pdf
The R code I’m using is this one:
library(RiskPortfolios)
rets <- as.matrix(cbind(c(0.24, 0.15, 0.27, 0.14), c(0.08, 0.13, 0.06, 0.13)))
w <- 0.98
rets
w
meanEstimation(rets, control = list(type = 'ewma', lambda = w))
covEstimation(rets, control = list(type = 'ewma', lambda = w))
The mean estimation is the same of the one in the example, but the covariance matrix is different:
> rets
[,1] [,2]
[1,] 0.24 0.08
[2,] 0.15 0.13
[3,] 0.27 0.06
[4,] 0.14 0.13
> w
[1] 0.98
>
> meanEstimation(rets, control = list(type = 'ewma', lambda = w))
[1] 0.1995434 0.1002031
>
> covEstimation(rets, control = list(type = 'ewma', lambda = w))
[,1] [,2]
[1,] 0.007045044 -0.003857217
[2,] -0.003857217 0.002123827
Am I missing something?
Thanks
They give the same answer if type = "lw" is used:
round(covEstimation(rets, control = list(type = 'lw')), 4)
## 0.0032 -0.0017
## -0.0017 0.0010
They are using different algorithms. From the RiskPortfolio manual:
ewma ... See RiskMetrics (1996)
From the Matlab hlp page:
There is no relationship between ewstats function and the RiskMetrics® approach for determining the expected return and covariance from a return time series.
Unfortunately Matlab does not tell us which algorithm is used.
For those who eventually need an equivalent ewstats function in R, here the code I wrote:
ewstats <- function(RetSeries, DecayFactor=NULL, WindowLength=NULL){
#EWSTATS Expected return and covariance from return time series.
# Optional exponential weighting emphasizes more recent data.
#
# [ExpReturn, ExpCovariance, NumEffObs] = ewstats(RetSeries, ...
# DecayFactor, WindowLength)
#
# Inputs:
# RetSeries : NUMOBS by NASSETS matrix of equally spaced incremental
# return observations. The first row is the oldest observation, and the
# last row is the most recent.
#
# DecayFactor : Controls how much less each observation is weighted than its
# successor. The k'th observation back in time has weight DecayFactor^k.
# DecayFactor must lie in the range: 0 < DecayFactor <= 1.
# The default is DecayFactor = 1, which is the equally weighted linear
# moving average Model (BIS).
#
# WindowLength: The number of recent observations used in
# the computation. The default is all NUMOBS observations.
#
# Outputs:
# ExpReturn : 1 by NASSETS estimated expected returns.
#
# ExpCovariance : NASSETS by NASSETS estimated covariance matrix.
#
# NumEffObs: The number of effective observations is given by the formula:
# NumEffObs = (1-DecayFactor^WindowLength)/(1-DecayFactor). Smaller
# DecayFactors or WindowLengths emphasize recent data more strongly, but
# use less of the available data set.
#
# The standard deviations of the asset return processes are given by:
# STDVec = sqrt(diag(ECov)). The correlation matrix is :
# CorrMat = VarMat./( STDVec*STDVec' )
#
# See also MEAN, COV, COV2CORR.
NumObs <- dim(RetSeries)[1]
NumSeries <- dim(RetSeries)[2]
# size the series and the window
if (is.null(WindowLength)) {
WindowLength <- NumObs
}
if (is.null(DecayFactor)) {
DecayFactor = 1
}
if (DecayFactor <= 0 | DecayFactor > 1) {
stop('Must have 0< decay factor <= 1.')
}
if (WindowLength > NumObs){
stop(sprintf('Window Length #d must be <= number of observations #d',
WindowLength, NumObs))
}
# ------------------------------------------------------------------------
# size the data to the window
RetSeries <- RetSeries[NumObs-WindowLength+1:NumObs, ]
# Calculate decay coefficients
DecayPowers <- seq(WindowLength-1, 0, by = -1)
VarWts <- sqrt(DecayFactor)^DecayPowers
RetWts <- (DecayFactor)^DecayPowers
NEff = sum(RetWts) # number of equivalent values in computation
# Compute the exponentially weighted mean return
WtSeries <- matrix(rep(RetWts, times = NumSeries),
nrow = length(RetWts), ncol = NumSeries) * RetSeries
ERet <- colSums(WtSeries)/NEff;
# Subtract the weighted mean from the original Series
CenteredSeries <- RetSeries - matrix(rep(ERet, each = WindowLength),
nrow = WindowLength, ncol = length(ERet))
# Compute the weighted variance
WtSeries <- matrix(rep(VarWts, times = NumSeries),
nrow = length(VarWts), ncol = NumSeries) * CenteredSeries
ECov <- t(WtSeries) %*% WtSeries / NEff
list(ExpReturn = ERet, ExpCovariance = ECov, NumEffObs = NEff)
}

Monte Carlo integration in R : getting the wrong answer [using Hit or Miss]

So I'm using Monte Carlo method to evaluate definite integral of a bunch of functions.
To start with,
y = x ^ (-0.5) ; for x in [0.01,1]
for which, my code in R looks like this
#
s <- NULL
m<- 100
a<- 0.01
b<- 1
set.seed(5)
x<-runif(m,a,b)
y<-runif(m,0,1)
for (i in 1:m){
if(y[i]<(x[i]^(-0.5))){
s[i] <- 1
}
else{
s[i] <-0
}
}
nn<-sum(s==1)*(b-a)/m
print(nn)
#
Answer (nn) : 0.99
Actual answer: 1.8
I cannot figure out where I'm going wrong with this. Have I done something wrong?
A number less than 1 to the power of something negative will always be greater than anything less than one, so you shouldn't be surprised when you get a vector of all 1s.
The rectangle you're using is too short (a height of 1). In reality, it should be 10 tall (since 0.01^-0.5=10) is the maximum value.
Then you take the total area of the rectangle and multiply it by the average of s, so the revised code looks like this:
s <- NULL
m<- 100
a<- 0.01
b<- 1
set.seed(5)
x<-runif(m,a,b)
y<-10*runif(m,0,1)
for (i in 1:m){
if(y[i]<(x[i]^(-0.5))){
s[i] <- 1
}
else{
s[i] <-0
}
}
nn<-sum(s)*(b-a)/m*10#note that the addition of the area of the rectangle
print(nn)
I got a result of 1.683, which is a lot closer to the real answer.
Edit: made a superfluous multiplication, answer revised slightly
As user1362215 points out, your function should be contained in the rectangle. You get closer to the solution if you increase n. Here is a vectorised solution. Results are in the range.
# Hit and miss
f <- function(x) x ^ (-0.5)
n <- 1000000
a <- 0.01
b <- 1
#ceiling(max(f((seq(0.01,1,by=0.001)))))
#[1] 10
set.seed(5)
x <- runif(n,a,b)
y <- 10*runif(n,0,1)
R <- sum(y < f(x))/n
(b-a)*10*R
#[1] 1.805701
# Repeat a few times to look at the distribution
set.seed(5)
n <- 100000
r <- replicate(1000,sum(10*runif(n,0,1) < f(runif(n,a,b)))/n *(b-a)*10)
hist(r)
summary(r)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.755 1.792 1.800 1.800 1.809 1.845
# Sample mean method for comparison
set.seed(5)
r <- replicate(1000, mean(f(runif(n, a,b)))*(b-a))
hist(r)
summary(r)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.788 1.798 1.800 1.800 1.803 1.813
Re your edit: I am assuming the x*2 + y^2, [-1,1] you are referring to a circle rather than a function f(z). So really to estimate area of unit circle/Pi by simulation.
f2 <- function(x) sqrt(1-x^2)
s <- seq(-1 , 1 ,by=0.001)
plot(s,f2(s))
# Get the max value of function within the range
c <- ceiling(max(f2(s)))
# [1] 1
n <- 1000000
a <- -1
b <- 1
set.seed(5)
x <- runif(n,a,b)
y <- c*runif(n,0,1)
R <- sum(y < f2(x))/n
(b-a)*c*R
#[1] 1.57063 # multiply it by 2 to get full area
pi/2
#[1] 1.570796
A Monte Carlo alternative to acceptance/rejection is to uniformly generate x values, average the resulting y = f(x) values to estimate the average height, and multiply that by the interval length to get the estimated area. I don't know R well enough, so here it is in Ruby to illustrate the algorithm:
def f(x)
x ** -0.5
end
sum = 0.0
10000.times { sum += f(0.01 + 0.99 * rand) }
print (1.0 - 0.01) * (sum / 10000)
I'm getting results in the range 1.8 +/- 0.02
You can also improve the precision of your estimator by using antithetic random variates - for each x you generate, also use the symmetric x value mirrored about the median of the x's.
Using #user20650's code for guidance for how to do this in R, you can estimate Pi / 2 as follows:
f <- function(x) sqrt(1-x^2)
n <- 100000
a <- -1
b <- 1
range <- b-a
set.seed(5)
r <- replicate(1000, mean(f(runif(n,a,b))) * range)
hist(r)
summary(r)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.566 1.570 1.571 1.571 1.572 1.575
No bounding function is needed for this approach, and generally it yields greater precision than the acceptance/rejection approach.

Resources