What does rpois() do in the following case? - r

rpois() takes two values (n, and lambda) to generate n random numbers according to Poisson distribution.
But, what is rpois() doing in the following case?
> n = c(0,1,2,3,4,5,6,7,8,9)
> lamda = 10
> rpois(n, lamda)
[1] 13 15 10 9 10 11 10 10 11 15
>

from the docs:
The length of the result is determined by ‘n’ for ‘rpois’, and is
the maximum of the lengths of the numerical arguments for the
other functions.
it's therefore the same as:
rpois(length(n), lambda)
a bit more digging, it ends up calling do_random1 in src/main/random.c. which basically says:
if (length(param1) == 1) {
n = as.integer(param1)
} else {
n = length(param1)
}
but in C, and with fiddling to make sure it works with "long" vectors, etc.

Related

Sum of function values in R

I am trying to calculate the following ( the image says f(n) = n \sum_{i=1}^{\infty} (c(i)*(1-c(i))^n)):
where c(i) is
c <- function(i){1/i^3}
In other words, f(2) is 2*{1^(-3)(1-1^(-3))^2+2^(-3)(1-2^(-3))^2+3^(-3)(1-3^(-3))^2+4^(-3)(1-4^(-3))^2+...}.
How to write such an f function in R?
My initial attempt is:
f <- function(n){n*sum(c*(1-c)^n)}
but this is obviously wrong with error
Error in 1 - c : non-numeric argument to binary operator
Please let me know if further clarification is needed. Thanks.
Clearly, you can't get an infinite sum unless you tackle it analytically, but since we can see that it's a convergent sum, we could look at, say, the first million like this:
f <- function(n) {
C <- 1 / seq(1e6)^3
n * sum(C * (1 - C)^n)
}
Which allows:
f(1)
#> [1] 0.1847138
f(2)
#> [1] 0.3387583
f(3)
#> [1] 0.4674204
In case you are worried that this is not accurate enough, we get the same result out to 7 digits by summing only the first 10,000 terms, so 1 million should be very close to the converged value.

How to generate k random variables numbers from a hypergeometric distribution?

I am trying to make my own function in order to generate nn random variables from a hypergeometric distribution. I know with rhyper(nn,m,n,k), which is an R function, I can directly do it. However, I want to make my own function named rHypg(nn,m,n,k). I have made the following function:
rHypg <- function(nn, m, n, k) {
x <- seq(nn)
for (i in 1:nn) {
N = n
M = m
for (j in 1:k) {
b <-
sample(0:1,
size = 1,
prob = c(M / (M + N), N / (M + N)))
if (b == 0) {
M <- M - 1
x[i] <- x[i] + 1
} else if (b == 1) {
N <- N - 1
}
}
}
return(x)
}
When I test my above function (rHypg) for some examples like rHypg(10, 7, 8, 5) and rHypg(10,7, 3, 10) I get below results in output:
rHypg(10, 7, 8, 5)
## [1] 4 5 6 7 7 9 9 11 11 12
rHypg(10,7, 3, 10)
## [1] 8 9 10 11 12 13 14 15 16 17
The above results are wrong because output random numbers must be <= m. I think I need to use stop somewhere inside my function to resolve this problem, but I don not know how (I think in order to my function generates nn random variables from a hypergeometric distribution, when x[i]=m it has to be stoped)!
Could you please help me to correct my code in order to generate correctly nn random variables from a hypergeometric distribution with parameters nn, m, n and k?
[[Hint: Explanations regarding hypergeometric distribution:
nn: number of observations.
m: the number of white balls in the urn.
n: the number of black balls in the urn.
k: the number of balls drawn from the urn, hence must be in 0,1,…, m+n.]]

How do I maximize a vector

I'd like to maximize numbers in a vector and get the results as a new vector
Like this:
w <- 0:10
maximizer <- function(w){
max(10, w + 5)
}
I'm expecting getting a vector (10,10,10,10,10,10,11,12,13,14,15), but all I'm getting is 15. I know weird ways of fixing this, but I'm sure there must be an easier way...
Instead of max you should use pmax:
maximizer <- function(w) pmax(10, w + 5)
maximizer(0:10)
# [1] 10 10 10 10 10 10 11 12 13 14 15
since
pmax() and pmin() take one or more vectors as arguments, recycle
them to common length and return a single vector giving the ‘parallel’ maxima (or minima) of the argument vectors.
while
max and min return the maximum or minimum of all the values present in their arguments
you could use a for-loop with an ifelse statement albeit not as elegant as the solution posted:
for(i in seq_along(w)){
w[[i]] <- ifelse(w[[i]]+5 < 10, 10, w[[i]] + 5)
}
[1] 10 10 10 10 10 10 11 12 13 14 15

Sum of random numbers in R

I am trying to write a program in R to sum n random number. However, when I try it for some numbers it won't work.
For example,
## rm(list=ls())
random.sum <- function(n) {
x[1:n] <- ceiling(10*runif(n))
cat("x:", x[1:n], "\n")
return(sum(x))
}
x <- rep(100, 10)
show(random.sum(10))
show(random.sum(5))
when I try to sum 10 random numbers it will give me the correct sum which is
show(random.sum(10))
x: 1 3 10 1 3 2 8 6 7 9
[1] 50
However, when I try it for the next one which is 5, it won't work,
show(random.sum(5))
x: 7 5 6 2 9
[1] 529
I am not sure what I am doing wrong
The easiest way would be something like this (updated as per #Axeman's comment):
sum(sample(1:10, 10, replace = TRUE))
where the first "10" is your n and min and max define the value range for runif.
Also keep x local to the function:
random.sum <- function(n) {
x <- sample(1:10, 10, replace = TRUE)
cat("x:", x, "\n")
return(sum(x))
}
The reason for your error is the variable scoping rules of R. Your variable x in global scope is copied upon modification, but maintains the dimension of the global declaration. If you sum over only the first n elements with sum(x[1:n]) you will get the correct answer.
Now, that begs the question, are you trying to modify the global object x inside the function? If that is your intent, the superassignment operator <<- can be used. See the R intro section 10.5 "Assignments within functions" for details.

fill up a matrix one random cell at a time

I am filling a 10x10 martix (mat) randomly until sum(mat) == 100
I wrote the following.... (i = 2 for another reason not specified here but i kept it at 2 to be consistent with my actual code)
mat <- matrix(rep(0, 100), nrow = 10)
mat[1,] <- c(0,0,0,0,0,0,0,0,0,1)
mat[2,] <- c(0,0,0,0,0,0,0,0,1,0)
mat[3,] <- c(0,0,0,0,0,0,0,1,0,0)
mat[4,] <- c(0,0,0,0,0,0,1,0,0,0)
mat[5,] <- c(0,0,0,0,0,1,0,0,0,0)
mat[6,] <- c(0,0,0,0,1,0,0,0,0,0)
mat[7,] <- c(0,0,0,1,0,0,0,0,0,0)
mat[8,] <- c(0,0,1,0,0,0,0,0,0,0)
mat[9,] <- c(0,1,0,0,0,0,0,0,0,0)
mat[10,] <- c(1,0,0,0,0,0,0,0,0,0)
i <- 2
set.seed(129)
while( sum(mat) < 100 ) {
# pick random cell
rnum <- sample( which(mat < 1), 1 )
mat[rnum] <- 1
##
print(paste0("i =", i))
print(paste0("rnum =", rnum))
print(sum(mat))
i = i + 1
}
For some reason when sum(mat) == 99 there are several steps extra...I would assume that once i = 91 the while would stop but it continues past this. Can somone explain what I have done wrong...
If I change the while condition to
while( sum(mat) < 100 & length(which(mat < 1)) > 0 )
the issue remains..
Your problem is equivalent to randomly ordering the indices of a matrix that are equal to 0. You can do this in one line with sample(which(mat < 1)). I suppose if you wanted to get exactly the same sort of output, you might try something like:
set.seed(144)
idx <- sample(which(mat < 1))
for (i in seq_along(idx)) {
print(paste0("i =", i))
print(paste0("rnum =", idx[i]))
print(sum(mat)+i)
}
# [1] "i =1"
# [1] "rnum =5"
# [1] 11
# [1] "i =2"
# [1] "rnum =70"
# [1] 12
# ...
See ?sample
Arguments:
x: Either a vector of one or more elements from which to choose,
or a positive integer. See ‘Details.’
...
If ‘x’ has length 1, is numeric (in the sense of ‘is.numeric’) and
‘x >= 1’, sampling _via_ ‘sample’ takes place from ‘1:x’. _Note_
that this convenience feature may lead to undesired behaviour when
‘x’ is of varying length in calls such as ‘sample(x)’. See the
examples.
In other words, if x in sample(x) is of length 1, sample returns a random number from 1:x. This happens towards the end of your loop, where there is just one 0 left in your matrix and one index is returned by which(mat < 1).
The iteration repeats on level 99 because sample() behaves very differently when the first parameter is a vector of length 1 and when it is greater than 1. When it is length 1, it assumes you a random number from 1 to that number. When it has length >1, then you get a random number from that vector.
Compare
sample(c(99,100),1)
and
sample(c(100),1)
Of course, this is an inefficient way of filling your matrix. As #josilber pointed out, a single call to sample could do everything you need.
The issue comes from how sample and which do the sampling when you have only a single '0' value left.
For example, do this:
mat <- matrix(rep(1, 100), nrow = 10)
Now you have a matrix of all 1's. Now lets make two numbers 0:
mat[15]<-0
mat[18]<-0
and then sample
sample(which(mat<1))
[1] 18 15
by adding a size=1 argument you get one or the other
now lets try this:
mat[18]<-1
sample(which(mat<1))
[1] 3 13 8 2 4 14 11 9 10 5 15 7 1 12 6
Oops, you did not get [1] 15 . Instead what happens in only a single integer (15 in this case) is passed tosample. When you do sample(x) and x is an integer, it gives you a sample from 1:x with the integers in random order.

Resources