Generate random numbers with 3 to 7 digits in R - r

How can I generate random numbers of varying length, say between 3 to 7 digits with equal probability.
At the end I would like the code to come up with a 3 to 7 digit number (with equal probability) consisting of random numbers between 0 and 9.
I came up with this solution but feel that it is overly complicated because of the obligatory generation of a data frame.
options(scipen=999)
t <- as.data.frame(c(1000,10000,100000,1000000,10000000))
round(runif(1, 0,1) * sample_n(t,1, replace = TRUE),0)
Is there a more elegant solution?

Based on the information you provided, I came up with another solution that might be closer to what you want. In the end, it consists of these steps:
randomly pick a number len from [3, 7] determining the length of the output
randomly pick len numbers from [0, 9]
concatenate those numbers
Code to do that:
(len <- runif(1, 3, 7) %/% 1)
(s <- runif(len, 0, 9) %/% 1)
cat(s, sep = "")
I previously provided this answer; it does not meet the requirements though, as became clear after OP provided further details.
Doesn't that boil down to generating a random number between 100 and 9999999?
If so, does this do what you want?
runif(5, 100, 9999999) %/% 1
You could probably also use round, but you'd always have to round down.
Output:
[1] 4531543 9411580 2195906 3510185 1129009

You could use a vectorized approach, and sample from the allowed range of exponents directly in the exponent:
pick.nums <- function(n){floor(10^(sample(3:7,n,replace = TRUE))*runif(n))}
For example,
> set.seed(123)
> pick.nums(5)
[1] 455 528105 89241 5514350 4566147

Related

How can I use sum function in R?

This is my first post here and I couldn't find the answer I was looking for.
I'm currently taking edX course on Probability in Data Science, but I got stuck on section 1.
The task asks you to simulate a series of 6 games with random, independent outcomes of either a loss (0) or win(1), and then use the sum function to determine whether a simulated series contained at least 4 wins.
Here's what I did:
l <- list(0:1)
n <- 6
games <- expand.grid(rep(l, n))
games <- paste (games$Var1, games$Var2, games$Var3, games$Var4, games$Var5, games$Var6)
sample (game, 1, replace = TRUE)
but I can't seem to use the sum function to sum the result of '''sample''' and check if there's a series of at least 4 games. I've been trying to use
sum(sample (game, 1, replace = TRUE))
but can't seem to get anywhere with it.
Any light would be greatly appreciated!
Thanks a lot!
This is what one simulated series look like
sample(c(0, 1), 6, replace = TRUE)
To count number of wins (i.e 1) you could use sum like
sum(sample(c(0, 1), 6, replace = TRUE)) >= 4
Now you could generate such series n times with replicate.
n <- 1000
replicate(n, sum(sample(c(0, 1), 6, replace = TRUE)) >= 4)
If you have to use games to calculate you can use rowSums to count number of 1's
sum(rowSums(games) >= 4)
#[1] 22

sampling bug in R? [duplicate]

This question already has answers here:
Sample from vector of varying length (including 1)
(4 answers)
Closed 4 years ago.
I am trying to sample one element out of a numeric vector.
When the length of the vector > 1, the result is one of the numbers of the vector, as expected. However when the vector contains one element, it samples a number between 0 and this single number.
For example:
sample(c(100, 1000), 1)
results in either 100 or 1000, however
sample(c(100), 1)
results in different numbers smaller than 100.
What is going on?
Have a look at the Details of the sample function:
"If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x"
This is (unfortunately) expected behavior. See ?sample. The first line of the Details section:
If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x. Note that this convenience feature may lead to undesired behaviour when x is of varying length in calls such as sample(x). See the examples.
Luckily the Examples section provides a suggested fix:
# sample()'s surprise -- example
x <- 1:10
sample(x[x > 8]) # length 2
sample(x[x > 9]) # oops -- length 10!
sample(x[x > 10]) # length 0
## safer version:
resample <- function(x, ...) x[sample.int(length(x), ...)]
resample(x[x > 8]) # length 2
resample(x[x > 9]) # length 1
resample(x[x > 10]) # length 0
You could, of course, also just use an if statement:
sampled_x = if (length(my_x) == 1) my_x else sample(my_x, size = 1)

How to generate random 64-digit hexadecimal number in R?

To apply in a blockchain application, I needed to generate random 64-digit hexadecimal numbers in R.
I thought that due to the capacity of computers, obtaining such an 64-digit hexadecimal number at once is rather cumbersome, perhaps impossible.
So, I thought that I should produce random hexadecimal numbers with rather low digits, and bring (concatenate) them together to obtain random 64-digit hexadecimal number.
I am near solution:
library(fBasics)
.dec.to.hex(abs(ceiling(rnorm(1) * 1e6)))
produces random hexadecimal numbers. The problem is that in some of the instances, I get 6-digit hexadecimal number, in some instances I get 7-digit hexadecimal number. Hence, fixing this became priority first.
Any idea?
You can simply sample each digit and paste them together.
set.seed(123)
paste0(sample(c(0:9, LETTERS[1:6]), 64, T), collapse = '')
## [1] "4C6EF08E87F7A91E305FEBAFAB8942FEBC07C353266522374D07C1832CE5A164"
Max argument of .dec.to.hex() is .dec.to.hex(2^30.99999....9).
So, the question reduces to 2^30.99999=2147468763 is what power of 10?
2147468763 = 2.147468763e9
1e9 < 2.147468763e9. Hence 9th power. But, rnorm(1) may produce ">5". For safety, use 8th power (.dec.to.hex(abs(ceiling(rnorm(1) * 1e8))) is 7 or 8 hexa-digits. 10*7 >= 64).
library(fBasics)
strtrim(paste(sapply(1:10, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e8)))), collapse=""), 64)
# 0397601803C22E220509810703BDE2300460EA80322F000CF50ABD0226F27009
10 iterations instead of 11; hence, with a little bit less operations!
nchar(strtrim(paste(sapply(1:10, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e8)))), collapse=""), 64))
# 64
library(fBasics)
strtrim(paste(sapply(1:11, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e6)))), collapse=""), 64)
# 08FBFA019B4930E2AF707AFEE08A0F90D765E05757607609B0691190FC54E012
Let's check:
nchar(strtrim(paste(sapply(1:11, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e6)))), collapse=""), 64)) # 64

Check if numbers in a vector are alternating in R

i need to check if the first number of a vector is smaller than the second number and the second number is greater than the third number and so on. I got so far that i can calculate the differences of the numbers of a vector like this:
n <- sample(3) #may n = 132
diff(n) # outputs 2 -1
I need to check if the first number is positive, the second negative etc. The problem i have is that i need the program to do it for a vector of length n. How can i implement this?
As it is not very clear what i am trying to do here i will give a better example:
May v be a vector c(1,2,4,3).
I need to check if the first number of the vector is smaller than the second, the second greater than the third, the third smaller than the fourth.
So i need to check if 1 < 2 > 4 < 3. (This vector wouldn´t meet the requirements) Every number i will get will be > 0 and is guaranteed to just be there once.
This process needs to be generalized to a given n which is > 0 and a natural number.
v <- c(1, 2, 4, 3)
all(sign(diff(v)) == c(1, -1))
# [1] FALSE
# Warning message:
# In sign(diff(v)) == c(1, -1) :
# longer object length is not a multiple of shorter object length
We can safely ignore the warning message, since we make deliberate use of "recycling" (which means c(1, -1) is implicitly repeated to match the length of sign(diff(v))).
Edit: taking #digEmAll's comment into account, if you want to allow a negative difference rather than a positive one at the start of the sequence, then this naive change should do it:
diffs <- sign(diff(v))
all(diffs == c(1, -1)) || all(diffs == c(-1, 1))
If we need to find whether there are alternative postive, negative difference, then
all(rle(as.vector(tapply(n, as.integer(gl(length(n),
2, length(n))), FUN = diff)))$lengths==1)
#[1] TRUE
Also, as #digEmAll commented and the variation of my initial response
all(rle(sign(diff(n)) > 0)$lengths == 1)
data
n <- c(1, 2, 4, 3)

Counting consecutive repeats, and returning the maximum value in each in each string of repeats if over a threshold

I am working with long strings of repeating 1's and 0's representing the presence of a phenomenon as a function of depth. If this phenomenon is flagged for over 1m, it is deemed significant enough to use for further analyses, if not it could be due to experimental error.
I ultimately need to get a total thickness displaying this phenomenon at each location (if over 1m).
In a dummy data set the input and expected output would look like this:
#Depth from 0m to 10m with 0.5m readings
depth <- seq(0, 10, 0.5)
#Phenomenon found = 1, not = 0
phenomflag <- c(1,0,1,1,1,1,0,0,1,0,1,0,1,0,1,1,1,1,1,0)
What I would like as an output is a vector with: 4, 5 (which gets converted back to 2m and 2.5m)
I have attempted to solve this problem using
y <- rle(phenomflag)
z <- y$length[y$values ==1]
but once I have my count, I have no idea how to:
a) Isolate 1 maximum number from each group of consecutive repeats.
b) Restrict to consecutive strings longer than (x) - this might be easier after a.
Thanks in advance.
count posted a good solution in the comments section.
y <- y <- rle(repeating series of 1's and 0's)
x <- cbind(y$lengths,y$values) ; x[which(x[,1]>=3 & x[,2]==1)]
This results in just the values that repeat more than a threshold of 2, and just the maximum.

Resources