To apply in a blockchain application, I needed to generate random 64-digit hexadecimal numbers in R.
I thought that due to the capacity of computers, obtaining such an 64-digit hexadecimal number at once is rather cumbersome, perhaps impossible.
So, I thought that I should produce random hexadecimal numbers with rather low digits, and bring (concatenate) them together to obtain random 64-digit hexadecimal number.
I am near solution:
library(fBasics)
.dec.to.hex(abs(ceiling(rnorm(1) * 1e6)))
produces random hexadecimal numbers. The problem is that in some of the instances, I get 6-digit hexadecimal number, in some instances I get 7-digit hexadecimal number. Hence, fixing this became priority first.
Any idea?
You can simply sample each digit and paste them together.
set.seed(123)
paste0(sample(c(0:9, LETTERS[1:6]), 64, T), collapse = '')
## [1] "4C6EF08E87F7A91E305FEBAFAB8942FEBC07C353266522374D07C1832CE5A164"
Max argument of .dec.to.hex() is .dec.to.hex(2^30.99999....9).
So, the question reduces to 2^30.99999=2147468763 is what power of 10?
2147468763 = 2.147468763e9
1e9 < 2.147468763e9. Hence 9th power. But, rnorm(1) may produce ">5". For safety, use 8th power (.dec.to.hex(abs(ceiling(rnorm(1) * 1e8))) is 7 or 8 hexa-digits. 10*7 >= 64).
library(fBasics)
strtrim(paste(sapply(1:10, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e8)))), collapse=""), 64)
# 0397601803C22E220509810703BDE2300460EA80322F000CF50ABD0226F27009
10 iterations instead of 11; hence, with a little bit less operations!
nchar(strtrim(paste(sapply(1:10, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e8)))), collapse=""), 64))
# 64
library(fBasics)
strtrim(paste(sapply(1:11, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e6)))), collapse=""), 64)
# 08FBFA019B4930E2AF707AFEE08A0F90D765E05757607609B0691190FC54E012
Let's check:
nchar(strtrim(paste(sapply(1:11, function(i) .dec.to.hex(abs(ceiling(rnorm(1) * 1e6)))), collapse=""), 64)) # 64
Related
I write a code in R to reverse a number. But I got inf as output.
digit<-512
rev_num<-0
while(digit>0){
rev_num=rev_num*10 + digit %% 10
digit=digit / 10
}
print(paste(rev_num))
Can anyone tell me the error in this code?
A quick fix to your approach would add floor for digit when dividing by 10.
digit<-512
rev_num<-0
while(digit>0){
rev_num=rev_num*10 + digit %% 10
digit= floor(digit / 10)
}
rev_num
#[1] 215
There is also stri_reverse function in stringi
stringi::stri_reverse(512)
#[1] "215"
You need digit = digit %/% 10 instead of / where %/% is to be used for integer division. And you need integer division because using / gives residual decimal places and your while loop do not stop until digit reaches the minimum number that can be represented by your machine while your rev_num keep growing by a multiple of 10 in
each iteration, reaching Inf.
Fix to your code (digit here is not an integer, so when you divide it by 10, it goes 51.2, then 5.12 and so on, which is why you got INF as output):
digit<-512
rev_num<-0
while(digit>0){
rev_num=rev_num*10 + digit %% 10
digit=as.integer(digit / 10)
}
print(paste(rev_num))
Another approach to reversing a number:
z <- 4321
as.numeric(paste(rev(strsplit(as.character(z),"")[[1]]),collapse=""))
First, you convert the number into a string
Then, You can use stri_reverse() function form stringi
stri_reverse(<String value to reverse>)
Then convert the string into Number.
Maybe you can try the base R code below, using toString + utf8ToInt + intToUtf8:
digit<-512
rev_num <- as.numeric(intToUtf8(rev(utf8ToInt(toString(digit)))))
I was looking into the RNG of base R and was curious if the 32-bit implementation of Mersenne-Twister might be limiting it when scaled to large numbers of random numbers needed so I did a simple test:
set.seed(8)
length(unique(runif(1e8)))
# [1] 98845641
1e8 - 98845641
# 1154359
So it turns out that there are indeed numerous duplicates in the 100 million draw.
When I switch to the 64-bit version of the MT RNG implemented by dqrng package, the problem does not appear.
Question 1:
The 64 bit referenced refers to the type of floating point numbers used?
Question 2:
Am I right to conclude that because of the large span of possible numbers (64bit FP vs 32bit FP), duplicates are less likely when using the 64-bit MT?
from ?Random:
Do not rely on randomness of low-order bits from RNGs. Most of the supplied uniform generators return 32-bit integer values that are converted to doubles, so they take at most 2^32 distinct values and long runs will return duplicated values.
Indeed, when we calculate the expected number of draws that have a duplicate, we get
M <- 2^32
n <- 1e8
(n * (1 - (1 - 1 / M)^(n - 1))) / 2
# [1] 1150705
which is very close to the result that you have.
How can I generate random numbers of varying length, say between 3 to 7 digits with equal probability.
At the end I would like the code to come up with a 3 to 7 digit number (with equal probability) consisting of random numbers between 0 and 9.
I came up with this solution but feel that it is overly complicated because of the obligatory generation of a data frame.
options(scipen=999)
t <- as.data.frame(c(1000,10000,100000,1000000,10000000))
round(runif(1, 0,1) * sample_n(t,1, replace = TRUE),0)
Is there a more elegant solution?
Based on the information you provided, I came up with another solution that might be closer to what you want. In the end, it consists of these steps:
randomly pick a number len from [3, 7] determining the length of the output
randomly pick len numbers from [0, 9]
concatenate those numbers
Code to do that:
(len <- runif(1, 3, 7) %/% 1)
(s <- runif(len, 0, 9) %/% 1)
cat(s, sep = "")
I previously provided this answer; it does not meet the requirements though, as became clear after OP provided further details.
Doesn't that boil down to generating a random number between 100 and 9999999?
If so, does this do what you want?
runif(5, 100, 9999999) %/% 1
You could probably also use round, but you'd always have to round down.
Output:
[1] 4531543 9411580 2195906 3510185 1129009
You could use a vectorized approach, and sample from the allowed range of exponents directly in the exponent:
pick.nums <- function(n){floor(10^(sample(3:7,n,replace = TRUE))*runif(n))}
For example,
> set.seed(123)
> pick.nums(5)
[1] 455 528105 89241 5514350 4566147
I previously used the prod() function in R that can give me the product, and it works fine for big numbers. The numbers I have are too small like 1.294528e-07 and once I take the product it gives me a 0. How can I get the accurate product with exact decimal numbers?
This sounds like a job for Rmpfr:
x <- 1.294528e-07;
x^100;
## [1] 0
library('Rmpfr');
mpfr200 <- function(...) mpfr(...,precBits=200);
x <- mpfr200(1.294528e-07);
x^100;
## 1 'mpfr' number of precision 200 bits
## [1] 1.6260909947893069252110098843324246495136887016336218206204661e-689
Or you could try using big rationals from gmp, although that can get unwieldy fast:
library('gmp');
x <- as.bigq(1.294528e-07);
x^100;
## Big Rational ('bigq') :
## [1] 863201774730415899361077141424209396921125795515936990572548766128423810485376236957138978501283812494839104247698243918794530436552717763753923415271480284747830193402522833445368586578235815336302329397252055448087703163456038666698935354732741208048881276922439385153961853551463063865863458552423073323374618023089542368149617965922293453325011050815707644365808660423522776994133587512616070446174486428153909409377712433145387919387175172482342168167092570331925560436171324785600650158865676862026009729048525389272889703709380624434349438465164559591242984791355618727989751127260257309242805499481511142165616784856784942417419338154196561431152388897904866047119736434465720555151366859879507712533271194851766672024261634534292974370183919317337519761869292257947511511111895425764926171263133863980854536246390233199502174749146911367500644673909293464659053254182209374190247093969004854275674922622004129684228283369400286413197699863545912211106461047595912964876198783114172495242447965086668614473659343976994896679029656266765528726921384485164153780083326552118475505412074971594197272427677831237385443579950907872485700396062323506489811292749356021319775368577664875790645937426179486396681942844892307294288187671687510056569029216067321069225537944854772595983467728588640812585079820715315382504185719050646602130250650306723313760231069912835376365077115331890400516502810239814459239282321065702537572103441710647744406489548580900916084596895906189449738524638127337711843685456775272799845630310027842996833372802952394634016929280394482001/53084469288402965415857953902888840109250315594591772197851276288408082158869345776872184284291470495326835835491268017776297213098616208286592801322449982740570898437845767564651639451594841758957168240360072929342348878909784083488070009533176658698675760470010215436132534526706916446032842195249059760060209393388578176281027744679436285035293376786965957429093838689438026612141820062960073237944227430409290082873605748455349816343081466372681612738552636294666573661997989000563367893746123926316870120929629731301360766711677076606816082939449299471019533119911247114865751561110071173719092050562287666719013887097553109594042589835370732409680749273477741701995190365166750190649349508586781414000112900259875654673888811075100415790235930270448790550846107436360615795146817327563001398966815753145995110673462134196824939359497706430237425390060733252224220979131253874493851823165781457427314551881655553433521397260371393668335053627112038905459972042994824319713601980913921755210218082851146588240382210406887660412630451416112206306502796230074832738884324233514086958873577398927563897852887699678587467916741882087729456997268017154775450070371874680332524775280043635270781581135769020865808796073362505082201497885288236887117727248916243704472117062440908853938749830047482062530807255930310025959255017626077608884577821899189515192765061903944746945679900747890669013446056647522960607623949336195016784356934505954115977283306997415274187752372480232652878520163417566097389917356958213807994947634794209832173114356392822435844256992216702376456519427913660058510779720382422426729352627871147386069533497741201545766859622239986052843328404669795291152396764393702113455439610494964706121896827191760262984304690887475635610694078746800875955908479469409563047706020039725815572087005410536948676710051947334063697608908539017716795313119629533327813992071776512146390147185444968042597376280912548998385218672413833388480083312283356458666501168482077474943858309275329657545594008977132937574085616539588927255406484838471048801662983695044651588178399267847057683205128610922849562650411762874243138435345707386259866513245949164125352989897783775852403441398200260025840304184956928642989258546038224598392605412214974894309376
(Notice the slash at character 1570 into that digit sequence.)
Using R, calculate for x and y be integers ∈ [1, 1000], How many unique powers, x^y exist.
This is what I have right now, just don't know how to eliminate the duplicate numbers,
x<-1:1000
y<-1:1000
for (i in x)
{
for (j in y){
print(i^j)
}
}
A combinatorial approach to this could split the numbers from 1-1000 into equivalence classes where each number in the class is the power of some other number. For instance, we would split the numbers 1-10 into (1), (2, 4, 8), (3, 9), (5), (6), (7), (10). None of the powers of values between equivalence classes will coincide, so we can just handle each equivalence class separately.
num.unique.comb <- function(limit) {
# Count number of powers in each equivalence class (labeled by lowest val)
num.powers <- rep(0, limit)
# Handle 1 as special case
num.powers[1] <- 1
# Beyond sqrt(limit), all unhandled numbers are in own equivalence class
handled <- c(T, rep(F, limit-1))
for (base in 2:ceiling(sqrt(limit))) {
if (!handled[base]) {
# Handle all the values in 1:limit that are powers of base
num.handle <- floor(log(limit, base))
handled[base^(1:num.handle)] <- T
# Compute the powers of base that we cover
num.powers[base] <- length(unique(as.vector(outer(1:num.handle, 1:limit))))
}
}
num.powers[!handled] <- limit
# Handle sums too big for standard numeric types
library(gmp)
print(sum(as.bigz(num.powers)))
}
num.unique.comb(10)
# [1] 76
num.unique.comb(1000)
# [1] 978318
One nice property of this combinatorial approach is that it's very fast compared to a brute-force approach. For instance, it takes less than 0.1 seconds to compute with limit set to 1000. This allows us to compute the result for much larger values:
# ~0.15 seconds
num.unique.comb(10000)
# [1] 99357483
# ~4 seconds
num.unique.comb(100000)
# [1] 9981335940
# ~220 seconds
num.unique.comb(1000000)
# [1] 999439867182
This is a pretty neat result -- in under 4 minutes we can compute the number of unique values within 1 trillion numbers, where each number can have up to 6 million digits!
Update: Based on this combinatorial code I've updated the OEIS entry for this sequence to include terms up to 10,000.
A brute-force approach would be to just compute all the powers and count the number of unique values:
num.unique.bf <- function(limit) {
length(unique(as.vector(sapply(1:limit, function(x) x^(1:limit)))))
}
num.unique.bf(10)
# [1] 76
A problem with this brute-force analysis is that you are dealing with large numbers that will create numerical issues. For instance:
1000^1000
# [1] Inf
As a result we get an inaccurate value:
# Wrong due to numerical issues!
num.unique.bf(1000)
# [1] 119117
However, a package like the gmp can enable us to compute even numbers as large as 1000^1000. My computer has trouble storing all 1 million numbers in memory at once, so I'll write them to a file (size for n=1000 is 1.2 GB on my computer) and then compute the number of unique values in that file:
library(gmp)
num.unique.bf2 <- function(limit) {
sink("foo.txt")
for (x in 1:limit) {
vals <- as.bigz(x)^(1:limit)
for (idx in 1:limit) {
cat(paste0(as.character(vals[idx]), "\n"))
}
}
sink()
as.numeric(system("sort foo.txt | uniq | wc -l", intern=T))
}
num.unique.bf2(10)
# [1] 76
num.unique.bf2(1000)
# [1] 978318
A quick visit to the OEIS (click the link for the first 1000 values) shows that this is correct. This approach is rather slow (roughly 40 minutes on my computer), and combinatorial approaches should be significantly faster.