How to find all pythagorean triples under 1000 without control flow functions(for, while, if) in R? - r

Without using any control flow
statements, i.e., if, while, for write an R function that lists all such triplets {a, b, c} less than 1000 and a < b < c.I have no idea how to go about this problem other than knowing the which function will help. Im guessing its some sort of recursion.

# Just because I think this is pretty
xx <- (1:1000) ^2
xy <- combn(xx,2)
xz <- rbind(xy, colSums(xy))
xp <- xz[, xz[3,] %in% xx]
sqrt(xp)

Related

Integration of a function with while loop in R

I want to integrate a function involving while loop in R. I have pasted here an MWE. Could anyone please guide about how to get rid of warning messages when integrating such a function?
Thank You
myfun <- function(X, a, b, kmin, kmax){
term <- 0
k <- 1
while(k < kmax | term < 10000){
term <- term + a * b * X^k
k <- k+1
}
fx <- exp(X) * term
return(fx)
}
a <- 5
b <- 4
kmax <- 20
integrate(myfun, lower = 0, upper = 10, a = a, b = b, kmax = kmax)
Produces a warning, accessed via warnings():
In while (k < kmax | term < 10000) { ... :
the condition has length > 1 and only the first element will be used
From the integrate() documentation:
f must accept a vector of inputs and produce a vector of function evaluations at those points.
This is the crux of the problem here, which you can see by running myfun(c(1, 2), a, b, kmin, kmax) and reproducing a similar warning. What's happening is that integrate() wants to pass a vector of inputs to myfun in X; this means that inside your while loop, term will become a vector as well. This creates a problem when the while loop kicks back to the evaluation stage, because now the condition k < kmax | term < 10000 has a vector structure as well (since term does), which while doesn't like.
This warning is very good in this case, because it strongly suggests that integrate() isn't doing what you want it to do. Your goal here isn't to get rid of the warning messages; the function as written simply won't work with integrate() due to the while loop structure.
Your choices for how to proceed are to either (1) rewrite the function in a way that doesn't use a while loop, or (2) just hard-code some numeric integration yourself, perhaps with a for loop. The best way to use R is to vectorize everything and to avoid things like while and for when at all possible.
Finally, I'll note that there seems to be some problem with the underlying function, since myfun(0.5, a, b, kmin, kmax) does not converge (note the problem with the mathematics when the supplied X term is less than 1), so you won't be able to integrate it on the interval [0, 10] no matter what you do.

What is going on with floating point precision here?

This question is in reference is an observation from a code-golf challenge.
The submitted R solution is a working solution, but a few of us (maybe just I) seems to be dumbfounded as to why the initial X=m reassignment is necessary.
The code is golfed down a bit by #Giuseppe, so I'll write a few comments for the reader.
function(m){
X=m
# Re-assign input m as X
while(any(X-(X=X%*%m))) 0
# Instead of doing the meat of the calculation in the code block after `while`
# OP exploited its infinite looping properties to perform the
# calculations within the condition check.
# `-` here is an abuse of inequality check and relies on `any` to coerce
# the numeric to logical. See `as.logical(.Machine$double.xmin)`
# The code basically multiplies the matrix `X` with the starting matrix `m`
# Until the condition is met: X == X%*%m
X
# Return result
}
Well as far as I can tell. Multiplying X%*%m is equivalent to X%*%X since X is a just an iteratively self-multiplied version of m. Once the matrix has converged, multiplying additional copies of m or X does not change its value. See linear algebra textbook or v(m)%*%v(m)%*%v(m)%*%v(m)%*%v(m)%*%m%*%m after defining the above function as v. Fun right?
So the question is, why does #CodesInChaos's implementation of this idea not work?
function(m){while(any(m!=(m=m%*%m)))0 m}
Is this caused by a floating point precision issue? Or is this caused by the a function in the code such as the inequality check or .Primitive("any")? I do not believe this is caused by as.logical since R seems to coerce errors smaller than .Machine$double.xmin to 0.
Here is a demonstration of above. We are simply looping and taking the difference between m and m%*%m. This error becomes 0 as we try to converge the stochastic matrix. It seems to converge then blow to 0/INF eventually depending on the input.
mat = matrix(c(7/10, 4/10, 3/10, 6/10), 2, 2, byrow = T)
m = mat
for (i in 1:25) {
m = m%*%m
cat("Mean Error:", mean(m-(m=m%*%m)),
"\n Float to Logical:", as.logical(m-(m=m%*%m)),
"\n iter", i, "\n")
}
Some additional thoughts on why this is a floating point math issue
1) the loop indicates that this is probably not a problem with any or any logical check/conversion step but rather something to do with float matrix math.
2) #user202729's comment in the original thread that this issue persists in Jelly, a code golf language gives more credence to the idea that this is a perhaps a floating point issue.
The different methods iterate different functions, both starting with seed value m. Function iteration only converges to a given fixed point if that fixed point is stable and the seed is within the basin of attraction of that fixed point.
In the original code, you are iterating the function
f <- function(X) X %*% m
The limit matrix is a stable fixed-point under the assumption (stated in the Code Gulf problem) that a well-defined limit exists. Since the function definition depends on m, it isn't surprising that the fixed point is a function of m.
On the other hand, the proposed variation using m = m %*% m is obtained by iterating the function
g <- function(X) X %*% X
Note that all idempotent matrices are fixed points of this function but clearly they can't all be stable fixed points. Apparently, the limiting matrix in the original fixed function is not a stable fixed point of g (even though it is a fixed point).
To really nail this down, you would need to get into the theory of matrix fixed points under function iteration to show why the fixed point in the case of g is unstable.
This is indeed a floating point math issue. To see it, see the results of this function:
test2 <- function(m) {
c <- 0
res <- list()
while (any(m!=(m=m%*%m))) {
c <- c + 1
res[[c]] <- m
}
print(c)
res
}
To test equality with some tolerance, you can use:
test3 <- function(m) {
while (!isTRUE(all.equal(m, m <- m %*% m))) 0
m
}

Urn model in R (trying to calculate probabilities for random events)

I really hope you can help me with a problem I cant solve on my own.
I'm trying to program a basic urn model for a web app. I want to calculate the probabilities of specific random events according to different drawing methods in a model with 2 different colors.
The composition of the urn (red and black balls) is specified in a vector
a <-c(number_red, number_black)
The random event is specified in another vecotor, lets say
b<-c("red","red","black","red") or any other combination of red and black balls
Now want to calculate the probability of the event (vector b), when the balls are
1) replaced in the urn, and order does matter
2) NOT replaced in the urn, and order does matter
3) NOT replaced in the urn, and order doesn't matter
4) replaced in the urn, and order doesn't matter
I came up with several different ideas but none of them really worked...
At first I wrote fuctions in order to determine how many different combinations one can could draw in each of the scenarios.
stan = function(n,x) {return(n^x)}
perm = function(n, x) {return(factorial(n) / factorial(n-x))}
komb = function(n, x) {return(factorial(n) / (factorial(n-x)*factorial(x)))}
komb2 = function(n, x) {return(factorial(n+x-1) / (factorial(n-1)*factorial(x)))}
But then I didnt really know how to apply them in order to calculate the final probabilities.
I also experimented with for loops in order to emulate a tree diagram, but it became too complex for me. For example:
c <- c(number_red/(number_red+number_black), number_red/(number_red+number_black))
b <- c("red","black","red")
b[b=="red"]<-1
[b=="black"]<-2
b<-as.numeric(b)
vec<-NULL
for (i in b){
vec<-c(vec, c[i])}
prod(vec)
A solution like that gives correct results for problem #1, but i dont really know how to apply it to the other problems since I would have to find a way to alter vector c according to the composition of vector b each time I run the loop.
Of course I have experimented with different ideas, but none of them really seems to work. I would be very thankful if someone could help me with my problem.
Best,
Henry
Is this correct?
a <- c(red = 5, black = 5)
b <- c("red","red","black","red")
# (1)
prod((a/sum(a))[b])
# (2)
p <- c()
n <- a
for(i in b){
p <- c(n[i] / sum(n), p)
n[i] <- n[i] - 1
}
prod(p)
# (3)
komb <- function(n, x) {
return(factorial(n) / (factorial(n-x)*factorial(x)))
}
n <- table(b)
prod(sapply(names(n), function(i){
komb(a[i], n[i])
})) / komb(sum(a), sum(n))
# (4)
# I think it is the same as (1) as each sample is independent;

Vectorize function to avoid loop

I'm trying to speed up my code because it's running very long. I already found out where the problem lies. Consider the following example:
x<-c((2+2i),(3+1i),(4+1i),(5+3i),(6+2i),(7+2i))
P<-matrix(c(2,0,0,3),nrow=2)
out<-sum(c(0.5,0.5)%*%mtx.exp(P%*%(matrix(c(x,0,0,x),nrow=2)),5))
I have a vector x with complex values, the vector has 12^11 entries and then I want to calculate the sum in the third row. (I need the function mtx.exp because it's a complex matrix power (the function is in the package Biodem). I found out that the %^% function does not support complex arguments.)
So my problem is that if I try
sum(c(0.5,0.5)%*%mtx.exp(P%*%(matrix(c(x,0,0,x),nrow=2)),5))
I get an error: "Error in pot %*% pot : non-conformable arguments." So my solution was to use a loop:
tmp<-NULL
for (i in 1:length(x)){
tmp[length(tmp)+1]<-sum(c(0.5,0.5)%*%mtx.exp(P%*%matrix(c(x[i],0,0,x[i]),nrow=2),5))
}
But as said, this takes very long. Do you have any ideas how to speed up the code? I also tried sapply but that takes just as long as the loop.
I hope you can help me, because i have to run this function approximatly 500 times and this took in first try more than 3 hours. Which is not very satisfying..
Thank u very much
The code can be sped up by pre-allocating your vector,
tmp <- rep(NA,length(x))
but I do not really understand what you are trying to compute:
in the first example,
you are trying to take the power of a non-square matrix,
in the second, you are taking the power of a diagonal matrix
(which can be done with ^).
The following seems to be equivalent to your computations:
sum(P^5/2) * x^5
EDIT
If P is not diagonal and C not scalar,
I do not see any easy simplification of mtx.exp( P %*% C, 5 ).
You could try something like
y <- sapply(x, function(u)
sum(
c(0.5,0.5)
%*%
mtx.exp( P %*% matrix(c(u,0,0,u),nrow=2), 5 )
)
)
but if your vector really has 12^11 entries,
that will take an insanely long time.
Alternatively, since you have a very large number
of very small (2*2) matrices,
you can explicitely compute the product P %*% C
and its 5th power (using some computer algebra system:
Maxima, Sage, Yacas, Maple, etc.)
and use the resulting formulas:
these are just (50 lines of) straightforward operations on vectors.
/* Maxima code */
p: matrix([p11,p12], [p21,p22]);
c: matrix([c1,0],[0,c2]);
display2d: false;
factor(p.c . p.c . p.c . p.c . p.c);
I then copy and paste the result in R:
c1 <- dnorm(abs(x),0,1); # C is still a diagonal matrix
c2 <- dnorm(abs(x),1,3);
p11 <- P[1,1]
p12 <- P[1,2]
p21 <- P[2,1]
p22 <- P[2,2]
# Result of the Maxima computations:
# I just add all the elements of the resulting 2*2 matrix,
# but you may want to do something slightly different with them.
c1*(c2^4*p12*p21*p22^3+2*c1*c2^3*p11*p12*p21*p22^2
+2*c1*c2^3*p12^2*p21^2*p22
+3*c1^2*c2^2*p11^2*p12*p21*p22
+3*c1^2*c2^2*p11*p12^2*p21^2
+4*c1^3*c2*p11^3*p12*p21+c1^4*p11^5)
+
c2*p12
*(c2^4*p22^4+c1*c2^3*p11*p22^3+3*c1*c2^3*p12*p21*p22^2
+c1^2*c2^2*p11^2*p22^2+4*c1^2*c2^2*p11*p12*p21*p22
+c1^3*c2*p11^3*p22+c1^2*c2^2*p12^2*p21^2
+3*c1^3*c2*p11^2*p12*p21+c1^4*p11^4)
+
c1*p21
*(c2^4*p22^4+c1*c2^3*p11*p22^3+3*c1*c2^3*p12*p21*p22^2
+c1^2*c2^2*p11^2*p22^2+4*c1^2*c2^2*p11*p12*p21*p22
+c1^3*c2*p11^3*p22+c1^2*c2^2*p12^2*p21^2
+3*c1^3*c2*p11^2*p12*p21+c1^4*p11^4)
+
c2*(c2^4*p22^5+4*c1*c2^3*p12*p21*p22^3
+3*c1^2*c2^2*p11*p12*p21*p22^2
+3*c1^2*c2^2*p12^2*p21^2*p22
+2*c1^3*c2*p11^2*p12*p21*p22
+2*c1^3*c2*p11*p12^2*p21^2+c1^4*p11^3*p12*p21)

Why is nlogn so hard to invert?

Let's say I have a function that is nlogn in space requirements, I want to work out the maximum size of input for that function for a given available space. i.e. I want to find n where nlogn=c.
I followed an approach to calculate n, that looks like this in R:
step = function(R, z) { log(log(R)-z)}
guess = function(R) log(log(R))
inverse_nlogn = function(R, accuracy=1e-10) {
zi_1 = 0
z = guess(R)
while(abs(z - zi_1)>accuracy) {
zi_1 = z
z = step(R, z)
}
exp(exp(z))
}
But I can't get understand why it must be solved iteratively. For the range we are interested (n>1), the function is non singular.
There's nothing special about n log n — nearly all elementary functions fail to have elementary inverses, and so have to be solved by some other means: bisection, Newton's method, Lagrange inversion theorem, series reversion, Lambert W function...
As Gareth hinted the Lambert W function (eg here) gets you almost there, indeed n = c/W(c)
A wee google found this, which might be helpful.
Following up (being completely explicit):
library(emdbook)
n <- 2.5
c <- 2.5*log(2.5)
exp(lambertW(c)) ## 2.5
library(gsl)
exp(lambert_W0(c)) ## 2.5
There are probably minor differences in speed, accuracy, etc. of the two implementations. I haven't tested/benchmarked them extensively. (Now that I tried
library(sos)
findFn("lambert W")
I discover that it's implemented all over the place: the games package, and a whole package that's called LambertW ...

Resources