R - how to define a "symbolic sequence" - r

Let's say that we have this sequence of numbers:
1/2, 1/3, 1/4, ..., 1/N
Sorry for the bad formatting, but the LaTeX doesn't work here. Let's say that we define a function which sums all the elements of this sequence:
Σ n = 1N 1/n
The N is supposed to be in the superscipt, but I can't align it properly here - anyway, this function would calculate the sum of all 1/n elements starting from n=1 to N.
I now want to find the limit of this function when N tends to infinity using R. For this, I was planning to use the lim function from the Ryacas package as suggested in this question.
However, I can't seem to find a way to make my script work. My idea was this:
x <- ysym("x")
sq <- 1:x
fun <- sum(1/sq)
lim(fun, x, Inf)
However, I am already getting an error at the second step of this process. I can't seem to run the
sq <- 1:x:
Error in 1:x : NA/NaN argument
In addition: Warning message:
In 1:x : numerical expression has 3 elements: only the first used
So it seems that it's not possible to define a "symbolic sequence" (don't know what else I would call it) in this way.
What would be the proper way to calculate what I want?

This series goes to infinity:
library(Ryacas)
yac_str("Sum(n, 1, Infinity, 1/n)")
# "Infinity"
Let's try a convergent series:
yac_str("Sum(n, 1, Infinity, 1/n^2)")
# "Pi^2/6"
If you really want to use Limit:
yac_str("Limit(n, Infinity) Sum(i, 1, n, 1/i^2)")
Use yac_str to execute a Yacas command.

Related

Is there a way to optimize the calculation of Bernoulli Log-Likelihoods for many multivariate samples?

I currently have two Torch Tensors, p and x, which both have the shape of (batch_size, input_size).
I would like to calculate the Bernoulli log likelihoods for the given data, and return a tensor of size (batch_size)
Here's an example of what I'd like to do:
I have the formula for log likelihoods of Bernoulli Random variables:
\sum_i^d x_{i} ln(p_i) + (1-x_i) ln (1-p_i)
Say I have p Tensor:
[[0.6 0.4 0], [0.33 0.34 0.33]]
And say I have the x tensor for the binary inputs based on those probabilities:
[[1 1 0], [0 1 1]]
And I want to calculate the log likelihood for every sample, which would result in:
[[ln(0.6)+ln(0.4)], [ln(0.67)+ln(0.34)+ln(0.33)]]
Would it be possible to do this computation without the use of for loops?
I know I could use torch.sum(axis=1) to do the final summation between the logs, but is it possible to do the Bernoulli log-likelihood computation without the use of for loops? or use at most 1 for loop? I am trying to vectorize this operation as much as possible. I could've sworn we could use LaTeX for equations before, did something change or is it another website?
Though not a good practice, you can directly use the formula on the tensors as follows (works because these are element wise operations):
import torch
p = torch.tensor([
[0.6, 0.4, 0],
[0.33, 0.34, 0.33]
])
x = torch.tensor([
[1., 1, 0],
[0, 1, 1]
])
eps = 1e-8
bll1 = (x * torch.log(p+eps) + (1-x) * torch.log(1-p+eps)).sum(axis=1)
print(bll1)
#tensor([-1.4271162748, -2.5879497528])
Note that to avoid log(0) error, I have introduced a very small constant eps inside it.
A better way to do this is to use BCELoss inside nn module in pytorch.
import torch.nn as nn
bce = nn.BCELoss(reduction='none')
bll2 = -bce(p, x).sum(axis=1)
print(bll2)
#tensor([-1.4271162748, -2.5879497528])
Since pytorch computes the BCE as a loss, it prepends your formula with a negative sign. The attribute reduction='none' says that I do not want the computed losses to be reduced (averaged/summed) across the batch in any way. This is advisable to use since we do not need to manually take care of numerical stability and error handling (such as adding eps above.)
You can indeed verify that the two solutions actually return the same tensor (upto a tolerance):
torch.allclose(bll1, bll2)
# True
or the tensors (without summing each row):
torch.allclose((x * torch.log(p+eps) + (1-x) * torch.log(1-p+eps)), -bce(p, x))
# True
Feel free to ask for further clarifications.

Integration of a function with while loop in R

I want to integrate a function involving while loop in R. I have pasted here an MWE. Could anyone please guide about how to get rid of warning messages when integrating such a function?
Thank You
myfun <- function(X, a, b, kmin, kmax){
term <- 0
k <- 1
while(k < kmax | term < 10000){
term <- term + a * b * X^k
k <- k+1
}
fx <- exp(X) * term
return(fx)
}
a <- 5
b <- 4
kmax <- 20
integrate(myfun, lower = 0, upper = 10, a = a, b = b, kmax = kmax)
Produces a warning, accessed via warnings():
In while (k < kmax | term < 10000) { ... :
the condition has length > 1 and only the first element will be used
From the integrate() documentation:
f must accept a vector of inputs and produce a vector of function evaluations at those points.
This is the crux of the problem here, which you can see by running myfun(c(1, 2), a, b, kmin, kmax) and reproducing a similar warning. What's happening is that integrate() wants to pass a vector of inputs to myfun in X; this means that inside your while loop, term will become a vector as well. This creates a problem when the while loop kicks back to the evaluation stage, because now the condition k < kmax | term < 10000 has a vector structure as well (since term does), which while doesn't like.
This warning is very good in this case, because it strongly suggests that integrate() isn't doing what you want it to do. Your goal here isn't to get rid of the warning messages; the function as written simply won't work with integrate() due to the while loop structure.
Your choices for how to proceed are to either (1) rewrite the function in a way that doesn't use a while loop, or (2) just hard-code some numeric integration yourself, perhaps with a for loop. The best way to use R is to vectorize everything and to avoid things like while and for when at all possible.
Finally, I'll note that there seems to be some problem with the underlying function, since myfun(0.5, a, b, kmin, kmax) does not converge (note the problem with the mathematics when the supplied X term is less than 1), so you won't be able to integrate it on the interval [0, 10] no matter what you do.

Integrand Syntax in R

I'm trying to complete a straightforward integration, but I'm running into an issue that (I think) is due to the form in which I'm writing the integrand.
Suppose I want to find the area bound by f(x) = 3x and g(x) = x^2. Geometrically, the area between the two curves:
Ok, so not a big deal to do analytically:
But I'd like to accomplish this with R, of course.
So I enter my function and there's a problem:
> g <- function(x) {3x-x^2}
Error: unexpected symbol in "g <- function(x) {3x"
This frustrated me so I started playing around with things. Interestingly, I found that if I factor x out of the integrand:
everything works smoothly:
> f <- function(x) {x*(3-x)}
> integrate(f, 0, 3)
4.5 with absolute error < 5e-14
My next step was to check ?integrate, part of which is attached below:
integrate(f, lower, upper, ..., subdivisions = 100L,
rel.tol = .Machine$double.eps^0.25, abs.tol = rel.tol,
stop.on.error = TRUE, keep.xy = FALSE, aux = NULL)
Arguments
f
an R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error.
lower, upper
the limits of integration. Can be infinite.
Am I somehow not taking a numeric first argument in my first attempt to integrate? Thanks in advance.
Change 3x to 3*x.
(This may be the smallest answer-length-to-question-length ratio I've seen in a long time ;-)

Non-conformable arrays in R

y <- matrix(c(7, 9, -5, 0, 2, 6), ncol = 1)
try <- t(y)
tryy <- try %*% y
i <- solve(tryy)
h <- y %*% i %*% try
uniroot(as.vector(solve(((1-x) * diag(6)) + h)), c(-Inf, Inf))
Error in (1 - x) * diag(6) : non-conformable arrays
The purpose of this command uniroot(as.vector(solve(((1-x) * diag(6)) + h)), c(-Inf, Inf)) is to solve the characteristics equation det[(1-λ)I+h] = 0
where, λ=eigenvalues , I=identity matrix , h=hat matrix=y(y'y)^(-1)y'
here λ is unknown ,we have to solve for it.
I am not understanding where is the problem here? I have tried as:
as.vector(solve(6*diag(6)+h))
This is not non-conformable. But why is not working inside the uniroot function?
Your question is a bit confusing, so I have to make a couple of assumptions. If you want the eigenvalues of h, then the characteristic equation is:
det(h - I*λ) = 0
not
det[(1-λ)I+h] = 0
So I used the former.
Given the above, the short answer is: do it this way.
f <- function(lambda) det(h -lambda*diag(6))
F <- Vectorize(f)
library(rootSolve)
uniroot.all(F,c(-1000,1000),n=2000)
# [1] 0 1
# or, much more simply
eigen(h)$values
# [1] 1.000000e+00 2.220446e-16 0.000000e+00 -2.731318e-18 -6.876381e-18 -7.365903e-17
So h has 2 eigenvalues, 0 and 1. Note that the built-in function eigen(...) finds 6 roots, but 5 of them are within the machine tolerance of 0.
The question about why your code fails is a bit more involved.
First, your code:
tryy <- try %*% y
is the dot product of y with itself (so, a scalar), returned as a matrix with one element. When you "invert" that using solve(...)
i <- solve(tryy)
you simply take the reciprocal, so i is also a matrix with 1 element. I'm not sure if this is what you had in mind.
Second, uniroot(...) does not work this way. The first argument must be a function; you've passed an expression which depends on x, which in turn is undefined. You could try:
f <- function(x) det(h-x*diag(6))
uniroot(f,c(-Inf,Inf))
but this wouldn't work either because (a) uniroot(...) works on a finite interval, (b) it requires that the function f(...) have different sign at the ends of the interval, and (c) in any event it would return only one root (the smaller one).
So you could use uniroot.all(...) in package rootSolve. uniroot.all(...) also requires a function as it's first argument, but there's a twist: the function must be "vectorized". This means that if you pass a vector of lambda values, f(...) should return a vector of the same length. Fortunately in R there is an easy way to "vectorize" a given function, as in:
F <- Vectorize(f).
Even this has it's limits. uniroot.all(...) also requires a finite interval, so we have to guess what that is, and also it evaluates F on n sub-intervals. So if your interval does not contain all the roots, or if the sub-intervals are not small enough, you will not find all the roots.
Using the built-in eigen(...) function is definitely the best option.

computing an intergral with multiple variables in R

Hi I have a equation like the following that I want to calculate.
The equation is given by :
In this equation x is an arrary from 0 to 500.
The value of t = 500 i.e upper limit of the integration.
Now I want to compute c as c(500,x).
The code that I have written so far is as follows:
x <- seq(from=0,by=0.5,length=1000)
t=500
integrand <- function(t)t^(-0.5)*exp((-x^2/t)-t)
integrated <- integrate(integrand, lower=0, upper=t)
final <- pi^(-0.5)*exp(2*x)*integrated
The error I get is as follows:
Error in integrate(integrand, lower = 0, upper = t) :
evaluation of function gave a result of wrong length
In addition: Warning messages:
1: In -x^2/t :
longer object length is not a multiple of shorter object length
2: In -x^2/t - t :
longer object length is not a multiple of shorter object length
3: In t^(-0.5) * exp(-x^2/t - t) :
longer object length is not a multiple of shorter object length
But it doesn't work because there is a variable x inside the integrand which is an arrary. Can anyone suggest how can I compute the integration first and then calculate the total expression for each value of x ? If I change the value of x in the integrand to constant I can compute integration but I want to compute for all the values of x from 0 to 500.
Thank you so much.
Well, here is some code, but it blows up after t=353:
Cfun <- function(XX, upper){
integrand <- function(x)x^(-0.5)*exp((-XX^2/x)-x)
integrated <- integrate(integrand, lower=0, upper=upper)$value
(final <- pi^(-0.5)*exp(2*XX)*integrated) }
sapply(1:400, Cfun, upper=500)
I'd put the loop over values for x outside the integration. Iterate over the x-values and perform the integration for each one inside. Then you'll have C(x) as a function of x suitable for plotting.
You realize, of course, that the indefinite integral can be evaluated:
http://www.wolframalpha.com/input/?i=integrate+exp%28-%28c%2Bt%5E2%29%2Ft%29%2Fsqrt%28t%29
Maybe that will help you see what the answer looks like before you get started.

Resources