Running two loops in R for mathematical expression calculation - r

I am trying to write the R code for the expression given in the image for calculation purposes. I tried to use two loops and sapply function but I failed. Can anyone suggest a suitable code for the calculation of this expression?
I tried below lines given in the image.
R Code tried:

Please see as below:
gamma <- 1.5
s <- 1
k <- 3
i <- s:k
j <- lapply(i, function(x) 0:x)
prod_i <- sapply(j, function(x) prod(k + gamma - x))
f <- sum(factorial(k) / factorial(k - i) * prod_i)
f
# [1] 637.875

Related

Head/tail breaks classification algorithm in R

I am trying to implement the head/tail breaks classification algorithm in R (see here). This relatively new algorithm is a less computationally expensive alternative to other classification methods used in Cartography for highly skewed data.
So far, I have been looking as template a code in Python (see here) with relatively success. Here is my implementation in R:
# fake data to classify
pareto_data <- c()
for (i in 1:100){
pareto_data[i] <- (1.0/i)^1.16
}
# head/tail breaks algorithm
ht <- function(data){
ln <- length(data)
mn <- mean(data)
res <- append(c(),mn) # this is where I was hopping to store my output
head <- subset(data,data>=mn)
while (length(head)>=1 & length(head)/ln <= 0.40){
print(res)
return(ht(head))
}
#return(res)
}
ht(pareto_data)
As a result of running above code, I have been able to print the following:
[1] 0.03849691
[1] 0.1779904
[1] 0.4818454
This output is very likely the same of running the original Python code I have been using as template. However, I have not been successful in storing it in either a vector or a list.
I would be really thankful if you can give hints to overcome this problem and also to improve my code (which is not exactly the same as the original one in Python, particularly in the conditions of the while statement).
A possible recursive version of the algorithm could be the following.
ht_breaks <- function(x){
ht_inner <- function(x, mu){
n <- length(x)
mu <- c(mu, mean(x))
h <- x[x > mean(x)]
if(length(h) > 1 && length(h)/n <= 0.4){
ht_inner(h, mu)
} else mu
}
ht_inner(x, NULL)
}
pareto_data <- (1.0/(1:100))^1.16
ht_breaks(pareto_data)
#[1] 0.03849691 0.17799039 0.48184535

R integrate over values

I have been trying to integrate the following function over r in [0,1], but to no avail:
brownian_bridge <- function(r){X[r*(length(X)-1)+1]-r*X[length(X)]}
X is a vector of length 1000, and r is defined as
r=seq(from=0,to=1,length=1000)
Furthermore,
X=cumsum(rnorm(1000,mean=0,sd=sqrt(1/1000)))
Now my question is: How can I integrate browian_bridge over r in [0,1]? Is there a built-in R function to do this? Standard tools such as integrate2() don't seem to work, because r is a sequence and not a function that holds for all r in [0,1].
Maybe you should use Vectorize first and then apply integrate
f <- Vectorize(brownian_bridge,"r")
res <- integrate(f,0,1,subdivisions = 1e5)
such that
set.seed(1)
X=cumsum(rnorm(1000,mean=0,sd=sqrt(1/1000)))
brownian_bridge <- function(r){
X[r*(length(X)-1)+1]-r*X[length(X)]
}
f <- Vectorize(brownian_bridge,"r")
res <- integrate(f,0,1,subdivisions = 1e5)
gives
> res
0.2478581 with absolute error < 1e-04

I am beginner in R and I'm trying to solve a system of equations but when i run i get error in R [duplicate]

This question already has an answer here:
Simple for loop in R producing "replacement has length zero" in R
(1 answer)
Closed 4 years ago.
# my error : Error in F[1] <- n/(X[0]) - sum(log(1 + Y^exp(X[1] + X[2] * x))) : replacement has length zero
set.seed(16)
#Inverse Transformation on CDF
n=100
SimRRR.f <- function(100, lambda=1,tau)) {
x= rnorm(100,0,1)
tau= exp(-1-x)
u=runif(100)
y= (1/(u^(1/lambda)-1))^(1/tau)
y
}
Y<-((1/u)-1)^exp(-1-x)
# MLE for Simple Linear Regresion
# System of equations
library(rootSolve)
library(nleqslv)
model <- function(X){
F <- numeric(length(X))
F[1] <- n/(X[0])-sum(log(1+Y^exp(X[1]+X[2]*x)))
F[2] <- 2*n -(X[0]+1)*sum(exp(X[1]+X[2]*x))*Y^( exp(X[1]+X[2]*x))*log(Y)/(1+ Y^( exp(X[1]+X[2]*x)))
F[3] <- sum(x) + sum(x*log(Y))*exp(X[1]+X[2]*x) -(X[0]+1)*X[1]*sum(exp(X[1]+X[2]*x)*Y^(exp(X[1]+X[2]*x)*log(Y)))/(1+ Y^( exp(X[1]+X[2]*x)))
# Solution
F
}
startx <- c(0.5,3,1) # start the answer search here
answers<-as.data.frame(nleqslv(startx,model))
The problem is that you define x, u, tau and y inside the SimRRR function, but are trying to define Y in terms of them outside the function.
Using a function, you give it input, and you get back output. All the other variables defined in the course of the function doing its job go away at the end. As it stands, Y should be a series of NAs (unless you defined the above variables in the global environment as you were working on your function...)
Try the following functions, see if they do the job:
# I usually put all my library calls together at the beginning of the script.
library(rootSolve)
library(nleqslv)
x = rnorm(n,0,1) # see below for why this is pulled out.
SimRRR.f <- function(x, lambda=1,tau)) { # 100 can't be by itself in the function call. everything in there needs to be attached to a variable.
n <- length(x)
tau= exp(-1-x)
u=runif(n)
y= (1/(u^(1/lambda)-1))^(1/tau)
y
}
Y_sim = SimRRR.f(n = 100, lambda = 1, tau = 1) # pick the right tau, it's never defined here.
Your second function has more issues. Namely, it relies on x, which is not defined anywhere that can be found. Either you need x from the previous function, or you really meant X. I'm going to assume you do need the values of x, since X is only of length 3. This is why I pulled it out of the last function call - we need it now.
[Update]
It's also been pointed out in the comments that the indexing here is wrong. I didn't catch that previously (and the F elements are defined correctly). I think I've fixed the indexing issues too now:
model <- function(X, Y, x){ # If you use x and Y in the function, define them here.
n <- length(x)
F <- numeric(length(X))
F[1] <- n/(X[1])-sum(log(1+Y^exp(X[2]+X[3]*x)))
F[2] <- 2*n -(X[1]+1)*sum(exp(X[2]+X[3]*x))*Y^( exp(X[2]+X[3]*x))*log(Y)/(1+ Y^( exp(X[2]+X[3]*x)))
F[3] <- sum(x) + sum(x*log(Y))*exp(X[2]+X[3]*x) -(X[1]+1)*X[2]*sum(exp(X[2]+X[3]*x)*Y^(exp(X[2]+X[3]*x)*log(Y)))/(1+ Y^( exp(X[2]+X[3]*x)))
# Solution
F
}
I'm not familiar with the nleqslv package, but unless there is a method defined to convert it to a data frame, that might not go so well. I'd make sure everything else is working before the conversion.
startx <- c(0.5,3,1) # start the answer search here
answers <- nleqslv(startx,model, Y = Y_sim, x = x)
answer_df <- as.data.frame(answers)

R store expressions in data.frame

I have a matrix of computer generated equations. Each equation is solved to produce a single number. Resulting numeric matrix is then fed to the solve function. For plotting results over a variable, this has to be repeated N times.
I want to speed up the calculations by modifying AST of the equations and simplify them using these functions before evaluating.
The problem I encountered is that I cannot properly store modified equations of types expression or language in matrix or data.frame. For example:
foo <- data.frame(matrix(expression(NA), nrow = 100, ncol = 100))
# does not work
# apply(foo, MARGIN = c(1,2), function(x) {expression(1+1)})
for (i in c(1:100)) {for (j in c(1:100)) {foo[i,j] <- expression(1+1)}}
Resulting data.frame foo is 3.1Mb even for the shortest expression. Real equations are even bigger and have awful subset time. Is there a way to store these types efficiently?
The best way to deal with this I've found so far is to store expressions in function. I create a function:
out <- function() {
m <- matrix(NA, nrow = 100, ncol = 100)
}
Then I append expressions to it:
n <- 3
for (i in c(1:100)) {
for (j in c(1:100)) {
body(out)[[n]] <- substitute(m[i, j] <- f+i+j, list(i=i, j=j))
n <- n + 1
}
n <- n + 1
}
Finally, I append the return statement and add arguments to the function:
body(out)[[length(body(out))+1]] <- quote(m)
formals(out) <- alist(f=)
To speed things up for the repeated evaluation of the resulted function, I compile it to bytecode:
outc <- cmpfun(out)
Upon calling this function returns a numeric matrix of executed equations, so no subsetting is needed. It is really big though. out and outc are both 56Mb.

Calculating a GP correlation matrix outside of a loop

So I am trying to calculate the correlation matrix associated with a Gaussian Process using R and was hoping for some suggestions for doing so without using the triple for-loop I have written below. Mainly I want to try and condense the code for readable purposes and also to speed up calculations.
#Example Data
n = 500
x1 = sample(1:100,n,replace=T)
x2 = sample(1:100,n,replace=T)
x3 = sample(1:100,n,replace=T)
X = cbind(x1,x2,x3)
R = matrix(NA,nrow=n,ncol=n)
for(i in 1:nrow(X)){
for(j in 1:nrow(X)){
temp = 0
for(k in 1:ncol(X)){
temp = -abs(X[i,k]-X[j,k])^1.99 + temp
}
R[i,j] = exp(temp)
}
}
So as n gets large, the code gets much slower. Also worth noting, since this is a correlation matrix, the matrix is syymetric and the diagonal is equal to 1.
It's much faster using this:
y <- t(X)
R <- exp(-sapply(1:ncol(y), function(i) colSums((y-y[,i])^2)))
If you want ot keep your original formula:
R <- exp(-sapply(1:ncol(y), function(i) colSums(abs(y-y[,i])^1.99)))
I'm wondering if you could cut your calculation and looping times in half by changing these two lines? (Actually the timing was improved by more than 50% 14.304 secs improved to 6.234 secs )
1: for(j in 1:nrow(X)){
2: R[i,j] = exp(temp)
To:
1: for(j in i:nrow(X)){
2: R[i,j] = R[j,i]= exp(temp)
Tested:
> all.equal(R, R2)
[1] TRUE
That way you populate the lower triangle without doing any calculations.BTW, what's with the 1.99? This is perhaps a problem more suited to submitting as a C program. The Rcpp package supports this and there are a lot of worked examples on SO. Perhaps a search on: [r] rcpp nested loops

Resources