Derivative of a set of points - r

So I know you can find the derivative of something like: "x^3-6*x^2" by doing: D(expression(x^3-6*x^2), 'x'), but what if I need to find the first derivative maximum of a list of values such as:
value <- c(610,618,627,632,628,634,634,628,634,642,637,643,653,666,684,717,787,923,1197,1716,2638,4077,5461,7007,8561,9994,11278,12382,13382,14252)
these values are the y coordinate and the x coordinate starts at 1 and increments by 1. IE the first point is (1,610) second is (2,618) etc. -Thanks

Consider using the package numDerive from CRAN. It has a function grad that computes derivative of a function at a point. Example:
f = function(x) x^3 - 6*x^2
library(numDeriv)
grad(f, 1) #derivative of f at x=1
To solve your problem with a list of values, use a for loop:
xval <- c(YOUR VALUES HERE)
xval.derivatives <- c() #empty vector to hold
for(i in 1:length(xval)) xval.derivatives[i] <- grad(f,xval[i])

The gradient function from the pracma package calculates the derivative from a vector of values.
library(pracma)
value <- c(610,618,627,632,628,634,634,628,634,642,637,643,653,666,684,717,787,923,1197,1716,2638,4077,5461,7007,8561,9994,11278,12382,13382,14252)
value_prime <- pracma::gradient(value, h1 = 1)
plot(value_prime)
Alternatively, fit a spline.
spl <- smooth.spline(1:length(value), y=value)
pred <- predict(spl)
pred.prime <- predict(spl, deriv=1)
plot(pred.prime, type = 'b')
If you are interested in higher derivatives, check the pspline package.

Related

Trying to replicate rgeom() funtion

As an exercise, I'm trying to write a function which replicates the rgeom() function. I want it to have the same arguments and return values. I've started out by using runif to generate a vector with x elements, but I'm not sure how to apply the probability distribution:
rgeometric <- function(x, prob) {
outcomes <- runif(x)
P <- (1 - prob)^length(x) * prob
return (P)
}
Would it be something like the following? How can I check that the distribution is geometric?
set.seed(0)
rgeometric <- function(x, prob) {
outcomes <- runif(x)
P <- (1 - prob)^length(x) * prob
for (i in x) {
x[i] <- x[i]*P
}
return (outcomes)
}
rgeometric(5, 0.4)
We can accomplish this task using Inverse Transform Sampling.
First, let's clear up some of your notation.
In the rgeom() function, we'll want that first argument to be n, an integer vector of length one giving the number of samples to generate:
rgeometric <- function(n, prob) {
u <- runif(n)
## do stuff
}
So how does inverse transform sampling work?
First we generate a vector u of standard uniform deviates, as shown above.
Then, for each element ui of u, we find the value of the inverse of the cumulative density function at ui.
For the geometric distribution, the CDF is 1 - (1 - prob)^(x+1); the inverse of the CDF is ceiling(log(1-u) / log(1-prob)) - 1 (link to derivation, p. 11).
So, we can complete the function like so:
rgeometric <- function(n, prob) {
u <- runif(n)
return(ceiling(log(1-u) / log(1-prob)) - 1)
}
Your last question is how can we test if the resulting samples are distributed geometric?
I don't know of a formal test that will help, but we can see it appears to work when we compare the density of 1 million random draws from this custom function to the density of 1 million random draws from base R's rgeom() function:
n <- 1e6
p <- 0.25
set.seed(0)
x <- rgeometric(n, p)
y <- rgeom(n, p)
png("so-answer.png", width = 960)
opar <- par(mfrow = c(1, 2))
plot(density(x), main = "Draws from custom function")
plot(density(y), main = "Draws from base R function")
par(opar)
dev.off()
Note that for the definition of the geometric function implemented by r, the random variable is the number of failures until the first success. Therefore you could do:
my_rgeom <- function(n, p){
fun <- function(p){
n <- 0
stopifnot(p>0)
while(runif(1)>p) n <- n+1
n
}
replicate(n, fun(p))
}
Now test the function:
n <- 100000
p <- 0.25
X <- rgeom(n, p)
Y <- my_rgeom(n, p)
You can do a ks.test on X and Y, though this is for continuous variables. The best thing to do is the chisq.test to determine whether the two are similar.
Lastly we could use graphical methods. eg superimposed histogram:
barplot(table(X), col = rgb(0.5, 1, 0.5, 0.4))
barplot(table(Y), add = TRUE, col = rgb(1, 0.5, 0, 0.3))
From the image above you can see that the two are nearly identical

How to solve an objective function having an exponential term with a different base in CVXR?

I am using CVXR to solve a concave objective function. The decision variable (x) is one-dimensional and the objective function is the summation of 2 logarithmic terms in which the second term is exponential with different bases of “a and b” (e.g., a^x, b^x); “a and b” are constants.
My full objective function is:
(-x*sum(ln(y))) + ln((1-x)/((a^(1-x))-(b^(1-x))))
where y is a given 1-D vector of data.
When I add the second term having (a^x and b^x) to the objective function, I keep getting
Error in a^(1 - x): non-numeric argument to binary operator
Is there any atom function in CVXR that can be used to code constant^x?
Here is my code:
library(CVXR)
a <- 7
b <- 0.3
M=1000
x_i # is a given vector of 1-D data
x <- Variable(1)
nominator <- (1-x)
denominator <- (1/((a^(1-x))-(b^(1-x))))
obj <- (-xsum(log(x_i)) + Mlog(nominator/denominator)) # change M to the length of X_i later
constr <- list(x>0)
prob <- Problem(Maximize(obj), constr)
result <- solve(prob)
alpha_hat <- result$getValue(x)
Please tell me what I am doing wrong. I appreciate your help in advance.
do some math
2=e^log2
2^x=(e^log2)^x=e^(log2*x)
So, you can try
denominator <- 1/(exp(log(a)*(1-x)) - exp(log(b)*(1-x)))

integrating the square of probability density?

Suppose I have
set.seed(2020) # make the results reproducible
a <- rnorm(100, 0, 1)
My probability density is estimated through kernel density estimator (gaussian) in R using the R built in function density. The question is how to integrate the square of the estimated density. It does not matter between which values, let us suppose between -Inf and +Inf. I have tried the following:
f <- approxfun(density(a)$x, density(a)$y)
integrate (f*f, min(density(a)$x), max(density(a)$x))
There are a couple of problems here. First you have the x and y round the wrong way in approxfun. Secondly, you can't multiply function names together. You need to specify a new function that gives you the square of your original function:
set.seed(2020)
a <- rnorm(100, 0, 1)
f <- approxfun(density(a)$x, density(a)$y)
f2 <- function(v) ifelse(is.na(f(v)), 0, f(v)^2)
integrate (f2, -Inf, Inf)
#> 0.2591153 with absolute error < 0.00011
We can also plot the original density function and the squared density function:
curve(f, -3, 3)
curve(f2, -3, 3, add = TRUE, col = "red")
I think you should write the objective function as function(x) f(x)**2, rather than f*f, e.g.,
> integrate (function(x) f(x)**2, min(density(a)$x), max(density(a)$x))
0.2331793 with absolute error < 6.6e-06
Here is a way using package caTools, function trapz. It computes the integral given a vector x and its corresponding image y using the trapezoidal rule.
I also include a function trapzf based on the original to have the integral computed with the function returned by approxfun
trapzf <- function(x, FUN) trapz(x, FUN(x))
set.seed(2020) # make the results reproducible
a <- rnorm(100, 0, 1)
d <- density(a)
f <- approxfun(d$x, d$y)
int1 <- trapz(d$x, d$y^2)
int2 <- trapzf(d$x, function(x) f(x)^2)
int1
#[1] 0.2591226
identical(int1, int2)
#[1] TRUE

Is there a way that I can store a polynomial or the coefficients of a polynomial in a single element in an R vector?

I want to create an R function that generates the cyclic finite group F[x].
Basically, I need to find a way to store polynomials, or at the least a polynomial's coefficients, in a single element in an R vector.
For example, if I have a set F={0,1,x,1+x}, I would like to save these four polynomials into an R vector such as
F[1] <- 0 + 0x
F[2] <- 1 + 0x
F[3] <- 0 + x
F[4] <- 1 + x
But I keep getting the error: "number of items to replace is not a multiple of replacement length"
Is there a way that I can at least do something like:
F[1] <- (0,0)
F[2] <- (1,0)
F[3] <- (0,1)
F[4] <- (1,1)
For reference in case anyone is interested in the mathematical problem I am trying to work with, my entire R function so far is
gf <- function(q,p){
### First we need to create an irreducible polynomial of degree p
poly <- polynomial(coef=c(floor(runif(p,min=0,max=q)),1)) #This generates the first polynomial of degree p with coefficients ranging between the integer values of 0,1,...,q
for(i in 1:(q^5*p)){ #we generate/check our polynomial a sufficient amount of times to ensure that we get an irreducible polynomial
poly.x <- as.function(poly) #we coerce the generated polynomial into a function
for(j in 0:q){ #we check if the generated polynomial is irreducible
if(poly.x(j) %% q == 0){ #if we find that a polynomial is reducible, then we generate a new polynomial
poly <- polynomial(coef=c(floor(runif(p,min=0,max=q)),1)) #...and go through the loop again
}
}
}
list(poly.x=poly.x,poly=poly)
### Now, we need to construct the cyclic group F[x] given the irreducible polynomial "poly"
F <- c(rep(0,q^p)) #initialize the vector F
for(j in 0:(q^p-1)){
#F[j] <- polynomial(coef = c(rep(j,p)))
F[j] <- c(rep(0,3))
}
F
}
Make sure F is a list and then use [[]] to place the values
F<-list()
F[[1]] <- c(0,0)
F[[2]] <- c(1,0)
F[[3]] <- c(0,1)
F[[4]] <- c(1,1)
Lists can hold heterogeneous data types. If everything will be a constant and a coefficient for x, then you can also use a matrix. Just set each row value with the [row, col] type subsetting. You will need to initialize the size at the time you create it. It will not grow automatically like a list.
F <- matrix(ncol=2, nrow=4)
F[1, ] <- c(0,0)
F[2, ] <- c(1,0)
F[3, ] <- c(0,1)
F[4, ] <- c(1,1)
You will have to store those as strings, since otherwise R will try to interpret the operators. You can have
F[1] <- "0 + 0x"
Or even a matrix, which is more flexible for apply and other operations you might wanna do
mat <- matrix(c(0,1,0,1,0,0,1,1), ncol=2)

Generating random sample from the quantiles of unknown density in R [duplicate]

This question already has answers here:
How do I best simulate an arbitrary univariate random variate using its probability function?
(4 answers)
Closed 9 years ago.
How can I generate random sample data from the quantiles of the unknown density f(x) for x between 0 and 4 in R?
f = function(x) ((x-1)^2) * exp(-(x^3/3-2*x^2/2+x))
If I understand you correctly (??) you want to generate random samples with the distribution whose density function is given by f(x). One way to do this is to generate a random sample from a uniform distribution, U[0,1], and then transform this sample to your density. This is done using the inverse cdf of f, a methodology which has been described before, here.
So, let
f(x) = your density function,
F(x) = cdf of f(x), and
F.inv(y) = inverse cdf of f(x).
In R code:
f <- function(x) {((x-1)^2) * exp(-(x^3/3-2*x^2/2+x))}
F <- function(x) {integrate(f,0,x)$value}
F <- Vectorize(F)
F.inv <- function(y){uniroot(function(x){F(x)-y},interval=c(0,10))$root}
F.inv <- Vectorize(F.inv)
x <- seq(0,5,length.out=1000)
y <- seq(0,1,length.out=1000)
par(mfrow=c(1,3))
plot(x,f(x),type="l",main="f(x)")
plot(x,F(x),type="l",main="CDF of f(x)")
plot(y,F.inv(y),type="l",main="Inverse CDF of f(x)")
In the code above, since f(x) is only defined on [0,Inf], we calculate F(x) as the integral of f(x) from 0 to x. Then we invert that using the uniroot(...) function on F-y. The use of Vectorize(...) is needed because, unlike almost all R functions, integrate(...) and uniroot(...) do not operate on vectors. You should look up the help files on these functions for more information.
Now we just generate a random sample X drawn from U[0,1] and transform it with Z = F.inv(X)
X <- runif(1000,0,1) # random sample from U[0,1]
Z <- F.inv(X)
Finally, we demonstrate that Z is indeed distributed as f(x).
par(mfrow=c(1,2))
plot(x,f(x),type="l",main="Density function")
hist(Z, breaks=20, xlim=c(0,5))
Rejection sampling is easy enough:
drawF <- function(n) {
f <- function(x) ((x-1)^2) * exp(-(x^3/3-2*x^2/2+x))
x <- runif(n, 0 ,4)
z <- runif(n)
subset(x, z < f(x)) # Rejection
}
Not the most efficient but it gets the job done.
Use sample . Generate a vector of probablities from your existing function f, normalized properly. From the help page:
sample(x, size, replace = FALSE, prob = NULL)
Arguments
x Either a vector of one or more elements from which to choose, or a positive integer. See ‘Details.’
n a positive number, the number of items to choose from. See ‘Details.’
size a non-negative integer giving the number of items to choose.
replace Should sampling be with replacement?
prob A vector of probability weights for obtaining the elements of the vector being sampled.

Resources