I've been trying to code this problem:
https://sg.answers.yahoo.com/question/index?qid=20110127015240AA9RjyZ
I believe there is a R function somewhere to solve for the root values of the following equations:
(x+3)^2 + (y-50)^2 = 1681
(x-11)^2 + (y+2)^2 = 169
(x-13)^2 + (y-34)^2 = 625
I tried using the 'solve' function but they're only for linear equations(?)
Also tried 'nls'
dt = data.frame(a=c(-3,11,13), b = c(50, -2, 34), c = c(1681,169,625))
nls(c~(x-a)^2 + (y-b)^2, data = dt, start = list(x = 1, y = 1))
but getting an error all the time. (and yes I already tried changing the max iteration)
Error in nls(c ~ (x - a)^2 + (y - b)^2, data = dt, start = list(x = 1, :
number of iterations exceeded maximum of 50
how do you solve the root values in R?
nls does not work with zero residual data -- see ?nls where this is mentioned. nlxb in the nlmrt package is mostly similar to nls in terms of input arguments and does support zero residual data. Using dt from the question just replace nls with nlxb:
library(nlmrt)
nlxb(c~(x-a)^2 + (y-b)^2, data = dt, start = list(x = 1, y = 1))
giving:
nlmrt class object: x
residual sumsquares = 2.6535e-20 on 3 observations
after 5 Jacobian and 6 function evaluations
name coeff SE tstat pval gradient JSingval
x 6 7.21e-12 8.322e+11 7.649e-13 -1.594e-09 96.93
y 10 1.864e-12 5.366e+12 1.186e-13 -1.05e-08 22.45
You cannot always solve three equations for two variables.You can solve two equations for two variables and test if the solution satisfies the third equation.
Use package nleqslv as follows.
library(nleqslv)
f1 <- function(z) {
f <- numeric(2)
x <- z[1]
y <- z[2]
f[1] <- (x+3)^2 + (y-50)^2 - 1681
f[2] <- (x-11)^2 + (y+2)^2 - 169
f
}
f2 <- function(z) {
x <- z[1]
y <- z[2]
(x-13)^2 + (y-34)^2 - 625
}
zstart <- c(0,0)
z1 <- nleqslv(zstart,f1)
z1
f2(z1$x)
which gives you the following output:
>z1
$x
[1] 6 10
$fvec
[1] 7.779818e-09 7.779505e-09
$termcd
[1] 1
$message
[1] "Function criterion near zero"
$scalex
[1] 1 1
$nfcnt
[1] 9
$njcnt
[1] 1
$iter
[1] 9
>f2(z1$x)
[1] 5.919242e-08
So a solution has been found and the solution follows from the vector z$x. Inserting z$x in function f2 also gives almost zero.
So a solution has been found.
You could also try package BB.
Just go through rootSolve package and you will be done:
https://cran.r-project.org/web/packages/rootSolve/vignettes/rootSolve.pdf
Related
I am trying to change the value range of a variable (array, set of values) while keeping its properties. I don't know the exact name in math, but I mean such a kind of transformation that the variable array has exactly the same properties, the spacing between the values is the same, but the range is different. Maybe the code below will explain what I mean.
I just want to "linearly transpose" (or smth?) values to some other range and the distribution should remain same. In other words - I'll just change the scope of the variable using the regression equation y = a * x + b. I assume that the transformation will be completely linear, the correlation between the variables is exactly 1, and I calculate new variable (array) from a regression equation, actually a system of equations where I simply substitute the maximum ranges of both variables:
minimum.y1 = minimum.x1 * a + b
maximum.y2 = maximum.x2 * a + b
from which I can work out the following code to obtain a and b coefficients:
# this is my input variable
x <- c(-1, -0.5, 0, 0.5, 1)
# this is the range i want to obtain
y.pred <- c(1,2,3,4,5)
max_y = 5
min_y = 1
min_x = min(x)
max_x = max(x)
c1 = max_x-min_x
c2 = max_y-min_y
a.coeff = c2/c1
b.coeff = a.coeff-min_x
y = x * a.coeff + b.coeff
y
# hey, it works! :)
[1] 1 2 3 4 5
the correlation between the variable before and after the transformation is exactly 1. So we have a basis for further action. Let's get it as a function:
linscale.to.int <- function(max.lengt, vector) {
max_y = max.lengt
min_y = 1
min_x = min(vector)
max_x = max(vector)
c1 = max_x-min_x
c2 = max_y-min_y
a.coeff = c2/c1
b.coeff = a.coeff-min_x
return(vector * a.coeff + b.coeff)
}
x <- c(-1, -0.5, 0, 0.5, 1)
linscale.to.int(5,x)
[1] 1 2 3 4 5
and it works again. But here's the thing: when i aplly this function to random distribution, like this:
x.rand <- rnorm(50)
y.rand <- linscale.to.int(5,x.rand)
plot(x.rand, y.rand)
or better seable this:
x.rand <- rnorm(500)
y.rand <- linscale.to.int(20,x.rand)
plot(x.rand, y.rand)
I get the values of the second variable completely out of range; it should be between 1 and 20 but i get scope of valuest about -1 to 15:
And now the question arises - what am I doing wrong here? Where do I go wrong with such a transformation?
What you are trying to do is very straightforward using rescale from the scales package (which you will already have installed if you have ggplot2 / tidyverse installed). Simply give it the new minimum / maximum values:
x <- c(-1, -0.5, 0, 0.5, 1)
scales::rescale(x, c(1, 5))
#> [1] 1 2 3 4 5
If you want to have your own function written in base R, the following one-liner should do what you want:
linscale_to_int <- function(y, x) (x - min(x)) * (y - 1) / diff(range(x)) + 1
(Note that it is good practice in R to avoid periods in function names because this can cause confusion with S3 method dispatch)
Testing, we have:
x <- c(-1, -0.5, 0, 0.5, 1)
linscale_to_int(5, x)
#> [1] 1 2 3 4 5
x.rand <- rnorm(50)
y.rand <- linscale_to_int(5, x.rand)
plot(x.rand, y.rand)
y.rand <- linscale_to_int(20, x.rand)
plot(x.rand, y.rand)
Created on 2022-08-31 with reprex v2.0.2
I'm using the stats::filter function in R in order to understand ARIMA simulations in R (as in the function stats::arima.sim) and estiamtion. I know that stats::filter applies a linear filter to a vector or time series, but I'm not sure how to "unfilter" my series.
Consider the following example: I want to use a recursive filter with value 0.7 to my series x = 1:5 (which is essentially generating an AR(1) with phi=0.7). I can do so by:
x <- 1:5
ar <-0.7
filt <- filter(x, ar, method="recursive")
filt
Time Series:
Start = 1
End = 5
Frequency = 1
[1] 1.0000 2.7000 4.8900 7.4230 10.1961
Which returns me essentially c(y1,y2,y3,y4,y5) where:
y1 <- x[1]
y2 <- x[2] + ar*y1
y3 <- x[3] + ar*y2
y4 <- x[4] + ar*y3
y5 <- x[5] + ar*y4
Now imagine I have the y = c(y1,y2,y3,y4,y5) series. How can I use the filter function to return me the original series x = 1:5?
I can write a code to do it like:
unfilt <- rep(NA, 5)
unfilt[1] <- filt[1]
for(i in 2:5){
unfilt[i] <- filt[i] - ar*filt[i-1]
}
unfilt
[1] 1 2 3 4 5
But I do want to use the filter function to do so, instead of writing my own function. How can I do so? I tried stats::filter(filt, -ar, method="recursive"), which returns me [1] 1.0000 2.0000 3.4900 4.9800 6.7101 not what I desire.
stats::filter used with the recursive option is a particular case of an ARMA filter.
a[1]*y[n] + a[2]*y[n-1] + … + a[n]*y[1] = b[1]*x[n] + b[2]*x[m-1] + … + b[m]*x[1]
You could implement this filter with the signal package which allows more options than stat::filter :
a = c(1,-ar)
b = 1
filt_Arma <- signal::filter(signal::Arma(b = b, a = a),x)
filt_Arma
# Time Series:
# Start = 1
# End = 5
# Frequency = 1
# [1] 1.0000 2.7000 4.8900 7.4230 10.1961
identical(filt,filt_Arma)
# [1] TRUE
Reverting an ARMA filter can be done by switching b and a, provided that the inverse filter stays stable (which is the case here):
signal::filter(signal::Arma(b = a, a = b),filt)
# Time Series:
# Start = 2
# End = 6
# Frequency = 1
# [1] 1 2 3 4 5
This corresponds to switching numerator and denominator in the z-transform:
Y(z) = a(z)/b(z) X(z)
X(z) = b(z)/a(z) Y(z)
The code below estimates pi in R, now I am trying to find the minimum number of terms N_Min
you would have to include in your estimate of pie to make it accurate to three decimal places.
pi_Est<- function(NTerms){
NTerms = 5 # start with an estimate of just five terms
pi_Est = 0 # initialise the value of pi to zero
Sum_i = NA # initialise the summation variable to null
for(ii in 1:NTerms)
{
Sum_i[ii] = (-1)^(ii+1)/(2*ii - 1) # this is the series equation for calculating pi
}
Sum_i = 4*Sum_i # multiply by four as required in the formula (see lecture notes)
pi_Est = sum(Sum_i)
cat('\nThe estimate of pi with terms = ', NTerms ,' is ',pi_Est)
}
First of all, I would change some things about your function. Instead of getting it to print out a message, get it to return a value. Otherwise it becomes very difficult to do anything with its output, including testing it for convergence to pi.
Also, no matter what the value of NTerms is you feed this function, you are immediately over-writing NTerms inside the function.
You could rewrite the function like this:
pi_Est <- function(NTerms) {
pi_Est <- 0
Sum_i <- numeric()
for(ii in seq(NTerms))
{
Sum_i[ii] <- (-1)^(ii+1)/(2*ii - 1)
}
return(sum(4 * Sum_i))
}
And to show it converges to pi, let's test it with 50,000 terms:
pi_Est(50000)
#> [1] 3.141573
Now, if we want to find the first value of NTerms that is correct to 3 decimal places, we are going to need to be able to call this function on a vector of NTerms - at the moment it is only working on a single number. So let's define the function f that vectorizes pi_Est:
f <- Vectorize(pi_Est)
Now, let's create the estimate for all values of NTerms between 1 and 2,000 and store them in a vector:
estimates <- f(1:2000)
We can see that the values of estimates seem to oscillate round and converge to pi if we plot the first 100 values:
plot(estimates[1:100], type = 'l')
abline(h = pi)
Our answer is just the first value which, when rounded to three decimal places, is the same as pi rounded to three decimal places:
result <- which(round(estimates, 3) == round(pi, 3))[1]
result
#> [1] 1103
And we can check this is correct by feeding 1103 into our original function:
pi_Est(result)
#> [1] 3.142499
You will see that this gives us 3.142, which is the same as pi rounded to 3 decimal places.
Created on 2022-01-31 by the reprex package (v2.0.1)
1000 terms are required to make the estimate accurate to within 0.001:
pi_Est1 <- function(n) {
if (n == 0) return(0)
neg <- 1/seq(3, 2*n + 1, 4)
if (n%%2) neg[length(neg)] <- 0
4*sum(1/seq(1, 2*n, 4) - neg)
}
pi_Est2 <- function(tol) {
for (i in ceiling(1/tol + 0.5):0) {
est <- pi_Est1(i)
if (abs(est - pi) > tol) break
est1 <- est
}
list(NTerms = i + 1, Estimate = est1)
}
tol <- 1e-3
pi_Est2(tol)
#> $NTerms
#> [1] 1000
#>
#> $Estimate
#> [1] 3.140593
tol - abs(pi - pi_Est2(tol)$Estimate)
#> [1] 2.500001e-10
tol - abs(pi - pi_Est1(pi_Est2(tol)$NTerms - 1))
#> [1] -1.00075e-06
Created on 2022-01-31 by the reprex package (v2.0.1)
Perhaps we can try the code below
pi_Est <- function(digits = 3) {
s <- 0
ii <- 1
repeat {
s <- s + 4 * (-1)^(ii + 1) / (2 * ii - 1)
if (round(s, digits) == round(pi, digits)) break
ii <- ii + 1
}
list(est = s, iter = ii)
}
and you will see
> pi_Est()
$est
[1] 3.142499
$iter
[1] 1103
> pi_Est(5)
$est
[1] 3.141585
$iter
[1] 130658
Why not use a single line of code for the calculation?
Pi <- tail(cumsum(4*(1/seq(1,4*50000000,2))*rep(c(1,-1), 50000000)),1)
My predictor (x) has U-shaped distribution in relation to the binary outcome (y), with positive outcomes at both low and high values of x, leading to a biconcave roc curve with a poor area under the curve (auc).
To maximise its ability to discriminate the outcome, I am trying to optimise the parameters of a second grade polynomial of x, by using optim and 1 - auc as the cost function to minimise.
x = c(13,7,7,7,1,100,3,4,4,2,2,7,14,8,3,14,5,12,8,
13,9,4,9,4,8,3,13,9,4,4,5,9,10,10,7,6,12,7,2,
6,6,4,3,2,3,10,5,2,5,8,3,5,4,2,7,5,7,6,79,9)
y = c(0,0,1,0,0,1,0,0,1,1,0,1,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0)
theta = c(0, 0, 0)
min_auc <- function(theta, x, y) {
(1 - roc(y, (theta[1] + theta[2]*x + theta[3]*x^2))$auc)
}
optim(theta, min_auc, x = x, y = y)
The results are as follow:
$par
[1] 0.0 0.1 0.0
$value
[1] 0.4380054
$counts
function gradient
8 NA
$convergence
[1] 0
$message
NULL
However, from a manual definition of the parameters, I know that min_auc can be further minimised.
theta = c(0, -40, 1)
(1 - roc(y, (theta[1] + theta[2]*x + theta[3]*(x^2)))$auc)
[1] 0.2762803
Could anyone explain to me what I am doing wrong, please? Is it possibly due to a non-convex cost function?
One possibility is it's a collinearity problem. Scaling the inputs helps:
min_auc <- function(theta, x, y) {
(1 - roc(y, (theta[1] + theta[2]*scale(x) + theta[3]*scale(x)^2))$auc)
}
optim(theta, min_auc, x = x, y = y)
# $par
# [1] -0.02469136 -0.03117284 0.11049383
#
# $value
# [1] 0.2762803
#
# $counts
# function gradient
# 30 NA
#
# $convergence
# [1] 0
#
# $message
# NULL
#
Another potential problem is that the surface over which you're optimizing has some flat spots. Let's say, for example, that we fix the intercept in this equation to -2. This is about what you get if you do qlogis(mean(y)). Then, you're only optimizing over 2 parameters so the surface is easier to see. Here's what it looks like with the two remaining theta terms on the two horizontal axes and the 1-AUC value on the y-axis.
min_auc <- function(theta, x, y) {
(1 - roc(y, (-2 + theta[1]*scale(x) + theta[2]*scale(x)^2))$auc)
}
s <- seq(-.25, .25, length=50)
o <- outer(s, s, Vectorize(function(z,w)min_auc(c(z,w),x,y)))
library(plotly)
plot_ly(x = ~s, y = ~s, z = ~o) %>% add_surface()
As you may have noticed above, there is no unique solution to the problem. There are lots of solutions that seem to get to the minimum value.
Finding root for Hermite spline
I have a question regarding one discussion that already in StackOverflow
get x-value given y-value: general root finding for linear / non-linear interpolation function
They told that for cubic interpolation splines returned by stats::splinefun with methods "fmm", "natrual", "periodic" and "hyman", the function RootSpline3 provides a stable numerical solution. Now if I use "monoH.FC", would the RootSpline3 function work? I have actually tried it, but seems like it doesn't work. Can you tell what is wrong in my code(why the length argument is invalid)? Either my code is wrong or it will not work for this particular method? If yes, what I need to do?).
kne<-c(10,15,18,18,15,14,13,13,15,21,26,39,52,64,70,66,57,40,22,11)
t<-seq(0,1,len=20)
s <- splinefun(t, kne, method = "monoH.FC")
RootSpline3 <- function (s, y0 = 0, verbose = TRUE) {
## extract piecewise construction info
info <- environment(s)$z
print(info)
n_pieces <- info$n - 1L
x <- info$x; y <- info$y
print(x)
b <- info$b; c <- info$c; d <- info$d
## list of roots on each piece
xr <- vector("list", n_pieces)
## loop through pieces
i <- 1L
while (i <= n_pieces) {
## complex roots
croots <- polyroot(c(y[i] - y0, b[i], c[i], d[i]))
## real roots (be careful when testing 0 for floating point numbers)
rroots <- Re(croots)[round(Im(croots), 10) == 0]
## the parametrization is for (x - x[i]), so need to shift the roots
rroots <- rroots + x[i]
## real roots in (x[i], x[i + 1])
xr[[i]] <- rroots[(rroots >= x[i]) & (rroots <= x[i + 1])]
## next piece
i <- i + 1L
}
## collapse list to atomic vector
xr <- unlist(xr)
## make a plot?
if (verbose) {
curve(f, from = x[1], to = x[n_pieces + 1], xlab = "x", ylab = "f(x)")
abline(h = y0, lty = 2)
points(xr, rep.int(y0, length(xr)))
}
## return roots
xr
}
RootSpline3(s, 10)
Depending on which spline method you use, you'll get a different environment. In the case of method = 'monoH.FC', there are no environment variables available (like $z, $c, $d, etc) as there are in the default spline method ('fmm').
However, there is still an x and y argument that can be called with a monoH.FC splinefun(), both of which will produce vectors, and which can be used to plot the spline fit.
Using your data:
s(kne)
# -1870 -14201 -3542 -3751 -2915 -2706 -2497 -3333 -2288 -4169 -5214
# -7931 -10648 -13156 -14410 -13574 -11693 -8140 -4378 -2079
s(t)
# 10 69 18 19 15 14 13 17 12 21 26 39 52 64 70 66 57 40 22 11
plot(t, kne)
curve(s(x), add = TRUE)