I was manually creating a negative exponent distribution today and was trying to figure out a faster/easier solution. First, I just manually crafted a geometric sequence such as this one, multiplying constantly by .60 til I neared zero:
x <- 400
x*.60
Doing this about 20 times, I got this vector of solutions and plotted the distribution, as seen below:
y <- c(400,240,144,86.4, 51.84, 31.104, 18.6624, 11.19744, 6.718464, 4.031078,
2.418647, 1.451188, .8707129, .5224278, .3134567, .188074, .1128444,
.06770664, .04062398, .02437439)
plot(y)
However, I was trying to figure out what must be an easier way of doing this with seq, but I only know how to do this with arithmetic sequences. I tried reproducing what I did below:
plot(seq(from=400,
to=1,
by=-.60))
Which obviously doesn't produce the same effect, causing a very linear decline when plotted:
Is there an easier solution? I have to imagine that this is a rather basic function within R.
You may use dexp.
(x <- dexp(1:20, rate=.5)*1000)
# [1] 303.26532986 183.93972059 111.56508007 67.66764162 41.04249931 24.89353418 15.09869171 9.15781944 5.55449827
# [10] 3.36897350 2.04338572 1.23937609 0.75171960 0.45594098 0.27654219 0.16773131 0.10173418 0.06170490
# [19] 0.03742591 0.02269996
plot(x)
To make it start exactly at 400, we can minimize (400 - dexp(1, rate=.5)*x)^2 using optimize.
f <- function(x, a) (a - dexp(1, rate=.5)*x)^2
xmin <- optimize(f, c(0, 4000), a=400)
(x <- dexp(seq_len(20), rate=.5)*xmin$minimum)
# [1] 400.00000000 242.61226389 147.15177647 89.25206406 54.13411329 32.83399945 19.91482735 12.07895337 7.32625556
# [10] 4.44359862 2.69517880 1.63470858 0.99150087 0.60137568 0.36475279 0.22123375 0.13418505 0.08138735
# [19] 0.04936392 0.02994073
Note that if you want any different rate= you should to use it both in optimize and when creating the values.
Related
Taking the ideas from the following links:
the local minimum between the two peaks
How to explain ...
I look for the local minimum or minimums, avoiding the use of functions already created for this purpose [max / min locale or global].
Our progress:
#DATA
simulate <- function(lambda=0.3, mu=c(0, 4), sd=c(1, 1), n.obs=10^5) {
x1 <- rnorm(n.obs, mu[1], sd[1])
x2 <- rnorm(n.obs, mu[2], sd[2])
return(ifelse(runif(n.obs) < lambda, x1, x2))
}
data <- simulate()
hist(data)
d <- density(data)
#
#https://stackoverflow.com/a/25276661/8409550
##Since the x-values are equally spaced, we can estimate dy using diff(d$y)
d$x[which.min(abs(diff(d$y)))]
#With our data we did not obtain the expected value
#
d$x[which(diff(sign(diff(d$y)))>0)+1]#pit
d$x[which(diff(sign(diff(d$y)))<0)+1]#peak
#we check
#1
optimize(approxfun(d$x,d$y),interval=c(0,4))$minimum
optimize(approxfun(d$x,d$y),interval=c(0,4),maximum = TRUE)$maximum
#2
tp <- pastecs::turnpoints(d$y)
summary(tp)
ind <- (1:length(d$y))[extract(tp, no.tp = FALSE, peak = TRUE, pit = TRUE)]
d$x[ind[2]]
d$x[ind[1]]
d$x[ind[3]]
My questions and request for help:
Why did the command lines fail:
d$x[which.min(abs(diff(d$y)))]
It is possible to eliminate the need to add one to the index in the command lines:
d$x[which(diff(sign(diff(d$y)))>0)+1]#pit
d$x[which(diff(sign(diff(d$y)))<0)+1]#peak
How to get the optimize function to return the two expected maximum values?
Question 1
The answer to the first question is straighforward. The line d$x[which.min(abs(diff(d$y)))] asks for the x value at which there was the smallest change in y between two consecutive points. The answer is that this happened at the extreme right of the plot where the density curve is essentially flat:
which.min(abs(diff(d$y)))
#> [1] 511
length(abs(diff(d$y)))
#> [1] 511
This is not only smaller than the difference at your local maxima /minima points; it is orders of magnitude smaller. Let's zoom in to the peak value of d$y, including only the peak and the point on each side:
which.max(d$y)
#> [1] 324
plot(d$x[323:325], d$y[323:325])
We can see that the smallest difference is around 0.00005, or 5^-5, between two consecutive points. Now look at the end of the plot where it is flattest:
plot(d$x[510:512], d$y[510:512])
The difference is about 1^-7, which is why this is the flattest point.
Question 2
The answer to your second question is "no, not really". You are taking a double diff, which is two elements shorter than x, and if x is n elements long, a double diff will correspond to elements 2 to (n - 1) in x. You can remove the +1 from the index, but you will have an off-by-one error if you do that. If you really wanted to, you could concatenate dummy zeros at each stage of the diff, like this:
d$x[which(c(0, diff(sign(diff(c(d$y, 0))))) > 0)]
which gives the same result, but this is longer, harder to read and harder to justify, so why would you?
Question 3
The answer to the third question is that you could use the "pit" as the dividing point between the minimum and maximum value of d$x to find the two "peaks". If you really want a single call to get both at once, you could do it inside an sapply:
pit <- optimize(approxfun(d$x,d$y),interval=c(0,4))$minimum
peaks <- sapply(1:2, function(i) {
optimize(approxfun(d$x, d$y),
interval = c(min(d$x), pit, max(d$x))[i:(i + 1)],
maximum = TRUE)$maximum
})
pit
#> [1] 1.691798
peaks
#> [1] -0.02249845 3.99552521
I have a dataset from a biological experiment:
x = c(0.488, 0.977, 1.953, 3.906, 7.812, 15.625, 31.250, 62.500, 125.000, 250.000, 500.000, 1000.000)
y = c(0.933, 1.036, 1.112, 1.627, 2.646, 5.366, 11.115, 2.355, 1.266, 0, 0, 0)
plot(log(x),y)
x represents a concentration and y represents the response in our assay.
The plot can be found here: 1
How can I predict the x-value (concentration) of a pre-defined y-value (in my case 1.5)?
After a loess smoothing I can predict the y-value at a defined x-value. See the example:
smooth_data <- loess(y~log(x))
predict(smooth_data, 1.07) # which gives 1.5
Using the predict function, both x = 1.07 and x = 5.185 result in y = 1.5
Is there a convenient way to get the estimates from the loess regression at y = 1.5 without manually typing some x values into the predict function?
Any suggestions?
I gues your x and y's are pairs? so for f(0.488) = 0.933 and so on?
More of a mathproblem in my opinion :).
If you could define a function that describes your graph it would be pretty easy.
You could also draw a straight line between all points and for every line that intersects with your y value you could get corrosponding x values. But straight lines wouldn't be really precies.
If you have enough pairs you could also train a neureal network. That might get you the best results but takes some time and alot of pairs to train well.
Could you clarify your question a bit and tell us what you are looking for? A way to do it or a code example?
I hope this is helping you atleast a little bit :)
Since your function is not monotonic, there is no true inverse, but if you split it into two functions - one for x < maximum and one for x > maximum - you can just create two inverse functions and solve for whatever values of y you want.
smooth_data <- loess(y~log(x))
X = seq(0,6.9,0.1)
P = predict(smooth_data, X)
M = which.max(P)
Inverse1 = approxfun(X[1:M] ~ P[1:M])
Inverse2 = approxfun(X[M:length(X)] ~ P[M:length(X)])
Inverse1(1.5)
[1] 1.068267
predict(smooth_data, 1.068267)
[1] 1.498854
Inverse2(1.5)
[1] 5.185876
predict(smooth_data, 5.185876)
[1] 1.499585
How do I plot decision boundary from weight vector?
My original data is 2-dimensional but non-linearly separable so I used a polynomial transformation of order 2 and therefore I ended up with a 6-dimensional weight vector.
Here's the code I used to generate my data:
polar2cart <- function(theta,R,x,y){
x = x+cos(theta) * R
y = y+sin(theta) * R
c=matrix(x,ncol=1000)
c=rbind(c,y)
}
cart2polar <- function(x, y)
{
r <- sqrt(x^2 + y^2)
t <- atan(y/x)
c(r,t)
}
R=5
eps=5
sep=-5
c1<-polar2cart(pi*runif(1000,0,1),runif(1000,0,eps)+R,0,0)
c2<-polar2cart(-pi*runif(1000,0,1),runif(1000,0,eps)+R,R+eps/2,-sep)
data <- data.frame("x" = append(c1[1,], c2[1,]), "y" = append(c1[2,], c2[2,]))
labels <- append(rep(1,1000), rep(-1, 1000))
and here's how it is displayed (using ggplot2):
Thank you in advance.
EDIT: I'm sorry if I didn't provide enough information about the weight vector. The algorithm I'm using is pocket which is a variation of perceptron, which means that the output weight vector is the perpendicular vector that determines the hyper-plane in the feature space plus the bias . Therefore, the hyper-plane equation is , where are the variables. Now, since I used a polynomial transformation of order 2 to go from a 2-dimensional space to a 5-dimensional space, my variables are : and thus the equation for my decision boundary is:
So basically, my question is how do I go about drawing my decision boundary given
PS: I've found a solution while waiting, it might not be the best approach but, it gives the expected results. I'll share it as soon as I finish my project if anyone is interested. Meanwhile, I'd love to hear a better alternative.
I've got the first line down which is defining the function:
f <- function(x) 3034*log(x)+2305.84*log(1-x)-1517*log(1-x)
Now the problem I'm having is I need to find all the x values where
f(x)=-1947.92 but I've got no idea what the command is to do this?
Normally I would say you should use uniroot(), after modifying the function to return zero at the target, but that will be problematic here:
target <- -1947.92
f <- function(x) 3034*log(x)+2305.84*log(1-x)-1517*log(1-x)
g <- function(x) f(x)-target
uniroot(g,interval=c(1e-4,1-1e-4))
## Error in uniroot(g, interval = c(1e-04, 1 - 1e-04)) :
## f() values at end points not of opposite sign
What's going on is that your curve crosses zero in two places. uniroot() requires that you bracket the root:
Let's take a look:
curve(g(x))
abline(h=0,col=2)
Zoom in:
curve(g(x),from=0.75,to=0.85)
abline(h=0,col=2)
Now we can either just eyeball this (i.e. use interval=c(1e-4,0.8) or interval=c(0.8,1-1e-4) depending on which root we're interested in) or find
opt1 <- optim(g,par=0.5,method="L-BFGS-B",lower=1e-4,upper=1-1e-4,
control=list(fnscale=-1)) ## maximize rather than min
then use opt1$par as your cut-point. (Or you could do some simple calculus: the maximum [point where the derivative wrt x is zero] is much easier to compute than the roots ...)
Alternatively, you could ask Wolfram Alpha ...
Is there a function for solving transcendental equations in R?
For example, I want to solve the following equation
x = 1/tan(x)
Any suggestions? I know the solution has multiple roots so I also want to be able to recover all the answers for a given interval
I would plot the function curve and look at it to see what it looks like:
R > y = function(x) { x - 1/tan(x) }
R > curve(y, xlim = c(-10, 10))
R > abline(h = 0, color = 'red')
Then I saw there is a root between 0 and 3, I would use uniroot to get the root I want:
R > uniroot(y, interval = c(0, 3))
$root
[1] 0.8603
$f.root
[1] 6.612e-06
$iter
[1] 7
$estim.prec
[1] 6.104e-05
You can use uniroot to find roots of any 1D equations within a given range. However, getting multiple roots seems like a very hard problem in general (e.g. see the relevant chapter of Numerical Recipes for some background: chapter 9 at http://apps.nrbook.com/c/index.html ). Which root is found when there are multiple roots is hard to predict. If you know enough about the problem to subdivide the space into subregions with zero or one roots, or if you're willing to divide it into lots of regions and hope that you found all the roots, you can do it. Otherwise I look forward to other peoples' solutions ...
In this particular case, as shown by #liuminzhao's solution, there's (at most? exactly?) one solution between n*pi and (n+1)*pi
y = function(x) x-1/tan(x)
curve(y,xlim=c(-10,10),n=501,ylim=c(-5,5))
abline(v=(-3:3)*pi,col="gray")
abline(h=0,col=2)
This is a bit of a hack, but it will find roots of your equation (provided they are not too close to a multiple of pi: you can reduce eps if you like ...). However, if you want to solve a different multi-root transcendental equation you might need another (specialized) strategy ...
f <- function(n,eps=1e-6) uniroot(y,c(n*pi+eps,(n+1)*pi-eps))$root
sapply(0:3,f)
## [1] 0.8603337 3.4256204 6.4372755 9.5293334