I stumbled upon this site and found there a very interesting transformation of the time axis through the sigmoid
I modified the horizontal axis distortion code to suit my needs.
set.seed(123)
vec <- cumsum(rnorm(50))
sigmoid <- function(x) 1 / (1 + exp(-x))
y1 <- sigmoid(seq(-2,2,length.out = 25))
y2 <- sigmoid(seq(-5,5,length.out = 25))
fu <- function(y1,y2, vec){
n <- length(vec)
s1 <- splinefun(y1, y2, method = "natural")(vec)
s2 <- splinefun(y1, y2, method = "natural")(seq(min(vec),max(vec),by=ceiling(length(vec)/50)))
if (! all(s2 == sort(s2))) warning("Non monotonic transformation on y axis!")
d <- cbind(x=1:length(vec), y=s1)
par(mfrow=c(2,2), mar=rep(2,4))
plot(vec,t="b",lwd=2) ; abline(h=seq(-10,10,by = 1),col=8,lty=2)
plot(d, type="b", lwd=2) ; abline(h=s2,col=8,lty=2)
matplot(cbind(y1,y2),t="l",col=c(2,4))
}
fu(y1,y2,vec)
I would like to replace the sigmoid with more complex functions such as a sine wave or the sum of sine waveforms. But when I try to do this, I get unsatisfactory results.
my.sin <- function(ve,a,f,p) a*sin(f*ve+p)
y1 <- my.sin(1:20,a = 2,f = 0.3,p = 0)+2
y2 <- my.sin(1:20,a = 2,f = 0.6,p = 3)+2
fu(y1,y2,vec)
QUESTION: How to make it so that the graph can only expand / contract, but not distorted,
That is, so that the graph retains its shape in all respects except for the horizontal axis.
That is, I want the same distortion as in my picture with a sigmode, but I want to replace the sigoid with a sinusoid ..
100 times I apologize for the unclear wording of the question, but I do not know how to explain it better
Related
I am trying to compute the turning angles of sequential vectors in the complex plane. Please see the code below for a demo data frame and my attempt at calculating the angles.
The sign of the angles seem correct: left turns are positive and right turns are negative. However, the turning angles do not look right when I reference the plot. NOTE: I want the turning angles and not the angles between the vectors. Image for reference:
set.seed(123)
# Generate a random path and plot it
path.short.random <- function(points = 6) {
x <- runif(points, -1, 1)
y <- rnorm(points, 0, 0.25)
i <- order(x, y)
x <- x[i]
y <- y[i]
path <- data.frame(x = x, y = y)
plot(x, y, main = "Random Path", asp = 1)
# draw arrows from point to point
s <- seq(length(x) - 1) # one shorter than data
arrows(x[s], y[s], x[s + 1], y[s + 1], col = 1:points)
path
}
# Save the path as a data frame
df <- path.short.random()
# Compute sequential turning angles
get.angles <- function(df) {
df$polar <- complex(real = df$x, imaginary = df$y)
df$displacement <- c(0, diff(df$polar))
diff(Arg(df$displacement[2:nrow(df)]))
}
get.angles(df)
I am trying to plot the following function in R Studio using the curve function as follows:
loglikelihood.func = function(x, mu){
n = length(x)
n*mu - sum(x) - sum(exp(mu)/(exp(x)))
}
curve(expr = loglikelihood.func(x = data, mu), xname = "mu", from
= 0, to = 15)
Now, I have a vector of data that contains 50 data points and in the function, it is currently summing both mu and x (my data) i.e. sum(exp(mu)/(exp(x))) is equivalent to sum(exp(mu))/sum(exp(x))
I want my function to work so that I have the fraction sum for each different data point in x while keeping mu constant i.e. exp(mu)/exp(x1) + exp(mu)/exp(x2) + exp(mu)/exp(x3) + ... and repeat this for each separate mu when it plots in the curve function.
If I change my function to exp(mu)/sum(exp(x)) it's doing exp(mu)/[exp(x1)+exp(x2)+...] which is not what I want. Can someone offer some advice here?
Edit:
This is a subset of my data,
data = c(8.5,8.9,9.1,8.9,8.4,9.7,9.1,9.6,8.7,9.3,9.6,9.3,8.7,9.0,8.8,8.9,8.9,12.2)
Not sure I get this right... But curve() can't do as much by itself. You can define your function:
loglikelihood.func = function(x, mu) {
length(x) * mu - sum(x) - sum(exp(mu)/(exp(x)))
}
Then define a range (for x) over which you'd like to plot it, and specify the limits (xlim, ylim) on a first plot. If you don't want to have a sequence, you can use your data instead.
xrange <- seq(from=1, to=10, by = 0.1)
plot(x=xrange, y = sapply(xrange, function(x) loglikelihood.func(x, mu=0)),
xlim = c(1, 10),
ylim = c(-10, 0),
type = "l")
Then add other curves, specifying different mu's:
lines(x=xrange, y = sapply(xrange, function(x) loglikelihood.func(x, mu=1)))
lines(x=xrange, y = sapply(xrange, function(x) loglikelihood.func(x, mu=2)))
(More practical doing with a loop if you don't need extra graph parameters)
I calculated a distribution function numerically. First I plot the function. It looks wrong around 0.05. Is this due to rounding errors, please?
Second, I need to find the corresponding first and second non-central moments. That is,
EX = int x dF(x)
EX^2 = int x^2 dF(x)
Can I do this type of Lebesgue–Stieltjes integration in R, please? Is there a build-in method, please? If not in R, what package offers such calculation, please?
I guess alternatively, I can find the numerical differentiation f(x) of F(x) and then conduct the usually integration like
EX = int x f(x) dx
But I remember from somewhere that numerical differentiation is much less stable. Which is the right way, please?
FYI my functions are attached below.
library(mvtnorm)
library(matrixcalc)
VAR <- matrix(c(1.043856e-03, 5.044899e-04, 3.239951e-04, 2.330992e-04, 0.0001779055, 0.0001403866, 0.0001127118, 9.074962e-05, 7.157144e-05,
5.044899e-04, 5.485889e-04, 3.523165e-04, 2.534751e-04, 0.0001934568, 0.0001526582, 0.0001225642, 9.868232e-05, 7.782773e-05,
3.239951e-04, 3.523165e-04, 3.878844e-04, 2.790645e-04, 0.0002129870, 0.0001680697, 0.0001349376, 1.086447e-04, 8.568475e-05,
2.330992e-04, 2.534751e-04, 2.790645e-04, 3.123147e-04, 0.0002383642, 0.0001880950, 0.0001510153, 1.215896e-04, 9.589399e-05,
1.779055e-04, 1.934568e-04, 2.129870e-04, 2.383642e-04, 0.0002728857, 0.0002153361, 0.0001728863, 1.391990e-04, 1.097820e-04,
1.403866e-04, 1.526582e-04, 1.680697e-04, 1.880950e-04, 0.0002153361, 0.0002548851, 0.0002046389, 1.647645e-04, 1.299447e-04,
1.127118e-04, 1.225642e-04, 1.349376e-04, 1.510153e-04, 0.0001728863, 0.0002046389, 0.0002555744, 2.057751e-04, 1.622886e-04,
9.074962e-05, 9.868232e-05, 1.086447e-04, 1.215896e-04, 0.0001391990, 0.0001647645, 0.0002057751, 2.840218e-04, 2.239993e-04,
7.157144e-05, 7.782773e-05, 8.568475e-05, 9.589399e-05, 0.0001097820, 0.0001299447, 0.0001622886, 2.239993e-04, 3.974881e-04),
nrow=9, ncol=9, byrow=TRUE)
is.symmetric.matrix(VAR)
is.positive.definite(VAR)
kappa(VAR)
CDF <- function(x){
summand <- rep(0, 5)
for(j in 5:9){
choice <- combn(9, j)
for(i in 1:ncol(choice)){
ub <- rep(Inf, 9)
ub[choice[, i]] <- x
summand[j-4] <- summand[j-4] + as.numeric(pmvnorm(lower=rep(-Inf, 9), upper=ub, sigma=VAR))
}
}
l <- c(1, -5, 15, -35, 70)
as.numeric(t(l)%*%summand)
}
CDF <- Vectorize(CDF)
x <- seq(-0.1, 0.1, by=0.01)
y <- CDF(x)
plot(x, y, type="l", lwd=2)
I initially plotted the result I got from taking first differences from numCDF <- CDF( seq(-10, 10, length=100) ), but that was rather disappointing, since only one value was different than 0. So I restricted the focus to:
numCDF <- CDF( seq(-.10, .10, length=100) )
plot( diff(numCDF) )
Simply plotting the values of numCDF produces similar chaotic results in the region where you expressed concern.
So I think maybe your function is not sufficiently well-behaved to yield good results.
I would like to compute the convolution of two probability distributions in R and I need some help. For the sake of simplicity, let's say I have a variable x that is normally distributed with mean = 1.0 and stdev = 0.5, and y that is log-normally distributed with mean = 1.5 and stdev = 0.75. I want to determine z = x + y. I understand that the distribution of z is not known a priori.
As an aside the real world example I am working with requires addition to two random variables that are distributed according to a number of different distributions.
Does anyone know how to add two random variables by convoluting the probability density functions of x and y?
I have tried generating n normally distributed random values (with above parameters) and adding them to n log-normally distributed random values. However, I wish to know if I can use the convolution method instead. Any help would be greatly appreciated.
EDIT
Thank you for these answers. I define a pdf, and try to do the convolution integral, but R complains on the integration step. My pdfs are Log Pearson 3 and are as follows
dlp3 <- function(x, a, b, g) {
p1 <- 1/(x*abs(b) * gamma(a))
p2 <- ((log(x)-g)/b)^(a-1)
p3 <- exp(-1* (log(x)-g) / b)
d <- p1 * p2 * p3
return(d)
}
f.m <- function(x) dlp3(x,3.2594,-0.18218,0.53441)
f.s <- function(x) dlp3(x,9.5645,-0.07676,1.184)
f.t <- function(z) integrate(function(x,z) f.s(z-x)*f.m(x),-Inf,Inf,z)$value
f.t <- Vectorize(f.t)
integrate(f.t, lower = 0, upper = 3.6)
R complains at the last step since the f.t function is bounded and my integration limits are probably not correct. Any ideas on how to solve this?
Here is one way.
f.X <- function(x) dnorm(x,1,0.5) # normal (mu=1.5, sigma=0.5)
f.Y <- function(y) dlnorm(y,1.5, 0.75) # log-normal (mu=1.5, sigma=0.75)
# convolution integral
f.Z <- function(z) integrate(function(x,z) f.Y(z-x)*f.X(x),-Inf,Inf,z)$value
f.Z <- Vectorize(f.Z) # need to vectorize the resulting fn.
set.seed(1) # for reproducible example
X <- rnorm(1000,1,0.5)
Y <- rlnorm(1000,1.5,0.75)
Z <- X + Y
# compare the methods
hist(Z,freq=F,breaks=50, xlim=c(0,30))
z <- seq(0,50,0.01)
lines(z,f.Z(z),lty=2,col="red")
Same thing using package distr.
library(distr)
N <- Norm(mean=1, sd=0.5) # N is signature for normal dist
L <- Lnorm(meanlog=1.5,sdlog=0.75) # same for log-normal
conv <- convpow(L+N,1) # object of class AbscontDistribution
f.Z <- d(conv) # distribution function
hist(Z,freq=F,breaks=50, xlim=c(0,30))
z <- seq(0,50,0.01)
lines(z,f.Z(z),lty=2,col="red")
I was having trouble getting integrate() to work for different density parameters, so I came up with an alternative to #jlhoward's using Riemann approximation:
set.seed(1)
#densities to be convolved. could also put these in the function below
d1 <- function(x) dnorm(x,1,0.5) #
d2 <- function(y) dlnorm(y,1.5, 0.75)
#Riemann approximation of convolution
conv <- function(t, a, b, d) { #a to b needs to cover the range of densities above. d needs to be small for accurate approx.
z <- NA
x <- seq(a, b, d)
for (i in 1:length(t)){
print(i)
z[i] <- sum(d1(x)*d2(t[i]-x)*d)
}
return(z)
}
#check against sampled convolution
X <- rnorm(1000, 1, 0.5)
Y <- rlnorm(1000, 1.5, 0.75)
Z <- X + Y
t <- seq(0, 50, 0.05) #range to evaluate t, smaller increment -> smoother curve
hist(Z, breaks = 50, freq = F, xlim = c(0,30))
lines(t, conv(t, -100, 100, 0.1), type = "s", col = "red")
It seems that a statistical problem that I am working on requires doing something known in computational geometry as "offline orthogonal range counting":
Suppose I have a set of n points (for the moment, in the plane). For every pair of points i and j, I would like to count the number of remaining points in the set that are in the rectangle whose diagonal is the segment with endpoints i and j. The overall output then is a vector of n(n-1) values each in [0, 1, 2, ... , n-2].
I've seen that a rich literature on the problem (or at least a very similar problem) exists, but I cannot find an implementation. I would prefer an R (a statistical computing language) package, but I guess that's asking too much. An open source C/C++ implementation will also work.
Thanks.
I hope I understand well your proble. Here an implementation in R using package geometry. I use
mesh.drectangle function which compute a signed distance from points p to boundary of rectangle.
I create a combination for all points using combn
for each point p of combination , I compute the distance from the rectangle rect_p to the others points
if distance < 0 I choose the points.
For example
library(geometry)
## I generate some data
set.seed(1234)
p.x <- sample(1:100,size=30,replace=T)
p.y <- sample(1:100,size=30,replace=T)
points <- cbind(p.x,p.y)
## the algortithm
ll <- combn(1:nrow(points),2,function(x){
x1<- p.x[x[1]]; y1 <- p.y[x[1]]
x2<- p.x[x[2]]; y2 <- p.y[x[2]]
p <- points[-x,]
d <- mesh.drectangle(p,x1,y1,x2,y2)
res <- NA
if(length(which(d <0))){
points.in = as.data.frame(p,ncol=2)[ d < 0 , ]
res <- list(n = nrow(points.in),
rect = list(x1=x1,x2=x2,y1=y1,y2=y2),
points.in = points.in)
}
res
},simplify=F)
ll <- ll[!is.na(ll)]
## the result
nn <- do.call(rbind,lapply(ll,'[[','n'))
To visualize the results, I plots rectangles with 5 points for example.
library(grid)
grid.newpage()
vp <- plotViewport(xscale = extendrange(p.x),
yscale = extendrange(p.y))
pushViewport(vp)
grid.xaxis()
grid.yaxis()
grid.points(x=points[,'p.x'],y=points[,'p.y'],pch='*')
cols <- rainbow(length(ll))
ll <- ll[nn == 5] ## here I plot only the rectangle with 5 points
lapply(seq_along(ll),function(i){
x <- ll[[i]]
col <- sample(cols,1)
x1<- x$rect$x1; x2<- x$rect$x2
y1<- x$rect$y1; y2<- x$rect$y2
grid.rect(x=(x1+x2)*.5,y=(y1+y2)*.5,
width= x2-x1,height = y2-y1,
default.units ='native',
gp=gpar(fill=col,col='red',alpha=0.2)
)
grid.points(x=x$points.in$p.x,y=x$points.in$p.y,pch=19,
gp=gpar(col=rep(col,x$n)))
}
)
upViewport()