time <- 1:12
y <- c(0,0,0,0,0,0,0,12,34,69,100,100)
mdl <- gcFitModel(time,y,control = grofit.control(fit.opt = "m", model.type = "gompertz"))
I get my parameters from the above mdl to fit the gompertz equation
y <- A* exp(-exp(mu * e * (lambda - time)/mu + 1))
mu <- 36.162016
lambda <- 7.9800164
A <- 100
time <- 1:12
time here is in a step of 15 days. For e.g time = 1 implies mid-day of a 15 day period, time 2 implies mid day of the next 15 days period, time 3 implies implies the mid-day of the next 15 days period and so on.
I fitted the following function:
e <- exp(1)
y <- 100 * exp(-exp(mu * e * (lambda - time)/mu + 1))
plot(time,y)
The lambda controls the movement along the x-axis.
I am looking to modify this curve so that I get more data points by converting the ids into weeks i.e instead of mid-point of every 15 days, I want to get y for every 7 days. How can I do this?
Your code doesn't include any fitting routines, so I am assuming this is a question about plotting.
Here is an example of a plot with 100 points between time = 1 and time = 12.
time <- seq(1, 12, length.out = 100);
mu <- 34.55844
y <- 100 * exp(-exp(mu * e * (3 - time)/mu + 1))
plot(time,y)
Related
I'm trying to recreate something similar to an image in modern actuarial risk theory using R: https://www.academia.edu/37238799/Modern_Actuarial_Risk_Theory (page 89)
Click here for image
In my case, the drops are of size based on an exponential distribution with parameter 1/2000 and they are spaced apart with Poisson inter arrival times which means they are distributed exponentially with a rate parameter of 0.25 (in my model)
The value of U is given by an initial surplus plus a premium income (c) per unit time (for an amount of time determined by the inter arrival distribution) minus a claim amount which would be random from the exponential distribution mentioned above.
I have a feeling a loop will need to be used and this is what I have so far:
lambda <- 0.25
EX <- 2000
theta <- 0.5
c <- lambda*EX*(1+theta)
x <- rexp(1, 1/2000)
s <- function(t1){for(t1 in 1:10){v <- c(rep(rexp(t1,1/2000)))
print(sum(v))}}
u <- function(t){10000+c*t}
plot(u, xlab = "t", xlim = c(-1,10), ylim = c(0,20000))
abline(v=0)
for(t1 in 1:10){v <- c(rep(rexp(t1,1/2000)))
print(sum(v))}
The end goal is to run this simulation say 10,000 times over a 10 year span and use it as a visible representation as the rate of ruin for an insurance company.
Any help appreciated.
I think you're looking for something like this, all wrapped up in a neat function which by default draws the plot, but if wanted simply returns "ruin" or "safe" so you can run it in simulation:
simulate_ruin <- function(lambda = 0.25, EX = 2000,
theta = 0.5, initial_amount = 10000,
max_time = 10, draw = TRUE) {
income_per_year <- lambda * EX * (1 + theta)
# Simulate a Poisson process. Include the initial time 0,
# and replicate every other time point so we have values "before" and
# "after" each drop
times <- c(0, rep(cumsum(rexp(1000, lambda)), each = 2))
times <- c(times[times < max_time], max_time)
# This would be our income if there were no drops (a straight line)
total_without_drops <- initial_amount + (income_per_year * times)
# Now simulate some drops.
drop_size <- rexp((length(times) - 1) / 2, 1/2000)
# Starting from times[3], we apply our cumulative drops every second index:
payout_total <- rep(c(0, cumsum(drop_size)), each = 2)
total <- total_without_drops - payout_total
if(draw) {
plot(times, total, type = "l", ylim = c(-1000, 20000))
abline(h = 0, lty = 2)
} else {
if(any(total < 0))
return("ruin")
else
return("safe")
}
}
So we can call it once for a simulation:
simulate_ruin()
And again for a different simulation
simulate_ruin()
And table the results of 10,000 simulations to find the rate of ruin, which turns out to be around 3%
table(sapply(1:10000, function(x) simulate_ruin(draw = FALSE)))
#>
#> ruin safe
#> 305 9695
Created on 2022-04-06 by the reprex package (v2.0.1)
I have a fairly simple equation, in which I have direct measurements of the variables through time, and two different unknown parameters I need to solve for, but which I know can be considered constants over the time periods I'm studying.
Both of these "constants" have fairly narrow ranges of variability in nature. In principle, it seems like some kind of optimization procedure/function should be able to do this easily, by finding the pair of values that minimizes the standard deviation of each of the constant values across the time series.
However, I am new to optimization and parameter fitting. Any help figuring out how to use r code to find the pair (or pairs) of values in this situation would be greatly appreciated.
Below is a simplified form of the equation I'm dealing with:
A * x + B * z - B * d = c + e
A and B are the constants I need to solve for.
Possible real-world values of A are 0.4-0.8
Possible real-world values of B are 0.85-0.99
To create a reasonable mock data set, assuming perfect measurements of all variables, and known values of A and B:
### Generate mock data
### Variables all have a daily cycle and are strongly autocorrelated,
# and so can be approximated via sin function,
# with unique noise added to each to simulate variability:
# Variability for each variable
n <- 1000 # number of data points
t <- seq(0,4*pi,length.out = 1000)
a <- 3
b <- 2
x.unif <- runif(n)
z.norm <- rnorm(n)
c.unif <- runif(n)
d.norm <- rnorm(n)
d.unif <- runif(n)
e.norm <- rnorm(n)
amp <- 1
# Create reasonable values of mock variable data for all variables except e;
# I will calculate from known fixed values for A and B.
x <- a*sin(b*t)+x.unif*amp + 10 # uniform error
z <- a*sin(b*t)+z.norm*amp + 10 # Gaussian/normal error
c <- ((a*sin(b*t)+c.unif*amp) + 10)/4
d <- ((a*sin(b*t)+d.norm*amp)+(a*sin(b*t)+d.unif*amp)+10)/2
# Put vectors in dataframe
dat <- data.frame("t" = t, "x" = x, "z" = z, "c" = c, "d" = d)
# Equation: A*x + B*z - B*d = c + e
# Solve for e:
# e = A*x + B*z - B*d - c
# Specify "true" values for A and B:
A = 0.6
B = 0.9
# Solve for e:
dat <- dat %>%
mutate(e = A*x + B*z - B*d - c)
# Gather data for easy visualizing of results for e:
dat_gathered <- dat %>%
gather(-t, value = "value", key = "key")
# Plot all variables
ggplot(dat_gathered, aes(x = t, y = value, color = key)) + geom_line()
# Add small error (to simulate measurement error) to all variables except A and B:
dat <- dat %>%
mutate(x_j = x + rnorm(x, sd=0.02)/(1/x)) %>%
mutate(z_j = z + rnorm(z, sd=0.02)/(1/z)) %>%
mutate(c_j = c + rnorm(c, sd=0.02)/(1/c)) %>%
mutate(d_j = d + rnorm(d, sd=0.02)/(1/d)) %>%
mutate(e_j = e + rnorm(e, sd=0.02)/(1/e))
The variables in dat with the _j suffix represent real world data (since they have measurement error added). Knowing the constraint that:
A is within 0.4-0.8
B is within 0.85-0.99
Is it possible to use the noisy "_j" data to optimize for the pair of constant values that minimize deviation of A and B across the entire time series?
A little bit of algebra and setting this up as a linear regression problem with no intercept seems to work fine:
m1 <- lm(e_j+c_j ~ 0 + x_j + I(z_j-d_j), data=dat)
coef(m1) ## A =0.6032, B = 0.8916
It doesn't do anything to constrain the solution, though.
suppose I have a data like y and I fit a smooth function to this data with Fourier basis
y<- c(1,2,5,8,9,2,5)
x <- seq_along(y)
Fo <- create.fourier.basis(c(0, 7), 4)
precfd = smooth.basis(x,y,Fo)
plotfit.fd(y, x, precfd$fd)
precfd <- smooth.basis(x, y, Fo);coef(precfd)
the out put of last line gives me this:
const 411.1060285
sin1 -30.5584033
cos1 6.5740933
sin2 26.2855849
cos2 -26.0153965
I know what is the coefficient but what in const? in original formula there is no constant part as this link say:
http://lampx.tugraz.at/~hadley/num/ch3/3.3a.php
The first basis function in create.fourier.basis is a constant function to allow for a non-zero mean (intercept) in the data. From the documentation of the create.fourier.basis function:
The first basis function is the unit function with the value one everywhere. The next two are the sine/cosine pair with period defined in the argument period. The fourth and fifth are the sin/cosine series with period one half of period. And so forth. The number of basis functions is usually odd.
You can drop the first (unit) basis function in create.fourier.basis with the argument dropind = 1. Below some example code that illustrates which basis functions are used in create.fourier.basis. Note: the scaling of the basis functions depends on the period argument in create.fourier.basis.
Example 1: non-zero mean
library(fda)
## time sequence
tt <- seq(from = 0, to = 1, length = 100)
## basis functions
phi_0 <- 1
phi_1 <- function(t) sin(2 * pi * t) / sqrt(1 / 2)
phi_2 <- function(t) cos(2 * pi * t) / sqrt(1 / 2)
## signal
f1 <- 10 * phi_0 + 5 * phi_1(tt) - 5 * phi_2(tt)
## noise
eps <- rnorm(100)
## data
X1 <- f1 + eps
## create Fourier basis with intercept
four.basis1 <- create.fourier.basis(rangeval = range(tt), nbasis = 3)
## evaluate values basis functions
## eval.basis(tt, four.basis1)
## fit Fourier basis to data
four.fit1 <- smooth.basis(tt, X1, four.basis1)
coef(four.fit1)
Example 2: zero mean
## signal
f2 <- 5 * phi_1(tt) - 5 * phi_2(tt)
## data
X2 <- f2 + eps
## create Fourier basis without intercept
four.basis2 <- create.fourier.basis(rangeval = range(tt), nbasis = 3, dropind = 1)
## evaluate values basis functions
## eval.basis(tt, four.basis2)
## fit Fourier basis to data
four.fit2 <- smooth.basis(tt, X2, four.basis2)
coef(four.fit2)
I'm trying to visualize a signal and its frequency spectrum in Julia.
I found the FFTW package that provides the FFT and DSP for the frequencies.
Here is what I'm trying, with a sinusoidal signal:
using Plots
using FFTW
using DSP
# Number of points
N = 2^14 - 1
# Sample rate
fs = 1 / (1.1 * N)
# Start time
t0 = 0
tmax = t0 + N * fs
# time coordinate
t = [t0:fs:tmax;]
# signal
signal = sin.(2π * 60 * t) # sin (2π f t)
# Fourier Transform of it
F = fft(signal)
freqs = fftfreq(length(t), fs)
freqs = fftshift(freqs)
# plots
time_domain = plot(t, signal, title = "Signal")
freq_domain = plot(freqs, abs.(F), title = "Spectrum")
plot(time_domain, freq_domain, layout = 2)
savefig("Wave.pdf")
I expected to see a nice plot with a peak in the 60 Hz, but all I got was a weird result:
I'm ignoring the negative frequencies for now.
How should I do that in Julia?
What you call fs in your code is not your sampling rate but the inverse of it: the sampling period.
The function fftfreq takes the sampling rate as its second argument. Since what you give as the second argument is the sampling period, the frequencies returned by the function are incorrectly scaled by (1/(Ts^2)).
I renamed fs to Ts and changed the second argument to fftfreq to the sampling rate 1.0/Ts. I think you also need to shift the result of fft.
# Number of points
N = 2^14 - 1
# Sample period
Ts = 1 / (1.1 * N)
# Start time
t0 = 0
tmax = t0 + N * Ts
# time coordinate
t = t0:Ts:tmax
# signal
signal = sin.(2π * 60 .* t) # sin (2π f t)
# Fourier Transform of it
F = fft(signal) |> fftshift
freqs = fftfreq(length(t), 1.0/Ts) |> fftshift
# plots
time_domain = plot(t, signal, title = "Signal")
freq_domain = plot(freqs, abs.(F), title = "Spectrum", xlim=(-1000, +1000))
plot(time_domain, freq_domain, layout = 2)
savefig("Wave.pdf")
The spectrum of my data set shows 3 periodic components in the time serie. I would like to substract the periodic components to keep the data without these periodicities.
It points out periodic events with the periodicity of (1/144 = daily), (1/72 = 1/2daily), and (1/6 = hourly).
My idea was to find out the Fourier components (mag and phase) of my dataset and to extract the Fourier components for these 3 specific frequencies and to create a new signal with is :
Data - PeriodicSignal_1h - PeriodicSignal_1/2day - PeriodicSignal_1day
I try with fft but I do not know how to extract the signal at these specific frequencies.
My dataset is complicated but I'm working on an example to understand the process. Here is the example :
samplingFrequency = 1000;
timeInterval = 1/samplingFrequency;
signalIndex = seq(0, 1, by=timeInterval);
N = 1000
a1 = 2;
a2 = 3;
f1 = 10;
f2 = 20;
signal1 = a1 * sin(2 * pi * f1 * signalIndex);
signal2 = a2 * sin(2 * pi * f2 * signalIndex);
inputSignal = signal1 + signal2;
Y <- fft(inputSignal)
mag <-sqrt(Re(Y)^2+Im(Y)^2)*2/length(inputSignal)
phase <-atan(Im(Y)/Re(Y)) Yr <- Re(Y) Yi <- Im(Y)
I'm trying to extract the mag and the phase of the signal with frequency f1. And I would like to generate a new signal with is :
ImputSignal - Signal_f1
I believe the following does what you are looking for ... I changed some of the variable names. At the bottom is the frequency selection that you asked about.
Set up the time and frequency parameters
samplingFrequency = 1000;
f_Hz = samplingFrequency
N = 1000
df_Hz = f_Hz / N
T = 1 / df_Hz
dt=T/N
t = dt*(seq(1,N)-1)
Generate a fake signal, no noise
a1 = 2;
a2 = 3;
f1 = 10;
f2 = 20;
signal1 = a1 * sin(2 * pi * f1 * t);
signal2 = a2 * sin(2 * pi * f2 * t);
inputSignal = signal1 + signal2;
Plot the fake signal
plot(t, signal1,type='l',col='green',ylim=c(-6,6))
lines(t, signal2,col='red')
lines(t, inputSignal,col='black')
Get the fft, and plot positive frequency portion
Y <- fft(inputSignal)
m <- floor(N/2)-1
posFreqIndices <- 2:(m+1)
negFreqIndices <- N:(m+3)
mag <-sqrt(Re(Y)^2+Im(Y)^2)*2/length(inputSignal)
phase <-atan(Im(Y)/Re(Y))
Yr <- Re(Y)
Yi <- Im(Y)
freq <- seq(df_Hz,f_Hz/2-df_Hz,df_Hz)
plot(freq,mag[posFreqIndices],type='l',xlab='Freq (Hz)', ylab='Magnitude',xlim=c(0,30))
# plot(freq,10*log10(mag[posFreqIndices]),type='l',xlab='Freq (Hz)', ylab='Magnitude (db)',xlim=c(0,30))
# plot(freq,phase[posFreqIndices]*180/pi,type='l',xlab='Freq (Hz)', ylab='Phase (deg)',xlim=c(0,30))
Identify the frequencies for the filtered signal based on amplitude
ampSelectIndices <- which(mag>1.9 & mag < 2.1)
Generate the filtered fft for the selected frequencies
YAmpSelect <- Y*0
YAmpSelect[ampSelectIndices] = Y[ampSelectIndices]
Calculate the inverse fft
yAmpSelect = Re(fft(YAmpSelect, inverse = TRUE))/length(YAmpSelect)
Plot the filtered signal
plot(t,yAmpSelect,t='l',xlab='t (sec)',ylab='Filtered for mag ~ 2')
Plot the original signal minus the filtered signal
plot(t,inputSignal-yAmpSelect,type='l')
The fft is calculated with frequencies folded. The following checks the unfolding process, this check works for real valued signals (not complex value time signals). The process is correct for complex valued time signals.
checkFreqWrapping = all.equal(mag[posFreqIndices], mag[negFreqIndices])
stopifnot(checkFreqWrapping)
Select fft values by frequency
freqSelectIndices_a <- which(9.95 < freq & freq < 10.05)
freqSelectIndices = union(posFreqIndices[freqSelectIndices_a],negFreqIndices[freqSelectIndices_a])
Create the fft for selected frequencies
YFreqSelect <- Y*0
YFreqSelect[freqSelectIndices ] = Y[freqSelectIndices ]
Calculate the time signal, plot it.
yFreqSelect = Re(fft(YFreqSelect, inverse = TRUE))/length(YFreqSelect)
plot(t,yFreqSelect,t='l',xlab='t (sec)',ylab='Filtered for mag ~ 2')
plot(t,inputSignal-yFreqSelect,type='l')
OK, I think that explains how to select fft values based on frequencies... Good luck...