Question about delayed sampled sinusoid math expression - math

I have been studying the digital audio processing by using the book <Designing Audio Effect Plugins in C++>.
For analog Sinusoid:
Complex Sinusoid = e^(jωt)
Delayed Sinusoid = e^(jω(t−n)) = e^(jwt) * e^(-jwn), a delay of n seconds
For digital sampled version:
sampled complex sinusoid = e^(jωnT), T is interval for each sample, n is the index of sample
I understand all above, but I got confused about the delayed sampled sinusoid which described as: e^(jω ( nT −M )), M = samples of delay
But I think it should be described as e^(jωT( n − M )), since the T is a constant for a fixed sample rate, n and M has the same unit.
Anyone can explain it for me?

You are right about e^(jωT( n − M )), when M represents delay as sample count
Formula e^(jω ( nT −M )) is valid for M as time

Related

Simulating a process n times in R

I've written an R script (sourced from here) simulating the path of a geometric Brownian motion of a stock price, and I need the simulation to run 1000 times such that I generate 1000 paths of the process Ut = Ste^-mu*t, by discretizing the law of motion derived from Ut which is the bottom line of the solution to the question posted here.
The process also has n = 252 steps and discretization step = 1/252, also risk of sigma = 0.4 and instantaneous drift mu, which I've treated as zero, although I'm not sure about this. I'm struggling to simulate 1000 paths of the process but am able to generate one single path, I'm unsure which variables I need to change or whether there's an issue in my for loop that's restricting me from generating all 1000 paths. Could it also be that the script is simulating each individual point for 252 realization instead of simulating the full process? If so, would this restrict me from generating all 1000 paths? Is it also possible that the array I'm generating defined as U hasn't being correctly generated by me? U[0] must equal 1 and so too must the first realization U(1) = 1. The code is below, I'm pretty stuck trying to figure this out so any help is appreciated.
#Simulating Geometric Brownian motion (GMB)
tau <- 1 #time to expiry
N <- 253 #number of sub intervals
dt <- tau/N #length of each time sub interval
time <- seq(from=0, to=N, by=dt) #time moments in which we simulate the process
length(time) #it should be N+1
mu <- 0 #GBM parameter 1
sigma <- 0.4 #GBM parameter 2
s0 <- 1 #GBM parameter 3
#simulate Geometric Brownian motion path
dwt <- rnorm(N, mean = 0, sd = 1) #standard normal sample of N elements
dW <- dwt*sqrt(dt) #Brownian motion increments
W <- c(0, cumsum(dW)) #Brownian motion at each time instant N+1 elements
#Define U Array and set initial values of U
U <- array(0, c(N,1)) #array of U
U[0] = 1
U[1] <- s0 #first element of U is s0. with the for loop we find the other N elements
for(i in 2:length(U)){
U[i] <- (U[1]*exp(mu - 0.5*sigma^2*i*dt + sigma*W[i-1]))*exp(-mu*i)
}
#Plot
plot(ts(U), main = expression(paste("Simulation of Ut")))
This questions is quite difficult to answer since there are a lot of unclear things, at least to me.
To begin with, length(time) is equal to 64010, not N + 1, which will be 254.
If I understand correctly, the brownian motion function returns the position in one dimension given a time. Hence, to calculate this position for each time the following can be enough:
s0*exp((mu - 0.5*sigma^2)*time + sigma*rnorm(length(time),0,time))
However, this calculates 64010 points, not 253. If you replicate it 1000 times, it gives 64010000 points, which is quite a lot.
> B <- 1000
> res <- replicate(B, {
+ s0*exp((mu - 0.5*sigma^2)*time + sigma*rnorm(length(time),0,time))
+ })
> length(res)
[1] 64010000
> dim(res)
[1] 64010 1000
I know I'm missing the second part, the one explained here, but I actually don't fully understand what you need there. If you can draw the formula maybe I can help you.
In general, avoid programming in R using for loops to iterate vectors. R is a vectorized language, and there is no need for that. If you want to run the same code B times, the replicate(B,{ your code }) function is your firend.

Simple American Option Pricing via Monte Carlo Simulation in R - Results are too high

I am more of a novice in R and have been trying to built a formula to price american type options (call or put) using a simple Monte Carlo Simulation (no regressions etc.). While the code works well for European Type Options, it appears to overvalue american type options (in comparision to Binomial-/Trinomial Trees and other pricing models).
I would greatly appreciate your input!
The steps I take are outlined below.
1.) Simulate n stock price paths with m+1 steps (Geometric Brownian Motion):
n = 10000; m = 100; T = 5; S = 100; X = 100; r = 0.1; v = 0.1; d = 0
pat = matrix(NA,n,m+1)
pat[,1] = S
dt = T/m
for(i in 1:n)
{
for (j in seq(2,m+1))
{
pat[i,j] = pat[i,j-1] + pat[i,j-1]*((r-d)* dt + v*sqrt(dt)*rnorm(1))
}
}
2.) I calculate the payoff matrix for call options and put options and discount both via backwards induction:
# Put option
payP = matrix(NA,n,m+1)
payP[,m+1] = pmax(X-pat[,m+1],0)
for (j in seq(m,1)){
payP[,j] = pmax(X-pat[,j],payP[,j+1]*exp(-r*dt))
}
# Call option
payC = matrix(NA,n,m+1)
payC[,m+1] = pmax(pat[,m+1]-X,0)
for (j in seq(m,1)){
payC[,j] = pmax(pat[,j]-X,payC[,j+1]*exp(-r*dt))
}
3.) I calculate the Option Price as the average (mean) payoff at time 0:
mean(payC[,1])
mean(payP[,1])
In the example above, a call price of approximately 44.83 and an approximate put price of 3.49 is found. However, following a trinomial tree approach (n = 250 steps), prices should more 39.42 (call) and 1.75 (put).
Black Scholes Call Price (since no dividend yield) is 39.42.
As I said, any input is highly appreciated. Thank you very much in advance!
All the bests!
I think your problem is rather a conceptual one than an actual coding problem.
What your code currently does is that it takes the in hindsight best point in time to exercise the American option over the whole simulated stock price path. It does not take into account that once the intrinsic value of an American option is higher than its calculated option price, you exercise it - which means, that you forego the chance to exercise it in the future where the difference between the intrinsic value and option price might be even larger (depending on the realized stock price movements).
Hence, you overestimate the option prices.

How to calculate amplitude from spectrum()

I have a signal and I need to get the actual magnitude of a frequency found at spectrum()
Consider the following signal
f <- 5
n <- 500
signal <- 4*sin(2*pi*f*seq(0,10,1/n))
S.signal <- spectrum(signal, log="no")
Using spectrum() I get the following:
I can verify the amplitude of the peak using:
> max(S.signal$spec)
[1] 16698.45
How can I convert this value 16698.45 to the actual magnitude of the signal at that frequency 4 - or something close?
There is no relation between the amplitude of your signal and the amplitude of your spectrum here. The Fourier transform of a sinus is a delta function at the corresponding frequency, that is an infinitely narrow pic with an infinite amplitude.
The fact that you find a value for the amplitude of your spetrum is due to the sampling of your signal that cause a loss of information, You can see that here :
f <- 5
n <- 1000
signal <- 4*sin(2*pi*f*seq(0,10,1/n))
S.signal <- spectrum(signal, log="no")
max(S.signal$spec)
[1] 25261.03
You have better sampling, so you get a value closer to the real value of the spectrum (that is inifinity here).
A late answer, but in case it helps others. As previous answers state, it is not a question of how to convert the spectral density to an amplitude, but rather, having found a signal in our density spectra, how do we extract the amplitude at the dominant frequency. I found the custom function proposed in this post useful.
An example implementing it with original poster's example:
power_spec = function(y,samp.freq, ...){
N <- length(y)
fk <- fft(y)
fk <- fk[2:length(fk)/2+1]
fk <- 2*fk[seq(1, length(fk), by = 2)]/N
freq <- (1:(length(fk)))* samp.freq/(2*length(fk))
data.frame(amplitude = Mod(fk), freq = freq)
}
f <- 5
n <- 500
signal <- 4*sin(2*pi*f*seq(0,10,1/n))
x = power_spec(signal,samp.freq = 1/n)
plot(x$amplitude~x$freq,type='l',xlim=c(0,10))
We find a peak with an amplitude of 4 at f = 5.
Please up-vote the original post where this custom function came from if it helps you too!
If your signal is really like what you mentioned in your code, a sin() function, then you should only get a impulse/peak at one location, and anywhere else is simply zero.

How to solve a portfolio optimization with a generalised objective function?

I have a portfolio of 5 stocks for which I want to find an optimal mix of minimizing portfolio variance and maximizing expected future dividends. The latter is from analysts forecasts. My problem is that I know how to solve an minimum variance problem but I am not sure how to put the quadratic form into the right matrix form for the objective function of quadprog.
The standard minimum variance problem reads
Min! ( portfolio volatility )
wherer has the 252 daily returns of the five stocks,d has the expected yearly dividend yields ( where firm_A pays 1 %, firm_B pays 2 % etc, )
and I have programmed it as follows
dat = rep( rnorm( 10, mean = 0, sd = 1 ), 252*5 )
r = matrix( dat, nr = 252, nc = 5 )
d = matrix( c( 1, 2, 1, 2, 2 ) )
library(quadprog)
# Dmat (covariance) and dvec (penalized returns) are generated easily
risk.param = 0.5
Dmat = cov(r)
Dmat[is.na(Dmat)]=0
dvec = matrix(colMeans(r) * risk.param)
dvec[is.na(dvec)]=1e-5
# The weights sum up to 1
n = 5
A = matrix( rep( 1, n ), nr = n )
b = 1
meq = 1
res = solve.QP( Dmat, dvec, A, b, meq = 1 )
Obviously, the returns in r a standard normal, hence each stocks gets about 20% weight.
Q1: How can I account for the fact that firm_A pays a dividend of 1, firm_B a dividend of 2, etc?
The new objective function reads:
Max! ( 0.5 * Portfolio_div - 0.5 * Portfolio_variance )
but I don't know how to hard-code it. The portfolio variance was easy to put into Dmat but the new objective function has the Portfolio_div element defined as Portfolio_div = w * d where w has the five weights.
Thanks a lot.
EDIT: Maybe it makes sense to add a higher-level description of the problem:
I am able to use a minimum-variance optimization with the code above. Minimizing the portfolio variance means optimizing the weights on the variace-covariance matrix Dmat (of dimension 5x5). However, I want to add an additional part to the optimization, which are the dividends in d multiplied with the weights (hence of dimension 5x1). The same weights are also used for Dmat.
Q2: How can I add the vector d to the code?
EDIT2: I guess the answer is to simply use
dvec = -1/d
as I maximize expected dividends by minimizing the inverse of the negative.
Q3: Could someone please tell me if that's right?
Opening a can of worms:
TLDR While I respect great work Harry MARKOWITZ ( 1990 Nobel prize ) has performed, I appreciate much more his wonderfull CACI Simulations spin-off deterministic simulation framework COMET III, than the Portfolio theory assumption, that variance per-se is the ruling minimiser driver for the portfolio optimisation process.
Driving this principal point of view ( which still may meet a bit ill-formed motivation of big funds,that live happily from their 2-by-20 feesdue to the nature and scale of "their" skewed perspective of perception of what are direct losses,which they recognise as a non-acquired hefty & risk-free management feesassociated with a crowd-panic churn attributed AUM erosion,rather than the real profits & losses, gained from their (in)ability to deliver any above average AUM returns ) further,closer to your ideathe problem is in the proper formulation of the { penalty | utility } function.
While variance is taken in classical efficient frontier theory as a penalty factor, operated in a min! global search, it has not much to do with real profit generation. You get penalised even for positive-side variance components, which is a nonsense per-se.
On the contrary, the dividend is a direct benefit, an absolute utility, entering the max! optimisation process.
So the first step in Q3 & Q1 ought be a design of a consistent utility function isolated from relative, revenue un-related factors, but containing all other absolute factors -- a cost of entry, transaction costs, rebalancing costs -- as otherwise your utility model would be misleading your portfolio wealth management strategy.
A2: Without this a-priori designed property, no one may claim a model is worth a single CPU-hour to even start the model's global optimisation efforts.

Rstan on Rstudio MCMC having too elevated running time (limited use of avaiable CPU and RAM)

I am a newbie of the Rstan world, but I really need it for my thesis. I am actually using the script and a similar dataset from a guy from NYU, who reports as an estimated time for a similar DS of about 18 hours. However, when I try to run my model it won't do more than 10% in 18hours. Thus, I ask for some little help to understand what I am doing wrong and how to improve the efficiency.
I am running a 500 iter, 100 warmup 2 chains model with a Bernoulli_logit function over 5 parameters, trying to estimate 2 of them through a No U Turn MC procedure. (at each step it draws from a random normal a each parameters, then it estimates y and compares it with the actual data to see if the new parameters are a better fit to the data)
y[n] ~ bernoulli_logit( alpha[kk[n]] + beta[jj[n]] - gamma * square( theta[jj[n]] - phi[kk[n]] ) );
(n being about 10mln)
My data is a 10.000x1004 matrix of 0s and 1s. To wrap it up, it is a matrix about people following politicians on twitter and I want to estimate their political ideas given who they follow. I run the model on RStudio with R x64 3.1.1 on a Win8 Professional, 6bit, I7 quad core with 16 GB ram.
Checking the performances, rsession uses no more than 14% CPU and 6GB of ram, although 7 more GB are free. While trying to subsample to a 10.000x250 matrix, I have noticed that it will use below 1.5GB instead. However, I have tried the procedure with a 50x50 dataset and it worked just fine, so there is no mistake in the procedure.
Rsession opens 8 threads, i see activity on each core but none is fully occupied.
I wonder why is it the case that my PC does not work at the best of its possibilities and whether there might be some bottleneck, a cap or a setup that prevents it to do so. R is 64 bit (just checked) and so Rstan should be (even though I had some difficulties in installing and that might have messed up some parameters)
this is what happens when i compile it
Iteration: 1 / 1 [100%] (Sampling)
# Elapsed Time: 0 seconds (Warm-up)
# 11.451 seconds (Sampling)
# 11.451 seconds (Total)
SAMPLING FOR MODEL 'stan.code' NOW (CHAIN 2).
Iteration: 1 / 1 [100%] (Sampling)
# Elapsed Time: 0 seconds (Warm-up)
# 12.354 seconds (Sampling)
# 12.354 seconds (Total)
while when i run it it just works for hours but it never goes beyond the 10% of the first chain (mainly because I have interrupted it after my pc was about to melt down).
Iteration: 1 / 500 [ 0%] (Warmup)
and has this setting:
stan.model <- stan(model_code=stan.code, data = stan.data, init=inits, iter=1, warmup=0, chains=2)
## running modle
stan.fit <- stan(fit=stan.model, data = stan.data, iter=500, warmup=100, chains=2, thin=thin, init=inits)
please help me find what is slowing down the procedure (and if nothing wtong is happening, what can I manipulate to have still some reasonable result in shorter time?).
I thank you in advance,
ML
here's the model (From Pablo Barbera, NYU)
n.iter <- 500
n.warmup <- 100
thin <- 2 ## this will give up to 200 effective samples for each chain and par
Adjmatrix <- read.csv("D:/TheMatrix/Adjmatrix_1004by10000_20150424.txt", header=FALSE)
##10.000x1004 matrix of {0, 1} with the relationship "user i follows politician j"
StartPhi <- read.csv("D:/TheMatrix/StartPhi_20150424.txt", header=FALSE)
##1004 vector of values [-1, 1] that should be a good prior for the Phi I want to estimate
start.phi<-ba<-c(do.call("cbind",StartPhi))
y<-Adjmatrix
J <- dim(y)[1]
K <- dim(y)[2]
N <- J * K
jj <- rep(1:J, times=K)
kk <- rep(1:K, each=J)
stan.data <- list(J=J, K=K, N=N, jj=jj, kk=kk, y=c(as.matrix(y)))
## rest of starting values
colK <- colSums(y)
rowJ <- rowSums(y)
normalize <- function(x){ (x-mean(x))/sd(x) }
inits <- rep(list(list(alpha=normalize(log(colK+0.0001)),
beta=normalize(log(rowJ+0.0001)),
theta=rnorm(J), phi=start.phi,mu_beta=0, sigma_beta=1,
gamma=abs(rnorm(1)), mu_phi=0, sigma_phi=1, sigma_alpha=1)),2)
##alpha and beta are the popularity of the politician j and the propensity to follow people of user i;
##phi and theta are the position on the political spectrum of pol j and user i; phi has a prior given by expert surveys
##gamma is just a weight on the importance of political closeness
library(rstan)
stan.code <- '
data {
int<lower=1> J; // number of twitter users
int<lower=1> K; // number of elite twitter accounts
int<lower=1> N; // N = J x K
int<lower=1,upper=J> jj[N]; // twitter user for observation n
int<lower=1,upper=K> kk[N]; // elite account for observation n
int<lower=0,upper=1> y[N]; // dummy if user i follows elite j
}
parameters {
vector[K] alpha;
vector[K] phi;
vector[J] theta;
vector[J] beta;
real mu_beta;
real<lower=0.1> sigma_beta;
real mu_phi;
real<lower=0.1> sigma_phi;
real<lower=0.1> sigma_alpha;
real gamma;
}
model {
alpha ~ normal(0, sigma_alpha);
beta ~ normal(mu_beta, sigma_beta);
phi ~ normal(mu_phi, sigma_phi);
theta ~ normal(0, 1);
for (n in 1:N)
y[n] ~ bernoulli_logit( alpha[kk[n]] + beta[jj[n]] -
gamma * square( theta[jj[n]] - phi[kk[n]] ) );
}
'
## compiling model
stan.model <- stan(model_code=stan.code,
data = stan.data, init=inits, iter=1, warmup=0, chains=2)
## running modle
stan.fit <- stan(fit=stan.model, data = stan.data,
iter=n.iter, warmup=n.warmup, chains=2,
thin=thin, init=inits)
samples <- extract(stan.fit, pars=c("alpha", "phi", "gamma", "mu_beta",
"sigma_beta", "sigma_alpha"))
First, my apologies: I would have introduced this as a comment, but I don't have enough reputation.
Here's the question you asked: "what can I manipulate to have still some reasonable result in shorter time?"
The answer is, it depends. Instead of representing things as a binary matrix, have your tried reducing the size of the matrix by using counts? Based on the type of model you're trying to run, I imagine there is some non-identifiablity in the posterior. Could you try reparameterizing?
Also, you may want to run in CmdStan if R is causing problems with memory management.

Resources