Reconstruct Time Series from FFT frequency and strength data using R - r

After applying a Fourier Transform to an EEG measurement, I want to compare the approximation by FFT with the original signal in the form of a plot. I have to convert the data (frequency and strength) from the FFT back to a time series.
To transform the original time series I use the eegfft method of the eegkit package. I get a list of frequencies and amplitudes to approximate the original signal.
Here the two results of the FFT are shown as shortened examples:
# Frequency in Hz
freq <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
# Strength in uV
ampl <- c(4.1135352, 5.1272713, 3.2069741, 1.5336438, 2.4301334, 1.0974758, 1.8238327, 0.9637886, 1.1401306, 0.2224472)
Is there a package or method that I can use to reconstruct a time series from the frequency and amplitude data that has been approximated by FFT?
EDIT:
For the reconstruction of the original signal, do I also need the phase information that the eegfft method returns in the result?
# Phase shift in range -pi to pi
phase <- c(0.0000000, 1.1469542, -2.1930702, 2.7361738,1.1597980, 2.6118647, -0.6609641, -2.1508755,1.6584852, -1.2906986)

I expect something like this should work.
Edit: I have set phases to default to zero if missing and not passed into data_from_fft.
freq <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
ampl <- c(4.1135352, 5.1272713, 3.2069741, 1.5336438, 2.4301334, 1.0974758, 1.8238327, 0.9637886, 1.1401306, 0.2224472)
phase <- c(0.0000000, 1.1469542, -2.1930702, 2.7361738,1.1597980, 2.6118647, -0.6609641, -2.1508755,1.6584852, -1.2906986)
sampl_freq = 1000
data_from_fft <- function(xmin, xmax, sample_freq,
frequencies, amplitudes, phases = 0) {
x_vals <- seq(xmin, xmax, length.out = sample_freq * (xmax-xmin))
y_vals <- x_vals * 0
for (i in seq_along(x_vals)) {
# Note, I don't understand why the pi/2 phase adjustment is needed here,
# but I couldn't get the right answers out eegfft without it... :-(
y_vals[i] <- sum(amplitudes * sin(2*pi*frequencies * x_vals[i] + phase + pi/2))
}
data.frame(x_vals, y_vals)
}
library(tidyverse)
plot_from_FFT <- data_from_fft(0, 1, sampl_freq, freq, ampl, phase)
ggplot(plot_from_FFT, aes(x_vals, y_vals)) +
geom_line()
Now, let's see if we can use that output to reconstruct the inputs:
eegkit::eegfft(plot_from_FFT$y_vals, lower = 1, upper = 20, Fs = sampl_freq) %>%
filter(abs(strength) > 0.1)
frequency strength phase.shift
1 1 4.1158607 0.004451123
2 2 5.1177070 1.154553861
3 3 3.2155744 -2.185185998
4 4 1.5319350 2.739953054
5 5 2.4283426 1.173258629
6 6 1.0813858 2.645126993
7 7 1.8323207 -0.644216053
8 8 0.9598727 -2.138381646
9 9 1.1427380 1.685081744
10 10 0.2312619 -1.265466418
Yes! These are pretty close to the inputs.
eegkit::eegfft(plot_from_FFT$y_vals, lower = 1, upper = 20, Fs = sampl_freq) %>%
filter(abs(strength) > 0.1) %>%
left_join(
tibble(frequency = freq,
strength_orig = ampl,
phase_orig = phase)
) %>%
gather(stat, value, -frequency) %>%
mutate(category = if_else(stat %>% str_detect("str"), "strength", "phase"),
version = if_else(stat %>% str_detect("orig"), "plot inputs", "reconstructed inputs"),) %>%
ggplot(aes(frequency, value, shape = version, size = version)) +
geom_point() +
scale_x_continuous(breaks = 1:10, minor_breaks = NULL) +
scale_shape_manual(values = c(16, 21)) +
scale_size_manual(values = c(1,5)) +
facet_wrap(~category)

Related

Multiply probability distributions in R [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 months ago.
Improve this question
I'm trying to multiply some probability functions as to update the probability given certain factors. I've tried several things using the pdqr and bayesmeta packages, but they all work out not the way I intend, what am I missing?
A reproducible example showing two different distributions, a and b, which I want to multiply. That is because, as you notice, b doesn't have measurements in the low values, so a probability of 0. This should be reflected in the updated distribution.
library(tidyverse)
library(pdqr)
library(bayesmeta)
#measurements
a <- c(1, 2, 2, 4, 5, 5, 6, 6, 7, 7, 7, 8, 7, 8, 2, 6, 9, 10)
b <- c(5, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 9, 9, 9, 7)
#create probability distribution functions
distr_a <- new_d(a, type = "continuous")
distr_b <- new_d(b, type = "continuous")
#try to combine distributions
summarized <- distr_a + distr_b
multiplied <- distr_a * distr_b
mixture <- form_mix(list(distr_a, distr_b))
convolution <- convolve(distr_a, distr_b)
The resulting PDF's are plotted like this:
The bayesmeta::convolve() does the same as summarizing two pdqr PDF's and seem to oddly shift the distributions to the right and make them not as high as supposed to be.
Ordinarily multiplying the pdqr PDF's leaves a very low probablity overall.
Using the pdqr::form_mix() seems to even the PDF's out in between, but leaving probabilies above 0 for the lower x-values.
So, I tried to gain some insight in what I wanted to do, by using the PDF's for a and b to generate probabilities for each x value and multiply that:
#multiply distributions manually
x <- c(1:10)
manual <- data.frame(x) %>%
mutate(a = distr_a(x),
b = distr_b(x),
multiplied = a*b)
This indeed gives a resulting shape I am after, it however (logically) has too low probabilities:
I would like to multiply (multiple) PDF's. What am I doing wrong? Are my statistics wrong, or am I missing a usefull function?
UPDATE:
It seems I am a stats noob on this subject, but I would like to achieve something like the below distribution. Given that both situation a and b are true, I would expect the distribution te be something like the dotted line. Is that possible?
multiplied is the correct one. One can check with log-normal distributions. The sum of two independant log-normal random variables is log-normal with µ = µ_a + µ_b and sigma² = sigma²_a + sigma²_b.
a <- rlnorm(25000, meanlog = 0, sdlog = 1)
b <- rlnorm(25000, meanlog = 1, sdlog = 1)
distr_a <- new_d(a, type = "continuous")
distr_b <- new_d(b, type = "continuous")
distr_ab <- form_trans(
list(distr_a, distr_b), trans = function(x, y) x*y
)
# or: distr_ab <- distr_a * distr_b
plot(distr_ab, xlim = c(0, 40))
curve(dlnorm(x, meanlog = 1, sdlog = sqrt(2)), add = TRUE, col = "red")
As demonstrated here:
https://www.r-bloggers.com/2019/05/bayesian-models-in-r-2/
# Example distributions
probs <- seq(0,1,length.out= 100)
prior <- dbinom(x = 8, prob = probs, size = 10)
lik <- dnorm(x = probs, mean = .5, sd = .1)
# Multiply distributions
unstdPost <- lik * prior
# If you wanted to get an actual posterior, it must be a probability
# distribution (integrate to 1), so we can divide by the sum:
stdPost <- unstdPost / sum(unstdPost)
# Plot
plot(probs, prior, col = "black", # rescaled
type = "l", xlab = "P(Black)", ylab = "Density")
lines(probs, lik / 15, col = "red")
lines(probs, unstdPost, col = "green")
lines(probs, stdPost, col = "blue")
legend("topleft", legend = c("Lik", "Prior", "Unstd Post", "Post"),
text.col = 1:4, bty = "n")
Created on 2022-08-06 by the reprex package (v2.0.1)

R-hat against iterations RStan

I am trying to generate a similar plot as below to show the change in R-hat over iterations:
I have tried the following options :
summary(fit1)$summary : gives R-hat all chains are merged
summary(fit1)$c_summary : gives R-hat for each chain individually
Can you please help me to get R-hat for each iteration for a given parameter?
rstan provides the Rhat() function, which takes a matrix of iterations x chains and returns R-hat. We can extract this matrix from the fitted model and apply Rhat() cumulatively over it. The code below uses the 8 schools model as an example (copied from the getting started guide).
library(tidyverse)
library(purrr)
library(rstan)
theme_set(theme_bw())
# Fit the 8 schools model.
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18))
fit <- stan(file = 'schools.stan', data = schools_dat)
# Extract draws for mu as a matrix; columns are chains and rows are iterations.
mu_draws = as.array(fit)[,,"mu"]
# Get the cumulative R-hat as of each iteration.
mu_rhat = map_dfr(
1:nrow(mu_draws),
function(i) {
return(data.frame(iteration = i,
rhat = Rhat(mu_draws[1:i,])))
}
)
# Plot iteration against R-hat.
mu_rhat %>%
ggplot(aes(x = iteration, y = rhat)) +
geom_line() +
labs(x = "Iteration", y = expression(hat(R)))

Creating subplot (facets) with custom x,y position of the subplots in ggplot2

How can we custom the position of the panels/subplot in ggplot2?
Concretely I have a grouped times series and I want to produce 1 subplot per time series with custom positions of the subplot, not necessarily in a grid.
The facet_grid() or facet_wrap() functions do not provide a full customization of the position of the panel as it uses grid.
library(tidyverse)
df = data.frame(group = LETTERS[1:5],
x = c(1,2,3,1.5,2.5),
y =c(2,1,2,3,3),
stringsAsFactors = F)%>%
group_by(group)%>%
expand_grid(time = 1:20)%>%
ungroup()%>%
mutate(dv = rnorm(n()))%>%
arrange(group,time)
## plot in grid
df%>%
ggplot()+
geom_line(aes(x=time,y=dv))+
facet_grid(~group)
## plot with custom x, y position
## Is there an equivalent of facet_custom()?
df%>%
ggplot()+
geom_line(aes(x=time,y=dv))+
facet_custom(~group, x.subplot = x, y.subplot = y)
FYI: This dataset is only an example. My data are EEG data where each group represents an electrode (up to 64) and I want to plot the EEG signals of each electrode accordingly to the position of the electrode on the head.
Well, I guess this would not really be a 'facet plot' any more. I therefore don't think there is a specific function out there.
But you can use the fantastic patchwork package for that, in particular the layout option in wrap_plots.
As the main package author Thomas describes in the vignette, the below option using area() may be a bit verbose, but it would give you full programmatic options about positioning all your plots.
library(tidyverse)
library(patchwork)
mydf <- data.frame(
group = LETTERS[1:5],
x = c(1, 2, 3, 1.5, 2.5),
y = c(2, 1, 2, 3, 3),
stringsAsFactors = F
) %>%
group_by(group) %>%
expand_grid(time = 1:20) %>%
ungroup() %>%
mutate(dv = rnorm(n())) %>%
arrange(group, time)
## plot in grid
mylist <-
mydf %>%
split(., .$group)
p_list <-
map(1:length(mylist), function(i){
ggplot(mylist[[i]]) +
geom_line(aes(x = time, y = dv)) +
ggtitle(names(mylist)[i])
}
)
layout <- c(
area(t = 1, l = 1, b = 2, r = 2),
area(t = 2, l = 3, b = 3, r = 4),
area(t = 3, l = 5, b = 4, r = 6),
area(t = 4, l = 3, b = 5, r = 4),
area(t = 5, l = 1, b = 6, r = 2)
)
wrap_plots(p_list, design = layout)
#> result not shown, it's the same as below
For a more programmatic approach, one option is to create the required "patch_area" object manually.
t = 1:5
b = t+1
l = c(1,3,5,3,1)
r = l+1
list_area <- list(t = t, b = b, l = l, r = r)
class(list_area) <- "patch_area"
wrap_plots(p_list, design = list_area)
Created on 2020-04-22 by the reprex package (v0.3.0)

How to automatically set-up and add functions to a model in R?

I am setting up a model, and I am trying to reduce the amount of writing I have to do.
Concretely, I am using the coala R-package to do coalescent simulations, and I am trying to easily implement a stepping-stone migration model.
A reproducible example: 4 linearly distributed populations exchange migrants according to stepping-stone pattern (only the adjacent populations).
model <- coal_model(sample_size = c(5, 5, 5, 5),
loci_number = 1,
loci_length = 10,
ploidy = 1) +
feat_mutation(rate = mut_rate, # e.g. 0.1
model = "HKY",
base_frequencies = c(0.25,0.25,0.25,0.25),
tstv_ratio = 4) +
feat_migration(mig_rate, 1, 2) + # mig_rate can be e.g. 0.5
feat_migration(mig_rate, 2, 1) +
feat_migration(mig_rate, 2, 3) +
feat_migration(mig_rate, 3, 2) +
feat_migration(mig_rate, 3, 4) +
feat_migration(mig_rate, 4, 3) +
sumstat_dna(name = "dna", transformation = identity)
This example works, but the downside is that I have to write many 'feat_migration' lines, although there is a clear pattern that could be automated. It is fine for a small number of populations, but I want to do a large simulation with about 70 populations. Does someone has a good idea how to automate this? The documentation has not helped me so far.
I tried two things that didn't work:
feat_migration(mig_rate, c(1,2,2,3,3,4), c(2,1,3,2,4,3))
and something like this:
migration_model <- function(){
for(i in 1:n_pops){
feat_migration(mig_rate, i, i+1) +
feat_migration(mig_rate, i+1, i))
}
In the latter case, I don't really know how I can correctly create and parse all functions correctly into my model.
Good ideas are very welcome! :)
Consider the higher-order functions: Map (wrapper to mapply) and Reduce to build a list of function calls and add them iteratively into model. Specifically, Reduce helps for function accumulating needs where result of each iteration needs to be passed into the next iteration to reduce to a single final result.
n_pops <- 4
start_pts <- as.vector(sapply(seq(n_pops-1), function(x) c(x, x+1)))
start_pts
# [1] 1 2 2 3 3 4
end_pts <- as.vector(sapply(seq(n_pops-1), function(x) c(x+1, x)))
end_pts
# [1] 2 1 3 2 4 3
# LIST OF feat_migration()
feats <- Map(function(x, y) feat_migration(mig_rate, x, y), start_pts, end_pts)
# LIST OF FUNCTIONS
funcs <- c(coal_model(sample_size = c(5, 5, 5, 5),
loci_number = 1,
loci_length = 10,
ploidy = 1),
feat_mutation(rate = mut_rate, # e.g. 0.1
model = "HKY",
base_frequencies = c(0.25,0.25,0.25,0.25),
tstv_ratio = 4),
feats,
sumstat_dna(name = "dna", transformation = identity)
)
# MODEL CALL
model <- Reduce(`+`, funcs)
As an aside, the functional form for ggplot + calls is Reduce:
gp <- ggplot(df) + aes_string(x='Time', y='Data') +
geom_point() + scale_x_datetime(limits=date_range)
# EQUIVALENTLY
gp <- Reduce(ggplot2:::`+.gg`, list(ggplot(df), aes_string(x='Time', y='Data'),
geom_point(), scale_x_datetime(limits=date_range)))
The answer is a slight edit by the solution proposed by Parfait. The model initializes without errors, and can be run in the simulator without errors.
n_pops <- 4
start_pts <- as.vector(sapply(seq(n_pops-1), function(x) c(x, x+1)))
end_pts <- as.vector(sapply(seq(n_pops-1), function(x) c(x+1, x)))
# LIST OF feat_migration()
feats <- Map(function(x, y) feat_migration(mig_rate, x, y), start_pts, end_pts)
# LIST OF FUNCTIONS
funcs <- c(list(coal_model(sample_size = c(5, 5, 5, 5),
loci_number = 1,
loci_length = 10,
ploidy = 1),
feat_mutation(rate = mut_rate, # e.g. 0.1
model = "HKY",
base_frequencies = c(0.25,0.25,0.25,0.25),
tstv_ratio = 4),
sumstat_dna(name = "dna", transformation = identity)),
feats)
)
# MODEL CALL
model <- Reduce(`+`, funcs)

Separate Bayesian parameter estimates for multiple groups in JAGS/rjags

I am trying to perform a hierarchical analysis in JAGS, extrapolating from Kruschke's Doing Bayesian Data Analysis, chapter 9. I wish to obtain posterior parameter estimates for the proportion of heads for four coins (theta's 1,2,3 and 4), coming from two mints, and also the estimates for average bias of the coins that come from each mint (mint bias: omega). I have kept the variability of each mint's bias, kappa, as a constant. The trouble is that I cannot get a posterior estimate from the second mint, it seems to just be sampling the prior. Does anyone know how to fix the model string text (see step 3 below) so as to generate the posterior estimate for the second mint?
Entire script for the analysis below
library(rjags)
library(runjags)
library(coda)
############### 1. Generate the data
flips <- c(sample(c(rep(1,3), rep(0,9))), # coin 1, mint 1, 12 flips total
sample(c(rep(1,1), rep(0,4))), # coin 2, mint 1, 5 flips total
sample(c(rep(1,10), rep(0,5))), # coin 1, mint 2, 15 flips
sample(c(rep(1,17), rep(0,6)))) # coin 2, mint 2, 23 flips
coins <- factor(c(rep(1,12), rep(2,5), rep(3, 15), rep(4, 23)))
mints <- factor(c(rep(1,17), rep(2,38)))
nFlips <- length(flips)
nCoins <- length(unique(coins))
nMints <- length(unique(mints))
#################### 2. Pass data into a list
dataList <- list(
flips = flips,
coins = coins,
mints = mints,
nFlips = nFlips,
nCoins = nCoins,
nMints = nMints)
################### 3. specify and save the model
modelString <- "
model{
# start with nested likelihood function
for (i in 1:nFlips) {
flips[i] ~ dbern(theta[coins[i]])
}
# next the prior on theta
for (coins in 1:nCoins) {
theta[coins] ~ dbeta(omega[mints[coins]]*(kappa - 2) + 1, (1 - omega[mints[coins]])*(kappa - 2) + 1)
}
# next we specify the prior for the higher-level parameters on the mint, omega and kappa
for (mints in 1:nMints) {
omega[mints] ~ dbeta(2,2)
}
kappa <- 5
}
"
writeLines(modelString, "tempModelHier4CoinTwoMint.txt")
############################### Step 4: Initialise Chains
initsList <- list(theta1 = mean(flips[coins==1]),
theta2 = mean(flips[coins==2]),
theta3 = mean(flips[coins==3]),
theta4 = mean(flips[coins==4]),
omega1 = mean(c(mean(flips[coins==1]),
mean(flips[coins==2]))),
omega2 = mean(c(mean(flips[coins==3]),
mean(flips[coins==4]))))
initsList
############################### Step 5: Generate Chains
runJagsOut <- run.jags(method = "simple",
model = "tempModelHier4CoinTwoMint.txt",
monitor = c("theta[1]", "theta[2]", "theta[3]", "theta[4]", "omega[1]", "omega[2]"),
data = dataList,
inits = initsList,
n.chains = 1,
adapt = 500,
burnin = 1000,
sample = 50000,
thin = 1,
summarise = FALSE,
plots = FALSE)
############################### Step 6: Convert to Coda Object
codaSamples <- as.mcmc.list(runJagsOut)
head(codaSamples)
############################### Step 7: Make Graphs
df <- data.frame(as.matrix(codaSamples))
theta1 <- ggplot(df, aes(x = df$theta.1.)) + geom_density()
theta2 <- ggplot(df, aes(x = df$theta.2.)) + geom_density()
theta3 <- ggplot(df, aes(x = df$theta.3.)) + geom_density()
theta4 <- ggplot(df, aes(x = df$theta.4.)) + geom_density()
omega1 <- ggplot(df, aes(x = df$omega.1.)) + geom_density()
omega2 <- ggplot(df, aes(x = df$omega.2.)) + geom_density()
require(gridExtra)
ggsave("coinsAndMintsHier/hierPropFourCoinsTwoMints.pdf", grid.arrange(theta1, theta2, theta3, theta4, omega1, omega2, ncol = 2), device = "pdf", height = 30, width = 10, units = "cm")
The problem was the way you were attempting to index the mints of the coins when setting the prior on theta. There are only 4 theta's in this case, not nFlips. Your nested indexing mints[coins] was accessing the mints data vector, not a vector of which mint each coin belongs to. I've created a corrected version below. Notice the explicit construction of a vector that indexes which mint each coin belongs to. Notice also in the model specification each for-loop index has its own index name distinct from data names.
graphics.off() # This closes all of R's graphics windows.
rm(list=ls()) # Careful! This clears all of R's memory!
library(runjags)
library(coda)
#library(rjags)
############### 1. Generate the data
flips <- c(sample(c(rep(1,3), rep(0,9))), # coin 1, mint 1, 12 flips total
sample(c(rep(1,1), rep(0,4))), # coin 2, mint 1, 5 flips total
sample(c(rep(1,10), rep(0,5))), # coin 1, mint 2, 15 flips
sample(c(rep(1,17), rep(0,6)))) # coin 2, mint 2, 23 flips
# NOTE: I got rid of `factor` because it was unneeded and got in the way
coins <- c(rep(1,12), rep(2,5), rep(3, 15), rep(4, 23))
# NOTE: I got rid of `factor` because it was unneeded and got in the way
mints <- c(rep(1,17), rep(2,38))
nFlips <- length(flips)
nCoins <- length(unique(coins))
nMints <- length(unique(mints))
# NEW: Create vector that specifies the mint of each coin. There must be a more
# elegant way to do this, but here is a logical brute-force approach. This
# assumes that coins are consecutively numbered from 1 to nCoins.
mintOfCoin = NULL
for ( cIdx in 1:nCoins ) {
mintOfCoin = c( mintOfCoin , unique(mints[coins==cIdx]) )
}
#################### 2. Pass data into a list
dataList <- list(
flips = flips,
coins = coins,
mints = mints,
nFlips = nFlips,
nCoins = nCoins,
nMints = nMints,
mintOfCoin = mintOfCoin # NOTE
)
################### 3. specify and save the model
modelString <- "
model{
# start with nested likelihood function
for (fIdx in 1:nFlips) {
flips[fIdx] ~ dbern( theta[coins[fIdx]] )
}
# next the prior on theta
# NOTE: Here we use the mintOfCoin index.
for (cIdx in 1:nCoins) {
theta[cIdx] ~ dbeta( omega[mintOfCoin[cIdx]]*(kappa - 2) + 1 ,
( 1 - omega[mintOfCoin[cIdx]])*(kappa - 2) + 1 )
}
# next we specify the prior for the higher-level parameters on the mint,
# omega and kappa
# NOTE: I changed the name of the mint index so it doesn't conflict with
# mints data vector.
for (mIdx in 1:nMints) {
omega[mIdx] ~ dbeta(2,2)
}
kappa <- 5
}
"
writeLines(modelString, "tempModelHier4CoinTwoMint.txt")
############################### Step 4: Initialise Chains
initsList <- list(theta1 = mean(flips[coins==1]),
theta2 = mean(flips[coins==2]),
theta3 = mean(flips[coins==3]),
theta4 = mean(flips[coins==4]),
omega1 = mean(c(mean(flips[coins==1]),
mean(flips[coins==2]))),
omega2 = mean(c(mean(flips[coins==3]),
mean(flips[coins==4]))))
initsList
############################### Step 5: Generate Chains
runJagsOut <- run.jags(method = "parallel",
model = "tempModelHier4CoinTwoMint.txt",
# NOTE: theta and omega are vectors:
monitor = c( "theta", "omega" , "kappa" ),
data = dataList,
#inits = initsList, # NOTE: Let JAGS initialize.
n.chains = 4, # NOTE: Not only 1 chain.
adapt = 500,
burnin = 1000,
sample = 10000,
thin = 1,
summarise = FALSE,
plots = FALSE)
############################### Step 6: Convert to Coda Object
codaSamples <- as.mcmc.list(runJagsOut)
head(codaSamples)
########################################
## NOTE: Important step --- Check MCMC diagnostics
# Display diagnostics of chain, for specified parameters:
source("DBDA2E-utilities.R") # For function diagMCMC()
parameterNames = varnames(codaSamples) # from coda package
for ( parName in parameterNames ) {
diagMCMC( codaObject=codaSamples , parName=parName )
}
############################### Step 7: Make Graphs
# ...

Resources