I have data that follows a sigmoid curve and I would like fit a logistic function to extract the three (or two) parameters for each participant. I have found some methods online, but I'm not sure which is the correct option.
This tutorial explains that you should use the nls() function like this:
fitmodel <- nls(y~a/(1 + exp(-b * (x-c))), start=list(a=1,b=.5,c=25))
## get the coefficients using the coef function
params=coef(fitmodel)
... where you clearly need the starting values to find the best-fitting values (?).
And then this post explains that to get the starting values, you can use a "selfstarting model can estimate good starting values for you, so you don't have to specify them":
fit <- nls(y ~ SSlogis(x, Asym, xmid, scal), data = data.frame(x, y))
However somewhere else I also read that you should use the SSlogis function for fitting a logistic function. Please could someone confirm whether these two steps are the best way to go about it? Or should I use values that I have extracted from previous similar data for the starting values?
Additionally, what should I do if I don't want the logistic function to be defined by the asymptote at all?
Thank you!
There isn't a best way but SSlogis does eliminate having to set starting values whereas if you specify the formula you have more control over the parameterization.
If the question is really how to fix a at a predetermined level, here the value 1, without rewriting the formula then set a before running nls and omit it from the starting values.
a <- 1
fo <- y ~ a / (1 + exp(-b * (x-c)))
nls(fo, start = list(b = 0.5, c = 25))
Alternately this substitutes a=1 into formula fo giving fo2 without having to rewrite the formula yourself.
fo2 <- do.call("substitute", list(fo, list(a = 1)))
nls(fo2, start = list(b = 0.5, c = 25))
As #G. Grothendieck writes, there is no general "best way", it always depends on you particular aims. Use of SSLogis is a good idea, as you don't need to specify start values, but a definition of an own function is more flexible. See the following example, where we use heuristics to derive start values ourselves instead of specifying them manually. Then we fit a logistic model and as a small bonus, the Baranyi growth model with an explicit lag phase.
# time (t)
x <- c(0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
# Algae cell counts (Mio cells per ml)
y <- c(0.88, 1.02, 1.43, 2.79, 4.61, 7.12,
6.47, 8.16, 7.28, 5.67, 6.91)
## we now plot the data linearly and logarithmically
## the layout function is another way to subdivide the plotting area
nf <- layout(matrix(c(1,2,3,3), 2, 2, byrow = TRUE), respect = TRUE)
layout.show(nf) # this shows how the plotting area is subdivided
plot(x, y)
plot(x, log(y))
## we see that the first points show the steepest increase,
## so we can estimate a start value of the growth rate
r <- (log(y[5]) - log(y[1])) / (x[5] - x[1])
abline(a=log(y[1]), b=r)
## this way, we have a heuristics for all start parameters:
## r: steepest increase of y in log scale
## K: maximum value
## N0: first value
## we can check this by plotting the function with the start values
f <- function(x, r, K, N0) {K /(1 + (K/N0 - 1) * exp(-r *x))}
plot(x, y, pch=16, xlab="time (days)", ylab="algae (Mio cells)")
lines(x, f(x, r=r, K=max(y), N0=y[1]), col="blue")
pstart <- c(r=r, K=max(y), N0=y[1])
aFit <- nls(y ~ f(x, r, K,N0), start = pstart, trace=TRUE)
x1 <- seq(0, 25, length = 100)
lines(x1, predict(aFit, data.frame(x = x1)), col = "red")
legend("topleft",
legend = c("data", "start parameters", "fitted parameters"),
col = c("black", "blue", "red"),
lty = c(0, 1, 1),
pch = c(16, NA, NA))
summary(aFit)
(Rsquared <- 1 - var(residuals(aFit))/var(y))
## =============================================================================
## Approach with Baranyi-Roberts model
## =============================================================================
## sometimes, a logistic is not good enough. In this case, use another growth
## model
baranyi <- function(x, r, K, N0, h0) {
A <- x + 1/r * log(exp(-r * x) + exp(-h0) - exp(-r * x - h0))
y <- exp(log(N0) + r * A - log(1 + (exp(r * A) - 1)/exp(log(K) - log(N0))))
y
}
pstart <- c(r=0.5, K=7, N0=1, h0=2)
fit2 <- nls(y ~ baranyi(x, r, K, N0, h0), start = pstart, trace=TRUE)
lines(x1, predict(fit2, data.frame(x = x1)), col = "forestgreen", lwd=2)
legend("topleft",
legend = c("data", "logistic model", "Baranyi-Roberts model"),
col = c("black", "red", "forestgreen"),
lty = c(0, 1, 1),
pch = c(16, NA, NA))
Related
I would like to do a corrplot, but instead of using the correlation coefficient, it would display the slope of a linear regression between each variables.
And if possible, it would do the same than the corrplot function, as it will show which slope is significant or not. And for comparaison issues between the variables, I guess it would be preferable to normalise all the slopes.
I want to do that because I have sometimes a bad correlation/R2, but still a significant slope. So having both the correlation matrix and the "slope" matrix would be great.
Do you know if there is any existing function like this ? Or how to do it ?
Thank you.
EDIT :
Here is a link explaining why I have a difference between the slope and R2/correlation : https://statisticsbyjim.com/regression/low-r-squared-regression/
Here is an example of what I get using corrplot. And what I would like to do is a similar function but with the slope instead of the correlation.
M<-cor(mtcars)
test <- cor.mtest(M, conf.level = 0.95)
corrplot(M, order="hclust", tl.col="black",
p.mat = test$p, sig.level = 0.10)
Here you have points with best fit (lower panel), and the regression parameters( upper panel):
#Panel of correlations
panel.corr <- function(x, y,data){
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
a <- round(summary(lm(x~y, data=mtcars))$coef[1,1],3)
b <- round(summary(lm(x~y, data=mtcars))$coef[2,1],3)
txt <- paste0("y=", a," + (",b,")*x")
text(0.5, 0.5, txt, cex = 1)
}
#Panel of histograms
panel.hist <- function(x, ...){
usr <- par("usr"); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks
len <- length(breaks)
y <- h$counts/max(h$counts)
rect(breaks[-len], 0, breaks[-1], y, col = "lightblue")
}
panel.scat <- function(x, y, ...) {
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1), new = TRUE)
plot(x,y)
abline(lm(y ~ x))
}
#Plot
pairs(mtcars[, c(1,3:7)],
lower.panel = panel.scat,
upper.panel = panel.corr,
diag.panel = panel.hist,
gap = 0.3,
main = "Scatterplot matrix of `mtcars`")
Following the tutorial on this page and to answer your question :
library(tidyverse)
library(ggpubr)
theme_set(theme_pubr())
# Load the package
data("marketing", package = "datarium")
head(marketing, 4)
ggplot(marketing, aes(x = youtube, y = sales)) +
geom_point() +
stat_smooth()
cor(marketing$sales, marketing$youtube)
model <- lm(sales ~ youtube, data = marketing)
model
The output of calling model is :
##
## Call:
## lm(formula = sales ~ youtube, data = marketing)
##
## Coefficients:
## (Intercept) youtube
## 8.4391 0.0475
And there is the informations you're looking for :
Intercepts is quite self-explanatory
Slope is the value of youtube coefficient here
If you are working on multiple regression you need to take into account all the coeeficient from your model or formula to link the R code.
If you want to compare just to features which you previously computed the correlation just swap it into the formula and you'll get a simple regression model for it. I have to advise you to check the pre-requisite of a linear regression before just in case ...
Hope it helps.
I am trying to fit michaelis menten equation to a dataset to determine rate of disappearance as well as IC50 (Km) if data permits. I am getting good fit except the first point at concentration 0, however, I am getting negative value of Km, which is not correct. I am using the following code.
x <- c(0, 2.5, 5.0, 10.0, 25.0)
y <- c(4.91, 1.32, 1.18, 1.12, 1.09)
#########################Fit General Michaelis Menten Equation########################################################
model.mm <- nls(y ~ (Vmax*x/(Km+x)), data = data.frame(x,y),
start = list(Km=max(y)/2, Vmax = max(y)))
print(summary(model.mm))
#plot it
plot(y~x, type="p", lwd=2,
xlab="Lopinavir Concentrations (uM)", ylab="Efflux Ratio")
title("Lopinavir Transport in MDCK-MDR1 Cells")
lines(fitted(model.mm)~x, col="red")
Any suggestions for improving the fit and parameter estimates will be very appreciated.
thanks,
Krina
The Michealis Menten Hyperbola of xlab = Substrate Concentration and ylab = Velocity usually looks like a rising hyperbola to a maximum. The parameters which you are measuring look like an exponential decay as the substrate concentration increases. I'm not sure the Michealis Menten Equation works so well here. Also you shouldn't be using the lines function. It doesn't give you a curve. You should be using the curve function.
x <- c(0, 2.5, 5.0, 10.0, 25.0)
y <- c(4.91, 1.32, 1.18, 1.12, 1.09)
mm <- data.frame(x, y)
I think you should make the dataframe before you call it into a function.
model.mm <- nls(y ~ Vmax*x / (Km + x), data = mm, start = list(Km = max(mm$y)/2, Vmax = max(mm$y)))
plot(y ~ x, type = "p", lwd = 2, xlab = "Lopinavir Concentrations (uM)", ylab = "Efflux Ratio", pch = 16, main = "Lopinavir Transport in MDCK-MDR1 Cells")
summary(model.mm)
Formula: y ~ Vmax * x/(Km + x)
Parameters:
Estimate Std. Error t value Pr(>|t|)
Km -0.4772 6.6246 -0.072 0.947
Vmax 1.0678 2.1382 0.499 0.652
Residual standard error: 2.835 on 3 degrees of freedom
Number of iterations to convergence: 4
Achieved convergence tolerance: 6.634e-06
Now I think the curve function is pretty self explanatory:
?curve
curve(x * 1.0678 / (x + -0.4772), col = "red", lwd = 2, add = TRUE)
fx <- function(x){x * 1.0678 / (x + -0.4772)}
range(x)
1 0 25
We can integrate this Michealis-Menten FUnction and calculate area under curve:
require(pracma)
integrate(fx, lower = 0, upper = 25)
Error in integrate(fx, lower = 0, upper = 25) :
the integral is probably divergent
This divergence is because you plot looks a lot like y = (1 / x) which is divergent.
If you push x a little bit away from zero, where y tends to infinity we can get a finite answer.
integrate(fx, lower = 0.5, upper = 25)
29.71809 with absolute error < 0.00069
but this integral is questionable for divergence reasons.
You can estimate the area under your scatterplot points using the trapezoidal approximation for integration:
trapz(mm$x, mm$y)
1 33.2375
Note: I tried fitting exponential functions to your plot but that doesn't work. The curve drops too fast missing most of the points.
I think I figured out what's wrong with your function.
Type: y = (x * v) / (x + K)
into https://www.desmos.com/calculator
and see what happens when you make K negative and K - positive, and when you make both K and v negative, etc.
Thank you for your help, really appreciated.
I was able to resolve this by using hill equation.
fo <- y ~ (Vmax*x^hill/((VC50^hill) + (x^hill)))
st <- c(Vmax=0.5, hill=1, VC50=0.3)
model.hill <- nls(fo, data = data.frame(x,y), start = st)
print(summary(model.hill))
co <- coef(model.hill)
plot(y~x, type="p", lwd=2,
xlab="Lopinavir Concentrations (uM)", ylab="Efflux Ratio")
title("Lopinavir Transport in MDCK-MDR1 Cells")
lines(fitted(model.hill)~x, col="red")
I have the following plot of some experimental data (see below). The red line is a fitting curve of the black dots, which are experimental values. Now, the first three dots at 0, 0.583, and 1.916 form a baseline and the next two, 2.083, 2.416, seem to be outliers. How can I program the fitting curve, so that it doesn't take into account baseline and outliers? At the moment, R is clearly trying to optimize also for those irrelevant values.
x <-
c(0,0.583333,1.916666,2.083333,2.416666,2.5,3.666666,5.916666,9,16.75,20)
y <-
c(
0.05464,0.05453,0.0544,0.18043,0.18151,0.12551,0.18792,0.2497,0.28359,0.31734,0.3263
)
plot(x,y, ylim = range(c(0,0.45)), pch = 1)
fit <- nls(y ~ -p1 / exp(x) + p1, start = list(p1 = 1))
xx <- seq(0,20, length = 200)
lines(xx, predict(fit, data.frame(x = xx)), col = "red")
To avoid fitting the first 5 points use the subset= argument of nls giving a vector of the negative positions to exclude:
nls(y ~ -p1 / exp(x) + p1, start = list(p1 = 1), subset = -seq(5))
Note that this model is actually linear in its single parameter so we could use lm instead of nls:
lm(y ~ I(1-exp(-x)) - 1, subset = -seq(5))
I am using the following code to create a standard normal distribution in R:
x <- seq(-4, 4, length=200)
y <- dnorm(x, mean=0, sd=1)
plot(x, y, type="l", lwd=2)
I need the x-axis to be labeled at the mean and at points three standard deviations above and below the mean. How can I add these labels?
The easiest (but not general) way is to restrict the limits of the x axis. The +/- 1:3 sigma will be labeled as such, and the mean will be labeled as 0 - indicating 0 deviations from the mean.
plot(x,y, type = "l", lwd = 2, xlim = c(-3.5,3.5))
Another option is to use more specific labels:
plot(x,y, type = "l", lwd = 2, axes = FALSE, xlab = "", ylab = "")
axis(1, at = -3:3, labels = c("-3s", "-2s", "-1s", "mean", "1s", "2s", "3s"))
Using the code in this answer, you could skip creating x and just use curve() on the dnorm function:
curve(dnorm, -3.5, 3.5, lwd=2, axes = FALSE, xlab = "", ylab = "")
axis(1, at = -3:3, labels = c("-3s", "-2s", "-1s", "mean", "1s", "2s", "3s"))
But this doesn't use the given code anymore.
If you like hard way of doing something without using R built in function or you want to do this outside R, you can use the following formula.
x<-seq(-4,4,length=200)
s = 1
mu = 0
y <- (1/(s * sqrt(2*pi))) * exp(-((x-mu)^2)/(2*s^2))
plot(x,y, type="l", lwd=2, col = "blue", xlim = c(-3.5,3.5))
An extremely inefficient and unusual, but beautiful solution, which works based on the ideas of Monte Carlo simulation, is this:
simulate many draws (or samples) from a given distribution (say the normal).
plot the density of these draws using rnorm. The rnorm function takes as arguments (A,B,C) and returns a vector of A samples from a normal distribution centered at B, with standard deviation C.
Thus to take a sample of size 50,000 from a standard normal (i.e, a normal with mean 0 and standard deviation 1), and plot its density, we do the following:
x = rnorm(50000,0,1)
plot(density(x))
As the number of draws goes to infinity this will converge in distribution to the normal. To illustrate this, see the image below which shows from left to right and top to bottom 5000,50000,500000, and 5 million samples.
In general case, for example: Normal(2, 1)
f <- function(x) dnorm(x, 2, 1)
plot(f, -1, 5)
This is a very general, f can be defined freely, with any given parameters, for example:
f <- function(x) dbeta(x, 0.1, 0.1)
plot(f, 0, 1)
I particularly love Lattice for this goal. It easily implements graphical information such as specific areas under a curve, the one you usually require when dealing with probabilities problems such as find P(a < X < b) etc.
Please have a look:
library(lattice)
e4a <- seq(-4, 4, length = 10000) # Data to set up out normal
e4b <- dnorm(e4a, 0, 1)
xyplot(e4b ~ e4a, # Lattice xyplot
type = "l",
main = "Plot 2",
panel = function(x,y, ...){
panel.xyplot(x,y, ...)
panel.abline( v = c(0, 1, 1.5), lty = 2) #set z and lines
xx <- c(1, x[x>=1 & x<=1.5], 1.5) #Color area
yy <- c(0, y[x>=1 & x<=1.5], 0)
panel.polygon(xx,yy, ..., col='red')
})
In this example I make the area between z = 1 and z = 1.5 stand out. You can move easily this parameters according to your problem.
Axis labels are automatic.
This is how to write it in functions:
normalCriticalTest <- function(mu, s) {
x <- seq(-4, 4, length=200) # x extends from -4 to 4
y <- (1/(s * sqrt(2*pi))) * exp(-((x-mu)^2)/(2*s^2)) # y follows the formula
of the normal distribution: f(Y)
plot(x,y, type="l", lwd=2, xlim = c(-3.5,3.5))
abline(v = c(-1.96, 1.96), col="red") # draw the graph, with 2.5% surface to
either side of the mean
}
normalCriticalTest(0, 1) # draw a normal distribution with vertical lines.
Final result:
I am a beginner in R and I have tried to find information about the following without finding anything.
The green graph in the picture is composed by the red and yellow graphs. But let's say that I only have the data points of something like the green graph. How do I extract the low/high frequencies (i.e. approximately the red/yellow graphs) using a low pass/high pass filter?
Update: The graph was generated with
number_of_cycles = 2
max_y = 40
x = 1:500
a = number_of_cycles * 2*pi/length(x)
y = max_y * sin(x*a)
noise1 = max_y * 1/10 * sin(x*a*10)
plot(x, y, type="l", col="red", ylim=range(-1.5*max_y,1.5*max_y,5))
points(x, y + noise1, col="green", pch=20)
points(x, noise1, col="yellow", pch=20)
Update 2: Using the Butterworth filter in the signal package suggested I get the following:
library(signal)
bf <- butter(2, 1/50, type="low")
b <- filter(bf, y+noise1)
points(x, b, col="black", pch=20)
bf <- butter(2, 1/25, type="high")
b <- filter(bf, y+noise1)
points(x, b, col="black", pch=20)
The calculations was a bit work, signal.pdf gave next to no hints about what values W should have, but the original octave documentation at least mentioned radians which got me going. The values in my original graph was not chosen with any specific frequency in mind, so I ended up with the following not so simple frequencies: f_low = 1/500 * 2 = 1/250, f_high = 1/500 * 2*10 = 1/25 and the sampling frequency f_s = 500/500 = 1. Then I chose a f_c somewhere inbetween the low and high frequencies for the low/high pass filters (1/100 and 1/50 respectively).
I bumped into similar problem recently and did not find the answers here particularly helpful. Here is an alternative approach.
Let´s start by defining the example data from the question:
number_of_cycles = 2
max_y = 40
x = 1:500
a = number_of_cycles * 2*pi/length(x)
y = max_y * sin(x*a)
noise1 = max_y * 1/10 * sin(x*a*10)
y <- y + noise1
plot(x, y, type="l", ylim=range(-1.5*max_y,1.5*max_y,5), lwd = 5, col = "green")
So the green line is the dataset we want to low-pass and high-pass filter.
Side note: The line in this case could be expressed as a function by using cubic spline (spline(x,y, n = length(x))), but with real world data this would rarely be the case, so let's assume that it is not possible to express the dataset as a function.
The easiest way to smooth such data I have came across is to use loess or smooth.spline with appropriate span/spar. According to statisticians loess/smooth.spline is probably not the right approach here, as it does not really present a defined model of the data in that sense. An alternative is to use Generalized Additive Models (gam() function from package mgcv). My argument for using loess or smoothed spline here is that it is easier and does not make a difference as we are interested in the visible resulting pattern. Real world datasets are more complicated than in this example and finding a defined function for filtering several similar datasets might be difficult. If the visible fit is good, why to make it more complicated with R2 and p values? To me the application is visual for which loess/smoothed splines are appropriate methods. Both of the methods assume polynomial relationships with the difference that loess is more flexible also using higher degree polynomials, while cubic spline is always cubic (x^2). Which one to use depends on trends in a dataset. That said, the next step is to apply a low-pass filter on the dataset by using loess() or smooth.spline():
lowpass.spline <- smooth.spline(x,y, spar = 0.6) ## Control spar for amount of smoothing
lowpass.loess <- loess(y ~ x, data = data.frame(x = x, y = y), span = 0.3) ## control span to define the amount of smoothing
lines(predict(lowpass.spline, x), col = "red", lwd = 2)
lines(predict(lowpass.loess, x), col = "blue", lwd = 2)
Red line is the smoothed spline filter and blue the loess filter. As you see results differ slightly. I guess one argument of using GAM would be to find the best fit, if the trends really were this clear and consistent among datasets, but for this application both of these fits are good enough for me.
After finding a fitting low-pass filter, the high-pass filtering is as simple as subtracting the low-pass filtered values from y:
highpass <- y - predict(lowpass.loess, x)
lines(x, highpass, lwd = 2)
This answer comes late, but I hope it helps someone else struggling with similar problem.
Use filtfilt function instead of filter (package signal) to get rid of signal shift.
library(signal)
bf <- butter(2, 1/50, type="low")
b1 <- filtfilt(bf, y+noise1)
points(x, b1, col="red", pch=20)
One method is using the fast fourier transform implemented in R as fft. Here is an example of a high pass filter. From the plots above, the idea implemented in this example is to get the serie in yellow starting from the serie in green (your real data).
# I've changed the data a bit so it's easier to see in the plots
par(mfrow = c(1, 1))
number_of_cycles = 2
max_y = 40
N <- 256
x = 0:(N-1)
a = number_of_cycles * 2 * pi/length(x)
y = max_y * sin(x*a)
noise1 = max_y * 1/10 * sin(x*a*10)
plot(x, y, type="l", col="red", ylim=range(-1.5*max_y,1.5*max_y,5))
points(x, y + noise1, col="green", pch=20)
points(x, noise1, col="yellow", pch=20)
### Apply the fft to the noisy data
y_noise = y + noise1
fft.y_noise = fft(y_noise)
# Plot the series and spectrum
par(mfrow = c(1, 2))
plot(x, y_noise, type='l', main='original serie', col='green4')
plot(Mod(fft.y_noise), type='l', main='Raw serie - fft spectrum')
### The following code removes the first spike in the spectrum
### This would be the high pass filter
inx_filter = 15
FDfilter = rep(1, N)
FDfilter[1:inx_filter] = 0
FDfilter[(N-inx_filter):N] = 0
fft.y_noise_filtered = FDfilter * fft.y_noise
par(mfrow = c(2, 1))
plot(x, noise1, type='l', main='original noise')
plot(x, y=Re( fft( fft.y_noise_filtered, inverse=TRUE) / N ) , type='l',
main = 'filtered noise')
Per request of OP:
The signal package contains all kinds of filters for signal processing. Most of it is comparable to / compatible with the signal processing functions in Matlab/Octave.
Check out this link where there's R code for filtering (medical signals). It's by Matt Shotwell and the site is full of interesting R/stats info with a medical bent:
biostattmat.com
The package fftfilt contains lots of filtering algorithms that should help too.
I also struggled to figure out how the W parameter in the butter function maps on to the filter cut-off, in part because the documentation for filter and filtfilt is incorrect as of posting (it suggests that W = .1 would result in a 10 Hz lp filter when combined with filtfilt when signal sampling rate Fs = 100, but actually, it's only a 5 Hz lp filter -- the half-amplitude cut-off is 5 Hz when use filtfilt, but the half-power cut-off is 5 Hz when you only apply the filter once, using the filter function). I'm posting some demo code I wrote below that helped me confirm how this is all working, and that you could use to check a filter is doing what you want.
#Example usage of butter, filter, and filtfilt functions
#adapted from https://rdrr.io/cran/signal/man/filtfilt.html
library(signal)
Fs <- 100; #sampling rate
bf <- butter(3, 0.1);
#when apply twice with filtfilt,
#results in a 0 phase shift
#5 Hz half-amplitude cut-off LP filter
#
#W * (Fs/2) == half-amplitude cut-off when combined with filtfilt
#
#when apply only one time, using the filter function (non-zero phase shift),
#W * (Fs/2) == half-power cut-off
t <- seq(0, .99, len = 100) # 1 second sample
#generate a 5 Hz sine wave
x <- sin(2*pi*t*5)
#filter it with filtfilt
y <- filtfilt(bf, x)
#filter it with filter
z <- filter(bf, x)
#plot original and filtered signals
plot(t, x, type='l')
lines(t, y, col="red")
lines(t,z,col="blue")
#estimate signal attenuation (proportional reduction in signal amplitude)
1 - mean(abs(range(y[t > .2 & t < .8]))) #~50% attenuation at 5 Hz using filtfilt
1 - mean(abs(range(z[t > .2 & t < .8]))) #~30% attenuation at 5 Hz using filter
#demonstration that half-amplitude cut-off is 6 Hz when apply filter only once
x6hz <- sin(2*pi*t*6)
z6hz <- filter(bf, x6hz)
1 - mean(abs(range(z6hz[t > .2 & t < .8]))) #~50% attenuation at 6 Hz using filter
#plot the filter attenuation profile (for when apply one time, as with "filter" function):
hf <- freqz(bf, Fs = Fs);
plot(c(0, 20, 20, 0, 0), c(0, 0, 1, 1, 0), type = "l",
xlab = "Frequency (Hz)", ylab = "Attenuation (abs)")
lines(hf$f[hf$f<=20], abs(hf$h)[hf$f<=20])
plot(c(0, 20, 20, 0, 0), c(0, 0, -50, -50, 0),
type = "l", xlab = "Frequency (Hz)", ylab = "Attenuation (dB)")
lines(hf$f[hf$f<=20], 20*log10(abs(hf$h))[hf$f<=20])
hf$f[which(abs(hf$h) - .5 < .001)[1]] #half-amplitude cutoff, around 6 Hz
hf$f[which(20*log10(abs(hf$h))+6 < .2)[1]] #half-amplitude cutoff, around 6 Hz
hf$f[which(20*log10(abs(hf$h))+3 < .2)[1]] #half-power cutoff, around 5 Hz
there is a package on CRAN named FastICA, this computes the approximation of the independent source signals, however in order to compute both signals you need a matrix of at least 2xn mixed observations (for this example), this algorithm can't determine the two indpendent signals with just 1xn vector. See the example below. hope this can help you.
number_of_cycles = 2
max_y = 40
x = 1:500
a = number_of_cycles * 2*pi/length(x)
y = max_y * sin(x*a)
noise1 = max_y * 1/10 * sin(x*a*10)
plot(x, y, type="l", col="red", ylim=range(-1.5*max_y,1.5*max_y,5))
points(x, y + noise1, col="green", pch=20)
points(x, noise1, col="yellow", pch=20)
######################################################
library(fastICA)
S <- cbind(y,noise1)#Assuming that "y" source1 and "noise1" is source2
A <- matrix(c(0.291, 0.6557, -0.5439, 0.5572), 2, 2) #This is a mixing matrix
X <- S %*% A
a <- fastICA(X, 2, alg.typ = "parallel", fun = "logcosh", alpha = 1,
method = "R", row.norm = FALSE, maxit = 200,
tol = 0.0001, verbose = TRUE)
par(mfcol = c(2, 3))
plot(S[,1 ], type = "l", main = "Original Signals",
xlab = "", ylab = "")
plot(S[,2 ], type = "l", xlab = "", ylab = "")
plot(X[,1 ], type = "l", main = "Mixed Signals",
xlab = "", ylab = "")
plot(X[,2 ], type = "l", xlab = "", ylab = "")
plot(a$S[,1 ], type = "l", main = "ICA source estimates",
xlab = "", ylab = "")
plot(a$S[, 2], type = "l", xlab = "", ylab = "")
I am not sure if any filter is the best way for You. More useful instrument for that aim is the fast Fourier transformation.