Convolution for Digital Signal Processing in R - r

I have a simple digital system which has an input x(n) = u(n) - u(n-4).
I am trying to find the output y(n) with the conv() function from the 'signal' package or the convolve() function from the 'stats' package and plot the y(n) versus n for -10 ≤ n ≤ 10.
So far I have the following code:
library(signal)
n <- c(-10:10) # Time index
x <- c(rep(0, 10), rep(1, 4), rep(0, 7)) # Input Signal
h1 <- c(rep(0, 11), 0.5, rep(0, 9)) # Filter 1
h2 <- 0.8^n # Filter 2
h2[0:11] <- 0 #
system <- data.frame(n, x, h1, h2)
y <- conv(x + conv(x, h1), h2) # Output Signal
system <- transform(system, y=y[1:21])
plot(system$n, system$y)
I checked this plot and it is very wrong. I think there is some recycling of the vectors when I do the convolution and the output of the conv() function doesn't seem to line up with the original time index. I just can't seem to figure out how to fix my logic here. I realize the conv(n, m) function returns a vector of length (m+n)-1, is there a good way to easily match this vector to a time index vector?
This would require some knowledge of Digital Signal Processing as well as coding in R, and it would be great if someone had experience in using R for this purpose and could give a few pointers. Thanks in advance.

I figured it out.. The center of the output of the conv() function lines up with the center of the time index vector. As such:
library(signal)
n <- c(-10:10) # Time index
x <- c(rep(0, 10), rep(1, 4), rep(0, 7)) # Input Signal, square pulse
h1 <- c(rep(0, 11), 0.5, rep(0, 9)) # Filter 1
h2 <- 0.8^n # Filter 2
h2[1:10] <- 0 #
system <- data.frame(n, x, h1, h2)
y <- conv(x + conv(x, h1)[11:31], h2) # Output Signal
system <- transform(system, y=y[11:31])
plot(system$n, system$y)
I'll work on a general form to accomplish this, as I will be doing this regularly and wouldn't want to do this manually every time. If someone beats me to it, please share. :)
UPDATE
Created a general form of the conv() function to automatically line up indices of the input and output vectors. This comes at the cost of not getting the full convolution, so you would have to set up your input as depicting the full area of interest first.
library(signal) # Should this be inside the func. with attach(), detach()?
conv2 <- function(x, y){
conv(x, y)[ceiling(length(x)/2):(length(x)+floor(length(x)/2))]
}
# so
y <- conv2(x + conv2(x, h1), h2)
UPDATE 2
I wanted a function to compare to FFT. I'm not exactly happy with this version, i wanted to use sapply(), but it works. For now, it'll do.. I'll work on improvements.
conv3 <- function(x, h){
m <- length(x)
n <- length(h)
X <- c(x, rep(floor(n/2), 0, floor(n/2)))
H <- c(h, rep(floor(m/2), 0, floor(m/2)))
Y <- vector()
for(i in 1:n+m-1){
Y[i] <- 0
for(j in 1:m){
Y[i] <- ifelse(i-j+1>0, Y[i] + X[j]*H[i-j+1], 0)
}
}
Y[is.na(Y)] <- 0
Y[ceiling(m/2):(m+floor(m/2))]
}
Next, I think I need to work on making it multidimensional.

Related

Brownian Motion / loop in R

I want to implete the function of the Wiener representation in R (see https://en.wikipedia.org/wiki/Wiener_process#Wiener_representation). (I want to implement the first formulae) When plotting this
function it should look more similar to the standard brownian motion the higher the dimension of the random vector is, and the lower it should look smoother.
I have tried to implement it, but I think there is a mistake somewhere in the loop, because the graphs do not should look much more like a brownian motion when n is high, I even went as high as 10000 there isn't enough fluctation inside each graph
brownmotion <- function(n, time=1000){
W <- rep(0, time)
Wp1 <- rep(0, time)
Wp2 <- 0
X <- seq(0, 1, length.out = time)
xsi <- rnorm(n)
for ( i in 1:length(X)){
for (j in 1:n){
Wp1[i] <- X[i]*xsi[1]
Wp2 <- Wp2 + xsi[j]*sin(j*X[i]*pi)/(j*pi)
W[i] <- Wp1[i] + sqrt(2)*Wp2
}
}
return (W)
}
Since this is R, this is better done without loops:
brownmotion <- function(n, time=1000){
X <- seq(0, 1, length.out = time)
xsi <- rnorm(n + 1)
W <- xsi[1] * X + sqrt(2) * colSums(xsi[-1] * sin(pi * 1:n %*% t(X)) / (pi * 1:n))
return (W)
}
When coding this, I noticed a small error in your original code in that you use xsi[1] twice. I avoided this by making xsi length n + 1, so xsi[1] could be the initial value and there are still n values left.

Is it possible to use vector math in R for a summation involving intervals?

Title's a little rough, open to suggestions to improve.
I'm trying to calculate time-average covariances for a 500 length vector.
This is the equation we're using
The result I'm hoping for is a vector with an entry for k from 0 to 500 (0 would just be the variance of the whole set).
I've started with something like this, but I know I'll need to reference the gap (i) in the first mean comparison as well:
x <- rnorm(500)
xMean <-mean(x)
i <- seq(1, 500)
dfGam <- data.frame(i)
dfGam$gamma <- (1/(500-dfGam$i))*(sum((x-xMean)*(x[-dfGam$i]-xMean)))
Is it possible to do this using vector math or will I need to use some sort of for loop?
Here's the for loop that I've come up with for the solution:
gamma_func <- function(input_vec) {
output_vec <- c()
input_mean <- mean(input_vec)
iter <- seq(1, length(input_vec)-1)
for(val in iter){
iter2 <- seq((val+1), length(input_vec))
gamma_sum <- 0
for(val2 in iter2){
gamma_sum <- gamma_sum + (input_vec[val2]-input_mean)*(input_vec[val2-val]-input_mean)
}
output_vec[val] <- (1/length(iter2))*gamma_sum
}
return(output_vec)
}
Thanks
Using data.table, mostly for the shift function to make x_{t - k}, you can do this:
library(data.table)
gammabar <- function(k, x){
xbar <- mean(x)
n <- length(x)
df <- data.table(xt = x, xtk = shift(x, k))[!is.na(xtk)]
df[, sum((xt - xbar)*(xtk - xbar))/n]
}
gammabar(k = 10, x)
# [1] -0.1553118
The filter [!is.na(xtk)] starts the sum at t = k + 1, because xtk will be NA for the first k indices due to being shifted by k.
Reproducible x
x <- c(0.376972124936433, 0.301548373935665, -1.0980231706536, -1.13040590360378,
-2.79653431987176, 0.720573498411587, 0.93912102300901, -0.229377746707471,
1.75913134696347, 0.117366786802848, -0.853122822287008, 0.909259181618213,
1.19637295955276, -0.371583903741348, -0.123260233287436, 1.80004311672545,
1.70399587729432, -3.03876460529759, -2.28897494991878, 0.0583034949929225,
2.17436525195634, 1.09818265352131, 0.318220322390854, -0.0731475581637693,
0.834268741278827, 0.198750636733429, 1.29784138432631, 0.936718306241348,
-0.147433193833294, 0.110431994640128, -0.812504663900505, -0.743702167768748,
1.09534507180741, 2.43537370755095, 0.38811846676708, 0.290627670295127,
-0.285598287083935, 0.0760147178373681, -0.560298603759627, 0.447188372143361,
0.908501134499943, -0.505059597708343, -0.301004012157305, -0.726035976548133,
-1.18007702699501, 0.253074712637114, -0.370711296884049, 0.0221795637601637,
0.660044122429767, 0.48879363533552)

Trying to understand how some function works

I was given a task to write a function, which I name: my_mode_k.
The input is consisted of two variables:
(x, k)
as x, is a vector of natural numbers with the length of n. the greatest object of x can be k, given that k < n.
my_mode_k output is the highest frequency object of x. if there's more then one object in the vector that are common in x the same number of times - then the function will output the minimum object between them.
for example:
my_mode_k(x = c(1, 1, 2, 3, 3) , k =3)
1
This is code I wrote:
my_mode_k <- function(x, k){
n <- length(x)
x_lemma <- rep(0, k)
for(i in 1:n){
x_lemma[i] < x_lemma[i] +1
}
x_lem2 <- 1
for( j in 2:k){
if(x_lemma[x_lem2] < x_lemma[j]){
x_lem2 <- j
}
}
x_lem2
}
which isn't working properly.
for example:
my_mode_k(x = c(2,3,4,3,2,2,5,5,5,5,5,5,5,5), k=5)
[1] 1
as the function is supposed to return 5.
I don't understand why and what is the intuition to have in order to even know if a function is working properly (It took me some time to realize that it's not executing the needed task) - so I could fix the mistake in it.
Here are a few steps on how you can achieve this.
k <- 5
input <- c(2,3,4,3,3,3,3,3,3,3,2,2,5,5,5,5,5,5,5,5)
# Calculate frequencies of elements.
tbl <- table(input[input <= k])
# Find which is max. Notice that it returns the minimum of there is a tie.
tbl.max <- which.max(tbl)
# Find which value is your result.
names(tbl.max)
input <- c(2,2,3,3,3,5,5,5)
names(which.max(table(input[input <= k])))
# 3
input <- c(2,2,5,5,5,3,3,3)
names(which.max(table(input[input <= k])))
# 3

re-expressing a simple operation as a function in R

I am trying to construct a new variable, z, using two pre-existing variables - x and y.  Suppose for simplicity that there are only 5 observations (corresponding to 5 time periods) and that x=c(5,7,9,10,14) and y=c(0,2,1,2,3). I’m really only using the first observation in x as the initial value, and then constructing the new variable z using depreciated values of x[1] (depreciation rate of 0.05 per annum) and each of the observations over time in the vector, y. The variable I am constructing takes the form of a new 5 by 1 vector, z, and it can be obtained using the following simple commands in R:
z=NULL
for(i in 1:length(x)){n=seq(1,i,by=1)
z[i]=sum(c(0.95^(i-1)*x[1],0.95^(i-n)*y[n]))}
The problem I am having is that I need to define this operation as a function. That is, I need to create a function f that will spit out the vector z whenever any arbitrary vectors x and y are plugged into the function, f(x,y). I’ve been going around in circles for days now and I was wondering if someone would be kind enough to provide me with a suggestion about how to proceed. Thanks in advance.
I hope following will work for you...
x=c(5,7,9,10,14)
y=c(0,2,1,2,3)
getZ = function(x,y){
z = NULL
for(i in 1:length(x)){
n=seq(1,i,by=1)
z[i]=sum(c(0.95^(i-1)*x[1],0.95^(i-n)*y[n]))
}
return = z
}
z = getZ(x,y)
z
5.000000 6.750000 7.412500 9.041875 11.589781
This will allow .05 (or any other value) passed in as r.
ConstructZ <- function(x, y, r){
n <- length(y)
d <- 1 - r
Z <- vector(length = n)
for(i in seq_along(x)){
n = seq_len(i)
Z[i] = sum(c(d^(i-1)*x[1],d^(i-n)*y[n]))
}
return(Z)
}
Here is a cool (if I say so myself) way to implement this as an infix operator (since you called it an operation).
ff = function (x, y, i) {
n = seq.int(i)
sum(c(0.95 ^ (i - 1) * x[[1]],
0.95 ^ (i - n) * y[n]))
}
`%dep%` = function (x, y) sapply(seq_along(x), ff, x=x, y=y)
x %dep% y
[1] 5.000000 6.750000 7.412500 9.041875 11.589781
Doing the loop multiple times and recalculating the exponents every time may be inefficient. Here's another way to implement your calculation
getval <- function(x,y,lambda=.95) {
n <- length(y)
pp <- lambda^(1:n-1)
yy <- sapply(1:n, function(i) {
sum(y * c(pp[i:1], rep.int(0, n-i)))
})
pp*x[1] + yy
}
Testing with #vrajs5's sample data
x=c(5,7,9,10,14)
y=c(0,2,1,2,3)
getval(x,y)
# [1] 5.000000 6.750000 7.412500 9.041875 11.589781
but appears to be about 10x faster when testing on larger data such as
set.seed(15)
x <- rpois(200,20)
y <- rpois(200,20)
I'm not sure of how often you will run this or on what size of data so perhaps efficiency isn't a concern for you. I guess readability is often more important long-term for maintenance.

How to make a loop run faster in R?

I want to use arms() to get one sample each time and make a loop like the following one in my function. It runs very slowly. How could I make it run faster? Thanks.
library(HI)
dmat <- matrix(0, nrow=100,ncol=30)
system.time(
for (d in 1:100){
for (j in 1:30){
y <- rep(0, 101)
for (i in 2:100){
y[i] <- arms(0.3, function(x) (3.5+0.000001*d*j*y[i-1])*log(x)-x,
function(x) (x>1e-4)*(x<20), 1)
}
dmat[d, j] <- sum(y)
}
}
)
This is a version based on Tommy's answer but avoiding all loops:
library(multicore) # or library(parallel) in 2.14.x
set.seed(42)
m = 100
n = 30
system.time({
arms.C <- getNativeSymbolInfo("arms")$address
bounds <- 0.3 + convex.bounds(0.3, dir = 1, function(x) (x>1e-4)*(x<20))
if (diff(bounds) < 1e-07) stop("pointless!")
# create the vector of z values
zval <- 0.00001 * rep(seq.int(n), m) * rep(seq.int(m), each = n)
# apply the inner function to each grid point and return the matrix
dmat <- matrix(unlist(mclapply(zval, function(z)
sum(unlist(lapply(seq.int(100), function(i)
.Call(arms.C, bounds, function(x) (3.5 + z * i) * log(x) - x,
0.3, 1L, parent.frame())
)))
)), m, byrow=TRUE)
})
On a multicore machine this will be really fast since it spreads the loads across cores. On a single-core machine (or for poor Windows users) you can replace mclapply above with lapply and get only a slight speedup compared to Tommy's answer. But note that the result will be different for parallel versions since it will use different RNG sequences.
Note that any C code that needs to evaluate R functions will be inherently slow (because interpreted code is slow). I have added the arms.C just to remove all R->C overhead to make moli happy ;), but it doesn't make any difference.
You could squeeze out a few more milliseconds by using column-major processing (the question code was row-major which requires re-copying as R matrices are always column-major).
Edit: I noticed that moli changed the question slightly since Tommy answered - so instead of the sum(...) part you have to use a loop since y[i] are dependent, so the function(z) would look like
function(z) { y <- 0
for (i in seq.int(99))
y <- y + .Call(arms.C, bounds, function(x) (3.5 + z * y) * log(x) - x,
0.3, 1L, parent.frame())
y }
Well, one effective way is to get rid of the overhead inside arms. It does some checks and calls the indFunc every time even though the result is always the same in your case.
Some other evaluations can be also be done outside the loop. These optimizations bring down the time from 54 secs to around 6.3 secs on my machine. ...and the answer is identical.
set.seed(42)
#dmat2 <- ##RUN ORIGINAL CODE HERE##
# Now try this:
set.seed(42)
dmat <- matrix(0, nrow=100,ncol=30)
system.time({
e <- new.env()
bounds <- 0.3 + convex.bounds(0.3, dir = 1, function(x) (x>1e-4)*(x<20))
f <- function(x) (3.5+z*i)*log(x)-x
if (diff(bounds) < 1e-07) stop("pointless!")
for (d in seq_len(nrow(dmat))) {
for (j in seq_len(ncol(dmat))) {
y <- 0
z <- 0.00001*d*j
for (i in 1:100) {
y <- y + .Call("arms", bounds, f, 0.3, 1L, e)
}
dmat[d, j] <- y
}
}
})
all.equal(dmat, dmat2) # TRUE
why not like this?
dat <- expand.grid(d=1:10, j=1:3, i=1:10)
arms.func <- function(vec) {
require(HI)
dji <- vec[1]*vec[2]*vec[3]
arms.out <- arms(0.3,
function(x,params) (3.5 + 0.00001*params)*log(x) - x,
function(x,params) (x>1e-4)*(x<20),
n.sample=1,
params=dji)
return(arms.out)
}
dat$arms <- apply(dat,1,arms.func)
library(plyr)
out <- ddply(dat,.(d,j),summarise, arms=sum(arms))
matrix(out$arms,nrow=length(unique(out$d)),ncol=length(unique(out$j)))
However, its still single core and time consuming. But that isn't R being slow, its the arms function.

Resources