Smoothing directional (angular) data in R - r

I'm trying to deal with some motion analysis software tracking errors after the data is exported. For some frames the direction is rotated by 180 degrees from the "true" direction.
I would like to smooth the data set so that when the direction changes by ~180 in a single frame, it is transformed to reflect the actual angle.
Is anyone aware of a way to solve this using any of the circular statistics packages in R language such as CircStats? Alternatively, I could imagine a script that checks if frame to frame variation is near 180 degrees, subtracts 180 if this is true, then moves to the next frame. Does this sound like a reasonable approach and would it be easily implemented in R?
I'm afraid I don't have the rep to upload a figure describing the problem (it's very easy to see), but here is a example dataset.
Thanks for the help. I've been a longtime user of stack overflow but have never failed to find my answer before needing to ask before.
David
edit - attached image

It was an interesting problem to solve! It needs to be iterative since whenever a value is changed, it can solve a problem but create another... Let me know if it does the trick.
threshold <- 90
correction <- 180
dat <- read.table("angle_data.txt", header=TRUE)
dat <- ts(dat)
repeat {
diffs <- dat - lag(dat, k = 1)
probl <- which(abs(diffs[,2]) > threshold)
if(length(probl)==0)
break
obs.1 <- dat[probl[1], 2]
obs.2 <- dat[probl[1] + 1, 2]
dat[probl[1] + 1, 2] <- obs.2 + sign(obs.1 - obs.2) * 180
}

Related

Interpolating blinks in eyetracking data - start/end intervals as time points

So, I apologise in advance for my poor attempt at explaining myself. I am rather lost.
Summary:
I am working with the eyelinker package in R to analyse pupil size data in a time-series fashion.
I have managed to create a set of intervals where blinks start and end (extendedBlinks, they extend 150 milliseconds each direction (1000Hz).
# Define set of intervals for blinks
Blk <- cbind(df$blinks$stime, df$blinks$etime)
# Extend blinks (100 milliseconds each way)
extendedBlinks <- Intervals(Blk) %>% expand(150, "absolute")
head(extendedBlinks)
output:
Object of class Intervals
6 intervals over R:
[4485724, 4486141]
[4485984, 4486657]
[4486549, 4486853]
[4486595, 4487040]
[4486800, 4489142]
[4498990, 4499339]
In my dataframe, I have PSL (Pupil Size Left), PSR (Pupil Size Right), and time (relative to the eyetracker, and has the same values as the intervals shown above.
So, I want to get the PSL/PSR (for the sake of the example, let's just stick to getting the PSL).
I've tried many things, nothing seems to work for me. I want to replace the given values in y1 with extendedBlinks[1,1] and extendedBlinks[1,2] respectively (and then iterate over the intervals to interpolate the blinks.
# Interpolation
x1 <- c(extendedBlinks[1,1],extendedBlinks[1,2])
y1 <- c(500, 550)
interp <- approx(x1,y1, n = extendedBlinks[1,2]-extendedBlinks[1,1])
plot(interp)
Again, sorry for the poorly worded question. I'll edit as I receive feedback to try and make it clearer.
Any ideas?
Cheers!

Normalizing an R stars object by grid area?

first post :)
I've been transitioning my R code from sp() to sf()/stars(), and one thing I'm still trying to grasp is accounting for the area in my grids.
Here's an example code to explain what I mean.
library(stars)
library(tidyverse)
# Reading in an example tif file, from stars() vignette
tif = system.file("tif/L7_ETMs.tif", package = "stars")
x = read_stars(tif)
x
# Get areas for each grid of the x object. Returns stars object with "area" in units of [m^2]
x_area <- st_area(x)
x_area
I tried loosely adopting code from this vignette (https://github.com/r-spatial/stars/blob/master/vignettes/stars5.Rmd) to divide each value in x by it's grid area, and it's not working as expected (perhaps because my objects are stars and not sf?)
x$test1 = x$L7_ETMs.tif / x_area # Some computationally intensive calculation seems to happen, but doesn't produce the results I expect?
x$test1 = x$L7_ETMs.tif / x_area$area # Throws error, "non-conformable arrays"
What does seem to work is the following.
x %>%
mutate(test1 = L7_ETMs.tif / units::set_units(as.numeric(x_area$area), m^2))
Here are the concerns I have with this code.
I worry that as I turn the x_area$area (a matrix, areas in lat/lon) into a numeric vector, I may mess up the lat/lon matching between the grid and it's area. I did some rough testing to see if the areas match up the way I expect them to, but can't escape the worry that this could lead to errors that are difficult to catch.
It just doesn't seem clean that I start with "x_area" in the correct units, only to remove then set the units again during the computation.
Can someone suggest a "cleaner" implementation for what I'm trying to do, i.e. multiplying or dividing grids by its area while maintaining units throughout? Or convince me that the code I have is fine?
Thanks!
I do not know how to improve the stars code, but you can compare the results you get with this
tif <- system.file("tif/L7_ETMs.tif", package = "stars")
library(terra)
r <- rast(tif)
a <- cellSize(r, sum=FALSE)
x <- r / a
With planar data you could do this when it is safe to assume there is no distortion (generally not the case, but it can be the case)
y <- r / prod(res(r))

R: Writing a recursive function for a Random Walk (initial values)

I'm a new user to R, and I am trying to create a function that will simulate a random walk. The issue for me is trying to integrate some initial values smoothly. Say I have this basic function.
y(t) = y(t-2) + eps(t)
Epsilon (or eps(t)) will be the randomness factor. I want to define y(-1)=0, and y(0)=0.
Here is my code:
ran.walk=function(n){ # 'n' steps will be the input
eps=rnorm(n) # creates a vector taking random values from N(0,1)
y= c(eps[1], eps[2]) # this will set up my initial vector
for (i in 3:n){
ytemp = y[i-2] + eps[i] ## !!! problem is here. Details below !!!
y= c(y, ytemp)
}
return(y)
}
I'm trying to get this start adding y3, y4, y5, etc, but I think there is a flaw in this design... I'm not sure if I should just set up two separate lines, with an if statement: testing if n is even or odd, perhaps with:
if i%%2 == 1 #using modulus
Since,
y1= eps1,
y2= eps2,
y3= y1 + eps3,
y4= y2 + eps4,
y5= y3 + eps5 and so on...
Currently, I see the error in my code.
I have y1, and y2 concatenated, but I don't think it knows how to incorporate y[1]
Can I define beforehand somehow y[-1]=0, and y[0]=0 ? I tried this also and got an error.
Thank you kindly in advance for any assistance. This is first times attempting a for loop with recursion.
-N (sorry for any formatting issues, I had a lot of problems getting this question to go through)
I found that your odd and even series is independent one of the other. Assuming that it is the case, I just split the problem in two columns and use cumsum to get the random walk. The final data frame include the random numbers and the random walk, so you can compare it is working properly.
Hoping it helps
ran.walk=function(n) {
eps=rnorm(ceiling(n / 2)*2)
dim(eps) <- c(2,ceiling(n/2))
# since each series is independent, we can tally each one in its own
eps2 <- apply(eps, 1, cumsum)
# and just reorganize it
eps2 <- as.numeric(t(eps2))
rndwlk <- data.frame(rnd=as.numeric(eps), walk=eps2)
# remove the extra value if needed
rndwlk <- rndwlk[1:n,]
return(rndwlk)
}
ran.walk(13)
After taking a break with my piano, it came to me. It's funny how simple the answer becomes once you discover it... almost trivial.
Setting the initial value to be a vector, that is:
[y(1) = y(-1) + eps(1), y(2)= y(0) + eps(2)]
everything works out. It is still true that the evens and odds don't interact, but there is no reason to specify any of that.
The method to split the iterations with modulus, then concatenating it back into the main vector would also work, but is unnecessary and more complicated. Shorter is better for users and computers. As Einstein said, make it as simple as possible, but no simpler.

R/ImageJ: Measuring shortest distance between points and curves

I have some experience with R as a statistics platform, but am inexperienced in image based maths. I have a series of photographs (tiff format, px/µm is known) with holes and irregular curves. I'd like to measure the shortest distance between a hole and the closest curve for that particular hole. I'd like to do this for each hole in a photograph. The holes are not regular either, so maybe I'd need to tell the program what are holes and what are curves (ImageJ has a point and segmented line functions).
Any ideas how to do this? Which package should I use in R? Would you recommend another program for this kind of task?
EDIT: Doing this is now possible using sclero package. The package is currently available on GitHub and the procedure is described in detail in the tutorial. Just to illustrate, I use an example from the tutorial:
library(devtools)
install_github("MikkoVihtakari/sclero", dependencies = TRUE)
library(sclero)
path <- file.path(system.file("extdata", package = "sclero"), "shellspots.zip")
dat <- read.ijdata(path, scale = 0.7812, unit = "um")
shell <- convert.ijdata(dat)
aligned <- spot.dist(shell)
plot(aligned)
It is also possible to add sample spot sizes using the functions provided by the sclero package. Please see Section 2.5 in the tutorial.
There's a tool for edge detection written for Image J that might help you first find the holes and the lines, and clarify them. You find it at
http://imagejdocu.tudor.lu/doku.php?id=plugin:filter:edge_detection:start
Playing around with the settings for the tresholding and the hysteresis can help in order to get the lines and holes found. It's difficult to tell whether this has much chance of working without seeing your actual photographs, but a colleague of mine had good results using this tool on FRAP images. I programmed a ImageJ tool that can calculate recoveries in FRAP analysis based on those images. You might get some ideas for yourself when looking at the code (see: http://imagejdocu.tudor.lu/doku.php?id=plugin:analysis:frap_normalization:start )
The only way I know you can work with images, is by using EBImage that's contained in the bioconductor system. The package Rimage is orphaned, so is no longer maintained.
To find the shortest distance: once you have the coordinates of the lines and holes, you can go for the shotgun approach : calculate the distances between all points and the line, and then take the minimum. An illustration about that in R :
x <- -100:100
x2 <- seq(-70,-50,length.out=length(x)/4)
a.line <- list(x = x,
y = 4*x + 5)
a.hole <- list(
x = c(x2,rev(x2)),
y = c(200 + sqrt(100-(x2+60)^2),
rev(200 - sqrt(100-(x2+60)^2)))
)
plot(a.line,type='l')
lines(a.hole,col='red')
calc.distance <- function(line,hole){
mline <- matrix(unlist(line),ncol=2)
mhole <- matrix(unlist(hole),ncol=2)
id1 <- rep(1:nrow(mline),nrow(mhole))
id2 <- rep(1:nrow(mhole), each=nrow(mline))
min(
sqrt(
(mline[id1,1]-mhole[id2,1])^2 +
(mline[id1,2]-mhole[id2,2])^2
)
)
}
Then :
> calc.distance(a.line,a.hole)
[1] 95.51649
Which you can check mathematically by deriving the equations from the circle and the line. This goes fast enough if you don't have millions of points describing thousands of lines and holes.

R Time series - having trouble making bollinger lines - need simple example please

Learning R language - I know how to do a moving average but I need to do more - but I am not a statistician - unfortunately all the docs seem to be written for statisticians.
I do this in excel a lot, it's really handy for analysis of operational activities.
Here are the fields on each row to make bollinger bands:
Value could be # of calls, complaint ratio, anything
TimeStamp | Value | Moving Average | Moving STDEVP | Lower Control | Upper Control
Briefly, the moving avg and the stdevP point to the prior 8 or so values in the series. Lower control at a given point in time is = moving average - 2*moving stdevP and upper control = moving average + 2*moving stdevP
This can easily be done in excel for a single file, but if I can find a way to make R work R will be better for my needs. Hopefully faster and more reliable when automated, too.
links or tips would be appreciated.
You could use the function rollapply() from the zoo package, providing you work with a zoo series :
TimeSeries <- cumsum(rnorm(1000))
ZooSeries <- as.zoo(TimeSeries)
BollLines <- rollapply(ZooSeries,9,function(x){
M <- mean(x)
SD <- sd(x)
c(M,M+SD*2,M-SD*2)
})
Now you have to remember that rollapply uses a centered frame, meaning that it takes the values to the left and the right of the current day. This is also more convenient and more true to the definition of the Bollinger Band than your suggestion of taking x prior values.
If you don't want to convert to zoo, you can use the vectors as well and write your own function. I added an S3 based plotting function that allows you to easily plot the calculations as well. With these functions, you could do something like :
TimeSeries <- cumsum(rnorm(1000))
X <- BollingerBands(TimeSeries,80)
plot(X,TimeSeries,type="l",main="An Example")
to get :
The function codes :
BollingerBands <- function(x,width){
Start <- width +1
Stop <- length(x)
Trail <- rep(NA,ceiling(width/2))
Tail <- rep(NA,floor(width/2))
Lines <- sapply(Start:Stop,function(i){
M <- mean(x[(i-width):i])
SD <- sd(x[(i-width):i])
c(M,M+2*SD,M-2*SD)
})
Lines <- apply(Lines,1,function(i)c(Trail,i,Tail))
Out <- data.frame(Lines)
names(Out) <- c("Mean","Upper","Lower")
class(Out) <- c("BollingerBands",class(Out))
Out
}
plot.BollingerBands <- function(x,data,lcol=c("red","blue","blue"),...){
plot(data,...)
for(i in 1:3){
lines(x[,i],col=lcol[i])
}
}
There is an illustration in the R Graph Gallery (65) giving code both for calculating the bands and for plotting share prices.
The 2005 code still seems to work six years later and will give IBM's current share price and going back several months
The most obvious bug is the width of the bandwidth and volume lower charts which have been narrowed; there may be another over the number of days covered.

Resources