How can I fit a smooth hysteresis in R? - r

I have a measurment of which should fit an hysteresis. For visualisation purpose I would like to plot a line approximating the hysteresis to help explain this pattern.
I created an example in the following image using the code below.
I would like to have an output similar to the green curve - however I don't have this data directly available, and I don't care whether it is pointy.
However most smoothing functions such as smooth.spline which I plotted in blue - allow no loops. The closest I can find is from the bezier library - plotted in red. Not nicely visible here but it produces a loop, however it fits poorly (and gives some warnings and takes quite some time).
Can you suggest a method?
set.seed(12345)
up <- seq(0,1,length.out=100)^3
down <- sqrt(seq(1,0,length.out=100))
x <- c(seq(0,1,length.out=length(up)),
seq(1,0, length.out=length(down)))
data <- data.frame(x=x, y=c(up,down),
measuredx=x + rnorm(length(x))*0.01,
measuredy=c(up,down) + rnorm(length(up)+length(down))*0.03)
with(data,plot(measuredx,measuredy, type = "p"))
with(data,lines(x,y, col='green'))
sp <- with(data,smooth.spline(measuredx, measuredy))
with(sp, lines(x,y, col="blue"))
library(bezier)
bf <- bezierCurveFit(as.matrix(data[,c(1,3)]))
lines(bezier(t=seq(0, 1, length=500), p=bf$p), col="red", cex=0.25)
UPDATE
As it turns out my actual problem is slightly different I ask another question to reflect my actual issue in the question: How to fit a smooth hysteresis in a poorly distributed data set?

set.seed(12345)
up <- seq(0,1,length.out=100)^3
down <- sqrt(seq(1,0,length.out=100))
x <- c(seq(0,1,length.out=length(up)),
seq(1,0, length.out=length(down)))
data <- data.frame(x=x, y=c(up,down),
measuredx=x + rnorm(length(x))*0.01,
measuredy=c(up,down) + rnorm(length(up)+length(down))*0.03)
Instead of smoothing data$measuredy directly over data$measuredx, do two separate smoothing, by smoothing each against a time stamp variable. Then combine the fitted values from two smoothing. This is a general way for smoothing a closed curve or a loop. (See also Q & A: Smoothing Continuous 2D Points)
t <- seq_len(nrow(data) + 1)
xs <- smooth.spline(t, c(data$measuredx, data$measuredx[1]))$y
ys <- smooth.spline(t, c(data$measuredy, data$measuredy[1]))$y
with(data, plot(measuredx, measuredy))
lines(xs, ys)
c(data$measuredx, data$measuredx[1]) for example is just to ensure that the last value in the vector agrees with the first, so that it completes a cycle.
The curve is not really closed at the bottom left corner, because smooth.spline is doing smoothing not interpolation, so even if we have ensure that data vector completes a cycle, the fitted one may not be a closed one. A practical workaround is to use weighted regression, imposing heavy weight on this spot to make it closed.
t <- seq_len(nrow(data) + 1)
w <- rep(1, length(t)) ## initially identical weight everywhere
w[c(1, length(w))] <- 100000 ## give heavy weight
xs <- smooth.spline(t, c(data$measuredx, data$measuredx[1]), w)$y
ys <- smooth.spline(t, c(data$measuredy, data$measuredy[1]), w)$y
with(data, plot(measuredx, measuredy), col = 8)
lines(xs, ys, lwd = 2)

Related

How to smooth a curve in R?

location diffrence<-c(0,0.5,1,1.5,2)
Power<-c(0,0.2,0.4,0.6,0.8,1)
plot(location diffrence,Power)
The guy which has written the paper said he has smoothed the curve using a weighted moving average with weights vector w = (0.25,0.5,0.25) but he did not explained how he did this and with which function he achieved that.i am really confused
Up front, as #MartinWettstein cautions, be careful in when you smooth data and what you do with it (infer from it). Having said that, a simple exponential moving average might look like this.
# replacement data
x <- seq(0, 2, len=5)
y <- c(0, 0.02, 0.65, 1, 1)
# smoothed
ysm <-
zoo::rollapply(c(NA, y, NA), 3,
function(a) Hmisc::wtd.mean(a, c(0.25, 0.5, 0.25), na.rm = TRUE),
partial = FALSE)
# plot
plot(x, y, type = "b", pch = 16)
lines(x, ysm, col = "red")
Notes:
the zoo:: package provides a rolling window (3-wide here), calling the function once for indices 1-3, then again for indices 2-4, then 3-5, 4-6, etc.
with rolling-window operations, realize that they can be center-aligned (default of zoo::rollapply) or left/right aligned. There are some good explanations here: How to calculate 7-day moving average in R?)
I surround the y data with NAs so that I can mimic a partial window. Normally with rolling-window ops, if k=3, then the resulting vector is length(y) - (k-1) long. I'm inferring that you want to include data on the ends, so the first smoothed data point would be effectively (0.5*0 + 0.25*0.02)/0.75, the second smoothed data point (0.25*0 + 0.5*0.02 + 0.25*0.65)/1, and the last smoothed data point (0.25*1 + 0.5*1)/0.75. That is, omitting the 0.25 times a missing data point. That's a guess and can easily be adjusted based on your real needs.
I'm using Hmisc::wtd.mean, though it is trivial to write this weighted-mean function yourself.
This is suggestive only, and not meant to be authoritative. Just to help you begin exploring your smoothing processes.

Using R and Sensor Accelerometer Data to Detect a Jump

I'm fascinated by sensor data. I used my iPhone and an app called SensorLog to capture
accelerometer data while I stand and push my legs to jump.
My goal is to use R to create a model which can identify jumps and how long I'm in the air.
I'm unsure how to proceed in such a challenge. I have a timeseries with accelerometer data.
https://drive.google.com/file/d/0ByWxsCBUWbqRcGlLVTVnTnZIVVk/view?usp=sharing
Some questions:
How can a jump be detected in timeseries data?
How to identify the air time part?
How to train such a model?
Below is the R code used to create the graphs above, which is me standing and doing a simple jump.
Thanks!
# Training set
sample <- read.csv("sample-data.csv")
# Sum gravity
sample$total_gravity <- sqrt(sample$accelerometerAccelerationX^2+sample$accelerometerAccelerationY^2+sample$accelerometerAccelerationZ^2)
# Smooth our total gravity to remove noise
f <- rep(1/4,4)
sample$total_gravity_smooth <- filter(sample$total_gravity, f, sides=2)
# Removes rows with NA from smoothing
sample<-sample[!is.na(sample$total_gravity_smooth),]
#sample$test<-rollmaxr(sample$total_gravity_smooth, 10, fill = NA, align = "right")
# Plot gravity
plot(sample$total_gravity, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(sample$total_gravity_smooth, col="red")
stdevs <- mean(sample$total_gravity_smooth)+c(-2,-1,+1,+2)*sd(sample$total_gravity_smooth)
abline(h=stdevs)
This is probably less than perfect solution, but it might be enough to get you started. The first part relies on a small modification of the find_peaks function from the gazetools package.
find_maxima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == max(r))])
}
}
peaks
}
find_minima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == min(r))])
}
}
peaks
}
In order to get the find_maxima and find_minima functions to give us what we're looking for we are going to need to smooth the total_gravity data even further:
spline <- smooth.spline(sample$loggingSample, y = sample$total_gravity, df = 30)
Note: I 'zeroed out' total gravity (sample$total_gravity <- sample$total_gravity - 1)
Next, pull out the smoothed x and y values:
out <- as.data.frame(cbind(spline$x,spline$y))
Then find our local maxima and minima
max <- find_maxima(out$y, threshold = 0.4)
min <- find_minima(out$y, threshold = -0.4)
And then plot the data to make sure everything looks legit:
plot(out$y, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(out$y, col="red")
stdevs <- mean(out$y)+c(-2,-1,+1,+2)*sd(out$y)
abline(h=stdevs)
abline(v=max[1], col = 'green')
abline(v=max[2], col = 'green')
abline(v=min[1], col = 'blue')
And finally, we can see how long you were off the ground.
print(hangtime <- min[1] - max[1])
[1] 20
You can reduce your thresholds to get additional datapoints (changes in acceleration).
Hope this helps!
I would consider a few things:
Smooth the data by collecting median values every 100ms - accelerometer data on iPhones is not perfectly accurate, so this approach will help.
Identify turningpoints as #scribbles suggests.
There is code available in my github repository that could be modified to help with both of these issues. A PDF with some explanation is here: https://github.com/MonteShaffer/mPowerEI/blob/master/mPowerEI/example/challenge-1a.pdf
Specifically, take a look at:
library(devtools);
install_github("MonteShaffer/mPowerEI", subdir="mPowerEI");
library(mPowerEI);
# data smoothing
?scaleToTimeIncrement
# turning points
?pastecs::turnpoints

Graphing a polynomial output of calc.poly

I apologize first for bringing what I imagine to be a ridiculously simple problem here, but I have been unable to glean from the help file for package 'polynom' how to solve this problem. For one out of several years, I have two vectors of x (d for day of year) and y (e for an index of egg production) data:
d=c(169,176,183,190,197,204,211,218,225,232,239,246)
e=c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,0.016599262,0.002810977,0.00560387 8,0,0.002810977,0.002810977)
I want to, for each year, use the poly.calc function to create a polynomial function that I can use to interpolate the timing of maximum egg production. I want then to superimpose the function on a plot of the data. To begin, I have no problem with the poly.calc function:
egg1996<-poly.calc(d,e)
egg1996
3216904000 - 173356400*x + 4239900*x^2 - 62124.17*x^3 + 605.9178*x^4 - 4.13053*x^5 +
0.02008226*x^6 - 6.963636e-05*x^7 + 1.687736e-07*x^8
I can then simply
plot(d,e)
But when I try to use the lines function to superimpose the function on the plot, I get confused. The help file states that the output of poly.calc is an object of class polynomial, and so I assume that "egg1996" will be the "x" in:
lines(x, len = 100, xlim = NULL, ylim = NULL, ...)
But I cannot seem to, based on the example listed:
lines (poly.calc( 2:4), lty = 2)
Or based on the arguments:
x an object of class "polynomial".
len size of vector at which evaluations are to be made.
xlim, ylim the range of x and y values with sensible defaults
Come up with a command that successfully graphs the polynomial "egg1996" onto the raw data.
I understand that this question is beneath you folks, but I would be very grateful for a little help. Many thanks.
I don't work with the polynom package, but the resultant data set is on a completely different scale (both X & Y axes) than the first plot() call. If you don't mind having it in two separate panels, this provides both plots for comparison:
library(polynom)
d <- c(169,176,183,190,197,204,211,218,225,232,239,246)
e <- c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,
0.016599262,0.002810977,0.005603878,0,0.002810977,0.002810977)
egg1996 <- poly.calc(d,e)
par(mfrow=c(1,2))
plot(d, e)
plot(egg1996)

Levy Walk simulation in R

I am trying to generate a series of numbers to simulate a Levy Walk in R. Currently I am using the following code:
alpha=2
n=1000
x=rep(0,n)
y=rep(0,n)
for (i in 2:n){
theta=runif(1)*2*pi
f=runif(1)^(-1/alpha)
x[i]=x[i-1]+f*cos(theta)
y[i]=y[i-1]+f*sin(theta)
}
The code is working as expected and I am able to generate the numbers according to my requirements. The figure below shows on such Levy Walk:
The following histogram confirms that the numbers generated (i.e. f) actually belong to a power law:
My question is as follows:
The step lengths generated (i.e. f) are quite large. Haw can I modify the code so that the step lengths only fall within some bound [fmin, fmax]?
P.S. I have intentionally not vectorized the code.
Try using this:
f=runif(1, fmax^(-alpha), fmin^(-alpha))^(-1/alpha)
Note that you need 0 < fmin < fmax.
BTW, you can vectorize your code like this:
theta <- runif(n-1)*2*pi
f <- runif(n-1, fmax^(-alpha), fmin^(-alpha))^(-1/alpha)
x <- c(0, cumsum(f*cos(theta)))
y <- c(0, cumsum(f*sin(theta)))
Just for precision, what you're simmulating here is a Lévy flight. For it to be a Lévy walk, you should allow the particle to "walk" from the beginning to the end of each flight (with a for, for example). If you plot your resulting simmulation with plot(x, y, type = "o") you will see that there are no positions within flights (no walking) using your code.
library(ggplot2)
library(gridExtra)
alpha= 5
n= 1000
x= rep(0,n)
y= rep(0,n)
fmin= 1
fmax= n
for (i in 2:n){
theta= runif(n-1)*2*pi
f= runif(n-1, fmax^(-alpha), fmin^(-alpha))^(-1/alpha)
x= c(0, cumsum(f*cos(theta)))
y= c(0, cumsum(f*sin(theta)))
}
ggplot(data.frame(x=x, y=y), aes(x, y))+geom_point()+geom_path()

How to plot a violin scatter boxplot (in R)?

I just came by the following plot:
And wondered how can it be done in R? (or other softwares)
Update 10.03.11: Thank you everyone who participated in answering this question - you gave wonderful solutions! I've compiled all the solution presented here (as well as some others I've came by online) in a post on my blog.
Make.Funny.Plot does more or less what I think it should do. To be adapted according to your own needs, and might be optimized a bit, but this should be a nice start.
Make.Funny.Plot <- function(x){
unique.vals <- length(unique(x))
N <- length(x)
N.val <- min(N/20,unique.vals)
if(unique.vals>N.val){
x <- ave(x,cut(x,N.val),FUN=min)
x <- signif(x,4)
}
# construct the outline of the plot
outline <- as.vector(table(x))
outline <- outline/max(outline)
# determine some correction to make the V shape,
# based on the range
y.corr <- diff(range(x))*0.05
# Get the unique values
yval <- sort(unique(x))
plot(c(-1,1),c(min(yval),max(yval)),
type="n",xaxt="n",xlab="")
for(i in 1:length(yval)){
n <- sum(x==yval[i])
x.plot <- seq(-outline[i],outline[i],length=n)
y.plot <- yval[i]+abs(x.plot)*y.corr
points(x.plot,y.plot,pch=19,cex=0.5)
}
}
N <- 500
x <- rpois(N,4)+abs(rnorm(N))
Make.Funny.Plot(x)
EDIT : corrected so it always works.
I recently came upon the beeswarm package, that bears some similarity.
The bee swarm plot is a
one-dimensional scatter plot like
"stripchart", but with closely-packed,
non-overlapping points.
Here's an example:
library(beeswarm)
beeswarm(time_survival ~ event_survival, data = breast,
method = 'smile',
pch = 16, pwcol = as.numeric(ER),
xlab = '', ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
legend('topright', legend = levels(breast$ER),
title = 'ER', pch = 16, col = 1:2)
(source: eklund at www.cbs.dtu.dk)
I have come up with the code similar to Joris, still I think this is more than a stem plot; here I mean that they y value in each series is a absolute value of a distance to the in-bin mean, and x value is more about whether the value is lower or higher than mean.
Example code (sometimes throws warnings but works):
px<-function(x,N=40,...){
x<-sort(x);
#Cutting in bins
cut(x,N)->p;
#Calculate the means over bins
sapply(levels(p),function(i) mean(x[p==i]))->meansl;
means<-meansl[p];
#Calculate the mins over bins
sapply(levels(p),function(i) min(x[p==i]))->minl;
mins<-minl[p];
#Each dot is one value.
#X is an order of a value inside bin, moved so that the values lower than bin mean go below 0
X<-rep(0,length(x));
for(e in levels(p)) X[p==e]<-(1:sum(p==e))-1-sum((x-means)[p==e]<0);
#Y is a bin minum + absolute value of a difference between value and its bin mean
plot(X,mins+abs(x-means),pch=19,cex=0.5,...);
}
Try the vioplot package:
library(vioplot)
vioplot(rnorm(100))
(with awful default color ;-)
There is also wvioplot() in the wvioplot package, for weighted violin plot, and beanplot, which combines violin and rug plots. They are also available through the lattice package, see ?panel.violin.
Since this hasn't been mentioned yet, there is also ggbeeswarm as a relatively new R package based on ggplot2.
Which adds another geom to ggplot to be used instead of geom_jitter or the like.
In particular geom_quasirandom (see second example below) produces really good results and I have in fact adapted it as default plot.
Noteworthy is also the package vipor (VIolin POints in R) which produces plots using the standard R graphics and is in fact also used by ggbeeswarm behind the scenes.
set.seed(12345)
install.packages('ggbeeswarm')
library(ggplot2)
library(ggbeeswarm)
ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()
ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()

Resources