Perona-Malik model in R for smoothing time series of data - r

Recently, i use Savitzky-Golay in signal package for smoothing my data, but it is not work well. I hear that Perona-Malik is good smooth method for this task, however, i could not realize it. My question is that is it possible realize the task to smooth the data by using P& M model by using R.
Thanks
hees
Simple example.
library(signal)
bf <- butter(5,1/3)
x <- c(rep(0,15), rep(10, 10), rep(0, 15))
###
sg <- sgolayfilt(x) # replace at here
plot(sg, type="l")
lines(filtfilt(rep(1, 5)/5,1,x), col = "red") # averaging filter
lines(filtfilt(bf,x), col = "blue") # butterworth
p

Related

How do I plot multiple lines on the same graph?

I am using the R. I am trying to use the "lines' command in ggplot2 to show the predicted values vs. the actual values for a statistical model (arima, time series). Yet, when I ran the code, I can only see a line of one color.
I simulated some data in R and then tried to make plots that show actual vs predicted:
#set seed
set.seed(123)
#load libraries
library(xts)
library(stats)
#create data
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
final_data <- data.frame(date_decision_made, property_damages_in_dollars)
#aggregate
y.mon<-aggregate(property_damages_in_dollars~format(as.Date(date_decision_made),
format="%W-%y"),data=final_data, FUN=sum)
y.mon$week = y.mon$`format(as.Date(date_decision_made), format = "%W-%y")`
ts = ts(y.mon$property_damages_in_dollars, start = c(2014,1), frequency = 12)
#statistical model
fit = arima(ts, order = c(4, 1, 1))
Here were my attempts at plotting the graphs:
#first attempt at plotting (no second line?)
plot(fit$residuals, col="red")
lines(fitted(fit),col="blue")
#second attempt at plotting (no second line?)
par(mfrow = c(2,1),
oma = c(0,0,0,0),
mar = c(2,4,1,1))
plot(ts, main="as-is") # plot original sim
lines(fitted(fit), col = "red") # plot fitted values
legend("topleft", legend = c("original","fitted"), col = c("black","red"),lty = 1)
#third attempt (plot actual, predicted and 5 future values - here, the actual and future values show up, but not the predicted)
pred = predict(fit, n.ahead = 5)
ts.plot(ts, pred$pred, lty = c(1,3), col=c(5,2))
However, none of these seem to be working correctly. Could someone please tell me what I am doing wrong? (note: the computer I am using for my work does not have an internet connection or a usb port - it only has R with some preloaded packages. I do not have access to the forecast package.)
Thanks
Sources:
In R plot arima fitted model with the original series
R fitted ARIMA off by one timestep? pkg:Forecast
Plotting predicted values in ARIMA time series in R
You seem to be confusing a couple of things:
fitted usually does not work on an object of class arima. Usually, you can load the forecast package first and then use fitted.
But since you do not have acces to the forecast package you cannot use fitted(fit): it always returns NULL. I had problems with fitted
before.
You want to compare the actual series (x) to the fitted series (y), yet in your first attempt you work with the residuals (e = x - y)
You say you are using ggplot2 but actually you are not
So here is a small example on how to plot the actual series and the fitted series without ggplot.
set.seed(1)
x <- cumsum(rnorm(10))
y <- stats::arima(x, order = c(1, 0, 0))
plot(x, col = "red", type = "l")
lines(x - y$residuals, col = "blue")
I Hope this answer helps you get back on tracks.

Trying to find a way to combine IRT info plots from 3 different mirt models in R in the same

I am looking to combine all three" test information function" lines (one for each model) into one and the same graph. I have a data set of category 1-5 Likert responses in 400 rows in sets of 8 columns (one for each item). I have ran three IRT models on these sets using mirt package in R, and produced test info plots. I would like to combine IRT test info plots from three different (graded response) models, three lines, in one and the same grid.
plot(PFgrmodel29, type = 'info', xlim = c(-4, 4), ylim=c(0,85))
plot(PFgrmodel43, type = 'info', xlim = c(-4, 4), ylim=c(0,85))
plot(PFgrmodel57, type = 'info', xlim = c(-4, 4), ylim=c(0,85))
Example of test info plot:
How can I achieve this with mirt, lattice, ggplot2 or similar?
Your plots from the mirt package are a lattice object, so you can try using latticeExtra, since you did not provide your dataset, I provide an example code below using the example dataset in the package:
library(mirt)
library(latticeExtra)
fulldata <- expand.table(LSAT7)
mod1 <- mirt(fulldata,1,SE=TRUE)
mod2 <- mirt(fulldata,1, itemtype = 'Rasch')
mod3 <- mirt(fulldata,1,itemtype='ideal')
key=list(columns=2,
text=list(lab=c("mod1","mod2","mod3")),
lines=list(lwd=4, col=c("blue","orange","red"))
)
p1 = plot(mod1,type="info",key=key)
p2 = update(plot(mod2,type="info"),col="orange")
p3 = update(plot(mod3,type="info"),col="red")
p1+p2+p3
That is just beautiful! Works like a charm, except I needed to add ylim=c(0,100) to modify the y axis (taller) to fit the data. I thought that placing the model with the highest info curve first ( as mod1) would do it, but no. Thank you Stupidwolf so much for providing the code!! No need for latticeExtra package.
ALso I had to retain the "model" part of the code for this to work:
model <- 'F = 1-5 PRIOR = (5, g, norm, -1.5, 3)'
My code looks like this now:
library(mirt)
library(latticeExtra)
model <- 'F = 1-5 PRIOR = (5, g, norm, -1.5, 3)'
mod1 <- mirt(PFdata57,1,itemtype="graded", SE=TRUE)
mod2 <- mirt(PFdata43,1,itemtype="graded", SE=TRUE)
mod3 <- mirt(PFdata29,1,itemtype="graded", SE=TRUE)
key=list(columns=1,
text=list(lab=c("P57/PF Short form 8a","P43/PF Short form 6a","P29/PF Short form 4a")),
lines=list(lwd=4, col=c("blue","orange","red")))
p1 = plot(mod1,type="info",key=key,xlim=c(-4,4),ylim=c(0,85))
p2 = update(plot(mod2,type="info"),col="orange")
p3 = update(plot(mod3,type="info"),col="red")
p1+p2+p3

R: Find outliers with mvBACON

I'm new to R and working on an assignment were I am supposed to replicate the results from a linear regression (time series data with 1360 observations and 52 variables (11 variables in the regression model)). In the original study the researchers identified outliers with the Hadi method. It seems that this is done best in R with the mvBacon function, is this correct? I cannot seem to find a good answer on how to use this though, could anyone please tell me how I can use this function to find the outliers?
(I would very much appreciate an answer that is explained as simply as possible since R is very new to me).
Thank you very much!
Yes, the mvBACON is for outlier identification based on some distance. The default one is the Mahalanobis distance.
The following code will walk you through a simple example on the mtcars subdataset on how to identify outliers with mvBACON:
# load packages
library(dplyr)
library(magrittr)
# Use mtcars (sub)dataset and plot it
data <- mtcars %>% select(mpg, disp)
plot(data, main = "mtcars")
# Add some outliers and plot again
data <- rbind(data,
data.frame(mpg = c(1, 80), disp = c(800, 1000)))
plot(data, main = "mtcars")
# Use mvBacon to calculate the distances and get the ouliers
# install.packages("robustX) # uncomment line to install package
library(robustX)
#compute distance - default is Mahalonobis
distances <- mvBACON(data)
# Plot it again...
plot(data, main = "mtcars")
# ...with highlighting the outliers
points(data[!distances$subset, ], col = "red", pch = 19)
# Some fine tuning, since many of the outliers seem to be still good for regression
distances <- mvBACON(data, alpha = 0.6)
# update plot
plot(data, main = "mtcars")
points(data[!distances$subset, ], col = "red", pch = 19)

How to increase the size of the text in a Bayesian network plot with bnlearn in R

I am trying to draw a Bsyesian Network in R with bnlearn. Here is the my R code
library(bnlearn)
library(Rgraphviz)
first_variable <- rnorm(100)
second_variable <- rnorm(100)
third_variable <- rnorm(100)
v <- data.frame(first_variable,second_variable,third_variable)
b <- hc(v)
hlight <- list(nodes = nodes(b), arcs = arcs(b),col = "grey", textCol = "red")
pp <- graphviz.plot(b, highlight = hlight)
The code above works, but the size of the text in the plot is very smaller than I expected. Here it is:
I think that is because my variables have long names . In my real data, the variable names are even longer. Here is the BN plot for my real dataset:
Is there any way to increase the size of the text in the plot?
This is basically answered in the post here (albeit that wasn't the OPs only question).
The two approaches suggested are to change the text size globally:
par(cex=0.05)
graphviz.plot(res, highlight =
list(nodes=nodes(res), fill="lightgreen", col="black"))
But I don't find that this works.
Alternatively (and this is what I have been doing) is to change the node characteristics separately:
g <- Rgraphviz::layoutGraph(bnlearn::as.graphNEL(b))
graph::nodeRenderInfo(g) <- list(fontsize=20)
Rgraphviz::renderGraph(g)

Using R and Sensor Accelerometer Data to Detect a Jump

I'm fascinated by sensor data. I used my iPhone and an app called SensorLog to capture
accelerometer data while I stand and push my legs to jump.
My goal is to use R to create a model which can identify jumps and how long I'm in the air.
I'm unsure how to proceed in such a challenge. I have a timeseries with accelerometer data.
https://drive.google.com/file/d/0ByWxsCBUWbqRcGlLVTVnTnZIVVk/view?usp=sharing
Some questions:
How can a jump be detected in timeseries data?
How to identify the air time part?
How to train such a model?
Below is the R code used to create the graphs above, which is me standing and doing a simple jump.
Thanks!
# Training set
sample <- read.csv("sample-data.csv")
# Sum gravity
sample$total_gravity <- sqrt(sample$accelerometerAccelerationX^2+sample$accelerometerAccelerationY^2+sample$accelerometerAccelerationZ^2)
# Smooth our total gravity to remove noise
f <- rep(1/4,4)
sample$total_gravity_smooth <- filter(sample$total_gravity, f, sides=2)
# Removes rows with NA from smoothing
sample<-sample[!is.na(sample$total_gravity_smooth),]
#sample$test<-rollmaxr(sample$total_gravity_smooth, 10, fill = NA, align = "right")
# Plot gravity
plot(sample$total_gravity, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(sample$total_gravity_smooth, col="red")
stdevs <- mean(sample$total_gravity_smooth)+c(-2,-1,+1,+2)*sd(sample$total_gravity_smooth)
abline(h=stdevs)
This is probably less than perfect solution, but it might be enough to get you started. The first part relies on a small modification of the find_peaks function from the gazetools package.
find_maxima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == max(r))])
}
}
peaks
}
find_minima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == min(r))])
}
}
peaks
}
In order to get the find_maxima and find_minima functions to give us what we're looking for we are going to need to smooth the total_gravity data even further:
spline <- smooth.spline(sample$loggingSample, y = sample$total_gravity, df = 30)
Note: I 'zeroed out' total gravity (sample$total_gravity <- sample$total_gravity - 1)
Next, pull out the smoothed x and y values:
out <- as.data.frame(cbind(spline$x,spline$y))
Then find our local maxima and minima
max <- find_maxima(out$y, threshold = 0.4)
min <- find_minima(out$y, threshold = -0.4)
And then plot the data to make sure everything looks legit:
plot(out$y, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(out$y, col="red")
stdevs <- mean(out$y)+c(-2,-1,+1,+2)*sd(out$y)
abline(h=stdevs)
abline(v=max[1], col = 'green')
abline(v=max[2], col = 'green')
abline(v=min[1], col = 'blue')
And finally, we can see how long you were off the ground.
print(hangtime <- min[1] - max[1])
[1] 20
You can reduce your thresholds to get additional datapoints (changes in acceleration).
Hope this helps!
I would consider a few things:
Smooth the data by collecting median values every 100ms - accelerometer data on iPhones is not perfectly accurate, so this approach will help.
Identify turningpoints as #scribbles suggests.
There is code available in my github repository that could be modified to help with both of these issues. A PDF with some explanation is here: https://github.com/MonteShaffer/mPowerEI/blob/master/mPowerEI/example/challenge-1a.pdf
Specifically, take a look at:
library(devtools);
install_github("MonteShaffer/mPowerEI", subdir="mPowerEI");
library(mPowerEI);
# data smoothing
?scaleToTimeIncrement
# turning points
?pastecs::turnpoints

Resources