Construct a specific plot of time series using R - r

My problem is that I generate a time series from normal distribution and I plot my time series but I want to color in red the positive area between the time series and the axe X, the same for the negative area below the axe X and my time series.
This is the code I use but it does not work :
x1<-rnorm(250,0.4,0.9)
x <- as.matrix(x1)
t <- ts(x[,1], start=c(1,1), frequency=30)
plot(t,main="Daily closing price of Walterenergie",ylab="Adjusted close Returns",xlab="Times",col="blue")
plot(t,xlim=c(2,4),main="Daily closing price of Walterenergie",ylab="Adjusted close Returns",xlab="Times",col="blue")
abline(0,0)
z1<-seq(2,4,0.001)
cord.x <- c(2,z1,4)
cord.y <- c(0,t(z1),0)
polygon(cord.x,cord.y,col='red')

Edit: In response to OP's additional query.
library(ggplot2)
df <- data.frame(t=1:nrow(x),y=x)
df$fill <- ifelse(x>0,"Above","Below")
ggplot(df)+geom_line(aes(t,y),color="grey")+
geom_ribbon(aes(x=t,ymin=0,ymax=ifelse(y>0,y,0)),fill="red")+
geom_ribbon(aes(x=t,ymin=0,ymax=ifelse(y<0,y,0)),fill="blue")+
labs(title="Daily closing price of Walterenergie",
y="Adjusted close Returns",
x="Times")
Original response:
Is this what you had in mind?
library(ggplot2)
df <- data.frame(t=1:nrow(x),y=x)
ggplot(df)+geom_line(aes(t,y),color="grey")+
geom_ribbon(aes(x=t,ymin=0,ymax=y),fill="red")+
labs(title="Daily closing price of Walterenergie",
y="Adjusted close Returns",
x="Times")

This is some code I had written a while ago for someone. In this case two different colors are used for positive and negative. Although this is not exactly what you're after, I thought I'll share this.
# Set a seed to get a reproducible example
set.seed(12345)
num.points <- 100
# Create some data
x.vals <- 1:num.points
values <- rnorm(n=num.points, mean=0, sd=10)
# Plot the graph
plot(x.vals, values, t="o", pch=20, xlab="", ylab="", las=1)
abline(h=0, col="darkgray", lwd=2)
# We need to find the intersections of the curve with the x axis
# Those lie between positive and negative points
# When the sign changes the product between subsequent elements
# will be negative
crossings <- values[-length(values)] * values[-1]
crossings <- which(crossings < 0)
# You can draw the points to check (uncomment following line)
# points(x.vals[crossings], values[crossings], col="red", pch="X")
# We now find the exact intersections using a proportion
# See? Those high school geometry problems finally come in handy
intersections <- NULL
for (cr in crossings)
{
new.int <- cr + abs(values[cr])/(abs(values[cr])+abs(values[cr+1]))
intersections <- c(intersections, new.int)
}
# Again, let's check the intersections
# points(intersections, rep(0, length(intersections)), pch=20, col="red", cex=0.7)
last.intersection <- 0
for (i in intersections)
{
ids <- which(x.vals<=i & x.vals>last.intersection)
poly.x <- c(last.intersection, x.vals[ids], i)
poly.y <- c(0, values[ids], 0)
if (max(poly.y) > 0)
{
col="green"
}
else
{
col="red"
}
polygon(x=poly.x, y=poly.y, col=col)
last.intersection <- i
}
And here's the result!

Base plotting solution:
x1<-rnorm(250,0.4,0.9)
x <- as.matrix(x1)
# t <- ts(x[,1], start=c(1,1), frequency=30)
plot(x1,main="Daily closing price of Walterenergie",ylab="Adjusted close Returns",xlab="Times",col="blue", type="l")
polygon( c(0,1:250,251), c(0, x1, 0) , col="red")
Note this doesn't deal with the time-series plotting method which is rather difficult to understand because of differences in scaling by the frequency value and a starting x value of 1. The solution to that is below:
plot(t,main="Daily closing price of Walterenergie",
ylab="Adjusted close Returns",xlab="Times",col="blue", type="l")
polygon( c(1,1+(0:250)/30), c(0, t, 0) , col="red")

Related

Way to progressively overlap line plots in R

I have a for loop from which I call a function grapher() which extracts certain columns from a dataframe (position and w, both continuous variables) and plots them. My code changes the Y variable (called w here) each time it runs and so I'd like to plot it as an overlay progressively. If I run the grapher() function 4 times for example, I'd like to have 4 plots where the first plot has only 1 line, and the 4th has all 4 overlain on each other (as different colours).
I've already tried points() as suggested in other posts, but for some reason it only generates a new graph.
grapher <- function(){
position.2L <- data[data$V1=='2L', 'V2']
w.2L <- data[data$V1=='2L', 'w']
plot(position.2L, w.2L)
points(position.2L, w.2L, col='green')
}
# example of my for loop #
for (t in 1:200){
#code here changes the 'w' variable each iteration of 't'
if (t%%50==0){
grapher()
}
}
Not knowing any details about your situation I can only assume something like this might be applicable.
# Example data set
d <- data.frame(V1=rep(1:2, each=6), V2=rep(1:6, 2), w=rep(1:6, each=2))
# Prepare the matrix we will write to.
n <- 200
m <- matrix(d$w, nrow(d), n)
# Loop progressively adding more noise to the data
set.seed(1)
for (i in 2:n) {
m[,i] <- m[,i-1] + rnorm(nrow(d), 0, 0.05)
}
# We can now plot the matrix, selecting the relevant rows and columns
matplot(m[d$V1 == 1, seq(1, n, by=50)], type="o", pch=16, lty=1)

New outliers appear after I remove existing ones using QQ Plot Results

I'm working on the PCA section from Michael Faraway's Linear Models with R (chapter 11, page 164).
PCA analysis is sensitive to outliers and the Mahalanobis distance helps us identify them.
The author checks for outliers by plotting the Mahalanobis distance against the quantiles of a chi-squared distribution.
if require(faraway)==F install.packages("faraway"); require(faraway)
data(fat, package='faraway')
cfat <- fat[,9:18]
n <- nrow(cfat); p <- ncol(cfat)
plot(qchisq(1:n/(n+1),p), sort(md), xlab=expression(paste(chi^2,
"quantiles")),
ylab = "Sorted Mahalanobis distances")
abline(0,1)
I identify the points:
identify(qchisq(1:n/(n+1),p), sort(md))
It appears that the outliers are in rows 242:252. I remove these outliers and re-create the QQ Plot:
cfat.mod <- cfat[-c(242:252),] #remove outliers
robfat <- cov.rob(cfat.mod)
md <- mahalanobis(cfat.mod, center=robfat$center, cov=robfat$cov)
n <- nrow(cfat.mod); p <- ncol(cfat.mod)
plot(qchisq(1:n/(n+1),p), sort(md), xlab=expression(paste(chi^2,
"quantiles")),
ylab = "Sorted Mahalanobis distances")
abline(0,1)
identify(qchisq(1:n/(n+1),p), sort(md))
Alas, it appears now that a new set of points (rows 234:241) are now outliers. This keeps happening every time I remove additional outliers.
Look forward to understanding what I'm doing wrong.
To identify the points correctly, make sure the labels correspond to the positions of the points in the data. The functions order or sort with index.return=TRUE will give the sorted indices. Here is an example, arbitrarily removing the points with md greater than a threshold.
## Your data
data(fat, package='faraway')
cfat <- fat[, 9:18]
n <- nrow(cfat)
p <- ncol(cfat)
md <- sort(mahalanobis(cfat, colMeans(cfat), cov(cfat)), index.return=TRUE)
xs <- qchisq(1:n/(n+1), p)
plot(xs, md$x, xlab=expression(paste(chi^2, 'quantiles')))
## Use indices in data as labels for interactive identify
identify(xs, md$x, labels=md$ix)
## remove those with md>25, for example
inds <- md$x > 25
cfat.mod <- cfat[-md$ix[inds], ]
nn <- nrow(cfat.mod)
md1 <- mahalanobis(cfat.mod, colMeans(cfat.mod), cov(cfat.mod))
## Plot the new data
par(mfrow=c(1, 2))
plot(qchisq(1:nn/(nn+1), p), sort(md1), xlab='chisq quantiles', ylab='')
abline(0, 1, col='red')
car::qqPlot(md1, distribution='chisq', df=p, line='robust', main='With car::qqPlot')

Using R and Sensor Accelerometer Data to Detect a Jump

I'm fascinated by sensor data. I used my iPhone and an app called SensorLog to capture
accelerometer data while I stand and push my legs to jump.
My goal is to use R to create a model which can identify jumps and how long I'm in the air.
I'm unsure how to proceed in such a challenge. I have a timeseries with accelerometer data.
https://drive.google.com/file/d/0ByWxsCBUWbqRcGlLVTVnTnZIVVk/view?usp=sharing
Some questions:
How can a jump be detected in timeseries data?
How to identify the air time part?
How to train such a model?
Below is the R code used to create the graphs above, which is me standing and doing a simple jump.
Thanks!
# Training set
sample <- read.csv("sample-data.csv")
# Sum gravity
sample$total_gravity <- sqrt(sample$accelerometerAccelerationX^2+sample$accelerometerAccelerationY^2+sample$accelerometerAccelerationZ^2)
# Smooth our total gravity to remove noise
f <- rep(1/4,4)
sample$total_gravity_smooth <- filter(sample$total_gravity, f, sides=2)
# Removes rows with NA from smoothing
sample<-sample[!is.na(sample$total_gravity_smooth),]
#sample$test<-rollmaxr(sample$total_gravity_smooth, 10, fill = NA, align = "right")
# Plot gravity
plot(sample$total_gravity, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(sample$total_gravity_smooth, col="red")
stdevs <- mean(sample$total_gravity_smooth)+c(-2,-1,+1,+2)*sd(sample$total_gravity_smooth)
abline(h=stdevs)
This is probably less than perfect solution, but it might be enough to get you started. The first part relies on a small modification of the find_peaks function from the gazetools package.
find_maxima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == max(r))])
}
}
peaks
}
find_minima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == min(r))])
}
}
peaks
}
In order to get the find_maxima and find_minima functions to give us what we're looking for we are going to need to smooth the total_gravity data even further:
spline <- smooth.spline(sample$loggingSample, y = sample$total_gravity, df = 30)
Note: I 'zeroed out' total gravity (sample$total_gravity <- sample$total_gravity - 1)
Next, pull out the smoothed x and y values:
out <- as.data.frame(cbind(spline$x,spline$y))
Then find our local maxima and minima
max <- find_maxima(out$y, threshold = 0.4)
min <- find_minima(out$y, threshold = -0.4)
And then plot the data to make sure everything looks legit:
plot(out$y, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(out$y, col="red")
stdevs <- mean(out$y)+c(-2,-1,+1,+2)*sd(out$y)
abline(h=stdevs)
abline(v=max[1], col = 'green')
abline(v=max[2], col = 'green')
abline(v=min[1], col = 'blue')
And finally, we can see how long you were off the ground.
print(hangtime <- min[1] - max[1])
[1] 20
You can reduce your thresholds to get additional datapoints (changes in acceleration).
Hope this helps!
I would consider a few things:
Smooth the data by collecting median values every 100ms - accelerometer data on iPhones is not perfectly accurate, so this approach will help.
Identify turningpoints as #scribbles suggests.
There is code available in my github repository that could be modified to help with both of these issues. A PDF with some explanation is here: https://github.com/MonteShaffer/mPowerEI/blob/master/mPowerEI/example/challenge-1a.pdf
Specifically, take a look at:
library(devtools);
install_github("MonteShaffer/mPowerEI", subdir="mPowerEI");
library(mPowerEI);
# data smoothing
?scaleToTimeIncrement
# turning points
?pastecs::turnpoints

how do i fit unique curves on each unique plot in a for loop

I have written this code (see below) for my data frame kleaf.df to combine multiple plots of variable press_mV with each individual plot for unique ID
I need some help fitting curves to my plots. when i run this code i get the same fitted curve (the curve fitted for the first plot) on ALL the plots where i want each unique fitted curve on each unique plot.
thanks in advance for any help given
f <- function(t,a,b) {a * exp(b * t)}
par(mfrow = c(5, 8), mar = c(1,1,1,1), srt = 0, oma = c(1,6,5,1))
for (i in unique(kleaf.df$ID))
{
d <- subset(kleaf.df, kleaf.df$ID == i)
plot(c(1:length(d$press_mV)),d$press_mV)
#----tp:turning point. the last maximum value before the values start to decrease
tp <- tail(which( d$press_mV == max(d$press_mV) ),1)
#----set the end points(A,B) to fit the curve to
A <- tp+5
B <- A+20
#----t = time, p = press_mV
# n.b:shift by 5 accomadate for the time before attachment
t <- A:B+5
p <- d$press_mV[A:B]
fit <- nls(p ~ f(t,a,b), start = c(a=d$press_mV[A], b=-0.01))
#----draw a curve on plot using the above coefficents
curve(f(x, a=co[1], b=co[2]), add = TRUE, col="green", lwd=2)
}

Make histograms of stacked rectangles rather than columns

With the following code, I get a histogram as below
x <- rnorm(100)
hist(x,col="gray")
What can I do to get to display the bars as stacked rectangles (visible by their outlines, rather than a change in fill color) instead of uniform columns? Each rectangle represents a frequency of, for example, 1, although I want to be able to change this through a parameter.
From answer at this question (h/t Vincent Zoonekynd).
x <- rnorm(100)
hist(x,col="gray")
abline(h=seq(5,40,5),col="white")
Here is a function to get you started (it is actually a modicication of part of the examples for the tkBrush function in the TeachingDemos package):
rechist <- function(x,...){
tmp <- hist(x,plot=F)
br <- tmp$breaks
w <- as.numeric(cut(x,br,include.lowest=TRUE))
sy <- unlist(lapply(tmp$counts,function(x)seq(length=x)))
my <- max(sy)
sy <- sy/my
my <- 1/my
sy <- sy[order(order(x))]
plot.new()
plot.window(xlim=range(br), ylim=c(0,1))
rect(br[w], sy-my, br[w+1], sy,
border=TRUE, col='grey')
rect(br[-length(br)], 0, br[-1], tmp$counts*my)
axis(1)
}
rechist( iris$Petal.Length )

Resources