R timeseries jumps and plotting in dygraph - r

I have a time series of stock data of several days, and big jumps in between the days as the data in the closing time of the stock market is missing of course.
Picture shows what I mean:
Graph
The time series used to plot is an xts object, which looks like this:
xts object
The graph is plotted using the following function:
dygraph(stocks, main="Closing Stock Prices") %>%
dyAxis("y", label="Value") %>%
dySeries("..1",label="IBM") %>%
dyOptions(colors = c("blue"), connectSeparatedPoints=TRUE) %>%
dyRangeSelector()
Now what I really want is to "ignore" the value in between set dates and just plot the graph in one go without the gap between. Is this possible somehow?
I was thinking of just manipulating the time series and just consider as it single points as I don't necessarily need the time anyway but only the graph to be shown properly, but is this possible even as the xts object requires a time series object?!
Thanks in advance!!

Apparently there is no solution to hide it, other than going for NA values to hide the graph itself, but the gap which still exist.
I now went for a generated timestamp misused as index to simulate the effect of an ongoing graph.

Related

plotly and week number line graph

I am trying to graph the weekly evolution of the number of downloads.
I have data similar data for all o week year and plotly cut de line.
example:
Downloads_weekly=Downloads.groupby(['Week_Number']).agg({'Daily Installs' : 'sum'})
fig1=px.line(Downloads_weekly.reset_index() , x='Week_Number' , y= 'Daily Installs')
fig1.update_xaxes(
tickformat="%Y-%W"
)
fig1
I have the exact same issue- Im still trying to figure it out. What I think is that Plotly is converting the dates to months, anyway it becomes a big mess. The way I solve the problem is by hacking around it, I create my Year-Week columns
df['Year-Week'] = df['Date'].dt.strftime('%Y.%U')
And then I sort the data frame, and plot that
df.sort_values(['Year-Week'], ascending=True, inplace=True)

Graph time series in seconds with dygraph

I get data from a device recording roughly every 8 to 10 ms. Here is how the raw data looks like (with Time in seconds):
> data
Time Raw
1 0.004061294 0.7384716
2 0.012264294 1.9172491
3 0.020467294 2.2663045
4 0.028670294 1.1509293
5 0.037858294 1.6099277
6 0.048030294 0.7834101
...
The file is very long, and I would like to inspect it first with a scrollable x-axis. Therefore, I treated this data as a time series and used the dygraphs package to visualize it.
data_ts <- as.ts(data$Raw)
dygraph(ts_data, main="Raw signal") %>%
dyRangeSelector()
The result is great, but I would like to be able to see the values of the x-axis in seconds (or miliseconds). So far, nothing is displayed and I can't wrap my head around the problem. I guess it may have something to do with the fact that I haven't specified the unit of the time series. Or that my data does not fit the requirement of a time series, since the time intervals are slightly irregular.
Thanks for your help!
Any other solutions that would allow to scroll the x-axis are also welcome.

Plot occurrences over time in R from a dataset of H:M:S timestamps

I'm trying to plot, essentially, a timeline. I have two lists of timestamps in %H:%M:%S format, and I want to show them as occurrences over an x-axis of continuous time from 0 to 8 minutes.
Everything I can find is either a specialized package for a timeline that bins data by year instead of showing it continuously scatterplot style, or it requires a y-axis. I'm not plotting a function, so I don't have a y-axis, just two lines made up of dots something like this: mock excel plot. I got this by putting in fake values for the missing axis. It's also vertical instead of horizontal, but that's easily fixed.
I'm trying to show all the times in those 8 minutes a person pronounced an S and all the times she deleted it so I can see if she starts deleting more as time goes on.
Edit: here's some of my data as requested.
s deleted s pronounced
0:02:32 0:00:49
0:04:01 0:00:50
0:04:02 0:00:51
0:04:10 0:00:56
0:04:11 0:00:58
0:04:44 0:00:59
0:05:59 0:01:09
0:06:03 0:01:10
0:06:31 0:01:12
0:06:33 0:05:59
0:06:40 0:06:00
0:06:06
0:06:06
0:06:32
0:06:43
0:06:44
0:06:47
0:06:51
0:06:52
0:07:10
0:07:14

Weekly time series plot in R

I am trying to create a plot of weekly data. Though this is not the exact problem I am having it illustrates it well. Basically imagine you want to make a plot of 1,2,....,7 for for 7 weeks from Jan 1 2015. So basically my plot should just be a line that trends upward but instead I get 7 different lines. I tried the code (and some other to no avail). Help would be greatly appreciated.
startDate = "2015-01-01"
endDate = "2015-02-19"
y=c(1,2,3,4,5,6,7)
tsy=ts(y,start=as.Date(startDate),end=as.Date(endDate))
plot(tsy)
You are plotting both the time and y together as individual plots.
Instead use:
plot(y)
lines(y)
Also, create a date column based on the specifics you gave which will be a time series. From here you can add the date on the x-axis to easily see how your variable changes over time.
To make your life easier I think your first step should be to create a (xts) time series object (install/load the xts-package), then it is a piece of cake to plot, subset or do whatever you like with the series.
Build your vector of dates as a sequence with start/end date:
seq( as.Date("2011-07-01"), by=1, len=7)
and your data vector: 1:7
a one-liner builds and plots the above time series object:
plot(as.xts(1:7,order.by=seq( as.Date("2011-07-01"), by=1, len=7)))

Plotting truncated times from zoo time series

Let's say I have a data frame with lots of values under these headers:
df <- data.frame(c("Tid", "Value"))
#Tid.format = %Y-%m-%d %H:%M
Then I turn that data frame over to zoo, because I want to handle it as a time series:
library("zoo")
df <- zoo(df$Value, df$Tid)
Now I want to produce a smooth scatter plot over which time of day each measurement was taken (i.e. discard date information and only keep time) which supposedly should be done something like this: https://stat.ethz.ch/pipermail/r-help/2009-March/191302.html
But it seems the time() function doesn't produce any time at all; instead it just produces a number sequence. Whatever I do from that link, I can't get a scatter plot of values over an average day. The data.frame code that actually does work (without using zoo time series) looks like this (i.e. extracting the hour from the time and converting it to numeric):
smoothScatter(data.frame(as.numeric(format(df$Tid,"%H")),df$Value)
Another thing I want to do is produce a density plot of how many measurements I have per hour. I have plotted on hours using a regular data.frame with no problems, so the data I have is fine. But when I try to do it using zoo then I either get errors or I get the wrong results when trying what I have found through Google.
I did manage to get something plotted through this line:
plot(density(as.numeric(trunc(time(df),"01:00:00"))))
But it is not correct. It seems again that it is just producing a sequence from 1 to 217, where I wanted it to be truncating any date information and just keep the time rounded off to hours.
I am able to plot this:
plot(density(df))
Which produces a density plot of the Values. But I want a density plot over how many values were recorded per hour of the day.
So, if someone could please help me sort this out, that would be great. In short, what I want to do is:
1) smoothScatter(x-axis: time of day (0-24), y-axis: value)
2) plot(density(x-axis: time of day (0-24)))
EDIT:
library("zoo")
df <- data.frame(Tid=strptime(c("2011-01-14 12:00:00","2011-01-31 07:00:00","2011-02-05 09:36:00","2011-02-27 10:19:00"),"%Y-%m-%d %H:%M"),Values=c(50,52,51,52))
df <- zoo(df$Values,df$Tid)
summary(df)
df.hr <- aggregate(df, trunc(df, "hours"), mean)
summary(df.hr)
png("temp.png")
plot(df.hr)
dev.off()
This code is some actual values that I have. I would have expected the plot of "df.hr" to be an hourly average, but instead I get some weird new index that is not time at all...
There are three problems with the aggregate statement in the question:
We wish to truncate the times not df.
trunc.POSIXt unfortunately returns a POSIXlt result so it needs to be converted back to POSIXct
It seems you did not intend to truncate to the hour in the first place but wanted to extract the hours.
To address the first two points the aggregate statement needs to be changed to:
tt <- as.POSIXct(trunc(time(df), "hours"))
aggregate(df, tt, mean)
but to address the last point it needs to be changed entirely to
tt <- as.POSIXlt(time(df))$hour
aggregate(df, tt, mean)

Resources