I have the following data that I am trying to plot with dygraphs in R:
ts.rmean ts.rmax
0001-01-01 3.163478 5.86
0002-01-01 3.095909 4.67
0003-01-01 3.112000 6.01
0004-01-01 2.922800 5.44
0005-01-01 2.981154 5.21
0006-01-01 3.089167 5.26
0007-01-01 3.168000 6.28
0008-01-01 3.040400 5.00
0009-01-01 2.809130 6.04
0010-01-01 3.002174 4.64
0011-01-01 3.002000 4.93
0012-01-01 3.081250 5.28
0013-01-01 2.687083 4.62
Each line represents a daily value between 01 Jan - 31 Dec for ts.rmean and ts.rmax. Since I have not specified the date, the x-axis of the plot shows the index of each line from 1 to 366. It is possible to modify the data so that the x-axis would show Month-Day?
You could do something like this:
library(dygraphs)
library(xts)
#convert the rownames of your data frame to a year-month-day,
#used 2012 because it has 366 days and subsetted to fit the example
rownames(data)<-strptime(paste0("2012-",1:366),format="%Y-%j")[1:nrow(data)]
#transform to xts
data<-as.xts(data)
#plot
dygraph(data)
Related
I am dealing with a dataset with dates and various response values at different time intervals as shown below
Id Date Response
1 2008-03-12 4.88
1 2009-06-06 5.39
2 2015-10-22 8.61
2 2019-09-26 6.20
3 2006-09-28 7.40
3 2009-07-15 7.25
3 2011-01-19 9.50
Dates are X values, Response y values.
I am interested in estimating the AUC for each Id. Any suggestions for accomplishing this is much appreciated.
I have this data I wanted to convert to dates, but I doubt it is possible with year that is below 0, below is the snippets
library(datasets)
library(quantmod)
data(treering)
tree_df = data.frame(ds=index(treering), y=as.numeric(treering))
> head(tree_df)
ds y
1 -6000 1.345
2 -5999 1.077
3 -5998 1.545
4 -5997 1.319
5 -5996 1.413
6 -5995 1.069
> tail(tree_df)
ds y
7975 1974 1.031
7976 1975 1.027
7977 1976 1.173
7978 1977 1.471
7979 1978 1.444
7980 1979 1.160
?treering
Yearly Treering Data, -6000–1979
Description
Contains normalized tree-ring widths in dimensionless units.
Usage
treering
Format
A univariate time series with 7981 observations. The object is of class "ts".
Each tree ring corresponds to one year.
Is there a way to convert the data into dates with a negative year in its own way? like for example "-6000-01-01"?
Apparently by converting Minus Integer to Date help the trick, in this case (-2910983) from the year 1970 is -6000, therefore a sequence of 1 Year will help and then finally converted to Date
sequences = seq(as.Date(-2910983,origin="1970-01-01"),as.Date(paste0(max(index(treering)),"-01-01")),by="1 years")
tail(sequences)
[1] "1974-01-01" "1975-01-01" "1976-01-01" "1977-01-01" "1978-01-01" "1979-01-01"
head(sequences)
[1] "-6000-01-01" "-5999-01-01" "-5998-01-01" "-5997-01-01" "-5996-01-01" "-5995-01-01"
I am trying to import a text file with data that looks like:
Jan 1998 4.36
Feb 1998 4.34
Mar 1998 4.35
Apr 1998 4.37
May 1998 4.45
Jun 1998 4.54
Jul 1998 4.52
Aug 1998 4.68
Sep 1998 4.82
Oct 1998 4.72
Nov 1998 4.80
...
as a zoo in R. I have tried importing it directly as a zoo:
install.packages("zoo")
library("zoo")
FMAGX_prices <- read.csv.zoo("filepath.../FMAGX_prices.csv", format = "%m/%Y")
and importing it as a data frame and then converting it to a zoo. The reason I create the dates vector re-assign it to the front of the data frame is that by default, I get a 3 column data frame, one with the month abbreviation, one with the year, and one with the price:
install.packages("zoo")
library("zoo")
FMAGX_prices <-read.table("filepath.../FMAGX_prices.txt")
dates <- paste(FMAGX_prices$V1, FMAGX_prices$V2, sep = " ")
FMAGX_prices$V3 <- as.numeric(as.character(FMAGX_prices$V3))
FMAGX_prices$dates <- dates
FMAGX_prices <- subset(FMAGX_prices, select= c(dates, V3))
FMAGX_prices <- read.zoo(FMAGX_prices, "%b %Y")
neither method works. I always get the below error:
Error in read.zoo(FMAGX_prices, format = "%b %Y") :
index has 144 bad entries at data rows: 1 2 3 4 5 6 7 8 9 10 11...
My assumption is that there is something wrong with my date format, but I am not sure what it would be.
I've tried various combinations of arguments in the read statements, I've added headers, I've reformatted the data as a CSV, changed the dates to 01/1998, 02/1998, etc (and the corresponding arguments), but I always get that same error
I am stuck on the why that this is happening and have tried searching everywhere for the answer. When I try to plot a timeseries object in R the resulting plot comes out in reverse.
I have the following code:
library(sqldf)
stock_prices <- read.csv('~/stockPrediction/input/REN.csv')
colnames(stock_prices) <- tolower(colnames(stock_prices))
colnames(stock_prices)[7] <- 'adjusted_close'
stock_prices <- sqldf('SELECT date, adjusted_close FROM stock_prices')
head(stock_prices)
date adjusted_close
1 2014-10-20 3.65
2 2014-10-17 3.75
3 2014-10-16 4.38
4 2014-10-15 3.86
5 2014-10-14 3.73
6 2014-10-13 4.09
tail(stock_prices)
date adjusted_close
1767 2007-10-15 8.99
1768 2007-10-12 9.01
1769 2007-10-11 9.02
1770 2007-10-10 9.06
1771 2007-10-09 9.06
1772 2007-10-08 9.08
But when I try the following code:
stock_prices_ts <- ts(stock_prices$adjusted_close, start=c(2007, 1), end=c(2014, 10), frequency=12)
plot(stock_prices_ts, col='blue', lwd=2, type='l')
How the image that results is :
And even if I reverse the time series object with this code:
plot(rev(stock_prices_ts), col='blue', lwd=2, type='l')
I get this
which has arbitrary numbers.
Any idea why this is happening? Any help is much appreciated.
This is happened because your object loose its time serie structure once you apply rev function.
For example :
set.seed(1)
gnp <- ts(cumsum(1 + round(rnorm(100), 2)),
start = c(1954, 7), frequency = 12)
gnp ## gnp has a real time serie structure
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1954 0.37 1.55 1.71 4.31 5.64 5.82
1955 7.31 9.05 10.63 11.32 13.83 15.22 15.60 14.39 16.51 17.47 18.45 20.39
1956 22.21 23.80 25.72 27.50 28.57 27.58 29.20 30.14 30.98 30.51 31.03 32.45
1957
rev(gnp) ## the reversal is just a vector
[1] 110.91 110.38 110.60 110.17 110.45 108.89 106.30 104.60 102.44 ....
In general is a liitle bit painful to manipulate the class ts. One idea is to use an xts object that "generally" conserve its structure one you apply common operation on it.
Even in this case the generic method rev is not implemented fo an xts object, it is easy to coerce the resulted zoo time series to and xts one using as.xts.
par(mfrow=c(2,2))
plot(gnp,col='red',main='gnp')
plot(rev(gnp),type='l',col='red',main='rev(gnp)')
library(xts)
xts_gnp <- as.xts(gnp)
plot(xts_gnp)
## note here that I apply as.xts again after rev operation
## otherwise i lose xts structure
rev_xts_gnp = as.xts(rev(as.xts(gnp)))
plot(rev_xts_gnp)
I just downloaded a lot of temperature data from one of our dataloggers. The dataframe gives me mean hourly observations of temperature for 1691 hours for 87 temperature sensors (so there is a lot of data here). This looks something like this
D1_A D1_B D1_C
13.43 14.39 12.33
12.62 13.53 11.56
11.67 12.56 10.36
10.83 11.62 9.47
I would like to reshape this dataset into a matrix that looks like this:
#create a blank matrix 5 columns 131898 rows
matrix1<-matrix(nrow=131898, ncol=5)
colnames(matrix1)<- c("year", "ID", "Soil_Layer", "Hour", "Temperature")
where:
year is always "2012"
ID corresponds to the header ID (e.g. D1)
Soil_Layer corresponds to the second bit of the header (e.g. A, B, or C)
Hour= 1:1691 for each sensor
and Temperature= the observed values in the original dataframe.
Can this be done with the reshape package in r? Does this need to be done as a loop? Any input on how to handle this dataset would be useful. Cheers!
I think this does what you want...you can take advantage of the colsplit() and melt() functions in package reshape2. It's not clear where you identify the Hour for the data, so I assumed it was ordered from the original dataset. If that's not the case, update your question:
library(reshape2)
#read in your data
x <- read.table(text = "
D1_A D1_B D1_C
13.43 14.39 12.33
12.62 13.53 11.56
11.67 12.56 10.36
10.83 11.62 9.47
9.98 10.77 9.04
9.24 10.06 8.65
8.89 9.55 8.78
9.01 9.39 9.88
", header = TRUE)
#add hour index, if data isn't ordered, replace this with whatever
#tells you which hour goes where
x$hour <- 1:nrow(x)
#Melt into long format
x.m <- melt(x, id.vars = "hour")
#Split into two columns
x.m[, c("ID", "Soil_Layer")] <- colsplit(x.m$variable, "_", c("ID", "Soil_Layer"))
#Add the year
x.m$year <- 2012
#Return the first 6 rows
head(x.m[, c("year", "ID", "Soil_Layer", "hour", "value")])
#----
year ID Soil_Layer hour value
1 2012 D1 A 1 13.43
2 2012 D1 A 2 12.62
3 2012 D1 A 3 11.67
4 2012 D1 A 4 10.83
5 2012 D1 A 5 9.98
6 2012 D1 A 6 9.24