My data.frame is not readable with dygraph

My data.frame is not readable with dygraph - r

I would like to produce dygraph with a data.frame I import from CSV file. I suspect my date column is formatted incorrectly. My date column is originally in %m/%d/%y format.
If applicable, column 1 is class(factor), column 2 and 3 are class(integer). Here is head(mydata)
Date term1 term2
1 7/1/16 2304 0
2 7/2/16 2304 0
3 7/3/16 1628 0
4 7/4/16 1230 0
5 7/5/16 1216 5
6 7/6/16 2056 0
Here is the dygraph command:
library(tidyverse)
library(dygraphs)
dygraph(mydata, main = "mydata") %>%
dyRangeSelector()
I received error: Unsupported type passed to argument 'data'.
I then converted mydata$Date to POSIXct like this:
mydata$DateTime=as.POSIXct(paste(mydata$Date, mydata$Time), format="%Y%m%d %H%M%S")
I expected the above to correct the problem, however I still receive same error. When I view(mydata), I see this:
Date term1 term2 DateTime
1 <NA> 2304 0 <NA>
2 <NA> 2304 0 <NA>
3 <NA> 1628 0 <NA>
4 <NA> 1230 0 <NA>
5 <NA> 1216 5 <NA>
6 <NA> 2056 0 <NA>
Clearly, this only worsened the problem.
I was able to use dygraph on imported stock data, and based on head(my stock data) the correct head(mydata) would look like this:
Date Open High Low Close Volume
2016-02-03 2016-02-02 18:00:00 18.00 18.88 16.000 18.20 4157398
2016-02-04 2016-02-03 18:00:00 18.26 19.42 17.570 18.50 469900
2016-02-05 2016-02-04 18:00:00 18.84 18.88 17.520 17.60 219900
2016-02-08 2016-02-07 18:00:00 17.52 18.00 15.720 15.85 372100
2016-02-09 2016-02-08 18:00:00 15.50 15.50 12.748 12.81 744100
2016-02-10 2016-02-09 18:00:00 13.01 14.00 12.790 13.09 260800
Thank you in advance for everyone's time & insight.
-M

library(zoo)
library(highcharter)
library(xts)
Date=mydata6$Date=as.Date(as.character(mydata6$Date,"%Y-%m-%d"))
Open=mydata6$Open=as.numeric(na.locf(mydata6$Open))
High=mydata6$High=as.numeric(na.locf(mydata6$High))
Z=cbind(Open, High)
newdata=xts(Z,mydata$Date)
dygraph(newdata, main = "Stock") %>%
dyRangeSelector()

Related

Undesired output when left_join() ing time stamped dataframes in R

I can't understand why my code is providing an undesired ouput since I've tried this in the past with similar datasets and good results.
Below are the two dataframes I would like to left_join():
> head(datagps)
Date & Time [Local] Latitude Longitude DateTime meters
1: 06/11/2018 08:44 -2.434986 34.85387 2018-11-06 08:44:00 1.920190
2: 06/11/2018 08:48 -2.434993 34.85386 2018-11-06 08:48:00 3.543173
3: 06/11/2018 08:52 -2.435014 34.85388 2018-11-06 08:52:00 1.002979
4: 06/11/2018 08:56 -2.435011 34.85389 2018-11-06 08:56:00 3.788024
5: 06/11/2018 09:00 -2.434986 34.85387 2018-11-06 09:00:00 1.262584
6: 06/11/2018 09:04 -2.434994 34.85386 2018-11-06 09:04:00 3.012679
> head(datasensorraw)
# A tibble: 6 x 4
TimeGroup x y z
<dttm> <int> <int> <dbl>
1 2000-01-01 00:04:00 0 0 0
2 2000-01-01 00:08:00 1 0 1
3 2000-01-01 00:12:00 0 0 0
4 2000-01-01 00:20:00 0 0 0
5 2000-01-01 00:24:00 0 0 0
6 2018-06-09 05:04:00 4 14 14.6
And below is my code. There are no Errors, but for some reason I get NA's under x, y and z. This should not happen since there are registered values in the datasensorraw dataframe for those time stamps:
> library(dplyr)
> dataresults<-datagps %>%
+ mutate(`Date & Time [Local]` = as.POSIXct(`Date & Time [Local]`,
+ format = "%d/%m/%Y %H:%M")) %>%
+ left_join(datasensorraw, by = c("Date & Time [Local]" = "TimeGroup"))
> #Left join the data frames
> head(dataresults)
Date & Time [Local] Latitude Longitude DateTime meters x y z
1 2018-11-06 07:44:00 -2.434986 34.85387 2018-11-06 08:44:00 1.920190 NA NA NA
2 2018-11-06 07:48:00 -2.434993 34.85386 2018-11-06 08:48:00 3.543173 NA NA NA
3 2018-11-06 07:52:00 -2.435014 34.85388 2018-11-06 08:52:00 1.002979 NA NA NA
4 2018-11-06 07:56:00 -2.435011 34.85389 2018-11-06 08:56:00 3.788024 NA NA NA
5 2018-11-06 08:00:00 -2.434986 34.85387 2018-11-06 09:00:00 1.262584 NA NA NA
6 2018-11-06 08:04:00 -2.434994 34.85386 2018-11-06 09:04:00 3.012679 NA NA NA
I can also upload a small dput() sample of datagps and datasensorraw.
I am learning R so I'm wondering if I'm doing something wrong. I shouldn't get NAs under those columns as you can see on the dput() samples provided. Any input is appreciated!

Looks like a mixup on your date format. Try to switch format = "%d/%m/%Y %H:%M" to format = "%m/%d/%Y %H:%M" or switch it to d/m/y in your other dataset.
dataresults<- datagps_sample %>%
mutate(`Date & Time [Local]` = as.POSIXct(`Date & Time [Local]`, format = "%m/%d/%Y %H:%M")) %>%
left_join(datasensorraw_sample, by = c("Date & Time [Local]" = "TimeGroup"))
> head(dataresults)
Date & Time [Local] Latitude Longitude DateTime meters x y z
1 2018-06-11 12:44:00 -2.434986 34.85387 2018-11-06 08:44:00 1.920190 17 12 21.59363
2 2018-06-11 12:48:00 -2.434993 34.85386 2018-11-06 08:48:00 3.543173 6 0 6.00000
3 2018-06-11 12:52:00 -2.435014 34.85388 2018-11-06 08:52:00 1.002979 47 25 53.24351
4 2018-06-11 12:56:00 -2.435011 34.85389 2018-11-06 08:56:00 3.788024 0 0 0.00000
5 2018-06-11 13:00:00 -2.434986 34.85387 2018-11-06 09:00:00 1.262584 48 53 72.23108
6 2018-06-11 13:04:00 -2.434994 34.85386 2018-11-06 09:04:00 3.012679 139 113 179.24589
EDIT: basically, left_join was not finding any matches and it was returning the rows from your original dataframe but with the new columns as NA. If you format your column before left joining you could check if there are common id's with something simple like datagps$Date & Time [Local] %in% datasensorraw$TimeGroup.

Aggregate on a daily basis in R

I'm borrowing the reproducible example given here:
Aggregate daily level data to weekly level in R
since it's pretty much close to what I want to do.
Interval value
1 2012-06-10 552
2 2012-06-11 4850
3 2012-06-12 4642
4 2012-06-13 4132
5 2012-06-14 4190
6 2012-06-15 4186
7 2012-06-16 1139
8 2012-06-17 490
9 2012-06-18 5156
10 2012-06-19 4430
11 2012-06-20 4447
12 2012-06-21 4256
13 2012-06-22 3856
14 2012-06-23 1163
15 2012-06-24 564
16 2012-06-25 4866
17 2012-06-26 4421
18 2012-06-27 4206
19 2012-06-28 4272
20 2012-06-29 3993
21 2012-06-30 1211
22 2012-07-01 698
23 2012-07-02 5770
24 2012-07-03 5103
25 2012-07-04 775
26 2012-07-05 5140
27 2012-07-06 4868
28 2012-07-07 1225
29 2012-07-08 671
30 2012-07-09 5726
31 2012-07-10 5176
In his question, he asks to aggregate on weekly intervals, what I'd like to do is aggregate on a "day of the week basis".
So I'd like to have a table similar to that one, adding the values of all the same day of the week:
Day of the week value
1 "Sunday" 60000
2 "Monday" 50000
3 "Tuesday" 60000
4 "Wednesday" 50000
5 "Thursday" 60000
6 "Friday" 50000
7 "Saturday" 60000

You can try:
aggregate(d$value, list(weekdays(as.Date(d$Interval))), sum)

We can group them by weekly intervals using weekdays :
library(dplyr)
df %>%
group_by(Day_Of_The_Week = weekdays(as.Date(Interval))) %>%
summarise(value = sum(value))
# Day_Of_The_Week value
# <chr> <int>
#1 Friday 16903
#2 Monday 26368
#3 Saturday 4738
#4 Sunday 2975
#5 Thursday 17858
#6 Tuesday 23772
#7 Wednesday 13560

We can do this with data.table
library(data.table)
setDT(df1)[, .(value = sum(value)), .(Dayofweek = weekdays(as.Date(Interval)))]
# Dayofweek value
#1: Sunday 2975
#2: Monday 26368
#3: Tuesday 23772
#4: Wednesday 13560
#5: Thursday 17858
#6: Friday 16903
#7: Saturday 4738

using lubridate https://cran.r-project.org/web/packages/lubridate/vignettes/lubridate.html
df1$Weekday=wday(arrive,label=TRUE)
library(data.table)
df1=data.table(df1)
df1[,sum(value),Weekday]

Extracting the data based on the values

I have a data frame like below, where the error 1 is present if there is a error in DOB and after that the corrected DOB for the same record with no error. I want to extract only the data which are not corrected and having the error 1. Could anyone out there help me on this.
ID Date1 Date2 DOB Code Error
381 2002-10-01 2015-10-01 1967-01-22 4 1
381 2002-10-01 2015-10-01 1967-01-20 4 NA
381 2011-10-01 2015-10-01 1969-05-13 11 1
381 2011-10-01 2015-10-01 1968-05-13 11 NA
837 2005-12-07 2015-12-07 1987-11-19 8 1
837 2005-12-08 2015-12-08 1989-12-07 8 1
837 2001-04-15 2015-04-15 1984-08-11 18 1
840 2001-04-23 2015-04-23 1999-03-14 18 NA
The output table will have the details below.
ID Date1 Date2 DOB Code Error
837 2005-12-07 2015-12-07 1987-11-19 8 1
837 2005-12-08 2015-12-08 1989-12-07 8 1
837 2001-04-15 2015-04-15 1984-08-11 18 1

xts::apply.weekly thinks Monday is the last day of the week

I have an R data.frame containing one value for every quarter of hour
Date A B
1 2015-11-02 00:00:00 0 0 //day start
2 2015-11-02 00:15:00 0 0
3 2015-11-02 00:30:00 0 0
4 2015-11-02 00:45:00 0 0
...
96 2015-11-02 23:45:00 0 0 //day end
97 2015-11-03 00:00:00 0 0 //new day
...
6 2016-03-23 01:15:00 0 0 //last record
I use xts to construct a time series
xtsA <- xts(data$A,data$Date)
by using apply.daily I get the result I expect
apply.daily(xtsA, sum)
Date A
1 2015-11-02 23:45:00 400
2 2015-11-03 23:45:00 400
3 2015-11-04 23:45:00 500
but apply.weekly seems to use Monday as last day of the week
Date A
19 2016-03-07 00:45:00 6500 //Monday
20 2016-03-14 00:45:00 5500 //Monday
21 2016-03-21 00:45:00 5000 //Monday
and I do not understand why it uses 00:45:00. Does anyone know?

Data is imported from CSV file the Date column looks like this:
data <- read.csv("...", header=TRUE)
Date A
1 151102 0000 0
...
The error is in the date time interpretation and using
data$Date <- as.POSIXct(strptime(data$Date, "%y%m%d %H%M"), tz = "GMT")
solves it, and now apply.weekly returns
Date A
1 2015-11-08 23:45:00 3500 //Sunday
2 2015-11-15 23:45:00 4000 //Sunday
...

How to convert Date or Datetime field when some parts are blank; na.omit fails

I have a data set that has dates and times for in and out. Each line is an in and out set, but some are blank. I can remove the blanks with na.omit and a nice read in (it was a csv, and na.strings=c("") works on the read.csv).
Of course, because the real world is never like the tutorial, some of the times are only dates, so my as.POSIXlt(Dataset$In,format="%m/%d/%Y %H:%M") returns NA on the "only date no time"s.
na.omit does not remove these lines. so the questions are 2
Why doesn't na.omit work, or how can I get it to work?
Better, How can I convert one column into both Dates and Times (in the posix format) without 2 calls or with some sort of optional parameter in the format string? (or is this even possible?).
This is a sample of the dates and times. I can't share the real file, 1 it's huge, 2 it's PII.
Id,In,Out
1,8/15/2015 8:00,8/15/2015 17:00
1,8/16/2015 8:04,8/16/2015
1,8/17/2015 8:50,8/17/2015 18:00
1,8/18/2015,8/18/2015 17:00
2,8/15/2015,8/15/2015 13:00
2,8/16/2015 8:00,8/16/2015 17:00
3,8/15/2015 4:00,8/15/2015 11:00
3,8/16/2015 9:00,8/16/2015 19:00
3,8/17/2015,8/17/2015 17:00
3,,
4,,
4,8/16/2015 6:00,8/16/2015 20:00

DF <- read.table(text = "Id,In,Out
1,8/15/2015 8:00,8/15/2015 17:00
1,8/16/2015 8:04,8/16/2015
1,8/17/2015 8:50,8/17/2015 18:00
1,8/18/2015,8/18/2015 17:00
2,8/15/2015,8/15/2015 13:00
2,8/16/2015 8:00,8/16/2015 17:00
3,8/15/2015 4:00,8/15/2015 11:00
3,8/16/2015 9:00,8/16/2015 19:00
3,8/17/2015,8/17/2015 17:00", header = TRUE, sep = ",",
stringsAsFactors = FALSE) #set this option during import
DF$In[nchar(DF$In) < 13] <- paste(DF$In[nchar(DF$In) < 13], "0:00")
DF$Out[nchar(DF$Out) < 13] <- paste(DF$Out[nchar(DF$Out) < 13], "0:00")
DF$In <- as.POSIXct(DF$In, format = "%m/%d/%Y %H:%M", tz = "GMT")
DF$Out <- as.POSIXct(DF$Out, format = "%m/%d/%Y %H:%M", tz = "GMT")
# Id In Out
#1 1 2015-08-15 08:00:00 2015-08-15 17:00:00
#2 1 2015-08-16 08:04:00 2015-08-16 00:00:00
#3 1 2015-08-17 08:50:00 2015-08-17 18:00:00
#4 1 2015-08-18 00:00:00 2015-08-18 17:00:00
#5 2 2015-08-15 00:00:00 2015-08-15 13:00:00
#6 2 2015-08-16 08:00:00 2015-08-16 17:00:00
#7 3 2015-08-15 04:00:00 2015-08-15 11:00:00
#8 3 2015-08-16 09:00:00 2015-08-16 19:00:00
#9 3 2015-08-17 00:00:00 2015-08-17 17:00:00
na.omit doesn't work with POSIXlt objects because it is documented to "handle vectors, matrices and data frames comprising vectors and matrices (only)." (see help("na.omit")). And in the strict sense, POSIXlt objects are not vectors:
unclass(as.POSIXlt(DF$In))
#$sec
#[1] 0 0 0 0 0 0 0 0 0
#
#$min
#[1] 0 4 50 0 0 0 0 0 0
#
#$hour
#[1] 8 8 8 0 0 8 4 9 0
#
#$mday
#[1] 15 16 17 18 15 16 15 16 17
#
#$mon
#[1] 7 7 7 7 7 7 7 7 7
#
#$year
#[1] 115 115 115 115 115 115 115 115 115
#
#$wday
#[1] 6 0 1 2 6 0 6 0 1
#
#$yday
#[1] 226 227 228 229 226 227 226 227 228
#
#$isdst
#[1] 0 0 0 0 0 0 0 0 0
#
#attr(,"tzone")
#[1] "GMT"
There is hardly any reason to prefer POSIXlt over POSIXct (which is an integer giving the number of seconds since the origin internally and thus needs less memory).

You've been given a couple of strategies that bring these character values in and process "in-place". I almost never use as.POSIXlt since there are so many pitfalls in dealing with the list-in-list structures that it returns, especially considering its effective incompatibility with dataframes. Here's a method that does the testing and coercion at the read.-level by defining an as-method:
setOldClass("inTime", prototype="POSIXct")
setAs("character", "inTime",
function(from) structure( ifelse( is.na(as.POSIXct(from, format="%m/%d/%Y %H:%M") ),
as.POSIXct(from, format="%m/%d/%Y") ,
as.POSIXct(from, format="%m/%d/%Y %H:%M") ),
class="POSIXct" ) )
read.csv(text=txt, colClasses=c("numeric", 'inTime','inTime') )
Id In Out
1 1 2015-08-15 08:00:00 2015-08-15 17:00:00
2 1 2015-08-16 08:04:00 2015-08-16 00:00:00
3 1 2015-08-17 08:50:00 2015-08-17 18:00:00
4 1 2015-08-18 00:00:00 2015-08-18 17:00:00
5 2 2015-08-15 00:00:00 2015-08-15 13:00:00
6 2 2015-08-16 08:00:00 2015-08-16 17:00:00
7 3 2015-08-15 04:00:00 2015-08-15 11:00:00
8 3 2015-08-16 09:00:00 2015-08-16 19:00:00
9 3 2015-08-17 00:00:00 2015-08-17 17:00:00
The structure "envelope" is needed because of the rather strange behavior of ifelse, which otherwise would return a numeric object rather than an object of class-'POSIXct'.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

My data.frame is not readable with dygraph - r

Related

Undesired output when left_join() ing time stamped dataframes in R

Aggregate on a daily basis in R

Extracting the data based on the values

xts::apply.weekly thinks Monday is the last day of the week

How to convert Date or Datetime field when some parts are blank; na.omit fails

Categories

Resources