How to bind two xts data environments in R - r

I am completely new to R and I am learning how to program in R to get historical stock index data. I am planning to build an daily update code to update historic index data. I use a data environment call "indexData" to store the data as a xts. But unfortunately, the rbind or merge does not support data environments. They only support objects. I am wondering whether there are any workarounds or packages that I can used to solve this. My code is the following:
indexData<-new.env()
startDate<-"2013-11-02"
getSymbols(Symbols=indexList,src='yahoo',from=startDate,to="2013-11-12",env=indexData)
startDate<-(end(indexData$FTSE)+1)
NewIndexData<-new.env()
getSymbols(Symbols=indexList,src='yahoo',from=startDate,env=NewIndexData)
rbind(indexData,NewIndexData) #it does not work for data environments
Much appreciate for any suggestions!

If you use auto.assign=FALSE, you get explicit variables -- and those you can extend as shown below.
First we get two IBM price series:
R> IBM <- getSymbols("IBM",from="2013-01-01",to="2013-12-01",auto.assign=FALSE)
R> IBM2 <- getSymbols("IBM",from="2013-12-02",to="2013-12-28",auto.assign=FALSE)
Then we check their dates:
R> c(start(IBM), end(IBM))
[1] "2013-01-02" "2013-11-29"
R> c(start(IBM2), end(IBM2))
[1] "2013-12-02" "2013-12-27"
Finally, we merge them and check the dates of the combined series:
R> allIBM <- merge(IBM, IBM2)
R> c(start(allIBM), end(allIBM))
[1] "2013-01-02" "2013-12-27"
R>
You can approximate this by pulling distinct symbols out of the two environments but I think that is harder for you as someone beginning in R. So my recommendation would be to not work with environments -- see about lists instead.

Related

Converting to Date Time Using as.POSIXct

I have a column "DateTime".
Example value: 2016-12-05-16.25.54.875000
When I import this, R reads it as a Factor.
Now, when I sort the dataset by decreasing "DateTime", the maximum DateTime is 23 June 2017. When I use DateTime = as.POSIXct(DateTime), it changes to 22 June 2017. How is this happening?
P.S. I am running this R script in Power BI.
So some comments first. When you read strings in R, unless you specify otherwise they are imported as factors. You can use the Option
Trying what #Disco Superfly has suggested works if you define the data as a string in R
> a <- "2016-12-05-16.25.54.875000"
> as.POSIXct(a, format="%Y-%m-%d-%H.%M.%S")
[1] "2016-12-05 16:25:54 CET"
> as.POSIXct(a)
[1] "2016-12-05 CET"
Is not clear what you are saying about the fact that the data is being changed. Can you give a reproducible example?
To summarize, if your Dates are strings that what other have already suggested works perfectly. I suppose you are trying to do more than what you are explaining and therefore I don't understand what you are saying exactly.

Obtain a package 1st version's date of publication

Documentation of R packages only include the date of last update/publication.
Numbering of versions do not follow a common pattern to all packages.
Therefore, it is quite difficult to know at a glance if the package is old or new. Sometimes you need to decide between two packages with similar functions and knowing the age of a package could guide the decision.
My first approach was to plot downloads per year: By traking CRAN downloads. This methods provides also the relative popularity/usage of a package. However, this requires a lot of memory and time to proceed. Therefore, I would rather have a faster way to look into the history of one package.
Is there a quick way to know or vizualize the first version's date of release of one specific package or even to compare several pakages at once?
The purpose is to facilitate a mental mapping of all available packages in R, especially for newcomers. Getting to know packages and managing them is probably the main challenge why people give up on R.
Just for fun:
## not all repositories have the same archive structure!
archinfo <- function(pkgname,repos="http://www.cran.r-project.org") {
pkg.url <- paste(contrib.url(repos),"Archive",pkgname,sep="/")
r <- readLines(pkg.url)
## lame scraping code
r2 <- gsub("<[^>]+>"," ",r) ## drop HTML tags
r2 <- r2[-(1:grep("Parent Directory",r2))] ## drop header
r2 <- r2[grep(pkgname,r2)] ## drop footer
strip.white <- function(x) gsub("(^ +| +$)","",x)
r2 <- strip.white(gsub(" ","",r2)) ## more cleaning
r3 <- do.call(rbind,strsplit(r2," +")) ## pull out data frame
data.frame(
pkgvec=gsub(paste0("(",pkgname,"_|\\.tar\\.gz)"),"",r3[,1]),
pkgdate=as.Date(r3[,2],format="%d-%b-%Y"),
## assumes English locale for month abbreviations
size=r3[,4])
}
AERinfo <- archinfo("AER")
lme4info <- archinfo("lme4")
comb <- rbind(data.frame(pkg="AER",AERinfo),
data.frame(pkg="lme4",lme4info))
We can't compare package numbers directly because everyone uses different numbering schemes ...
library(dplyr) ## overkill
comb2 <- comb %>% group_by(pkg) %>% mutate(numver=seq(n()))
If you want to arrange by package date:
comb2 <- arrange(comb2,pkg,pkgdate)
Pretty pictures ...
library(ggplot2); theme_set(theme_bw())
ggplot(comb2,aes(x=pkgdate,y=numver,colour=pkg))+geom_line()
As Andrew Taylor suggested, CRAN Archives contains all previous versions and the date is indicated.

Loading intraday data into R for handling it with quantmod

I need to modify this example code for using it with intraday data which I should get from here and from here. As I understand, the code in that example works well with any historical data (or not?), so my problem then boils down to a question of loading the initial data in a necessary format (I mean daily or intraday).
As I also understand from answers on this question, it is impossible to load intraday data with getSymbols(). I tried to download that data into my hard-drive and to get it then with a read.csv() function, but this approach didn't work as well. Finally, I found few solutions of this problem in various articles (e.g. here), but all of them seem to be very complicated and "artificial".
So, my question is how to load the given intraday data into the given code elegantly and correctly from programmer's point of view, without reinventing the wheel?
P.S. I am very new to analysis of time series in R and quantstrat thus if my question seems to be obscure let me know what you need to know to answer it.
I don't know how to do this without "reinventing the wheel" because I'm not aware of any existing solutions. It's pretty easy to do with a custom function though.
intradataYahoo <- function(symbol, ...) {
# ensure xts is available
stopifnot(require(xts))
# construct URL
URL <- paste0("http://chartapi.finance.yahoo.com/instrument/1.0/",
symbol, "/chartdata;type=quote;range=1d/csv")
# read the metadata from the top of the file and put it into a usable list
metadata <- readLines(paste(URL, collapse=""), 17)[-1L]
# split into name/value pairs, set the names as the first element of the
# result and the values as the remaining elements
metadata <- strsplit(metadata, ":")
names(metadata) <- sub("-","_",sapply(metadata, `[`, 1))
metadata <- lapply(metadata, function(x) strsplit(x[-1L], ",")[[1]])
# convert GMT offset to numeric
metadata$gmtoffset <- as.numeric(metadata$gmtoffset)
# read data into an xts object; timestamps are in GMT, so we don't set it
# explicitly. I would set it explicitly, but timezones are provided in
# an ambiguous format (e.g. "CST", "EST", etc).
Data <- as.xts(read.zoo(paste(URL, collapse=""), sep=",", header=FALSE,
skip=17, FUN=function(i) .POSIXct(as.numeric(i))))
# set column names and metadata (as xts attributes)
colnames(Data) <- metadata$values[-1L]
xtsAttributes(Data) <- metadata[c("ticker","Company_Name",
"Exchange_Name","unit","timezone","gmtoffset")]
Data
}
I'd consider adding something like this to quantmod, but it would need to be tested. I wrote this in under 15 minutes, so I'm sure there will be some issues.

Processing Accelerometer Data in R

I have accelerometer data Log, the data contain (accX, accY, accZ,timestamp)
The data look like
I am so confused how to processing it in R.
I have two questions:
Is there any library for handle this data? I want plot it like time
series data and analysis it.
How to process the timestamp? because the timestamp is not every
second but milisecond.
Please anyone can give me some light?
Thank you in advance
To answer your two specific questions:
1) There are many packages for time series analysis. See here: http://cran.r-project.org/web/views/TimeSeries.html
2) There are many ways to process the time stamp data. #bjoseph gives you some very good advice in his response. The lubridate package (http://cran.r-project.org/web/packages/lubridate/index.html) is very good at handling time data with somewhat more sensible functions than the POSIX set. ggplot2 (http://ggplot2.org/) plots time series data quite sensibly as well.
The classes POSIXlt and POSIXct exist in R for manipulating calendar times and dates. See more information here: http://stat.ethz.ch/R-manual/R-devel/library/base/html/as.POSIXlt.html
In your specific case you need to may need to modify your system time to include milliseconds:
test <- "2014-07-09 15:03:33:252"
test
[1]"2014-07-09 15:03:33:252"
options("digits.secs"=6)
Sys.time()
[1] "2014-07-23 11:16:32.480932 EDT"
as.POSIXlt(test)
[1] "2014-07-09 15:03:33 EDT"
test
[1] "2014-07-09 15:03:33:252"
time.test <- as.POSIXlt(test)
class(time.test)
[1] "POSIXlt" "POSIXt"
time.test
[1] "2014-07-09 15:03:33 EDT"
To apply this to the entire column of your data.table you can run:
dt$timecolumn <- as.POSIXlt(dt$timecolumn)
where time column is the name of the column with the times in it.
If you need help importing the data from excel, a good guide:
> library(gdata) # load gdata package
> help(read.xls) # documentation
> mydata = read.xls("mydata.xls") # read from first sheet

Downloading FRED data with quantmod: can dates be specified?

I am downloading data from FRED with the quantmod library (author Jeffrey A. Ryan). With Yahoo and Google data, I am able to set start and end dates. Can the same be done for FRED data?
The help page does not list "from" and "to" as options of quantmod's getSymbols function, from which I'm inferring that it is not currently possible.
Is there a way to set a range for the data to be downloaded or do I need to download the entire dataset and discard the data I don't need?
Thanks for your help. Below the code that illustrates the context:
The dates are ignored when downloading from FRED:
# environment in which to store data
data <- new.env()
# set dates
date.start <- "2000-01-01"
date.end <- "2012-12-31"
# set tickers
tickers <- c("FEDFUNDS", "GDPPOT", "DGS10")
# import data from FRED database
library("quantmod")
getSymbols( tickers
, src = "FRED" # needed!
, from = date.start # ignored
, to = date.end # ignored
, env = data
, adjust = TRUE
)
head(data$FEDFUNDS)
head(data$FEDFUNDS)
FEDFUNDS
1954-07-01 0.80
1954-08-01 1.22
1954-09-01 1.06
1954-10-01 0.85
1954-11-01 0.83
1954-12-01 1.28
EDIT: Solution
Thanks to GSee's suggestion below, I am using the following code to subset the data to within the range of dates specified above:
# subset data to within time range
dtx <- data$FEDFUNDS
dtx[paste(date.start,date.end,sep="/")]
Here I extracted the xts data from the environment before acting upon it. My follow-up question explores alternatives.
Follow-Up Question
I have asked some follow-up questions there: get xts objects from within an environment
You have to download all the data and subset later. getSymbols.FRED does not support the from argument like getSymbols.yahoo does.
Alternatively you can download FRED data from Quandl (http://www.quandl.com/help/r) which offers more than 4 million datasets including all of the FRED data. There is an API and R package available. ("Quandl"). Data can be returned in several formats formats e.g. data frame ("raw"), ts ("ts"), zoo ("zoo") and xts ("xts").
For example to download GDPPOT10 and specify the dates and have it returned as an xts object all you have to do is:
require(Quandl)
mydata = Quandl("FRED/GDPPOT", start_date="2005-01-03",end_date="2013-04-10",type="xts")
Quandl doesn't seem to offer all data from FRED, at least in terms data frequency. Quandl most likely offers only annual data which is not useful in many circumstance.

Resources