I am very new to R, I watched a youtube video to do various time series analysis, but it downloaded data from yahoo - my data is in Excel. I wanted to follow the same analysis, but with data from an excel.csv file. I spent two days finding out that the date must be in USA style. Now I am stuck again on a basic step - loading the data so it can be analysed - this seems to be the biggest hurdle with R. Please can someone give me some guidance on why the command shown below does not do the returns for the complete column set. I tried the zoo format, but it didn't work, then I tried xts and it worked partially. I suspect the original import from excel is the major problem. Can I get some guidance please
> AllPrices <- as.zoo(AllPrices)
> head(AllPrices)
Index1 Index2 Index3 Index4 Index5 Index6 Index7 Index8 Index9 Index10
> AllRets <- dailyReturn(AllPrices)
Error in NextMethod("[<-") : incorrect number of subscripts on matrix
> AllPrices<- as.xts(AllPrices)
> AllRets <- dailyReturn(AllPrices)
> head(AllRets)
daily.returns
2012-11-06 0.000000e+00
2012-11-07 -2.220249e-02
2012-11-08 1.379504e-05
2012-11-09 2.781961e-04
2012-11-12 -2.411128e-03
2012-11-13 7.932869e-03
Try to load your data using the readr package.
library(readr)
Then, look at the documentation by running ?read_csv in the console.
I recommend reading in your data this way. Specify the column types. For instance, if your first column is the date, read it in as a character "c" and if your other columns are numeric use "n".
data <- read_csv('YOUR_DATA.csv', col_types = "cnnnnn") # date in left column, 5 numeric columns
data$Dates <- as.Date(data$Dates, format = "%Y-%m-%d") # make the dates column a date class (you need to update "Dates" to be your column name for the Dates column, you may need to change the format
data <- as.data.frame(data) # turn the result into a dataframe
data <- xts(data[,-1], order.by = XAU[,1]) # then make an xts, data is everything but the date column, order.by is the date column
I have a dataset with lets say 2 variables. I want to do some regression testing, but the quite a few numeric observations have "NULL". I would want to use this as a value however, but I don't want to convert it to a specific number, ie 99999.
I keep trying all the different ways after googling and it doesn't work.
Benny2 <- read_excel("C:/Users/EH9508/Desktop/Benny2.xlsx")
I have two variables "Days" and "Amount" both have numeric values and "NULL"
Any help would be appreicated.
You can convert the file/sheet to csv from Excel (save as > csv) and then:
mydata <- read.csv("path/to/file.csv")
If you don't have access to Excel, then this is how it goes with the xlsx library:
library("xlsx")
mydata <- read.xlsx("path/to/file.xlsx")
If you put the csv/xlsx file in the same folder as your R script, you can type the file name without the path as read.xlsx("file.xlsx").
If you already have your data in R and are wondering how to get the NULL converted to a given value, try this:
mydata <- matrix(rnorm(10),5,2) # You data
mydata[2,1] <- NA # Some NA
mydata[5,2] <- NA
mydata[is.na(mydata)] <- 99999 # Replaces mydata where NA for 99999
Trying to create an xts file but after formatting upon loading in, I have different number of rows for dates than I do for my data. My data has many columns with varying number of rows, anywhere from 20 to 200. I want to create a separate variable after loading in, and the variable with depend on the composite I want to look at, so I want a full data.frame with NAs before creating a variable where I will na.omit and reduce the dimensions.
Here is the code:
#load file with desired composite
allcomposites <- read.csv("Composites 2014.08.31.csv", header = T)
compositebench <- allcomposites[1, 2:ncol(allcomposites)]
dates1 <- as.Date(allcomposites$Name, format = "%m/%d/%Y")
allcomposites <- as.data.frame(lapply(allcomposites[2:nrow(allcomposites),2:ncol(allcomposites)], as.numeric))
allcomposites <- as.xts(allcomposites, order.by = dates1)
## Error in xts(x, order.by = order.by, frequency = frequency, ...) :
## NROW(x) must match length(order.by)
Edit to show what allcomposites looks like:
Name Composite1 Composite2 Composite3 Composite4 Composite5
Bmark 229 229 982 612 995
8/31/2014 0.9979 0.9404 4.3808 3.9296
7/31/2014 -0.4563 -0.3038 -1.7817 -1.7248
6/30/2014 0.205 0.2234 2.2184 2.7304
5/31/2014 1.311 1.5771 3.4824 1.7601
4/30/2014 0.9096 1.0187 -1.9195 1.2964
You need to be more careful when removing the first row from dates1 as well as allcomposites.
Here's another way to accomplish your goal:
Lines <- "Name Composite1 Composite2 Composite3 Composite4 Composite5
Bmark 229 229 982 612 995
8/31/2014 0.9979 0.9404 4.3808 3.9296
7/31/2014 -0.4563 -0.3038 -1.7817 -1.7248
6/30/2014 0.205 0.2234 2.2184 2.7304
5/31/2014 1.311 1.5771 3.4824 1.7601
4/30/2014 0.9096 1.0187 -1.9195 1.2964"
library(xts)
# use fill=TRUE because you only provided data for 4 composites
allcomp <- read.table(text=Lines, header=TRUE, fill=TRUE)
# remove the first row that contains "Bmark"
allcomp <- allcomp[-1,]
# create an xts object from the remaining data
allcomp_xts <- xts(allcomp[,-1], as.Date(allcomp[,1], "%m/%d/%Y"))
## Error in xts(x, order.by = order.by, frequency = frequency, ...
## NROW(x) must match length(order.by)
I wasted hours running into this error. Regardless of whether or not I had the exact same problem, I'll show how I solved for this error message in case it saves you the pain I had.
I imported an Excel or CSV file (tried both) through several importing functions, then tried to convert my data (as either a data.frame or .zoo object) into an xts object and kept getting errors, this one included.
I tried creating a vector of dates seperately to pass in as the order.by parameter. I tried making sure the date vector the rows of the data.frame were the same. Sometimes it worked and sometimes it didn't, for reasons I can't explain. Even when it did work, R had "coerced" all my numeric data into character data. (Causing me endless problems, later. Watch for coercion, I learned.)
These errors kept happening until:
For xts conversion I used the date column from the imported Excel sheet as the order.by parameter with an as.Date() modifier, AND I *dropped the date column during the conversion to xts.*
Here's the working code:
xl_sheet <- read_excel("../path/to/my_excel_file.xlsx")
sheet_xts <- xts(xl_sheet[-1], order.by = as.Date(xl_sheet$date))
Note my date column was the first column, so the xl_sheet[-1] removed the first column.
I'm currently downloading stock data using GetSymbols from the Quantmod package and calculating the daily stock returns, and then combining the data into a dataframe. I would like to do this for a very large set of stock symbols. See example below. In stead of doing this manually I would like to use a For Loop if possible or maybe use one of the apply functions, however I can not find the solution.
This is what I currently do:
Symbols<-c ("XOM","MSFT","JNJ","GE","CVX","WFC","PG","JPM","VZ","PFE","T","IBM","MRK","BAC","DIS","ORCL","PM","INTC","SLB")
length(Symbols)
#daily returns for selected stocks & SP500 Index
SP500<-as.xts(dailyReturn(na.omit(getSymbols("^GSPC",from=StartDate,auto.assign=FALSE))))
S1<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[1],from=StartDate,auto.assign=FALSE))))
S2<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[2],from=StartDate,auto.assign=FALSE))))
S3<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[3],from=StartDate,auto.assign=FALSE))))
S4<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[4],from=StartDate,auto.assign=FALSE))))
S5<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[5],from=StartDate,auto.assign=FALSE))))
S6<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[6],from=StartDate,auto.assign=FALSE))))
S7<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[7],from=StartDate,auto.assign=FALSE))))
S8<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[8],from=StartDate,auto.assign=FALSE))))
S9<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[9],from=StartDate,auto.assign=FALSE))))
S10<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[10],from=StartDate,auto.assign=FALSE))))
....
S20<-as.xts(dailyReturn(na.omit(getSymbols(Symbols[20],from=StartDate,auto.assign=FALSE))))
SPportD<-cbind(SP500,S1,S2,S3,S4,S5,S6,S7,S8,S9,S10,S11,S12,S13,S14,S15,S16,S17,S18,S19,S20)
names(SPportD)[1:(length(Symbols)+1)]<-c("SP500",Symbols)
SPportD.df<-data.frame(index(SPportD),coredata(SPportD),stringsAsFactors=FALSE)
names(SPportD.df)[1:(length(Symbols)+2)]<-c(class(StartDate),"SP500",Symbols)
Any suggestions?
Thanks!
dailyReturn uses close prices, so I would recommend you either use a different function (e.g. TTR::ROC on the Adjusted column), or adjust the close prices for dividends/splits (using adjustOHLC) before calling dailyReturn.
library(quantmod)
Symbols <- c("XOM","MSFT","JNJ","GE","CVX","WFC","PG","JPM","VZ","PFE",
"T","IBM","MRK","BAC","DIS","ORCL","PM","INTC","SLB")
# create environment to load data into
Data <- new.env()
getSymbols(c("^GSPC",Symbols), from="2007-01-01", env=Data)
# calculate returns, merge, and create data.frame (eapply loops over all
# objects in an environment, applies a function, and returns a list)
Returns <- eapply(Data, function(s) ROC(Ad(s), type="discrete"))
ReturnsDF <- as.data.frame(do.call(merge, Returns))
# adjust column names are re-order columns
colnames(ReturnsDF) <- gsub(".Adjusted","",colnames(ReturnsDF))
ReturnsDF <- ReturnsDF[,c("GSPC",Symbols)]
lapply is your friend:
Stocks = lapply(Symbols, function(sym) {
dailyReturn(na.omit(getSymbols(sym, from=StartDate, auto.assign=FALSE)))
})
Then to merge:
do.call(merge, Stocks)
Similar application for the other assignments
Packages are quantmod for data download and PerformanceAnalytics for analysis/plotting.
care must be taken with time series date alignment
Code
require(quantmod)
require(PerformanceAnalytics)
Symbols<-c ("XOM","MSFT","JNJ","GE","CVX","WFC","PG","JPM","VZ","PFE","T","IBM","MRK","BAC","DIS","ORCL","PM","INTC","SLB")
length(Symbols)
#Set start date
start_date=as.Date("2014-01-01")
#Create New environment to contain stock price data
dataEnv<-new.env()
#download data
getSymbols(Symbols,env=dataEnv,from=start_date)
#You have 19 symbols, the time series data for all the symbols might not be aligned
#Load Systematic investor toolbox for helpful functions
setInternet2(TRUE)
con = gzcon(url('https://github.com/systematicinvestor/SIT/raw/master/sit.gz', 'rb'))
source(con)
close(con)
#helper function for extracting Closing price of getsymbols output and for date alignment
bt.prep(dataEnv,align='remove.na')
#Now all your time series are correctly aligned
#prices data
stock_prices = dataEnv$prices
head(stock_prices[,1:3])
# head(stock_prices[,1:3])
# BAC CVX DIS
#2014-01-02 16.10 124.14 76.27
#2014-01-03 16.41 124.35 76.11
#2014-01-06 16.66 124.02 75.82
#2014-01-07 16.50 125.07 76.34
#2014-01-08 16.58 123.29 75.22
#2014-01-09 16.83 123.29 74.90
#calculate returns
stock_returns = Return.calculate(stock_prices, method = c("discrete"))
head(stock_returns[,1:3])
# head(stock_returns[,1:3])
# BAC CVX DIS
#2014-01-02 NA NA NA
#2014-01-03 0.019254658 0.001691638 -0.002097810
#2014-01-06 0.015234613 -0.002653800 -0.003810275
#2014-01-07 -0.009603842 0.008466376 0.006858349
#2014-01-08 0.004848485 -0.014232030 -0.014671208
#2014-01-09 0.015078408 0.000000000 -0.004254188
#Plot Performance for first three stocks
charts.PerformanceSummary(stock_returns[,1:3],main='Stock Absolute Performance',legend.loc="bottomright")
Performance Chart:
I'm trying to automate some seasonal adjustment with the x12 package. To do this I need a ts object. However, I do not need a simple ts object, but one whose start date and frequency has been set. For any given series I could type that, but I will be feeding a mix of monthly or weekly data in. I can get the data from a quantmod as an xta object, but can't seem to figure out how to extract the frequency from the xts.
Here is some sample code that works the the whole way through, but I would like to pull the frequency info from the xts, rather than explicitly set it:
getSymbols("WILACR3URN",src="FRED", from="2000-01-01") # get data as an XTS
lax <- WILACR3URN #shorten name
laxts <- ts(lax$WILACR3URN, start=c(2000,1), frequency=12) #explicitly it works
plot.ts(laxts)
x12out <- x12(laxts,x12path="c:\\x12arima\\x12a.exe",transform="auto", automdl=TRUE)
laxadj <- as.ts(x12out$d11) # extract seasonally adjusted series
Any suggestions? Or is it not possible and I should determine/feed the frequency explicitly?
Thanks
This is untested for this specific case, but try using xts::periodicity for the frequency:
freq <- switch(periodicity(lax)$scale,
daily=365,
weekly=52,
monthly=12,
quarterly=4,
yearly=1)
And use the year and mon elements of POSIXlt objects to calculate the start year and month.
pltStart <- as.POSIXlt(start(lax))
Start <- c(pltStart$year+1900,pltStart$mon+1)
laxts <- ts(lax$WILACR3URN, start=Start, frequency=freq)
plot.ts(laxts)
The xts::periodicity suggestion was helpful to me. I've also found the following approach using xts::convertIndex works well for monthly and quarterly data. It is untested for weekly data.
require("quantmod")
require("dplyr")
getSymbols("WILACR3URN",src="FRED", from="2000-01-01") # get data as an XTS
lax <- WILACR3URN #shorten name
laxts <- lax %>%
convertIndex("yearmon") %>% # change index of xts object
as.ts(start = start(.), end = end(.)) # convert to ts
plot.ts(laxts)