Trying to create an xts file but after formatting upon loading in, I have different number of rows for dates than I do for my data. My data has many columns with varying number of rows, anywhere from 20 to 200. I want to create a separate variable after loading in, and the variable with depend on the composite I want to look at, so I want a full data.frame with NAs before creating a variable where I will na.omit and reduce the dimensions.
Here is the code:
#load file with desired composite
allcomposites <- read.csv("Composites 2014.08.31.csv", header = T)
compositebench <- allcomposites[1, 2:ncol(allcomposites)]
dates1 <- as.Date(allcomposites$Name, format = "%m/%d/%Y")
allcomposites <- as.data.frame(lapply(allcomposites[2:nrow(allcomposites),2:ncol(allcomposites)], as.numeric))
allcomposites <- as.xts(allcomposites, order.by = dates1)
## Error in xts(x, order.by = order.by, frequency = frequency, ...) :
## NROW(x) must match length(order.by)
Edit to show what allcomposites looks like:
Name Composite1 Composite2 Composite3 Composite4 Composite5
Bmark 229 229 982 612 995
8/31/2014 0.9979 0.9404 4.3808 3.9296
7/31/2014 -0.4563 -0.3038 -1.7817 -1.7248
6/30/2014 0.205 0.2234 2.2184 2.7304
5/31/2014 1.311 1.5771 3.4824 1.7601
4/30/2014 0.9096 1.0187 -1.9195 1.2964
You need to be more careful when removing the first row from dates1 as well as allcomposites.
Here's another way to accomplish your goal:
Lines <- "Name Composite1 Composite2 Composite3 Composite4 Composite5
Bmark 229 229 982 612 995
8/31/2014 0.9979 0.9404 4.3808 3.9296
7/31/2014 -0.4563 -0.3038 -1.7817 -1.7248
6/30/2014 0.205 0.2234 2.2184 2.7304
5/31/2014 1.311 1.5771 3.4824 1.7601
4/30/2014 0.9096 1.0187 -1.9195 1.2964"
library(xts)
# use fill=TRUE because you only provided data for 4 composites
allcomp <- read.table(text=Lines, header=TRUE, fill=TRUE)
# remove the first row that contains "Bmark"
allcomp <- allcomp[-1,]
# create an xts object from the remaining data
allcomp_xts <- xts(allcomp[,-1], as.Date(allcomp[,1], "%m/%d/%Y"))
## Error in xts(x, order.by = order.by, frequency = frequency, ...
## NROW(x) must match length(order.by)
I wasted hours running into this error. Regardless of whether or not I had the exact same problem, I'll show how I solved for this error message in case it saves you the pain I had.
I imported an Excel or CSV file (tried both) through several importing functions, then tried to convert my data (as either a data.frame or .zoo object) into an xts object and kept getting errors, this one included.
I tried creating a vector of dates seperately to pass in as the order.by parameter. I tried making sure the date vector the rows of the data.frame were the same. Sometimes it worked and sometimes it didn't, for reasons I can't explain. Even when it did work, R had "coerced" all my numeric data into character data. (Causing me endless problems, later. Watch for coercion, I learned.)
These errors kept happening until:
For xts conversion I used the date column from the imported Excel sheet as the order.by parameter with an as.Date() modifier, AND I *dropped the date column during the conversion to xts.*
Here's the working code:
xl_sheet <- read_excel("../path/to/my_excel_file.xlsx")
sheet_xts <- xts(xl_sheet[-1], order.by = as.Date(xl_sheet$date))
Note my date column was the first column, so the xl_sheet[-1] removed the first column.
Related
I am very new to R, I watched a youtube video to do various time series analysis, but it downloaded data from yahoo - my data is in Excel. I wanted to follow the same analysis, but with data from an excel.csv file. I spent two days finding out that the date must be in USA style. Now I am stuck again on a basic step - loading the data so it can be analysed - this seems to be the biggest hurdle with R. Please can someone give me some guidance on why the command shown below does not do the returns for the complete column set. I tried the zoo format, but it didn't work, then I tried xts and it worked partially. I suspect the original import from excel is the major problem. Can I get some guidance please
> AllPrices <- as.zoo(AllPrices)
> head(AllPrices)
Index1 Index2 Index3 Index4 Index5 Index6 Index7 Index8 Index9 Index10
> AllRets <- dailyReturn(AllPrices)
Error in NextMethod("[<-") : incorrect number of subscripts on matrix
> AllPrices<- as.xts(AllPrices)
> AllRets <- dailyReturn(AllPrices)
> head(AllRets)
daily.returns
2012-11-06 0.000000e+00
2012-11-07 -2.220249e-02
2012-11-08 1.379504e-05
2012-11-09 2.781961e-04
2012-11-12 -2.411128e-03
2012-11-13 7.932869e-03
Try to load your data using the readr package.
library(readr)
Then, look at the documentation by running ?read_csv in the console.
I recommend reading in your data this way. Specify the column types. For instance, if your first column is the date, read it in as a character "c" and if your other columns are numeric use "n".
data <- read_csv('YOUR_DATA.csv', col_types = "cnnnnn") # date in left column, 5 numeric columns
data$Dates <- as.Date(data$Dates, format = "%Y-%m-%d") # make the dates column a date class (you need to update "Dates" to be your column name for the Dates column, you may need to change the format
data <- as.data.frame(data) # turn the result into a dataframe
data <- xts(data[,-1], order.by = XAU[,1]) # then make an xts, data is everything but the date column, order.by is the date column
I have a really long list of EMG data that I need to convert to a vector or data frame before using the biosignal EMG package in R. It doesn't work with lists. The EMG data is .csv and is in the form shown in the picture.
I tried using the as.data.frame function, but it still gave me a list.
I also tried unlisting it, but it gave me an integer instead.
There are 2 columns and 647 rows.
I need to plot the data in the 2nd column and starting from row 8 till row 647.
How do I do this?
Below is the code I used:
library(biosignalEMG) # ReadCSV
MyEMGdata151517 <- read.csv(file="C:\\Users\\zyous\\OneDrive\\Desktop\\AsciiTraceDump_190211_151517.csv", header=TRUE, sep=",")
MyEMGdata151543<-read.csv(file="C:\\Users\\zyous\\OneDrive\\Desktop\\AsciiTraceDump_190211_151743.csv", header=TRUE, sep=",")
Rectified_EMG_151517<-rectification(MyEMGdata151517,rtype = "fullwave")
M<-as.data.frame.array(MyEMGdata151543) is.data.frame(M) #Rectifying 151517
Rectified_EMG_151517 <- rectification(MyEMGdata151517, rtype = "fullwave")
Rectified_plot_151517<-plot(MyEMGdata151517, main = "Rectified EMG")
When I try to rectify, I get this error: Error in rectification(MyEMGdata151517, rtype = "fullwave") : an object of class 'emg' is required.
And that error I think is because my file is not a vector. But how do i convert it when unlist wont work I wanna see peaks like the kind you would get in excel doing this.
It appears that you are not converting the variable created with read.csv() into an emg object. The documentation gives the following example:
x <- rnorm(10000, 0, 1)
emg1 <- emg(x, samplingrate=1000, units="mV", data.name="")
summary(emg1)
EMG Object
Total number of samples: 10000
Number of channels: 1
Duration (seconds): 10
Samplingrate (Hertz): 1000
Channel information:
Units: mV
plot(emg1, main="Simulated EMG")
Which yields:
Based on the attached image in your question I think you should do something similar with emg2 <- emg(MyEMGdata151517$X8, samplingrate=1000, units="mV", data.name="")
I was trying to use quantmod to download some history data of stock price, here's my code:
Nasdaq100_Symbols <- c('GE','PG','MSFT','AAPL','PFE','AMD','DELL','GRPN','FB','CSCO','INTC',
'EZJ.L','BP','HSBC','MKS')
getSymbols(Nasdaq100_Symbols)
Warning messages:
1: DELL contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
How can I remove these NA values since I'm trying to merge them together and make it as a time series data type,
nasdaq100 <- data.frame(as.xts(merge(GE,PG,MSFT,AAPL,PFE,AMD,DELL,GRPN,FB,CSCO,INTC,
EZJ.L,BP,HSBC,MKS)))
head(nasdaq100[,1:12],2)
GE.Open GE.High GE.Low GE.Close GE.Volume GE.Adjusted PG.Open PG.High PG.Low
2007-01-02 NA NA NA NA NA NA NA NA NA
2007-01-03 37.41 38.15 37.38 37.97 43222800 24.48669 63.72 64.66 63.7
PG.Close PG.Volume PG.Adjusted
2007-01-02 NA NA NA
2007-01-03 64.54 9717900 44.56958
class(nasdaq100)
[1] "data.frame"
# set outcome variable
outcomeSymbol <- 'FISV.Volume'
# shift outcome value to be on same line as predictors
library(xts)
nasdaq100 <- xts(nasdaq100,order.by=as.Date(rownames(nasdaq100)))
nasdaq100 <- as.data.frame(merge(nasdaq100,lm1=lag(nasdaq100[,outcomeSymbol],-1)))
Error in `[.xts`(nasdaq100, , outcomeSymbol) : subscript out of bounds
I'm stuck here, I found a tutorial on Youtube(https://www.youtube.com/watch?v=lDgvaJFpybU&t=32s) but can't move forward because of these warning and errors, can someone tell me how to fix it?
If you are going to do part of an example code, make sure you adjust everything correctly. At the end you are filling the outcomeSymbol with a value from the stock FISV that you didn't download in the beginning of your script. And I must say the code in the script that you can find here could be written better. There are way too many switches between xts and data.frame that are not necessary. I'm not going to rewrite his whole code. But this code fixes your errors.
First, instead of polluting your work environment with a 100 stocks I put everything in one list object. Then merge all this together with Reduce and merge. The missing data that is in the DELL ticker, will nicely merge with everything else, but will be NA as there is no data. If you want to deal with this, either do not download the DELL data, or fill it with 0 with the na.fill function. This last option might not be a good solution if you are going to use this data for training a model. I also show you how to turn a xts object into a data.frame without having to use as.Date later on.
library(quantmod)
Nasdaq100_Symbols <- c('GE','PG','MSFT','AAPL','PFE','AMD','DELL')
# put all stocks in one list object
stocks <- lapply(Nasdaq100_Symbols, getSymbols, auto.assign = FALSE)
# following is not needed but if you want to use the list for other purposes
# it is a good practice to name all the different list objects.
# names(stocks) <- Nasdaq100_Symbols
# merge all stocks into 1 xts object
nasdaq100 <- Reduce(merge, stocks)
# fill NA's with 0
nasdaq100 <- na.fill(nasdaq100, 0)
outcomeSymbol <- "GE.Volume" # <-- used GE as that data is available in the downloaded data set
# merge outcome to data
nasdaq100 <- merge(nasdaq100, lm1 = lag(nasdaq100[, outcomeSymbol], -1))
# turn into data.frame
nasdaq100_df <- data.frame(date = index(nasdaq100), coredata(nasdaq100))
I am not exactly sure why you want to remove NA before merge.
I do it after merge and it works perfectly for me because xts objects are merged based on their data index. I only keep Adjusted Clase so my usual code looks like:
yahoo_symbols <- c(share1, share2, share3,...)
qts_env <- new.env()
getSymbols(yahoo_symbols,
env = qts_env,
from = start_date,
to = end_date,
periodicity = "daily"
)
shares_cl <- do.call(merge, eapply(qts_env, Ad))
shares_cl <- na.omit(shares_cl)
I hope that it helps.
I read a CSV file using read.csv() command and I want to convert into xts and graph with chartSeries().
I changed into a matrix by doing:
MyData <- as.matrix(MyData)
When I convert to xts using
MyData_xts <- xts(MyData[,-1], order.by=as.POSIXct(MyData[,1]))
I get the following error message:
Error in as.POSIXlt.character(as.character(x), ...) :
character string is not in a standard unambiguous format
The column that has my index is in the yyyymm format. I've read that that may be a problem, but I haven't been able to find a way around it.
EDIT 1
The CSV read before converting to matrix looks like this. All of the rows are factors class:
X |Mkt.RF|SMB
------|------|---
196307|-0.39 |-.046
196308|5.07 |-0.81
196308|-1.57 |-.048
You should use read.zoo to import your CSV directly into a zoo object. If you want, you can use as.xts to convert the zoo object to xts. You should also use a yearmon index, since your index only has years and months.
Text <- "X,Mkt.RF,SMB
196307,-0.39,-0.046
196308, 5.07,-0.810
196309,-1.57,-0.048"
# function adapted from examples in ?read.zoo
z <- read.zoo(text=Text, header=TRUE, sep=",",
FUN=function(x) as.yearmon(format(x), "%Y%m"))
z
# Mkt.RF SMB
# Jul 1963 -0.39 -0.046
# Aug 1963 5.07 -0.810
# Sep 1963 -1.57 -0.048
Since you do not provide any data, I will use a small test example that matches your description. I do not think that as.POSIXct will work without specific days. You can make this work by using the first day of each month.
x = c("201701", "201702", "201703")
xt = as.POSIXct(paste(x, "01", sep=""), format="%Y%m%d")
xts(xt, order.by=xt)
[,1]
2017-01-01 1483246800
2017-02-01 1485925200
2017-03-01 1488344400
Updated:
I see that you have now provided data and say that you are getting NAs. I am using the data that you provided, reading it as a csv, processing it with my code and not getting NAs. Please look again at this version of the code.
Input = read.csv(text="X,Mkt.RF,SMB
196307,-0.39 ,-.046
196308,5.07 ,-0.81
196308,-1.57 ,-.048",
header=TRUE, stringsAsFactors=FALSE)
library(xts)
Input$xt = as.POSIXct(paste(Input$X, "01", sep=""), format="%Y%m%d")
xts(Input, order.by=Input$xt)
X Mkt.RF SMB xt
1963-07-01 "196307" "-0.39" "-0.046" "1963-07-01"
1963-08-01 "196308" " 5.07" "-0.810" "1963-08-01"
1963-08-01 "196308" "-1.57" "-0.048" "1963-08-01"
I read a CSV file using read.csv() command and I want to convert into xts and graph with chartSeries().
I changed into a matrix by doing:
MyData <- as.matrix(MyData)
When I convert to xts using
MyData_xts <- xts(MyData[,-1], order.by=as.POSIXct(MyData[,1]))
I get the following error message:
Error in as.POSIXlt.character(as.character(x), ...) :
character string is not in a standard unambiguous format
The column that has my index is in the yyyymm format. I've read that that may be a problem, but I haven't been able to find a way around it.
EDIT 1
The CSV read before converting to matrix looks like this. All of the rows are factors class:
X |Mkt.RF|SMB
------|------|---
196307|-0.39 |-.046
196308|5.07 |-0.81
196308|-1.57 |-.048
You should use read.zoo to import your CSV directly into a zoo object. If you want, you can use as.xts to convert the zoo object to xts. You should also use a yearmon index, since your index only has years and months.
Text <- "X,Mkt.RF,SMB
196307,-0.39,-0.046
196308, 5.07,-0.810
196309,-1.57,-0.048"
# function adapted from examples in ?read.zoo
z <- read.zoo(text=Text, header=TRUE, sep=",",
FUN=function(x) as.yearmon(format(x), "%Y%m"))
z
# Mkt.RF SMB
# Jul 1963 -0.39 -0.046
# Aug 1963 5.07 -0.810
# Sep 1963 -1.57 -0.048
Since you do not provide any data, I will use a small test example that matches your description. I do not think that as.POSIXct will work without specific days. You can make this work by using the first day of each month.
x = c("201701", "201702", "201703")
xt = as.POSIXct(paste(x, "01", sep=""), format="%Y%m%d")
xts(xt, order.by=xt)
[,1]
2017-01-01 1483246800
2017-02-01 1485925200
2017-03-01 1488344400
Updated:
I see that you have now provided data and say that you are getting NAs. I am using the data that you provided, reading it as a csv, processing it with my code and not getting NAs. Please look again at this version of the code.
Input = read.csv(text="X,Mkt.RF,SMB
196307,-0.39 ,-.046
196308,5.07 ,-0.81
196308,-1.57 ,-.048",
header=TRUE, stringsAsFactors=FALSE)
library(xts)
Input$xt = as.POSIXct(paste(Input$X, "01", sep=""), format="%Y%m%d")
xts(Input, order.by=Input$xt)
X Mkt.RF SMB xt
1963-07-01 "196307" "-0.39" "-0.046" "1963-07-01"
1963-08-01 "196308" " 5.07" "-0.810" "1963-08-01"
1963-08-01 "196308" "-1.57" "-0.048" "1963-08-01"