Declare yearly data in read.zoo - r

I am trying to read in yearly data with gaps using the read.zoo function from the zoo package. I am having some trouble finding the FUN that declares the data to be yearly data. The data set is located here.
The function call I am trying is
tsGDP <- read.zoo("us-gross-domestic-product-192919.csv", sep=",", format="%Y",
regular=FALSE, header=TRUE, index.column=1)
plot(log(tsGDP))
This works fine, but it chokes when I try to plot the ACF of the series
> acf(tsGDP)
Error in na.fail.default(as.ts(x)) : missing values in object
This R-list posting seems to indicate that this is because I am not declaring yearly data correctly.

Without data , it is hard to reproduce problem.
But , from the documentation of acf
By default, no missing values are allowed. If the na.action function passes through missing values (as na.pass does), the covariances are computed from the complete cases.
why not to try with
acf(x = tsGDP, na.fail = na.pass)

Related

Error in x[is.na(x)] <- na.string : replacement has length zero when exporting data frame to openxlsx in R

I have an issue when I try to export a data frame with the library openxlsx to an Excel. When I tried, this error happen:
openxlsx::write.xlsx(usertl_lp, file = "Mi_Exportación.xlsx")
Error in x[is.na(x)] <- na.string : replacement has length zero
usertl_lp_clean <- usertl_lp %>% mutate(across(where(is.list), as.character))
openxlsx::write.xlsx(usertl_lp_clean, file = "Mi_Exportación.xlsx")
This error may be caused by cells containing vectors. So, using across to modify the vector to character.
I posted this here for others in need.
I think you are looking for the writeData function from the same package.
Check out writeFormula from the same package as well or even write_xlsx from the writexl package.
I was having a similar problem in a data frame, but, in my case, I was using the related openxlsx::writeData.
The data frame was generated using sapply, with functions which could deliver errors because of the data. So, I coded to fill with NA when an error were generated. I ended up with NaN and NAs in the same column.
What worked for me is conducting the following treatment before writeData:
df[is.na(df)]<-''
so, for your problem, the following may work:
df[is.na(df)]<-''
openxlsx::write.xlsx(as.data.frame(df), file = "df.xlsx", colNames = TRUE, rowNames = FALSE, append = FALSE)

Why this error happen "duplicated name in data frame using '.'?

I have a data frame with 30 row and 850 column(features).
when I want to use svm or other classifier with caret and e1071 packages, I faced this error!
Error in terms.formula(formula, data = data) :
duplicated name 'X10Percentile' in data frame using '.'
Even when I want to use feature selection method such as Boruta, I face the same error.
I double check my feature and found nothing. I thought I must have the same column name in data frame so I create a sample data and check as follow:
test<-data.frame("w1"=c(1:6),"w1.1"=c(2:7),"w1"=c(3:8), "ta"=c("T","F","T","F","F","T"))
set.seed(100)
train <- createDataPartition(y=test$ta,p=0.6,list = FALSE)
TrainSet <- test[train,]
TestSet <- test[-train,]
trcontrol_rcv<- trainControl(method="cv", number=10)
svm_test<-svm(ta ~., data=TrainSet,trControl=trcontrol_rcv)
It works good and no Error occurs.
As I see no error happen when test data even has exactly the same colname.
I want to know why this error"Error in terms.formula(formula, data = data) :
duplicated name 'X10Percentile' in data frame using '.'" happen for my data, and how can I eliminate it?
Thank you in advance.
Thank you, everyone. Fortunately, I found the cause of this error.
Because R considers variables as factors. Therefore it makes a data. frame (which in fact is a list).To solve this problem, I converted it into a data numeric in the following way;
test1<-sapply(test,function(x) as.numeric(as.character(x)))
For me that was not the solution, I had a LargeMatrix as an object of only numeric type vectors.
The problem was that some dimnames(MyLargeMatrix) were duplicated. I change them and the error went away.

how to get tsclean working on data frame with multiple time series

I'm in the process of creating a forecast based on the hts package but before getting this far I need to clean the data for outliers and missing values.
For this I thought of using the tsclean function in the forecast package. I got my data stored in data frame with multiple columns (time series) that I wish to get cleaned. I can get the function to work when only having one time serie, but since I do have quite a lot i'm looking for a smart way to do this.
When running the code:
SFA5 <- ts(SFA4, frequency=12, start=c(2012,1), end=c(2017,10))
ggt <- tsclean(SFA5[1:70, 1:94], replace.missing = TRUE)
I get this error message:
Error in na.interp(x, lambda = lambda) : The time series is not univariate.
The data is here:
https://www.dropbox.com/s/dow2jpuv5unmtgd/Data1850.xlsx?dl=0
My question is: what am i doing wrong or is the only solution to do a loop sequence
The error message suggests that the function takes univariate time series as its first argument only. So you need to apply tsclean to each column, as you might have guessed.
library(forecast)
ggt <- sapply(X = SFA5[1:70, 1:94], FUN = tsclean)

How to convert a zoo object in a ts object in order to use strucchange

Ok. I´ve tried several foruns and threads, but I couldn't find this. I imported my database to R using this:
teste <- read.zoo("bitcoin2.csv", header=TRUE, sep=",", format = "%m/%d/%Y")
Which worked fine. My xyplot gave me the right plot. So I tried to convert it to ts in order to use strucchange and other outlier/breakpoints packages.
aba <- as.ts(zoo(z$Weighted_Price))
When I did it, it seems to have been lost the index time. The plot still has the same shape, but the X-axis doesn't look as a regular time series plot.
Anyway, I´ve tried the strucchange. After loading it, I made this simple test:
test<-breakpoints(teste$Weighted_Price~1)
But R returned me:
Error in my.RSS.table[as.character(i), 3:4] <- c(pot.index[opt], break.RSS[opt]) :
replacement has length zero
I presume my mistake is that the coercion from zoo to ts was not correct. Any help would be great.

In R cannot use AdjustedSharpeRatio() from 'Performance Analytics'

I have some troubles using the function AdjustedSharpeRatio() from the package PerformanceAnalytics, the following code sample in R 3.0.0:
library(PerformanceAnalytics)
logrets = array(dim=c(3,2),c(1,2,3,4,5,6))
weights = c(0.4,0.6)
AdjustedSharpeRatio(rowSums(weights*logrets),0.01)
gives the following error:
Error in checkData(R) :
The data cannot be converted into a time series. If you are trying to pass in
names from a data object with one column, you should use the form 'data[rows,
columns, drop = FALSE]'. Rownames should have standard date formats, such as
'1985-03-15'.
Replacing the last line with zoo gives the same error:
AdjustedSharpeRatio(zoo(rowSums(weights*logrets)),0.01)
Am I missing something obvious ?
Hmm...not too sure what you are trying to achieve with the logrets and weights objects there....but if logrets are already in percentages. then maybe something like this...
AdjustedSharpeRatio(xts(rowSums(weights*logrets)/100,Sys.Date()-(c(3:1)*365)), Rf=0.01)
This might work:
a <- rowSums(weights*logrets)
names(a) <- c('1985-03-15', '1985-03-16', '1985-03-17')
AdjustedSharpeRatio(a,0.01)

Resources