backtest package R extracting returns from object - r

I would like to extract the returns from the backtest package which are according the to the manual stored within a 5 dimensional array called 'results')
This is the backtest package:
https://cran.r-project.org/web/packages/backtest/backtest.pdf
A simple example looks like this:
library(backtest)
data(starmine)
bt <- backtest(starmine, in.var = c("smi"),
ret.var = "ret.0.1.m", date.var = "date",
id.var = "id", buckets = 10,
natural = TRUE, by.period = TRUE)
summary(bt)
When you run the summary command, it will print out the return series for each decile. I would like to extract those into a dataframe that I can use for further analysis.
Does someone know, how I can access the return series or extract it?

The bt object is an object with class backtest (which we see from class(bt)). The summary() function has a method defined for backtest objects which only prints the information to the screen. If you try to assign the information via stuff <- summary(bt), the stuff object will be NULL. To access the data that summary(bt) prints to the screen, you should use the accessor functions created for that object ( they are described in ?'backtest-class'). These functions include:
means()
counts()
summary()
marginals()
summaryStats()
turnover()
In order to access the data frame of summary statistics by month printed as the side effect of summary(bt), you can run summaryStats(bt). Please see pages 5-8 of the backtest help files for more information.

Related

Extract data from signac RegionMatrix?

I'm trying to extract the data matrix computed by Signac r package when running RegionMatrix(). I know I can plot the data with RegionHeatmap(), but I want to plot the data with another package.
So, if I created a region matrix like this:
my_object <- RegionMatrix(my_object
, key = "my_region_matrix"
, regions = StringToGRanges(top_cluster_genes$gene)
, upstream = 2500, downstream = 2500)
How do I go about extracting the data stored within the object under the key "my_region_matrix"? I know it's a pretty basic question, but I've been browsing the returned object and can't find any assay/matrix with that key name. (I'm putting this under Seurat because apparently there is no tag for Signac and the object is a Seurat object anyway)

Using a for loop to create dynamic objects in R

I'm trying to use a for loop to create a set of dynamic objects in R. These will contain a list of organisations and values against a certain metric--each output will be the values of an individual metric.
In practice, this will be used to create chart objects using ggplot2, which I'll then use in RMarkdown. For the example below, it's just a sample using a head() function for each metric.
I tried using the paste function to create this name, but it gives the following error:
Error in paste("organisation_short", "_", MetricIDs[x]) <-
head(organisationdata_Jan2021) : target of assignment expands to
non-language object
I understand that the assign function might help, but I'm not sure how to use it. (My attempts also produced errors). I found a similar question in the link below, but it's set up in a way that pipes data directly into assign. I'm also not clear what "value = ." is doing. This query is below:
dynamically name objects in R
I believe the "value = ." refers to the data being piped into the assign function. I created an alternative version which is in the code below.
Error in assign(x = organisationdata_Jan2021, value = paste0("sampledata", :
invalid first argument
The idea is to create output files along the lines of: organisation_short_ABC123, organisation_short_ABC323, organisation_short_KJM088
I would be grateful for any guidance you might have!
MetricIDs <- c('ABC123','ABC323','KJM088')
# Attempt using paste
for (x in 1:3)
{
organisationdata_Jan2021 <- organisationdata_CM0040_Jan2021 %>% filter(Metric_ID==MetricIDs[x]) # Filter data to specific Metric ID
paste("organisation_short","_", MetricIDs[x]) <- head(organisationdata_Jan2021) # Goal: Create object that includes the Metric ID.
}
# Attempt using assign
for (x in 1:3)
{
organisationdata_Jan2021 <- organisationdata_CM0040_Jan2021 %>% filter(Metric_ID==MetricIDs[x]) # Filter data to specific Metric ID
assign(x=organisationdata_Jan2021, value=paste0("sampledata",MetricIDs[x]))
}
# Expected object names: organisation_short_ABC123, organisation_short_ABC323, organisation_short_KJM088
# This will be used to create chart objects using ggplot2, and those objects will be used in an R MarkDown document.

How do I export a textstat_simil document without losing observations or variables?

I'm new to quanteda and I am having issues exporting my documents. I am comparing two documents, "dfm_latam", with more than 27k observations, and "dfm_cosines", which consists of two corpuses with texts to be compared with each one of the 27k observations of the dfm_latam database.
corpus_cosine_2 <- corpus(cosine_2_pdf)
corpus_cosines <- corpus_cosine_1 + corpus_cosine_2
dfm_cosines <- dfm(corpus_cosines, case_insensitive = TRUE)
corpus_latam <- corpus(latam_review)
docvars(corpus_latam, "Text") <- names(corpus_latam$text)
dfm_latam <- dfm(corpus_latam, case_insensitive = TRUE)
simil_latam <- textstat_simil(dfm_latam, dfm_cosines, method = "cosine", margin = "documents", case_insensitive = TRUE)
view(simil_latam)
The view() function in R provides me with the first 1000 rows and everything is fine. Both numeric variables from the dfm_cosines are showing up. But, when I try to export it as an Excel document, the output looks completely different from the view() 1000 rows preview. One of the variables is missing, and the .xlsx output only shows "corpus_cosine_1's" results. The dfm "dfm_cosines" is made after both "corpus_cosine_1" and "corpus_cosine_2". Why does it happen when I export it?
openxlsx::write.xlsx(simil_latam, file = "F:\\path\\simil_latam.xlsx")
So, I tried exporting along with the view() function:
openxlsx::write.xlsx(view(simil_latam), file = "F:\\path\\simil_latam.xlsx")
For this write.xlsx(view()), the variables showing up are just right, but I only export 1.000 observations out of the 27.000+ I have. How do I automatically export all of the observations of the table with all variables showing up?
You need to convert the textstat_simil object to something more spreadsheet-like. Try
as.matrix(simil_latam)
before you call write.xlsx() or if you prefer this format,
as.data.frame(simil_latam)
I suggest you inspect both coerced objects before exporting them, and also see the help functions for each of these for these methods (found in the quanteda.textstats package).

Loop Causal Impact in R over multiple datasets and automatically export results

I need some suggestions on how to solve this problem. I have a number of zoo objects on which I want to perform a Causal Impact analysis in R, using the homonym package developed by Google. To automatize the process, I want to run a loop over the zoo objects and automatically save the results in a file to be exported in either word or csv.
So far, my solution has been to include the zoo objects into a zoo list by
zoolist<-list(ts1,
ts2,
ts3
)
and then run a for loop like:
for (i in zoolist)
{
experiment_impact<-CausalImpact(i,
pre.period,
post.period,
model.args = list(nseasons = 7, season.duration = 1))
summary(experiment_impact)
}
The code seems to work, however I don't have idea on how to export all the outputs in a csv or doc or whatever format, provided that it is compact and readable.
Any idea? Thank you for your help!
If the only thing you want to do is capture the summary, exactly as printed to the screen, you can use capture.output. Replace the second line in your loop with:
capture.output(summary(experiment_impact), file = 'example.txt', append = T)
A more elegant solution might be to use lapply to run the analysis on each item in the list, so that you end up with a list of output items:
resultList =
lapply(
zoolist,
CausalImpact,
pre.period,
post.period,
model.args = list(nseasons = 7, season.duration = 1)
)
You could then extract desired values from each of the CausalImpact objects in the list and format the values in a data.frame, which you could output using write.csv.

ts object not recognised in hybridModel of forecastHybrid package

Data is something like this:
df <- tribble(
~y,~timestamp
18.74682, 1500256800,
19.00424, 1500260400,
18.86993, 1500264000,
18.74960, 1500267600,
18.99854, 1500271200,
18.85443, 1500274800,
18.78031, 1500278400,
18.97948, 1500282000,
18.86576, 1500285600,
18.55633, 1500289200,
18.79052, 1500292800,
18.74790, 1500296400,
18.62743, 1500300000,
19.04696, 1500303600,
18.97851, 1500307200,
18.70956, 1500310800,
18.92302, 1500314400,
18.91465, 1500318000,
18.61556, 1500321600,
19.03535, 1500325200 )
I'm trying to apply hybridModel on timeseries data to perform ensemble.Below is my code:
library(tidyquant)
library(forecast)
library(timetk)
library(sweep)
library(forecastHybrid)
df <- mutate(df, timestamp = as_datetime(timestamp))
tk_ts_df <- tk_ts(df, start = 1, freq = 3600, silent = TRUE)
fit <- hybridModel(tk_ts_df)
On fitting timeseries object tk_ts_df (ts object) to hybridModel; it's giving error : "The time series must be numeric and may not be a matrix or dataframe object."
But on link: https://cran.r-project.org/web/packages/forecastHybrid/vignettes/forecastHybrid.html
It's clearly mentioned : The workhorse function of the package is hybridModel(), a function that combines several component models from the “forecast” package. At a minimum, the user must supply a ts or numeric vector for y
Please suggest what I'm doing wrong.
The "forecastHybrid" requires that the input timeseries is a numeric vector or ts type. While the "timekit" package does return a ts object, it also adds additional attributes that are not in regular ts objects so input checks failed.
See discussion here. and the fixing commit here.
The latest version from Github incorporating the fix can be downloaded with
devtools::install_github("ellisp/forecastHybrid/pkg")

Resources