Calculate changing date for a Donchian Channel technical indicator - r

I am trying to create an indicator that has a dynamic n that changes each day. Basically I am making a strategy that enters a trade when a stocks price reaches its all time highest price.
The best way I can think to do this is by using a Donchian Chanel and entering when the closing price is equal to or greater than all previous DC highs. To do this I need:
n = (Current date of algo - start date).
This way the indicator will start working from day 1 and it won't "forget" about previous highs as the strategy runs through years of data. The problem I am having is that I don't know how to write a code/function that will express the current date of strategy in a way that I can turn it into a simple calculation. The best code I can come up with is:
##Problem in line below##
dcn <- difftime(initdate, as.Date(datePos), units = c("days"))
### This part will work fine once dcn is working
BuySig<-function(price,DC...)
{ifelse(price=>DC,1,0)}
add.indicator(strategy=strategyname,name="DonchianChannel",
arguments=list(HL=quote(mktdata$Close),n=dcn),label="DC")
dcn of course is going to be my Donichan Channel n. The problem I am having is that no matter what I try to use in place of as.Date(datePos) it keeps telling me "object 'datePos' not found". I have tried using other things that I specify earlier in my code such as: Dates, timestamp.
Any advice would be really helpful.

You can't use DonchianChannel with an n that varies. n must be a fixed integer for that function. You need to create your own function that trades 'highest highs' since the start of your data set.
This achieves what you want; just make a function out of it and supply it as a function for add.indicator
library(quantmod)
getSymbols("SPY")
SPY_max <- runMax(Cl(SPY), n = 1, cumulative = TRUE)
SPY$all_time_high <- Cl(SPY) >= SPY_max
chart_Series(SPY["2018/", 1:4])
tail(SPY[SPY$all_time_high == 1,], 10)
# SPY.Open SPY.High SPY.Low SPY.Close SPY.Volume SPY.Adjusted all_time_high
# 2018-01-19 279.80 280.41 279.14 280.41 140920100 273.9762 1
# 2018-01-22 280.17 282.69 280.11 282.69 91322400 276.2038 1
# 2018-01-23 282.74 283.62 282.37 283.29 97084700 276.7901 1
# 2018-01-25 284.16 284.27 282.40 283.30 84587300 276.7998 1
# 2018-01-26 284.25 286.63 283.96 286.58 107743100 280.0046 1
# 2018-08-24 286.44 287.67 286.38 287.51 57487400 283.3048 1
# 2018-08-27 288.86 289.90 288.68 289.78 57072400 285.5416 1
# 2018-08-28 290.30 290.42 289.40 289.92 46943500 285.6796 1
# 2018-08-29 290.16 291.74 289.89 291.48 61485500 287.2167 1
# 2018-09-20 292.64 293.94 291.24 293.58 100360600 289.2860
When the column all_time_high returns 1, you're at an all time high for the time series in question.

Related

How can I express Closing Price above or below SMA in R?

Glad to have found this community!
As a beginning project, I wish to examine what happens when you open a position when the price is above an SMA and close it once the price is below.
I am currently working on a script in R as follows:
`##############################
# 0 - Load libraries
##############################
require(quantmod)
require(PerformanceAnalytics)
require(TTR)
# Step 1: Get the data
loadSymbols("^GSPC")
# Step 2: Create your indicator
SMA <- SMA(x = (GSPC), n = 200)
# Step 3: Construct your trading rule
signal <- Lag(ifelse(SMA$SMA > CLV(GSPC), 1, -1))
# Step 4: The trading rules/equity curve
ret <- ROC(Cl(GSPC))*signal
ret <- ret[2009-06-02/2020-09-07]
eq <- exp(cumsum(ret))
plot(eq)
# Step 5: Evaluate strategy performance
table.Drawdowns(ret, top=10)
table.DownsideRisk(ret)
charts.PerformanceSummary(ret)
plot(cl(GSPC))
lines(sma, col = "Blue")`
The script does not produce any output when I run it. I have re-installed all packages and I am running R version 3.6.
Please can somebody tell me why?

Anomaly detection In R

I am used to using the qcc package in R to detect outliers in the data. I recently came across the AnomalyDetection package. Found here: https://github.com/twitter/AnomalyDetection
My dataset is below:
date_start<-as.Date(c('2017-10-17','2017-10-18',
'2017-10-19','2017-10-20',
'2017-10-21','2017-10-22',
'2017-10-23','2017-10-24',
'2017-10-25','2017-10-26',
'2017-10-27','2017-10-28',
'2017-10-29','2017-10-30',
'2017-10-31','2017-11-01',
'2017-11-02','2017-11-03',
'2017-11-04','2017-11-05',
'2017-11-06','2017-11-07',
'2017-11-08','2017-11-09',
'2017-11-10','2017-11-11',
'2017-11-12'))
count <- c(NA, 3828,
3532,3527,
3916,4303,
3867,3699,
3439,3099,
3148,3310,
3904,3525,
2962,3398,
2935,3013,
3005,3516,
3010,2848,
2689,2573,
2569,2946,
2713)
df<-data.frame(date_start,count)
head(df)
date_start count
1 2017-10-17 NA
2 2017-10-18 3828
3 2017-10-19 3532
4 2017-10-20 3527
5 2017-10-21 3916
6 2017-10-22 4303
When I test out this dataset with the AnomalyDetection package, the response is NULL and no plot appears. Any idea why this may be?
library(AnomalyDetection)
res = AnomalyDetectionTs(df, max_anoms=0.02, direction='both', plot=TRUE)
res$plot
NULL
This is caused by the fact no anomalies were detected.
When one manually changes:
count[13] <- 5671
it is detected.
Additionally for the plot to work the time stamps need to be class POSIXct
df <- data.frame(date_start = as.POSIXct(date_start),
count)
res <- AnomalyDetectionTs(df,
max_anoms = 0.02,
direction = 'both',
plot = TRUE)
#output
$anoms
timestamp anoms
1 2017-10-29 02:00:00 5671
$plot
When using POSIXct i get the the following error "Error: Column x is a date/time and must be stored as POSIXct, not POSIXlt"
However changing to POSIXlt solves the problem

Error in if ((location <= 1) | (location >= length(x)) - R - Eventstudies

I am trying my best at a simple event study in R, with some data retrieved from the Wharton Research Data Service (WRDS). I am not completely new to R, but I would describe my expertise level as intermediate. So, here is the problem. I am using the eventstudies package and one of the steps is converting the physical dates to event time frame dates with the phys2eventtime(..) function. This function takes multiple arguments:
z : time series data for which event frame is to be generated. In the form of an xts object.
Events : it is a data frame with two columns: unit and when. unit has column name of which response is to measured on the event date, while when has the event date.
Width : width corresponds to the number of days on each side of the event date. For a given width, if there is any NA in the event window then the last observation is carried forward.
The authors of the package have provided an example for the xts object (StockPriceReturns) and for Events (SplitDates). This looks like the following:
> data(StockPriceReturns)
> data(SplitDates)
> head(SplitDates)
unit when
5 BHEL 2011-10-03
6 Bharti.Airtel 2009-07-24
8 Cipla 2004-05-11
9 Coal.India 2010-02-16
10 Dr.Reddy 2001-10-10
11 HDFC.Bank 2011-07-14
> head(StockPriceReturns)
Mahindra.&.Mahindra
2000-04-03 -8.3381609
2000-04-04 0.5923550
2000-04-05 6.8097616
2000-04-06 -0.9448889
2000-04-07 7.6843828
2000-04-10 4.1220462
2000-04-11 -1.9078480
2000-04-12 -8.3286900
2000-04-13 -3.8876847
2000-04-17 -8.2886060
So I have constructed my data in the same way, an xts object (DS_xts) and a data.frame (cDS) with the columns "unit" and "when". This is how it looks:
> head(DS_xts)
61241
2011-01-03 0.024247
2011-01-04 0.039307
2011-01-05 0.010589
2011-01-06 -0.022172
2011-01-07 0.018057
2011-01-10 0.041488
> head(cDS)
unit when
1 11754 2012-01-05
2 10104 2012-01-24
3 61241 2012-01-31
4 13928 2012-02-07
5 14656 2012-02-08
6 60097 2012-02-14
These are similar in my opinion, but how it looks does not tell the whole story. I am quite certain that my problem is in how I have constructed these two objects. Below is my R code:
#install.packages("eventstudies")
library("eventstudies")
DS = read.csv("ReturnData.csv")
cDS = read.csv("EventData.csv")
#Calculate Abnormal Returns
DS$AR = DS$RET - DS$VWRETD
#Clean up and let only necessary columns remain
DS = DS[, c("PERMNO", "DATE", "AR")]
cDS = cDS[, c("PERMNO", "DATE")]
#Generate correct date format according to R's as.Date
for (i in 1:nrow(DS)) {
DS$DATE[i] = format(as.Date(toString(DS$DATE[i]), format = "%Y %m %d"), format = "%Y-%m-%d")
}
for (i in 1:nrow(cDS)) {
cDS$DATE[i] = format(as.Date(toString(cDS$DATE[i]), format = "%Y %m %d"), format = "%Y-%m-%d")
}
#Rename cDS columns according to phys2eventtime format
colnames(cDS)[1] = "unit"
colnames(cDS)[2] = "when"
#Create list of unique PERMNO's
PERMNO <- unique(DS$PERMNO)
for (i in 1:length(PERMNO)) {
#Subset based on PERMNO
DStmp <- DS[DS$PERMNO == PERMNO[i], ]
#Remove PERMNO column and rename AR to PERMNO
DStmp <- DStmp[, c("DATE", "AR")]
colnames(DStmp)[2] = as.character(PERMNO[i])
dates <- as.Date(DStmp$DATE)
DStmp <- DStmp[, -c(1)]
#Create a temporary XTS object
DStmp_xts <- xts(DStmp, order.by = dates)
#If first iteration, just create new variable, otherwise merge
if (i == 1) {
DS_xts <- DStmp_xts
} else {
DS_xts <- merge(DS_xts, DStmp_xts, all = TRUE)
}
}
#Renaming columns for matching
colnames(DS_xts) <- c(PERMNO)
#Making sure classes are the same
cDS$unit <- as.character(cDS$unit)
eventList <- phys2eventtime(z = DS_xts, events = cDS, width = 10)
So, if I run phys2eventtime(..) it returns:
> eventList <- phys2eventtime(z = DS_xts, events = cDS, width = 10)
Error in if ((location <= 1) | (location >= length(x))) { :
missing value where TRUE/FALSE needed
In addition: Warning message:
In findInterval(when, index(x)) : NAs introduced by coercion
I have looked at the original function (it is available at their GitHub, can't use more than two links yet) to figure out this error, but I ran out of ideas how to debug it. I hope someone can help me sort it out. As a final note, I have also looked at another (magnificent) answer related to this R package (question: "format a zoo object with “dimnames”=List of 2"), but it wasn't enough to help me solve it (or I couldn't yet comprehend it).
Here is the link for the two CSV files if you would like to reproduce my error (or solve it!).

SMA using R & TTR Package

Afternoon! I'm just starting out with R and learning about data frames, packages, etc... read a lot of the messages here but couldn't find an answer.
I have a table I'm accessing with R that has the following fields:
[Symbol],[Date],[Open],[High],[Low],[Close],[Volume]
And, I'm calculating SMAs on the close prices:
sqlQuery <- "Select * from [dbo].[Stock_Data]"
conn <- odbcDriverConnect(connectionString)
dfSMA <- sqlQuery(conn, sqlQuery)
sma20 <- SMA(dfSMA$Close, n = 20)
dfSMA["SMA20"] <- sma20
When I look at the output, it appears to be calculating the SMA without any regard for what the symbol is. I haven't tried to replicate the calculation, but I would suspect it's just doing it by 20 moving rows, regardless of date/symbol.
How do I restrict the calculation to a given symbol?
Any help is appreciated - just need to be pointed in the right direction.
Thanks
You're far more likely to get answers if you provide reproducible examples. First, let's replicate your data:
library(quantmod)
symbols <- c("GS", "MS")
getSymbols(symbols)
# Create example data:
dGS <- data.frame("Symbol" = "GS", "Date" = index(GS), coredata(OHLCV(GS)))
names(dGS) <- str_replace(names(dGS), "GS\\.", "")
dMS <- data.frame("Symbol" = "MS", "Date" = index(MS), coredata(OHLCV(MS)))
names(dMS) <- str_replace(names(dMS), "MS\\.", "")
dfSMA <- rbind(dGS, dMS)
> head(dfSMA)
Symbol Date Open High Low Close Volume Adjusted
1 GS 2007-01-03 200.60 203.32 197.82 200.72 6494900 178.6391
2 GS 2007-01-04 200.22 200.67 198.07 198.85 6460200 176.9748
3 GS 2007-01-05 198.43 200.00 197.90 199.05 5892900 177.1528
4 GS 2007-01-08 199.05 203.95 198.10 203.73 7851000 181.3180
5 GS 2007-01-09 203.54 204.90 202.00 204.08 7147100 181.6295
6 GS 2007-01-10 203.40 208.44 201.50 208.11 8025700 185.2161
What you want to do is subset your long data object, and then apply technical indicators on each symbol in isolation. Here is one approach to guide you toward acheiving your desired result.
You could do this using a list, and build the indicators on xts data objects for each symbol, not on a data.frame like you do in your example (You can apply the TTR functions to columns in a data.frame but it is ugly -- work with xts objects is much more ideal). This is template for how you could do it. The final output l.data should be intuitive to work with. Keep each symbol in a separate "Container" (element of the list) rather than combining all the symbols in one data.frame which isn't easy to work with.
make_xts_from_long_df <- function(x) {
# Subset the symbol you desire
res <- dfSMA[dfSMA$Symbol == x, ]
#Create xts, then allow easy merge of technical indicators
x_res <- xts(OHLCV(res), order.by = res$Date)
merge(x_res, SMA(Cl(x_res), n = 20))
}
l.data <- setNames(lapply(symbols, make_xts_from_long_df), symbols)

Scrape number of articles on a topic per year from NYT and WSJ?

I would like to create a data frame that scrapes the NYT and WSJ and has the number of articles on a given topic per year. That is:
NYT WSJ
2011 2 3
2012 10 7
I found this tutorial for the NYT but is not working for me :_(. When I get to line 30 I get this error:
> cts <- as.data.frame(table(dat))
Error in provideDimnames(x) :
length of 'dimnames' [1] not equal to array extent
Any help would be much appreciated.
Thanks!
PS: This is my code that is not working (A NYT api key is needed http://developer.nytimes.com/apps/register)
# Need to install from source http://www.omegahat.org/RJSONIO/RJSONIO_0.2-3.tar.gz
# then load:
library(RJSONIO)
### set parameters ###
api <- "API key goes here" ###### <<<API key goes here!!
q <- "MOOCs" # Query string, use + instead of space
records <- 500 # total number of records to return, note limitations above
# calculate parameter for offset
os <- 0:(records/10-1)
# read first set of data in
uri <- paste ("http://api.nytimes.com/svc/search/v1/article?format=json&query=", q, "&offset=", os[1], "&fields=date&api-key=", api, sep="")
raw.data <- readLines(uri, warn="F") # get them
res <- fromJSON(raw.data) # tokenize
dat <- unlist(res$results) # convert the dates to a vector
# read in the rest via loop
for (i in 2:length(os)) {
# concatenate URL for each offset
uri <- paste ("http://api.nytimes.com/svc/search/v1/article?format=json&query=", q, "&offset=", os[i], "&fields=date&api-key=", api, sep="")
raw.data <- readLines(uri, warn="F")
res <- fromJSON(raw.data)
dat <- append(dat, unlist(res$results)) # append
}
# aggregate counts for dates and coerce into a data frame
cts <- as.data.frame(table(dat))
# establish date range
dat.conv <- strptime(dat, format="%Y%m%d") # need to convert dat into POSIX format for this
daterange <- c(min(dat.conv), max(dat.conv))
dat.all <- seq(daterange[1], daterange[2], by="day") # all possible days
# compare dates from counts dataframe with the whole data range
# assign 0 where there is no count, otherwise take count
# (take out PSD at the end to make it comparable)
dat.all <- strptime(dat.all, format="%Y-%m-%d")
# cant' seem to be able to compare Posix objects with %in%, so coerce them to character for this:
freqs <- ifelse(as.character(dat.all) %in% as.character(strptime(cts$dat, format="%Y%m%d")), cts$Freq, 0)
plot (freqs, type="l", xaxt="n", main=paste("Search term(s):",q), ylab="# of articles", xlab="date")
axis(1, 1:length(freqs), dat.all)
lines(lowess(freqs, f=.2), col = 2)
UPDATE: the repo is now at https://github.com/rOpenGov/rtimes
There is a RNYTimes package created by Duncan Temple-Lang https://github.com/omegahat/RNYTimes - but it is outdated because the NYTimes API is on v2 now. I've been working on one for political endpoints only, but not relevant for you.
I'm rewiring RNYTimes right now...Install from github. You need to install devtools first to get install_github
install.packages("devtools")
library(devtools)
install_github("rOpenGov/RNYTimes")
Then try your search with that, e.g,
library(RNYTimes); library(plyr)
moocs <- searchArticles("MOOCs", key = "<yourkey>")
This gives you number of articles found
moocs$response$meta$hits
[1] 121
You could get word counts for each article by
as.numeric(sapply(moocs$response$docs, "[[", 'word_count'))
[1] 157 362 1316 312 2936 2973 355 1364 16 880

Resources