numbers are outside valid range when computing MACD and error handling

numbers are outside valid range when computing MACD and error handling - r

I'm trying to compute ( MACD - signal ) / signal of prices of Russel 1000 (which is an index of the 1000 US large cap stocks). I keep getting this error message and simply couldn't figure out why :
Error in EMA(c(49.85, 48.98, 48.6, 49.15, 48.85, 50.1, 50.85, 51.63, 53.5, :n = 360 is outside valid range: [1, 198]
I'm still relatively new in R although I'm proficient in Python. I suppose I could've used "try" to just work around this error, but I do want to understand at least what the cause of it is.
Without further ado, this is the code :
N<-1000
DF_t<- data.frame(ticker=rep("", N), macd=rep(NA,N),stringsAsFactors=FALSE)
stock<-test[['Ticker']]
i<-0
for (val in stock){dfpx=bdh(c(val), c("px_last"),start.date=as.Date("2018-1-
01"),end.date=as.Date("2019-12-30"))
macd<- MACD( dfpx[,"px_last"], 60, 360, 45, maType="EMA")
num<-dim(macd)[1]
ma<-(macd[num,][1]-macd[num,][2])/macd[num,][2]
i=i+1
DF_t[i,]<-list(val,ma)
}
For your information,bdh() is a Bloomberg command to fetch historic data.dfpx[] is a dataframe.MACD() is a function that takes a time series of prices and outputs a matrix,where the first column are the MACD values and the second column are the signal values.
Thank you very much! Any advice would be really appreciated. Btw, the code works with a small sample of a few stocks but it will cause the error message when I try to apply it to the universe of one thousand stocks. In addition, the number of data points is about 500, which should be large enough for my setup of the parameters to compute MACD.

Q : "...and error handling"
If I may add a grain of salt onto this, the error-prevention is way better than any ex-post error-handling.
For this, there is a cheap, constant O(1) in both [TIME]- and [SPACE]-Domains step, that principally prevents any such error-related crashes :
Just prepend to the instantiated the vector of TimeSERIES data with that many constant and process-invariant value cells, that make it to the maximum depth of any vector-processing, and any such error or exception is principally avoided :
processing-invariant value, in most cases, is the first know value, to be repeated that many times, as needed back, in the direction of time towards older ( not present ) bars ( yes, not relying on NaN-s and how NaN-s might get us in troubles in methods, that are sensitive to missing data, which was described above. Q.E.D. )

For those who are interested, I have found the cause of the error: some stocks have missing prices. It's simple like that. For instance, Dow US Equity has only about 180 daily prices (for whatever reason) over the past one and a half years, which definitely can't be used to compute a moving average of 360 days.
I basically ran small samples till I eventually pinpointed what caused the error message. Generally speaking, unless you are trying to extract data of above 6,000 stocks or so and you are querying say 50 fields, you are Okay. A rule of thumb for daily usage limit of Bloomberg is said to be around 500,000 for a school console. A PhD colleague working in a trading firm also told me professional consoles of Bloomberg are more forgiving.

Related

How to use Poisson distribution? Inter-arrival times

Initially I was using poissrnd command to generate Poisson distributed numbers but I had no info on how to make them 'arrive' in my code. So I decided to generate the inter-arrival times. I do that as below.
t=exprnd(1/0.1);
for i=1:5
t=t+exprnd(1/0.1);
end
%t is like 31.3654 47.1014 72.0024 77.5162 102.3227 104.5794
%Even if this way of producing arrival times is wrong, still my question remains the same
All is ok to see but how can I actually use these times in my code to say that yes, arrival number 1 occuring at 31.3654 time, then arrival number 2 at 47.1014 time,etc. As ultimately I have to take an arrival, do some action, then receive another call. I cannot use a loop to increment through such varying numbers (even if i use ceil()).
So, what do people mean when they say they generated arrivals using Poisson distbn. How do they make use of the Posson distbn.? Because, the code won't know if a number is from Poisson or Rand distbn? I am tired of thinking an answer to this question. Please suggest something.

nTrials must be be greater.... issue on conjoint design

I'm trying to create a list of conjoint cards using R.
I have followed the professor's introduction, with my own dataset, but I'm stuck with this issue, which I have no idea.
library(conjoint)
experiment<-expand.grid(
ServiceRange = c("RA", "Active", "Passive","Basic"),
IdentProce = c("high", "mid", "low"),
Fee = c(1000,500,100),
Firm = c("KorFin","KorComp","KorStrt", "ForComp")
)
print(experiment)
design=caFactorialDesign(data=experiment, type="orthogonal")
print(design)
at the "design" line, I'm keep getting the following error message:
Error in optFederov(~., data, nTrials = i, approximate = FALSE, nRepeats = 50) :
nTrials must not be greater than the number of rows in data
How do I address this issue?

You're getting this error because you have 144 rows in experiment, but the nTrials mentioned in the error gets bigger than 144. This causes an error for optFederov(), which is called inside caFactorialDesign(). The problem stems from the fact that your Fee column has relatively large values.
I'm not familiar with how the conjoint package is set up, but I can show you how to troubleshoot this error. You can read the conjoint documentation for more on how to select appropriate experimental data.
(Note that the example data in the documentation always has very low numeric values, usually values between 1-10. Compare that with your Fee vector, which has values up to 1000.)
You can see the source code for a function loaded into your RStudio namespace by highlighting the function name (e.g. caFactorialDesign) and hitting Command-Return (on a Mac - probably something similar on PC). You can also just look at the source code on GitHub.
The caFactorialDesign is implemented here. That link highlights the line (26) that is throwing the error for you:
temp.design<-optFederov(~., data, nTrials=i, approximate=FALSE, nRepeats=50)
Recall the error message:
nTrials must not be greater than the number of rows in data
You've passed in experiment as the data parameter, so nrow(experiment) will tell us what the upper limit on nTrials is:
nrow(experiment) # 144
We can actually just think of the error for this dataset as:
nTrials must not be greater than 144
Ok, so how is the value for nTrials determined? We can see nTrials is actually an argument to optFederov(), and its value is set as i - often a sign that there's a for-loop wrapping an operation. And in fact, that's what we see:
for (i in ca.number: profiles.number)
{
temp.design<-optFederov(~., data, nTrials=i, approximate=FALSE, nRepeats=50)
...
}
This tells us that optFederov() is going to get called for each value of i in the loop, which will start at ca.number and will go up to profiles.number (inclusive).
How are these two variables assigned? If we look a little higher up in the caFactorialDesign() definition, ca.number is defined on lines 5-9:
num <- data.frame(data.matrix(data))
vars.number<-length(num)
levels.number<-0
for (i in 1:length(num)) levels.number<-levels.number+max(num[i])
ca.number<-levels.number-vars.number+1
You can run these calculations outside of the function - just remember that data == experiment. So just change that first line to num <- data.frame(data.matrix(experiment)), and then run that chunk of code. You can see that ca.number == 1008!!
In other words, the very first value of i in the for-loop which calls optFederov() is already way bigger than the max limit: 1008 >> 144.
It's possible you can include these numeric values as factors or strings in your definition of experiment - I'm not sure if that is an appropriate way to do this analysis. But I hope it's clear that you won't be able to use such large values in caFactorialDesign(), unless you have a much larger number of total observations in your data.

End equity (and other trading stats) update issue in portfolio - blotter

I have multiple symbols in a portfolio but when running a blotter trading strategy the end equity only updates for the last symbol that has been run. When looking into how the equity updates each transaction it seems that when a new symbol is introduced the equity goes back to its original value set at the beginning (1mil).
This is how the portfolio is getting updated for a symbol month by month:
updatePortf(portfolioName,Symbols=symbolName, Dates=currentDate)
updateAcct(accountName,Dates=currentDate)
updateEndEq(accountName, currentDate)
Why is this happening?
Hope my question makes sense and thank you in advance

This is a good question. if you look at applyStrategy you'll see that each symbol in the loop is run in isolation -- independently. You may want to check out applyStrategy.rebalancing which does a nested loop of the form:
for(i in 2:length(pindex)){
#the proper endpoints for each symbol will vary, so we need to get
#them separately, and subset each one
for (symbol in symbols){
#sret<-ret[[portfolio]]
This means it loops over a section of timestamps, and then for each symbol, which is what you want when you want some interactions between symbols (applyStrategy simply does a for symbols, then an inner by timestamp loop, so you'll never get interactions going on) in terms of equity.
When I first starting using quantstrat I initially had the same frustration. My solution was to modify applyStrategy.rebalancing do become a (slower) double loop, for each time stamp, then an inner loop across each symbol.
Yes, this means you can't directly compute portfolio PL accurately in quantstrat. So things like opening positions that are position of current portfolio equity can't be done directly. (But you can modify the code to do it if you want).
Why does quantstrat behave this way by default? The authors will give you good reasons. In short, my view is that (after some brief discussions with the authors), if a signal has predictive power, and gives you an edge in a strategy, it'll work regardless of how you combine it with other symbols later. quantstrat is about identifying whether signals are good or not in relation to mktdata you pass to it.
Logically, if a signal is good on a per symbol level, then it will likely do ok on a portfolio level too (if not better, with smoother portfolio PL). quantstrat's current approach will give you a reasonable approximation to what portfolio PL will look like, but not in a true "compounding return" sense. To do that, you'd want to scale your positions according to the current portfolio PL (which isn't possible in applyStrategy as noted above). This simplification of running a strategy per symbol only also makes simulations much, much faster. Note that you can introduce still interactions with other symbols in applyStrategy by adding additional columns to the symbol data that relate to other symbols though (e.g. in pairs trading, etc).
At the end of the day backtest results are always simplifications of trading in reality, so there isn't a big motivation to get "super" accurate backtest results that project profit/trading revenue very accurately.

how to do a sum in graphite but exclude cases where not all data is present

We have 4 data series and once in a while one of the 4 has a null as we missed reading the data point. This makes the graph look like we have awful spikes in loss of volume coming in which is not true as we were just missing the data point.
I am doing a basic sumSeries(server*.InboundCount) right now for server 1, 2, 3, 4 where the * is.
Is there a way where graphite can NOT sum the locations on the line and just have sum for those points in time be also null so it connects the line from the point where there is data to the next point where there is data.
NOTE: We also display the graphs server*.InboundCount individually to watch for spikes on individual servers.
or perhaps there is function such that it looks at all the series and if any of the values is null, it returns null for every series that it takes X series and returns X series points to the sum function as null+null+null+null hopefully doesn't result in a spike and shows null.
thanks,
Dean

This is an old question but still deserves an answer as a point of reference, what you're after I believe is the function KeepLastValue
Takes one metric or a wildcard seriesList, and optionally a limit to the number of ‘None’ values to skip over. Continues the line with the last received value when gaps (‘None’ values) appear in your data, rather than breaking your line.
This would make your function
sumSeries(keepLastValue(server*.InboundCount))
This will work ok if you have a single null datapoint here and there. If you have multiple consecutive null data points you can specify how far back before a null breaks your data. For example, the following will look back up to 10 values before the sumSeries breaks:
sumSeries(keepLastValue(server*.InboundCount, 10))
I'm sure you've since solved your problems, but I hope this helps someone.

Having difficulty using R programming to implement a trading strategy using multiple securities

I am currently attempting to implement a trading idea that I have been playing around with. It consists of 50+ securities and has a strategy very similar to this one. (Current package I am using is quantmod).
http://www.r-bloggers.com/backtesting-a-simple-stock-trading-strategy/
For those who aren't interested in clicking, it is a strategy that will look at the pass X days( in his case 200 ) and enter a position depending on the peak reached in the stock. I understand how to do this strategy for my idea, but I cannot grasp how to aggregate my data into one summary.
Is there a way I can consolidate the summary for all the positions I have entered into one larger portfolio summary and chart that against the S&P 500?
Any advice on where I can find resources or being lead to the information. I have looked at portfolio analysis package for R and I do not believe that will be much help to me.
Thank you in advance.
Edit: In the link, at the bottom, there are 3 indexes that are FTSE, N225, DJIA. Could i combine those 3 summaries to show the same output as below, BUT combined
FTSE:
Me Index
Cumulative Return 3.56248582 3.8404476
Annual Return 0.05667121 0.0589431
Annualized Sharpe Ratio 0.45907768 0.3298633
Win % 0.53216374 0.5239884
Annualized Volatility 0.12344579 0.1786895
Maximum Drawdown -0.39653398 -0.5256991
Max Length Drawdown 1633.00000 2960.0000
Could I get that same output but for the 3 securities data combined? Is there a effective way of doing that. Thank you so much. Happy holidays

It's a little unclear to me what you mean by "combine" in this case. If you want a single column representing the combined returns from all three exchanges as if they were a single unified market, that's really tricky, because the exchanges trade in different currencies (British pounds; U.S. dollars, Japanese Yen, etc.). The underlying analysis would have to be modified substantially to take into account fluctuating daily foreign exchange rates.
I suspect that this is NOT want you want. Rather, you are simply asking how to take three sequential two-column outputs and turn them into a single parallel six-column output.
If that is indeed what you want, then you need to rewrite the testStrategy() function shown near the bottom of the link. As it's currently written, that function takes three inputs: an index name myStock (with allowed values of FTSE, DJIA, or N225), and two integer values, nHold and nHigh. You would need to change it so that it instead accepts five inputs; e.g., myStockA, myStockB and myStockC, plus the two integer values already mentioned. Then each of the lines currently referring to myStock would have to be replicated three times. Finally, the two cbind() lines that you see at the bottom would have to be modified so that instead of merging the data together into only two columns, you include all six.
For a good intro tutorial on how to write and modify your own R functions, please see this. To understand how to use the cbind() function, which you will have to call with six rather than two inputs, please see this.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex