I wanted to try out the tawny package (v2.1.6) for its portfolio optimization with shrinkage estimators and ran the following example from the documentation page (R 3.4.1 on Win 7):
require(tawny)
require(tawny.types)
require(PerformanceAnalytics)
# Select a portfolio using 200 total observations
data(sp500.subset)
h <- sp500.subset
p <- TawnyPortfolio(h, 150)
b <- BenchmarkPortfolio('^GSPC', 150, nrow(h), end=end(h))
# Optimize using a window of length 200 (there will be 51 total iterations)
ws <- optimizePortfolio(p, RandomMatrixDenoiser())
rs <- PortfolioReturns(p, ws)
o <- zoo(cbind(portfolio=rs, benchmark=b$returns), index(rs))
charts.PerformanceSummary(o)
At line rs <- PortfolioReturns(p, ws) I get stuck with error:
Error in UseFunction(type.fn, type.name, ...) :
No valid function for 'PortfolioReturns(TawnyPortfolio,xts)'
The only conclusion I can draw from debugging the UseFunction in the lambda.r package is, that the PortfolioReturns function expects the second argument as numeric while I am supplying an xts object. I tried supplying a numeric matrix instead of the xts i.e. as.numeric(coredata(ws)) - without success. My R/lambda.r-expertise is not good enough to take it any further.
I have two questions:
1) Am I wasting my time with the tawny package (alpha release)? Are there better alternatives you can recommend?
2) Alternatively, is there a way to fix and use that example?
For now, I went the manual route. I replaced the PortfolioReturns() function by MyPortfolioReturns() using the original as blueprint:
MyPortfolioReturns <- function(h, weights) {
w.index <- c(index(weights[2:nrow(weights)]), end(weights) + 1)
index(weights) <- w.index
h.trim <- h[index(h) %in% index(weights)]
ts.rets <- apply(xts(h.trim) * weights, 1, sum)
ts.rets <- xts(ts.rets, order.by=index(h.trim))
if (any(is.na(ts.rets)))
{
cat("WARNING: Filling NA returns with 0\n")
ts.rets[is.na(ts.rets)] <- 0
}
return(ts.rets)
}
Note that tawny shifts the weights by one day, as they are applied the following day.
The rest of the demo code needs some slight adjustments:
rs <- MyPortfolioReturns(p$returns, ws)
o <- xts(cbind(portfolio=rs, benchmark=b$returns[151:200,]), index(rs))
charts.PerformanceSummary(o)
At least now I can chart the results from the portfolio optimization.
Related
I am trying to run the MSClaio2008 function of the nsRFA library in R. Following the documentation, I have successfully run the examples given on the web site. However, when I created my own data set to run with the MSClaio2008 function as seen below:
library(nsRFA)
data <- c(91,84,42,30,66,95,65,12,27,61,31,101,48,52,53,55,80,23,87,46,50,33,75,88,54,17,57,39,10,89,59,24,11,43,13,93,105,28,104,18,103,62,22,58,15,34,74,51,97,44,99,76,14,16,109,92,110,40,78,83,60,49,96,36,32,81,68,20,56,25,63,47,37,29,100,71,106,41,90,70,85,38,19,108,73,102,26,82,98,45,77,35,94,79,86,72,107,21,67,69,64)
MSC <- MSClaio2008(data)
it throws this error:
Error in optim(par = Tm1, fn = .logLgam, x = x) : object 'Tm1' not found
I have searched for topics on this platform, I have found this but that did not solve my problem.
This looks like a bug in the package, but it is related to the fact that your data are weird. This function is supposedly fitting distributions to observations of hydrological extremes: you have given it an integer sequence from 10 to 110.
As it turns out, when computing the fit for some of the possible distributions (P3 and GEV, I think), the function internally computes the skewness of the data and has a test for what to do if it the skewness is positive or negative. It doesn't consider the possibility that the skewness would be exactly 0!
Inside nsRFA:::ML_estimation:
skx <- sum((x - mean(x))^3)/(length(x) * sd(x)^3)
if (skx > 0) {
Tm1 <- min(x) - 1
...
} else if (skx < 0) {
Tm1 <- max(x) + 1
...
}
If the skewness is exactly zero, then the variable Tm1 doesn't get set and the next step that relies on Tm1 breaks.
If you perturb the data in some way so that the skewness is not exactly zero, this should work, e.g. MSClaio2008(c(8,data)) or MSClaio2008(data+rnorm(length(data), sd=0.001)) ...
It's a weird case, but it would be worth contacting the maintainer (maintainer("nsRFA")) to let them know.
I am getting a Error in 1:T : argument of length 0 when running the Performance Analytics package in R. am I missing a package? Below is my code with error.
#clean z, all features, alpha = .01, run below
setwd("D:/LocalData/casaler/Documents/R/RESULTS/PLOTS_PCA/CLN_01")
PGFZ_ALL <- read.csv("D:/LocalData/casaler/Documents/R/PG_DEUX_Z.csv", header=TRUE)
options(max.print = 100000) #Sets ability to view all dealer records
pgfzc_all <- PGFZ_ALL
#head(pgfzc_all,10)
library("PerformanceAnalytics")
library("RGraphics")
Loading required package: grid
pgfzc_elev <- pgfzc_all$ELEV
#head(pgfzc_elev,5)
#View(pgfzc_elev)
set.seed(123) #for replication purposes; always use same seed value
cln_elev <- clean.boudt(pgfzc_elev, alpha = 0.01) #set alpha .001 to give the most extreme outliers
Error in 1:T : argument of length 0
It's hard to answer your question without knowing what your data looks like. But I can tell you what throws that error. Looking into the source code of the clean.boudt function I find the following cause of your error:
T = dim(R)[1]
...
for (t in c(1:T)) {
d2t = as.matrix(R[t, ] - mu) %*% invSigma %*% t(as.matrix(R[t,
] - mu))
vd2t = c(vd2t, d2t)
}
...
The dim(R)[1] extracts the number of rows in the data supplied to the R argument in the function. It appears that your data has no rows, so check the data type of pgfzc_elev
The cause of the error is likely from your use of $ to subset pgfzc_all.
pgfzc_elev <- pgfzc_all$ELEV
I reckon it is of class integer, which is why dim(R)[1] does not work in the function.
Rather subset your object like this:
pgfzc_elev <- pgfzc_all[, ELEV, drop = F]
Try that and see if it works.
I am using the function plkhci from library Bhat to construct Profile-likelihood based confidence intervals and I got this warning:
Warning message: In dqstep(list(label = x$label, est = btrf(xt, x$low,
x$upp), low = x$low, : oops: unable to find stepsize, use default
when i run
r <- dfp(x,f=nlogf)
Can I ignore this warning as I still can get the output?
Following is the complete coding:
library(Bhat)
beta0<--8
beta1<-0.03
gamma<-0.0105
alpha<-0.05
n<-100
u<-runif(n)
u
x<-rnorm(n)
x
c<-rexp(100,1/1515)
c
t1<-(1/gamma)*log(1-((gamma/(exp(beta0+beta1*x)))*(log(1-u))))
t1
t<-pmin(t1,c)
t
delta<-1*(t1>c)
delta
length(delta)
cp<-length(delta[delta==1])/n
cp
delta[delta==1]<-ifelse(rbinom(length(delta[delta==1]),1,0.5),1,2)
delta
deltae<-ifelse(delta==0, 1,0)
deltar<-ifelse(delta==1, 1,0)
deltai<-ifelse(delta==2, 1,0)
dat=data.frame(t,delta, deltae,deltar,deltai,x)
dat$interval[delta==2] <- as.character(cut(dat$t[delta==2], breaks=seq(0, 600, 100)))
labs <- cut(dat$t[delta==2], breaks=seq(0, 600, 100))
dat$lower[delta==2]<-as.numeric( sub("\\((.+),.*", "\\1", labs) )
dat$upper[delta==2]<-as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", labs) )
data0<-dat[which(dat$delta==0),]#uncensored data
data1<-dat[which(dat$delta==1),]#right censored data
data2<-dat[which(dat$delta==2),]#interval censored data
nlogf<-function(para)
{
b0<-para[1]
b1<-para[2]
g<-para[3]
e<-sum((b0+b1*data0$x)+g*data0$t+(1/g)*exp(b0+b1*data0$x)*(1-exp(g*data0$t)))
r<-sum((1/g)*exp(b0+b1*data1$x)*(1-exp(g*data1$t)))
i<-sum(log(exp((1/g)*exp(b0+b1*data2$x)*(1-exp(g*data2$lower)))-exp((1/g)*exp(b0+b1*data2$x)*(1-exp(g*data2$upper)))))
l<-e+r+i
return(-l)
}
x <- list(label=c("beta0","beta1","gamma"),est=c(-8,0.03,0.0105),low=c(-10,0,0),upp=c(10,1,1))
r <- dfp(x,f=nlogf)
x$est <- r$est
plkhci(x,nlogf,"beta0")
plkhci(x,nlogf,"beta1")
plkhci(x,nlogf,"gamma")
I am giving you a super long answer, but it will help you see that you can chase down your own error messages (most of the time, sometimes this means of looking at functions will not work). It is good to see what is happening inside a method when it throws an warning because sometimes it is fine and sometimes you need to fix your data.
This function is REALLY involved! You can look at it by typing dfp into the R command line (NO TRAILING PARENTHESES) and it will print out the whole function.
17 lines from the end, you will see an assignment:
del <- dqstep(x, f, sens = 0.01)
You can see that this calls the function dqstep, which is reflected in your warning.
You can see this function by typing dqstep into the command line of R again. In reading through this function, also long but not so tedious, there is this section of boolean logic:
if (r < 0 | is.na(r) | b == 0) {
warning("oops: unable to find stepsize, use default")
cat("problem with ", x$label[i], "\n")
break
}
This is the culprit, it returns the message you are getting. The line right above it spells out how r is calculated. You are feeding this function your default x from the prior function plus a sensitivity equations (which I assume dfp generates, it is huge and ugly, so I did not untangle all of it). When the previous nested function returns either an r value lower than Zero, and r value of NA or a b value of ZERO, that message is displayed.
The second error tells you that it was likely b==0 because b is in the denominator and it returned and infinity value, so NO STEP SIZE IS RETURNED FROM THIS NESTED FUNCTION to the variable del in dfp.
The step is fed into THIS equation:
h <- logit.hessian(x, f, del, dapprox = FALSE, nfcn)
which you can look into by typing logit.hessian into the R commandline.
When you do, you see that del is a step size in a logit scale, with a default value of del=rep(0.002, length(x$est))...which the function set for you because running the function dqstep returned no value.
So, you now get to decide if using that step size in the calculation of your confidence interval seems right or if there is a problem with your data which needs resolving to make this work better for you.
When I ran it, line by line, I got this message:
Error in if (denom <= 0) { : missing value where TRUE/FALSE needed
at this line of code:
r <- dfp(x,f=nlogf(x))
Which makes me think I was correct.
That is how I chase down issues I have with messages from packages when I get a message like yours.
In nutshell will explain the code;
Am trying to forecast by creating 24 hourly models in a single day and collating the results in the data frame.Basic issue is not able to reproduce #the output even after setting seed.Please anyone help me.some custom functions #and objects i have made and there is no randomization in them.(Just FYI).
f <- as.formula("actual~ lag.1 + last3.avg+monsoon+mon.thurs+wdaySaturday+wdaySunday+holiday
") #Defining the formula for neural network
require(dplyr);require(neuralnet)
set.seed(123456)
nnet.hour=data.frame()#Initializing a dataframe
#k=0
#x=list()
for(i in 1:24){#Running it for 24 hours in a day
sub<-new.day.ahead[new.day.ahead$hour==i,]
sub$lag.1<-lag(sub$actual,1)
for(i in 1:nrow(sub)){
sub$last3.avg[i]=sum(lag(sub$actual,1)[i],lag(sub$actual,2)[i],lag(sub$actual,3)[i],na.rm=TRUE)/3
}
ind=which(sub$mod.date==ymd(t[1]));ind#t[1] is basically a date #initialisation,getting the index
monsoon=as.factor(sub$Monsoon.Dummy)
wday=as.factor(sub$wday.dummy)
holiday=as.factor(sub$holiday)
sub=as.data.frame(cbind(sub[,c(4,16,17)],cbind(
monsoon=model.matrix(~monsoon)[,-1],
wday=model.matrix(~wday)[,-1],
holiday=model.matrix(~holiday)[,-1]
)))
names(sub)[5]<-"mon.thurs"
##Normalising the data for training in a neural net
sub[,2][1]=0
maxs <- apply(sub, 2, max)
mins <- apply(sub, 2, min)
scaled <- as.data.frame(scale(sub, center = mins, scale = maxs - mins))
train<- scaled[1:I(ind-1),]
test<- scaled[ind,]
set.seed(123456)
nn <- neuralnet(f,data=train,hidden =7,linear.output = TRUE)
pr.nn<-neuralnet::compute(nn,test[,-1])
#Normalising back
pr.nn.<- pr.nn$net.result*(max(sub$actual)-min(sub$actual))+min(sub$actual)
test.r <- (test$actual)*(max(sub$actual)-min(sub$actual))+min(sub$actual)
u=mape(as.numeric(test.r),as.numeric(pr.nn.));u#Calculating Mean Absolute Percentage Error
if(i==1){
nnet.hour=data.frame(actual=as.numeric(test.r),forecast1=as.numeric(pr.nn.),mape=u)
}else{
nnet.hour=rbind(nnet.hour,data.frame(data.frame(actual=as.numeric(test.r),forecast1=as.numeric(pr.nn.),mape=u)))
nnet.hour=data.frame(nnet.hour)
}
}
Yes.This is solved.Actually for some iterations I failed to invoke 'dplyr' package ,so the lag variables i was creating using lag(function 'lag' is both in base as well as dplyr package) function were returning just the same series as the variable I was trying to forecast courtesy which errors were ~negligible.
Once I invoke dplyr package Results are reproducible.
Thanks.
In my statistics class we use Stata and since I'm an R user I want to do the same things in R. I've gotten the right results but it seems like a somewhat awkward way of getting something as simple as confidence intervals.
Here's my crude solution:
library(quantreg)
na = round(runif(100, min=127, max=144))
f <- rq(na~1, tau=.5, data=ds)
s <- summary.rq(f, se="boot", R=1000)
coef(s)[1]
coef(s)[1]+ c(-1,1)*1.96*coef(s)[2]
I've also experimented a little at the boot package but I haven't gotten it to work:
library(boot)
b <- boot(na, function(w, i){
rand_bootstrap_sample = w[i]
f <- rq(rand_bootstrap_sample~1, tau=.5)
return(coef(f))
}, R=100)
boot.ci(b)
Gives an error:
Error in bca.ci(boot.out, conf, index[1L], L = L, t = t.o, t0 = t0.o, :
estimated adjustment 'a' is NA
My questions:
What I wan't is to know if there is another better way of getting the confidence interval
why is the bootstrap code complaining?
Your example does not give an error message for me (Windows 7/64,R 2.14.2), so it could be a problem of random seeds. So if you post an example using some random method, better add a line set.seed; see example.
Note that the error message refers to the bca type of boot.ci; since this one often complains, deselect it by giving type explicitly.
I do not know exactly why you use the rather complex rq in the bootstrap. If you really wanted to profile rq, forget the simple example below, but please give some more details.
library(boot)
set.seed(4711)
na = round(runif(100, min=127, max=144))
b <- boot(na, function(w, i) median(w[i]), R=1000)
boot.ci(b,type=c("norm","basic","perc"))