EnvStats simulateVector function - r

I'm using the EnvStats package more specifically the simulateVector function to generate random samples from pdf's.
I've tried using a Normal pdf and varying the parameters that truncate this pdf:
> vfy <- simulateVector(10, distribution = "norm",
+ param.list = list(mean = 400, sd = 40), seed = 47,
+ sort = FALSE, left.tail.cutoff = 1, right.tail.cutoff = 1)
> vfy
[1] 479.7879 428.4457 407.4162 388.7294 404.3510 356.5705 360.5807 400.6052 389.9182 341.3700
> vfy <- simulateVector(10, distribution = "norm",
+ param.list = list(mean = 400, sd = 40), seed = 47,
+ sort = FALSE, left.tail.cutoff = 0, right.tail.cutoff = 0)
> vfy
[1] 479.7879 428.4457 407.4162 388.7294 404.3510 356.5705 360.5807 400.6052 389.9182 341.3700
For my surprise the results do not vary.... What's wrong? Thanks

The left.tail.cutoff and right.tail.cutoff arguments are only relevant when you use sample.method = "LHS" for Latin Hypercube sampling.
The default is sample.method = "SRS" for simple random sampling, which uses the rnomr() function. The help file states "This argument is ignored if sample.method="SRS"."
See ?simulateVector() for the default arguments.

Related

Error while running WTC (Wavelet Coherence) Codes in R

I am doing Wavelet Analysis in R using Biwavelet. However, I receive the error message:
Error in check.datum(y) :
The step size must be constant (see approx function to interpolate)
When I run the following code:
wtc.AB = wtc(t1, t2, nrands = nrands)
Please share your help here. Complete Code is:
# Import your data
Data <- read.csv("https://dl.dropboxusercontent.com/u/18255955/Tutorials/Commodities.csv")
# Attach your data so that you can access variables directly using their
# names
attach(Data)
# Define two sets of variables with time stamps
t1 = cbind(DATE, ISLX)
t2 = cbind(DATE, GOLD)
# Specify the number of iterations. The more, the better (>1000). For the
# purpose of this tutorial, we just set it = 10
nrands = 10
wtc.AB = wtc(t1, t2, nrands = nrands)
# Plotting a graph
par(oma = c(0, 0, 0, 1), mar = c(5, 4, 5, 5) + 0.1)
plot(wtc.AB, plot.phase = TRUE, lty.coi = 1, col.coi = "grey", lwd.coi = 2,
lwd.sig = 2, arrow.lwd = 0.03, arrow.len = 0.12, ylab = "Scale", xlab = "Period",
plot.cb = TRUE, main = "Wavelet Coherence: A vs B")```

Using target risk or target return in R package fPortfolio

I use the R package fPortfolio for portfolio optimization for a rolling portfolio (adaptive asset allocation). Therefore, I use the backtesting function.
I aim at constructing a portfolio for a set of assets for a predefined target return (and minimized risk) or for a predefined target risk and maximized returns.
Even allowing for short selling (as proposed in another post 5 years ago) seems not to work. Besides, I do not want to allow for short selling in my approach.
I cannot figure out why changing values for target return or target risk do not influence the solution at all.
Where do I go wrong?
require(quantmod)
require(fPortfolio)
require(PortfolioAnalytics)
tickers= c("SPY","TLT","GLD","VEIEX","QQQ","SHY")
getSymbols(tickers)
data.raw = as.timeSeries(na.omit(cbind(Ad(SPY),Ad(TLT),Ad(GLD),Ad(VEIEX),Ad(QQQ),Ad(SHY))))
data.arith = na.omit(Return.calculate(data.raw, method="simple"))
colnames(data.arith) = c("SPY","TLT","GLD","VEIEX","QQQ","SHY")
cvarSpec <- portfolioSpec(
model = list(
type = "CVAR",
optimize = "maxReturn",
estimator = "covEstimator",
tailRisk = list(),
params = list(alpha = 0.05, a = 1)),
portfolio = list(
weights = NULL,
targetReturn = NULL,
targetRisk = 0.08,
riskFreeRate = 0,
nFrontierPoints = 50,
status = 0),
optim = list(
solver = "solveRglpk.CVAR",
objective = NULL,
params = list(),
control = list(),
trace = FALSE))
backtest = portfolioBacktest()
setWindowsHorizon(backtest) = "12m"
assets <- SPY ~ SPY + TLT + GLD + VEIEX + QQQ + SHY
portConstraints ="LongOnly"
myPortfolio = portfolioBacktesting(
formula = assets,
data = data.arith,
spec = cvarSpec,
constraints = portConstraints,
backtest = backtest,
trace = TRUE)
setSmootherLambda(myPortfolio$backtest) <- "1m"
myPortfolioSmooth <- portfolioSmoothing(myPortfolio)
backtestPlot(myPortfolioSmooth, cex = 0.6, font = 1, family = "mono")

How do I specify numerical and categorical variables in catboost with R?

The tutorial for catboost with R says this:
library(catboost)
countries = c('RUS','USA','SUI')
years = c(1900,1896,1896)
phone_codes = c(7,1,41)
domains = c('ru','us','ch')
dataset = data.frame(countries, years, phone_codes, domains)
label_values = c(0,1,1)
fit_params <- list(iterations = 100,
loss_function = 'Logloss',
ignored_features = c(4,9),
border_count = 32,
depth = 5,
learning_rate = 0.03,
l2_leaf_reg = 3.5)
pool = catboost.load_pool(dataset, label = label_values, cat_features = c(0,3))
model <- catboost.train(pool, params = fit_params)
However, this results in:
Error in catboost.from_data_frame(data, label, pairs, weight, group_id, :
Unsupported column type: character
Many thanks,

Generating correlated variables

I am studying the effects of skewness and kurtosis on the Pearson corrections to bivariate correlations for range restriction. Currently I am using R and "rcorrvar" as it should allow me to generate correlated vectors with a specifiable skew and kurtosis. When I run it as below
rcorrvar(n = 100, k_cont = 2, k_CAT = 2,pois = 2, k_nb = 0,
method = c("Fleishman", "Polynomial"), means = 0, vars = 1,
skews = 2,skurts = 4,fifths = NULL, sixths = NULL,
Six = list(), marginal = list(), support = list(), nrand = 100,
lam = NULL, size = NULL, prob = NULL, mu = NULL, Sigma = NULL,
rho = NULL, cstart = NULL, seed = 1234, errorloop = FALSE,
epsilon = 0.001, maxit = 1000, extra_correct = TRUE)
Error in rcorrvar(n = 100, k_cont = 2, k_CAT = 2, pois = 2, k_nb = 0, :
unused arguments (k_CAT = 2, pois = 2)
How do I correct these errors?
Assuming that the rcorrvar function you're using is from the SimMultiCorrData package, it appears as though you may have misspelled the two variables - they're supposed to be k_cat and k_pois.
Please note that R's variables are case-sensitive.

ddply to ksmooth function

I have a data frame with several columns. the relevant three are chr, pos and ratio. I want to use ddply to ksmooth based on chr (chromosome) but keep getting a wrong data frame with lots of NA values. Here is my reproducible data frame:
d=data.frame(chr=c(rep.int(1,24),rep.int(2,15),rep.int(3,30),rep.int(4,20),rep.int(5,11)),
pos=c(sort(sample(1:1000, size = 24, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 15, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 30, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 20, replace = FALSE),decreasing = FALSE), sort(sample(1:1000, size = 11, replace = FALSE),decreasing = FALSE)),
ratio=seq(1:100))
and ddply function
f <- ddply(d, .(chr),
function(e) {
as.data.frame(ksmooth(e$pos,e$ratio,"normal",bandwidth=10))
})
Obviously I'm doing something wrong.
Thanks for the help,
Guy
This is nothing related to plyr::ddply. The issue is with ksmooth. You want:
ksmooth(e$pos, e$ratio, "normal", bandwidth=10, x.points = e$pos)
Read ?ksmooth for what x.points means. By default, this is NULL, and ksmooth will use n.points instead. This is the source of all your trouble.

Resources