Error in Unscale function of R - r

The following is a column that I normalised using r's scale function
Percentage
18.12807882
11.91957875
0
9.523809524
6.789250354
1.654411765
8.027327071
10.13061485
7.576493443
8.230958231
14.82579689
12.27436823
26.59574468
15.7537155
I have used scale to normalise the same.I have predicted the following output using a decision tree.
-0.5041396 0.6601254 -0.2216793
How do i convert this to its normal unit. I have tried
unscale(norm.data = DF$Percentage, vals = Q,col.ids = Q)
that yields an error: Incorrect dimension of data to unscale.
What should i be doing exactly. I need some guidance.

Related

New Probability Model fitting using R Code

I have developed new probability model using generalized technique of mixturing.
Now i want it's fitting on discrete data set.
But i am getting error that
Error in seq.default(0, x) : 'to' must be of length 1
I don't understand, how to handle this.
R code is below:
# rm(list=ls(all=TRUE))
obs=rep(seq(0,6),c(260,87,32,4,1,0,0))
NBWED<-function(x,r,alpha,beta){
j=seq(0,x)
C=function(n,x){
factorial(n)/(factorial(n-x)*factorial(x))
}
C(x+r-1,x)*sum(C(x,j)*(-1)^j*(alpha^2/(alpha+beta))*((r+j+alpha)+beta)/(r+j+alpha)^2)
}
library(MASS)
fit09=fitdistr(x = obs,densfun = NBWED,start = list(r=1,alpha=0.5,beta=9.4),lower = list(a = 0.1,0.001,0.001),upper=c(Inf,Inf,Inf))
fit09
seq(0,n) creates a vector from 0 to n. Probably, your x is a vector or something similar, therefore it throws an error.
Just try: seq(0,5) and see the result. It would help.

R non-linear model fitting using fitModel function

I want to fit a non-linear model to a real data.
The real data consists of 2 known numerical vectors ; thickness as 'x' and fh as 'y'
thickness=seq(0.15,2.00,by=0.05)
fh = c(5.17641, 4.20461, 3.31091, 2.60899, 2.23541, 1.97771, 1.88141, 1.62821, 1.50138, 1.51075, 1.40850, 1.26222, 1.09432, 1.13202, 1.12918, 1.10355, 1.11867, 1.09740,1.08324, 1.05687, 1.19422, 1.22984, 1.34516, 1.19713,1.25398 ,1.29885, 1.33658, 1.31166, 1.40332, 1.39550,1.37855, 1.41491, 1.59549, 1.56027, 1.63925, 1.72440, 1.74192, 1.82049)
plot(thickness,fh)
This is apparently non-linear. So, I am trying to fit this model as a non-linear function of
y= x*2/3+(2+2*a)/(3*x)
Variable a is an unknown constant and I am trying to find the best constant a that minimizes the sum of square of error between the regression line and the real data.
I first used a function fitModel that I found on a YouTube video, Fitting Functions to Data in R.
library(TIMP)
f=fitModel(fh~thickness^2/3+(2+2*A)/(3*thickness)) #it finds the coefficient 'A'
coef(f) # to represent just the coefficient
However, there's an error
Error in modelspec[[datasetind[i]]] : subscript out of bounds
So, as an alternative, want to find a plot of 'a' and 'the Sum of Squares of Error'. This time, I have such a hard time finding 'a' and plotting this graph. By manual work, I figured out the value 'a' is somewhere near 0.2 but this is not a precise value.
It would be helpful if someone could manifest either:
Why the fitModel function didn't work or
How to find the value a and plot the graph.
You could try this instead:
yf = function(a,xv) xv*(2/3)+(2+2*a)/(3*xv)
yf(2,thickness)
f <- function (a,y, xv) sum((y - yf(a,xv))^2)
f(2,fh,thickness)
xmin <- optimize(f, c(0, 10), tol = 0.0001, y=fh,xv=thickness)
xmin
plot(thickness,fh)
lines(thickness,yf(xmin$minimum,thickness),col=3)

Unknown error message when attempting to find MLE in R

I'm trying to find the MLE of distribution whose pdf is specified as 'mixture' in the code. I've provided the code below that gives an error of
"Error in optim(start, f, method = method, hessian = TRUE, ...) :
L-BFGS-B needs finite values of 'fn'"
"claims" is the dataset im using. I tried the same code with just the first two values of "claims" and encountered the same problem, so for a reproducible example the first two values are 1536.77007 and 1946.92409.
The limits on the parameters of the distribution is that 0<.p.<1 and a>0 and b>0, hence the lower and upper bounds in the MLE function. Any help is much appreciated.
#create mixture of two exponential distribution
mixture<-function(x,p,a,b){
d<-p*a*exp(-a*x)+(1-p)*b*exp(-b*x)
d
}
#find MLE of mixture distribution
LL <- function(p,a,b) {
X = mixture(claims,p,a,b)
#
-sum(log(X))
}
mle(LL, start = list(p=0.5,a=1/100,b=1/100),method = "L-BFGS-B", lower=c(0,0,0), upper=c(1,Inf,Inf))
edit: Not really sure why dput(), but anyway,
#first two values of claims put into dput() (the actual values are above)
dput(claims[1:2])
c(307522.103, 195633.5205)

How to weight observations in mxnet?

I am new to neural networks and the mxnet package in R. I want to do a logistic regression on my predictors since my observations are probabilities varying between 0 and 1. I'd like to weight my observations by a vector obsWeights I have, but I'm not sure where to implement the weights. There seems to be a weight= option in mx.symbol.FullyConnected but if I try weight=obsWeights I get the following error message
Error in mx.varg.symbol.FullyConnected(list(...)) :
Cannot find argument 'weight', Possible Arguments:
----------------
num_hidden : int, required
Number of hidden nodes of the output.
no_bias : boolean, optional, default=False
Whether to disable bias parameter.
How should I proceed to weight my observations? Here is my code at the moment.
# Prepare data
train.mm = model.matrix(obs ~ . , data = train_data)
train_label = train_data$obs
# Normalize
train.mm = apply(train.mm, 2, function(x) (x-min(x))/(max(x)-min(x)))
# Create MXDataIter compatible iterator
batch_size = 128
train.iter = mx.io.arrayiter(data=t(train.mm), label=train_label,
batch.size=batch_size, shuffle=T)
# Symbolic model definition
data = mx.symbol.Variable('data')
fc1 = mx.symbol.FullyConnected(data=data, num.hidden=128, name='fc1')
act1 = mx.symbol.Activation(data=fc1, act.type='relu', name='act1')
final = mx.symbol.FullyConnected(data=act1, num.hidden=1, name='final')
logistic = mx.symbol.LogisticRegressionOutput(data=final, name='logistic')
# Run model
mxnet_train = mx.model.FeedForward.create(
symbol = logistic,
X = train.iter,
initializer = mx.init.Xavier(rnd_type = 'gaussian', factor_type = 'avg', magnitude = 2),
num.round = 25)
Assigning the fully connected weight argument is not what you want to do at any rate. That weight is a reference to parameters of the layer; i.e., what you multiply in the inputs by to get output values These are the parameter values you're trying to learn.
If you want to make some samples matter more than others, then you'll need to adjust the loss function. For example, multiply the usual loss function by your weights so that they do not contribute as much to the overall average loss.
I do not believe the standard Mxnet loss functions have a spot for assigning weights (that is LogisticRegressionOutput won't cover this). However, you can make your own cost function that does. This would involve passing your final layer through a sigmoid activation function to first generate the usual logistic regression output value. Then pass that into the loss function you define. You could do squared error, but for logistic regression you'll probably want to use the cross entropy function:
l * log(y) + (1 - l) * log(1 - y),
where l is the label and y is the predicted value.
Ideally, you'd write a symbol with an efficient definition of the gradient (Mxnet has a cross entropy function, but its for softmax input, not a binary output. You could translate your output to two outputs with softmax as an alternative, but that seems less easy to work with in this case), but the easiest path would be to let Mxnet do its autodiff on it. Then you multiply that cross entropy loss by the weights.
I haven't tested this code, but you'd ultimately have something like this (this is what you'd do in python, should be similar in R):
label = mx.sym.Variable('label')
out = mx.sym.Activation(data=final, act_type='sigmoid')
ce = label * mx.sym.log(out) + (1 - label) * mx.sym.log(1 - out)
weights = mx.sym.Variable('weights')
loss = mx.sym.MakeLoss(weigths * ce, normalization='batch')
Then you want to input your weight vector into the weights Variable along with your normal input data and labels.
As an added tip, the output of an mxnet network with a custom loss via MakeLoss outputs the loss, not the prediction. You'll probably want both in practice, in which case its useful to group the loss with a gradient-blocked version of the prediction so that you can get both. You'd do that like this:
pred_loss = mx.sym.Group([mx.sym.BlockGrad(out), loss])

Portfolio Optimization using non-linear optimizer

I have been attempting to optimize a portfolio (max return subject to target risk) of returns using a known mu and covariance matrix subject to box and group constraints. It seems that the best solution is to create my own function using the 'Rdonlp2' package. However, when testing this package under very basic (long only) constraints it only produces an equal weighted portfolio. When I added a box and group constraint, it produced an equal weighted portfolio subject to the box constraint but did not register the group constraint.
install.packages("fPortfolio")
install.packages("Rdonlp2", repos="http://R-Forge.R-project.org")
If you do not already have the packages ^
library(fPortfolio)
library(Rdonlp2)
lppData=100*LPP2005.RET[,1:6]
mr1Spec = portfolioSpec()
setTargetRisk(mr1Spec) = 0.1
setSolver(mr1Spec) = "solveRdonlp2"
efficientPortfolio(data=lppData, spec=mr1Spec, constraints="LongOnly")
efficientPortfolio(data=lppData, spec=mr1Spec, constraints= c("maxsum[1:6]=.75", "maxW[1:6]=.1"))
has anyone had success using this optimizer? any help getting the portfolio to optimize correctly or setting up my own function would be greatly appreciated!
#WaltS
If I use only box constraints and don't specify the optimizer I get a solution (code below)
library(fPortfolio)
lppData=100*LPP2005.RET[,1:6]
mr1Spec = portfolioSpec()
portfolioFrontier(data=lppData, spec=mr1Spec, constraints=c("minW[1:6]=-1", "maxW[1:6]=2"))
However If I add group constraints...
portfolioFrontier(data=lppData, spec=mr1Spec, constraints= c("maxsumW[1:6]=.2", "maxsumW[1:6]=.75"))
and run it, it produces the following error...
Error in `colnames<-`(`*tmp*`, value = c("SBI", "SPI", "SII", "LMI", "MPI", :
attempt to set 'colnames' on an object with less than two dimensions
In the box only case, do you if its possible to call the portfolio weights corresponding to the vol closest to .15, which would be a var of about .38?

Resources