zfit straight line fitting for 2 dim dataset - zfit

I would like to fit 2-dim plot by straight line (a*x+b) using zfit like the following figure.
That is very easy work by a probfit package, but it has been deprecated by scikit-hep. https://nbviewer.jupyter.org/github/scikit-hep/probfit/blob/master/tutorial/tutorial.ipynb
How can I fit such 2dim plots by any function?
I've checked zfit examples, but it seems to be assumed some distribution (histogram) thus zfit requires dataset like 1d array and I couldn't reach how to pass 2d data to zfit.

There is no direct way in zfit currently to implement this out-of-the-box (with one line), since a corresponding loss is simply not added.
However, the SimpleLoss (zfit.loss.SimpleLoss) allows you to construct any loss that you can think of (have a look at the example as well in the docstring). In your case, this would look along this:
x = your_data
y = your_targets # y-value
obs = zfit.Space('x', (lower, upper))
param1 = zfit.Parameter(...)
param2 = zfit.Parameter(...)
model = Func(...) # a function is the way to go here
data = zfit.Data.from_numpy(array=x, obs=obs)
def mse():
prediction = model.func(data)
value = tf.reduce_mean((prediction - y) ** 2) # or whatever you want to have
return value
loss = zfit.loss.SimpleLoss(mse, [param1, param2])
# etc.
On another note, it would be a good idea to add such a loss. If you're interested to contribute I recommend to get in contact with the authors and they will gladly help you and guide you to it.
The loss function itself consists presumably of three to four things: x, y, a model and maybe an uncertainty on y. The chi2 loss looks like this:
def chi2():
y_pred = model.func(x)
return tf.reduce_sum((y_pred - y) / y_error) ** 2)
loss = zfit.loss.SimpleLoss(chi2, model.get_params())
That's all, 4 lines of code. x is a zfit.Data object, model is in this case a Func.
Does that work?
That's all.


Split output of LSTM to do computation on each vector in keras/tensorflow

I'm trying to implement a custom layer in keras. At the end it should implement an Attention Layer. So what i want to do is take the output of an LSTM and make a computation on every vector of the output.
I've got an LSTM with return-sequence=True. So I get an output with a shape like (batch_size, num_vectors, dim_vector).
How can I access a single vector in the call-function of the custom-layer? Or better, how can i split the input tensor to get a list of tensors with shape (dim_vector).
So it should be batch_size * num_vectors vectors/tensors in this list.
What i want to do looks kind of like this:
for i in range(num_vectors):
x_i = list_of_vectors/tensors[i]
W = self.W.eval().transponse()
W1 = self.W1.eval()
b = self.b.eval()
b1 = self.b1.eval()
activated = self.kernel_activation(W1.dot(x_i) + b1)
score = W.dot(activated) + b
What looks kind a promising but is poorly documented is K.gather(). Maybe someone could explain how it is working or has a better idea how to deal with my problem.
I also tried tf.unstack() to get a list. But this doesn't work because the dimensions of my input-tensor are unknown except of dim_vector.
I'm working with keras on tensorflow-backend.
Thanks in advance

How to weight observations in mxnet?

I am new to neural networks and the mxnet package in R. I want to do a logistic regression on my predictors since my observations are probabilities varying between 0 and 1. I'd like to weight my observations by a vector obsWeights I have, but I'm not sure where to implement the weights. There seems to be a weight= option in mx.symbol.FullyConnected but if I try weight=obsWeights I get the following error message
Error in mx.varg.symbol.FullyConnected(list(...)) :
Cannot find argument 'weight', Possible Arguments:
num_hidden : int, required
Number of hidden nodes of the output.
no_bias : boolean, optional, default=False
Whether to disable bias parameter.
How should I proceed to weight my observations? Here is my code at the moment.
# Prepare data
train.mm = model.matrix(obs ~ . , data = train_data)
train_label = train_data$obs
# Normalize
train.mm = apply(train.mm, 2, function(x) (x-min(x))/(max(x)-min(x)))
# Create MXDataIter compatible iterator
batch_size = 128
train.iter = mx.io.arrayiter(data=t(train.mm), label=train_label,
batch.size=batch_size, shuffle=T)
# Symbolic model definition
data = mx.symbol.Variable('data')
fc1 = mx.symbol.FullyConnected(data=data, num.hidden=128, name='fc1')
act1 = mx.symbol.Activation(data=fc1, act.type='relu', name='act1')
final = mx.symbol.FullyConnected(data=act1, num.hidden=1, name='final')
logistic = mx.symbol.LogisticRegressionOutput(data=final, name='logistic')
# Run model
mxnet_train = mx.model.FeedForward.create(
symbol = logistic,
X = train.iter,
initializer = mx.init.Xavier(rnd_type = 'gaussian', factor_type = 'avg', magnitude = 2),
num.round = 25)
Assigning the fully connected weight argument is not what you want to do at any rate. That weight is a reference to parameters of the layer; i.e., what you multiply in the inputs by to get output values These are the parameter values you're trying to learn.
If you want to make some samples matter more than others, then you'll need to adjust the loss function. For example, multiply the usual loss function by your weights so that they do not contribute as much to the overall average loss.
I do not believe the standard Mxnet loss functions have a spot for assigning weights (that is LogisticRegressionOutput won't cover this). However, you can make your own cost function that does. This would involve passing your final layer through a sigmoid activation function to first generate the usual logistic regression output value. Then pass that into the loss function you define. You could do squared error, but for logistic regression you'll probably want to use the cross entropy function:
l * log(y) + (1 - l) * log(1 - y),
where l is the label and y is the predicted value.
Ideally, you'd write a symbol with an efficient definition of the gradient (Mxnet has a cross entropy function, but its for softmax input, not a binary output. You could translate your output to two outputs with softmax as an alternative, but that seems less easy to work with in this case), but the easiest path would be to let Mxnet do its autodiff on it. Then you multiply that cross entropy loss by the weights.
I haven't tested this code, but you'd ultimately have something like this (this is what you'd do in python, should be similar in R):
label = mx.sym.Variable('label')
out = mx.sym.Activation(data=final, act_type='sigmoid')
ce = label * mx.sym.log(out) + (1 - label) * mx.sym.log(1 - out)
weights = mx.sym.Variable('weights')
loss = mx.sym.MakeLoss(weigths * ce, normalization='batch')
Then you want to input your weight vector into the weights Variable along with your normal input data and labels.
As an added tip, the output of an mxnet network with a custom loss via MakeLoss outputs the loss, not the prediction. You'll probably want both in practice, in which case its useful to group the loss with a gradient-blocked version of the prediction so that you can get both. You'd do that like this:
pred_loss = mx.sym.Group([mx.sym.BlockGrad(out), loss])

Estimate parameters of Frechet distribution using mmedist or fitdist(with mme) error

I'm relatively new in R and I would appreciated if you could take a look at the following code. I'm trying to estimate the shape parameter of the Frechet distribution (or inverse weibull) using mmedist (I tried also the fitdist that calls for mmedist) but it seems that I get the following error :
Error in mmedist(data, distname, start = start, fix.arg = fix.arg, ...) :
the empirical moment function must be defined.
The code that I use is the below:
scale = 1
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
memp=minvweibull(c(1,2), shape=3, rate=1, scale=1)
# estimating the parameters
para_lm = mmedist(data_fre,"invweibull",start=c(shape=3,scale=1),order=c(1,2),memp = "memp")
Please note that I tried many times en-changing the code in order to see if my mistake was in syntax but I always get the same error.
I'm aware of the paradigm in the documentation. I've tried that as well but with no luck. Please note that in order for the method to work the order of the moment must be smaller than the shape parameter (i.e. shape).
The example is the following:
#simulate a sample
x4 <- rpareto(1000, 6, 2)
#empirical raw moment
memp <- function(x, order)
ifelse(order == 1, mean(x), sum(x^order)/length(x))
mmedist(x4, "pareto", order=c(1, 2), memp="memp",
start=c(shape=10, scale=10), lower=1, upper=Inf)
Thank you in advance for any help.
You will need to make non-trivial changes to the source of mmedist -- I recommend that you copy out the code, and make your own function foo_mmedist.
The first change you need to make is on line 94 of mmedist:
if (!exists("memp", mode = "function"))
That line checks whether "memp" is a function that exists, as opposed to whether the argument that you have actually passed exists as a function.
if (!exists(as.character(expression(memp)), mode = "function"))
The second, as I have already noted, relates to the fact that the optim routine actually calls funobj which calls DIFF2, which calls (see line 112) the user-supplied memp function, minvweibull in your case with two arguments -- obs, which resolves to data and order, but since minvweibull does not take data as the first argument, this fails.
This is expected, as the help page tells you:
memp A function implementing empirical moments, raw or centered but
has to be consistent with distr argument. This function must have
two arguments : as a first one the numeric vector of the data and as a
second the order of the moment returned by the function.
How can you fix this? Pass the function moment from the moments package. Here is complete code (assuming that you have made the change above, and created a new function called foo_mmedist):
# values
n = 100
scale = 1
shape = 3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
# estimating the parameters
para_lm = foo_mmedist(data_fre, "invweibull",
start= c(shape=5,scale=2), order=c(1, 2), memp = moment)
You can check that optimization has occurred as expected:
> para_lm$estimate
shape scale
2.490816 1.004128
Note however, that this actually reduces to a crude way of doing overdetermined method of moments, and am not sure that this is theoretically appropriate.

What are x1_step1_xoffset, x1_step1_gain and x1_step1_ymin in a neural network generated by genFunction in Matlab?

I'm working with Matlab's Neural Network toolbox and I have generated a neural network function with genFunction.
I would like to know what mapminmax_apply function does, what are these variables used for and their meaning in the neural network:
% Input 1
x1_step1_xoffset = [0.151979470539401;-89.4008362047824;0.387909026651698;0.201508462422352];
x1_step1_gain = [2.67439342164766;0.0112020512930696;3.56055585104964;4.09080417195814];
x1_step1_ymin = -1;
Here it's the mapminmax_apply function:
% Map Minimum and Maximum Input Processing Function
function y = mapminmax_apply(x,settings_gain,settings_xoffset,settings_ymin)
y = bsxfun(#minus,x,settings_xoffset);
y = bsxfun(#times,y,settings_gain);
y = bsxfun(#plus,y,settings_ymin);
And here it's the call to the function with the above variables:
% Input 1
Xp1 = mapminmax_apply(X{1,ts},x1_step1_gain,x1_step1_xoffset,x1_step1_ymin);
I think:
the mapminmax function can also return the settings it uses (amongst others, offset, gain and ymin). For some reason in the code spat out by the NN function, these settings are given at the begining of the file, under Input1, in the form of x1_step1_xoffset, etc.
mapminmax('apply',X,PS) will apply the settings in PS to the mapminmax algorithm.
So, I think the code generated here has more steps than you necessarily need. You could get rid of the Input1 steps and just use a simple xp1 = mapminmax(x1'), instead of the mapminmax_apply
Matlab NN toolbox automatically normalizes the features of the dataset.
The functions mapminmax_apply and mapminmax_reverse are related to normalizing the features.
The function mapminmax_apply exactly converts/normalizes input range to -1 to 1.
Since the output will also come out as a normalized vector/value(between -1 to 1) it needs to be reversed normalized using the function mapminmax_reverse .

"Vectorizing" this for-loop in R? (suppressing interaction main effects in lm)

When interactions are specified in lm, R includes main effects by default, with no option to suppress them. This is usually appropriate and convenient, but there are certain instances (within estimators, ratio LHS variables, among others) where this isn't appropriate.
I've got this code that fits a log-transformed variable to a response variable, independently within subsets of the data.
Here is a silly yet reproducible example:
id = as.factor(c(1,2,2,3,3,3,4,4,4,4,5,5,5,5,6,7,7,8,8,8,9,9,9,9,10))
x = rexp(length(id))
y = rnorm(length(id))
logx = log(x)
data = data.frame(id,y,logx)
for (i in data$id){
sub = subset(data, id==i) #This splits the data by id
m = lm(y~logx-1,data=sub) #This gives me the linear (log) fit for one of my id's
sub$x.tilde = log(1+3)*m$coef #This linearizes it and gives me the expected value for x=3
data$x.tilde[data$id==i] = sub$x.tilde #This puts it back into the main dataset
data$tildecoeff[data$id==i] = m$coef #This saves the coefficient (I use it elsewhere for plotting)
I want to fit a model like the following:
Y = B(X*id) +e
with no intercept and no main effect of id. As you can see from the loop, I'm interested in the expectation of Y when X=3, constrained the fit through the origin (because Y is a (logged) ratio of Y[X=something]/Y[X=0].
But if I specify
m = lm(Y~X*as.factor(id)-1)
there is no means of suppressing the main effects of id. I need to run this loop several hundred times in an iterative algorithm, and as a loop it is far too slow.
The other upside of de-looping this code is that it'll be much more convenient to get prediction intervals.
(Please, I don't need pious comments about how leaving out main effects and intercepts is improper -- it usually is, but I can promise that it isn't in this instance).
Thanks in advance for any ideas!
I think you want
m <- lm(y ~ 0 + logx : as.factor(id))
see R-intro '11.1 Defining statistical models; formulae'
