Error estimation of maximum likelihood fit in zfit with weight - zfit

I am looking at the zfit tutorial, and seeing the comment about the likelihood with weight, (see attached image)
does anyone know if how the uncertainty is estimated in this case. How would the weight contribute here? Thanks in advance!

A weighted likelihood is indeed not a "real likelihood" anymore, however, the point estimation at the maximum is still valid.
The method implemented in zfit is the "asymptotically correct approach"as described in Parameter uncertainties in weighted unbinned maximum likelihood fits.
It will be taken into account automatically if your dataset is weighted.
(since this info was missing in the hesse docs, it just got updated)

Related

Coefficient value of covariate in Cox PH model is too big

I am trying to develop Cox PH model with time-varying covariates in R. I use coxph function from survival package. There was not any trouble during estimation process, though coefficient value of one covariates is too large, in particular, 2.5e+32.
I can't guess what is reason of this problem and how to tackle it. This variable is nonstationary and proportional assumption is violated. Does either of this facts may cause such a big value of coefficient?
More information could help framing your problem.
Anyway, I doubt non-proportionality is to blame. It would imply that you have some outliers heavily biasing your coefficient beyond reasonable expectations. You could give this a quick look by plotting the output of cox.zph.
Another possible explanation is that this rather depends on the unit of measure you used to define your covariate. Can the magnitude of the coefficient be meaningfully interpreted? If so, you could simply re-scale/standardise/log-transform that covariate to obtain a 'more manageable' coefficient (if this is theoretically appropriate).
This could also be due to the so called 'complete separation', which has been discussed here and here.

How should you use scaled weights with the svydesign() function in the survey package in R?

I am using the survey package in R to analyse the "Understanding Society" social survey. The main user guide for the survey specifies (on page 45) that the weights have been scaled to have a mean of 1. When using the svydesign() function, I am passing the weight variable to the weight argument.
In the survey package documentation, under the surveysummary() function, it states:
Note that the design effect will be incorrect if the weights have been rescaled so that they are not reciprocals of sampling probabilities.
Will I therefore get incorrect estimates and/or standard errors when using functions such as svyglm() etc?
This came to my attention because, when using the psrsq() function to get the Pseudo R-Squared of a model, I received the following warning:
Weights appear to be scaled: rsquared may be wrong
Any help would be greatly appreciated! Thanks!
No, you don't need to worry
The warning is only about design effect estimation (which most people don't want to do), and only about without-replacement design effects (DEFF rather than DEFT). Most people don't need to do design-effect estimation, they just need estimates and standard errors. These are fine; there is no problem.
If you want to estimate the design effects, R needs to estimate the standard errors (which is fine) and also estimate what the standard errors would be under simple random sampling without replacement, with the same sample size. That second part is the problem: working out the variance under SRSWoR requires knowing the population size. If you have scaled the weights, R can no longer work out the population size.
If you do need design effects (eg, to do a power calculation for another survey), you can still get the DEFT design effects that compare to simple random sampling with replacement. It's only if you want design effects compared to simple random sampling without replacement that you need to worry about the scaling of weights. Very few people are in that situation.
As a final note surveysummary isn't a function, it's a help page.

Is there an R-package to calculate pseudo R-squared measures for conditional (fixed effects) logistic models using clogit or bife?

Is there a R-package to calculate pseudo R-squared measures for my model? rcompanion neither supports clogit nor bife (due to missing intercept?).
Originally that was one question out of a larger context, which I edited to make it more readable.
Thanks in advance for your help!
Relative to question 5: Definitions for pseudo r-squared values based (mostly) on log likelihood values are given by UCLA IDRE. If you are able to extract the log likelihood from the fitted model and null model, calculating these measures is fairly straightforward. I don't know which, if any, of these might make sense for this kind of model. If the log likelihood values are not available for this kind of model are not available, Efron's pseudo r-square, which is based on the predicted and actual y values. This might be serviceable in this case; I don't know.

Nonlinear regression / Curve fitting with L-infinity norm

I am looking into time series data compression at the moment.
The idea is to fit a curve on a time series of n points so that the maximum deviation on any of the points is not greater than a given threshold. In other words, none of the values that the curve takes at the points where the time series is defined, should be "further away" than a certain threshold from the actual values.
Till now I have found out how to do nonlinear regression using the least squares estimation method in R (nls function) and other languages, but I haven't found any packages that implement nonlinear regression with the L-infinity norm.
I have found literature on the subject:
http://www.jstor.org/discover/10.2307/2006101?uid=3737864&uid=2&uid=4&sid=21100693651721
or
http://www.dtic.mil/dtic/tr/fulltext/u2/a080454.pdf
I could try to implement this in R for instance, but I first looking to see if this hasn't already been done and that I could maybe reuse it.
I have found a solution that I don't believe to be "very scientific": I use nonlinear least squares regression to find the starting values of the parameters which I subsequently use as starting points in the R "optim" function that minimizes the maximum deviation of the curve from the actual points.
Any help would be appreciated. The idea is to be able to find out if this type of curve-fitting is possible on a given time series sequence and to determine the parameters that allow it.
I hope there are other people that have already encountered this problem out there and that could help me ;)
Thank you.

Error estimation with matlab

I have a data set which I need to fit to two quadratic equation:
f1(x) = a*x + b*x^2
f2(x) = b*x^2
Is there a way to estimate the error where I take into account both the standard error in the measurement and the error in the curve fittig?
I guess you mean that "error due to measurement" is the distribution of the measured values around the "true" predicted values by some physical law, and "error in curve fitting" is caused by fitting the data to a model that does not fully capture the physical law.
There is no way to know which kind of error you are seeing unless you already know the physical law. For example:
Suppose you have a perfect amplifier whose transfer function is Vo = Vi^2. You input a range of voltages Vo and measure the output Vi for each.
If you fit a quadratic to the data, you know that any error is caused by measurement.
If you fit a line to the data, your error is caused by both measurement and your choice of curve fitting. But you'd have to know that the behavior is actually quadratic in order to measure the error source. And you'd do it by... fitting a quadratic.
In the real world, nothing ever behaves perfectly, so you're always stuck with your best approximation to the physical reality.
If you have errors in your measurements as well as in your response variable, you might try fitting your models using Orthogonal Regression. There's a demo illustrating exactly this process that ships as part MATLAB's Statistics Toolbox.

Resources