Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
The section on the help system tells me the Variance[] function is equalvalent to:
Total[(list-Mean[list])^2]/(Length[list]-1)
But I think the right definition should be:
Total[(list-Mean[list])^2]/(Length[list])
I can't figure this out.
Both definitions are correct:
The first formula gives an unbiased estimator of the population variance when the population mean is unknown.
The second formula gives an unbiased estimator of the population variance when the population mean is known.
When the true mean is unknown and has to be estimated from the data, the second formula would systematically underestimate the variance. The intuition is that a given sample would tend to have lower dispersion around the estimated mean than around the true mean. The -1 in the denominator corrects for that.
See Point estimation of the variance.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 months ago.
Improve this question
I am trying to follow chapter 2 on SDT in
https://link.springer.com/chapter/10.1007/978-3-030-03499-3_2
It basically says
d'emp = z(HIT) - z(FA)
if you don't know z() let your computer compute it ..
But how? Is there a function in R? It cannot be scale becaus Hit and FA are single values.
In this book, the z-transformation z() is defined as "the inverse cumulative Gaussian function". I think the sentence "If you are not familiar with the z-transformation just treat it as a function you can find on your computer" means for readers to not stop too much time in what does z-transformation means and pay attention to the calculations of d_emp and b_emp as the differences and the average.
However, if you want to know how to compute the inverse cumulative Gaussian (normal) function, you can use qnorm() from statslibrary. Be aware that you have to specify the mean and sd of the population, by default the function takes mean = 0 and sd = 1.
To know more:
Inverse of the cumulative gaussian distribution in R
https://www.statology.org/dnorm-pnorm-rnorm-qnorm-in-r/
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
How can I constrain my regression coefficient (only the slope, not the intercept) to be positive? It's a general statistical question, but specifically, I would like to have an r solution, and even more specifically when using model 2 regression (major axis regression).
You could do linear regression with nls, and limit the paramater range there.
Example: Using the nl2sol algorithm from the Port library we want to find a data set with x and y values with a negative Y-intercept and slope between 1.5 and 1.6:
nls(y~a+b*x,algorithm="port",start=c(a=0,b=1.5),lower=c(a=-Inf,b=1.4),upper=c(a=Inf,b=1.6))
This solution and others are explained in the more general question at https://stats.stackexchange.com/questions/61733/linear-regression-with-slope-constraint
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
Suppose the lifetime of an bulb can be modeled with an exponential distribution with parameter 1.
What is the expected value of a bulb’s remaining life if it has already survived 2 hours?
Exponential distribution is memoryless. Therefore, the time that has passed so far is irrelevant, and the expected value of the bulb’s remaining life is 1 (as the expected value of exponential distribution with parameter c is 1/c).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Summing up the posterior probabilities of a discrete distribution gives a value of more than one. Where am I going wrong?
This is the posterior generated by jags
My guess is that the histogram is supposed to be interpreted as a density function, and the probability mass of each bar is therefore the width of the bar times the height of the bar.
Given that interpretation, it looks like the masses sum to approximately 1. The width of each bar appears to be 1/2 and the sum of the heights is about 2 (to judge by eyeball).
If that's not it, you'll have to give more information e.g. show your R script and any data.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I’ve got the following variables:
Response: number of quota units leased (in and out) by fishers.
Explanatory: number of quota units own by fishers.
I fitted a GLM (Poisson), but I’m not totally sure if it’s right, considering that the explanatory variable is count as well. I’ve found examples of Poisson regression just with categorical and continuous explanatory variables, but not with counting variables.
So:
Am I right using Poisson with my data? If not so, what alternative do I have?
The residuals variances of my model are not homogeneous. I understand that Poisson regression allows face this problem, or should I pay attention to this issue and solve it (using weights, for example)?
Any help would much appreciated,
The problem seems like it could be well modeled with Poisson regression. The residual variance should NOT be "homogeneous". The Poisson model assumes that the variance is proportional to the mean. You have options if that asumption is violated. The quasi-biniomial and the negative binomial models can also be used and they allow some relaxation of the dispersion parameter estimates.
If the number of quota units owned by fishers sets an upper bound on the number used then I would not think that should be used as an explanatory variable, but might better be entered as offset=log(quota_units). It will change the interpretation of the estimates, such that they are estimates of the log(usage_rate).