Estimate the variance of an estimator - r

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 hours ago.
Improve this question
Need some help with a homework question.
I suck at math and our prof does not show us how to break down equations into R code at all, nor does she discuss R at all, but expects us to use it. Can someone show me how to work out this problem in R? A classmate and myself have been working on homework all day and we're exhausted and I've got a migraine setting in. Any help is much appreciated!
A population consists of N=10 primary units, each of which consists of Mi=6 secondary units. A two-stage sampling design selects 2 primary units by simple random sampling (without replacement) and 3 secondary units from each selected primary unit, also by simple random sampling. The observed values of the variable of interest are 7, 5, 3 from the first primary unit selected and 4, 2, 3 from the second primary unit selected.
The population mean per secondary unit was 4.
Estimate the variance of the estimator above.
Not sure where to start

Related

Different results using ANOSIM and SIMPER in R [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I have run ANOSIM and SIMPER to analyse community similarity at two treatments.
When I run ANOSIM the output is:
ANOSIM statistic R: -0.04465
Significance: 0.749
meaning they are similar to each other?
but the I run SIMPER it says the composition at two treatments are 62% different to each other?
not sure how to interpret the outputs of the two tests... why are they saying different things?
I looked up the documentation.
Are they similar to each other? Yes:
The divisor is chosen so that R will be in the interval -1 … +1, value
0 indicating completely random grouping.
Does 62% mean the groups are different from each other? No, greater than 70% is required.
The function displays most important species for each pair of groups. These species contribute at least to 70 % of the differences between groups.
There is also this note:
The results of simper can be very difficult to interpret. The method
very badly confounds the mean between group differences and within
group variation, and seems to single out variable species instead of
distinctive species (Warton et al. 2012). Even if you make groups that
are copies of each other, the method will single out species with high
contribution, but these are not contributions to non-existing
between-group differences but to within-group variation in species
abundance.

How do I analyze movement between points in R? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
So I have a lot of points, kind of like this:
animalid1;A;time
animalid1;B;time
animalid1;C;time
animalid2;A;time
animalid2;B;time
animalid2;A;time
animalid2;B;time
animalid2;C;time
animalid3;A;time
animalid3;B;time
animalid3;C;time
animalid3;B;time
animalid3;A;time
What I want to do is to first of all make R understand that the points A,B,C are connected. Then I want to get comparisons of movement from A to C and how long time it takes, how many steps were used, etc. So maybe I have a movement sequence like ABC on 20 animals and then ABABC on 10 animals and then ABCBA on 5 animals. I want to get some sort of statistical test done to see if the total time is different between these groups, and so on.
I bet this has been done before. But my Google skills are not good enough to find it.
Look at the msm package (msm stands for Multi State Model). Given observations of states at different times it will estimate probabilities of transitions and average time in the different states.

svm in R, train data set [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 8 years ago.
Improve this question
More a general question, but since I am using R -> tags
My training data set is 15,000 entries big from which around 20 i would like to use for positive data set -> building up the svm. I wanted to use the remaining resampled dataset as my negative dataset, but i was wondering, it might be better to take the same size (around 20) as the negative data set, otherwise it's highly imbalanced? Is there an easy approach to pool then the classifiers (ensemble based) in R after 1000 rounds of resampling? (or even with the e1071 package)
Followup question: I would like to calculate a score for each prediction afterwards, is it fine just to take the probabilities times 100??
Thx
You can try "class weight" approach in which the smaller class gets more weight, thus taking more cost to mis-classify the positive labelled class.

Covariates in correlation? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I'm not sure if this is a question for stackoverflow, or crossvalidated.
I'm looking for away to include covariate measures when calculating the correlation between two measures. For example, Lets say I have 100 samples, for which I have two measurements, x and y. Now lets say I also have a third measure, a covariate (lets say age). I want to measure the correlation between x and y, but I also want to ignore any of that correlation that comes from the covariate, age.
If I'm fitting a linear model, I could simply add the term to the model:
lm(y~x+age)
I know you can't calculate correlation with this kind of model in R (using ~).
So I want to know:
Does what I'm asking even make sense to do? I suspect it may not.
If it does, what R packages should I be using?
It sounds like you're asking for a semipartial correlation. You want the correlation between x and y partialling out the correlation between x and z. You need to read about partial and semipartial correlations.
The ppcor package in R will then help you with the calculations.

Uniform Random Selection with Replacement [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
Suppose you have a deck of 100 cards, with the numbers 1-100 on one side. You select a card, note the number, replace the card, shuffle, and repeat.
Question #1: How many cards (on average) must you select to have drawn the same card twice? Why?
Question #2: How many cards (on average) must you select to have drawn all of the cards at least once? Why?
(thanks, it has to do with random music playlists and making the option to not repeat the shuffle, as it were)
Q1: Relates to Birthday paradox problem
As you see in the collision problem section(in wikipedia link above), your question maps exactly.
Cast as a collision problem
The birthday problem can be generalized as follows: given n random integers drawn from a discrete uniform distribution with range [1,d], what is the probability p(n;d) that at least two numbers are the same? (d=365 gives the usual birthday problem.)
You have a range [1,100] from which you select random cards. The probability of collision(two selected cards are the same) is given as p(n;d) = ...
Further down, we have formula for average/expected number of selections as
Q(100) gives your answer.

Resources